1. Audio Conversion Tool
    2. Sound waves, converted to binary-number data, are stored in computer memory as files by the audio hardware. These files are interpreted by Audio ToolBoxÔ as audio waveforms, the shapes of which are analogous to the characteristics of the recorded sound. Thus Audio ToolBox lets you change the binary numbers that have been stored on your computer by your audio hardware. Just as the contents of a spreadsheet file are made intelligible and useable by a spreadsheet program, so are the binary amplitude values made useful by Audio ToolBox. Audio ToolBox stores these large amounts of data on files on your computer hard disk. Unlike most other converters, Audio ToolBox does not load a copy of its data into memory when you process a sound file — all processing is done directly on the data stored on the hard disk. This allows Audio ToolBox to manipulate sound files of any size — up to your available disk space.

      1. When Would You Use Convert?
      2. The audio conversion tool lets you convert to and from Microsoft Multimedia Wave (8, 16 & MS ADPCM), linear 16 and unsigned 8 bit, Dialogic 4 and 8 bit, plus other industry standard formats. Optionally choose volume adjustments, dynamic compression/ expansion and fast or high fidelity filtering and re-sampling algorithms.

        The following sections refer to options as seen on the ToolBox Apprentice screen. If you prefer to use Audio ToolBox from the DOS environment, and not use ToolBox Apprentice, simply skip to the section "Conversion Command Line Options".

      3. Convert Source Audio File
      4. The Source Audio File field lets you enter the name of your source file; you can also choose a source file by using the Browse Button (be sure to position your cursor in the Source Audio File field before clicking the Browse Button)

        1. File Data Type

Your audio files are defined by both a "file type" and a "data type". Audio ToolBox supports many different file types and data formats. You may need to review information provided by your system vendor to determine which one is best for your particular needs.

Information required includes:

File Data Type: be sure to consider the following:
What kind of sound file? What kind of data is stored in the file? These files may contain pure digitized data or digitized data combined with format, control, and annotation information.
Which of the many specialized encoding methods does it use?
Frequency: Do you want telephony sampling frequency or higher fidelity?
        1. File Types

The audio conversion tool supports two different basic file types: Pure files and Wave files.

Pure files contain only the digitized sound.
Wave files contain raw audio data plus additional format, control, and annotation information.
        1. Data Types

Following are some industry standard PCM data types and a brief description. Some of the data types are pretty sophisticated, so it’s difficult to provide much more information given the scope of this document. Your version of the audio conversion tool supports many, but not all, of these data types. Please contact VISI regarding specialized or custom formats.

Pika AVA: Pika Technology 4 bit Adaptive Delta PCM, similar to Dialogic 4 bit ADPCM.
Brooktrout CVSD: Brooktrout Technology 1 bit Continuously Variable Slope Delta Modulation.
Dialogic ADPCM: Dialogic Corporation 4 bit Adaptive Delta PCM.
Dialogic PCM: Dialogic Corporation 8 bit u-law similar to the C-ITU (CCITT) G.711 u-law.
Float: 32 bit floating point data with a 24 bit mantissa and 8 bit exponent. This is an IEEE standard.
FFT Real: A 32 bit floating point frequency representation of the real portion of an audio signal.
C-ITU (CCITT) G.711: A-Law (inverted A-Law) or u-law (folded u-law): An international standard 14 bit to 8 bit compression.
C-ITU (CCITT) G.721: An international standard 4 bit sub-band encoded Adaptive Delta PCM.
C-ITU (CCITT) G.723 (ITU-T) 3 bit ADPCM: An international standard 3 bit sub-band encoded Adaptive Delta PCM.
Harris CVSD: Harris Semiconductor 1 bit Continuously Variable Slope Delta Modulation.
Multimedia unsigned 8 bit: unsigned 8 bit (0 to 255) linear data samples.
Multimedia linear 16 bit: signed 16 bit normalized (+/- 32,767) linear two’s complement data samples. This format is an industry standard for storing digitized sound.
MS Wave ADPCM: Microsoft Adaptive Delta PCM.
New Voice CVSD: New Voice 1 bit Continuously Variable Slope Delta Modulation.
Perception CVSD: Perception Technology 1 bit Continuously Variable Slope Delta Modulation.
Rhetorex 3 bit ADPCM: Rhetorex Corporation super secret proprietary Adaptive Delta PCM.
Rhetorex 4 bit ADPCM: Rhetorex Corporation super secret proprietary Adaptive Delta PCM.
TTI 8 bit: Talking Technology 8 bit unsigned PCM similar to Multimedia unsigned 8 bit.
        1. Files Converted with the Wrong File Data Type
        2. If you convert a file with the wrong File Data Type Audio ToolBox will try to detect this condition and report an error. Some data types, however, cannot be accurately detected and the converted file will sound like static, or have sections of "static" at the beginning, end, and possibly intermixed. This "static" is actually header, format and audio information misinterpreted as audio data. Simply re-convert the file with the proper file data type.

          Since Audio ToolBox supports many different file data types, you may need to review information provided by your system vendor to determine which one will work for you.

        3. File Frequency
        4. The source file frequency specifies the sample rate, or "recording speed" used to create the file. This number is similar to the speed of vinyl records: LP’s were typically recorded at 33 RPM while other types were recorded at 45 or even 78 RPM. Just as your LP’s won’t sound right when played at 45 RPM, so will your audio files sound strange if you use the wrong file frequencies.

        5. Files Converted with the Wrong Frequency

If you convert a file with the wrong Frequency, the file will usually sound fast or slow but otherwise sound intelligible. Simply re-convert the file with the proper frequency. Since the source and destination frequencies are related, the following may help you determine how to change your frequency settings:

If the file sounds too fast when played back, the Source Frequency is set too high or the Destination Frequency is set too low.
If the file sounds too slow when played back, the Source Frequency is set too low or the Destination Frequency is set too high.

Since Audio ToolBox supports virtually any frequency, you may need to review information provided by your system vendor to determine the correct frequency for your particular application.

        1. Indexed Files

As we’ve said earlier, Audio ToolBox supports Pure (raw audio data) and Multimedia Wave (headered) files. Indexed (VBase) files are groups of related audio segments which are accessed by specifying an index number. When converting an indexed file, you must first use the index chop tool to separate it into its component parts. Then use the index pack tool to put it back together.

      1. Convert Destination Audio File
      2. The Destination Audio File field lets you enter the name of your destination file. Just as in the Source Audio File field, you can also choose a destination file by using the Browse Button (be sure to position your cursor in the Destination Audio File field before clicking the Browse Button).

        1. Backup Checkbox

        By default, Audio ToolBox creates a backup copy of your file if you perform an "in place" conversion (i.e., do not specify a destination audio file). This protects your original file if you accidentally forget to specify a destination file (and wish to retain the original), or if you do not like the resulting conversion and wish to go back to the original. If this checkbox is set, Audio ToolBox backs up the most recent copy of your source file and names the backup with the extension ".BK1". Turn this checkbox off if your are short of disk space or do not wish to retain a backup copy of your original file.

      3. Convert Filters

During conversion, Audio ToolBox re-samples the source file (recorded at any frequency) so that the resulting destination file matches the specified output frequency.

Anti-aliasing Filter: The anti-aliasing filter removes unwanted high frequencies when converting from a higher to a lower frequency.

Use the default low-pass 2nd order float anti-aliasing filter for standard fidelity. Use the low-pass 6th order or FFT FIR anti-aliasing filters for improved fidelity. Use the "fast" filters for improved speed on 286/386 class computers. Turn the filters off for maximum conversion speed.

Re-sample Algorithm: The re-sampling algorithm determines how audio data will be generated or removed to adjust a file for the new frequency.

Use the default linear algorithm for standard fidelity. Use the re- FFT algorithm for best fidelity.

      1. Convert Options
      2. Convert options let you specify additional parameters to your Convert command so that you can finely tune the results to your liking.

        1. Convert Auto-Crop
        2. The Convert Auto-Crop command option looks at the selected sound segment and trims the quiet sections from the beginning and end, automatically, without accidentally erasing any audible part of the file.

          The guard time parameter sets the amount of time to leave at either end of the cropped sound segment. Increase the guard time for more silence at the beginning and end, decrease for less time.

        3. Convert Normalize
        4. Convert Normalize scans the selected audio segment and tries to set the maximum volume level "To" an ideal target level with a permissible number of "Over" exceptions. The ideal "To" level is the volume level that would be achieved if the signal were a perfectly symmetrical wave; the "Over" exceptions compensate for the fact that most real audio waveforms are not symmetrical.

          The "To" level is expressed as a percentage of the maximum amplitude possible with the current data format. The default value of 80% should be appropriate for most applications.

          In the "Over" parameter, you allow the volume to go over the target range some percentage of times in order to achieve a better overall volume level. Set this parameter to zero to force no portion of this signal to go over the target level.

          When you normalize a file, Audio ToolBox sets the levels of the file to an optimum level based upon a theoretical ideal. If you normalize multiple files, one at a time, the end result will be that they are each equal to an ideal, thus they each equal each other. You may remember from your school mathematics that if A=Z and B=Z, then A=B. This is the same basic principle used for equalizing multiple files, one at a time.

          When you Convert Normalize to adjust the overall volume level of a group of files, the files will sound roughly equal in volume.

        5. Convert Dynamic Compress / Expand
        6. Convert Dynamic Compress / Expand lets you improve dynamic range of the audio. The dynamic compressor / expander works by analyzing the audio signal and increasing the volume level on sections that are too soft while decreasing the volume level on sections that are too loud.

          The "Maximum" parameter limits the amount that the compressor / expander will increase or decrease the volume level of the audio. Increase this number (expressed in decibels) for a more noticeable effect.

          Use this command to improve the "depth" and "range" of the recorded voice.

        7. Convert Volume Adjust

        The Convert Volume Adjust command permanently changes the loudness (volume level) of a previously recorded file. Use this command when an existing recording is too loud, or when a portion of a recording is not loud enough.

        The Convert Volume Adjust command adjusts the amplitude of the selected sound segment by the number of decibels that you have chosen. A value of 0 dB specifies that the volume level will remain the same. Negative values specify that the level should be quieter; positive values specify that the volume level should be increased.

        Technically speaking, Convert Volume Adjust increases or decreases the signal voltage by the specified decibel level.

      3. Convert Equalize
      4. The Equalize button lets you fine tune the sound of your words, to change the quality of the vibrations themselves. While a microphone records the whole range of vibration in sound levels, the equalizer can be used to make certain ranges softer and other ranges louder, so that the net effect makes you sound different.

        When you select Equalize, you will be presented with a dialog box that lets you adjust the relative levels of frequency bands. By moving the controls up or down from the center point, you will increase or decrease the volume level of the audio in those frequency bands. After adjusting the frequency band controls, press "OK" to apply the changes to the recorded audio during the conversion process.

      5. Convert Advanced
      6. Audio ToolBox offers additional features which you may find useful in handling foreign file formats and specialized conversion tasks. You can specify a starting byte position to skip custom file headers, set input file length for extracting embedded data, even select byte and bit ordering options for reading files from non-Intel computers!

        1. Position
        2. This is the position of the input file, measured in bytes, where you will begin your conversion (default is byte 0). Use this setting to skip a portion of the beginning of the source audio file.

        3. Length
        4. This is the length of the input file, measured in bytes (default is entire file). Use this setting to convert only a portion of the source audio file and leave some portion of the end of the source audio file unconverted.

        5. Swap Input
        6. Use this setting to swap input bits, nibbles, bytes or words (default is none). Use this setting when converting files from non-Intel based computer.

        7. Swap Output
        8. Use this setting to swap output bits, nibbles, bytes or words (default is none). Use this setting when converting files for non-Intel based computer.

        9. Configuration Files

        Lists the name of global, input and output configuration file (default is none). Use this setting to override internal conversion variables. For VISI technical support use only.

      7. Conversion Command Line Options
      8. The following tables describe in detail the audio conversion command line parameters and their usage. If you prefer to use only the ToolBox Apprentice, and work solely from the Windows environment, simply skip this section.

        PCMCvt

        PCMCvt [-help] OldFile [NewFile] [-ac -b -di -do -fi -fo -li -ng -ni -no -pi -qa -qd -qe -ql -qr -q2 -si -so -va -vd -vn]

        Convert to and from Microsoft Multimedia Wave (8, 16 & MS ADPCM), linear 16 and unsigned 8 bit, Dialogic 4 and 8 bit, plus other industry standard formats. Optionally choose volume adjustments, dynamic compression/ expansion and fast or high fidelity filtering and re-sampling algorithms.

        Parameters Description

        -h Displays abbreviated help screen.

        OldFile Source file name specification.

        NewFile Destination output file name specification. If not specified, Audio ToolBox will overwrite the input file.

        -acX.x Auto-crop (default sound guard time is 0.05 sec).Use this setting to automatically remove silence from the beginning and end of the file. Increase the guard time for more silence at the beginning and end.

        -b Inhibits creation of a backup file when no output file is specified. Use this setting to save disk space.

        -diXXX Optional input data format; default is Wave Multimedia. Use this setting to specify the format of the source audio file.

        Value Meaning

        DLG004 Dialogic 4 bit (OKI).
        Default frequency is 6 kHz.

        DLG008 Dialogic 8 bit (u-Law).
        Default frequency is 6 kHz.

        G11F08 C-ITU (CCITT) G.711 (Folded) u-Law 8 bit.
        Default frequency is 8 kHz.

        G11I08 C-ITU (CCITT) G.711 (Inverted) a-Law 8 bit.
        Default frequency is 8 kHz.

        G21004 C-ITU (CCITT) G.721 4 bit ADPCM.
        Default frequency is 8 kHz.
        (C-ITU (CCITT) installation only)

        MPC008 Multimedia 8 bit.
        Default frequency is 11.025 kHz.

        MPC016 Multimedia 16 bit.
        Default frequency is 11.025 kHz.

        NWV001 NewVoice CVSD 1 bit.
        Default frequency is 32 kHz.
        (New Voice installation only)

        PTC001 Perception Technology CVSD 1 bit.
        Default frequency is 32 kHz.
        (Perception Technology installation only)

        VIS016 VIS Interchange format 16 bit.
        Frequency specified in file header.

        WAV000 Wave Multimedia (input default).
        Frequency specified in file header.

        -doXXX Optional output data format; user selects default when running the setup program. Use this setting to specify a destination audio file format that differs from the default.

        Value Meaning

        DLG004 see input data format.

        DLG008 see input data format.

        G11F08 see input data format.

        G11I08 see input data format.

        G21004 see input data format.

        MPC008 see input data format.

        MPC016 see input data format.

        NWV001 see input data format.

        PTC001 see input data format.

        WAV008 Wave Multimedia 8 bit.
        Default frequency is 11.025 kHz.

        WAV016 Wave Multimedia 16 bit.
        Default frequency is 11.025 kHz.

        VIS016 VIS Interchange format 16 bit.
        Default frequency is 11.025 kHz.

        -fiX.x Frequency of input file. Use this setting to override the default sample frequency of the source audio file.

        Value Meaning

        X.x Frequency in X.x kHz (default is 11.025).

        -foX.x Frequency of output file. Use this setting to override the sample frequency of the destination audio file.

        Value Meaning

        X.x Frequency in X.x kHz
        (see input data format for default frequency).

        -liXXX Length of input file in bytes (default is entire file). Use this setting to convert only a portion of the source audio file.

        -ngXXX Name of global configuration file (default is none). Use this setting to override internal conversion variables. For VISI technical support use only.

        -niXXX Name of input configuration file (default is none). Use this setting to override internal conversion variables. For VISI technical support use only.

        -noXXX Name of output configuration file (default is none). Use this setting to override internal conversion variables. For VISI technical support use only.

        -piXXX Position of input file beginning (default is byte 0). Use this setting to skip a portion of the source audio file.

        -qa Quality All: DTMF touch-tone and low pass anti-aliasing Fast Fourier Transform (FFT) filters on, FFT re-sampling on. (default is none). Use this setting for improved fidelity.

        -qdXXX DTMF touch-tone FFT filter on/ off (default is "off"). Use this setting to remove touch-tone frequencies during conversion.

        -qeXXX Equalizer FFT filter on/ off (default is "off"). Use this setting to perform frequency equalization. Default equalizer setting are optimized for voice band boost.

        Comma delimited custom parameters may follow to indicate: Gain, low frequency, upper frequency, band settings (up to 32 bands): -qeG,L,H,B1,B2...B32

        Value Meaning

        G Overall gain setting.

        L Low end of equalizer band range. Default is 0 Hz.

        H High end of equalizer band range. Default of 0 sets high range to maximum (i.e., Nyquist) frequency of source file.

        B1...B32 Gain levels (in dB) of bands spread evenly between low and high equalizer band range.

        -qlXXX Low pass anti-aliasing FFT filter on/ off (default is "off"). Use the "on" setting for improved fidelity.

        -qrXXX Re-sample FFT filter on/ off (default is "off"). Use the "on" setting for improved fidelity.

        -q2XXX Low pass 2nd order anti-aliasing filter off/ on/ fast (default is "on"). Use the default "on" setting for standard fidelity. Use "fast" for standard fidelity with improved speed on 286/386 class computers. Use "off" for maximum conversion speed.

        -q6XXX Low pass 6th order anti-aliasing filter off/ on/ fast (default is "off"). Use the "on" setting for improved fidelity. Use "fast" for improved fidelity with improved speed on 286/386 class computers.

        -siXXX Swap input data based on "bit", "nibble", "byte" or "word" alignment (default is none). Use these settings when converting files from non-Intel based computer.

        -soXXX Swap output data based on "bit", "nibble", "byte" or "word" alignment (default is none). Use these settings when converting files for non-Intel based computer.

        -vaX.x Volume adjust (+/-) (default is 0 dB). Use this setting to increase or decrease the volume level of the destination audio file.

        -vdX.x Volume dynamic compress/ expand (default is 6.0 dB). Use this setting to improve the dynamic range of the destination audio file.

        -vnX.x Volume normalize (default is 1.0% over range). Use this setting to adjust the volume levels across multiple files.

        Example

        Convert the sample 11 kHz 16 bit Multimedia Wave file PCMCvt.Wav into a Dialogic 6 kHz 4 bit ADPCM file by entering:

        PCMCvt PCMCvt.Wav AccTst.v04 <ENTER>

        Convert the sample Dialogic 6 kHz 4 bit ADPCM file AccTst.v04 into the 11 kHz 16 bit Multimedia Wave file PCMNew.Wav. Auto-crop silence, activate the low pass anti-aliasing and re-sampling FFT filters, activate the dynamic compressor/expander and normalize volume.

        PCMCvt AccTst.v04 PCMNew.Wav -diDLG004 -fi6 -doMPC016 -fo11.025 -ac -qlON -qrON -vd -vn <ENTER>

        If we were to specify the additional equalizer command:

        -qe-3,0,3,12,-12,0,0,0,0,0,0

        we would reduce the overall volume by 3 db (the first -3), set the equalizer range to between 0 and 3 kHz (the 0,3), increase the first band at 350 Hz by 12 dB (the 12), reduce the band at 700 Hz by 12 dB (the -12), and leave the rest of the bands alone (the 0 dB settings). This example shows the settings for an eight band equalizer; you can specify up to 32 bands and Audio ToolBox will adjust the equalizer accordingly.

        Remarks

        C-ITU (CCITT) G.721, New Voice and Perception Technology input and output formats are only available if the user selects them when running the setup program.

      9. Generalized File Conversion

Audio ToolBox provides general purpose switches for converting additional file formats.

The following example shows how to convert Apple McIntosh AIFF 16 bit linear files recorded at 48 kHz and transferred to the IBM PC compatible computer.

pcmcvt In.aif Out.vox -dimpc016 -dodlg004 -sibyte
-pi128 -fi44.1 -fo6

The previous command line tells the PCMCvt program to:

Convert the input file "IN.AIF" to the output file "OUT.VOX"
Read the input file as "Multimedia 16 bit linear" data; write the output file as "Dialogic 4 bit" data.
Swap the input file bytes to convert from Apple "high-low" byte order to PC "low-high" byte order.
Skip the first 128 bytes of the input file AIFF header.
Read the input file frequency as 44.1 kHz (samples/ second), write the output file as 6 kHz.

The following example shows how to convert PIKA Technology 4 bit ADPCM files recorded at 9.3 kHz into Dialogic 4 bit ADPCM at 6 kHz.

pcmcvt In.vox Out.vox -diDlg004 -dodlg004 -sinibble -fi9.3 -fo6

The previous command line tells the conversion program to:

Convert the input file "IN.VOX" to the output file "OUT.VOX"
Read the input file as "Dialogic 4 bit" data; write the output file as "Dialogic 4 bit" data.
Swap the input file nibbles to convert from PIKA "first-last" byte order to Dialogic "last-first" nibble order.
Read the input file frequency as 9.3 kHz (samples/ second), write the output file as 6 kHz.

Similarly, the following command will convert from Talking Technology 8 bit 6.757 kHz (skipping the 32 bit header) into the default Dialogic 4 bit at 6 kHz:

PCMCvt In.TTI Out.Dlg -diMPC008 -fi6.757 -pi32

In addition, many commercial formats have an equivalent PCMCvt data type. For example:

Industry Name PCMCvt Type

AIFF Multimedia 16 bit with header.

BiCom ADPCM Dialogic OKI with "-swapnibble" switch

Pika Technology Dialogic OKI with "-swapnibble" switch

Sun .au C-ITU (CCITT) G.711 u-Law

Talking Technology Multimedia 8 bit

VoiceTek CVSD Harris CVSD

Vynet CVSD Perception Technology CVSD

These techniques can be used to convert from virtually any file format and data type to a supported audio format.