Phone System Audio Format Guide
Everything you need to know about voicemail and IVR audio formats — MP3 vs WAV, sample rate, mono vs stereo, length limits, and what each major platform actually accepts.
Generate phone-system-ready audioMP3 vs WAV — when to use which
| Use case | Recommended format |
|---|---|
| Cloud phone systems (RingCentral, Zoom, Workspace Voice) | MP3 64–128 kbps |
| Microsoft Teams Phone (auto attendants & queues) | WAV (16 kHz, 16-bit, mono PCM) |
| Traditional PBX (Asterisk, Cisco, Avaya) | WAV (8 kHz, 8-bit, mono μ-law) |
| Nextiva NextOS | WAV (8 kHz, 8-bit, mono μ-law) |
| Personal voicemail (mobile / Google Voice personal) | In-app recording (no upload supported) |
Sample rate & bit depth
- 8 kHz / 8-bit μ-law — the historic telephony standard. Universally accepted, modest fidelity. Use for traditional PBX and Nextiva.
- 16 kHz / 16-bit PCM — HD-voice. Required by Microsoft Teams Phone and most modern cloud platforms.
- 44.1 kHz / 16-bit PCM — CD quality. Phone systems usually downsample, often badly. Avoid as a final phone-system upload.
Mono vs stereo
Always mono. Phone networks are single-channel. Stereo files are either downmixed silently — sometimes badly — or rejected outright by stricter systems. The only reason to keep a stereo master is your archive copy. Every uploaded greeting should be mono.
Length limits
| Platform | Max greeting length | Max file size |
|---|---|---|
| RingCentral | 5 min (auto receptionist), 3 min (user) | 5 MB |
| Microsoft Teams Phone | 5 min | 5 MB |
| Zoom Phone | 60 sec (auto receptionist) | 2 MB |
| Google Workspace Voice | 5 min | ~10 MB |
| Nextiva NextOS | 5 min | 5 MB |
Most professional voicemail greetings should be 15–25 seconds anyway — well under every platform\'s limit.
Common pitfalls
- Quiet recording rejected by phone system
Phone systems don\'t boost volume server-side. Always normalize before upload. - Distorted playback after upload
Sample rate mismatch. Confirm the file is at the rate your platform requires. - Greeting cuts off at the end
Some platforms trim the trailing fraction of a second. Add 1–2 seconds of silence to the end of your script.
- Stereo file rejected
Always export mono. No exceptions. - MP3 ID3 tags causing rejection
Some PBX systems can\'t parse MP3 metadata. Strip ID3 tags before upload (our generator does this by default). - Music too loud under voiceover
Phone bandwidth amplifies harsh frequencies. Mix music 12–15 dB below voice.
Skip the conversion — generate phone-system-ready audio
Our AI generator exports in the right format for your phone system automatically. Pick the platform, pick a voice, type your script — done.
Open the generatorFrequently asked questions
Should I use MP3 or WAV?
WAV is more universally accepted for telephony — every PBX takes it. MP3 is fine for most modern cloud systems (RingCentral, Zoom, Workspace Voice). When in doubt, WAV.
Why mono and not stereo?
Phone networks are mono. Stereo files either get downmixed badly or rejected outright. Always export mono.
Why 8 kHz on some platforms?
Traditional telephony is bandwidth-limited to about 4 kHz of audio — that's why 8 kHz sample rate (twice that, for Nyquist) is the historic standard. Modern HD-voice systems use 16 kHz, but 8 kHz still works everywhere.
Create your phone greeting online
Type your script, choose an AI voice, preview your greeting for free and download the finished audio when you are happy with it.
The future of phone greetings.
Free consultation at hello@phonegreetings.ai