Introducing Kaitom Voice V3 - Next-Generation Thai Text-to-Speech

We're excited to announce Kaitom Voice V3, the next generation of our Thai Text-to-Speech API. This major update brings significant improvements to speech quality, introduces smart text normalization, and simplifies integration with a modern JSON-based API.
What's New in V3
Kaitom Voice V3 represents a complete overhaul of our TTS engine, delivering the most natural-sounding Thai speech synthesis we've ever created.
Smart Text Normalization
V3 automatically handles complex text elements that previously required pre-processing:
| Type | Input | Spoken Output |
|---|---|---|
| Numbers | 1,234.56 | "หนึ่งพันสองร้อยสามสิบสี่จุดห้าหก" |
| Dates | 27/01/2569 | "วันที่ยี่สิบเจ็ดมกราคมพ.ศ.สองพันห้าร้อยหกสิบเก้า" |
| Currency | ฿1,500 | "หนึ่งพันห้าร้อยบาท" |
| Time | 14:30 | "สิบสี่นาฬิกาสามสิบนาที" |
| Percentages | 25% | "ยี่สิบห้าเปอร์เซ็นต์" |
Automatic Language Detection
No more specifying language modes! V3 automatically detects and handles Thai-English mixed content:
Hello and Welcome! ยินดีต้อนรับสู่ iApp Technology ผู้นำด้าน AI ของประเทศไทย
Simplified JSON API
We've modernized the API with a clean JSON interface:
curl -X POST 'https://api.iapp.co.th/v3/store/audio/tts' \
--header 'apikey: YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{"text": "สวัสดีครับ น้องไข่ต้ม เวอร์ชั่น 3"}'
Streaming Audio Output
- 24 kHz mono PCM streamed — start playback as soon as bytes arrive
- Real-time factor ~0.3–0.5 — 10 s of audio synthesized in 3–5 s
- Up to ~1,000 Thai characters per request (longer text auto-chunks server-side)
- Wrap with a WAV header (44-byte) on the client to play or save as
.wav
🎤 NEW: Thai Voice Cloning
V3 introduces voice cloning for Thai as a separate endpoint. Provide an 8–12 second clean Thai voice clip plus its literal transcript, and the synthesized speech will mimic that voice:
curl -X POST 'https://api.iapp.co.th/v3/store/audio/tts/clone' \
--header 'apikey: YOUR_API_KEY' \
--form 'text=สวัสดีครับ วันนี้ทดสอบการโคลนเสียง' \
--form 'speed=1.0' \
--form 'ref_text=ฮัลโหล สวัสดีครับ ผมชื่อไข่ต้ม' \
--form 'ref_audio=@reference.wav' \
--output 'output.pcm'
Voice cloning currently supports Thai language only. See the interactive cloning demo to record yourself and try it in the browser.
Quick Start
Python
import requests
url = "https://api.iapp.co.th/v3/store/audio/tts"
headers = {
"apikey": "YOUR_API_KEY",
"Content-Type": "application/json"
}
data = {"text": "สวัสดีครับ น้องไข่ต้ม เวอร์ชั่น 3"}
response = requests.post(url, headers=headers, json=data)
with open("output.wav", "wb") as f:
f.write(response.content)
JavaScript
const response = await fetch("https://api.iapp.co.th/v3/store/audio/tts", {
method: "POST",
headers: {
"apikey": "YOUR_API_KEY",
"Content-Type": "application/json"
},
body: JSON.stringify({ text: "สวัสดีครับ น้องไข่ต้ม เวอร์ชั่น 3" })
});
const blob = await response.blob();
V3 vs V2 Comparison
| Feature | V2 | V3 |
|---|---|---|
| API Format | Form Data | JSON |
| Language Mode | Required (TH / TH_MIX_EN) | Auto-detected |
| Text Normalization | Basic | Smart (numbers, dates, currency) |
| Max Characters | Unlimited | 10,000 |
| Audio Quality | Standard | 24 kHz streamed PCM |
Pricing
V3 is currently in Alpha and is FREE to use until 31 May 2026:
/v3/store/audio/tts(default Kaitom voice) —1 IC per 400 chars(Alpha Free until 2026-05-31)/v3/store/audio/tts/clone(Thai voice cloning) —1 IC per 400 chars(Alpha Free until 2026-05-31)
Pricing for general availability will be announced before the alpha period ends.
Use Cases
E-Learning & Education
Convert educational content into audio lessons with proper pronunciation of numbers, dates, and technical terms.
Chatbots & Virtual Assistants
Create natural-sounding voice responses for Thai chatbots with automatic language handling.
Content Creation
Generate professional voiceovers for videos and podcasts with high-quality audio output.
Accessibility
Make digital content accessible to visually impaired users with clear, natural speech.
IVR Systems
Build interactive voice response systems with smart text normalization for phone numbers, amounts, and dates.
Migration Guide
Migrating from V2 to V3 is straightforward:
Before (V2):
curl -X POST 'https://api.iapp.co.th/v3/store/speech/text-to-speech/kaitom' \
--header 'apikey: YOUR_API_KEY' \
--form 'text="สวัสดีครับ"' \
--form 'language="TH"'
After (V3):
curl -X POST 'https://api.iapp.co.th/v3/store/audio/tts' \
--header 'apikey: YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{"text": "สวัสดีครับ"}'
Key changes:
- New endpoint:
/v3/store/audio/tts - Content-Type:
application/json - Request body: JSON format
{"text": "..."} - No
languageparameter needed
Try It Now
Ready to experience the next generation of Thai TTS?
- Interactive Demo - Try V3 directly in your browser
- API Documentation - Complete technical reference
- Get API Key - Start building today
What's Next
We're continuously improving Kaitom Voice. Upcoming features include:
- Additional voice options (coming soon)
- SSML support for fine-grained control
- Voice cloning for English and other languages
Feedback
We'd love to hear your feedback on Kaitom Voice V3! Join our community:
- Discord: discord.gg/kYcpmdEcS2
- Email: sale@iapp.co.th
- Phone: 086-322-5858
Kaitom Voice V3 is available now for all iApp API users. Existing V1 and V2 APIs will continue to be supported.