You must verify your email to perform this action.
OpenAI has developed a model called Voice Engine, which can generate natural-sounding speech that closely resembles the original speaker from a single 15-second audio sample and text input. This technology was first developed in late 2022 and has been used for text-to-speech API, ChatGPT Voice, and Read Aloud. However, OpenAI is taking a cautious approach towards a broader release due to the potential for misuse of synthetic voices.
The technology has been tested with a small group of trusted partners to understand its potential applications. Examples include: providing reading assistance to non-readers and children with a wider range of voices, translating content to reach a global audience while preserving the native accent of the original speaker, reaching global communities by improving essential service delivery, supporting non-verbal individuals with therapeutic applications, and helping patients recover their voice.
Despite the promising applications, OpenAI acknowledges the risks associated with generating synthetic speech, especially in contexts like elections. They have implemented safety measures such as watermarking and proactive monitoring of how Voice Engine is being used. They also believe in voice authentication experiences and a no-go voice list to prevent misuse.
Looking forward, OpenAI sees Voice Engine as part of their commitment to AI safety and understanding the technical frontier. They also encourage steps like phasing out voice-based authentication, exploring policies to protect individuals' voices, educating the public about AI capabilities and limitations, and developing techniques for tracking the origin of audiovisual content. The technology is not widely released at this time but OpenAI hopes its preview can spark conversations around the challenges and opportunities of synthetic voices.
Post your own comment:
OpenAI has developed a model called Voice Engine, which can generate natural-sounding speech that closely resembles the original speaker from a single 15-second audio sample and text input. This technology was first developed in late 2022 and has been used for text-to-speech API, ChatGPT Voice, and Read Aloud. However, OpenAI is taking a cautious approach towards a broader release due to the potential for misuse of synthetic voices. The technology has been tested with a small group of trusted partners to understand its potential applications. Examples include: providing reading assistance to non-readers and children with a wider range of voices, translating content to reach a global audience while preserving the native accent of the original speaker, reaching global communities by improving essential service delivery, supporting non-verbal individuals with therapeutic applications, and helping patients recover their voice. Despite the promising applications, OpenAI acknowledges the risks associated with generating synthetic speech, especially in contexts like elections. They have implemented safety measures such as watermarking and proactive monitoring of how Voice Engine is being used. They also believe in voice authentication experiences and a no-go voice list to prevent misuse. Looking forward, OpenAI sees Voice Engine as part of their commitment to AI safety and understanding the technical frontier. They also encourage steps like phasing out voice-based authentication, exploring policies to protect individuals' voices, educating the public about AI capabilities and limitations, and developing techniques for tracking the origin of audiovisual content. The technology is not widely released at this time but OpenAI hopes its preview can spark conversations around the challenges and opportunities of synthetic voices.
SummaryBot via The Internet
April 9, 2024, 9:32 a.m.