subject:"\"\\\[Audyssey\\\] First Natural\\\-Sounding Synthesized Voice in the World'\""

Re: [Audyssey] First Natural-Sounding Synthesized Voice in the World'

2012-07-04 Thread william lomas

i think it is in the new voice search for android 
http://www.youtube.com/watch?v=XLyuWEWqYqQ


On 4 Jul 2012, at 14:09, Ron Kolesar  wrote:

> To bad we can't get this new technology for our computers.
> But it is a first for the blind and it does sound interesting at the least.
> 
> 
> 
> 
> Ron and current Leader Dog boz who states
> "that a service dog beats a cane paws down any day of the week."
> -Original Message- From: Phil Vlasak
> Sent: Wednesday, July 04, 2012 9:00 AM
> To: Gamers Discussion list
> Subject: [Audyssey] First Natural-Sounding Synthesized Voice in the World'
> 
> Android Director: 'We Have the First Natural-Sounding Synthesized Voice in
> the World'
> July 4, 2012 |
> Hugo Barra, Android's director of product management, was cool and composed
> as he shared Android's latest killer features.
> giving Google a voice  is very use case-driven. If you're in a situation
> where you're asking a question with your voice, there's a significant chance
> you're in a somewhat constrained environment. You're on the go, you're
> rushing. You might be in the car. You're carrying something else with your
> hands. You can't really pause to look at your screen or type.
> 
> So speaking it back to you seems pretty natural, right? That's how humans
> communicate. But we also wanted to do that only when we had a text-to-speech
> engine that was extremely high quality. And what you hear today, if you ask
> Google a question on Jelly Bean, is quite spectacular. There isn't a
> text-to-speech engine, as we call them, that has accuracy as high as that.
> 
> We have built a text-to-speech engine that's networked-based, meaning it
> uses a very large amount of data to compose a spoken answer. You know,
> purely from a synthesis perspective - forget about answering questions - it
> takes a very large amount of data to generate a synthesized audio of someone
> speaking. But we also have a matching engine that sits on the device. It's
> the exact same voice but with a very different computational technique. You'll
> always hear the same voice whether it's speaking back to you in a connected
> use-case, in which it comes from the server, or a disconnected offline
> use-case, in which it would just be synthesized on the device.
> 
> Wired: What makes a good voice? Did you model it after someone?
> 
> Barra: I actually come from speech recognition, and I worked in speech in
> general for a very long time. So don't let me talk about this all day. But
> it's a very, very intricate process. And it starts with finding a voice
> talent.
> 
> Wired: A real person?
> 
> Barra: Finding a person who has a voice that just nails it. And in this day
> and age, it's actually a very different voice talent than the voice talents
> that power most of the voice technology that exists today. A lot of today's
> voice technology comes from the companies you'd expect  Nuance and Microsoft
> and others. That technology is built for a telephony world, for a customer
> service environment where you need this posh, powerful voice  a branding
> approach to things.
> 
> We set out to create the very first conversational voice, and I think we
> nailed that. I think we have the very first high-quality, natural-sounding,
> conversational, synthesized voice in the entire world.
> 
> Between a bunch of designers, engineers and speech scientists, we sat down
> and tried to describe the personality of the person, the personality of the
> voice that we were trying to create. We wrote down "friendly" [as a product
> goal] and there were literally 15 different ways to describe what friendly
> means. So that was the brief that we gave to a casting agency, and they came
> back with 10 candidates. We recorded those 10 candidates, and we did a bunch
> of blind tests with all sorts of different people, and we voted it down to
> two people. And then we recorded more of those people, and we did some tests
> and we decided "OK, we're going to go with this one person."
> 
> I don't actually know her name. In fact, no one knows her name.
> 
> Wired: It's a secret?
> 
> Barra: It's supposed to be. It's not something that you publicize because it
> needs to be the voice of Google. And then you create the voice, you collect
> a lot of data. What we did is an industry first.
> 
> Wired: While it does sound more human-like, it doesn't have a lot of
> personality in the sense that it doesn't say funny things back to you. It
> doesn't deliver jokes.
> 
> Barra: So nothing to do with the voice itself, but what it says an

Re: [Audyssey] First Natural-Sounding Synthesized Voice in the World'

2012-07-04 Thread Ron Kolesar


To bad we can't get this new technology for our computers.
But it is a first for the blind and it does sound interesting at the least.




Ron and current Leader Dog boz who states
"that a service dog beats a cane paws down any day of the week."
-Original Message- 
From: Phil Vlasak

Sent: Wednesday, July 04, 2012 9:00 AM
To: Gamers Discussion list
Subject: [Audyssey] First Natural-Sounding Synthesized Voice in the World'

Android Director: 'We Have the First Natural-Sounding Synthesized Voice in
the World'
July 4, 2012 |
Hugo Barra, Android's director of product management, was cool and composed
as he shared Android's latest killer features.
giving Google a voice  is very use case-driven. If you're in a situation
where you're asking a question with your voice, there's a significant chance
you're in a somewhat constrained environment. You're on the go, you're
rushing. You might be in the car. You're carrying something else with your
hands. You can't really pause to look at your screen or type.

So speaking it back to you seems pretty natural, right? That's how humans
communicate. But we also wanted to do that only when we had a text-to-speech
engine that was extremely high quality. And what you hear today, if you ask
Google a question on Jelly Bean, is quite spectacular. There isn't a
text-to-speech engine, as we call them, that has accuracy as high as that.

We have built a text-to-speech engine that's networked-based, meaning it
uses a very large amount of data to compose a spoken answer. You know,
purely from a synthesis perspective - forget about answering questions - it
takes a very large amount of data to generate a synthesized audio of someone
speaking. But we also have a matching engine that sits on the device. It's
the exact same voice but with a very different computational technique. 
You'll

always hear the same voice whether it's speaking back to you in a connected
use-case, in which it comes from the server, or a disconnected offline
use-case, in which it would just be synthesized on the device.

Wired: What makes a good voice? Did you model it after someone?

Barra: I actually come from speech recognition, and I worked in speech in
general for a very long time. So don't let me talk about this all day. But
it's a very, very intricate process. And it starts with finding a voice
talent.

Wired: A real person?

Barra: Finding a person who has a voice that just nails it. And in this day
and age, it's actually a very different voice talent than the voice talents
that power most of the voice technology that exists today. A lot of today's
voice technology comes from the companies you'd expect  Nuance and Microsoft
and others. That technology is built for a telephony world, for a customer
service environment where you need this posh, powerful voice  a branding
approach to things.

We set out to create the very first conversational voice, and I think we
nailed that. I think we have the very first high-quality, natural-sounding,
conversational, synthesized voice in the entire world.

Between a bunch of designers, engineers and speech scientists, we sat down
and tried to describe the personality of the person, the personality of the
voice that we were trying to create. We wrote down "friendly" [as a product
goal] and there were literally 15 different ways to describe what friendly
means. So that was the brief that we gave to a casting agency, and they came
back with 10 candidates. We recorded those 10 candidates, and we did a bunch
of blind tests with all sorts of different people, and we voted it down to
two people. And then we recorded more of those people, and we did some tests
and we decided "OK, we're going to go with this one person."

I don't actually know her name. In fact, no one knows her name.

Wired: It's a secret?

Barra: It's supposed to be. It's not something that you publicize because it
needs to be the voice of Google. And then you create the voice, you collect
a lot of data. What we did is an industry first.

Wired: While it does sound more human-like, it doesn't have a lot of
personality in the sense that it doesn't say funny things back to you. It
doesn't deliver jokes.

Barra: So nothing to do with the voice itself, but what it says and how it
says it?

Wired: Exactly. Is that something you guys were looking to add in the
future, or is that something you wanted to leave out?

Barra: It's very deliberately not making jokes with you. Google is a neutral
party  it's not your friend, secretary or sister. It's not your mom. It's
not your girlfriend or boyfriend. It is an information retrieval entity. You
ask, we respond. And it's very important that this entity be impartial, and
adding jokes and other mannerisms to the voice would take away from that.

It's

Re: [Audyssey] First Natural-Sounding Synthesized Voice in the World'

2012-07-04 Thread william lomas

shame no sample of this supposed voice all hype to be honest by the sound of it 

On 4 Jul 2012, at 14:00, "Phil Vlasak"  wrote:

> Android Director: 'We Have the First Natural-Sounding Synthesized Voice in 
> the World'
> July 4, 2012 |
> Hugo Barra, Android's director of product management, was cool and composed 
> as he shared Android's latest killer features.
> giving Google a voice  is very use case-driven. If you're in a situation 
> where you're asking a question with your voice, there's a significant chance 
> you're in a somewhat constrained environment. You're on the go, you're 
> rushing. You might be in the car. You're carrying something else with your 
> hands. You can't really pause to look at your screen or type.
> 
> So speaking it back to you seems pretty natural, right? That's how humans 
> communicate. But we also wanted to do that only when we had a text-to-speech 
> engine that was extremely high quality. And what you hear today, if you ask 
> Google a question on Jelly Bean, is quite spectacular. There isn't a 
> text-to-speech engine, as we call them, that has accuracy as high as that.
> 
> We have built a text-to-speech engine that's networked-based, meaning it uses 
> a very large amount of data to compose a spoken answer. You know, purely from 
> a synthesis perspective - forget about answering questions - it takes a very 
> large amount of data to generate a synthesized audio of someone speaking. But 
> we also have a matching engine that sits on the device. It's the exact same 
> voice but with a very different computational technique. You'll always hear 
> the same voice whether it's speaking back to you in a connected use-case, in 
> which it comes from the server, or a disconnected offline use-case, in which 
> it would just be synthesized on the device.
> 
> Wired: What makes a good voice? Did you model it after someone?
> 
> Barra: I actually come from speech recognition, and I worked in speech in 
> general for a very long time. So don't let me talk about this all day. But 
> it's a very, very intricate process. And it starts with finding a voice 
> talent.
> 
> Wired: A real person?
> 
> Barra: Finding a person who has a voice that just nails it. And in this day 
> and age, it's actually a very different voice talent than the voice talents 
> that power most of the voice technology that exists today. A lot of today's 
> voice technology comes from the companies you'd expect  Nuance and Microsoft 
> and others. That technology is built for a telephony world, for a customer 
> service environment where you need this posh, powerful voice  a branding 
> approach to things.
> 
> We set out to create the very first conversational voice, and I think we 
> nailed that. I think we have the very first high-quality, natural-sounding, 
> conversational, synthesized voice in the entire world.
> 
> Between a bunch of designers, engineers and speech scientists, we sat down 
> and tried to describe the personality of the person, the personality of the 
> voice that we were trying to create. We wrote down "friendly" [as a product 
> goal] and there were literally 15 different ways to describe what friendly 
> means. So that was the brief that we gave to a casting agency, and they came 
> back with 10 candidates. We recorded those 10 candidates, and we did a bunch 
> of blind tests with all sorts of different people, and we voted it down to 
> two people. And then we recorded more of those people, and we did some tests 
> and we decided "OK, we're going to go with this one person."
> 
> I don't actually know her name. In fact, no one knows her name.
> 
> Wired: It's a secret?
> 
> Barra: It's supposed to be. It's not something that you publicize because it 
> needs to be the voice of Google. And then you create the voice, you collect a 
> lot of data. What we did is an industry first.
> 
> Wired: While it does sound more human-like, it doesn't have a lot of 
> personality in the sense that it doesn't say funny things back to you. It 
> doesn't deliver jokes.
> 
> Barra: So nothing to do with the voice itself, but what it says and how it 
> says it?
> 
> Wired: Exactly. Is that something you guys were looking to add in the future, 
> or is that something you wanted to leave out?
> 
> Barra: It's very deliberately not making jokes with you. Google is a neutral 
> party  it's not your friend, secretary or sister. It's not your mom. It's not 
> your girlfriend or boyfriend. It is an information retrieval entity. You ask, 
> we respond. And it's very important that this entity be impartial, and adding 
> jokes and other mannerisms to the voice would take away from that.
> 
> It's something that we've talked about, and it's pretty clear. There hasn't 
> been a single person in the company who thinks we should have gone the other 
> direction.
> 
> http://www.wired.com/gadgetlab/2012/07/google-android-hugo-barra-interview/all/
> 
> 
> ---
> Gamers mailing list __ Gamers@audyssey.org
> If you want

[Audyssey] First Natural-Sounding Synthesized Voice in the World'

2012-07-04 Thread Phil Vlasak

Android Director: 'We Have the First Natural-Sounding Synthesized Voice in 
the World'

July 4, 2012 |
Hugo Barra, Android's director of product management, was cool and composed 
as he shared Android's latest killer features.
giving Google a voice  is very use case-driven. If you're in a situation 
where you're asking a question with your voice, there's a significant chance 
you're in a somewhat constrained environment. You're on the go, you're 
rushing. You might be in the car. You're carrying something else with your 
hands. You can't really pause to look at your screen or type.


So speaking it back to you seems pretty natural, right? That's how humans 
communicate. But we also wanted to do that only when we had a text-to-speech 
engine that was extremely high quality. And what you hear today, if you ask 
Google a question on Jelly Bean, is quite spectacular. There isn't a 
text-to-speech engine, as we call them, that has accuracy as high as that.


We have built a text-to-speech engine that's networked-based, meaning it 
uses a very large amount of data to compose a spoken answer. You know, 
purely from a synthesis perspective - forget about answering questions - it 
takes a very large amount of data to generate a synthesized audio of someone 
speaking. But we also have a matching engine that sits on the device. It's 
the exact same voice but with a very different computational technique. You'll 
always hear the same voice whether it's speaking back to you in a connected 
use-case, in which it comes from the server, or a disconnected offline 
use-case, in which it would just be synthesized on the device.


Wired: What makes a good voice? Did you model it after someone?

Barra: I actually come from speech recognition, and I worked in speech in 
general for a very long time. So don't let me talk about this all day. But 
it's a very, very intricate process. And it starts with finding a voice 
talent.


Wired: A real person?

Barra: Finding a person who has a voice that just nails it. And in this day 
and age, it's actually a very different voice talent than the voice talents 
that power most of the voice technology that exists today. A lot of today's 
voice technology comes from the companies you'd expect  Nuance and Microsoft 
and others. That technology is built for a telephony world, for a customer 
service environment where you need this posh, powerful voice  a branding 
approach to things.


We set out to create the very first conversational voice, and I think we 
nailed that. I think we have the very first high-quality, natural-sounding, 
conversational, synthesized voice in the entire world.


Between a bunch of designers, engineers and speech scientists, we sat down 
and tried to describe the personality of the person, the personality of the 
voice that we were trying to create. We wrote down "friendly" [as a product 
goal] and there were literally 15 different ways to describe what friendly 
means. So that was the brief that we gave to a casting agency, and they came 
back with 10 candidates. We recorded those 10 candidates, and we did a bunch 
of blind tests with all sorts of different people, and we voted it down to 
two people. And then we recorded more of those people, and we did some tests 
and we decided "OK, we're going to go with this one person."


I don't actually know her name. In fact, no one knows her name.

Wired: It's a secret?

Barra: It's supposed to be. It's not something that you publicize because it 
needs to be the voice of Google. And then you create the voice, you collect 
a lot of data. What we did is an industry first.


Wired: While it does sound more human-like, it doesn't have a lot of 
personality in the sense that it doesn't say funny things back to you. It 
doesn't deliver jokes.


Barra: So nothing to do with the voice itself, but what it says and how it 
says it?


Wired: Exactly. Is that something you guys were looking to add in the 
future, or is that something you wanted to leave out?


Barra: It's very deliberately not making jokes with you. Google is a neutral 
party  it's not your friend, secretary or sister. It's not your mom. It's 
not your girlfriend or boyfriend. It is an information retrieval entity. You 
ask, we respond. And it's very important that this entity be impartial, and 
adding jokes and other mannerisms to the voice would take away from that.


It's something that we've talked about, and it's pretty clear. There hasn't 
been a single person in the company who thinks we should have gone the other 
direction.


http://www.wired.com/gadgetlab/2012/07/google-android-hugo-barra-interview/all/


---
Gamers mailing list __ Gamers@audyssey.org
If you want to leave the list, send E-mail to gamers-unsubscr...@audyssey.org.
You can make changes or update your subscription via the web, at
http://mail.audyssey.org/mailman/listinfo/gamers_audyssey.org.
All messages are archived and can be searched and read at
http://www.mail-archive.com/gamers@audyssey.or

Re: [Audyssey] First Natural-Sounding Synthesized Voice in the World'

Re: [Audyssey] First Natural-Sounding Synthesized Voice in the World'

Re: [Audyssey] First Natural-Sounding Synthesized Voice in the World'

[Audyssey] First Natural-Sounding Synthesized Voice in the World'

4 matches

Site Navigation

Mail list logo

Footer information