[gentoo-user] Re: speech recognition?

2016-06-03 Thread James
Hans  c5ace.com> writes:


> > is there a speech recognition software or the like which is capable to
> > listen in on a phone call in order to put on screen as text what the
> > other person is saying?

> http://nuance.com/dragon/index.htm

I just ran accross this article::

http://chrislord.net/index.php/2016/06/01/open-source-speech-recognition/


hth,
James





[gentoo-user] Re: speech recognition?

2016-05-26 Thread Hans

On 16/05/16 00:34, lee wrote:

Hi,

is there a speech recognition software or the like which is capable to
listen in on a phone call in order to put on screen as text what the
other person is saying?

I'd like to connect that to a softphone so that someone who suffers from
very bad hearing can talk to people on the phone more easily.  It must
work for German.

If there's a phone capable of this, I'd like to know about it.

Surely we should be able with nowadays technology to achieve this.


There is a commercial dictation software for Windows and Mac. It may 
work with whine.

http://nuance.com/dragon/index.htm




[gentoo-user] Re: speech recognition?

2016-05-18 Thread James
James  tampabay.rr.com> writes:


> lee  yagibdah.de> writes:

You know, it just dawned on me a solution that *may* fit your needs.

If folks where to first employ a speech to text interpolate software,
on their (originating) end, then your friend would receive mostly accurate
text communications. The single source (voice) solution becomes more accurate
over time as one uses them. If this sort of semantic occurred before the
text was sent to your friend, then it's much easier to support and
refine.


Sadly, it does nothing for new sources of contact, until they refine their
own speech-to-text system.

Sometimes a simultaneous video feed help folks comprehend, by reading lips
whilst discerning audio signals, but that requires low-latency connections.


hth,
James









[gentoo-user] Re: speech recognition?

2016-05-17 Thread James
lee  yagibdah.de> writes:


> is there a speech recognition software or the like which is capable to
> listen in on a phone call in order to put on screen as text what the
> other person is saying?

I like to say that there  are (2) main categories of effort here, one
very do-able (a single voice),  the other (infinite voices) plausibly
intractable atm.



> I'd like to connect that to a softphone so that someone who suffers from
> very bad hearing can talk to people on the phone more easily.  

This is possible, if only a few voices; that have had their speech patterns
analyzed, manipulated into storage with ample resources, then what you seek
is possible, accuracy is the constraint. 


> If there's a phone capable of this, I'd like to know about it.

If you are after a solution that can work with any voice, even limited
to a single language, then the answer is a long way away. Some would say
intractable. There is the question of accuracy required and the complexity
of vocabulary, sentence structure and allowed nominal variation on the voice(s).


> Surely we should be able with nowadays technology to achieve this.

With google sized resources, you can masquerade the problem with templates
for many different voices, but the underlying problems abound without limit.
What you  actually do is 'train' the google system to customize it's
translation of a given voice, very accurately over time.


Now say I disguise my voice with a throat infection, depressed attitude,
exuberance etc etc, you can see the troubles. In fact, the day after
watching horrible English cinema, which is often contagious (monty python
--life_of_bryan), I often develop a temporary 'Manchester slang' in the
vernacular. Endless, unlimited gyrations should one want to have a bit-o-fun
with language, particularly when any number of 'hill_billy' contaminants
manifest.

My mathematical belief is the problem is intractable, certainly as you
approach a high level of required accuracy. In fact folks routinely joust
with one anther around the 'looseness of language' and the various varieties
of layered meanings 

Truly intractable, but a 'dumbed down' simile surely will exist at some
point.  Google ibm in your searches as they did quite a bit of foundational
research in a variety of related areas of (speech/sound/voice) research.

Still, google's offering might prove acceptable for your needs.


hth,
James