Re: [g-a-devel] [u-a-dev] gnome-speech, and audio output, moving forward.

Bill Haneman Tue, 18 Sep 2007 10:58:17 -0700

HI Luke, Will, and all:

For what it's worth, I agree with the bulk of what's been said already. 
It will be fantastic to get some more sanity in the speech/audio arena.

As for the first item Will identifies as a 'proposal', namely relying on 
the TTS engine to return digital sound samples rather than doing the 
output itself, I think this is a great idea but I would just suggest 
looking carefully at the potential latency issues there.

Also, key requirements of any speech/audio integration API(s) include 
the ability to know, at least roughly, two pieces of information: what 
is currently in the output queue and approximately how close to 
completion it is, and the ability to "sync up" and actually know, at 
some point in time, exactly what has been spoken. These are subtly 
different, in that the second one requires information about completion 
as opposed to "approximate progress". I think the second one implies at 
least some degree of interrupt capability in the audio output stream as 
well. Use cases include audio/voice synchronization, braille 
synchronization, and (perhaps more importantly), the ability to reliably 
break an utterance into pieces and restart output at a known point.

As for moving away from Bonobo Activation (note; not the same as 
"Bonobo" in the broad sense), I think this makes sense. I also think 
moving away from the use of CORBA for gnome-speech IPC is a good idea; 
the speech APIs seem like excellent candidates for dBUS migration and we 
have very few, if any, platform bincompat guarantees to deal with as 
long as the consumers of the speech interfaces are kept in the loop.

Best regards,

Bill

Willie Walker wrote:
> Hi Luke:
>
> First of all, I say "Hear, hear!"  The audio windmill is something
> people have been charging at for a long time.  Users who rely upon
> speech synthesis working correctly and integrating well with the rest of
> their environment are among those that need reliable audio support most
> critically.
>
> I see two main proposals in the below:
>
> 1) Modify gnome-speech drivers to obtain samples from their
>    speech engines and then handle the audio playing themselves.
>    This is different from the current state where the
>    gnome-speech driver expects the speech engine to do all the
>    audio management.
>
>    This sounds like an interesting proposal.  I can tell you
>    for sure, though, that the current gnome-speech maintainer
>    has his hands full with other things (e.g., leading Orca).
>    So, the work would need to come from the community.
>
> 2) As part of #1, move to an API that is pervasive on the system.
>    The proposed API is GStreamer.
>
>    Moving to a pervasive API is definitely very interesting, and
>    I would encourage looking at a large set of platforms:  Linux
>    to Solaris, GNOME to KDE, etc.  An API of recent interest is 
>    Pulse Audio (https://wiki.ubuntu.com/PulseAudio), which might
>    be worth watching.  I believe there might be many significant
>    improvements in the works for OSS as well.
>
> In the bigger scheme of things, however, there is discussion of
> deprecating Bonobo.  Bonobo is used by gnome-speech to activate
> gnome-speech drivers.  As such, one might consider alternatives to
> gnome-speech.  For example, SpeechDispatcher
> (http://www.freebsoft.org/speechd) or TTSAPI
> (http://www.freebsoft.org/tts-api-provider) might be something to
> consider.  They are not without issue, however.  Some of the issues
> include cumbersome configuration, reliability, etc.  I believe that's
> all solvable with work.  The harder issue in my mind is that they will
> introduce an external dependency for things like GNOME, and I've also
> not looked at what their licensing scheme is.
>
> Will
>   

_______________________________________________
Gnome-accessibility-devel mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/gnome-accessibility-devel

Re: [g-a-devel] [u-a-dev] gnome-speech, and audio output, moving forward.

Reply via email to