Re: [Sugar-devel] GSOC 2010: Speech Recognition in Sugar

2010-04-06 Thread chirag jain
Hi Christoph,

Thanks for the encouraging words! :)

Yes, after English, creating language models for Spanish will be a great
idea so that we can cover a greater section of users. In fact I have decided
the following four languages during the summers, English, Spanish, German
and Hindi.

Although I know that users for Hindi are very few, but still I would like to
implement it because that would ease me to test the framework in my
locality.

Regards

On Tue, Apr 6, 2010 at 6:29 AM, Christoph Derndorfer <
christoph.derndor...@gmail.com> wrote:

> On Sun, Apr 4, 2010 at 9:49 AM, chirag jain wrote:
>
>> Hi,
>>
>>
>>
>> On Sat, Apr 3, 2010 at 7:37 AM, Benjamin M. Schwartz <
>> bmsch...@fas.harvard.edu> wrote:
>>
>>> I think your proposal is very interesting.  It contains a number of
>>> different ideas.  One major division is between Voice Commands and Speech
>>> Recognition.  Each of these contains many other possibilities. My biggest
>>> suggestion is to specify further which possibilities you want to work on.
>>>  I recommend you schedule the _easiest_ thing first, before moving on to
>>> the hard things.  Most GSoC students are too ambitious and never produce
>>> anything useful.
>>>
>>> Thanks Benjamin for a quick reply and providing me with some very useful
>> suggestions.
>>
>>
>>> Some specific ideas:
>>>
>>> Voice Commands:
>>>  - integrate with a text-command system like Gnome Do [1], so that the
>>> commands are accessible through the keyboard as well as microphone.  Also
>>> look at Perlbox [2].  (Note that neither Gnome Do or Perlbox can be used
>>> directly.)
>>>  - integrate with GnomeVoiceControl [3], which already uses PocketSphinx
>>> and should be highly compatible with Sugar.   This could allow voice
>>> control of unmodified Activities.
>>>
>>> I have already gone through Gnome Voice control which I think is the best
>> option for integrating into sugar. The reason being it uses Pocket Sphinx
>> which is light weight and thus should be compatible with devices like
>> XO-1.0. The run time memory requirements of Pocket Sphinx are upto 20 MB.
>> During next few days, I will be testing the functionality of Pocket Sphinx
>> in sugar and familiarizing myself more with Gnome voice control.
>>
>>
>>> Speech Recognition:
>>>  - supply text to any unmodified activity
>>>  - control input language easily for multilingual users
>>>
>>> [1] http://do.davebsd.com/index.shtml
>>> [2] http://perlbox.sourceforge.net/
>>> [3] http://live.gnome.org/GnomeVoiceControl
>>>
>>> I have broken the proposal into following parts that should be done in
>> sequence:
>>
>> a) My first priority this summer is to enable "Sugar Voice Control". This
>> includes:
>>
>> 1. Testing Pocket Sphinx on Sugar
>> 2. Studying more about Gnome Voice Control.
>> 3. Sugarizing the Gnome Voice Control.
>> 4. A command line interface that will start speech recognition in the
>> background and will start taking "Speech Commands".
>>
>> b) After the successful implementation of Sugar Voice control, we can then
>> look into providing speech recognized text to unmodified sugar activities.
>> Thus activities like Write can be made to get the required inputs either
>> from Keyboard or through microphone. This includes:
>>
>> 1.  Providing a Speech recognition button in the sugar frame (for example
>> on Top Right hand side) which when clicked will automatically start
>> recognizing speech in the background. Clicking the same button again will
>> stop the recognition process.
>>
>> 2.  A key board shortcut like Alt+S for starting speech recognition
>>
>> 3. Speech recognition control panel for controlling the various
>> parameters.
>>
>> c) The last part can be creating an API for providing easy Speech
>> Recognition access to activity developers.
>>
>> My aim is to atleast achieve part a) this summer and if time permits I
>> would also like to implement part b). Part c) can be taken care off later.
>>
>
> Hi,
>
> I just looked at your updated proposal and it's looking very good indeed.
>
> I also think that Benjamin's comments are spot-on and so achieving (a) in
> combination with supporting not only English but also Spanish (arguably the
> most important language when you look at current OLPC / Sugar deployments)
> would certainly be a big success and a great foundation for follow-up
> projects.
>
> Cheers,
> Christoph
>
> --
> Christoph Derndorfer
> co-editor, olpcnews
> url: www.olpcnews.com
> e-mail: christ...@olpcnews.com
>



-- 
Chirag Jain

Undergraduate Student
Netaji Subash Institute of Technology
New Delhi
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] GSOC 2010: Speech Recognition in Sugar

2010-04-06 Thread Christoph Derndorfer
On Sun, Apr 4, 2010 at 9:49 AM, chirag jain wrote:

> Hi,
>
>
>
> On Sat, Apr 3, 2010 at 7:37 AM, Benjamin M. Schwartz <
> bmsch...@fas.harvard.edu> wrote:
>
>> I think your proposal is very interesting.  It contains a number of
>> different ideas.  One major division is between Voice Commands and Speech
>> Recognition.  Each of these contains many other possibilities. My biggest
>> suggestion is to specify further which possibilities you want to work on.
>>  I recommend you schedule the _easiest_ thing first, before moving on to
>> the hard things.  Most GSoC students are too ambitious and never produce
>> anything useful.
>>
>> Thanks Benjamin for a quick reply and providing me with some very useful
> suggestions.
>
>
>> Some specific ideas:
>>
>> Voice Commands:
>>  - integrate with a text-command system like Gnome Do [1], so that the
>> commands are accessible through the keyboard as well as microphone.  Also
>> look at Perlbox [2].  (Note that neither Gnome Do or Perlbox can be used
>> directly.)
>>  - integrate with GnomeVoiceControl [3], which already uses PocketSphinx
>> and should be highly compatible with Sugar.   This could allow voice
>> control of unmodified Activities.
>>
>> I have already gone through Gnome Voice control which I think is the best
> option for integrating into sugar. The reason being it uses Pocket Sphinx
> which is light weight and thus should be compatible with devices like
> XO-1.0. The run time memory requirements of Pocket Sphinx are upto 20 MB.
> During next few days, I will be testing the functionality of Pocket Sphinx
> in sugar and familiarizing myself more with Gnome voice control.
>
>
>> Speech Recognition:
>>  - supply text to any unmodified activity
>>  - control input language easily for multilingual users
>>
>> [1] http://do.davebsd.com/index.shtml
>> [2] http://perlbox.sourceforge.net/
>> [3] http://live.gnome.org/GnomeVoiceControl
>>
>> I have broken the proposal into following parts that should be done in
> sequence:
>
> a) My first priority this summer is to enable "Sugar Voice Control". This
> includes:
>
> 1. Testing Pocket Sphinx on Sugar
> 2. Studying more about Gnome Voice Control.
> 3. Sugarizing the Gnome Voice Control.
> 4. A command line interface that will start speech recognition in the
> background and will start taking "Speech Commands".
>
> b) After the successful implementation of Sugar Voice control, we can then
> look into providing speech recognized text to unmodified sugar activities.
> Thus activities like Write can be made to get the required inputs either
> from Keyboard or through microphone. This includes:
>
> 1.  Providing a Speech recognition button in the sugar frame (for example
> on Top Right hand side) which when clicked will automatically start
> recognizing speech in the background. Clicking the same button again will
> stop the recognition process.
>
> 2.  A key board shortcut like Alt+S for starting speech recognition
>
> 3. Speech recognition control panel for controlling the various parameters.
>
> c) The last part can be creating an API for providing easy Speech
> Recognition access to activity developers.
>
> My aim is to atleast achieve part a) this summer and if time permits I
> would also like to implement part b). Part c) can be taken care off later.
>

Hi,

I just looked at your updated proposal and it's looking very good indeed.

I also think that Benjamin's comments are spot-on and so achieving (a) in
combination with supporting not only English but also Spanish (arguably the
most important language when you look at current OLPC / Sugar deployments)
would certainly be a big success and a great foundation for follow-up
projects.

Cheers,
Christoph

-- 
Christoph Derndorfer
co-editor, olpcnews
url: www.olpcnews.com
e-mail: christ...@olpcnews.com
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] GSOC 2010: Speech Recognition in Sugar

2010-04-04 Thread chirag jain
Hi,



On Sat, Apr 3, 2010 at 7:37 AM, Benjamin M. Schwartz <
bmsch...@fas.harvard.edu> wrote:

> I think your proposal is very interesting.  It contains a number of
> different ideas.  One major division is between Voice Commands and Speech
> Recognition.  Each of these contains many other possibilities. My biggest
> suggestion is to specify further which possibilities you want to work on.
>  I recommend you schedule the _easiest_ thing first, before moving on to
> the hard things.  Most GSoC students are too ambitious and never produce
> anything useful.
>
> Thanks Benjamin for a quick reply and providing me with some very useful
suggestions.


> Some specific ideas:
>
> Voice Commands:
>  - integrate with a text-command system like Gnome Do [1], so that the
> commands are accessible through the keyboard as well as microphone.  Also
> look at Perlbox [2].  (Note that neither Gnome Do or Perlbox can be used
> directly.)
>  - integrate with GnomeVoiceControl [3], which already uses PocketSphinx
> and should be highly compatible with Sugar.   This could allow voice
> control of unmodified Activities.
>
> I have already gone through Gnome Voice control which I think is the best
option for integrating into sugar. The reason being it uses Pocket Sphinx
which is light weight and thus should be compatible with devices like
XO-1.0. The run time memory requirements of Pocket Sphinx are upto 20 MB.
During next few days, I will be testing the functionality of Pocket Sphinx
in sugar and familiarizing myself more with Gnome voice control.


> Speech Recognition:
>  - supply text to any unmodified activity
>  - control input language easily for multilingual users
>
> [1] http://do.davebsd.com/index.shtml
> [2] http://perlbox.sourceforge.net/
> [3] http://live.gnome.org/GnomeVoiceControl
>
> I have broken the proposal into following parts that should be done in
sequence:

a) My first priority this summer is to enable "Sugar Voice Control". This
includes:

1. Testing Pocket Sphinx on Sugar
2. Studying more about Gnome Voice Control.
3. Sugarizing the Gnome Voice Control.
4. A command line interface that will start speech recognition in the
background and will start taking "Speech Commands".

b) After the successful implementation of Sugar Voice control, we can then
look into providing speech recognized text to unmodified sugar activities.
Thus activities like Write can be made to get the required inputs either
from Keyboard or through microphone. This includes:

1.  Providing a Speech recognition button in the sugar frame (for example on
Top Right hand side) which when clicked will automatically start recognizing
speech in the background. Clicking the same button again will stop the
recognition process.

2.  A key board shortcut like Alt+S for starting speech recognition

3. Speech recognition control panel for controlling the various parameters.

c) The last part can be creating an API for providing easy Speech
Recognition access to activity developers.

My aim is to atleast achieve part a) this summer and if time permits I would
also like to implement part b). Part c) can be taken care off later.

Regards
-- 
Chirag Jain

Undergraduate Student
Netaji Subash Institute of Technology
New Delhi
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


Re: [Sugar-devel] GSOC 2010: Speech Recognition in Sugar

2010-04-03 Thread Benjamin M. Schwartz
I think your proposal is very interesting.  It contains a number of
different ideas.  One major division is between Voice Commands and Speech
Recognition.  Each of these contains many other possibilities. My biggest
suggestion is to specify further which possibilities you want to work on.
 I recommend you schedule the _easiest_ thing first, before moving on to
the hard things.  Most GSoC students are too ambitious and never produce
anything useful.

Some specific ideas:

Voice Commands:
 - integrate with a text-command system like Gnome Do [1], so that the
commands are accessible through the keyboard as well as microphone.  Also
look at Perlbox [2].  (Note that neither Gnome Do or Perlbox can be used
directly.)
 - integrate with GnomeVoiceControl [3], which already uses PocketSphinx
and should be highly compatible with Sugar.   This could allow voice
control of unmodified Activities.

Speech Recognition:
 - supply text to any unmodified activity
 - control input language easily for multilingual users

[1] http://do.davebsd.com/index.shtml
[2] http://perlbox.sourceforge.net/
[3] http://live.gnome.org/GnomeVoiceControl



signature.asc
Description: OpenPGP digital signature
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel


[Sugar-devel] GSOC 2010: Speech Recognition in Sugar

2010-04-03 Thread chirag jain
Hi,

As a student at Delhi University, India, I would like to participate in GSOC
2010 with my proposal being Speech Recognition in Sugar.

I have created my proposal sugar wiki page and you can visit it at:
http://wiki.sugarlabs.org/go/Summer_of_Code/2010/speech-recognition

I am looking for some comments and feedbacks on this idea that will help me
to further improve it.

Thanks and Regards

-- 
Chirag Jain

Undergraduate Student
Netaji Subash Institute of Technology
Delhi University
New Delhi
India
___
Sugar-devel mailing list
Sugar-devel@lists.sugarlabs.org
http://lists.sugarlabs.org/listinfo/sugar-devel