Re: [Python-mode] more speech driven how twos

Eric S. Johansson Sat, 18 Jun 2011 06:01:23 -0700

BTW what are suitable returns from Emacs report functions for you.


As choices coming into my mind for the moment are:

- simple get it returned

yes as in
position = Emacs_fetch ("current character position", buffer = "focus")
buffer_list = Emacs_fetch ("buffer list")

obviously we have to figure out what's a reasonable set of first arguments forthis function. I am perfectly willing to use internal Lisp names includingcalling functions to get data. Actually, that seems like it might be veryreasonable thing to do. Sort of like SQL queries only more understandable.

- display in mode-line

that seems to be more of a Emacs_set function. Could you elaborate what you'rethinking of?

- message-buffer

actually, I was thinking of any buffer. If we had access to the current buffer,it would be possible to reimplement VR-mode in a more generic fashion.

- tool-tip

Again, could you elaborate but I do suspect it's something you would set throughthe set data function

- so-called speed-bar

Instead of a simple return it might be send so a program...


My currently preferred Emacs is Xemacs for political reasons[1]

I'm not sure what you need in a technical description. Normally in a
speech recognition environment you use either fixed grammars or
contiguous dictation. I am building a hybrid where you use a fixed
grammar with contextually dependent elements and interact with GUI
elements to make an unspeakable process speakable.

the process of making the unspeakable speakable involves identifying and
extracting information from the application and transforming it into a
speakable form before displaying it in a second application which can be
manipulated. See blog.esjworks.com for more complete examples.

I expect that most of the action routines for a complete grammar will
just be Emacs keystrokes invoking Emacs methods via keyboard input. It
would be nice to do a direct injection of commands to eliminate problems
with errors in command execution caused by too fast a rate of injecting
characters. A direct access channel would also allows to query the
buffer for state information which could be used to influence the action
routine.

The commands I asked for it which have no need to export information to
any external program would help me get a better feel for if I'm on the
right track or not. If there's something I use regularly and they "feel"
right" is a vocal damage through excessive use, then I'm on the right
path. If not, I need to look at the problem again they come up with a
better solution.

An example of a more complicated spoken command is the "get method"
command. The first thing the command does is search to the right for the
next method. An alias for it would be get next method. Going in the
other direction would be get previous method. Once the method was
identified, it would be placed in the region, mark on the left, point on
the right. The action routine for the grammar would then invoke a GUI
helper program to manipulate symbol names that pass the existing name
along to it. The resulting change method would be returned via a
different grammar and action routine, "use < transformation type>", and
the result would be placed back into the buffer replacing what was in
the region.

Making any sense?

It does. However, it's a new and vast matter for me. So let's proceed step bystep and see how it goes.

I didn't get here overnight. It took me 18 years to become injured and 10 yearsto become frustrated with speech recognition and then another five years tofigure out how to not become frustrated only become frustrated because Icouldn't pay people to write the code for me. joys of being a serialentrepreneur self-employed type person. If you think this is interesting, youshould see what I'm doing for diabetes self-management tools. I really need toget that done so I can pass the demo to potential funders which is part of thereason why I need these Emacs extensions. Everybody has an ulterior motive. :-)

Let's start with the report-function, telling where you are in code.
Agreed? So I will dig a little bit into the question, how the results fromEmacs are taken up in your environment.

this has been problematic from day one. The blind spot in speech recognition hasbeen that since getting information from application is "too hard" that commandshave become open loop commands using the content of the command to generate akeystroke or menu injection commands to activate a function within anapplication. Emacs examples would be something like:


(date | time | log) stamp = stamp.string($1);

search (forward={ctrl+s}|back={ctrl+r}) = $1;
yank (again={Esc}y|it back={ctrl+y}) = $1;
go to line = {Alt+g};
repeat (it={ctrl+u} | twice={ctrl+u2} | thrice={ctrl+u3} ) = $1;

copy    (character={esc}xdelete-forward-char{enter}{ctrl+y} |
         word= {esc}d{ctrl+y}|
         line={esc}xset-mark{ctrl-e}{esc}w
        )= $1;

kill    (character = {ctrl+d} |
         word={esc}d|
         line={ctrl+k}
        )= $1;

delete  (character = {esc}xdelete-backward-charI{enter}|
         word={esc}xbackward-kill-word{enter}|
         line={ctrl+u}0k{}
        )= $1;

left    (character={ctrl+b}|
         word={esc}b|
         line={ctrl+u}0k{esc}xforward-line{enter}
        )= $1;

right   (character={ctrl+f}|
         word={esc}f|
         line={esc}xforward-line{enter}
        )= $1;

sorry if that doesn't make much sense but the grammar is expressed in a ratherodd way in vocola, mixing action with grammar.


copy (character | word | line) is expressed as:

copy    (character={esc}xdelete-forward-char{enter}{ctrl+y} |
         word= {esc}d{ctrl+y}|
         line={esc}xset-mark{ctrl-e}{esc}w
        )= $1;

the right hand side of each grammar expression is the key sequence emitted.Note: in this example, not all of the key sequences are correct but I fixed themwhen I need them for the first time. You could say this is programming byplaceholder.

there has been no activity trying to find a general way to return data to thespeech recognition environment. There have been some hacks using the cut andpaste buffer mechanism but that's a rather unsophisticated. I think the bestmodel is "tell me what I want" rather than try to force anything into the speechrecognition action routines. The only potential of exception to that would bethe change of some major state which necessitates moving to a new grammar andaction routine state. For example, if you change modes, you had want a newgrammar. We might build a handle that by the first thing we ask is "what is thestate" and if it's the same as the last time, we don't change anything.Otherwise we initiate a grammar change.

This also highlights another difference between current thinking and what Ithink is a better solution. Current thinking only changes grammar if you changeapplications. I believe we should change grammar if an internal state changesbecause that should allow you to disambiguate commands through reduction ofscope (you'll hear me say that a lot) in python mode, an example that would beenabling commands for Shell operation and debugging only after you enter a shellbuffer versus a globally active grammar which would require longer more uniquecommands.

in summary, I would say that the easiest model is to continue with the grammarand action routine triggers the way we do now and action routines query theapplication for data and push data to the application.

I should probably into here because I'm probably giving you intellectualheartburn at this point because it's a lot to digest and I'm late for myastronomy club's work party. observing sites don't maintain themselves.


Later
--- eric
_______________________________________________
Python-mode mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-mode

Re: [Python-mode] more speech driven how twos

Reply via email to