Eric S. Johansson wrote: > (warning, I am an unrepentant curmudgeon and negative filter. Interpret > the following accordingly. If I'm wrong on any points, and someone > wants to correct me, I will gladly learn.) > > In a nutshell, not much.
I agree that it does look limited at the moment and that Naturally Speaking is the only viable path. Via Voice is outdated on Linux and the Windows version of NS is better anyway. As an unrepentant optimist though I can see the following path forward: In short: Create a copy-left (GPL) tool to transfer text from Naturally Speaking on Windows to Linux. A few starts have been made on this, but it needs to be organised as a proper community project and driven forward by several people. The user interface should aim to be better than what the native Windows NS version has. It should be speech engine and OS agnostic. That way you'll get people using it to transfer speech between all sorts of different systems, and it will get more use and development. You should be able to easily plug in a free engine like Sphinx (so these will be encouraged to improve) or even Vista's native system, which will be very widespread. My biggest gripe with NS is the editing interface. The actual recognition is quite good IMO, but when you do make a mistake it is very awkward to fix it without using the keyboard. If you give an edit command and that is not understood correctly either then you get a meaningless sentence and you are no longer able to easily correct the one you originally wanted to fix. The end result is that to totally lose the flow of what you were trying to express. The user interface is what we would have to reconstruct in whole or in part anyway, so it's no big loss. We should make it much more configurable so you can work around whatever shortcomings it has and encourage community contributions to improving usability. Use the NS macro system to send custom commands and use scripting on the receiving end to allow it to adapt to applications. I presume the macro functionality in NS is configured so that the pattern recognition is quite good on the macros you define yourself. So when you say 'Paste in my address' it generally works. We can (ab)use this macro facility for our own editing needs. We would define a set of macros that would be processed by the NS engine and would give us a know and parseable string. So saying 'Macro: delete sentence' would actually insert the text **MACRO-DELETE-SENTENCE** into the text stream. If you were watching the text on the Windows system the real text would be interspersed with such commands, but on the Linux system receiving the stream it would just Do the Right Thing. The big advantage is that it's very configurable this way so we can make it do what we want. We might eventually be able to get the engine running in Wine. Frankly I'm not too interested in having the whole NS run in Wine because of the interface. If we can make a better interface and can demonstrate a need for speech recognition (a commercial need) then we may well see the owners of the code port the speech engine to Linux. Low latency kernels should be a big draw for them as well. Now we just need someone willing to go on the barricades and front such a project :) Perhaps we can start this off as a Google Summer of Code project. Henrik -- Ubuntu-accessibility mailing list Ubuntu-accessibility@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-accessibility