Re:ideological speed bumps

Tim Cross Sat, 15 May 2010 18:00:32 -0700

Eric S. Johansson writes:
 > I've had this conversation with a couple of OSS developers and the answers 
 > always leave me very uncomfortable.
 > 
 > The problem is how does one live by OSS principals when essential tools are 
 > vehemently closed and the barriers to replacements are decade scale and no 
 > one 
 > is working on them?
 > 
 > The problem I refer to is the use of speech recognition as a tool for 
 > dealing 
 > with upper extremity disabilities. There is only one vendor for continuous 
 > speech large vocabulary recognition. There is maybe one or two universities 
 > in 
 > the world conducting research into speech recognition. All the open-source 
 > toolkits are hampered by design criteria (fixed grammar small vocabulary) 
 > and 
 > there is no corpus sufficient to build acoustical models. Recognition 
 > engines 
 > are multimillion dollar efforts to build and corpus collection is even more 
 > expensive. Speech recognition also requires very specialized knowledge and 
 > the 
 > people skilled in the art are owned by industry.  Therefore, if a rational 
 > person would assume that OSS speech recognition is not coming anytime in the 
 > near future, maybe not even in my lifetime.
 > 
 > A rational person would also assume that part of the way to tackle the 
 > problem 
 > is to nibble at the edges from the application side to the recognizer side, 
 > gradually increasing the availability of OSS components so that the disabled 
 > person can minimize their dependence on proprietary or closed source 
 > applications.
 > 
 > A lot of us disabled programmers have done a good job the nibbling around 
 > the 
 > edges but there's a lot of cases where we don't have the knowledge and need 
 > the 
 > help of project related people for example, Emacs integration mode with 
 > NaturallySpeaking (VR-mode) doesn't work right. It is incredibly fragile and 
 > breaks apparently at random. When I asked for help from various Emacs 
 > wizards to 
 > help keep it up-to-date and maybe even integrated into Emacs source in the 
 > hopes 
 > that it would be less likely to break, I was told there was no chance of 
 > help 
 > because it was linked to a proprietary package.
 > 
 > That doesn't leave us in a very good place because if that attitude persists 
 > from the ideologically pure, disable users have a shrinking number of 
 > open-source applications they can use because, the users require the use of 
 > a 
 > proprietary package.
 > 
 > How does one deal with the real world issue that disabled users will need 
 > proprietary packages integrated with open source applications to keep them 
 > from 
 > being forced into using 100% proprietary applications with no options?
 > 
Hi Eric,


the points you raise and your observations are all true, but I don't think
there is a good answer. What it really boils down to is that OSS is largely
about solutions that have been developed by users scratching their own itch.
Unfortunately, voice recognition is an extremely complex and difficult to
scratch itch and the number of developers with the necessary skills that want
to scratch it is very small. 

I don't think the problem is impossible to fix, but it is likely that it will
take some time. In the mid 90's, after losing my sight, there were no decent
OSS text-to-speech systems and hardly anything available for blind users to
use Unix or Linux. Essentially, we had to use windows/dos and a terminal. Now,
15 years later, the situation is very different. There are some good quality
TTS engines, some quite sophisticated TTS baed interfaces for both terminal
and GUI environments and both good quality free and relatively cheap TTS
engines available. Back int he mid 90's, many thought we would never have good
quality OSS TTS engines. 

It has been a umber of years since I've looked at the status of voice
recognition in the OSS world. Working on these projects would seem to be a
good proactive approach. In addition to this, two other approaches that might
be worth pursuing, especially by anyone who is interested in this area and
doesn't feel they have the technical skill to actually work in the development
area, would be to lobby commercial vendors to either make some of their code
open source or to provide low cost licenses and to lobby for project funding to
support OSS projects that are working on VR. A significant amount of the work
done to improve TTS interfaces has been made possible because of effective
loggying and gaining of support from commercial and government bodies. 

As an example of what can be achieved here. Through lobbying efforts, it is
now possible to obtain an end-user license for ViaVoice TTS at a very
reasonable price. Previously, you had to purchase the whole SDK to get the
runtime and it was very expensive. This provided users with a good quality
TTS. While it is not OSS, it has provided a 'bridge' while decent OSSS TTS
engines have been developed. I used this solution for a number of years. Now,
since swithcing to 64bit, I use a good quality OSS TTS engine. An option that
was not available, or more precisely, was not yet mature enough, only a few
years ago.

I'm possibly a little more optimistic regarding the future of OSS VR. Voice
recognition is rapidly moving from living in a very specialised domain to
being much more general purpose. This is largely due to the growth in small form
factor devices, such as mobile phones. I've been told that the Google Nexus 1
phone has quite good VR support. This is an indication that decent VR
applications that run in an OSS environment are becoming more prevalent. While
its true that most of these apps have been developed commercially and are not
OSS, I suspect they will 'leak' into the OSS world over time. The gorwth in
commercial VR solutions also adds to our knowledge and understanding of VR.
While much of this knowledge may still be proprietary in nature, this sort of
knowledge tends to find it way out into the public domain over time. It is
also likely as demand increases for VR solutions that more University research
will occur as it will be seen as something with good commercial potential i.e.
good funding opportunities. 

Unfortunately, it is also true that the accessibility benefits of technology
such as VR will all too often be a secondary issue to commercial interests.
There will be a lag time between this technology existing and it being
accessible to those who would really benefit from it. This is probably the
downside of the free market economy where developments are driven by profits.
However, it is also the percieved profits that ensure commercial resources are
invested into understanding the problem and developing workable solutions.
We are still a long way from the sort of society that would put
the accessibility needs before individual and corporate greed. In fact, we are
still a long way from getting mainstream recognition of accessibility issues
to the level they should be, which is why I think lobbying and raising issues
outside the accessibility community is so important. 

Tim


-- 
Tim Cross
tcr...@rapttech.com.au

There are two types of people in IT - those who do not manage what they 
understand and those who do not understand what they manage.

-- 
Ubuntu-accessibility mailing list
Ubuntu-accessibility@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-accessibility

Re:ideological speed bumps

Reply via email to