Now, SpellChecker use the trigram algorithm to find similar words. It
works well for keyboard fumbles, but not well enough for short words
and for languages like french where a same sound can be wrote
differently.
Spellchecking is a classical computer task, and aspell provides some
nice and free (it's GNU) sound dictionary. Lots of dictionary are
available.
I did a python parser which write translation code in different
languages : python, php and java. A bit like snowball stuff.
Few works will be done to generate lucene compliant code. But is the
python generator is well enough to Lucene, or a translation must be
done in Java to put it in Lucene source?
I'll start soon a PhonemeSpellChecker wich overide the trigram
SpellChecker.
Next step is to implement word cutter, just like Google suggest.
Any suggestions?
M.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]