The SpellChecker code mix indexing function, ngram treatment, and querying functions. Extending it will not produce clean code. Is it relevant to first refactor SpellChecker code for extracting dictionary reading function and indexing/searching functions? SpellChecker will get a method to add SpellEngine interface wich looks like

interface SpellEngine {
        public void addWord(String word);
        public String[] suggestSimilar(String word, int numSug);
}

and something to sort suggestions, like "distance" from suggested word.

M.

Le 9 juil. 07 à 02:38, Chris Hostetter a écrit :


: Now, SpellChecker use the trigram algorithm to find similar words. It
: works well for keyboard fumbles, but not well enough for short words
: and for languages like french where a same sound can be wrote
: differently.
: Spellchecking is a classical computer task, and aspell provides some
: nice and free (it's GNU) sound dictionary. Lots of dictionary are
: available.

The topic of "spell correction" as it pertains to Lucene users can really
have two meanings:
  a) an attempt to suggest potential spell correction of query strings
provided by a user as a form of input pre-processing
b) to use Lucene as a tool to suggest spell corrections based on a known
corpus.

The contrib/spellchecker code is an application of "B" -- it may in fact
be useful for "A" but that doesn't mean there aren't other non-Lucene
tools for achieving "A" as well.

: I did a python parser which write translation code in different
: languages : python, php and java. A bit like snowball stuff.
: Few works will be done to generate lucene compliant code. But is the
: python generator is well enough to Lucene, or a translation must be
: done in Java to put it in Lucene source?

the Lucene-Java repository tends to be about java code, but
contrib/javascript is an example of code that may be of general use to
Lucene-Java users that isn't java.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to