Re: LowerCaseFilterFactory and spellchecker

Chris Hostetter Tue, 04 Dec 2007 17:01:48 -0800

: It does make some sense, but I'm not sure that it should be blindly analyzed
: without adding logic to handle certain cases (like the QueryParser does).
: What happens if the analyzer produces two tokens?  The spellchecker has to
: deal with this appropriately.  Spell checkers should be able to "reverse
: analyze" the suggestions as well, so "Pyhton" gets corrected to "Python" and
: not "python".  Similarly, "ad-hco" should probably suggest "ad-hoc" and not
: "adhoc".


These all seem like arguments in favor of using the query analyzer for the 
source field ... yes, the person making the schema has to think carefully 
about what the analyzer does,  but they already have to be equally carful 
about what the indexing analyzer does.

Bottom line: if the indexing analyzer is used to build the dictionary, the 
query anlyzer should be used before looking up enteries in the dictionary.

"Python" is only a good suggestion for "Pyhton" if searching for "Python" 
is going to return something. "python" might be a better suggestion.  
Likewise "Python" might be a good suggestion for "python" if it's always 
capitalized in the source field.

-Hoss

Re: LowerCaseFilterFactory and spellchecker

Reply via email to