Another thing you might check into is stemming. The Porter stemmer
included in Solr is "aggressive", meaning that it will tend to do
weird things with misspellings. There is a different stemmer called
KStem which is available from www.lucidimagination.com/Downloads is
less aggressive. Porter turns "changes" and "changing" into "chang",
while KStem does not go this far.

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem

On Thu, Dec 17, 2009 at 12:59 PM, Lance Norskog <goks...@gmail.com> wrote:
> Character-based NGrams are a good tool for this problem. MLT is a
> document-wide numerical analysis.
>
> If the common types of OCR mistakes are different than what NGrams
> create, you might tune the ngram generator. For example, swapping
> letters might not happen very often. SIngle- and multi-word errors
> must happen a lot.
>
> If you do a facet query on your indexed terms, you will get a lot of
> facets with only one appearance in the index. These are often
> misspellings. It is possible to automate pulling these and creating a
> matching set of synonyms for words that appear in the spelling index.
>
> On Tue, Dec 15, 2009 at 12:57 PM, Chris Hostetter
> <hossman_luc...@fucit.org> wrote:
>>
>> : My first problem appears because I need suggestions inclusive when the
>> : expression has returned results. It's seems that only appear
>> : suggestions when there are no results. Is there a way to do so?
>>
>> can you give us an example of what your queries look like?  with the
>> example configs, i can get matches, as well as suggestions...
>>
>>
>> http://localhost:8983/solr/spell?q=ide&spellcheck=true
>>
>> : The second question is: For the purposes that I've mentioned, is the
>> : best way to use spellchecker or mlt component? Or some other (as a
>> : fuzzy query)?
>>
>> there's no clear cut answer to that -- i don't remember anyone else ever
>> asking about anything particularly similar to what you're doing, so i
>> don't know that there is any precident for a "best" way to go about it.
>>
>>
>>
>> -Hoss
>>
>>
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to