Re: spellcheck with misspelled words in index
I think you can just tell the spellchecker to only supply more popular suggestions, which would naturally omit these rare misspellings: str name=spellcheck.onlyMorePopulartrue/str -Peter On Wed, Jul 15, 2009 at 7:30 PM, Jay Hilljayallenh...@gmail.com wrote: We had the same thing to deal with recently, and a great solution was posted to the list. Create a stopwords filter on the field your using for your spell checking, and then populate a custom stopwords file with known misspelled words: fieldType name=textSpell class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=misspelled_words.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType Your spell field would look like this: field name=spell type=textSpell indexed=true stored=true multiValued=true/ Then add words like cusine to messpelled_words.txt -Jay On Tue, Jul 14, 2009 at 11:40 PM, Chris Williams cswilli...@gmail.comwrote: Hi, I'm having some trouble getting the correct results from the spellcheck component. I'd like to use it to suggest correct product titles on our site, however some of our products have misspellings in them outside of our control. For example, there's 2 products with the misspelled word cusine (and 25k with the correct spelling cuisine). So if someone searches for the word cusine on our site, I would like to show the 2 misspelled products, and a suggestion with Did you mean cuisine?. However, I can't seem to ever get any spelling suggestions when I search by the word cusine, and correctlySpelled is always true. Misspelled words that don't appear in the index work fine. I noticed that setting onlyMorePopular to true will return suggestions for the misspelled word, but I've found that it doesn't work great for other words and produces suggestions too often for correctly spelled words. I incorrectly had thought that by setting thresholdTokenFrequency higher on my spelling dictionary that these misspellings would not appear in my spelling index and thus I would get suggestions for them, but as I see now, the spellcheck doesn't quite work like that. Is there any way to somehow get spelling suggestions to work for these misspellings in my index if they have a low frequency? Thanks in advance, Chris -- Peter M. Wolanin, Ph.D. Momentum Specialist, Acquia. Inc. peter.wola...@acquia.com
spellcheck with misspelled words in index
Hi, I'm having some trouble getting the correct results from the spellcheck component. I'd like to use it to suggest correct product titles on our site, however some of our products have misspellings in them outside of our control. For example, there's 2 products with the misspelled word cusine (and 25k with the correct spelling cuisine). So if someone searches for the word cusine on our site, I would like to show the 2 misspelled products, and a suggestion with Did you mean cuisine?. However, I can't seem to ever get any spelling suggestions when I search by the word cusine, and correctlySpelled is always true. Misspelled words that don't appear in the index work fine. I noticed that setting onlyMorePopular to true will return suggestions for the misspelled word, but I've found that it doesn't work great for other words and produces suggestions too often for correctly spelled words. I incorrectly had thought that by setting thresholdTokenFrequency higher on my spelling dictionary that these misspellings would not appear in my spelling index and thus I would get suggestions for them, but as I see now, the spellcheck doesn't quite work like that. Is there any way to somehow get spelling suggestions to work for these misspellings in my index if they have a low frequency? Thanks in advance, Chris
Re: spellcheck with misspelled words in index
We had the same thing to deal with recently, and a great solution was posted to the list. Create a stopwords filter on the field your using for your spell checking, and then populate a custom stopwords file with known misspelled words: fieldType name=textSpell class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=misspelled_words.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType Your spell field would look like this: field name=spell type=textSpell indexed=true stored=true multiValued=true/ Then add words like cusine to messpelled_words.txt -Jay On Tue, Jul 14, 2009 at 11:40 PM, Chris Williams cswilli...@gmail.comwrote: Hi, I'm having some trouble getting the correct results from the spellcheck component. I'd like to use it to suggest correct product titles on our site, however some of our products have misspellings in them outside of our control. For example, there's 2 products with the misspelled word cusine (and 25k with the correct spelling cuisine). So if someone searches for the word cusine on our site, I would like to show the 2 misspelled products, and a suggestion with Did you mean cuisine?. However, I can't seem to ever get any spelling suggestions when I search by the word cusine, and correctlySpelled is always true. Misspelled words that don't appear in the index work fine. I noticed that setting onlyMorePopular to true will return suggestions for the misspelled word, but I've found that it doesn't work great for other words and produces suggestions too often for correctly spelled words. I incorrectly had thought that by setting thresholdTokenFrequency higher on my spelling dictionary that these misspellings would not appear in my spelling index and thus I would get suggestions for them, but as I see now, the spellcheck doesn't quite work like that. Is there any way to somehow get spelling suggestions to work for these misspellings in my index if they have a low frequency? Thanks in advance, Chris