Re: Spell checking: Is there a way to exclude words known to be wrong?
On Tue, Jul 14, 2009 at 6:37 PM, Erik Hatcher wrote: > Use the stopwords feature with a custom mispeled_words.txt and a > StopFilterFactory on the spell check field ;) > > Very cool! :) -- Regards, Shalin Shekhar Mangar.
Re: Spell checking: Is there a way to exclude words known to be wrong?
Use the stopwords feature with a custom mispeled_words.txt and a StopFilterFactory on the spell check field ;) Erik On Jul 13, 2009, at 8:27 PM, Jay Hill wrote: We're building a spell index from a field in our main index with the following configuration: textSpell default spell ./spellchecker true This works great and re-builds the spelling index on commits as expected. However, we know there are misspellings in the "spell" field of our main index. We could remove these from the spelling index using Luke, however they will be added again on commits. What we need is something similar to how the protwords.txt file is used. So that when we notice misspelled words such as "beginnning" being pulled from our main index we could add them to an exclusion file so they are not added to the spelling index again. Any tricks to make this possible? -Jay
Re: Spell checking: Is there a way to exclude words known to be wrong?
I don't think there is a way currently, but it might make a nice patch. Or you could just implement a custom SolrSpellChecker - both FileBasedSpellChecker and IndexBasedSpellChecker are actually like maybe 50 lines of code or less. It would be fairly quick to just plug a custom version in as a plugin. -- - Mark http://www.lucidimagination.com On Mon, Jul 13, 2009 at 8:27 PM, Jay Hill wrote: > We're building a spell index from a field in our main index with the > following configuration: > >textSpell > > default > spell > ./spellchecker > true > > > > This works great and re-builds the spelling index on commits as expected. > However, we know there are misspellings in the "spell" field of our main > index. We could remove these from the spelling index using Luke, however > they will be added again on commits. What we need is something similar to > how the protwords.txt file is used. So that when we notice misspelled words > such as "beginnning" being pulled from our main index we could add them to > an exclusion file so they are not added to the spelling index again. > > Any tricks to make this possible? > > -Jay >
Spell checking: Is there a way to exclude words known to be wrong?
We're building a spell index from a field in our main index with the following configuration: textSpell default spell ./spellchecker true This works great and re-builds the spelling index on commits as expected. However, we know there are misspellings in the "spell" field of our main index. We could remove these from the spelling index using Luke, however they will be added again on commits. What we need is something similar to how the protwords.txt file is used. So that when we notice misspelled words such as "beginnning" being pulled from our main index we could add them to an exclusion file so they are not added to the spelling index again. Any tricks to make this possible? -Jay