Re: Spell checking: Is there a way to exclude words known to be wrong?

2009-07-14 Thread Shalin Shekhar Mangar
On Tue, Jul 14, 2009 at 6:37 PM, Erik Hatcher wrote:

> Use the stopwords feature with a custom mispeled_words.txt and a
> StopFilterFactory on the spell check field ;)
>
>
Very cool! :)

-- 
Regards,
Shalin Shekhar Mangar.


Re: Spell checking: Is there a way to exclude words known to be wrong?

2009-07-14 Thread Erik Hatcher
Use the stopwords feature with a custom mispeled_words.txt and a  
StopFilterFactory on the spell check field ;)


Erik


On Jul 13, 2009, at 8:27 PM, Jay Hill wrote:


We're building a spell index from a field in our main index with the
following configuration:

textSpell

  default
  spell
  ./spellchecker
  true



This works great and re-builds the spelling index on commits as  
expected.
However, we know there are misspellings in the "spell" field of our  
main
index. We could remove these from the spelling index using Luke,  
however
they will be added again on commits. What we need is something  
similar to
how the protwords.txt file is used. So that when we notice  
misspelled words
such as "beginnning" being pulled from our main index we could add  
them to

an exclusion file so they are not added to the spelling index again.

Any tricks to make this possible?

-Jay




Re: Spell checking: Is there a way to exclude words known to be wrong?

2009-07-13 Thread Mark Miller
I don't think there is a way currently, but it might make a nice patch. Or
you could just implement a custom SolrSpellChecker - both
FileBasedSpellChecker and IndexBasedSpellChecker are actually like maybe 50
lines of code or less. It would be fairly quick to just plug a custom
version in as a plugin.

-- 
- Mark

http://www.lucidimagination.com

On Mon, Jul 13, 2009 at 8:27 PM, Jay Hill  wrote:

> We're building a spell index from a field in our main index with the
> following configuration:
>  
>textSpell
>
>  default
>  spell
>  ./spellchecker
>  true
>
>  
>
> This works great and re-builds the spelling index on commits as expected.
> However, we know there are misspellings in the "spell" field of our main
> index. We could remove these from the spelling index using Luke, however
> they will be added again on commits. What we need is something similar to
> how the protwords.txt file is used. So that when we notice misspelled words
> such as "beginnning" being pulled from our main index we could add them to
> an exclusion file so they are not added to the spelling index again.
>
> Any tricks to make this possible?
>
> -Jay
>


Spell checking: Is there a way to exclude words known to be wrong?

2009-07-13 Thread Jay Hill
We're building a spell index from a field in our main index with the
following configuration:
  
textSpell

  default
  spell
  ./spellchecker
  true

  

This works great and re-builds the spelling index on commits as expected.
However, we know there are misspellings in the "spell" field of our main
index. We could remove these from the spelling index using Luke, however
they will be added again on commits. What we need is something similar to
how the protwords.txt file is used. So that when we notice misspelled words
such as "beginnning" being pulled from our main index we could add them to
an exclusion file so they are not added to the spelling index again.

Any tricks to make this possible?

-Jay