[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597816#action_12597816
 ] 

Otis Gospodnetic commented on SOLR-572:
---------------------------------------

Shalin
I think the onlyMorePopular and extendedResults should be optional, so in case 
of plain text dictionaries this information would just not be present if we 
cannot derive it.  Even if we take words from plain text files and index them 
into a Lucene index their frequency will remain 1.

Does the index-time analyzer make sense?  I don't have the sources handy, but 
doesn't Lucene SC take the input word and chop it up into 2- and 3-grams before 
indexing?  If so, how would index-time analyzer come into play?

In principal, if taking plain text files and indexing words in them into a 
Lucene SC index solves problems, I think that's acceptable - such indices are 
likely to be relatively small, so they should be quick to build and not require 
a lot of memory.



> Spell Checker as a Search Component
> -----------------------------------
>
>                 Key: SOLR-572
>                 URL: https://issues.apache.org/jira/browse/SOLR-572
>             Project: Solr
>          Issue Type: New Feature
>          Components: spellchecker
>    Affects Versions: 1.3
>            Reporter: Shalin Shekhar Mangar
>             Fix For: 1.3
>
>         Attachments: SOLR-572.patch, SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to