[jira] Updated: (SOLR-572) Spell Checker as a Search Component

Grant Ingersoll (JIRA) Wed, 28 May 2008 05:46:08 -0700

     [ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Grant Ingersoll updated SOLR-572:
---------------------------------

    Attachment: SOLR-572.patch

OK, here's a start on the token stuff.  

*_NOTE:  This currently does not work!!!!!!!!  The tests do not pass and I 
haven't fully implemented the SpellingQueryConverter.  I have a few other 
things to attend to for a couple of days, so I wanted to get this up there as a 
starting point for others to look at and give comments on the approach for when 
I can get back to it in a day or two (but feel free to take it up, too)._*

The basic gist of it is to hand off analysis to a pluggable piece called the 
SpellingQueryConverter, which produces a collection of Tokens (which contain 
offsets into the original query String).

I'm still torn on how to best achieve this.  In some sense, there has to be 
some interaction with some form of a Query Parser.  I think it needs to be a 
Query Parser that has the source field's Analyzer as the Analyzer for doing the 
parsing.  This way, the output Query is properly analyzed and we can then 
extract just those "spellcheckable" terms from it (i.e. TermQuery, PhraseQuery, 
????)

Does this make sense?

> Spell Checker as a Search Component
> -----------------------------------
>
>                 Key: SOLR-572
>                 URL: https://issues.apache.org/jira/browse/SOLR-572
>             Project: Solr
>          Issue Type: New Feature
>          Components: spellchecker
>    Affects Versions: 1.3
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
> SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
> SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-572) Spell Checker as a Search Component

Reply via email to