[ 
https://issues.apache.org/jira/browse/SOLR-81?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470295
 ] 

Otis Gospodnetic commented on SOLR-81:
--------------------------------------

Adam,

I took a look at your patch.  It looks like you brought over (copied) various 
n-gram tokenizer classes and their unit tests that I put in Lucene's 
contrib/analyzers/.... .  Did you do this on purpose?  I intentionally put 
those n-gram tokenizers under Lucene's contrib, as they are generic and not 
Solr-specific.  Thus, the only classes my patch has are classes that are 
Solr-specific:

src/java/org/apache/solr/analysis/EdgeNGramTokenizerFactory.java
src/java/org/apache/solr/analysis/NGramTokenizerFactory.java
src/java/org/apache/solr/analysis/BaseTokenizerFactory.java

And instead of copying the source classes from Lucene's contrib/analyzers/.... 
it adds the new jar built from those sources:
lib/lucene-analyzers-2.1-dev.jar

Plus:
lib/lucene-spellchecker-2.1-dev.jar
example/solr/conf/schema.xml

I have some locally modified code for this issue, that was not a part of the 
first patch.  I wanted to attach the updated patch assuming you didn't really 
want those few generic tokenizer classes copied from Lucene over to Solr, but 
because changes are now in two places, so to speak, let's do this to unify our 
work:

Could you please:
- open a new LUCENE issue or just reopen the one where I originally attached 
this code and post your patch to the Lucene tokenizers there.
- prepare a new patch for this issue and make sure it only contains 
Solr-specific classes (see above), plus those 2 Jars.  

I'll upload my patch for schema.xml, so you can see my config (your patch 
didn't have this), and make sure your changes to the code are in sync with that.

Finally, are you making use of this code somehow already?
One thing that is completely missing from this patch is the RequestHandler that 
knows how to take the input (a query string), and get suggestions for 
alternative spellings via a SpellChecker instance.  I have some 
NGramRequestHandler code locally, but the code is unfinished.


> Add Query Spellchecker functionality
> ------------------------------------
>
>                 Key: SOLR-81
>                 URL: https://issues.apache.org/jira/browse/SOLR-81
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Otis Gospodnetic
>            Priority: Minor
>         Attachments: SOLR-81-edgengram-ngram.patch, SOLR-81-ngram.patch
>
>
> Use the simple approach of n-gramming outside of Solr and indexing n-gram 
> documents.  For example:
> <doc>
> <field name="word">lettuce</field>
> <field name="start3">let</field>
> <field name="gram3">let ett ttu tuc uce</field>
> <field name="end3">uce</field>
> <field name="start4">lett</field>
> <field name="gram4">lett ettu ttuc tuce</field>
> <field name="end4">tuce</field>
> </doc>
> See:
> http://www.mail-archive.com/solr-user@lucene.apache.org/msg01254.html
> Java clients: SOLR-20 (add delete commit optimize), SOLR-30 (search)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to