[
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598512#action_12598512
]
oleg_gnatovskiy edited comment on SOLR-572 at 5/20/08 3:39 PM:
---------------------------------------------------------------
Hey guys I created a dictionary index from the following XML file:
<add>
<doc>
<field name="id">1000000000</field>
<field name="word">pizza</field>
</doc>
<doc>
<field name="id">1000000001</field>
<field name="word">club</field>
</doc>
<doc>
<field name="id">1000000002</field>
<field name="word">bar</field>
</doc>
</add>
My config is the following:
<searchComponent name="spellcheck"
class="org.apache.solr.handler.component.SpellCheckComponent">
<lst name="dictionary">
<str name="name">default</str>
<str name="type">index</str>
<str name="field">word</str>
<!--<str name="indexDir">c:/temp/spellindex</str>-->
</lst>
</searchComponent>
and word is defined in schema.xml as:
<field name="word" type="string" index="true" stored="true"
required="false"/>
When I run a query with the following URL:
http://localhost:8983/solr/select/?q=barr&spellcheck=true&spellcheck.dictionary=default&spellcheck.count=10
I get the following response:
lst name="spellcheck">
<lst name="suggestions">
<int name="numFound">1</int>
<arr name="barr">
<str>bar</str>
</arr>
</lst>
</lst>
which is what I expect.
However with this URL:
http://localhost:8983/solr/select/?q=bar&spellcheck=true&spellcheck.dictionary=default&spellcheck.count=10
where bar is correctly spelled, I get the following:
<lst name="spellcheck">
<lst name="suggestions">
<int name="numFound">1</int>
<arr name="bar">
<str>barr</str>
</arr>
</lst>
</lst>
Could you please tell me where the word "barr" is coming from, and why it is
being suggested?
Thanks!
was (Author: oleg_gnatovskiy):
Hey guys I created a dictionary index from the following XML file:
<add>
<doc>
<field name="id">1000000000</field>
<field name="word">pizza</field>
</doc>
<doc>
<field name="id">1000000001</field>
<field name="word">club</field>
</doc>
<doc>
<field name="id">1000000002</field>
<field name="word">bar</field>
</doc>
</add>
My config is the following:
<searchComponent name="spellcheck"
class="org.apache.solr.handler.component.SpellCheckComponent">
<lst name="dictionary">
<str name="name">default</str>
<str name="type">index</str>
<str name="field">word</str>
<!--<str name="indexDir">c:/temp/spellindex</str>-->
</lst>
</searchComponent>
and word is defined in schema.xml as:
<field name="word" type="string" index="true" stored="true"
required="false"/>
When I run a query with the following URL:
http://localhost:8983/solr/select/?q=barr&spellcheck=true&spellcheck.dictionary=default&spellcheck.count=10
I get the following response:
lst name="spellcheck">
<lst name="suggestions">
<int name="numFound">1</int>
<arr name="barr">
<str>bar</str>
</arr>
</lst>
</lst>
which is what I expect.
However with this URL:
http://wil1devsch1.cs.tmcs:8983/solr/select/?q=bar&spellcheck=true&spellcheck.dictionary=default&spellcheck.count=10
where bar is correctly spelled, I get the following:
<lst name="spellcheck">
<lst name="suggestions">
<int name="numFound">1</int>
<arr name="bar">
<str>barr</str>
</arr>
</lst>
</lst>
Could you please tell me where the word "barr" is coming from, and why it is
being suggested?
Thanks!
> Spell Checker as a Search Component
> -----------------------------------
>
> Key: SOLR-572
> URL: https://issues.apache.org/jira/browse/SOLR-572
> Project: Solr
> Issue Type: New Feature
> Components: spellchecker
> Affects Versions: 1.3
> Reporter: Shalin Shekhar Mangar
> Fix For: 1.3
>
> Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch,
> SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the
> following features:
> * Allow creating a spell index on a given field and make it possible to have
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.