same

On 22.03.2012 10:00, Markus Jelsma wrote:
Can you try spellcheck.q ?


On Thu, 22 Mar 2012 09:57:19 +0100, tom <dev.tom.men...@gmx.net> wrote:
hi folks,

i think i found a bug in the spellchecker but am not quite sure:
this is the query i send to solr:

http://lh:8983/solr/CompleteIndex/select?
&rows=0
&echoParams=all
&spellcheck=true
&spellcheck.onlyMorePopular=true
&spellcheck.extendedResults=no
&q=a+bb+ccc++dddd

and this is the result:

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">4</int>
<lst name="params">
<str name="echoParams">all</str>
<str name="spellcheck">true</str>
<str name="echoParams">all</str>
<str name="spellcheck.extendedResults">no</str>
<str name="q">a bb ccc dddd</str>
<str name="rows">0</str>
<str name="spellcheck.onlyMorePopular">true</str>
</lst>
</lst>
<result name="response" numFound="43" start="0" />
<lst name="spellcheck">
<lst name="suggestions">
<lst name="bb">
<int name="numFound">1</int>
<int name="startOffset">2</int>
<int name="endOffset">4</int>
<arr name="suggestion">
<str>abb</str>
</arr>
</lst>
<lst name="cccc1">
<int name="numFound">1</int>
<int name="startOffset">5</int>
<int name="endOffset">8</int>
<arr name="suggestion">
<str>ccc</str>
</arr>
</lst>
<lst name="cccc2">
<int name="numFound">1</int>
<int name="startOffset">5</int>
<int name="endOffset">8</int>
<arr name="suggestion">
<str>ccc</str>
</arr>
</lst>
<lst name="dddd">
<int name="numFound">1</int>
<int name="startOffset">10</int>
<int name="endOffset">14</int>
<arr name="suggestion">
<str>dvd</str>
</arr>
</lst>
</lst>
</lst>
</response>

now, i know  this is just a technical query and i have done it for a
test regarding suggestions and i discovered the oddity just by chance
and was not regarding the test i did:
my question is regarding, how the suggestions cccc1 and cccc2 come
about. from what i understand from the wiki, that the entries in
spellcheck/suggestions are only (misspelled) substrings from the user
query.

the setup/context is thus:
- the words a ccc exists 11 times in the index but cccc1 and 2 dont


http://lh:8983/solr/CompleteIndex/terms?terms=on&terms.fl=spell&terms.prefix=ccc&terms.mincount=0


<response><lst name="responseHeader"><int name="status">0</int><int
name="QTime">1</int></lst><lst name="terms"><lst name="spell"><int
name="ccc">11</int></lst></lst></response>
-  analyzer for the spellchecker yields the terms as entered, i.e.
a|bb|ccc|dddd
-  the config is thus

<searchComponent name="spellcheck" class="solr.SpellCheckComponent">

<str name="queryAnalyzerFieldType">textSpell</str>

<lst name="spellchecker">
<str name="name">default</str>
<str name="field">spell</str>
<str name="spellcheckIndexDir">./spellchecker</str>
</lst>
</searchComponent>


does anyone have a clue what's going on?



Reply via email to