Oops.  Sorry.  I'm hijacking my own thread to put a real Subject in place...

Bob Sandiford | Lead Software Engineer | SirsiDynix
P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com
www.sirsidynix.com 


> -----Original Message-----
> From: Bob Sandiford
> Sent: Monday, April 25, 2011 5:34 PM
> To: solr-user@lucene.apache.org
> Subject:
> 
> Hi, all.
> 
> We're having some troubles with the Solr Spellcheck Response.  We're
> running version 3.1.
> 
> Overview:  If we search for something really ugly like:  "
> kljhklsdjahfkljsdhf book rck"
> 
> then when we get back the response, there's a suggestions list for
> 'rck', but no suggestions list for the other two words.  For 'book',
> that's fine, because it is 'spelled correctly' (i.e. we got hits on the
> word) and there shouldn't be any suggestions.  For the ugly thing,
> though, there aren't any hits.
> 
> The problem is that when we're handling the result, we can't tell the
> difference between no suggestions for a 'correctly spelled' term, and
> no suggestions for something that's odd like this.
> 
> (Now - this is happening with searches that aren't as obviously garbage
> - this was just to illustrate the point).
> 
> Our setup:
> We're running multiple shards, which may be part of the issue.  For
> example, 'book' might be found in one of the shards, but not another.
> 
> I don't *think* this has anything to do with our schema, since it's
> really how the Search Suggestions are being returned to us.
> 
> What we'd really like to see is the response coming back with an
> indication that a word wasn't found / had no suggestions.  We've hacked
> around in the code a little bit to do this, but were wondering if
> anyone has come across this, and what approaches you've taken.
> 
> Here's the xml we're getting back from the search:
> 
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
> 
> <lst name="responseHeader">
>   <int name="status">0</int>
>   <int name="QTime">56</int>
>   <lst name="params">
>     <str name="spellcheck">true</str>
>     <str name="facet">true</str>
>     <str name="sort">score desc, RELEVANCE_SORT_nsort desc</str>
>     <str name="shards.qt">spellcheckedStandard</str>
>     <str name="hl.mergeContiguous">true</str>
>     <str name="facet.limit">1000</str>
>     <str name="hl">true</str>
>     <str name="fl"> ELECTRONIC_ACCESS_display ISBN_display TITLE_boost
> FORMAT_display score MEDIA_TYPE_display AUTHOR_boost LOCALURL_display
> UPC_display id DOC_ID_display CHILD_SITE_display DS_EC
> PRIMARY_AUTHOR_boost PRIMARY_TITLE_boost DS_ID TOPIC_display
> ASSET_NAME_display OCLC_display</str>
>     <str
> name="shards">localhost:8983/solr/SD_ILS/,localhost:8983/solr/SD_ASSET/
> </str>
>     <arr name="facet.field">
>       <str>AUTHOR_facet</str>
>       <str>FORMAT_facet</str>
>       <str>LANGUAGE_facet</str>
>       <str>PUBDATE_nfacet</str>
>       <str>SUBJECT_facet</str>
>       <str>ABCDEF_cfacet</str>
>     </arr>
>     <str name="qt">spellcheckedStandard</str>
>     <arr name="fq">
>       <str>ACCESS_LEVEL_nfacet:"0"</str>
>       <str>CLEARANCE_nfacet:"0"</str>
>       <str>NEED_TO_KNOWS_facet:"@@EMPTY@@"</str>
>       <str>CITIZENSHIPS_facet:"@@EMPTY@@"</str>
>       <str>RESTRICTIONS_facet:"@@EMPTY@@"</str>
>     </arr>
>     <str name="facet.mincount">1</str>
>     <str name="indent">true</str>
>     <str name="hl.fl">*</str>
>     <str name="rows">12</str>
>     <str name="hl.snippets">5</str>
>     <str name="start">0</str>
>     <str name="q">TITLE_boost:"kljhklsdjahfkljsdhf book rck"~100^200.0
> OR PRIMARY_AUTHOR_boost:"kljhklsdjahfkljsdhf book rck"~100^100.0 OR
> DOC_TEXT:"kljhklsdjahfkljsdhf book rck"~100^2 OR
> PRIMARY_TITLE_boost:"kljhklsdjahfkljsdhf book rck"~100^1000.0 OR
> AUTHOR_boost:"kljhklsdjahfkljsdhf book rck"~100^20.0 OR
> textFuzzy:kljhklsdjahfkljsdhf~0.7 AND textFuzzy:book~0.7 AND
> textFuzzy:rck~0.7</str>
>   </lst>
> </lst>
> <result name="response" numFound="0" start="0" maxScore="0.0"/>
> <lst name="facet_counts">
>   <lst name="facet_queries"/>
>   <lst name="facet_fields">
>     <lst name="AUTHOR_facet"/>
>     <lst name="FORMAT_facet"/>
>     <lst name="LANGUAGE_facet"/>
>     <lst name="PUBDATE_nfacet"/>
>     <lst name="SUBJECT_facet"/>
>     <lst name="ABCDEF_cfacet"/>
>   </lst>
>   <lst name="facet_dates"/>
>   <lst name="facet_ranges"/>
> </lst>
> <lst name="highlighting"/>
> <lst name="spellcheck">
>   <lst name="suggestions">
>     <lst name="rck">
>       <int name="numFound">5</int>
>       <int name="startOffset">362</int>
>       <int name="endOffset">365</int>
>       <int name="origFreq">0</int>
>       <arr name="suggestion">
>         <lst>
>           <str name="word">rock</str>
>           <int name="freq">24000</int>
>         </lst>
>         <lst>
>           <str name="word">rick</str>
>           <int name="freq">6048</int>
>         </lst>
>         <lst>
>           <str name="word">rack</str>
>           <int name="freq">84</int>
>         </lst>
>         <lst>
>           <str name="word">reck</str>
>           <int name="freq">78</int>
>         </lst>
>         <lst>
>           <str name="word">ruck</str>
>           <int name="freq">30</int>
>         </lst>
>       </arr>
>     </lst>
>     <bool name="correctlySpelled">false</bool>
>   </lst>
> </lst>
> </response>
> 
> 
> 
> Thanks!
> 
> Bob Sandiford | Lead Software Engineer | SirsiDynix
> P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com
> www.sirsidynix.com


Reply via email to