Oops. Sorry. I'm hijacking my own thread to put a real Subject in place...
Bob Sandiford | Lead Software Engineer | SirsiDynix P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com www.sirsidynix.com > -----Original Message----- > From: Bob Sandiford > Sent: Monday, April 25, 2011 5:34 PM > To: solr-user@lucene.apache.org > Subject: > > Hi, all. > > We're having some troubles with the Solr Spellcheck Response. We're > running version 3.1. > > Overview: If we search for something really ugly like: " > kljhklsdjahfkljsdhf book rck" > > then when we get back the response, there's a suggestions list for > 'rck', but no suggestions list for the other two words. For 'book', > that's fine, because it is 'spelled correctly' (i.e. we got hits on the > word) and there shouldn't be any suggestions. For the ugly thing, > though, there aren't any hits. > > The problem is that when we're handling the result, we can't tell the > difference between no suggestions for a 'correctly spelled' term, and > no suggestions for something that's odd like this. > > (Now - this is happening with searches that aren't as obviously garbage > - this was just to illustrate the point). > > Our setup: > We're running multiple shards, which may be part of the issue. For > example, 'book' might be found in one of the shards, but not another. > > I don't *think* this has anything to do with our schema, since it's > really how the Search Suggestions are being returned to us. > > What we'd really like to see is the response coming back with an > indication that a word wasn't found / had no suggestions. We've hacked > around in the code a little bit to do this, but were wondering if > anyone has come across this, and what approaches you've taken. > > Here's the xml we're getting back from the search: > > > <?xml version="1.0" encoding="UTF-8"?> > <response> > > <lst name="responseHeader"> > <int name="status">0</int> > <int name="QTime">56</int> > <lst name="params"> > <str name="spellcheck">true</str> > <str name="facet">true</str> > <str name="sort">score desc, RELEVANCE_SORT_nsort desc</str> > <str name="shards.qt">spellcheckedStandard</str> > <str name="hl.mergeContiguous">true</str> > <str name="facet.limit">1000</str> > <str name="hl">true</str> > <str name="fl"> ELECTRONIC_ACCESS_display ISBN_display TITLE_boost > FORMAT_display score MEDIA_TYPE_display AUTHOR_boost LOCALURL_display > UPC_display id DOC_ID_display CHILD_SITE_display DS_EC > PRIMARY_AUTHOR_boost PRIMARY_TITLE_boost DS_ID TOPIC_display > ASSET_NAME_display OCLC_display</str> > <str > name="shards">localhost:8983/solr/SD_ILS/,localhost:8983/solr/SD_ASSET/ > </str> > <arr name="facet.field"> > <str>AUTHOR_facet</str> > <str>FORMAT_facet</str> > <str>LANGUAGE_facet</str> > <str>PUBDATE_nfacet</str> > <str>SUBJECT_facet</str> > <str>ABCDEF_cfacet</str> > </arr> > <str name="qt">spellcheckedStandard</str> > <arr name="fq"> > <str>ACCESS_LEVEL_nfacet:"0"</str> > <str>CLEARANCE_nfacet:"0"</str> > <str>NEED_TO_KNOWS_facet:"@@EMPTY@@"</str> > <str>CITIZENSHIPS_facet:"@@EMPTY@@"</str> > <str>RESTRICTIONS_facet:"@@EMPTY@@"</str> > </arr> > <str name="facet.mincount">1</str> > <str name="indent">true</str> > <str name="hl.fl">*</str> > <str name="rows">12</str> > <str name="hl.snippets">5</str> > <str name="start">0</str> > <str name="q">TITLE_boost:"kljhklsdjahfkljsdhf book rck"~100^200.0 > OR PRIMARY_AUTHOR_boost:"kljhklsdjahfkljsdhf book rck"~100^100.0 OR > DOC_TEXT:"kljhklsdjahfkljsdhf book rck"~100^2 OR > PRIMARY_TITLE_boost:"kljhklsdjahfkljsdhf book rck"~100^1000.0 OR > AUTHOR_boost:"kljhklsdjahfkljsdhf book rck"~100^20.0 OR > textFuzzy:kljhklsdjahfkljsdhf~0.7 AND textFuzzy:book~0.7 AND > textFuzzy:rck~0.7</str> > </lst> > </lst> > <result name="response" numFound="0" start="0" maxScore="0.0"/> > <lst name="facet_counts"> > <lst name="facet_queries"/> > <lst name="facet_fields"> > <lst name="AUTHOR_facet"/> > <lst name="FORMAT_facet"/> > <lst name="LANGUAGE_facet"/> > <lst name="PUBDATE_nfacet"/> > <lst name="SUBJECT_facet"/> > <lst name="ABCDEF_cfacet"/> > </lst> > <lst name="facet_dates"/> > <lst name="facet_ranges"/> > </lst> > <lst name="highlighting"/> > <lst name="spellcheck"> > <lst name="suggestions"> > <lst name="rck"> > <int name="numFound">5</int> > <int name="startOffset">362</int> > <int name="endOffset">365</int> > <int name="origFreq">0</int> > <arr name="suggestion"> > <lst> > <str name="word">rock</str> > <int name="freq">24000</int> > </lst> > <lst> > <str name="word">rick</str> > <int name="freq">6048</int> > </lst> > <lst> > <str name="word">rack</str> > <int name="freq">84</int> > </lst> > <lst> > <str name="word">reck</str> > <int name="freq">78</int> > </lst> > <lst> > <str name="word">ruck</str> > <int name="freq">30</int> > </lst> > </arr> > </lst> > <bool name="correctlySpelled">false</bool> > </lst> > </lst> > </response> > > > > Thanks! > > Bob Sandiford | Lead Software Engineer | SirsiDynix > P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com > www.sirsidynix.com