Hi James, Did you see my comments on the issue?
On Aug 11, 2010, at 12:28 AM, Dyer, James wrote: > Tom, > > I'm going to also need this to work with 1.4.1 within the next month or two > so if someone else doesn't back-port it to 1.4.1 then I probably will. I > also would like to see this working with shards. The PossibilityIterator > class likely can be made a lot simpler. If nobody else takes care of these > items I will try to find time to do so myself prior to making it work with > 1.4.1. > > James Dyer > E-Commerce Systems > Ingram Book Company > (615) 213-4311 > > -----Original Message----- > From: Tom Phethean (JIRA) [mailto:j...@apache.org] > Sent: Tuesday, August 10, 2010 10:01 AM > To: dev@lucene.apache.org > Subject: [jira] Commented: (SOLR-2010) Improvements to SpellCheckComponent > Collate functionality > > > [ > https://issues.apache.org/jira/browse/SOLR-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896903#action_12896903 > ] > > Tom Phethean commented on SOLR-2010: > ------------------------------------ > > Ok, thanks. Do you know if there is a rough timescale on that? > >> Improvements to SpellCheckComponent Collate functionality >> --------------------------------------------------------- >> >> Key: SOLR-2010 >> URL: https://issues.apache.org/jira/browse/SOLR-2010 >> Project: Solr >> Issue Type: New Feature >> Components: clients - java, spellchecker >> Affects Versions: 1.4.1 >> Environment: Tested against trunk revision 966633 >> Reporter: James Dyer >> Assignee: Grant Ingersoll >> Priority: Minor >> Attachments: SOLR-2010.patch, SOLR-2010.patch >> >> >> Improvements to SpellCheckComponent Collate functionality >> Our project requires a better Spell Check Collator. I'm contributing this >> as a patch to get suggestions for improvements and in case there is a >> broader need for these features. >> 1. Only return collations that are guaranteed to result in hits if >> re-queried (applying original fq params also). This is especially helpful >> when there is more than one correction per query. The 1.4 behavior does not >> verify that a particular combination will actually return hits. >> 2. Provide the option to get multiple collation suggestions >> 3. Provide extended collation results including the # of hits re-querying >> will return and a breakdown of each misspelled word and its correction. >> This patch is similar to what is described in SOLR-507 item #1. Also, this >> patch provides a viable workaround for the problem discussed in SOLR-1074. >> A dictionary could be created that combines the terms from the multiple >> fields. The collator then would prune out any spurious suggestions this >> would cause. >> This patch adds the following spellcheck parameters: >> 1. spellcheck.maxCollationTries - maximum # of collation possibilities to >> try before giving up. Lower values ensure better performance. Higher >> values may be necessary to find a collation that can return results. >> Default is 0, which maintains backwards-compatible behavior (do not check >> collations). >> 2. spellcheck.maxCollations - maximum # of collations to return. Default is >> 1, which maintains backwards-compatible behavior. >> 3. spellcheck.collateExtendedResult - if true, returns an expanded response >> format detailing collations found. default is false, which maintains >> backwards-compatible behavior. When true, output is like this (in context): >> <lst name="spellcheck"> >> <lst name="suggestions"> >> <lst name="hopq"> >> <int name="numFound">94</int> >> <int name="startOffset">7</int> >> <int name="endOffset">11</int> >> <arr name="suggestion"> >> <str>hope</str> >> <str>how</str> >> <str>hope</str> >> <str>chops</str> >> <str>hoped</str> >> etc >> </arr> >> <lst name="faill"> >> <int name="numFound">100</int> >> <int name="startOffset">16</int> >> <int name="endOffset">21</int> >> <arr name="suggestion"> >> <str>fall</str> >> <str>fails</str> >> <str>fail</str> >> <str>fill</str> >> <str>faith</str> >> <str>all</str> >> etc >> </arr> >> </lst> >> <lst name="collation"> >> <str name="collationQuery">Title:(how AND fails)</str> >> <int name="hits">2</int> >> <lst name="misspellingsAndCorrections"> >> <str name="hopq">how</str> >> <str name="faill">fails</str> >> </lst> >> </lst> >> <lst name="collation"> >> <str name="collationQuery">Title:(hope AND faith)</str> >> <int name="hits">2</int> >> <lst name="misspellingsAndCorrections"> >> <str name="hopq">hope</str> >> <str name="faill">faith</str> >> </lst> >> </lst> >> <lst name="collation"> >> <str name="collationQuery">Title:(chops AND all)</str> >> <int name="hits">1</int> >> <lst name="misspellingsAndCorrections"> >> <str name="hopq">chops</str> >> <str name="faill">all</str> >> </lst> >> </lst> >> </lst> >> </lst> >> In addition, SOLRJ is updated to include >> SpellCheckResponse.getCollatedResults(), which will return the expanded >> Collation format. getCollatedResult(), which returns a single String, is >> retained for backwards-compatibility. Other APIs were not changed but will >> still work provided that spellcheck.collateExtendedResult is false. >> This likely will not return valid results if using Shards. Rather, a more >> robust interaction with the index would be necessary than what exists in >> SpellCheckCollator.collate(). > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > -------------------------- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org