[ https://issues.apache.org/jira/browse/LUCENE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475072#comment-13475072 ]
Simon Willnauer commented on LUCENE-3846: ----------------------------------------- just for the record here are my benchmark numbers for the latest branch code: {noformat} Test class requires enabled assertions, enable globally (-ea) or for Solr/Lucene subpackages only: org.apache.lucene.search.suggest.LookupBenchmarkTest -- prefixes: 6-9, num: 7, onlyMorePopular: true FuzzySuggester queries: 50001, time[ms]: 4650 [+- 12.56], ~kQPS: 11 AnalyzingSuggester queries: 50001, time[ms]: 444 [+- 1.89], ~kQPS: 113 JaspellLookup queries: 50001, time[ms]: 181 [+- 0.96], ~kQPS: 275 TSTLookup queries: 50001, time[ms]: 229 [+- 2.35], ~kQPS: 218 FSTCompletionLookup queries: 50001, time[ms]: 245 [+- 3.54], ~kQPS: 204 WFSTCompletionLookup queries: 50001, time[ms]: 121 [+- 1.72], ~kQPS: 413 -- prefixes: 100-200, num: 7, onlyMorePopular: true FuzzySuggester queries: 50001, time[ms]: 5432 [+- 20.86], ~kQPS: 9 AnalyzingSuggester queries: 50001, time[ms]: 403 [+- 1.47], ~kQPS: 124 JaspellLookup queries: 50001, time[ms]: 129 [+- 1.24], ~kQPS: 389 TSTLookup queries: 50001, time[ms]: 68 [+- 4.03], ~kQPS: 739 FSTCompletionLookup queries: 50001, time[ms]: 254 [+- 2.60], ~kQPS: 197 WFSTCompletionLookup queries: 50001, time[ms]: 82 [+- 1.03], ~kQPS: 610 -- construction time FuzzySuggester input: 50001, time[ms]: 450 [+- 1.86] AnalyzingSuggester input: 50001, time[ms]: 449 [+- 1.82] JaspellLookup input: 50001, time[ms]: 40 [+- 3.80] TSTLookup input: 50001, time[ms]: 111 [+- 3.33] FSTCompletionLookup input: 50001, time[ms]: 213 [+- 4.36] WFSTCompletionLookup input: 50001, time[ms]: 156 [+- 2.08] -- prefixes: 2-4, num: 7, onlyMorePopular: true FuzzySuggester queries: 50001, time[ms]: 3571 [+- 12.15], ~kQPS: 14 AnalyzingSuggester queries: 50001, time[ms]: 997 [+- 5.73], ~kQPS: 50 JaspellLookup queries: 50001, time[ms]: 494 [+- 2.25], ~kQPS: 101 TSTLookup queries: 50001, time[ms]: 1846 [+- 9.67], ~kQPS: 27 FSTCompletionLookup queries: 50001, time[ms]: 221 [+- 1.57], ~kQPS: 227 WFSTCompletionLookup queries: 50001, time[ms]: 457 [+- 9.05], ~kQPS: 109 -- RAM consumption FuzzySuggester size[B]: 889,138 AnalyzingSuggester size[B]: 889,138 JaspellLookup size[B]: 9,815,128 TSTLookup size[B]: 9,858,792 FSTCompletionLookup size[B]: 466,520 WFSTCompletionLookup size[B]: 507,640 {noformat} > Fuzzy suggester > --------------- > > Key: LUCENE-3846 > URL: https://issues.apache.org/jira/browse/LUCENE-3846 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Assignee: Michael McCandless > Fix For: 4.1 > > Attachments: LUCENE-3846_fuzzy_analyzing.patch, LUCENE-3846.patch, > LUCENE-3846.patch, LUCENE-3846.patch, LUCENE-3846.patch, LUCENE-3846.patch > > > Would be nice to have a suggester that can handle some fuzziness (like spell > correction) so that it's able to suggest completions that are "near" what you > typed. > As a first go at this, I implemented 1T (ie up to 1 edit, including a > transposition), except the first letter must be correct. > But there is a penalty, ie, the "corrected" suggestion needs to have a much > higher freq than the "exact match" suggestion before it can compete. > Still tons of nocommits, and somehow we should merge this / make it work with > analyzing suggester too (LUCENE-3842). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org