[ https://issues.apache.org/jira/browse/LUCENE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480428#comment-13480428 ]
Dawid Weiss commented on LUCENE-4481: ------------------------------------- I've been following the discussion but I don't get the original problem -- could you provide an example of a token sequence (graph) for which the problem occurs? Is it in the patch somewhere (I looked at it but missed it somehow)? > AnalyzingSuggester may fail to return correct topN suggestions > -------------------------------------------------------------- > > Key: LUCENE-4481 > URL: https://issues.apache.org/jira/browse/LUCENE-4481 > Project: Lucene - Core > Issue Type: Bug > Reporter: Michael McCandless > Assignee: Michael McCandless > Fix For: 4.1, 5.0 > > Attachments: LUCENE-4481.patch, LUCENE-4481.patch, LUCENE-4481.patch, > LUCENE-4481.patch > > > I hit this when working on LUCENE-4480. > Because AnalyzingSuggester may prune some of the topN paths found by FST's > Util.TopNSearcher, this means the queue size limit of topN makes the overall > search inadmissible, ie it may incorrectly prune paths that would have lead > to a competitive path. > However, such pruning is rare: it happens only for graph token streams, and > even then only when competitive analyzed forms share the same surface forms. > The simplest way to fix this is to make the queue unbounded but this is > likely a sizable performance hit ... I haven't tested yet. It's even > possible the way the dups happen (always at the "end" of the suggestion, > because we tack on 0 byte followed by ord dedup byte) prevent this bug from > even occurring and so this could all be a false alarm! I have to try to make > a test case showing it ... > A cop-out solution would be to expose a separate queueSize or queueMultiplier > (over the topN) so that if users are affected by this they could crank up the > queue size or multiplier. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org