[ https://issues.apache.org/jira/browse/LUCENE-6585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593522#comment-14593522 ]
Adrien Grand commented on LUCENE-6585: -------------------------------------- +1 > Make ConjunctionDISI flatten sub ConjunctionDISI instances > ---------------------------------------------------------- > > Key: LUCENE-6585 > URL: https://issues.apache.org/jira/browse/LUCENE-6585 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Priority: Minor > Attachments: LUCENE-6585.patch > > > Today ConjunctionDISI wraps some sub (two-phase) iterators. I would like to > improve it by flattening sub iterators when they implement ConjunctionDISI. > In practice, this would make "+A +(+B +C)" be executed more like "+A +B +C" > (only in terms of matching, scoring would not change). > My motivation for this is that if we don't flatten and are unlucky, we can > sometimes hit some worst cases. For instance consider the 3 following > postings lists (sorted by increasing cost): > A: 1, 1001, 2001, 3001, ... > C: 0, 2, 4, 6, 8, 10, 12, 14, ... > B: 1, 3, 5, 7, 9, 11, 13, 15, ... > If we run "+A +B +C", then everything works fine, we use A as a lead, and > advance B 1000 by 1000 to find the next match (if any). > However if we run "+A +(+B +C)", then we would iterate B and C 2 by 2 over > the entire doc ID space when trying to find the first match which occurs on > or after A:1. > This is an extreme example which is unlikely to happen in practice, but > flattening would also help a bit on some more common cases. For instance > imagine that A, B and C have respective costs of 100, 10 and 1000. If you > search for "+A +(+B +C)", then we will use the most costly iterator (C) to > confirm matches of B (the least costly iterator, used as a lead) while it > would have been more efficient to confirm matches of B with A first, since A > is less costly than C. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org