[ 
https://issues.apache.org/jira/browse/LUCENE-6308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379227#comment-14379227
 ] 

David Smiley commented on LUCENE-6308:
--------------------------------------

The pain of not using _any_ review tool (no matter how imperfect any might be) 
is quite apparent to me with these patch updates and multi-user interaction.  
Let the pain continue?  Rob; you could ignore the RB if so long as if the 
patches are here for you to do the perf testing.  I don't mean to suggest we 
should use a review tool for every JIRA issue.

bq. SpanNear: when the subSpans are initialized and one of them is null, null 
is returned for the NearSpans... .  I think SpanNear should have at least two 
subqueries, I did not expect that to break backward compatibility.  The single 
subquery case could be restored, but is it really needed? When this happens the 
caller should be aware that the slop is going to be ignored, and in the ordered 
case there should be no match at all because the subSpans should not overlap 
then.

I simply recognized it as a difference and wondered why.  If some thing 
"should" not happen (you claim SpanNear should have at least 2 sub-queries) 
then an assertion or comment to that effect would be good.  As to the 
particulars of this case, I'm not sure at the moment without further inspection.

bq. SpanNotQuery and SpanScorer: as I understood there is no need for these two 
implement TwoPhaseIterator because that is only needed when there is some form 
of conjunction. SpanScorer only takes a single Spans, and for SpanNot advance() 
on the excluded spans only is good enough.

I believe it's useful for any aggregation, even over one so that the TPI 
benefits percolate across parent/child queries in aggregate.  [~jpountz] please 
check me on this.  Having said this I don't know why TPIs aren't propagated 
more generally; maybe a TODO?.  

On to my example:  If any Query that wraps another Query (technically a Scorer 
wrapping another Scorer, or Span wrapping another Span here) fails to propagate 
the TwoPhaseIterator, then the benefit of the TPI is only localized to the 
nodes that directly support it (e.g. BooleanQuery & SpanNearQuery).  Consider 
the case of a top level BooleanQuery wrapping two MUST clauses, one clause with 
a medium-frequency term, and the other clause being a SpanNearQuery between 
some other SpanQuery derivatives which include a low frequency term.  With your 
patch's SpanScorer _not_ supporting TPI, the top level BQ won't see a TPI so it 
will drive the query by the low-cost SpanQuery which will force it to visit 
positions for documents that could have been filtered by the top level term 
query's document list.

bq. Removing Spans/intervals completely is not an option: with (nested) 
proximity queries there is no way to avoid iterating over intervals/spans. We 
can make the scoring consistent between Spans and PhraseQuery, and PhraseQuery 
with getSpans() would also be good for performance of common cases.

I could have been clearer.  I simply mean I look forward to the _dichotomy_ 
being gone or made easier.  That is, today, a SpanNearQuery only accepts 
SpanQueries, which forces us to convert other queries to them (even TermQuery), 
which is a pain.  Any way, that discussion should be left to LUCENE-2878 so I 
shouldn't have brought it up here.

> SpansEnum, deprecate Spans
> --------------------------
>
>                 Key: LUCENE-6308
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6308
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: Trunk
>            Reporter: Paul Elschot
>            Priority: Minor
>         Attachments: LUCENE-6308-changeapi.patch, 
> LUCENE-6308-changeapi.patch, LUCENE-6308-changeapi.patch, 
> LUCENE-6308-changeapi.patch, LUCENE-6308.patch, LUCENE-6308.patch, 
> LUCENE-6308.patch, LUCENE-6308.patch, LUCENE-6308.patch, LUCENE-6308.patch
>
>
> An alternative for Spans that looks more like PositionsEnum and adds two 
> phase doc id iteration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to