[ 
https://issues.apache.org/jira/browse/LUCENE-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491667#comment-14491667
 ] 

Robert Muir commented on LUCENE-6373:
-------------------------------------

{quote}
Indeed, two phase doc id iteration for SpanOr is not simple. I think I'm 
getting there though. It needs two tests that I think I have seen before, but I 
could not find where:
One is to avoid calling an approximation.match() again when match is accepted. 
This can be done by keeping the last doc for which approximation.match() 
returned true.
The other is to distinguish between approximation+acceptance and normal 
acceptance of a matching doc. This can be done by keeping the last doc for 
which twoPhaseCurrentDocMatches returned true.
This is still cooking, and I don't expect to finish this very soon.
{quote}

I had the same problems with SpanNot, and it was too hard for me to debug them 
with the current tests.

Since LUCENE-6411, we have tons more assertions used by 
TestSpanSearchEquivalence. It wraps all parts of the span-query-tree in 
AssertingSpanQuery which means we have consistent checks on every operation and 
it makes bugs easier to track down since the stacktrace usually points to the 
offending code.

It also wraps the two-phase iterator and ensures you don't call matches() twice:
{code}
    @Override
    public boolean matches() throws IOException {
      if (approximation.docID() == -1 || approximation.docID() == 
DocIdSetIterator.NO_MORE_DOCS) {
        throw new AssertionError("matches() should not be called on doc ID " + 
approximation.docID());
      }
      if (lastDoc == approximation.docID()) {
        throw new AssertionError("matches() has been called twice on doc ID " + 
approximation.docID());
      }
      // ... more checks
{code}

that you don't call nextStartPosition() if you didn't call matches(), or if it 
returned false, and so on:
{code}
  @Override
  public int nextStartPosition() throws IOException {
    assert state != State.DOC_START : "invalid position access, state=" + state 
+ ": " + in;
    assert state != State.DOC_FINISHED : "invalid position access, state=" + 
state + ": " + in;
    assert state != State.DOC_UNVERIFIED : "invalid position access, state=" + 
state + ": " + in;
    // ... more checks
{code}

> Complete two phase doc id iteration support for Spans
> -----------------------------------------------------
>
>                 Key: LUCENE-6373
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6373
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Paul Elschot
>         Attachments: LUCENE-6373-SpanOr.patch
>
>
> Spin off from LUCENE-6308, see comments there from about 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to