So I dug into this a bit further, and find it happens with the stock Lucene query as well, in 4.10.2. I looked at the code on trunk, and I don't think the situation is different there. Basically if you delete a parent document, orphaning some child document(s) and then merge, the TPBJQuery fails with an exception if the orphaned docs are matched. I feel like this is pretty surprising, and it's not that hard to do something more expected; the orphaned children can just be skipped by testing for this condition (instead of just asserting the contrary, as we do now).

If some committer speaks up and agrees, I'll at least open an issue.

-Mike

On 2/5/2015 12:02 PM, Michael Sokolov wrote:
I've run into an exception, and I'm trying to understand whether it is something that can just happen if the index doesn't conform to the expectations of the TPBJQ, or if I've somehow messed things up in my extension of that query.

The exception I'm seeing is in BlockJoinScorer.nextDoc(). It's clear that the assertion below is being contravened; No parentDoc is found for some child doc.

        // Gather all children sharing the same parent as
        // nextChildDoc

        parentDoc = parentBits.nextSetBit(nextChildDoc);
        assert parentDoc != -1;

        //System.out.println("  nextChildDoc=" + nextChildDoc);
        if (
            // parentDoc = -1 shouldn't happen, but it did.  I'm not sure
            // if this is a consequence of our allowing parents to be a
// child -- I don't think so -- it seems more likely the index
            // can just get in a state where there are children with no
            // parent, and that could cause this?
            parentDoc == -1 ||
            (acceptDocs != null && !acceptDocs.get(parentDoc))
            ) {
          // Parent doc not accepted; skip child docs until
          // we hit a new parent doc:
          do {
            nextChildDoc = childScorer.nextDoc();
          } while (nextChildDoc <= parentDoc);

What I'm wondering is why we believe that assertion? Is there something that guarantees the state of the index beyond the user having indexed their documents correctly?

I'm concerned that a change I made to the query may be causing this, but I can't see how. What I did is to allow a parent doc to also be a child doc, and I also passed acceptDocs when creating the childScorer, so that child docs are filtered by the prevailing filter, as well as parent docs.

Any pointers or ideas welcome !  Thanks

-Mike


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to