Trejkaz created LUCENE-6865:
-------------------------------

             Summary: BooleanQuery2ModifierNodeProcessor breaks the query node 
hierarchy
                 Key: LUCENE-6865
                 URL: https://issues.apache.org/jira/browse/LUCENE-6865
             Project: Lucene - Core
          Issue Type: Bug
            Reporter: Trejkaz


We discovered that one of our own implementations of QueryNodeProcessor was 
seeing node.getParent() returning null for nodes other than the root of the 
query tree.

I put a diagnostic processor around every processor which runs and found that 
BooleanQuery2ModifierNodeProcessor (and possibly others, although it isn't 
clear) are mysteriously setting some of the node references to null.

Example query tree before:

{noformat}
GroupQueryNode, parent = null
  WithinQueryNode, parent = GroupQueryNode
    QuotedFieldQueryNode, parent = WithinQueryNode
    GroupQueryNode, parent = WithinQueryNode
      AndQueryNode, parent = GroupQueryNode
        GroupQueryNode, parent = AndQueryNode
          OrQueryNode, parent = GroupQueryNode
            QuotedFieldQueryNode, parent = OrQueryNode
            QuotedFieldQueryNode, parent = OrQueryNode
        GroupQueryNode, parent = AndQueryNode
          OrQueryNode, parent = GroupQueryNode
            QuotedFieldQueryNode, parent = OrQueryNode
            QuotedFieldQueryNode, parent = OrQueryNode
{noformat}

And after BooleanQuery2ModifierNodeProcessor.process():

{noformat}
GroupQueryNode, parent = null
  WithinQueryNode, parent = GroupQueryNode
    QuotedFieldQueryNode, parent = WithinQueryNode
    GroupQueryNode, parent = WithinQueryNode
      AndQueryNode, parent = GroupQueryNode
        BooleanModifierNode, parent = AndQueryNode
          GroupQueryNode, parent = null
            OrQueryNode, parent = GroupQueryNode
              QuotedFieldQueryNode, parent = OrQueryNode
              QuotedFieldQueryNode, parent = OrQueryNode
        BooleanModifierNode, parent = AndQueryNode
          GroupQueryNode, parent = null
            OrQueryNode, parent = GroupQueryNode
              QuotedFieldQueryNode, parent = OrQueryNode
              QuotedFieldQueryNode, parent = OrQueryNode
{noformat}

Looking at QueryNodeImpl, there is a lot of fiddly logic in there. Removing 
children can trigger setting the parent to null, but setting the parent can 
also trigger the child removing itself, so it's near impossible to figure out 
why this could be happening, but I'm closing in on it at least. My initial 
suspicion is that cloneTree() is responsible, because ironically the number of 
failures of this sort _increase_ if I try to use cloneTree to defend against 
mutability bugs.

The fix I have come up with is to clone the whole API but making QueryNode 
immutable. This removes the ability for processors to mess with nodes that 
don't belong to them, but also obviates the need for a parent reference in the 
first place, which I think is the entire source of the problem - keeping the 
parent and child in sync correctly is obviously going to be hard, and indeed we 
find that there is at least one bug of this sort lurking in there.

But even if we rewrite it, I figured I would report the issue so that maybe it 
can be fixed for others.


Code to use for diagnostics:

{code}
import java.util.List;

import org.apache.lucene.queryparser.flexible.core.QueryNodeException;
import org.apache.lucene.queryparser.flexible.core.config.QueryConfigHandler;
import org.apache.lucene.queryparser.flexible.core.nodes.QueryNode;
import 
org.apache.lucene.queryparser.flexible.core.processors.QueryNodeProcessor;

public class DiagnosticQueryNodeProcessor implements QueryNodeProcessor
{
    private final QueryNodeProcessor delegate;

    public TreeFixingQueryNodeProcessor(QueryNodeProcessor delegate)
    {
        this.delegate = delegate;
    }

    @Override
    public QueryConfigHandler getQueryConfigHandler()
    {
        return delegate.getQueryConfigHandler();
    }

    @Override
    public void setQueryConfigHandler(QueryConfigHandler queryConfigHandler)
    {
        delegate.setQueryConfigHandler(queryConfigHandler);
    }

    @Override
    public QueryNode process(QueryNode queryNode) throws QueryNodeException
    {
        System.out.println("Before " + delegate.getClass().getSimpleName() + 
".process():");
        dumpTree(queryNode);

        queryNode = delegate.process(queryNode);

        System.out.println("After " + delegate.getClass().getSimpleName() + 
".process():");
        dumpTree(queryNode);

        return queryNode;
    }

    private void dumpTree(QueryNode queryNode)
    {
        dumpTree(queryNode, "");
    }

    private void dumpTree(QueryNode queryNode, String prefix)
    {
        System.out.println(prefix + queryNode.getClass().getSimpleName() +
                           ", parent = " + (queryNode.getParent() == null ? 
"null" : queryNode.getParent().getClass().getSimpleName()));
        List<QueryNode> children = queryNode.getChildren();
        if (children != null)
        {
            String childPrefix = "  " + prefix;
            for (QueryNode child : children)
            {
                dumpTree(child, childPrefix);
            }
        }
    }
}
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to