[ 
https://issues.apache.org/jira/browse/LUCENE-6369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377861#comment-14377861
 ] 

Adrien Grand commented on LUCENE-6369:
--------------------------------------

bq. Was anyone hitting an issue due to shallow cloning?

We have had test failures due to shallow cloning when using the query cache. 
For instance, consider a test that would look like this (similar to what 
TestSloppyPhraseQuery2 does):

{code}
  public void testFoo() {
    PhraseQuery query = ...;
    for (int i = 0; i < 10; ++i) {
      query.setSlop(i);
      doTest(query);
    }
  }

  private void doTest(Query q) {
    BooleanQuery filtered = new BooleanQuery();
    filtered.add(q, Occur.MUST);
    filtered.add(q, Occur.FILTER);

    assertEquals(searcher.count(q), searcher.count(filtered));
  }
{code}

Then the first iteration would put a BooleanQuery around a PhraseQuery with a 
slop of 0 in the cache. But then later iterations will modify the slop of the 
inner phrase query, making it unreachable in the cache (which is essentially a 
LinkedHashMap). This is why we have seen these test failures which time out due 
to the fact that the cache was not able to evict entries. If we have this issue 
in our tests, I'm afraid this would be an issue as well for our users.

An alternative to deep cloning would be to make queries immutable (up to the 
boost) and Query.clone() final. I gave up on the idea since it would be a very 
large change (because it impacts **very** common queries such as BooleanQuery 
and PhraseQuery) and it would be nice to have a fix for this issue for 5.1. But 
if this is the only viable option then we can disable automatic query caching 
for now, add a note to the query cache javadocs that queries should not be 
modified after they been passed to IndexSearcher and work on making queries 
immutable.

bq. Also, what happens to deeply nested boolean queries during a rewrite?

Good call. Even if the depth should remain low in practice, I agree that a N² 
complexity is not good! I think we could easily fix it by making 
BooleanQuery.rewrite do a shallow clone by calling super.clone() and shallow 
cloning the list of clauses instead of deep cloning.

> Make queries more defensive and clone deeply
> --------------------------------------------
>
>                 Key: LUCENE-6369
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6369
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>             Fix For: 5.1
>
>         Attachments: LUCENE-6369.patch
>
>
> It is very important for the query cache that queries be either immutable or 
> clone deeply so that they cannot change after having been put into the cache.
> There are three issues that need to be addressed:
>  - mutable queries such as boolean or phrase queries do not clone deeply
>  - queries that wrap mutable objects such as TermQuery's term
>  - filters inherit Query's default clone impl which is not enough in most 
> cases



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to