[ 
https://issues.apache.org/jira/browse/LUCENE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593199#comment-14593199
 ] 

Adrien Grand commented on LUCENE-6570:
--------------------------------------

bq. I can appreciate that goal, but i don't think it's ever going to be 
feasible to turn that on by default in the truly generic case of any arbitrary 
lucene application, where people might have custom Query impls.

What makes custom query impls different?

bq. Consider again that REUSED_FILTER example i mentioned in my last comment – 
assuming the application is "well behaved", and doesn't call setBoost at add 
times: even w/o the implicit clone in BooleanQuery it should work great with a 
query cache enabled, and would use a lot less ram then with the implicit 
sub-query cloning in the BooleanQuery.Builder.

I don't think "would use a lot less ram" is accurate: clone() is shallow so the 
main data-structures would still be shared with the clone. For instance if you 
consider TermsQuery which is in my experience the bad guy that can sometimes 
make keys (queries) use more memory than the values (doc id sets), clone() does 
not clone "termData", so between storing a single TermsQuery and storing a 
TermsQuery and its clone, there are only 24 bytes of RAM of difference (I just 
tested). Since the query cache typically needs to only cache few queries to be 
efficient, this would mean the difference would only be about a few kb.

bq. But if an application does start trying to keep refrences to previously 
constructed Query instances, and call mutating methods (like setBoost) at 
runtime, then really they aren't going to be able to safely use the query cache 
at all – regardless of whether you have this implicit clone in BooleanQuery's 
builder.

A longer-term plan, once all our queries are fixed, is to upgrade clone()'s 
documentation to say that it has to return an independant instance. So there 
are two options: either deep cloning or shallow cloning and be immutable. By 
the way, this is where this issue arises from: we wanted to avoid having to 
deep-clone queries to use them as cache keys (see LUCENE-6369).

> Make BooleanQuery immutable
> ---------------------------
>
>                 Key: LUCENE-6570
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6570
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>             Fix For: 6.0
>
>         Attachments: LUCENE-6570.patch
>
>
> In the same spirit as LUCENE-6531 for the PhraseQuery, we should make 
> BooleanQuery immutable.
> The plan is the following:
>  - create BooleanQuery.Builder with the same setters as BooleanQuery today 
> (except setBoost) and a build() method that returns a BooleanQuery
>  - remove setters from BooleanQuery (except setBoost)
> I would also like to add some static utility methods for common use-cases of 
> this query, for instance:
>  - static BooleanQuery disjunction(Query... queries) to create a disjunction
>  - static BooleanQuery conjunction(Query... queries) to create a conjunction
>  - static BooleanQuery filtered(Query query, Query... filters) to create a 
> filtered query
> Hopefully this will help keep tests not too verbose, and the latter will also 
> help with the FilteredQuery derecation/removal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to