[
https://issues.apache.org/jira/browse/LUCENE-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593199#comment-14593199
]
Adrien Grand commented on LUCENE-6570:
--------------------------------------
bq. I can appreciate that goal, but i don't think it's ever going to be
feasible to turn that on by default in the truly generic case of any arbitrary
lucene application, where people might have custom Query impls.
What makes custom query impls different?
bq. Consider again that REUSED_FILTER example i mentioned in my last comment –
assuming the application is "well behaved", and doesn't call setBoost at add
times: even w/o the implicit clone in BooleanQuery it should work great with a
query cache enabled, and would use a lot less ram then with the implicit
sub-query cloning in the BooleanQuery.Builder.
I don't think "would use a lot less ram" is accurate: clone() is shallow so the
main data-structures would still be shared with the clone. For instance if you
consider TermsQuery which is in my experience the bad guy that can sometimes
make keys (queries) use more memory than the values (doc id sets), clone() does
not clone "termData", so between storing a single TermsQuery and storing a
TermsQuery and its clone, there are only 24 bytes of RAM of difference (I just
tested). Since the query cache typically needs to only cache few queries to be
efficient, this would mean the difference would only be about a few kb.
bq. But if an application does start trying to keep refrences to previously
constructed Query instances, and call mutating methods (like setBoost) at
runtime, then really they aren't going to be able to safely use the query cache
at all – regardless of whether you have this implicit clone in BooleanQuery's
builder.
A longer-term plan, once all our queries are fixed, is to upgrade clone()'s
documentation to say that it has to return an independant instance. So there
are two options: either deep cloning or shallow cloning and be immutable. By
the way, this is where this issue arises from: we wanted to avoid having to
deep-clone queries to use them as cache keys (see LUCENE-6369).
> Make BooleanQuery immutable
> ---------------------------
>
> Key: LUCENE-6570
> URL: https://issues.apache.org/jira/browse/LUCENE-6570
> Project: Lucene - Core
> Issue Type: Task
> Reporter: Adrien Grand
> Assignee: Adrien Grand
> Priority: Minor
> Fix For: 6.0
>
> Attachments: LUCENE-6570.patch
>
>
> In the same spirit as LUCENE-6531 for the PhraseQuery, we should make
> BooleanQuery immutable.
> The plan is the following:
> - create BooleanQuery.Builder with the same setters as BooleanQuery today
> (except setBoost) and a build() method that returns a BooleanQuery
> - remove setters from BooleanQuery (except setBoost)
> I would also like to add some static utility methods for common use-cases of
> this query, for instance:
> - static BooleanQuery disjunction(Query... queries) to create a disjunction
> - static BooleanQuery conjunction(Query... queries) to create a conjunction
> - static BooleanQuery filtered(Query query, Query... filters) to create a
> filtered query
> Hopefully this will help keep tests not too verbose, and the latter will also
> help with the FilteredQuery derecation/removal.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]