[
https://issues.apache.org/jira/browse/LUCENE-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12986281#action_12986281
]
Simon Willnauer commented on LUCENE-2868:
-----------------------------------------
bq.Here's my take on the patch, including ability to cache weight objects.
I have a couple of comments here - first I can not apply your patch to the
current trunk can you update it?
* you keep a cache per IndexSeacher (btw. QueryDataCache is missing in the
patch) which is used to cache several things across searches. This is very
dangerous! While I don't know how it is implemented I would guess you need to
synchronized access to it so it would slow down searches ey?
* Caching Scorers is going to break since Scorers are stateful and might be
advanced to different documents. Yet, I can see what you are trying to do here
since doing work in a scorer is costly so common TermQueries for instance
should not need to load the same posting list twice. There are two things which
come to my mind right away. 1. Postinglist caching - should be done on a codec
level IMO 2. Building PerReaderTermState only once for a common TermQuery.
While caching PostingLists is going to be tricky and quite a task reusing
PerReaderTermState could work fine as far as I can see if you are in the same
searcher.
* Caching Weights is kind of weird - what is the reason for this again? The
only thing you really save here is setup costs which are generally very low.
Overall I don' t like that this way you tightly couple something to Weight /
Query etc. for a single purpose what could be solved with some kind of query
optimization phase similar to what I had in my last patch and Earwin has
proposed. I think we should not tight couple things like that into lucene. This
is really extremely application dependent in the most cases and we should only
provide the infrastructure to do it.
bq. Earwin - I think we should make a new issue and get something like that
implemented in there which is more general than what I just sketched out. If
you could share your code that would be awesome!
Earwin, any new on this - shall I open an issue for that?
bq. It occurs to me that the name of the common class that gets created in
IndexSearcher and passed around should probably be named something more
appropriate, like QueryContext. That way people will feel free to extend it to
hold all sorts of query-local data, in time. Thoughts?
You refer to ScorerContext? This class was actually not intended to be
expendable its public final until now. I am not sure if we should open that up
though.
> It should be easy to make use of TermState; rewritten queries should be
> shared automatically
> --------------------------------------------------------------------------------------------
>
> Key: LUCENE-2868
> URL: https://issues.apache.org/jira/browse/LUCENE-2868
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Query/Scoring
> Reporter: Karl Wright
> Attachments: lucene-2868.patch, query-rewriter.patch
>
>
> When you have the same query in a query hierarchy multiple times, tremendous
> savings can now be had if the user knows enough to share the rewritten
> queries in the hierarchy, due to the TermState addition. But this is clumsy
> and requires a lot of coding by the user to take advantage of. Lucene should
> be smart enough to share the rewritten queries automatically.
> This can be most readily (and powerfully) done by introducing a new method to
> Query.java:
> Query rewriteUsingCache(IndexReader indexReader)
> ... and including a caching implementation right in Query.java which would
> then work for all. Of course, all callers would want to use this new method
> rather than the current rewrite().
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]