Wolf Siberski wrote:
Now I found another solution which requires more changes, but IMHO is
much cleaner:
- when a query computes its Weight, it caches it in an attribute
- a query can be 'frozen'. A frozen query always returns the cached
  Weight when calling Query.weight().

Orignally there was no Weight in Lucene, only Query and Scorer. Weight was added in order to make it so that searching did not modify a Query, so that a Query instance could be reused. Searcher-dependent state of the query is meant to reside in the Weight. IndexReader dependent state resides in the Scorer. Your "freezing" a query violates this. Can't we create the weight once in Searcher.search?


This approach requires that weights can be serialized. Interestingly,
Weight already implements Serializable, but the current implementation
doesn't work for all weight classes. The reason is that some weights
hold a reference to a searcher which is of course not serializable.
We can't make it transient either, because this searcher is the source
of the Similarity needed by scorers.

On closer look it turned out that the searcher is used only for two
things: as source for a Similarity, and as docFreqs&maxDoc source.
docFreq&maxDoc are only necessary to initialize the weights, but not
needed by scorers. So instead of providing the Searcher, I now provide
a Similarity and a DocFreqSource to the weights. Only the Similarity is
stored by weights.

We need to make sure, however, that this is the correct Similarity. It should still be the result of Query.getSimilarity(Searcher), which doesn't appear to be the case in your patch.


As for DocFreqSource versus Searcher, couldn't the Searcher be passed as a source for docFreqs and simoly have Weights not keep a pointer to it? This isn't a big deal, but it would substantially mimimize the API changes.

As (IMHO) positive side effect, Similarity got rid of
Searcher dependencies, which leads to a better split of responsibilities:
- Similarity only provides scoring formulas
- Searcher (rsp. DocFreqSource) provides the raw data (tf/df/maxDoc)
This change affects quite a few classes (because the createWeight() signature
is changed), but the modifications are pretty straightforward.

But couldn't the signature change be avoided if the Weight constructors immediately call Query.getSimilarity(Searcher) to get their Similarity, and no longer kept a pointer to the Searcher?


From my point of view, the patch submitted now is a sound solution
for Bug 31841 (at least I like it :-) ).
The next thing which IMHO needs to be done is a review by someone else.

I've make a quick review, but it would be nice if others looked at this too.

Thanks again for all your work here!

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to