Hi,

I am working on getting Solr upgraded to Lucene 9.8 [0] and I wanted to
raise visibility on an issue I ran into.

I believe PR#12380 [1] introduced a change that calls `finish()` on the
LeafCollector [2]. The trouble is on Solr side we have a few collectors
that extend SimpleCollector. (to be more precise there is a
DelegatingCollector in between but that does not change things).
SimpleCollector returns `this` on the `getLeafCollector` call, so now there
are 2 calls to the `finish()` method on the same collector instance (one as
a leaf, one at the end).

One example I am working with is CollapsingQParserPlugin$OrdScoreCollector
[3] where I am seeing a few tests fail because calling `finish` twice will
mess up the results. I don't know yet if there are others.

My first question is to validate this analysis with someone that knows this
code (and perhaps the Solr code too), and ideally also take a quick look at
my fix [4].

Second is related to PR#12380. is it expected that the 'finish' method is
idempotent? per my tests this seems to be called twice now in some cases
and it will be the case for any implementation extending SimpleCollector.
also given that SimpleCollector's getLeafCollector method is final, there
is almost no room for passing some state wrt. this being a leaf vs not.


thanks,
alex


[0] https://github.com/apache/solr/pull/1958
[1] https://github.com/apache/lucene/pull/12380
[2]
https://github.com/apache/lucene/blob/releases/lucene/9.8.0/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L779
[3]
https://github.com/apache/solr/blob/f0fcd300c896b858ae83235ecdb0a109eaea5cea/solr/core/src/java/org/apache/solr/search/CollapsingQParserPlugin.java#L594
[4]
https://github.com/apache/solr/commit/fc8a2ffe8951f31aa3b65fac2adc9eaa3fee6258

Reply via email to