I think I'd lean towards a third solution: tighten up
TopScoreDocCollector (make it final, remove ability to change its PQ,
make things private) and have it focus on high performance collection
by score. This is the default collector for Lucene searches so I think
keeping it high performance (not adding additional method calls nor
boolean checks) is important. You choose this for performance
Then introduce a new hit collector that makes subclassing / changing
the PQ easier. You choose this one for flexibility.
And 2.9 is a great chance to do this since we've deprecated the old
class (TopDocCollector) in favor of TopScoreDocCollector.
Mike
Shai Erera wrote:
Thanks Chris/Hoss (not sure who sent the original reply).
I don't like calling pq.lessThan, as pq.insert and
pq.insertWithOverflow call it anyway internally and since it would
add a method call (something that was tried to be avoided in the
current implementation), I prefer the code I proposed below.
BTW, I introduced 1356, so I take full responsibility on this
overlooking. The main reason for 1356 was to allow creating
extensions of TopDocCollector so they can be of the same type, and
share the topDocs() and totalHIts() implementations.
I can file an issue. Any other comments?
Shai
On Sat, Mar 21, 2009 at 3:48 AM, Chris Hostetter <hossman_luc...@fucit.org
> wrote:
(resending msg from earlier today during @apache mail outage -- i
didn't get a copy from the list, so i'm assuming no one did)
---------- Forwarded message ----------
Date: Fri, 20 Mar 2009 15:29:13 -0700 (PDT)
: TopDocCollector's (TDC) implementation of collect() seems a bit
problematic
: to me.
This code isn't an area i'm very familiar with, but your assessment
seems
correct ... it looks like when LUCENE-1356 introduced the ability to
provide a PriorityQueue to the constructor, the existing
optimization when
the score was obvoiusly too low was overlooked.
It looks like this same bug got propogated to TopScoreDocCollector
when it was introduced as well.
: Introduce in TDC a private boolean which signals whether the
default PQ is
: used or not. If it's not used, don't do the 'else if' at all. If
it is used,
: then the 'else if' is safe. Then code could look like:
my vote would just be to change the ">=" comarison to a hq.lessThan
call
... but i can understand how your proposal might be more efficient
-- I'll
let the performance experts fight it out ... but i definitely think
you
should fil a bug.
-Hoss
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org