The old search API is already removed in trunk.
Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de _____ From: John Wang [mailto:john.w...@gmail.com] Sent: Tuesday, October 20, 2009 3:28 AM To: java-dev@lucene.apache.org Subject: Re: lucene 2.9 sorting algorithm Hi Michael: Was wondering if you got a chance to take a look at this. Since deprecated APIs are being removed in 3.0, I was wondering if/when we would decide on keeping the ScoreDocComparator API and thus would be kept for Lucene 3.0. Thanks -John On Fri, Oct 16, 2009 at 9:53 AM, Michael McCandless <luc...@mikemccandless.com> wrote: Oh, no problem... Mike On Fri, Oct 16, 2009 at 12:33 PM, John Wang <john.w...@gmail.com> wrote: > Mike, just a clarification on my first perf report email. > The first section, numHits is incorrectly labeled, it should be 20 instead > of 50. Sorry about the possible confusion. > Thanks > -John > > On Fri, Oct 16, 2009 at 3:21 AM, Michael McCandless > <luc...@mikemccandless.com> wrote: >> >> Thanks John; I'll have a look. >> >> Mike >> >> On Fri, Oct 16, 2009 at 12:57 AM, John Wang <john.w...@gmail.com> wrote: >> > Hi Michael: >> > I added classes: ScoreDocComparatorQueue and OneSortNoScoreCollector >> > as >> > a more general case. I think keeping the old api for ScoreDocComparator >> > and >> > SortComparatorSource would work. >> > Please take a look. >> > Thanks >> > -John >> > >> > On Thu, Oct 15, 2009 at 6:52 PM, John Wang <john.w...@gmail.com> wrote: >> >> >> >> Hi Michael: >> >> It is open, http://code.google.com/p/lucene-book/source/checkout >> >> I think I sent the https url instead, sorry. >> >> The multi PQ sorting is fairly self-contained, I have 2 versions, 1 >> >> for string and 1 for int, each are Collector impls. >> >> I shouldn't say the Multi Q is faster on int sort, it is within >> >> the >> >> error boundary. The diff is very very small, I would stay they are more >> >> equal. >> >> If you think it is a good thing to go this way, (if not for the >> >> perf, >> >> just for the simpler api) I'd be happy to work on a patch. >> >> Thanks >> >> -John >> >> On Thu, Oct 15, 2009 at 5:18 PM, Michael McCandless >> >> <luc...@mikemccandless.com> wrote: >> >>> >> >>> John, looks like this requires login -- any plans to open that up, or, >> >>> post the code on an issue? >> >>> >> >>> How self-contained is your Multi PQ sorting? EG is it a standalone >> >>> Collector impl that I can test? >> >>> >> >>> Mike >> >>> >> >>> On Thu, Oct 15, 2009 at 6:33 PM, John Wang <john.w...@gmail.com> >> >>> wrote: >> >>> > BTW, we are have a little sandbox for these experiments. And all my >> >>> > testcode >> >>> > are at. They are not very polished. >> >>> > >> >>> > https://lucene-book.googlecode.com/svn/trunk >> >>> > >> >>> > -John >> >>> > >> >>> > On Thu, Oct 15, 2009 at 3:29 PM, John Wang <john.w...@gmail.com> >> >>> > wrote: >> >>> >> >> >>> >> Numbers Mike requested for Int types: >> >>> >> >> >>> >> only the time/cputime are posted, others are all the same since the >> >>> >> algorithm is the same. >> >>> >> >> >>> >> Lucene 2.9: >> >>> >> numhits: 10 >> >>> >> time: 14619495 >> >>> >> cpu: 146126 >> >>> >> >> >>> >> numhits: 20 >> >>> >> time: 14550568 >> >>> >> cpu: 163242 >> >>> >> >> >>> >> numhits: 100 >> >>> >> time: 16467647 >> >>> >> cpu: 178379 >> >>> >> >> >>> >> >> >>> >> my test: >> >>> >> numHits: 10 >> >>> >> time: 14101094 >> >>> >> cpu: 144715 >> >>> >> >> >>> >> numHits: 20 >> >>> >> time: 14804821 >> >>> >> cpu: 151305 >> >>> >> >> >>> >> numHits: 100 >> >>> >> time: 15372157 >> >>> >> cpu time: 158842 >> >>> >> >> >>> >> Conclusions: >> >>> >> The are very similar, the differences are all within error bounds, >> >>> >> especially with lower PQ sizes, which second sort alg again >> >>> >> slightly >> >>> >> faster. >> >>> >> >> >>> >> Hope this helps. >> >>> >> >> >>> >> -John >> >>> >> >> >>> >> >> >>> >> On Thu, Oct 15, 2009 at 3:04 PM, Yonik Seeley >> >>> >> <yo...@lucidimagination.com> >> >>> >> wrote: >> >>> >>> >> >>> >>> On Thu, Oct 15, 2009 at 5:33 PM, Michael McCandless >> >>> >>> <luc...@mikemccandless.com> wrote: >> >>> >>> > Though it'd be odd if the switch to searching by segment >> >>> >>> > really was most of the gains here. >> >>> >>> >> >>> >>> I had assumed that much of the improvement was due to ditching >> >>> >>> MultiTermEnum/MultiTermDocs. >> >>> >>> Note that LUCENE-1483 was before LUCENE-1596... but that only >> >>> >>> helps >> >>> >>> with queries that use a TermEnum (range, prefix, etc). >> >>> >>> >> >>> >>> -Yonik >> >>> >>> http://www.lucidimagination.com >> >>> >>> >> >>> >>> >> >>> >>> --------------------------------------------------------------------- >> >>> >>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> >>> >>> For additional commands, e-mail: java-dev-h...@lucene.apache.org >> >>> >>> >> >>> >> >> >>> > >> >>> > >> >>> >> >>> --------------------------------------------------------------------- >> >>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> >>> For additional commands, e-mail: java-dev-h...@lucene.apache.org >> >>> >> >> >> > >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-dev-h...@lucene.apache.org >> > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org