Isn't the issue the creation of a new Comparator each time the scorers
are sorted? That could be easily fixed by keeping single comparator
around to do all the work.

Yes, it's the Comparator, but I think even if you kept it around, the
Array.sort would still clone the Scorers, no?

Peter

On 10/23/06, Paul Elschot <[EMAIL PROTECTED]> wrote:

On Monday 23 October 2006 22:12, Peter Keegan wrote:
> I did some profile testing with the new ConjuctionScorer in 2.1 and
> discovered a new bottleneck in ConjunctionScorer.sortScorers. The
> java.utils.Arrays.sort method is cloning the Scorers array on every
sort,
> which is quite expensive on large indexes because of the size of the
'norms'
> array within, and isn't necessary.

Isn't the issue the creation of a new Comparator each time the scorers
are sorted? That could be easily fixed by keeping single comparator
around to do all the work.

Regards,
Paul Elschot


> We rewrote this to use simple insertion sorting as follows:
>
>   private void sortScorers() {
> // squeeze the array down for the sort
> //    if (length != scorers.length) {
> //      Scorer[] temps = new Scorer[length];
> //      System.arraycopy(scorers, 0, temps, 0, length);
> //      scorers = temps;
> //    }
>     insertionSort( scorers,length );
>     // note that this comparator is not consistent with equals!
> //    Arrays.sort(scorers, new Comparator() {         // sort the array
> //        public int compare(Object o1, Object o2) {
> //          return ((Scorer)o1).doc() - ((Scorer)o2).doc();
> //        }
> //      });
>
>     first = 0;
>     last = length - 1;
>   }
>   private void insertionSort( Scorer[] scores, int len)
>   {
>       for (int i=0; i<len; i++) {
>           for (int j=i; j>0 && scores[j-1].doc() > scores[j].doc();j-- )
{
>               swap (scores, j, j-1);
>           }
>       }
>       return;
>   }
>   private void swap(Object[] x, int a, int b) {
>     Object t = x[a];
>     x[a] = x[b];
>     x[b] = t;
>   }
>
> The squeezing of the array is no longer needed.
> We also initialized the Scorers array to 8 (instead of 2) to avoid
having to
> grow the array for common queries, although this probably has less
> performance impact.
>
> This change added about 3% to query throughput.
>
> Peter
>
>
> On 10/17/06, Yonik Seeley (JIRA) <[EMAIL PROTECTED]> wrote:
> >
> >      [ http://issues.apache.org/jira/browse/LUCENE-443?page=all ]
> >
> > Yonik Seeley resolved LUCENE-443.
> > ---------------------------------
> >
> >     Fix Version/s: 2.1
> >        Resolution: Fixed
> >          Assignee: Yonik Seeley
> >
> > Thanks!  I just committed this.
> >
> > > ConjunctionScorer tune-up
> > > -------------------------
> > >
> > >                 Key: LUCENE-443
> > >                 URL: http://issues.apache.org/jira/browse/LUCENE-443
> > >             Project: Lucene - Java
> > >          Issue Type: Bug
> > >          Components: Search
> > >    Affects Versions: 1.9
> > >         Environment: Linux, Java 1.5, Large Index with 4 million
items
> > and some heavily nested boolean queries
> > >            Reporter: Abdul Chaudhry
> > >         Assigned To: Yonik Seeley
> > >             Fix For: 2.1
> > >
> > >         Attachments: Conjunction20060921.patch,
ConjunctionScorer.java,
> > ConjunctionScorer.java
> > >
> > >
> > > I just recently ran a load test on the latest code from lucene ,
which
> > is using a new BooleanScore and noticed the ConjunctionScorer was
crunching
> > through objects , especially while sorting as part of the skipTo call.
It
> > turns a linked list into an array, sorts the array, then converts the
array
> > back to a linked list for further processing by the scoring engines
below.
> > > 'm not sure if anyone else is experiencing this as I have a very
large
> > index (> 4 million items) and I am issuing some heavily nested queries
> > > Anyway, I decide to change the link list into an array and use a
first
> > and last marker to "simulate" a linked list.
> > > This scaled much better during my load test as the java gargbage
> > collector was less - umm - virulent
> >
> > --
> > This message is automatically generated by JIRA.
> > -
> > If you think it was sent incorrectly contact one of the
administrators:
> > http://issues.apache.org/jira/secure/Administrators.jspa
> > -
> > For more information on JIRA, see:
http://www.atlassian.com/software/jira
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to