Isn't the issue the creation of a new Comparator each time the scorers are sorted? That could be easily fixed by keeping single comparator around to do all the work.
Yes, it's the Comparator, but I think even if you kept it around, the Array.sort would still clone the Scorers, no? Peter On 10/23/06, Paul Elschot <[EMAIL PROTECTED]> wrote:
On Monday 23 October 2006 22:12, Peter Keegan wrote: > I did some profile testing with the new ConjuctionScorer in 2.1 and > discovered a new bottleneck in ConjunctionScorer.sortScorers. The > java.utils.Arrays.sort method is cloning the Scorers array on every sort, > which is quite expensive on large indexes because of the size of the 'norms' > array within, and isn't necessary. Isn't the issue the creation of a new Comparator each time the scorers are sorted? That could be easily fixed by keeping single comparator around to do all the work. Regards, Paul Elschot > We rewrote this to use simple insertion sorting as follows: > > private void sortScorers() { > // squeeze the array down for the sort > // if (length != scorers.length) { > // Scorer[] temps = new Scorer[length]; > // System.arraycopy(scorers, 0, temps, 0, length); > // scorers = temps; > // } > insertionSort( scorers,length ); > // note that this comparator is not consistent with equals! > // Arrays.sort(scorers, new Comparator() { // sort the array > // public int compare(Object o1, Object o2) { > // return ((Scorer)o1).doc() - ((Scorer)o2).doc(); > // } > // }); > > first = 0; > last = length - 1; > } > private void insertionSort( Scorer[] scores, int len) > { > for (int i=0; i<len; i++) { > for (int j=i; j>0 && scores[j-1].doc() > scores[j].doc();j-- ) { > swap (scores, j, j-1); > } > } > return; > } > private void swap(Object[] x, int a, int b) { > Object t = x[a]; > x[a] = x[b]; > x[b] = t; > } > > The squeezing of the array is no longer needed. > We also initialized the Scorers array to 8 (instead of 2) to avoid having to > grow the array for common queries, although this probably has less > performance impact. > > This change added about 3% to query throughput. > > Peter > > > On 10/17/06, Yonik Seeley (JIRA) <[EMAIL PROTECTED]> wrote: > > > > [ http://issues.apache.org/jira/browse/LUCENE-443?page=all ] > > > > Yonik Seeley resolved LUCENE-443. > > --------------------------------- > > > > Fix Version/s: 2.1 > > Resolution: Fixed > > Assignee: Yonik Seeley > > > > Thanks! I just committed this. > > > > > ConjunctionScorer tune-up > > > ------------------------- > > > > > > Key: LUCENE-443 > > > URL: http://issues.apache.org/jira/browse/LUCENE-443 > > > Project: Lucene - Java > > > Issue Type: Bug > > > Components: Search > > > Affects Versions: 1.9 > > > Environment: Linux, Java 1.5, Large Index with 4 million items > > and some heavily nested boolean queries > > > Reporter: Abdul Chaudhry > > > Assigned To: Yonik Seeley > > > Fix For: 2.1 > > > > > > Attachments: Conjunction20060921.patch, ConjunctionScorer.java, > > ConjunctionScorer.java > > > > > > > > > I just recently ran a load test on the latest code from lucene , which > > is using a new BooleanScore and noticed the ConjunctionScorer was crunching > > through objects , especially while sorting as part of the skipTo call. It > > turns a linked list into an array, sorts the array, then converts the array > > back to a linked list for further processing by the scoring engines below. > > > 'm not sure if anyone else is experiencing this as I have a very large > > index (> 4 million items) and I am issuing some heavily nested queries > > > Anyway, I decide to change the link list into an array and use a first > > and last marker to "simulate" a linked list. > > > This scaled much better during my load test as the java gargbage > > collector was less - umm - virulent > > > > -- > > This message is automatically generated by JIRA. > > - > > If you think it was sent incorrectly contact one of the administrators: > > http://issues.apache.org/jira/secure/Administrators.jspa > > - > > For more information on JIRA, see: http://www.atlassian.com/software/jira > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]