[jira] Issue Comment Edited: (LUCENE-1997) Explore performance of multi-PQ vs single-PQ sorting API

Yonik Seeley (JIRA) Thu, 29 Oct 2009 11:10:26 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771466#action_12771466
 ]


Yonik Seeley edited comment on LUCENE-1997 at 10/29/09 6:09 PM:
----------------------------------------------------------------

Here's some more mud to help clear the water ;-)  This is with the latest JDK7 
- tested twice to be sure, and all results were within .5 percentile points of 
eachother.

Linux odin 2.6.28-16-generic #55-Ubuntu SMP Tue Oct 20 19:48:32 UTC 2009 x86_64 
GNU/Linux
Java(TM) SE Runtime Environment (build 1.7.0-ea-b74) (Oct 15 2009)
java -Xms2048M -Xmx2048M -Xbatch -server
Phenom II x4 3GHz (dynamic freq scaling turned off) 

||Source||Seg size||Query||Tot hits||Sort||Top N||QPS old||QPS new||Pct change||
|random|balanced|<all>|5000000|rand int|10|28.02|18.86|{color:red}-32.7%{color}|
|random|balanced|<all>|5000000|rand int|25|27.93|18.80|{color:red}-32.7%{color}|
|random|balanced|<all>|5000000|rand int|50|23.89|21.77|{color:red}-8.9%{color}|
|random|balanced|<all>|5000000|rand 
int|100|23.74|21.21|{color:red}-10.7%{color}|
|random|balanced|<all>|5000000|rand 
int|500|22.92|17.30|{color:red}-24.5%{color}|
|random|balanced|<all>|5000000|rand 
int|1000|21.99|14.64|{color:red}-33.4%{color}|
|random|balanced|<all>|5000000|rand 
string|10|23.63|20.58|{color:red}-12.9%{color}|
|random|balanced|<all>|5000000|rand 
string|25|22.74|20.42|{color:red}-10.2%{color}|
|random|balanced|<all>|5000000|rand 
string|50|16.88|21.93|{color:green}29.9%{color}|
|random|balanced|<all>|5000000|rand 
string|100|19.32|21.42|{color:green}10.9%{color}|
|random|balanced|<all>|5000000|rand 
string|500|18.58|18.14|{color:red}-2.4%{color}|
|random|balanced|<all>|5000000|rand 
string|1000|18.08|15.25|{color:red}-15.7%{color}|
|random|balanced|<all>|5000000|country|10|23.89|20.70|{color:red}-13.4%{color}|
|random|balanced|<all>|5000000|country|25|22.59|20.58|{color:red}-8.9%{color}|
|random|balanced|<all>|5000000|country|50|16.84|22.04|{color:green}30.9%{color}|
|random|balanced|<all>|5000000|country|100|16.68|21.71|{color:green}30.2%{color}|
|random|balanced|<all>|5000000|country|500|19.65|18.60|{color:red}-5.3%{color}|
|random|balanced|<all>|5000000|country|1000|17.70|15.48|{color:red}-12.5%{color}|
|random|log|<all>|5000000|rand int|10|28.31|18.94|{color:red}-33.1%{color}|
|random|log|<all>|5000000|rand int|25|23.75|22.09|{color:red}-7.0%{color}|
|random|log|<all>|5000000|rand int|50|23.99|21.90|{color:red}-8.7%{color}|
|random|log|<all>|5000000|rand int|100|23.75|21.47|{color:red}-9.6%{color}|
|random|log|<all>|5000000|rand int|500|22.83|18.41|{color:red}-19.4%{color}|
|random|log|<all>|5000000|rand int|1000|21.99|15.96|{color:red}-27.4%{color}|
|random|log|<all>|5000000|rand string|10|22.92|20.61|{color:red}-10.1%{color}|
|random|log|<all>|5000000|rand string|25|23.36|22.27|{color:red}-4.7%{color}|
|random|log|<all>|5000000|rand string|50|16.96|22.12|{color:green}30.4%{color}|
|random|log|<all>|5000000|rand string|100|19.61|21.59|{color:green}10.1%{color}|
|random|log|<all>|5000000|rand string|500|18.02|19.03|{color:green}5.6%{color}|
|random|log|<all>|5000000|rand string|1000|18.54|16.51|{color:red}-10.9%{color}|
|random|log|<all>|5000000|country|10|24.32|20.65|{color:red}-15.1%{color}|
|random|log|<all>|5000000|country|25|23.46|20.72|{color:red}-11.7%{color}|
|random|log|<all>|5000000|country|50|22.71|20.62|{color:red}-9.2%{color}|
|random|log|<all>|5000000|country|100|16.78|21.78|{color:green}29.8%{color}|
|random|log|<all>|5000000|country|500|19.14|19.22|{color:green}0.4%{color}|
|random|log|<all>|5000000|country|1000|17.61|16.79|{color:red}-4.7%{color}|

Same setup, just w/o -Xbatch
java -Xms2048M -Xmx2048M -server

||Source||Seg size||Query||Tot hits||Sort||Top N||QPS old||QPS new||Pct change||
|random|balanced|<all>|5000000|rand int|10|28.63|24.16|{color:red}-15.6%{color}|
|random|balanced|<all>|5000000|rand int|25|28.24|19.51|{color:red}-30.9%{color}|
|random|balanced|<all>|5000000|rand int|50|29.24|19.21|{color:red}-34.3%{color}|
|random|balanced|<all>|5000000|rand 
int|100|27.42|20.03|{color:red}-27.0%{color}|
|random|balanced|<all>|5000000|rand 
int|500|26.38|16.82|{color:red}-36.2%{color}|
|random|balanced|<all>|5000000|rand 
int|1000|26.34|14.40|{color:red}-45.3%{color}|
|random|balanced|<all>|5000000|rand 
string|10|27.61|20.29|{color:red}-26.5%{color}|
|random|balanced|<all>|5000000|rand 
string|25|25.84|21.80|{color:red}-15.6%{color}|
|random|balanced|<all>|5000000|rand 
string|50|19.47|21.69|{color:green}11.4%{color}|
|random|balanced|<all>|5000000|rand 
string|100|19.17|19.40|{color:green}1.2%{color}|
|random|balanced|<all>|5000000|rand 
string|500|18.29|16.87|{color:red}-7.8%{color}|
|random|balanced|<all>|5000000|rand 
string|1000|17.09|14.35|{color:red}-16.0%{color}|
|random|balanced|<all>|5000000|country|10|22.48|21.42|{color:red}-4.7%{color}|
|random|balanced|<all>|5000000|country|25|20.86|21.88|{color:green}4.9%{color}|
|random|balanced|<all>|5000000|country|50|20.26|21.67|{color:green}7.0%{color}|
|random|balanced|<all>|5000000|country|100|18.32|19.60|{color:green}7.0%{color}|
|random|balanced|<all>|5000000|country|500|17.93|17.01|{color:red}-5.1%{color}|
|random|balanced|<all>|5000000|country|1000|18.92|14.48|{color:red}-23.5%{color}|
|random|log|<all>|5000000|rand int|10|28.71|24.35|{color:red}-15.2%{color}|
|random|log|<all>|5000000|rand int|25|28.47|19.55|{color:red}-31.3%{color}|
|random|log|<all>|5000000|rand int|50|28.19|19.38|{color:red}-31.3%{color}|
|random|log|<all>|5000000|rand int|100|27.89|20.31|{color:red}-27.2%{color}|
|random|log|<all>|5000000|rand int|500|25.13|17.64|{color:red}-29.8%{color}|
|random|log|<all>|5000000|rand int|1000|26.51|15.55|{color:red}-41.3%{color}|
|random|log|<all>|5000000|rand string|10|27.81|20.39|{color:red}-26.7%{color}|
|random|log|<all>|5000000|rand string|25|25.66|21.96|{color:red}-14.4%{color}|
|random|log|<all>|5000000|rand string|50|17.70|20.17|{color:green}14.0%{color}|
|random|log|<all>|5000000|rand string|100|19.28|19.63|{color:green}1.8%{color}|
|random|log|<all>|5000000|rand string|500|18.03|17.45|{color:red}-3.2%{color}|
|random|log|<all>|5000000|rand string|1000|18.84|15.29|{color:red}-18.8%{color}|
|random|log|<all>|5000000|country|10|22.58|21.47|{color:red}-4.9%{color}|
|random|log|<all>|5000000|country|25|21.09|20.36|{color:red}-3.5%{color}|
|random|log|<all>|5000000|country|50|21.03|21.80|{color:green}3.7%{color}|
|random|log|<all>|5000000|country|100|18.45|21.38|{color:green}15.9%{color}|
|random|log|<all>|5000000|country|500|17.89|17.69|{color:red}-1.1%{color}|
|random|log|<all>|5000000|country|1000|18.93|15.62|{color:red}-17.5%{color}|



      was (Author: [email protected]):
    Here's some more mud to help clear the water ;-)  This is with the latest 
JDK7 - tested twice to be sure, and all results were within .5 percentile 
points of eachother.

Linux odin 2.6.28-16-generic #55-Ubuntu SMP Tue Oct 20 19:48:32 UTC 2009 x86_64 
GNU/Linux
Java(TM) SE Runtime Environment (build 1.7.0-ea-b74) (Oct 15 2009)
Phenom II x4 3GHz (dynamic freq scaling turned off) 

||Source||Seg size||Query||Tot hits||Sort||Top N||QPS old||QPS new||Pct change||
|random|balanced|<all>|5000000|rand int|10|28.02|18.86|{color:red}-32.7%{color}|
|random|balanced|<all>|5000000|rand int|25|27.93|18.80|{color:red}-32.7%{color}|
|random|balanced|<all>|5000000|rand int|50|23.89|21.77|{color:red}-8.9%{color}|
|random|balanced|<all>|5000000|rand 
int|100|23.74|21.21|{color:red}-10.7%{color}|
|random|balanced|<all>|5000000|rand 
int|500|22.92|17.30|{color:red}-24.5%{color}|
|random|balanced|<all>|5000000|rand 
int|1000|21.99|14.64|{color:red}-33.4%{color}|
|random|balanced|<all>|5000000|rand 
string|10|23.63|20.58|{color:red}-12.9%{color}|
|random|balanced|<all>|5000000|rand 
string|25|22.74|20.42|{color:red}-10.2%{color}|
|random|balanced|<all>|5000000|rand 
string|50|16.88|21.93|{color:green}29.9%{color}|
|random|balanced|<all>|5000000|rand 
string|100|19.32|21.42|{color:green}10.9%{color}|
|random|balanced|<all>|5000000|rand 
string|500|18.58|18.14|{color:red}-2.4%{color}|
|random|balanced|<all>|5000000|rand 
string|1000|18.08|15.25|{color:red}-15.7%{color}|
|random|balanced|<all>|5000000|country|10|23.89|20.70|{color:red}-13.4%{color}|
|random|balanced|<all>|5000000|country|25|22.59|20.58|{color:red}-8.9%{color}|
|random|balanced|<all>|5000000|country|50|16.84|22.04|{color:green}30.9%{color}|
|random|balanced|<all>|5000000|country|100|16.68|21.71|{color:green}30.2%{color}|
|random|balanced|<all>|5000000|country|500|19.65|18.60|{color:red}-5.3%{color}|
|random|balanced|<all>|5000000|country|1000|17.70|15.48|{color:red}-12.5%{color}|
|random|log|<all>|5000000|rand int|10|28.31|18.94|{color:red}-33.1%{color}|
|random|log|<all>|5000000|rand int|25|23.75|22.09|{color:red}-7.0%{color}|
|random|log|<all>|5000000|rand int|50|23.99|21.90|{color:red}-8.7%{color}|
|random|log|<all>|5000000|rand int|100|23.75|21.47|{color:red}-9.6%{color}|
|random|log|<all>|5000000|rand int|500|22.83|18.41|{color:red}-19.4%{color}|
|random|log|<all>|5000000|rand int|1000|21.99|15.96|{color:red}-27.4%{color}|
|random|log|<all>|5000000|rand string|10|22.92|20.61|{color:red}-10.1%{color}|
|random|log|<all>|5000000|rand string|25|23.36|22.27|{color:red}-4.7%{color}|
|random|log|<all>|5000000|rand string|50|16.96|22.12|{color:green}30.4%{color}|
|random|log|<all>|5000000|rand string|100|19.61|21.59|{color:green}10.1%{color}|
|random|log|<all>|5000000|rand string|500|18.02|19.03|{color:green}5.6%{color}|
|random|log|<all>|5000000|rand string|1000|18.54|16.51|{color:red}-10.9%{color}|
|random|log|<all>|5000000|country|10|24.32|20.65|{color:red}-15.1%{color}|
|random|log|<all>|5000000|country|25|23.46|20.72|{color:red}-11.7%{color}|
|random|log|<all>|5000000|country|50|22.71|20.62|{color:red}-9.2%{color}|
|random|log|<all>|5000000|country|100|16.78|21.78|{color:green}29.8%{color}|
|random|log|<all>|5000000|country|500|19.14|19.22|{color:green}0.4%{color}|
|random|log|<all>|5000000|country|1000|17.61|16.79|{color:red}-4.7%{color}|

  
> Explore performance of multi-PQ vs single-PQ sorting API
> --------------------------------------------------------
>
>                 Key: LUCENE-1997
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1997
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-1997.patch, LUCENE-1997.patch, LUCENE-1997.patch, 
> LUCENE-1997.patch, LUCENE-1997.patch, LUCENE-1997.patch, LUCENE-1997.patch, 
> LUCENE-1997.patch
>
>
> Spinoff from recent "lucene 2.9 sorting algorithm" thread on java-dev,
> where a simpler (non-segment-based) comparator API is proposed that
> gathers results into multiple PQs (one per segment) and then merges
> them in the end.
> I started from John's multi-PQ code and worked it into
> contrib/benchmark so that we could run perf tests.  Then I generified
> the Python script I use for running search benchmarks (in
> contrib/benchmark/sortBench.py).
> The script first creates indexes with 1M docs (based on
> SortableSingleDocSource, and based on wikipedia, if available).  Then
> it runs various combinations:
>   * Index with 20 balanced segments vs index with the "normal" log
>     segment size
>   * Queries with different numbers of hits (only for wikipedia index)
>   * Different top N
>   * Different sorts (by title, for wikipedia, and by random string,
>     random int, and country for the random index)
> For each test, 7 search rounds are run and the best QPS is kept.  The
> script runs singlePQ then multiPQ, and records the resulting best QPS
> for each and produces table (in Jira format) as output.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Issue Comment Edited: (LUCENE-1997) Explore performance of multi-PQ vs single-PQ sorting API

Reply via email to