[jira] Issue Comment Edited: (LUCENE-1997) Explore performance of multi-PQ vs single-PQ sorting API

Mark Miller (JIRA) Mon, 02 Nov 2009 16:49:26 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772792#action_12772792
 ]


Mark Miller edited comment on LUCENE-1997 at 11/3/09 12:48 AM:
---------------------------------------------------------------

bq. 100th page at the same time index is at 100 segments? How many very's would 
you give it?

I'm not claiming 100th page with many segments - I have no info on that, and I 
agree it would be more rare. But it has come to my attention that 100th page is 
more common than I would have thought. (sorry - I wasn't very clear on that in 
my last comment - I am just referring to the deep paging - I previously would 
have thought its more rare than I do now - though even before, its something I 
wouldnt want to see a huge perf drop on)

In any case - no one is saying this change won't happen. Just that its not 
likely to happen soon.

*edit*

Let me answer the question though - based on my experience with the 
mergefactors people like to use, and the cost of optimizing, I would say 100 
segments deserves no very. At best, it might be semi rare. Mixed with the 100 
page req, I'd take it to rare. But thats just me guessing based on my 
Lucene/Solr experience - so its not worth a whole ton.

      was (Author: markrmil...@gmail.com):
    bq. 100th page at the same time index is at 100 segments? How many very's 
would you give it?

I'm not claiming 100th page with many segments - I have no info on that, and I 
agree it would be more rare. But it has come to my attention that 100th page is 
more common than I would have thought. (sorry - I wasn't very clear on that in 
my last comment - I am just referring to the deep paging - I previously would 
have thought its more rare than I do now - though even before, its something I 
wouldnt want to see a huge perf drop on)

In any case - no one is saying this change won't happen. Just that its not 
likely to happen soon.
  
> Explore performance of multi-PQ vs single-PQ sorting API
> --------------------------------------------------------
>
>                 Key: LUCENE-1997
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1997
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-1997.patch, LUCENE-1997.patch, LUCENE-1997.patch, 
> LUCENE-1997.patch, LUCENE-1997.patch, LUCENE-1997.patch, LUCENE-1997.patch, 
> LUCENE-1997.patch, LUCENE-1997.patch
>
>
> Spinoff from recent "lucene 2.9 sorting algorithm" thread on java-dev,
> where a simpler (non-segment-based) comparator API is proposed that
> gathers results into multiple PQs (one per segment) and then merges
> them in the end.
> I started from John's multi-PQ code and worked it into
> contrib/benchmark so that we could run perf tests.  Then I generified
> the Python script I use for running search benchmarks (in
> contrib/benchmark/sortBench.py).
> The script first creates indexes with 1M docs (based on
> SortableSingleDocSource, and based on wikipedia, if available).  Then
> it runs various combinations:
>   * Index with 20 balanced segments vs index with the "normal" log
>     segment size
>   * Queries with different numbers of hits (only for wikipedia index)
>   * Different top N
>   * Different sorts (by title, for wikipedia, and by random string,
>     random int, and country for the random index)
> For each test, 7 search rounds are run and the best QPS is kept.  The
> script runs singlePQ then multiPQ, and records the resulting best QPS
> for each and produces table (in Jira format) as output.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Issue Comment Edited: (LUCENE-1997) Explore performance of multi-PQ vs single-PQ sorting API

Reply via email to