[jira] [Commented] (SOLR-6803) Pivot Performance

Neil Ireson (JIRA) Thu, 11 Dec 2014 03:57:08 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242440#comment-14242440
 ]


Neil Ireson commented on SOLR-6803:
-----------------------------------

Well I compile 4.10.2 with the above changes (i.e. shuffling the getSubset call 
down a couple of lines) and obtained the following times.

4.10.2 - patched (Processing time in ms)
| Values (#) |  Combined (ms) | Facet (ms) |    Pivot (ms) |
| 100       |        25|        37|        79|
| 1000      |       203|        77|       135|
| 10000     |      1577|       289|       404|
| 100000    |      2985|      1096|      1158|
| 500000    |      3474|      3892|      2921|
| 1000000   |      4421|      7123|      4655|

So these results look back inline with version 4.9. I ran "ant test" and 
everything went fine but I don't know if the PivotFacetProcessor code between 
subField != null is tested.

> Pivot Performance
> -----------------
>
>                 Key: SOLR-6803
>                 URL: https://issues.apache.org/jira/browse/SOLR-6803
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.10.2
>            Reporter: Neil Ireson
>            Priority: Minor
>         Attachments: PivotPerformanceTest.java
>
>
> I found that my pivot search for terms per day was taking an age so I knocked 
> up a quick test, using a collection of 1 million documents with a different 
> number of random terms and times, to compare different ways of getting the 
> counts.
> 1) Combined = combining the term and time in a single field.
> 2) Facet = for each term set the query to the term and then get the time 
> facet 
> 3) Pivot = use the term/time pivot facet.
> The following two tables present the results for version 4.9.1 vs 4.10.1, as 
> an average of five runs.
> 4.9.1 (Processing time in ms)
> |Values (#)   |  Combined (ms)|     Facet (ms)|     Pivot (ms)|
> |100       |        22|        21|        52|
> |1000      |       178|        57|       115|
> |10000     |      1363|       211|       310|
> |100000    |      2592|      1009|       978|
> |500000    |      3125|      3753|      2476|
> |1000000   |      3957|      6789|      3725|
> 4.10.1 (Processing time in ms)
> |Values (#)   |  Combined (ms)|     Facet (ms)|     Pivot (ms)|
> |100       |        21|        21|        75|
> |1000      |       188|        60|       265|
> |10000     |      1438|       215|      1826|
> |100000    |      2768|      1073|     16594|
> |500000    |      3266|      3686|     99682|
> |1000000   |      4080|      6777|    208873|
> The results show that, as the number of pivot values increases (i.e. number 
> of terms * number of times), pivot performance in 4.10.1 get progressively 
> worse.
> I tried to look at the code but there was a lot of changes in pivoting 
> between 4.9 and 4.10, and so it is not clear to me what has cause the 
> performance issues. However the results seem to indicate that if the pivot 
> was simply a combined facet search, it could potentially produce better and 
> more robust performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6803) Pivot Performance

Reply via email to