[jira] [Commented] (SOLR-7036) Faster method for group.facet

Erick Erickson (JIRA) Sun, 23 Oct 2016 19:55:09 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15600766#comment-15600766
 ]


Erick Erickson commented on SOLR-7036:
--------------------------------------

OK, I was all ready to (finally) commit this this weekend and ran into problems.

The trivial one is that for 6x, SolrIndexSearcher.getLeafReader is now 
getSlowAtomicReader, but that's easy.

The two more "interesting" ones are:

1> in 6x I now get test failures in SimpleFacesTest.testSimpleGroupedFacets. 
The problem is that over in FacetFieldProcessor, around lines 220 maxTopVals is 
calculated. Due to the casting to a long and back to an int, maxTopVals becomes 
a negative number and this line barfs with a negative index exception:

final PriorityQueue<Slot> queue = new PriorityQueue<Slot>(maxTopVals).

Aha, sez I, it must be changes to FacetFieldProcessor. So I started diving into 
that code and got lost. This only happens in the "uif" case where facet.limit < 
0 from testSimpleGroupedFacets. So I tried to see how it all worked originally 
and.... the other tests that have a negative facet.limit don't get to that code.

Since the faceting code is something I'm not that up on I gave up at that 
point. Any clues as to what the right thing to do is?

****************
2> Trunk has moved to the DocValues iterator (see LUCENE-7407) so we have to 
make two separate patches. I tabled that though until we get 6x in shape. 
Anyone who wants to should feel free to do that though.





> Faster method for group.facet
> -----------------------------
>
>                 Key: SOLR-7036
>                 URL: https://issues.apache.org/jira/browse/SOLR-7036
>             Project: Solr
>          Issue Type: Improvement
>          Components: faceting
>    Affects Versions: 4.10.3
>            Reporter: Jim Musil
>            Assignee: Erick Erickson
>             Fix For: 5.5, 6.0
>
>         Attachments: SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, 
> SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, SOLR-7036_zipped.zip, 
> jstack-output.txt, performance.txt, source_for_patch.zip
>
>
> This is a patch that speeds up the performance of requests made with 
> group.facet=true. The original code that collects and counts unique facet 
> values for each group does not use the same improved field cache methods that 
> have been added for normal faceting in recent versions.
> Specifically, this approach leverages the UninvertedField class which 
> provides a much faster way to look up docs that contain a term. I've also 
> added a simple grouping map so that when a term is found for a doc, it can 
> quickly look up the group to which it belongs.
> Group faceting was very slow for our data set and when the number of docs or 
> terms was high, the latency spiked to multiple second requests. This solution 
> provides better overall performance -- from an average of 54ms to 32ms. It 
> also dropped our slowest performing queries way down -- from 6012ms to 991ms.
> I also added a few tests.
> I added an additional parameter so that you can choose to use this method or 
> the original. Add group.facet.method=fc to use the improved method or 
> group.facet.method=original which is the default if not specified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-7036) Faster method for group.facet

Reply via email to