[ https://issues.apache.org/jira/browse/SOLR-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15691507#comment-15691507 ]
Erick Erickson commented on SOLR-7036: -------------------------------------- I'm thoroughly confused about the state of these two JIRAs, this one and SOLR-4763. 1> Do JSON facets supersede this? Should we just be moving to JSON facets? If yes, has the refinement step been added to JSON facets? Or is it even necessary/relevant? 2> does enabling DocValues sidestep this problem? We're recommending docValues for grouping and faceting after all. On some tests I did having DocValues for these fields sped made the timings roughly equal, but that may just mean I'm not testing correctly. 3> Last time I was in here on 23-Oct, there were some problems with the patch. Any progress on that front? I just ran the test that was failing so it looks like maybe the changes for SOLR-9654 [~yo...@apache.org] might have addressed point <1>. Not sure there's really anything to be done for <2>. > Faster method for group.facet > ----------------------------- > > Key: SOLR-7036 > URL: https://issues.apache.org/jira/browse/SOLR-7036 > Project: Solr > Issue Type: Improvement > Components: faceting > Affects Versions: 4.10.3 > Reporter: Jim Musil > Assignee: Erick Erickson > Fix For: 5.5, 6.0 > > Attachments: SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, > SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, SOLR-7036_zipped.zip, > jstack-output.txt, performance.txt, source_for_patch.zip > > > This is a patch that speeds up the performance of requests made with > group.facet=true. The original code that collects and counts unique facet > values for each group does not use the same improved field cache methods that > have been added for normal faceting in recent versions. > Specifically, this approach leverages the UninvertedField class which > provides a much faster way to look up docs that contain a term. I've also > added a simple grouping map so that when a term is found for a doc, it can > quickly look up the group to which it belongs. > Group faceting was very slow for our data set and when the number of docs or > terms was high, the latency spiked to multiple second requests. This solution > provides better overall performance -- from an average of 54ms to 32ms. It > also dropped our slowest performing queries way down -- from 6012ms to 991ms. > I also added a few tests. > I added an additional parameter so that you can choose to use this method or > the original. Add group.facet.method=fc to use the improved method or > group.facet.method=original which is the default if not specified. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org