[jira] [Commented] (SOLR-7036) Faster method for group.facet

md (JIRA) Sat, 16 Jul 2016 11:04:33 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15380870#comment-15380870
 ]


md commented on SOLR-7036:
--------------------------

Could you please send the Jstack during the run?

נשלח מה-iPhone שלי

‫ב-15 ביולי 2016, בשעה 21:36, ‏‏Jamie Swain (JIRA) 
‏<j...@apache.org<mailto:j...@apache.org>> כתב/ה:‬


   [ 
https://issues.apache.org/jira/browse/SOLR-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15379870#comment-15379870
 ]

Jamie Swain commented on SOLR-7036:
-----------------------------------

[~mdvir1] I tried using the 7 files you sent last night applied to 
f9c94706416c80dcdc4514256c2e4cbf975c386b.  I was able to build and run solr, 
and then add around 500k docs to it.  I tried a normal query, grouped query, 
facet query, group + facet query, and all worked fine without 
"group.facet.method=uif".  If I try "group.facet.method=uif", then I never get 
a response to my request, it appears the request just hangs.

I'm going to dig into this more later today, and I'll probably try running this 
with the debugger to try to see what is happening.

This is what my query looks like:
{code}
"responseHeader": {
   "zkConnected": true,
   "status": 0,
   "QTime": 422,
   "params": {
       "q": "*:*",
       "facet.field": "colorFamily",
       "json.nl<http://json.nl>": "flat",
       "omitHeader": "false",
       "group.facet": "true",
       "rows": "30",
       "facet": "true",
       "wt": "json",
       "group.field": "styleIdColor",
       "group": "true"
   }
},
{code}

In my schema, the colorFamily field used for faceting is like this:
{code}
<field stored="true" indexed="true" name="colorFamily" type="string" 
docValues="true"/>
{code}

The solr logs don't show me much for this, unfortunately.


Faster method for group.facet
-----------------------------

               Key: SOLR-7036
               URL: https://issues.apache.org/jira/browse/SOLR-7036
           Project: Solr
        Issue Type: Improvement
        Components: faceting
  Affects Versions: 4.10.3
          Reporter: Jim Musil
          Assignee: Erick Erickson
           Fix For: 5.5, 6.0

       Attachments: SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, 
SOLR-7036.patch, performance.txt, source_for_patch.zip


This is a patch that speeds up the performance of requests made with 
group.facet=true. The original code that collects and counts unique facet 
values for each group does not use the same improved field cache methods that 
have been added for normal faceting in recent versions.
Specifically, this approach leverages the UninvertedField class which provides 
a much faster way to look up docs that contain a term. I've also added a simple 
grouping map so that when a term is found for a doc, it can quickly look up the 
group to which it belongs.
Group faceting was very slow for our data set and when the number of docs or 
terms was high, the latency spiked to multiple second requests. This solution 
provides better overall performance -- from an average of 54ms to 32ms. It also 
dropped our slowest performing queries way down -- from 6012ms to 991ms.
I also added a few tests.
I added an additional parameter so that you can choose to use this method or 
the original. Add group.facet.method=fc to use the improved method or 
group.facet.method=original which is the default if not specified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


> Faster method for group.facet
> -----------------------------
>
>                 Key: SOLR-7036
>                 URL: https://issues.apache.org/jira/browse/SOLR-7036
>             Project: Solr
>          Issue Type: Improvement
>          Components: faceting
>    Affects Versions: 4.10.3
>            Reporter: Jim Musil
>            Assignee: Erick Erickson
>             Fix For: 5.5, 6.0
>
>         Attachments: SOLR-7036.patch, SOLR-7036.patch, SOLR-7036.patch, 
> SOLR-7036.patch, performance.txt, source_for_patch.zip
>
>
> This is a patch that speeds up the performance of requests made with 
> group.facet=true. The original code that collects and counts unique facet 
> values for each group does not use the same improved field cache methods that 
> have been added for normal faceting in recent versions.
> Specifically, this approach leverages the UninvertedField class which 
> provides a much faster way to look up docs that contain a term. I've also 
> added a simple grouping map so that when a term is found for a doc, it can 
> quickly look up the group to which it belongs.
> Group faceting was very slow for our data set and when the number of docs or 
> terms was high, the latency spiked to multiple second requests. This solution 
> provides better overall performance -- from an average of 54ms to 32ms. It 
> also dropped our slowest performing queries way down -- from 6012ms to 991ms.
> I also added a few tests.
> I added an additional parameter so that you can choose to use this method or 
> the original. Add group.facet.method=fc to use the improved method or 
> group.facet.method=original which is the default if not specified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-7036) Faster method for group.facet

Reply via email to