[ 
https://issues.apache.org/jira/browse/SOLR-4763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13720994#comment-13720994
 ] 

Tom Burton-West commented on SOLR-4763:
---------------------------------------

I have similar problems with performance, but in my case memory use is an issue 
as well. This is probably an extreme use case, but I thought it might be 
helpful to add to the discussion.

We currently index close to 11 million books with the entire book being a Solr 
document.  We are considering instead indexing pages as the Solr document and 
using grouping to return results organized by book.

I'm currently testing an index of about 1 million books indexed on a page 
level, spread out over 3 shards.  There are about 360 million pages.  For a 
worst-case query that returns about 200 million documents, group.truncate takes 
about 10 seconds (which is acceptable for us as a worst-case).  However, 
group.facet takes closer to 15 minutes.  We are running on a server with 74GB 
of memory with 32GB dedicated to the JVM.  What I see for this query with 
group.facet is that memory use goes up above about 30GB and then multiple full 
garbage collections occur.  

In contrast, using normal rather than the worst case queries, our 90th 
percentile queries (which return only a few million hits rather than 200 
million) took about 700 ms with facet.truncate and 2000 ms with group.facet.

I'm wondering how much of the performance issues others are observing might be 
due to memory requirements and slowdowns due to garbage collection.

Tom

                
> Performance issue when using group.facet=true
> ---------------------------------------------
>
>                 Key: SOLR-4763
>                 URL: https://issues.apache.org/jira/browse/SOLR-4763
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.2
>            Reporter: Alexander Koval
>
> I do not know whether this is bug or not. But calculating facets with 
> {{group.facet=true}} is too slow.
> I have query that:
> {code}
> "matches": 730597,
> "ngroups": 24024,
> {code}
> 1. All queries with {{group.facet=true}}:
> {code}
> "QTime": 5171
> "facet": {
>     "time": 4716
> {code}
> 2. Without {{group.facet}}:
> * First query:
> {code}
> "QTime": 3284
> "facet": {
>     "time": 3104
> {code}
> * Next queries:
> {code}
> "QTime": 230,
> "facet": {
>     "time": 76
> {code}
> So I think with {{group.facet=true}} Solr doesn't use cache to calculate 
> facets.
> Is it possible to improve performance of facets when {{group.facet=true}}?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to