[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13979898#comment-13979898 ]
Brett Lucey commented on SOLR-2894: ----------------------------------- Andrew actually raised that question to me yesterday as well and I spent a little bit of time looking into it. For the initial request to a shard, we only lower the mincount if the facet limit is set to something other than -1. In your case, this would be 10 for the top level pivot. We know we will (at most) get back 15 terms from each shard in this case. Because we are only faceting on a limited number of terms, having a mincount of 0 here provides us the benefit of potentially avoiding refinement. In refinement requests, we still need to know when a shard has responded to us with it's count for a term, so the mincount is -1 in that case because we are interested in the term even if the count is zero. It allows us to mark the shard as having responded and continue on. It's possible that we might be able to change this, but at the point of refinement, it's a rather targeted request so I don't expect there to be a significant benefit to doing so. In your case, with the facet limit being -1 on f2-f5, no refinement would be performed anyway. When we designed this implementation, the most important factor for us was speed, and we were willing to get it at a cost of memory. By making these changes, we reduced queries which previously took around 70 seconds for us down to around 600 milliseconds. I suspect that the biggest factor in the poor memory utilization is the wide open nature of using a facet.limit of -1, especially on a pivot so deep. Keep in mind that for each level of depth you add to a pivot, memory and time required will grow exponentially. Don't forget that if you are querying a node and all of the shards are located within the same Java VM, you are incurring the memory cost of both shards plus the node responding to the user query all within the same heap. I took a quick look at the code today while waiting for some other processes to finish, and I don't see any obvious low hanging fruit to free up a small amount of memory. > Implement distributed pivot faceting > ------------------------------------ > > Key: SOLR-2894 > URL: https://issues.apache.org/jira/browse/SOLR-2894 > Project: Solr > Issue Type: Improvement > Reporter: Erik Hatcher > Fix For: 4.9, 5.0 > > Attachments: SOLR-2894-reworked.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, dateToObject.patch > > > Following up on SOLR-792, pivot faceting currently only supports > undistributed mode. Distributed pivot faceting needs to be implemented. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org