I think what you're seeing might be a result of the overrequesting done in phase #1 of a distriuted facet query.
The purpose of overrequesting is to mitigate the possibility of a constraint which should be in the topN for the collection as a whole, but just outside the topN on every shard -- so they never make it to the second phase of the distributed calculation. The amount of overrequest is, by default, a multiplicitive function of the user specified facet.limit with a fudge factor (IIRC: 10+(1.5*facet.limit)) If you're using an explicitly high facet.limit, you can try setting the overrequets ratio/count to 1.0/0 respectively to force Solr to only request the # of constraints you've specified from each shard, and then aggregate them... https://lucene.apache.org/solr/6_3_0/solr-solrj/org/apache/solr/common/params/FacetParams.html#FACET_OVERREQUEST_RATIO https://lucene.apache.org/solr/6_3_0/solr-solrj/org/apache/solr/common/params/FacetParams.html#FACET_OVERREQUEST_COUNT One side note related to the work around you suggested... : One simple solution, in my case would be, now just thinking of it, run : the query with no facets and no rows, get the numFound, and set that as : facet.limit for the actual query. ...that assumes that the number of facet constraints returned is limited by the total number of documents matching the query -- in general there is no such garuntee because of multivalued fields (or faceting on tokenized fields), so this type of approach isn't a good idea as a generalized solution -Hoss http://www.lucidworks.com/