Turns out there are in fact documents for the same group in different shards which must be causing this problem. It looks like we have a slight flaw in how we were trying to use the composite id routing.
Thanks for putting me down the right path. On Fri, Aug 22, 2014 at 11:14 AM, Andrew Shumway <andrew.shum...@issinc.com> wrote: > The Co-location section of this document > http://searchhub.org/2013/06/13/solr-cloud-document-routing/ might be of > interest to you. It mentions the need for using Solr Cloud routing to > group documents in the same core so that grouping can work properly. > > --Andrew Shumway > > > -----Original Message----- > From: Bryan Bende [mailto:bbe...@gmail.com] > Sent: Friday, August 22, 2014 9:01 AM > To: solr-user@lucene.apache.org > Subject: Re: Incorrect group.ngroups value > > Thanks Jim. > > We've been using the composite id approach where we put group value as the > leading portion of the id (i.e. groupValue!documentid), so I was expecting > all of the documents for a given group to be in the same shard, but at > least this gives me something to look into. I'm still suspicious of > something changing between 4.6.1 and 4.8.1, because we've had the grouping > implemented this way for a while, and only on the exact day we upgraded did > someone bring this problem forward. I will keep investigating, thanks. > > > On Fri, Aug 22, 2014 at 9:18 AM, jim ferenczi <jim.feren...@gmail.com> > wrote: > > > Hi Bryan, > > This is a known limitations of the grouping. > > https://wiki.apache.org/solr/FieldCollapsing#RequestParameters > > > > group.ngroups: > > > > > > *WARNING: If this parameter is set to true on a sharded environment, > > all the documents that belong to the same group have to be located in > > the same shard, otherwise the count will be incorrect. If you are > > using SolrCloud <https://wiki.apache.org/solr/SolrCloud>, consider > > using "custom hashing"* > > > > Cheers, > > Jim > > > > > > > > 2014-08-21 21:44 GMT+02:00 Bryan Bende <bbe...@gmail.com>: > > > > > Is there any known issue with using group.ngroups in a distributed > > > Solr using version 4.8.1 ? > > > > > > I recently upgraded a cluster from 4.6.1 to 4.8.1, and I'm noticing > > several > > > queries where ngroups will be more than the actual groups returned > > > in the response. For example, ngroups will say 5, but then there > > > will be 3 > > groups > > > in the response. It is not happening on all queries, only some. > > > > > >