I think that there is no way around doing custom logic in this case. If indexing process knows that documents have to be grouped then they better be together.
-Saroj On Mon, Jun 11, 2012 at 6:37 AM, Nitesh Nandy <niteshna...@gmail.com> wrote: > Martijn, > > How do we add a custom algorithm for distributing documents in Solr Cloud? > According to this discussion > > http://lucene.472066.n3.nabble.com/SolrCloud-how-to-index-documents-into-a-specific-core-and-how-to-search-against-that-core-td3985262.html > , Mark discourages users from using custom distribution mechanism in Solr > Cloud. > > Load balancing is not an issue for us at the moment. In that case, how > should we implement a custom partitioning algorithm. > > > On Mon, Jun 11, 2012 at 6:23 PM, Martijn v Groningen < > martijn.v.gronin...@gmail.com> wrote: > > > The ngroups returns the number of groups that have matched with the > > query. However if you want ngroups to be correct in a distributed > > environment you need > > to put document belonging to the same group into the same shard. > > Groups can't cross shard boundaries. I guess you need to do > > some manual document partitioning. > > > > Martijn > > > > On 11 June 2012 14:29, Nitesh Nandy <niteshna...@gmail.com> wrote: > > > Version: Solr 4.0 (svn build 30th may, 2012) with Solr Cloud (2 slices > > and > > > 2 shards) > > > > > > The setup was done as per the wiki: > > http://wiki.apache.org/solr/SolrCloud > > > > > > We are doing distributed search. While querying, we use field > collapsing > > > with "ngroups" set as true as we need the number of search results. > > > > > > However, there is a difference in the number of "result list" returned > > and > > > the "ngroups" value returned. > > > > > > Ex: > > > > > > http://localhost:8983/solr/select?q=message:blah%20AND%20userid:3&&group=true&group.field=id&group.ngroups=true > > > > > > > > > The response XMl looks like > > > > > > <response> > > > <script/> > > > <lst name="responseHeader"> > > > <int name="status">0</int> > > > <int name="QTime">46</int> > > > <lst name="params"> > > > <str name="group.field">id</str> > > > <str name="group.ngroups">true</str> > > > <str name="group">true</str> > > > <str name="q">messagebody:monit AND usergroupid:3</str> > > > </lst> > > > </lst> > > > <lst name="grouped"> > > > <lst name="id"> > > > <int name="matches">10</int> > > > <int name="ngroups">9</int> > > > <arr name="groups"> > > > <lst> > > > <str name="groupValue">320043</str> > > > <result name="doclist" numFound="1" start="0"> > > > <doc>...</doc> > > > </result> > > > </lst> > > > <lst> > > > <str name="groupValue">398807</str> > > > <result name="doclist" numFound="5" start="0" maxScore="2.4154348">... > > > </result> > > > </lst> > > > <lst> > > > <str name="groupValue">346878</str> > > > <result name="doclist" numFound="2" start="0">...</result> > > > </lst> > > > <lst> > > > <str name="groupValue">346880</str> > > > <result name="doclist" numFound="2" start="0">...</result> > > > </lst> > > > </arr> > > > </lst> > > > </lst> > > > </response> > > > > > > So you can see that the ngroups value returned is 9 and the actual > number > > > of groups returned is 4 > > > > > > Why do we have this discrepancy in the ngroups, matches and actual > number > > > of groups. Is this an open issue ? > > > > > > Any kind of help is appreciated. > > > > > > -- > > > Regards, > > > > > > Nitesh Nandy > > > > > > > > -- > > Met vriendelijke groet, > > > > Martijn van Groningen > > > > > > -- > Regards, > > Nitesh Nandy >