I'm not so sure this is as bad as it sounds. When your collection is
sharded, no single node knows about the documents in other shards/nodes,
so to find the total number, a query will need to go to every node.

Trying to work out something to do a single request to every node,
combine their collection statistics and aggregate them into a single
result sounds very complicated, and likely overkill.

Are you needing to collect this information often? Do you have a lot of
collections?

Upayavira


On Fri, Jun 5, 2015, at 06:29 AM, Zheng Lin Edwin Yeo wrote:
> I'm trying to write a SolrJ program in Java to read and consolidate all
> the
> information into a JSON file, The client will just need to call this
> SolrJ
> program and read this JSON file to get the details. But the problem is we
> are still querying the Solr once for each collection, just that this time
> it is done in the SolrJ program in a for-loop, while previously it's done
> on the client side. Not sure will this lead to performance improvement?
> 
> For your suggestion on spawning a bunch of threads, does it mean the same
> thing as I did?
> 
> Regards,
> Edwin
> 
> 
> On 5 June 2015 at 12:03, Erick Erickson <erickerick...@gmail.com> wrote:
> 
> > Have you considered spawning a bunch of threads, one per collection
> > and having them all run in parallel?
> >
> > Best,
> > Erick
> >
> > On Thu, Jun 4, 2015 at 4:52 PM, Zheng Lin Edwin Yeo
> > <edwinye...@gmail.com> wrote:
> > > The reason we wanted to do a single call is to improve on the
> > performance,
> > > as our application requires to list the total number of records in each
> > of
> > > the collections, and the number of records that matches the query each of
> > > the collections.
> > >
> > > Currently we are querying each collection one by one to retrieve the
> > > numFound value and display them, but this can slow down the system
> > > significantly when the number of collection grows. So we are thinking of
> > > ways to improve the speed in this area.
> > >
> > > Any other methods which you can suggest that we can do to overcome this
> > > speed problem?
> > >
> > > Regards,
> > > Edwin
> > > On 5 Jun 2015 00:16, "Erick Erickson" <erickerick...@gmail.com> wrote:
> > >
> > >> Not in a single call that I know of. These are really orthogonal
> > >> concepts. Getting the cluster status merely involves reading the
> > >> Zookeeper clusterstate whereas getting the total number of docs for
> > >> each would involve querying each collection, i.e. going to the Solr
> > >> nodes themselves. I'd guess it's unlikely to be combined.
> > >>
> > >> Best,
> > >> Erick
> > >>
> > >> On Thu, Jun 4, 2015 at 7:47 AM, Zheng Lin Edwin Yeo
> > >> <edwinye...@gmail.com> wrote:
> > >> > Hi,
> > >> >
> > >> > Would like to check, are we able to use the Collection API or any
> > other
> > >> > method to list all the collections in the cluster together with the
> > >> number
> > >> > of records in each of the collections in one output?
> > >> >
> > >> > Currently, I only know of the List Collections
> > >> > /admin/collections?action=LIST. However, this only list the names of
> > the
> > >> > collections that are in the cluster, but not the number of records.
> > >> >
> > >> > Is there a way to show the number of records in each of the
> > collections
> > >> as
> > >> > well?
> > >> >
> > >> > Regards,
> > >> > Edwin
> > >>
> >

Reply via email to