I'm trying to write a SolrJ program in Java to read and consolidate all the
information into a JSON file, The client will just need to call this SolrJ
program and read this JSON file to get the details. But the problem is we
are still querying the Solr once for each collection, just that this time
it is done in the SolrJ program in a for-loop, while previously it's done
on the client side. Not sure will this lead to performance improvement?

For your suggestion on spawning a bunch of threads, does it mean the same
thing as I did?

Regards,
Edwin


On 5 June 2015 at 12:03, Erick Erickson <erickerick...@gmail.com> wrote:

> Have you considered spawning a bunch of threads, one per collection
> and having them all run in parallel?
>
> Best,
> Erick
>
> On Thu, Jun 4, 2015 at 4:52 PM, Zheng Lin Edwin Yeo
> <edwinye...@gmail.com> wrote:
> > The reason we wanted to do a single call is to improve on the
> performance,
> > as our application requires to list the total number of records in each
> of
> > the collections, and the number of records that matches the query each of
> > the collections.
> >
> > Currently we are querying each collection one by one to retrieve the
> > numFound value and display them, but this can slow down the system
> > significantly when the number of collection grows. So we are thinking of
> > ways to improve the speed in this area.
> >
> > Any other methods which you can suggest that we can do to overcome this
> > speed problem?
> >
> > Regards,
> > Edwin
> > On 5 Jun 2015 00:16, "Erick Erickson" <erickerick...@gmail.com> wrote:
> >
> >> Not in a single call that I know of. These are really orthogonal
> >> concepts. Getting the cluster status merely involves reading the
> >> Zookeeper clusterstate whereas getting the total number of docs for
> >> each would involve querying each collection, i.e. going to the Solr
> >> nodes themselves. I'd guess it's unlikely to be combined.
> >>
> >> Best,
> >> Erick
> >>
> >> On Thu, Jun 4, 2015 at 7:47 AM, Zheng Lin Edwin Yeo
> >> <edwinye...@gmail.com> wrote:
> >> > Hi,
> >> >
> >> > Would like to check, are we able to use the Collection API or any
> other
> >> > method to list all the collections in the cluster together with the
> >> number
> >> > of records in each of the collections in one output?
> >> >
> >> > Currently, I only know of the List Collections
> >> > /admin/collections?action=LIST. However, this only list the names of
> the
> >> > collections that are in the cluster, but not the number of records.
> >> >
> >> > Is there a way to show the number of records in each of the
> collections
> >> as
> >> > well?
> >> >
> >> > Regards,
> >> > Edwin
> >>
>

Reply via email to