So is it a better approach to query for smaller rows, say 500, and keep
increasing the start parameter? wouldnt that be slower since I have an
increasing start parameter and I will also be sorting by the same field in
each of my queries made to the multiple shards?

Also, does it make sense to have all these documents in the same shard? I
went for this approach because the shard which is queried the most is small
and gives a lot of benefit in terms of time taken for all the stats
queries. This shard is only about 5 gb whereas the entire index will be
about 50 gb.

Thanks for the help,
Rohit

On Mon, Nov 5, 2012 at 4:02 PM, Walter Underwood <wun...@wunderwood.org>wrote:

> Don't query for 5000 documents. That is going to be slow no matter how it
> is implemented.
>
> wunder
>
> On Nov 5, 2012, at 1:00 PM, Rohit Harchandani wrote:
>
> > Hi,
> > So it seems that when I query multiple shards with the sort criteria for
> > 5000 documents, it queries all shards and gets a list of document ids and
> > then adds the document ids to the original query and queries all the
> shards
> > again.
> > This process of doing the join of query results with the unique ids and
> > getting the remaining fields is turning out to be really slow. It takes a
> > while to search for a list of unique ids. Is there any config change  to
> > make this process faster?
> > Also what does isDistrib=false mean when solr generates the queries
> > internally?
> > Thanks,
> > Rohit
> >
> > On Fri, Oct 19, 2012 at 5:23 PM, Rohit Harchandani <rhar...@gmail.com
> >wrote:
> >
> >> Hi,
> >>
> >> The same query is fired always for 500 rows. The only thing different is
> >> the "start" parameter.
> >>
> >> The 3 shards are in the same instance on the same server. They all have
> >> the same schema. But the inherent type of the documents is different.
> Also
> >> most of the apps queries goes to shard "A" which has the smallest index
> >> size (4gb).
> >>
> >> The query is made to a "master" shard which by default goes to all 3
> >> shards for results. (also, the query that i am trying matches documents
> >> only only in shard "A" mentioned above)
> >>
> >> Will try debugQuery now and post it here.
> >>
> >> Thanks,
> >> Rohit
> >>
> >>
> >>
> >>
> >> On Thu, Oct 18, 2012 at 11:00 PM, Otis Gospodnetic <
> >> otis.gospodne...@gmail.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> Maybe you can narrow this down a little further.  Are there some
> >>> queries that are faster and some slower?  Is there a pattern?  Can you
> >>> share examples of slow queries?  Have you tried &debugQuery=true?
> >>> These 3 shards.... is each of them on its own server or?  Is the slow
> >>> one always the one that hits the biggest shard?  Do they hold the same
> >>> type of data?  How come their sizes are so different?
> >>>
> >>> Otis
> >>> --
> >>> Search Analytics - http://sematext.com/search-analytics/index.html
> >>> Performance Monitoring - http://sematext.com/spm/index.html
> >>>
> >>>
> >>> On Thu, Oct 18, 2012 at 12:22 PM, Rohit Harchandani <rhar...@gmail.com
> >
> >>> wrote:
> >>>> Hi all,
> >>>> I have an application which queries a solr instance having 3
> shards(4gb,
> >>>> 13gb and 30gb index size respectively) having 6 million documents in
> >>> all.
> >>>> When I start 10 threads in my app to make simultaneous queries (with
> >>>> rows=500 and different start parameter, sort on 1 field and no facets)
> >>> to
> >>>> solr to return 500 different documents in each query, sometimes I see
> >>> that
> >>>> most of the responses come back within no time (500ms-1000ms), but the
> >>> last
> >>>> response takes close to 50 seconds (Qtime).
> >>>> I am using the latest 4.0 release. What is the reason for this delay?
> Is
> >>>> there a way to prevent this?
> >>>> Thanks and regards,
> >>>> Rohit
> >>>
> >>
> >>
>
> --
> Walter Underwood
> wun...@wunderwood.org
>
>
>
>

Reply via email to