Re: paging size in SOLR

jame vaalet Sun, 14 Aug 2011 05:35:38 -0700

thanks erick ... that means it depends upon the memory allocated to the JVM
.


going back queryCacheResults factor i have got this doubt ..
say, i have got 10 threads with 10 different queries ..and each of them in
parallel are searching the same index with millions of docs in it
(multisharded ) .
now each of the queries have large number of results in it hence got to page
them all..
which all thread's (query ) result-set will be cached ? so that subsequent
pages can be retrieved quickly ..?

On 14 August 2011 17:40, Erick Erickson <erickerick...@gmail.com> wrote:

> There isn't an "optimum" page size that I know of, it'll vary with lots of
> stuff, not the least of which is whatever servlet container limits there
> are.
>
> But I suspect you can get quite a few (1000s) without
> too much problem, and you can always use the JSON response
> writer to pack in more pages with less overhead.
>
> You pretty much have to try it and see.
>
> Best
> Erick
>
> On Sun, Aug 14, 2011 at 5:42 AM, jame vaalet <jamevaa...@gmail.com> wrote:
> > speaking about pagesizes, what is the optimum page size that should be
> > retrieved each time ??
> > i understand it depends upon the data you are fetching back fromeach hit
> > document ... but lets say when ever a document is hit am fetching back
> 100
> > bytes worth data from each of those docs in indexes (along with solr
> > response statements ) .
> > this will make 100*x bytes worth data in each page if x is the page size
> ..
> > what is the optimum value of this x that solr can return each time
> without
> > going into exceptions ....
> >
> > On 13 August 2011 19:59, Erick Erickson <erickerick...@gmail.com> wrote:
> >
> >> Jame:
> >>
> >> You control the number via settings in solrconfig.xml, so it's
> >> up to you.
> >>
> >> Jonathan:
> >> Hmmm, that's seems right, after all the "deep paging" penalty is really
> >> about keeping a large sorted array in memory.... but at least you only
> >> pay it once per 10,000, rather than 100 times (assuming page size is
> >> 100)...
> >>
> >> Best
> >> Erick
> >>
> >> On Wed, Aug 10, 2011 at 10:58 AM, jame vaalet <jamevaa...@gmail.com>
> >> wrote:
> >> > when you say queryResultCache, does it only cache n number of result
> for
> >> the
> >> > last one query or more than one queries?
> >> >
> >> >
> >> > On 10 August 2011 20:14, simon <mtnes...@gmail.com> wrote:
> >> >
> >> >> Worth remembering there are some performance penalties with deep
> >> >> paging, if you use the page-by-page approach. may not be too much of
> a
> >> >> problem if you really are only looking to retrieve 10K docs.
> >> >>
> >> >> -Simon
> >> >>
> >> >> On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
> >> >> <erickerick...@gmail.com> wrote:
> >> >> > Well, if you really want to you can specify start=0 and rows=10000
> and
> >> >> > get them all back at once.
> >> >> >
> >> >> > You can do page-by-page by incrementing the "start" parameter as
> you
> >> >> > indicated.
> >> >> >
> >> >> > You can keep from re-executing the search by setting your
> >> >> queryResultCache
> >> >> > appropriately, but this affects all searches so might be an issue.
> >> >> >
> >> >> > Best
> >> >> > Erick
> >> >> >
> >> >> > On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet <jamevaa...@gmail.com
> >
> >> >> wrote:
> >> >> >> hi,
> >> >> >> i want to retrieve all the data from solr (say 10,000 ids ) and my
> >> page
> >> >> size
> >> >> >> is 1000 .
> >> >> >> how do i get back the data (pages) one after other ?do i have to
> >> >> increment
> >> >> >> the "start" value each time by the page size from 0 and do the
> >> iteration
> >> >> ?
> >> >> >> In this case am i querying the index 10 time instead of one or
> after
> >> >> first
> >> >> >> query the result will be cached somewhere for the subsequent pages
> ?
> >> >> >>
> >> >> >>
> >> >> >> JAME VAALET
> >> >> >>
> >> >> >
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> >
> >> > -JAME
> >> >
> >>
> >
> >
> >
> > --
> >
> > -JAME
> >
>



-- 

-JAME

Re: paging size in SOLR

Reply via email to