I've allocated 4GB to Solr, so the rest of the 4GB is free for the OS disk
caching.

I think that at any point of time, there can be a maximum of <number of
threads> concurrent requests, which happens to make sense btw (does it?).

As I increase the number of threads, the load average shown by top goes up
to as high as 80%. But if I keep the number of threads low (~10), the load
average never goes beyond ~8). So probably thats the number of requests I
can expect Solr to serve concurrently on this index size with this hardware.

Can anyone give a general opinion as to how much hardware should be
sufficient for a Solr deployment with an index size of ~43GB, containing
around 2.5 million documents? I'm expecting it to serve at least 20 requests
per second. Any experiences?

Thanks

On Fri, Mar 12, 2010 at 12:47 AM, Tom Burton-West <tburtonw...@gmail.com>wrote:

>
> How much of your memory are you allocating to the JVM and how much are you
> leaving free?
>
> If you don't leave enough free memory for the OS, the OS won't have a large
> enough disk cache, and you will be hitting the disk for lots of queries.
>
> You might want to monitor your Disk I/O using iostat and look at the
> iowait.
>
> If you are doing phrase queries and your *prx file is significantly larger
> than the available memory then when a slow phrase query hits Solr, the
> contention for disk I/O with other queries could be slowing everything
> down.
> You might also want to look at the 90th and 99th percentile query times in
> addition to the average. For our large indexes, we found at least an order
> of magnitude difference between the average and 99th percentile queries.
> Again, if Solr gets hit with a few of those 99th percentile slow queries
> and
> your not hitting your caches, chances are you will see serious contention
> for disk I/O..
>
> Of course if you don't see any waiting on i/o, then your bottleneck is
> probably somewhere else:)
>
> See
>
> http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-1
> for more background on our experience.
>
> Tom Burton-West
> University of Michigan Library
> www.hathitrust.org
>
>
>
> >
> > On Thu, Mar 11, 2010 at 9:39 AM, Siddhant Goel <siddhantg...@gmail.com
> > >wrote:
> >
> > > Hi everyone,
> > >
> > > I have an index corresponding to ~2.5 million documents. The index size
> > is
> > > 43GB. The configuration of the machine which is running Solr is - Dual
> > > Processor Quad Core Xeon 5430 - 2.66GHz (Harpertown) - 2 x 12MB cache,
> > 8GB
> > > RAM, and 250 GB HDD.
> > >
> > > I'm observing a strange trend in the queries that I send to Solr. The
> > query
> > > times for queries that I send earlier is much lesser than the queries I
> > > send
> > > afterwards. For instance, if I write a script to query solr 5000 times
> > > (with
> > > 5000 distinct queries, most of them containing not more than 3-5 words)
> > > with
> > > 10 threads running in parallel, the average times for queries goes from
> > > ~50ms in the beginning to ~6000ms. Is this expected or is there
> > something
> > > wrong with my configuration. Currently I've configured the
> > queryResultCache
> > > and the documentCache to contain 2048 entries (hit ratios for both is
> > close
> > > to 50%).
> > >
> > > Apart from this, a general question that I want to ask is that is such
> a
> > > hardware enough for this scenario? I'm aiming at achieving around 20
> > > queries
> > > per second with the hardware mentioned above.
> > >
> > > Thanks,
> > >
> > > Regards,
> > >
> > > --
> > > - Siddhant
> > >
> >
>
>
>
> --
> - Siddhant
>
>
>
> --
> View this message in context:
> http://old.nabble.com/Solr-Performance-Issues-tp27864278p27868456.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
- Siddhant

Reply via email to