Hi,

Thanks for your responses. It actually feels good to be able to locate where
the bottlenecks are.

I've created two sets of data - in the first one I'm measuring the time took
purely on Solr's end, and in the other one I'm including network latency
(just for reference). The data that I'm posting below contains the time took
purely by Solr.

I'm running 10 threads simultaneously and the average response time (for
each query in each thread) remains close to 40 to 50 ms. But as soon as I
increase the number of threads to something like 100, the response time goes
up to ~600ms, and further up when the number of threads is close to 500. Yes
the average time definitely depends on the number of concurrent requests.

Going from memory, debugQuery=on will let you know how much time
> was spent in various operations in SOLR. It's important to know
> whether it was the searching, assembling the response, or
> transmitting the data back to the client.


I just tried this. The information that it gives me for a query that took
7165ms is - http://pastebin.ca/1835644

So out of the total time 7165ms, QueryComponent took most of the time. Plus
I can see the load average going up when the number of threads is really
high. So it actually makes sense. (I didn't add any other component while
searching; it was a plain /select?q=query call).
Like I mentioned earlier in this mail, I'm maintaining separate sets for
data with/without network latency, and I don't think its the bottleneck.


> How many threads does it take to peg the CPU? And what
> response times are you getting when your number of threads is
> around 10?
>

If the number of threads is greater than 100, that really takes its toll on
the CPU. So probably thats the number.

When the number of threads is around 10, the response times average to
something like 60ms (and 95% of the queries fall within 100ms of that
value).

Thanks,




>
> Erick
>
> On Fri, Mar 12, 2010 at 3:39 AM, Siddhant Goel <siddhantg...@gmail.com
> >wrote:
>
> > I've allocated 4GB to Solr, so the rest of the 4GB is free for the OS
> disk
> > caching.
> >
> > I think that at any point of time, there can be a maximum of <number of
> > threads> concurrent requests, which happens to make sense btw (does it?).
> >
> > As I increase the number of threads, the load average shown by top goes
> up
> > to as high as 80%. But if I keep the number of threads low (~10), the
> load
> > average never goes beyond ~8). So probably thats the number of requests I
> > can expect Solr to serve concurrently on this index size with this
> > hardware.
> >
> > Can anyone give a general opinion as to how much hardware should be
> > sufficient for a Solr deployment with an index size of ~43GB, containing
> > around 2.5 million documents? I'm expecting it to serve at least 20
> > requests
> > per second. Any experiences?
> >
> > Thanks
> >
> > On Fri, Mar 12, 2010 at 12:47 AM, Tom Burton-West <tburtonw...@gmail.com
> > >wrote:
> >
> > >
> > > How much of your memory are you allocating to the JVM and how much are
> > you
> > > leaving free?
> > >
> > > If you don't leave enough free memory for the OS, the OS won't have a
> > large
> > > enough disk cache, and you will be hitting the disk for lots of
> queries.
> > >
> > > You might want to monitor your Disk I/O using iostat and look at the
> > > iowait.
> > >
> > > If you are doing phrase queries and your *prx file is significantly
> > larger
> > > than the available memory then when a slow phrase query hits Solr, the
> > > contention for disk I/O with other queries could be slowing everything
> > > down.
> > > You might also want to look at the 90th and 99th percentile query times
> > in
> > > addition to the average. For our large indexes, we found at least an
> > order
> > > of magnitude difference between the average and 99th percentile
> queries.
> > > Again, if Solr gets hit with a few of those 99th percentile slow
> queries
> > > and
> > > your not hitting your caches, chances are you will see serious
> contention
> > > for disk I/O..
> > >
> > > Of course if you don't see any waiting on i/o, then your bottleneck is
> > > probably somewhere else:)
> > >
> > > See
> > >
> > >
> >
> http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-1
> > > for more background on our experience.
> > >
> > > Tom Burton-West
> > > University of Michigan Library
> > > www.hathitrust.org
> > >
> > >
> > >
> > > >
> > > > On Thu, Mar 11, 2010 at 9:39 AM, Siddhant Goel <
> siddhantg...@gmail.com
> > > > >wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > I have an index corresponding to ~2.5 million documents. The index
> > size
> > > > is
> > > > > 43GB. The configuration of the machine which is running Solr is -
> > Dual
> > > > > Processor Quad Core Xeon 5430 - 2.66GHz (Harpertown) - 2 x 12MB
> > cache,
> > > > 8GB
> > > > > RAM, and 250 GB HDD.
> > > > >
> > > > > I'm observing a strange trend in the queries that I send to Solr.
> The
> > > > query
> > > > > times for queries that I send earlier is much lesser than the
> queries
> > I
> > > > > send
> > > > > afterwards. For instance, if I write a script to query solr 5000
> > times
> > > > > (with
> > > > > 5000 distinct queries, most of them containing not more than 3-5
> > words)
> > > > > with
> > > > > 10 threads running in parallel, the average times for queries goes
> > from
> > > > > ~50ms in the beginning to ~6000ms. Is this expected or is there
> > > > something
> > > > > wrong with my configuration. Currently I've configured the
> > > > queryResultCache
> > > > > and the documentCache to contain 2048 entries (hit ratios for both
> is
> > > > close
> > > > > to 50%).
> > > > >
> > > > > Apart from this, a general question that I want to ask is that is
> > such
> > > a
> > > > > hardware enough for this scenario? I'm aiming at achieving around
> 20
> > > > > queries
> > > > > per second with the hardware mentioned above.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Regards,
> > > > >
> > > > > --
> > > > > - Siddhant
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > - Siddhant
> > >
> > >
> > >
> > > --
> > > View this message in context:
> > > http://old.nabble.com/Solr-Performance-Issues-tp27864278p27868456.html
> > > Sent from the Solr - User mailing list archive at Nabble.com.
> > >
> > >
> >
> >
> > --
> > - Siddhant
> >
>



-- 
- Siddhant

Reply via email to