On 9/11/2013 2:57 PM, Deepak Konidena wrote:
I guess at this point in the discussion, I should probably give some more
background on why I am doing what I am doing. Having a single Solr shard
(multiple segments) on the same disk is posing severe performance problems
under load,in that, calls to Solr cause a lot of connection timeouts. When
we looked at the ganglia stats for the Solr box, we saw that while memory,
cpu and network usage were quite normal, the i/o wait spiked. We are unsure
on what caused the i/o wait and why there were no spikes in the cpu/memory
usage. Since the Solr box is a beefy box (multi-core setup, huge ram, SSD),
we'd like to distribute the segments to multiple locations (disks) and see
whether this improves performance under load.

@Greg - Thanks for clarifying that.  I just learnt that I can't set them up
using RAID as some of them are SSDs and some others are SATA (spinning
disks).

@Shawn Heisey - Could you elaborate more about the "broker" core and
delegating the requests to other cores?

On the broker core - I have a core on my servers that has no index of its own. In the /select handler (and others) I have placed a shards parameter, and many of them also have a shards.qt parameter. The shards paramter is how a non-cloud distributed search is done.

http://wiki.apache.org/solr/DistributedSearch

Addressing your first paragraph: You say that you have lots of RAM ... but is there a lot of unallocated RAM that the OS can use for caching, or is it mostly allocated to processes, such as the java heap for Solr?

Depending on exactly how your indexes are composed, you need up to 100% of the total index size available as unallocated RAM. With SSD, the requirement is less, but cannot be ignored. I personally wouldn't go below about 25-50% even with SSD, and I'd plan on 50-100% for regular disks.

There is some evidence to suggest that you only need unallocated RAM equal to 10% of your index size for caching with SSD, but that is only likely to work if you have a lot of stored (as opposed to indexed) data. If most of your index is unstored, then more would be required.

Thanks,
Shawn

Reply via email to