Hi all,

After reading 
http://carsabi.com/car-news/2012/03/23/optimizing-solr-7x-your-search-speed/ , 
I thought I'd do my own experiments. I used 2M docs from wikipedia, indexed in 
Solr 4.0 Beta on a standard EC2 large instance. I compared an unsharded and 
2-shard configuration (the latter set up with SolrCloud following the 
http://wiki.apache.org/solr/SolrCloud example). I wrote a simple python script 
to randomly throw queries from a hand-compiled list at Solr. The only "extra" I 
had turned on was facets (on document category).

To my surprise, the performance of the 2-shard configuration is almost exactly 
half that of the unsharded index - 

unsharded
4983912891 results in 24920 searches; 0 errors
70.02 mean qps
0.35s mean query time, 2.25s max, 0.00s min
90%   of qtimes <= 0.83s
99%   of qtimes <= 1.42s
99.9% of qtimes <= 1.68s

2-shard
4990351660 results in 24501 searches; 0 errors
34.07 mean qps
0.66s mean query time, 694.20s max, 0.01s min
90%   of qtimes <= 1.19s
99%   of qtimes <= 2.12s
99.9% of qtimes <= 2.95s

All caches were set to 4096 items, and performance looks ok in both cases (hit 
ratios close to 1.0, 0 evictions). I gave the single VM -Xmx1G and each shard 
VM -Xmx500M.

I must be doing something stupid - surely this result is unexpected? Does 
anybody have any thoughts where it might be going wrong?

cheers,
Tom

Reply via email to