With low qps and multi-core servers, I believe one reason to have multiple shards on one server is to provide better parallelism for a request, and thus reduce your response time.
-- Ken On Aug 2, 2011, at 11:06am, Jonathan Rochkind wrote: > What's the reasoning behind having three shards on one machine, instead of > just combining those into one shard? Just curious. I had been thinking the > point of shards was to get them on different machines, and there'd be no > reason to have multiple shards on one machine. > > On 8/2/2011 1:59 PM, Burton-West, Tom wrote: >> Hi Markus, >> >> Just as a data point for a very large sharded index, we have the full text >> of 9.3 million books with an index size of about 6+ TB spread over 12 shards >> on 4 machines. Each machine has 3 shards. The size of each shard ranges >> between 475GB and 550GB. We are definitely I/O bound. Our machines have >> 144GB of memory with about 16GB dedicated to the tomcat instance running the >> 3 Solr instances, which leaves about 120 GB (or 40GB per shard) for the OS >> disk cache. We release a new index every morning and then warm the caches >> with several thousand queries. I probably should add that our disk storage >> is a very high performance Isilon appliance that has over 500 drives and >> every block of every file is striped over no less than 14 different drives. >> (See blog for details *) >> >> We have a very low number of queries per second (0.3-2 qps) and our modest >> response time goal is to keep 99th percentile response time for our >> application (i.e. Solr + application) under 10 seconds. >> >> Our current performance statistics are: >> >> average response time 300 ms >> median response time 113 ms >> 90th percentile 663 ms >> 95th percentile 1,691 ms >> >> We had plans to do some performance testing to determine the optimum shard >> size and optimum number of shards per machine, but that has remained on the >> back burner for a long time as other higher priority items keep pushing it >> down on the todo list. >> >> We would be really interested to hear about the experiences of people who >> have so many shards that the overhead of distributing the queries, and >> consolidating/merging the responses becomes a serious issue. >> >> >> Tom Burton-West >> >> http://www.hathitrust.org/blogs/large-scale-search >> >> * >> http://www.hathitrust.org/blogs/large-scale-search/scaling-large-scale-search-500000-volumes-5-million-volumes-and-beyond >> >> -----Original Message----- >> From: Markus Jelsma [mailto:markus.jel...@openindex.io] >> Sent: Tuesday, August 02, 2011 12:33 PM >> To: solr-user@lucene.apache.org >> Subject: Re: performance crossover between single index and sharding >> >> Actually, i do worry about it. Would be marvelous if someone could provide >> some metrics for an index of many terabytes. >> >>> [..] At some extreme point there will be diminishing >>> returns and a performance decrease, but I wouldn't worry about that at all >>> until you've got many terabytes -- I don't know how many but don't worry >>> about it. >>> >>> ~ David >>> >>> ----- >>> Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book >>> -- >>> View this message in context: >>> http://lucene.472066.n3.nabble.com/performance-crossover-between-single-in >>> dex-and-sharding-tp3218561p3219397.html Sent from the Solr - User mailing >>> list archive at Nabble.com. -------------------------- Ken Krugler +1 530-210-6378 http://bixolabs.com custom data mining solutions