We have 16 shards on 4 physical servers. Shard size was determined by
measuring query response times as a function of doc count. Multiple shards
per server provides parallelism. In a VM environment, I would lean towards 1
shard per VM (with 1/4 the RAM). We implemented our own distributed search
(pre-Solr) and the extra sort/merge processing is not a performance issue.

Peter


On Tue, Aug 2, 2011 at 2:35 PM, Burton-West, Tom <tburt...@umich.edu> wrote:

> Hi Jonothan and Markus,
>
> >>Why 3 shards on one machine instead of one larger shard per machine?
>
> Good question!
>
> We made this architectural decision several years ago and I'm not
> remembering the rationale at the moment. I believe we originally made the
> decision due to some tests showing a sweetspot for I/O performance for
> shards with 500,000-600,000 documents, but those tests were made before we
> implemented CommonGrams and when we were still using attached storage.  I
> think we also might have had concerns about Java OOM errors with a really
> large shard/index, but we now know that we can keep memory usage under
> control by tweaking the amount of the terms index that gets read into
> memory.
>
> We should probably do some tests and revisit the question.
>
> The reason we don't have 12 shards on 12 machines is that current
> performance is good enough that we can't justify buying 8 more machines:)
>
> Tom
>
> -----Original Message-----
> From: Markus Jelsma [mailto:markus.jel...@openindex.io]
> Sent: Tuesday, August 02, 2011 2:12 PM
> To: solr-user@lucene.apache.org
> Subject: Re: performance crossover between single index and sharding
>
> Hi Tom,
>
> Very interesting indeed! But i keep wondering why some engineers choose to
> store multiple shards of the same index on the same machine, there must be
> significant overhead. The only reason i can think of is ease of maintenance
> in
> moving shards to a separate physical machine.
> I know that rearranging the shard topology can be a real pain in a large
> existing cluster (e.g. consistent hashing is not consistent anymore and
> having
> to shuffle docs to their new shard), is this the reason you choose this
> approach?
>
> Cheers,
> bble.com.
>

Reply via email to