In any case, this is really "the sizing question" and generic answers
are not reliable. Here's a long blog about why, but the net-net is
"prototype and measure". Fortunately you can prototype with just a few
nodes (I usually want at least 2 shards) and extrapolate reasonably
well.

https://lucidworks.com/blog/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

Best,
Erick

On Fri, Jan 13, 2017 at 10:29 AM, Susheel Kumar <susheel2...@gmail.com> wrote:
> As per Scott@FullStory you shall see benefits with many smaller shards then
> few bigger. Also upgrading to Solr 6.2 would be better as there are many
> improvements done handling multiple shards. See below presentation
>
> http://www.slideshare.net/lucidworks/large-scale-solr-at-fullstory-presented-by-scott-blum-fullstory
>
>
> Thnx
> Susheel
>
> On Fri, Jan 13, 2017 at 12:56 PM, Joe Obernberger <
> joseph.obernber...@gmail.com> wrote:
>
>> Hi All - we've been experimenting with Solr Cloud 5.5.0 with a 27 shard
>> (no replication - each shard runs on a physical host) cluster on top of
>> HDFS.  It currently just crossed 3 billion documents indexed with an index
>> size of 16.1TBytes.  In HDFS with 3x replication this takes up 48.2TBytes.
>>
>> Each shard is then hosting about 610GBytes of index.  The HDFS cache size
>> is very low at about 8GBytes.  Suffice it to say, performance isn't very
>> good, but again, this is for experimentation.
>>
>> If we were to redo this, would it be better to create many shards - maybe
>> 200 with 3 replicas each (600 in all) with the goal being to withstand a
>> server going out, and future expansion as more hardware is added?  I know
>> this is very general question.  Thanks very much in advance!
>>
>> -Joe
>>
>>

Reply via email to