Re: distribution of leader and replica in SolrCloud

Bernd Fehling Tue, 09 May 2017 00:45:07 -0700

I would name your solution more a work around as any similar solution of this 
kind.
The issue SOLR-6027 is now 3 years open and the world has changed.
Instead of racks full of blades where you had many dedicated bare metal servers
you have now huge machines with 256GB RAM and many CPUs. Virtualization has 
taken place.
To get under these conditions some independance from the physical hardware you 
have
to spread the shards across several physical machines with virtual servers.
>From my point of view it is a good solution to have 5 virtual 64GB servers
on 5 different huge physical machines and start 2 instances on each virtual 
server.
If I would split up each 64GB virtual server into two 32GB virtual server there 
would
be no gain. We don't have 10 huge machines (no security win) and we have to 
admin
and control 10 virtual servers instead of 5 (plus zookeeper servers).


It is state of the art that you don't have to care about the servers within
the cloud. This is the main sense of a cloud.
The leader should always be aware who are the members of his cloud, how to reach
them (IP address) and how are the users of the cloud (collections) distributed
across the cloud.

It would be great if a solution of issue SOLR-6027 would lead to some kind of
"automatic mode" for server distribution, without any special configuring.

Regards,
Bernd


Am 08.05.2017 um 17:47 schrieb Erick Erickson:
> Also, you can specify custom placement rules, see:
> https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement
> 
> But Shawn's statement is the nub of what you're seeing, by default
> multiple JVMs on the same physical machine are considered separate
> Solr instances.
> 
> Also note that if you want to, you can specify a nodeSet when you
> create the nodes, and in particular the special value EMPTY. That'll
> create a collection with no replicas and you can ADDREPLICA to
> precisely place each one if you require that level of control.
> 
> Best,
> Erick
> 
> On Mon, May 8, 2017 at 7:44 AM, Shawn Heisey <apa...@elyograg.org> wrote:
>> On 5/8/2017 5:38 AM, Bernd Fehling wrote:
>>> boss ------ shard1 ----- server2:7574
>>>        |             |-- server2:8983 (leader)
>>
>> The reason that this happened is because you've got two nodes running on
>> every server.  From SolrCloud's perspective, there are ten distinct
>> nodes, not five.
>>
>> SolrCloud doesn't notice the fact that different nodes are running on
>> the same server(s).  If your reaction to hearing this is that it
>> *should* notice, you're probably right, but in a typical use case, each
>> server should only be running one Solr instance, so this would never happen.
>>
>> There is only one instance where I can think of where I would recommend
>> running multiple instances per server, and that is when the required
>> heap size for a single instance would be VERY large.  Running two
>> instances with smaller heaps can yield better performance.
>>
>> See this issue:
>>
>> https://issues.apache.org/jira/browse/SOLR-6027
>>
>> Thanks,
>> Shawn
>>

Re: distribution of leader and replica in SolrCloud

Reply via email to