Honza

Thanks for your response.

Regards
Nilakantan

-----Original Message-----
From: Jan Friesse [mailto:[email protected]] 
Sent: Thursday, June 19, 2014 8:20 PM
To: Mahadevan, Nilakantan (STSD); Patrick Hemmer; [email protected]
Subject: Re: [corosync] automatic membership discovery

Mahadevan,

> Hi,
> 
> Just a thought, would it also provide the flexibility to  make this an 
> optional feature while setting up the cluster. This feature is good, 
> but if there is a way for me to ensure that the existing nodes do not 
> accept unless the new nodes are present in the local Config file. In 
> that case it would give the flexibility to system managers to choose 
> whichever is appropriate for them

sure. Such feature (if implemented) would mean for sure to set something like 
"auto_accept_node" to on and not being default.

Honza

> 
> Regards
> Nilakantan
> 
> 
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Jan Friesse
> Sent: Thursday, June 19, 2014 7:20 PM
> To: Patrick Hemmer; [email protected]
> Subject: Re: [corosync] automatic membership discovery
> 
> Patrick,
> so just to recapitulate your idea. Let's say you have cluster with 2 nodes. 
> Now, you will decide to add third node. Your idea is about properly configure 
> 3rd node (so if we would distribute that config file, call reload on every 
> node, everything would work), in other words, add 3rd node ONLY to config 
> file on 3rd node and then start corosync. Other nodes will just accept node, 
> add it to their membership (and probably some kind of automatically generated 
> persistent list of nodes). Do I understand it correctly?
> 
> Because if so, I believe it would mean also change config file, simply to 
> keep them in sync. And honestly, keeping config file is for sure a way I 
> would like to go, but that way is very hard. Every single thing must be very 
> well defined (like what is synchronized and what is not).
> 
> Regards,
>   Honza
> 
> Patrick Hemmer napsal(a):
>> *F**rom: *Patrick Hemmer <[email protected]>
>> *Sent: * 2014-06-16 11:25:40 EDT
>> *To: *Jan Friesse <[email protected]>, [email protected]
>> *Subject: *Re: [corosync] automatic membership discovery
>>
>>
>> On 2014/06/16 11:25, Patrick Hemmer wrote:
>>> Patrick,
>>>
>>>> I'm interested in having corosync automatically accept members into 
>>>> the cluster without manual reconfiguration. Meaning that when I 
>>>> bring a new node online, I want to configure it for the existing 
>>>> nodes, and those nodes will automatically add the new node into their 
>>>> nodelist.
>>>> From a purely technical standpoint, this doesn't seem like it would 
>>>> be hard to do. The only 2 things you have to do to add a node are 
>>>> add the nodelist.node.X.nodeid and ring0_addr to cmap. When the new 
>>>> node comes up, it starts sending out messages to the existing nodes.
>>>> The ring0_addr can be discovered from the source address, and the nodeid 
>>>> is in the message.
>>>>
>>> I need to think about this little deeper. It sounds like it may 
>>> work, but I'm not entirely sure.
>>>
>>>> Going even further, when using the allow_downscale and 
>>>> last_man_standing features, we can automatically remove nodes from 
>>>> the cluster when they disappear. With last_man_standing, the quorum 
>>>> expected votes is automatically adjusted when a node is lost, so it 
>>>> makes no difference whether the node is offline, or removed. Then 
>>>> with the auto-join functionality, it'll automatically be added back 
>>>> in when it re-establishes communication.
>>>>
>>>> It might then even be possible to write the cmap data out to a file 
>>>> when a node joins or leaves. This way if corosync restarts, and the 
>>>> corosync.conf hasn't been updated, the nodelist can be read from 
>>>> this save. If the save is out of date, and some nodes are 
>>>> unreachable, they would simply be removed, and added when they join.
>>>> This wouldn't even have to be a part of corosync. Could have some 
>>>> external utility watch the cmap values, and take care of setting 
>>>> them when corosync is launched.
>>>>
>>>> Ultimately this allows us to have a large scale dynamically sized 
>>>> cluster without having to edit the config of every node each time a 
>>>> node joins or leaves.
>>>>
>>> Actually, this is exactly what pcs does.
>> Unfortunately pcs has lots of issues.
>>
>>  1. It assumes you will be using pacemaker as well.
>>     In some of our uses, we are using corosync without pacemaker.
>>
>>  2. It still has *lots* of bugs. Even more once you start trying to use
>>     non-fedora based distros.
>>     Some bugs have been open on the project for a year and a half.
>>
>>  3. It doesn't know the real address of its own host.
>>     What I mean is when a node is sitting behind NAT. We plan on running
>>     corosync inside a docker container, and the container goes through
>>     NAT if it needs to talk to another host. So pcs would need to know
>>     the NAT address to advertise it to the other hosts. With the method
>>     described here, that address is automatically discovered.
>>
>>  4. Doesn't handle automatic cleanup.
>>     If you remove a node, something has to go and clean that node up.
>>     Basically you would have to write a program to connect to the quorum
>>     service and monitor for nodes going down, and then remove them. But
>>     then what happens if that node was only temporarily down? Who is
>>     responsible for adding it back into the cluster? If the node that
>>     was down is responsible for adding itself back in, what if another
>>     node joined the cluster while it was down? Its list will be
>>     incomplete. You could do a few things to try and alleviate these
>>     headaches, but automatic membership just feels more like the right
>>     solution.
>>
>>  5. It doesn't allow you to adjust the config file.
>>
>>
>>
>>
>>>> This really doesn't sound like it would be hard to do. I might even 
>>>> be willing to attempt implementing it myself if this sounds like 
>>>> something that would be acceptable to merge into the code base.
>>>> Thoughts?
>>>>
>>> Yes, but question is if it is really worth of it. I mean:
>>> - With multicast you have FULLY dynamic membership
>>> - PCS is able to distribute config file so adding new node to UDPU 
>>> cluster is easy
>>>
>>> Do you see any use case where pcs or multicast doesn't work? (to 
>>> clarify. I'm not blaming your idea (actually I find it interesting) 
>>> but I'm trying to find out real killer use case for this feature 
>>> which implementation will take quite a lot time almost for sure).
>>
>> Aside from the pcs issues mentioned above, having this in corosync 
>> just feels like the right solution. No external processes involved, 
>> no additional lines of communication, real-time on-demand updating. 
>> The end goal might be able to be accomplished by modifying pcs to 
>> resolve the issues, but is that the right way? If people want to use 
>> crmsh over pcs, do they not get this functionality?
>>
>>> Regards,
>>>   Honza
>>>
>>>> -Patrick
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> discuss mailing list
>>>> [email protected]
>>>> http://lists.corosync.org/mailman/listinfo/discuss
>>>>
>>
>>
> 
> _______________________________________________
> discuss mailing list
> [email protected]
> http://lists.corosync.org/mailman/listinfo/discuss
> 


_______________________________________________
discuss mailing list
[email protected]
http://lists.corosync.org/mailman/listinfo/discuss

Reply via email to