Honza Thanks for your response.
Regards Nilakantan -----Original Message----- From: Jan Friesse [mailto:[email protected]] Sent: Thursday, June 19, 2014 8:20 PM To: Mahadevan, Nilakantan (STSD); Patrick Hemmer; [email protected] Subject: Re: [corosync] automatic membership discovery Mahadevan, > Hi, > > Just a thought, would it also provide the flexibility to make this an > optional feature while setting up the cluster. This feature is good, > but if there is a way for me to ensure that the existing nodes do not > accept unless the new nodes are present in the local Config file. In > that case it would give the flexibility to system managers to choose > whichever is appropriate for them sure. Such feature (if implemented) would mean for sure to set something like "auto_accept_node" to on and not being default. Honza > > Regards > Nilakantan > > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Jan Friesse > Sent: Thursday, June 19, 2014 7:20 PM > To: Patrick Hemmer; [email protected] > Subject: Re: [corosync] automatic membership discovery > > Patrick, > so just to recapitulate your idea. Let's say you have cluster with 2 nodes. > Now, you will decide to add third node. Your idea is about properly configure > 3rd node (so if we would distribute that config file, call reload on every > node, everything would work), in other words, add 3rd node ONLY to config > file on 3rd node and then start corosync. Other nodes will just accept node, > add it to their membership (and probably some kind of automatically generated > persistent list of nodes). Do I understand it correctly? > > Because if so, I believe it would mean also change config file, simply to > keep them in sync. And honestly, keeping config file is for sure a way I > would like to go, but that way is very hard. Every single thing must be very > well defined (like what is synchronized and what is not). > > Regards, > Honza > > Patrick Hemmer napsal(a): >> *F**rom: *Patrick Hemmer <[email protected]> >> *Sent: * 2014-06-16 11:25:40 EDT >> *To: *Jan Friesse <[email protected]>, [email protected] >> *Subject: *Re: [corosync] automatic membership discovery >> >> >> On 2014/06/16 11:25, Patrick Hemmer wrote: >>> Patrick, >>> >>>> I'm interested in having corosync automatically accept members into >>>> the cluster without manual reconfiguration. Meaning that when I >>>> bring a new node online, I want to configure it for the existing >>>> nodes, and those nodes will automatically add the new node into their >>>> nodelist. >>>> From a purely technical standpoint, this doesn't seem like it would >>>> be hard to do. The only 2 things you have to do to add a node are >>>> add the nodelist.node.X.nodeid and ring0_addr to cmap. When the new >>>> node comes up, it starts sending out messages to the existing nodes. >>>> The ring0_addr can be discovered from the source address, and the nodeid >>>> is in the message. >>>> >>> I need to think about this little deeper. It sounds like it may >>> work, but I'm not entirely sure. >>> >>>> Going even further, when using the allow_downscale and >>>> last_man_standing features, we can automatically remove nodes from >>>> the cluster when they disappear. With last_man_standing, the quorum >>>> expected votes is automatically adjusted when a node is lost, so it >>>> makes no difference whether the node is offline, or removed. Then >>>> with the auto-join functionality, it'll automatically be added back >>>> in when it re-establishes communication. >>>> >>>> It might then even be possible to write the cmap data out to a file >>>> when a node joins or leaves. This way if corosync restarts, and the >>>> corosync.conf hasn't been updated, the nodelist can be read from >>>> this save. If the save is out of date, and some nodes are >>>> unreachable, they would simply be removed, and added when they join. >>>> This wouldn't even have to be a part of corosync. Could have some >>>> external utility watch the cmap values, and take care of setting >>>> them when corosync is launched. >>>> >>>> Ultimately this allows us to have a large scale dynamically sized >>>> cluster without having to edit the config of every node each time a >>>> node joins or leaves. >>>> >>> Actually, this is exactly what pcs does. >> Unfortunately pcs has lots of issues. >> >> 1. It assumes you will be using pacemaker as well. >> In some of our uses, we are using corosync without pacemaker. >> >> 2. It still has *lots* of bugs. Even more once you start trying to use >> non-fedora based distros. >> Some bugs have been open on the project for a year and a half. >> >> 3. It doesn't know the real address of its own host. >> What I mean is when a node is sitting behind NAT. We plan on running >> corosync inside a docker container, and the container goes through >> NAT if it needs to talk to another host. So pcs would need to know >> the NAT address to advertise it to the other hosts. With the method >> described here, that address is automatically discovered. >> >> 4. Doesn't handle automatic cleanup. >> If you remove a node, something has to go and clean that node up. >> Basically you would have to write a program to connect to the quorum >> service and monitor for nodes going down, and then remove them. But >> then what happens if that node was only temporarily down? Who is >> responsible for adding it back into the cluster? If the node that >> was down is responsible for adding itself back in, what if another >> node joined the cluster while it was down? Its list will be >> incomplete. You could do a few things to try and alleviate these >> headaches, but automatic membership just feels more like the right >> solution. >> >> 5. It doesn't allow you to adjust the config file. >> >> >> >> >>>> This really doesn't sound like it would be hard to do. I might even >>>> be willing to attempt implementing it myself if this sounds like >>>> something that would be acceptable to merge into the code base. >>>> Thoughts? >>>> >>> Yes, but question is if it is really worth of it. I mean: >>> - With multicast you have FULLY dynamic membership >>> - PCS is able to distribute config file so adding new node to UDPU >>> cluster is easy >>> >>> Do you see any use case where pcs or multicast doesn't work? (to >>> clarify. I'm not blaming your idea (actually I find it interesting) >>> but I'm trying to find out real killer use case for this feature >>> which implementation will take quite a lot time almost for sure). >> >> Aside from the pcs issues mentioned above, having this in corosync >> just feels like the right solution. No external processes involved, >> no additional lines of communication, real-time on-demand updating. >> The end goal might be able to be accomplished by modifying pcs to >> resolve the issues, but is that the right way? If people want to use >> crmsh over pcs, do they not get this functionality? >> >>> Regards, >>> Honza >>> >>>> -Patrick >>>> >>>> >>>> >>>> _______________________________________________ >>>> discuss mailing list >>>> [email protected] >>>> http://lists.corosync.org/mailman/listinfo/discuss >>>> >> >> > > _______________________________________________ > discuss mailing list > [email protected] > http://lists.corosync.org/mailman/listinfo/discuss > _______________________________________________ discuss mailing list [email protected] http://lists.corosync.org/mailman/listinfo/discuss
