> On 13 Nov 2015, at 7:31 AM, Vladimir Kuklin <vkuk...@mirantis.com> wrote: > > Hi, Andrew > > Thanks for a quick turnaround. > > > The one I linked to in my original reply does: > > > > https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/rabbitmq-cluster > > I do not have logs of testing of this script. Maybe, Bogdan has something to > tell about results of testing this script. From the first glance it does not > contain gigantic amount of workarounds we injected into our script to handle > various situations when a node fails to join or tries to join a cluster that > does not want to accept it (in this case you need to kick it from the cluster > with forget_cluster_node and it starts an RPC multicall in rabbitmq internals > to all cluster nodes, including the dead one, hanging forever). Actually, we > started a long time ago with an approach similar to the one in the script > above, but we faced a lot of issues in the case when a node tries to join a > cluster after a dirty failover or a long time of being out of the cluster.
Thats really good info, much appreciated. Peter, Oyvind: Sounds like it would be worth validating the agent we’re using in these types of situations. Based on that we can plot a path forward. > I do not have all the logs of which particular cases we were handling while > introducing that additional logic (it was an agile process, if you know what > I mean :-) ), but we finally came up with this almost 2K lines code script. > We are actively communicating with Pivotal folks on improving methods of > monitoring RabbitMQ cluster nodes or even switching to RabbitMQ > clusterer+autocluster plugins and writing new smaleer and fancier OCF script, > but this is only in plans for further Fuel releases, I guess. :-) > > > > > > Changing the state isn’t ideal but there is precedent, the part that has > > > me concerned is the error codes coming out of notify. > > > Apart from producing some log messages, I can’t think how it would > > > produce any recovery. > > > > > Unless you’re relying on the subsequent monitor operation to notice the > > > error state. > > > I guess that would work but you might be waiting a while for it to notice. > > > > Yes, we are relying on subsequent monitor operations. We also have several > > OCF check levels to catch a case when one node does not have rabbitmq > > application started properly (btw, there was a strange bug that we had to > > wait for several non-zero checks to fail to get the resource to restart > > http://bugs.clusterlabs.org/show_bug.cgi?id=5243) . > > Regarding this bug - it was very easy to reproduce - just add additional > check to 'Dummy' resource with non-intersecting interval returning > ERR_GENERIC code and the default check returning SUCCESS code. You will find > that it is restarting only after 2 consequent failures of non-zero level > check. Ack. I’ve asked some people to look into it. > > On Thu, Nov 12, 2015 at 10:58 PM, Andrew Beekhof <abeek...@redhat.com> wrote: > > > On 12 Nov 2015, at 10:44 PM, Vladimir Kuklin <vkuk...@mirantis.com> wrote: > > > > Hi, Andrew > > > > >Ah good, I understood it correctly then :) > > > I would be interested in your opinion of how the other agent does the > > > bootstrapping (ie. without notifications or master/slave). > > >That makes sense, the part I’m struggling with is that it sounds like the > > >other agent shouldn’t work at all. > > > Yet we’ve used it extensively and not experienced these kinds of hangs. > > Regarding other scripts - I am not aware of any other scripts that actually > > handle cloned rabbitmq server. I may be mistaking, of course. So if you are > > aware if these scripts succeed in creating rabbitmq cluster which actually > > survives 1-node or all-node failure scenarios and reassembles the cluster > > automatically - please, let us know. > > The one I linked to in my original reply does: > > > https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/rabbitmq-cluster > > > > > > Changing the state isn’t ideal but there is precedent, the part that has > > > me concerned is the error codes coming out of notify. > > > Apart from producing some log messages, I can’t think how it would > > > produce any recovery. > > > > > Unless you’re relying on the subsequent monitor operation to notice the > > > error state. > > > I guess that would work but you might be waiting a while for it to notice. > > > > Yes, we are relying on subsequent monitor operations. We also have several > > OCF check levels to catch a case when one node does not have rabbitmq > > application started properly (btw, there was a strange bug that we had to > > wait for several non-zero checks to fail to get the resource to restart > > http://bugs.clusterlabs.org/show_bug.cgi?id=5243) . > > It appears I misunderstood your bug the first time around :-( > Do you still have logs of this occuring? > > > I now remember, why we did notify errors - for error logging, I guess. > > > > > > On Thu, Nov 12, 2015 at 1:30 AM, Andrew Beekhof <abeek...@redhat.com> wrote: > > > > > On 11 Nov 2015, at 11:35 PM, Vladimir Kuklin <vkuk...@mirantis.com> wrote: > > > > > > Hi, Andrew > > > > > > Let me answer your questions. > > > > > > This agent is active/active which actually marks one of the node as > > > 'pseudo'-master which is used as a target for other nodes to join to. We > > > also check which node is a master and use it in monitor action to check > > > whether this node is clustered with this 'master' node. When we do > > > cluster bootstrap, we need to decide which node to mark as a master node. > > > Then, when it starts (actually, promotes), we can finally pick its name > > > through notification mechanism and ask other nodes to join this cluster. > > > > Ah good, I understood it correctly then :) > > I would be interested in your opinion of how the other agent does the > > bootstrapping (ie. without notifications or master/slave). > > > > > > > > Regarding disconnect_node+forget_cluster_node this is quite simple - we > > > need to eject node from the cluster. Otherwise it is mentioned in the > > > list of cluster nodes and a lot of cluster actions, e.g. list_queues, > > > will hang forever as well as forget_cluster_node action. > > > > That makes sense, the part I’m struggling with is that it sounds like the > > other agent shouldn’t work at all. > > Yet we’ve used it extensively and not experienced these kinds of hangs. > > > > > > > > We also handle this case whenever a node leaves the cluster. If you > > > remember, I wrote an email to Pacemaker ML regarding getting > > > notifications on node unjoin event '[openstack-dev] [Fuel][Pacemaker][HA] > > > Notifying clones of offline nodes’. > > > > Oh, I recall that now. > > > > > So we went another way and added a dbus daemon listener that does the > > > same when node lefts corosync cluster (we know that this is a little bit > > > racy, but disconnect+forget actions pair is idempotent). > > > > > > Regarding notification commands - we changed behaviour to the one that > > > fitter our use cases better and passed our destructive tests. It could be > > > Pacemaker-version dependent, so I agree we should consider changing this > > > behaviour. But so far it worked for us. > > > > Changing the state isn’t ideal but there is precedent, the part that has me > > concerned is the error codes coming out of notify. > > Apart from producing some log messages, I can’t think how it would produce > > any recovery. > > > > Unless you’re relying on the subsequent monitor operation to notice the > > error state. > > I guess that would work but you might be waiting a while for it to notice. > > > > > > > > On Wed, Nov 11, 2015 at 2:12 PM, Andrew Beekhof <abeek...@redhat.com> > > > wrote: > > > > > > > On 11 Nov 2015, at 6:26 PM, bdobre...@mirantis.com wrote: > > > > > > > > Thank you Andrew. > > > > Answers below. > > > > >>> > > > > Sounds interesting, can you give any comment about how it differs to > > > > the other[i] upstream agent? > > > > Am I right that this one is effectively A/P and wont function without > > > > some kind of shared storage? > > > > Any particular reason you went down this path instead of full A/A? > > > > > > > > [i] > > > > https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/rabbitmq-cluster > > > > <<< > > > > It is based on multistate clone notifications. It requries nothing > > > > shared but Corosync info base CIB where all Pacemaker resources stored > > > > anyway. > > > > And it is fully A/A. > > > > > > Oh! So I should skip the A/P parts before "Auto-configuration of a > > > cluster with a Pacemaker”? > > > Is the idea that the master mode is for picking a node to bootstrap the > > > cluster? > > > > > > If so I don’t believe that should be necessary provided you specify > > > ordered=true for the clone. > > > This allows you to assume in the agent that your instance is the only one > > > currently changing state (by starting or stopping). > > > I notice that rabbitmq.com explicitly sets this to false… any particular > > > reason? > > > > > > > > > Regarding the pcs command to create the resource, you can simplify it to: > > > > > > pcs resource create --force --master p_rabbitmq-server > > > ocf:rabbitmq:rabbitmq-server-ha \ > > > erlang_cookie=DPMDALGUKEOMPTHWPYKC node_port=5672 \ > > > op monitor interval=30 timeout=60 \ > > > op monitor interval=27 role=Master timeout=60 \ > > > op monitor interval=103 role=Slave timeout=60 OCF_CHECK_LEVEL=30 \ > > > meta notify=true ordered=false interleave=true master-max=1 > > > master-node-max=1 > > > > > > If you update the stop/start/notify/promote/demote timeouts in the > > > agent’s metadata. > > > > > > > > > Lines 1602,1565,1621,1632,1657, and 1678 have the notify command > > > returning an error. > > > Was this logic tested? Because pacemaker does not currently support/allow > > > notify actions to fail. > > > IIRC pacemaker simply ignores them. > > > > > > Modifying the resource state in notifications is also highly unusual. > > > What was the reason for that? > > > > > > I notice that on node down, this agent makes disconnect_node and > > > forget_cluster_node calls. > > > The other upstream agent does not, do you have any information about the > > > bad things that might happen as a result? > > > > > > Basically I’m looking for what each option does differently/better with a > > > view to converging on a single implementation. > > > I don’t much care in which location it lives. > > > > > > I’m CC’ing the other upstream maintainer, it would be good if you guys > > > could have a chat :-) > > > > > > > All running rabbit nodes may process AMQP connections. Master state is > > > > only for a cluster initial point at wich other slaves may join to it. > > > > Note, here you can find events flow charts as well [0] > > > > [0] https://www.rabbitmq.com/pacemaker.html > > > > Regards, > > > > Bogdan > > > > __________________________________________________________________________ > > > > OpenStack Development Mailing List (not for usage questions) > > > > Unsubscribe: > > > > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > > > > > __________________________________________________________________________ > > > OpenStack Development Mailing List (not for usage questions) > > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > > > > > > > > -- > > > Yours Faithfully, > > > Vladimir Kuklin, > > > Fuel Library Tech Lead, > > > Mirantis, Inc. > > > +7 (495) 640-49-04 > > > +7 (926) 702-39-68 > > > Skype kuklinvv > > > 35bk3, Vorontsovskaya Str. > > > Moscow, Russia, > > > www.mirantis.com > > > www.mirantis.ru > > > vkuk...@mirantis.com > > > __________________________________________________________________________ > > > OpenStack Development Mailing List (not for usage questions) > > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > > __________________________________________________________________________ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > > > > -- > > Yours Faithfully, > > Vladimir Kuklin, > > Fuel Library Tech Lead, > > Mirantis, Inc. > > +7 (495) 640-49-04 > > +7 (926) 702-39-68 > > Skype kuklinvv > > 35bk3, Vorontsovskaya Str. > > Moscow, Russia, > > www.mirantis.com > > www.mirantis.ru > > vkuk...@mirantis.com > > __________________________________________________________________________ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > -- > Yours Faithfully, > Vladimir Kuklin, > Fuel Library Tech Lead, > Mirantis, Inc. > +7 (495) 640-49-04 > +7 (926) 702-39-68 > Skype kuklinvv > 35bk3, Vorontsovskaya Str. > Moscow, Russia, > www.mirantis.com > www.mirantis.ru > vkuk...@mirantis.com > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev