Re: [openstack-dev] [HA][RabbitMQ][messaging][Pacemaker][operators] Improved OCF resource agent for dynamic active-active mirrored clustering

Andrew Beekhof Sun, 15 Nov 2015 14:17:25 -0800

> On 13 Nov 2015, at 7:31 AM, Vladimir Kuklin <vkuk...@mirantis.com> wrote:
> 
> Hi, Andrew
> 
> Thanks for a quick turnaround.
> 
> > The one I linked to in my original reply does:
> >   
> > https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/rabbitmq-cluster
> 
> I do not have logs of testing of this script. Maybe, Bogdan has something to 
> tell about results of testing this script. From the first glance it does not 
> contain gigantic amount of workarounds we injected into our script to handle 
> various situations when a node fails to join or tries  to join a cluster that 
> does not want to accept it (in this case you need to kick it from the cluster 
> with forget_cluster_node and it starts an RPC multicall in rabbitmq internals 
> to all cluster nodes, including the dead one, hanging forever). Actually, we 
> started a long time ago with an approach similar to the one in the script 
> above, but we faced a lot of issues in the case when a node tries to join a 
> cluster after a dirty failover or a long time of being out of the cluster.


Thats really good info, much appreciated.
Peter, Oyvind: Sounds like it would be worth validating the agent we’re using 
in these types of situations.

Based on that we can plot a path forward.

> I do not have all the logs of which particular cases we were handling while 
> introducing that additional logic (it was an agile process, if you know what 
> I mean :-) ), but we finally came up with this almost 2K lines code script. 
> We are actively communicating with Pivotal folks on improving methods of 
> monitoring RabbitMQ cluster nodes or even switching to RabbitMQ 
> clusterer+autocluster plugins and writing new smaleer and fancier OCF script, 
> but this is only in plans for further Fuel releases, I guess.

:-)

> 
> >
> > > Changing the state isn’t ideal but there is precedent, the part that has 
> > > me concerned is the error codes coming out of notify.
> > > Apart from producing some log messages, I can’t think how it would 
> > > produce any recovery.
> >
> > > Unless you’re relying on the subsequent monitor operation to notice the 
> > > error state.
> > > I guess that would work but you might be waiting a while for it to notice.
> >
> > Yes, we are relying on subsequent monitor operations. We also have several 
> > OCF check levels to catch a case when one node does not have rabbitmq 
> > application started properly (btw, there was a strange bug that we had to 
> > wait for several non-zero checks to fail to get the resource to restart 
> > http://bugs.clusterlabs.org/show_bug.cgi?id=5243) .
> 
> Regarding this bug - it was very easy to reproduce - just add additional 
> check to 'Dummy' resource with non-intersecting interval returning 
> ERR_GENERIC code and the default check returning SUCCESS code. You will find 
> that it is restarting only after 2 consequent failures of non-zero level 
> check.

Ack. I’ve asked some people to look into it.

> 
> On Thu, Nov 12, 2015 at 10:58 PM, Andrew Beekhof <abeek...@redhat.com> wrote:
> 
> > On 12 Nov 2015, at 10:44 PM, Vladimir Kuklin <vkuk...@mirantis.com> wrote:
> >
> > Hi, Andrew
> >
> > >Ah good, I understood it correctly then :)
> > > I would be interested in your opinion of how the other agent does the 
> > > bootstrapping (ie. without notifications or master/slave).
> > >That makes sense, the part I’m struggling with is that it sounds like the 
> > >other agent shouldn’t work at all.
> > > Yet we’ve used it extensively and not experienced these kinds of hangs.
> > Regarding other scripts - I am not aware of any other scripts that actually 
> > handle cloned rabbitmq server. I may be mistaking, of course. So if you are 
> > aware if these scripts succeed in creating rabbitmq cluster which actually 
> > survives 1-node or all-node failure scenarios and reassembles the cluster 
> > automatically - please, let us know.
> 
> The one I linked to in my original reply does:
> 
>    
> https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/rabbitmq-cluster
> 
> >
> > > Changing the state isn’t ideal but there is precedent, the part that has 
> > > me concerned is the error codes coming out of notify.
> > > Apart from producing some log messages, I can’t think how it would 
> > > produce any recovery.
> >
> > > Unless you’re relying on the subsequent monitor operation to notice the 
> > > error state.
> > > I guess that would work but you might be waiting a while for it to notice.
> >
> > Yes, we are relying on subsequent monitor operations. We also have several 
> > OCF check levels to catch a case when one node does not have rabbitmq 
> > application started properly (btw, there was a strange bug that we had to 
> > wait for several non-zero checks to fail to get the resource to restart 
> > http://bugs.clusterlabs.org/show_bug.cgi?id=5243) .
> 
> It appears I misunderstood your bug the first time around :-(
> Do you still have logs of this occuring?
> 
> > I now remember, why we did notify errors - for error logging, I guess.
> >
> >
> > On Thu, Nov 12, 2015 at 1:30 AM, Andrew Beekhof <abeek...@redhat.com> wrote:
> >
> > > On 11 Nov 2015, at 11:35 PM, Vladimir Kuklin <vkuk...@mirantis.com> wrote:
> > >
> > > Hi, Andrew
> > >
> > > Let me answer your questions.
> > >
> > > This agent is active/active which actually marks one of the node as 
> > > 'pseudo'-master which is used as a target for other nodes to join to. We 
> > > also check which node is a master and use it in monitor action to check 
> > > whether this node is clustered with this 'master' node. When we do 
> > > cluster bootstrap, we need to decide which node to mark as a master node. 
> > > Then, when it starts (actually, promotes), we can finally pick its name 
> > > through notification mechanism and ask other nodes to join this cluster.
> >
> > Ah good, I understood it correctly then :)
> > I would be interested in your opinion of how the other agent does the 
> > bootstrapping (ie. without notifications or master/slave).
> >
> > >
> > > Regarding disconnect_node+forget_cluster_node this is quite simple - we 
> > > need to eject node from the cluster. Otherwise it is mentioned in the 
> > > list of cluster nodes and a lot of cluster actions, e.g. list_queues, 
> > > will hang forever as well as forget_cluster_node action.
> >
> > That makes sense, the part I’m struggling with is that it sounds like the 
> > other agent shouldn’t work at all.
> > Yet we’ve used it extensively and not experienced these kinds of hangs.
> >
> > >
> > > We also handle this case whenever a node leaves the cluster. If you 
> > > remember, I wrote an email to Pacemaker ML regarding getting 
> > > notifications on node unjoin event '[openstack-dev] [Fuel][Pacemaker][HA] 
> > > Notifying clones of offline nodes’.
> >
> > Oh, I recall that now.
> >
> > > So we went another way and added a dbus daemon listener that does the 
> > > same when node lefts corosync cluster (we know that this is a little bit 
> > > racy, but disconnect+forget actions pair is idempotent).
> > >
> > > Regarding notification commands - we changed behaviour to the one that 
> > > fitter our use cases better and passed our destructive tests. It could be 
> > > Pacemaker-version dependent, so I agree we should consider changing this 
> > > behaviour. But so far it worked for us.
> >
> > Changing the state isn’t ideal but there is precedent, the part that has me 
> > concerned is the error codes coming out of notify.
> > Apart from producing some log messages, I can’t think how it would produce 
> > any recovery.
> >
> > Unless you’re relying on the subsequent monitor operation to notice the 
> > error state.
> > I guess that would work but you might be waiting a while for it to notice.
> >
> > >
> > > On Wed, Nov 11, 2015 at 2:12 PM, Andrew Beekhof <abeek...@redhat.com> 
> > > wrote:
> > >
> > > > On 11 Nov 2015, at 6:26 PM, bdobre...@mirantis.com wrote:
> > > >
> > > > Thank you Andrew.
> > > > Answers below.
> > > > >>>
> > > > Sounds interesting, can you give any comment about how it differs to 
> > > > the other[i] upstream agent?
> > > > Am I right that this one is effectively A/P and wont function without 
> > > > some kind of shared storage?
> > > > Any particular reason you went down this path instead of full A/A?
> > > >
> > > > [i]
> > > > https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/rabbitmq-cluster
> > > > <<<
> > > > It is based on multistate clone notifications. It requries nothing 
> > > > shared but Corosync info base CIB where all Pacemaker resources stored 
> > > > anyway.
> > > > And it is fully A/A.
> > >
> > > Oh!  So I should skip the A/P parts before "Auto-configuration of a 
> > > cluster with a Pacemaker”?
> > > Is the idea that the master mode is for picking a node to bootstrap the 
> > > cluster?
> > >
> > > If so I don’t believe that should be necessary provided you specify 
> > > ordered=true for the clone.
> > > This allows you to assume in the agent that your instance is the only one 
> > > currently changing state (by starting or stopping).
> > > I notice that rabbitmq.com explicitly sets this to false… any particular 
> > > reason?
> > >
> > >
> > > Regarding the pcs command to create the resource, you can simplify it to:
> > >
> > > pcs resource create --force --master p_rabbitmq-server 
> > > ocf:rabbitmq:rabbitmq-server-ha \
> > >   erlang_cookie=DPMDALGUKEOMPTHWPYKC node_port=5672 \
> > >   op monitor interval=30 timeout=60 \
> > >   op monitor interval=27 role=Master timeout=60 \
> > >   op monitor interval=103 role=Slave timeout=60 OCF_CHECK_LEVEL=30 \
> > >   meta notify=true ordered=false interleave=true master-max=1 
> > > master-node-max=1
> > >
> > > If you update the stop/start/notify/promote/demote timeouts in the 
> > > agent’s metadata.
> > >
> > >
> > > Lines 1602,1565,1621,1632,1657, and 1678 have the notify command 
> > > returning an error.
> > > Was this logic tested? Because pacemaker does not currently support/allow 
> > > notify actions to fail.
> > > IIRC pacemaker simply ignores them.
> > >
> > > Modifying the resource state in notifications is also highly unusual.
> > > What was the reason for that?
> > >
> > > I notice that on node down, this agent makes disconnect_node and 
> > > forget_cluster_node calls.
> > > The other upstream agent does not, do you have any information about the 
> > > bad things that might happen as a result?
> > >
> > > Basically I’m looking for what each option does differently/better with a 
> > > view to converging on a single implementation.
> > > I don’t much care in which location it lives.
> > >
> > > I’m CC’ing the other upstream maintainer, it would be good if you guys 
> > > could have a chat :-)
> > >
> > > > All running rabbit nodes may process AMQP connections. Master state is 
> > > > only for a cluster initial point at wich other slaves may join to it.
> > > > Note, here you can find events flow charts as well [0]
> > > > [0] https://www.rabbitmq.com/pacemaker.html
> > > > Regards,
> > > > Bogdan
> > > > __________________________________________________________________________
> > > > OpenStack Development Mailing List (not for usage questions)
> > > > Unsubscribe: 
> > > > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > >
> > >
> > > __________________________________________________________________________
> > > OpenStack Development Mailing List (not for usage questions)
> > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > >
> > >
> > >
> > > --
> > > Yours Faithfully,
> > > Vladimir Kuklin,
> > > Fuel Library Tech Lead,
> > > Mirantis, Inc.
> > > +7 (495) 640-49-04
> > > +7 (926) 702-39-68
> > > Skype kuklinvv
> > > 35bk3, Vorontsovskaya Str.
> > > Moscow, Russia,
> > > www.mirantis.com
> > > www.mirantis.ru
> > > vkuk...@mirantis.com
> > > __________________________________________________________________________
> > > OpenStack Development Mailing List (not for usage questions)
> > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> > __________________________________________________________________________
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> >
> > --
> > Yours Faithfully,
> > Vladimir Kuklin,
> > Fuel Library Tech Lead,
> > Mirantis, Inc.
> > +7 (495) 640-49-04
> > +7 (926) 702-39-68
> > Skype kuklinvv
> > 35bk3, Vorontsovskaya Str.
> > Moscow, Russia,
> > www.mirantis.com
> > www.mirantis.ru
> > vkuk...@mirantis.com
> > __________________________________________________________________________
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> -- 
> Yours Faithfully,
> Vladimir Kuklin,
> Fuel Library Tech Lead,
> Mirantis, Inc.
> +7 (495) 640-49-04
> +7 (926) 702-39-68
> Skype kuklinvv
> 35bk3, Vorontsovskaya Str.
> Moscow, Russia,
> www.mirantis.com
> www.mirantis.ru
> vkuk...@mirantis.com
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [HA][RabbitMQ][messaging][Pacemaker][operators] Improved OCF resource agent for dynamic active-active mirrored clustering

Reply via email to