Re: [openstack-dev] [Fuel] Speed Up RabbitMQ Recovering

Andrew Beekhof Tue, 19 May 2015 19:09:07 -0700

> On 20 May 2015, at 6:05 am, Andrew Woodward <xar...@gmail.com> wrote:
> 
> 
> 
> On Thu, May 7, 2015 at 5:01 PM Andrew Beekhof <abeek...@redhat.com> wrote:
> 
> > On 5 May 2015, at 1:19 pm, Zhou Zheng Sheng / 周征晟 <zhengsh...@awcloud.com> 
> > wrote:
> >
> > Thank you Andrew.
> >
> > on 2015/05/05 08:03, Andrew Beekhof wrote:
> >>> On 28 Apr 2015, at 11:15 pm, Bogdan Dobrelya <bdobre...@mirantis.com> 
> >>> wrote:
> >>>
> >>>> Hello,
> >>> Hello, Zhou
> >>>
> >>>> I using Fuel 6.0.1 and find that RabbitMQ recover time is long after
> >>>> power failure. I have a running HA environment, then I reset power of
> >>>> all the machines at the same time. I observe that after reboot it
> >>>> usually takes 10 minutes for RabittMQ cluster to appear running
> >>>> master-slave mode in pacemaker. If I power off all the 3 controllers and
> >>>> only start 2 of them, the downtime sometimes can be as long as 20 
> >>>> minutes.
> >>> Yes, this is a known issue [0]. Note, there were many bugfixes, like
> >>> [1],[2],[3], merged for MQ OCF script, so you may want to try to
> >>> backport them as well by the following guide [4]
> >>>
> >>> [0] https://bugs.launchpad.net/fuel/+bug/1432603
> >>> [1] https://review.openstack.org/#/c/175460/
> >>> [2] https://review.openstack.org/#/c/175457/
> >>> [3] https://review.openstack.org/#/c/175371/
> >>> [4] https://review.openstack.org/#/c/170476/
> >> Is there a reason you’re using a custom OCF script instead of the 
> >> upstream[a] one?
> >> Please have a chat with David (the maintainer, in CC) if there is 
> >> something you believe is wrong with it.
> >>
> >> [a] 
> >> https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/rabbitmq-cluster
> >
> > I'm using the OCF script from the Fuel project, specifically from the
> > "6.0" stable branch [alpha].
> 
> Ah, I’m still learning who is who... i thought you were part of that project 
> :-)
> 
> >
> > Comparing with upstream OCF code, the main difference is that Fuel
> > RabbitMQ OCF is a master-slave resource. Fuel RabbitMQ OCF does more
> > bookkeeping, for example, blocking client access when RabbitMQ cluster
> > is not ready. I beleive the upstream OCF should be OK to use as well
> > after I read the code, but it might not fit into the Fuel project. As
> > far as I test, the Fuel OCF script is good except sometimes the full
> > reassemble time is long, and as I find out, it is mostly because the
> > Fuel MySQL Galera OCF script keeps pacemaker from promoting RabbitMQ
> > resource, as I mentioned in the previous emails.
> >
> > Maybe Vladimir and Sergey can give us more insight on why Fuel needs a
> > master-slave RabbitMQ.
> 
> That would be good to know.
> Browsing the agent, promote seems to be a no-op if rabbit is already running.
> 
> 
> To the master / slave reason due to how the ocf script is structured to deal 
> with rabbit's poor ability to handle its self in some scenarios. Hopefully 
> the state transition diagram [5] is enough to clarify what's going on.
> 
> [5] http://goo.gl/PPNrw7


Not really.
It seems to be under the impression you can skip started and go directly from 
stopped to master.
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Fuel] Speed Up RabbitMQ Recovering

Reply via email to