01.04.2013 20:09, David Vossel wrote: > ----- Original Message ----- >> From: "Vladislav Bogdanov" <bub...@hoster-ok.com> >> To: pacemaker@oss.clusterlabs.org >> Sent: Monday, April 1, 2013 10:35:39 AM >> Subject: Re: [Pacemaker] Speeding up startup after migration >> >> 01.04.2013 17:28, David Vossel пишет: >>> >>> >>> >>> >>> ----- Original Message ----- >>>> From: "Vladislav Bogdanov" <bub...@hoster-ok.com> >>>> To: pacemaker@oss.clusterlabs.org >>>> Sent: Friday, March 29, 2013 2:03:27 AM >>>> Subject: Re: [Pacemaker] Speeding up startup after migration >>>> >>>> 29.03.2013 03:31, Andrew Beekhof wrote: >>>>> On Fri, Mar 29, 2013 at 4:12 AM, Benjamin Kiessling >>>>> <mittages...@l.unchti.me> wrote: >>>>>> Hi, >>>>>> >>>>>> we've got a small pacemaker cluster running which controls an >>>>>> active/passive router. On this cluster we've got a semi-large (~30) >>>>>> number of primitives which are grouped together. On migration it takes >>>>>> quite a long time until each resource is brought up again because they >>>>>> are started sequentially. Is there a way to speed up the process, >>>>>> ideally to execute these resource agents in parallel? They are fully >>>>>> independent so the order in which they finish is of no concern. >>>>> >>>>> I'm guessing you have them in a group? "Don't do that" and they will >>>>> fail over in parallel. >>>> >>>> Does current lrmd implementation have batch-limit like cluster-glue's >>>> one had? Can't find where is it. >>> >>> The batch-limit option is still around, but has nothing to do with >>> the lrmd. It does limit how many resources can execute in parallel, but at >>> the transition engine level rather than the lrmd. >> >> Yep, I know that option, it was there for a very long time. >> >> So, if I understand correctly, new lrmd runs as many simultaneous jobs >> as possible. Unfortunately, in some circumstances this would result in >> the high node load and timeouts. Is there a way to some-how limit that load? > > Isn't that what the batch-limit option does? or are you saying you > want a batch limit type option that is node specific? Why are you > concerned about this behavior living in the LRMD instead of at the > transition processing level?
There was a limit in a glue's lrmd, and I think it was there for reason. I do not know which behavior is better, they are just different. > > I believe if we do any batch limiting type behavior at the LRMD > level we're going to run into problems with the transition timers in the crmd. Did that change in crmd after lrmd replacement? > The LRMD needs to always perform the actions it is given as soon as possible. Yes, but... heavy load on a host (because of f.e. 150 CPU-intensive operations run in parallel) may cause f.e. monitor timeouts and then resource restarts and then stop timeouts and fencing. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org