> > Hi,
> >
> > We are now trying to show a good performance report to the potential
> > customer.
> > Our customer's requests are here;
> > * There are more than 100 resources on one node.
> > * 100 resources are included in one group, so they would start/stop
> > sequentially.
> > * Fail over for all of 100 resources should complete within 1 minute.
> 
> thats less than a second per resource (since members of a group are
> started sequentially)... is your resource capable of starting so
> quickly?
> 
> in truth, i think that for group that size, 1 minute is an unrealistic
> deadline (assuming its not just full of Dummy resources)
> 
> > * Heartbeat stable 2.1 (maybe release as 2.1.4, soon)
> > It took about 4 minutes for fail over.
> >
> > * Heartbeat-dev(5072025b79b8) + Pacemaker-0.7(ee6832884524)
> > It took about 3 minutes for fail over.
> > It's getting better!
> > Is this some effect of the new xml parser?
> 
> possible - but more likely the performance optimization i've been
> doing over the last couple of weeks and months.
> 
> did you cause the DC fail or another node?
> because the load spikes generated by electing a new DC have been
> reduced by 70-80% (no, thats not a typo)
> 
> and before you ask, no, these changes will never be part of 2.1.x
> 
> > hb_report are so huge, I created the bugzilla as enhancement.
> > http://developerbugs.linux-foundation.org/show_bug.cgi?id=1935
> >
> > Do you have any good idea to speed up fail over time?
> 
> split the group up :)

That's what I thought...
I got more detail, the target system would have 9 nodes, there are 8 actives
and 1 stand-by.
Each node has one group which contains 15 resources.
I tried 1 act + 1 sby with 8 group (1 group has 15 resources) as a test,
It took about 70 - 80 seconds for fail over.
Is it a reasonable time?
Is there any tunable value? 
For example; MAXMSG, MAXUNCOMPRESSED, max_child_count...

> 
> >
> > It would be best if the performance improvement is available with
> > Heartbeat
> > 2.1.4.
> > I know this kind of performance improvement is not so easy, But this
> > is a
> > matter of the greatest urgency If it comes in after the nearest
> > release, we
> > are planning to backport it to 2.1.4 for our customer individually.
> 
> i think you'll find that is an extremely non-trivial task

Year, I know we should recommend Heartbeat 3.0(?) + Pacemaker 1.0,
but it seems that we have no time to wait them this time.
(it's just our schedule)

Thanks,
Junko


_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to