> > Hi, > > > > We are now trying to show a good performance report to the potential > > customer. > > Our customer's requests are here; > > * There are more than 100 resources on one node. > > * 100 resources are included in one group, so they would start/stop > > sequentially. > > * Fail over for all of 100 resources should complete within 1 minute. > > thats less than a second per resource (since members of a group are > started sequentially)... is your resource capable of starting so > quickly? > > in truth, i think that for group that size, 1 minute is an unrealistic > deadline (assuming its not just full of Dummy resources) > > > * Heartbeat stable 2.1 (maybe release as 2.1.4, soon) > > It took about 4 minutes for fail over. > > > > * Heartbeat-dev(5072025b79b8) + Pacemaker-0.7(ee6832884524) > > It took about 3 minutes for fail over. > > It's getting better! > > Is this some effect of the new xml parser? > > possible - but more likely the performance optimization i've been > doing over the last couple of weeks and months. > > did you cause the DC fail or another node? > because the load spikes generated by electing a new DC have been > reduced by 70-80% (no, thats not a typo) > > and before you ask, no, these changes will never be part of 2.1.x > > > hb_report are so huge, I created the bugzilla as enhancement. > > http://developerbugs.linux-foundation.org/show_bug.cgi?id=1935 > > > > Do you have any good idea to speed up fail over time? > > split the group up :)
That's what I thought... I got more detail, the target system would have 9 nodes, there are 8 actives and 1 stand-by. Each node has one group which contains 15 resources. I tried 1 act + 1 sby with 8 group (1 group has 15 resources) as a test, It took about 70 - 80 seconds for fail over. Is it a reasonable time? Is there any tunable value? For example; MAXMSG, MAXUNCOMPRESSED, max_child_count... > > > > > It would be best if the performance improvement is available with > > Heartbeat > > 2.1.4. > > I know this kind of performance improvement is not so easy, But this > > is a > > matter of the greatest urgency If it comes in after the nearest > > release, we > > are planning to backport it to 2.1.4 for our customer individually. > > i think you'll find that is an extremely non-trivial task Year, I know we should recommend Heartbeat 3.0(?) + Pacemaker 1.0, but it seems that we have no time to wait them this time. (it's just our schedule) Thanks, Junko _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems