On Jul 10, 2008, at 1:16 PM, Junko IKEDA wrote:

Hi,

We are now trying to show a good performance report to the potential
customer.
Our customer's requests are here;
* There are more than 100 resources on one node.
* 100 resources are included in one group, so they would start/stop
sequentially.
* Fail over for all of 100 resources should complete within 1 minute.

thats less than a second per resource (since members of a group are started sequentially)... is your resource capable of starting so quickly?

in truth, i think that for group that size, 1 minute is an unrealistic deadline (assuming its not just full of Dummy resources)

* Heartbeat stable 2.1 (maybe release as 2.1.4, soon)
It took about 4 minutes for fail over.

* Heartbeat-dev(5072025b79b8) + Pacemaker-0.7(ee6832884524)
It took about 3 minutes for fail over.
It's getting better!
Is this some effect of the new xml parser?

possible - but more likely the performance optimization i've been doing over the last couple of weeks and months.

did you cause the DC fail or another node?
because the load spikes generated by electing a new DC have been reduced by 70-80% (no, thats not a typo)

and before you ask, no, these changes will never be part of 2.1.x

hb_report are so huge, I created the bugzilla as enhancement.
http://developerbugs.linux-foundation.org/show_bug.cgi?id=1935

Do you have any good idea to speed up fail over time?

split the group up :)


It would be best if the performance improvement is available with Heartbeat
2.1.4.
I know this kind of performance improvement is not so easy, But this is a matter of the greatest urgency If it comes in after the nearest release, we
are planning to backport it to 2.1.4 for our customer individually.

i think you'll find that is an extremely non-trivial task
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to