On 4/22/2017 11:51 PM, Andrei Borzenkov wrote:
As a real life example (not Linux/pacemaker) - panicking node flush eddisk buffers, so it was not safe to access shared filesystem until this was complete. This could take quite a lot of time, so without agent on *surviving* node(s) that monitors and acknowledges this process this resulted in data corruption.
If your syncs take that long, pay an extra nickel and buy a disk shelf with dual-ported sas drives and a pair of ssds for the log device. Otherwise what you're looking at is effectively downtime during the failover, and having "quite a lot" of it kinda defeats the purpose I should think.
Dima _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org