2008/6/20 Lars Marowsky-Bree <[EMAIL PROTECTED]>: > On 2008-06-19T22:52:55, Xinwei Hu <[EMAIL PROTECTED]> wrote: > >> > True. It is possible to break sfex, but the probability that that >> > is going to happen is extremely low and could be due only to a >> > very pathological timing. One way to make this probability still >> >> From my previous experience, I always got _NO_ from customers when >> there are possibility to data corruption. >> So I don't think "extreamely low" is a valid excuse. ;) > > But it's always just "extremly low". Even STONITH could fail (the device > could be misconfigured to reset the wrong outlet, or report success when > it in fact failed), there could be issues in the very HA stack, the > kernel could cause data corruption in the fs, the storage could fail, > etc. And that's just random failure, ignoring malicious attackers or > careless sysadmins. > > We're never 100% certain. > > sfex relies on timing, yes, but with such considerable safety margins
Do we have any systematic method to analysis the "safety margin" already ? If not, I'll not go with the "considerable" claim. > that it's "safe enough". NCS SBD basically trusts the other nodes too. > > I think it would be a valuable addition, in particular if it could get > it into daemon mode. This could be the first step towards a real "quorum > resource" which a future quorum plugin framework could utilize, too. If you are talking about transferring sfex into now day's qdisk, then I totally agree. >> dskcm has it's own problem too. >> Heartbeat doesn't support the idea of "link priority" or "link >> fallback", so the disk is always up busy for the communication. >> It consumes several hundred KBs of disk I/O bandwidth constantly. >> >> And as we are switching to openais stack, I don't think I'm going to >> improve it any further. >> >> dskcm attracted a lot of interest when people are testing/comparing >> different HA solutions. >> I did several POCs on this myself. But I haven't awared anyone use it >> in _production_ environment yet. > > It would be interesting to see whether this could be added to openAIS. Yeah, I'm preparing for that too. ;) > > Regards, > Lars > > -- > Teamlead Kernel, SuSE Labs, Research and Development > SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) > "Experience is the name everyone gives to their mistakes." -- Oscar Wilde > > _______________________________________________________ > Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/ > _______________________________________________________ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/