On 2007-11-06T09:53:13, Alan Robertson <[EMAIL PROTECTED]> wrote: > Cutting out that debug should be OK - or raising it to happen if debug is > > 1 would probably also be OK. If you're seeing this happen a lot, that's > not a good thing. Getting behind 200 messages seems like a lot to me - off > hand.
It's not. It happens very quickly. Alan, when have you last run CTS on a 9 node cluster? ;-) Network transmissions take time, and a longer time than the transmission from one process to the next via IPC - when the TE initiates the full load of actions nearly instantaneously, the network layer lags behind. And keep in mind that this is all from the DC, so the DC's network connectivity is a choke point. Here, we have ~250 actions - the messages started being dumped at ~200 messages or so. Once past that threshold, the logging mania started, _and_ the logging mania contributed to making the MCP even slower, so it was more likely to stay behind. > Are you also seeing retransmissions? A few (it also overflows the network buffers), but not very many. > Just because you have a lot of processors doesn't mean that Xen is > scheduling you properly. You have two different schedulers going on here - > so the opportunities for problems go up rather rapidly. That is not quite true; the hypervisor has essentially scheduled the guests to one CPU each, and doesn't need to interfere with the local scheduling. They are - except for the networking, and other shared resources, of course - running concurrently and independently, and have a full CPU to themselves. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde _______________________________________________________ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/