On 2007-11-06T09:53:13, Alan Robertson <[EMAIL PROTECTED]> wrote:

> Cutting out that debug should be OK - or raising it to happen if debug is > 
> 1 would probably also be OK.  If you're seeing this happen a lot, that's 
> not a good thing.  Getting behind 200 messages seems like a lot to me - off 
> hand.

It's not. It happens very quickly. Alan, when have you last run CTS on a
9 node cluster? ;-)

Network transmissions take time, and a longer time than the transmission
from one process to the next via IPC - when the TE initiates the full
load of actions nearly instantaneously, the network layer lags behind.

And keep in mind that this is all from the DC, so the DC's network
connectivity is a choke point.

Here, we have ~250 actions - the messages started being dumped at ~200
messages or so. Once past that threshold, the logging mania started,
_and_ the logging mania contributed to making the MCP even slower, so it
was more likely to stay behind.

> Are you also seeing retransmissions?

A few (it also overflows the network buffers), but not very many.

> Just because you have a lot of processors doesn't mean that Xen is 
> scheduling you properly.  You have two different schedulers going on here - 
> so the opportunities for problems go up rather rapidly.

That is not quite true; the hypervisor has essentially scheduled the
guests to one CPU each, and doesn't need to interfere with the local
scheduling. They are - except for the networking, and other shared
resources, of course - running concurrently and independently, and have
a full CPU to themselves.


Regards,
    Lars

-- 
Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to