Mike Rosenlof napsal(a): > > ________________________________________ > From: Jan Friesse [[email protected]] > Sent: Monday, March 03, 2014 2:30 AM > To: Mike Rosenlof; [email protected] > Subject: Re: [corosync] overhead? send/receive cpg messages > > Mike Rosenlof napsal(a): >> >> Hi, >> >> I'm using corosync 1.4.1 on RHEL 6.2 I have a cluster of two nodes and >> we are passing node to node messages with the CPG message API >> (cpg_model_initialize, cpg_join, cpg_dispatch, etc...) >> >> while the messaging is idle, if I run 'tcpdump' on the corosync source port >> 5404, we get three messages every couple of seconds:: >> >> 14:47:41.384240 IP g5se-48521a.hpoms-dps-lstn > g5se-ad665b.netsupport: UDP, >> length 107 >> 14:47:41.384693 IP g5se-ad665b.hpoms-dps-lstn > g5se-48521a.netsupport: UDP, >> length 107 >> 14:47:41.593953 IP g5se-48521a.hpoms-dps-lstn > 239.192.42.210.netsupport: >> UDP, length 119 >> >> 14:47:43.288532 IP g5se-48521a.hpoms-dps-lstn > g5se-ad665b.netsupport: UDP, >> length 107 >> 14:47:43.289109 IP g5se-ad665b.hpoms-dps-lstn > g5se-48521a.netsupport: UDP, >> length 107 >> 14:47:43.498385 IP g5se-48521a.hpoms-dps-lstn > 239.192.42.210.netsupport: >> UDP, length 119 >> >> etc... >> >> Now when the node application sends a message to another node >> stat=cpg_mcast_joined( commHandle.cpgHandle, CPG_TYPE_FIFO, &iov, 1 ); >> >> tcpdump captures a surprisingly large number of packets >> >> 14:47:45.525430 IP g5se-ad665b.hpoms-dps-lstn > g5se-48521a.netsupport: UDP, >> length 107 >> 14:47:45.525576 IP g5se-48521a.hpoms-dps-lstn > g5se-ad665b.netsupport: UDP, >> length 107 >> 14:47:45.526354 IP g5se-ad665b.hpoms-dps-lstn > g5se-48521a.netsupport: UDP, >> length 107 >> [snip, 59 length 107 messages deleted!] >> 14:47:45.540345 IP g5se-48521a.hpoms-dps-lstn > g5se-ad665b.netsupport: UDP, >> length 107 >> 14:47:45.540603 IP g5se-ad665b.hpoms-dps-lstn > g5se-48521a.netsupport: UDP, >> length 107 >> >> >> Does anybody have an idea why there are so many messages for corosync to >> transmit a message of approximately 12 bytes? >> >> thank you for any insight here... >> > > [Jan Friesse ] > corosync rotates token between nodes as a heartbeat (few in seconds) and > when messages are sent, token must rotate more quickly. Token itself is > quite small. > > To see actual messages, filter mcast packets. > > Does this answer your question? > >> > > [me] not exactly. I can see that there is a heartbeat going around the nodes > while idle, what alarmed me was this part of the sequence: > >> 14:47:45.525430 IP g5se-ad665b.hpoms-dps-lstn > g5se-48521a.netsupport: UDP, >> length 107 >> 14:47:45.525576 IP g5se-48521a.hpoms-dps-lstn > g5se-ad665b.netsupport: UDP, >> length 107 >> 14:47:45.526354 IP g5se-ad665b.hpoms-dps-lstn > g5se-48521a.netsupport: UDP, >> length 107 >> [snip, 59 length 107 messages deleted!] >> 14:47:45.540345 IP g5se-48521a.hpoms-dps-lstn > g5se-ad665b.netsupport: UDP, >> length 107 >> 14:47:45.540603 IP g5se-ad665b.hpoms-dps-lstn > g5se-48521a.netsupport: UDP, >> length 107 > > This is at a point where one node sent out a message (15 bytes) with > cpg_mcast_joined() and (note the timestamps) in the course of less than > 20msec there were over a hundred messages to support that one data packet. > That's the part that seems like a lot of overhead. >
Corosync is able to "hold" token for some time to lower network utilization (this is what you see BEFORE sending message). This mode is activated AFTER totem.seqno_unchanged_const (see corosync.conf) token rotation without any sent message by any member. So that >100 messages you've see is token rotation. After that (because nothing is sent) Corosync fails back to "hold" mode. This optimization is trade-of between latency and network utilization. So if you really dislike that behavior, you can set seqno_unchanged_const to smaller value (like 1). > Is this a configuration issue? It's a cluster of two nodes... No, this is pretty normal. > > --mike > Regards, Honza >> >> _______________________________________________ >> discuss mailing list >> [email protected] >> http://lists.corosync.org/mailman/listinfo/discuss >> > > > _______________________________________________ discuss mailing list [email protected] http://lists.corosync.org/mailman/listinfo/discuss
