On Thursday 20 March 2008 06:58:45 Carlo Marcelo Arenas Belon wrote: > On Mon, Mar 17, 2008 at 10:31:10PM +0100, Paul Millar wrote: > haven't used Xen in this setup, but had a similar setup using kvm [..] Yes, this is similar to my setup. The only major difference is I don't attach eth0 to the bridge; I use iptables to NAT out-going connections.
> > I've noticed two problems with this setup. First, multicast binding to > > particular device when sending UDP seems to be broken. > > binding to br0 works fine for me (can't send and listen to the metrics > through multicast packets) and of course, sniffing the interface (br0 or > eth0) show packets going out and in (in both at the same time, as they are > bridged). I suspect that this is working "by accident" :-) If the multicast binding isn't actually taking place, your kernel will send packets to the first non-loopback device (this is my experience; however, its a kernel-specific algorithm), i.e. eth0. As you say, since it is bridged, you will see traffic on both br0 and eth0. You could test this by removing eth0 from the bridge and seeing whether you continue to see the multicast traffic on br0. When I tell gmond to bind to the bridge interface ("br-xen"), I do see traffic, but only on eth0. The debug output confirmed that gmond is picking up the option, but it somehow isn't acting on it. > I suspect the br-xen device doesn't support multicast IOCTLs making your > gmond deaf (have seen something like that with an intel wireless adapter > once). Well, that could be, but it worked with with a 3.0.x-series gmond and I've not changed the Xen installation. My suspicion here is, with move to the 3.1 code-base, we've introduced a regression against bug #140: http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=140 > > The second problem is with a Xen guest host transmitting over the > > internal bridge. [...] no actual metrics are recorded. > > a 3.1 gmond sending its metrics to a 3.0 gmond showed something similar in > my tests, and that is expected (compatibility is only granted at the XML > layer between gmetad and gmond) This may be true, but the "other" gmond is also from trunk. If I telnet to its TCP port, I see it identify itself as: <GANGLIA_XML VERSION="3.1.0." SOURCE="gmond"> > did you see this problem with both 3.1 gmond?, Yes, I'm only using code from the current trunk. I wanted to avoid complications from running daemons from 3.0- and 3.1- codebase concurrently. > is the same observed if the metrics are collected using unicast instead? Hmm, good question. I've just checked and it works. Switching back to multicast and it fails. In both cases I can see traffic the network traffic (on the bridge) consistent with metric updates being transmitted. The failure mode is specific: only the metrics are lost whilst the host entry is maintained (with TN value resetting, as expected). I suspect that the code that deals with updating gmond's cache of metrics is somehow confused, resulting in those metrics not being recorded; yet that something was received from the host *is* recorded, hence the host entry in maintain. HTH, Paul.
signature.asc
Description: This is a digitally signed message part.
------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers