Re: [Ganglia-developers] hierarchical metric naming (long)

Steven Wagner Fri, 30 Aug 2002 14:34:22 -0700

Federico Sacerdoti wrote:

So, as Steven and others have mentioned, we have a problem with gangliametrics. Metrics currently lie in a flat namespace, with no hierarchicalgroupings. I have talked with Matt and Mason (a ganglia developer and myboss) about this problem, and would like to state and define some of ourideas.

Hey, you guys WERE listening all those times I went on and on about thissubject. :)

Another advantage of hierarchies comes from object-oriented design.Attributes in the Branch tag, such as DMAX (when metrics get deleted),become the default for all metrics below it. These can be overrided bythe individual metrics, analogous to overriding baseclass methods in anOO class tree. This gives an easy way to assign attribute values to agroup of metrics.

It seems to me this would also make the "DSO-ification" of the monitoringcore a smoother process, not to mention a cleaner one from the standpointof those developing the DSO's. :)

A third advantage is cleaner namespaces. You can call 'cpu_num' simply'num'. Similar naming simplifications are possible for the othermetrics. The most significant advantage is that we only have to worryabout name collisions among siblings in the tree. There can be a 'num'metric in another branch (for example, the 'num' of network interfaces).
So how do we name metrics in the XDR packet if we adopt a metrichierarchy? This is a difficult problem, since we want to allow newmetrics to appear at any time. Imagine an XDR packet comes in. We needto identify the metric, and update its value in our hash tables.

I was thinking of "yet another hash" that has a hashed-up number based onthe name or hierarchy position of the metric as a key. The idea being,this number is shorter than using the fully-qualified name of the metricall the time.

So instead of encoding "cpu.idle" we encode 0x03FA450A and that field's 50%shorter (even better if we get to "processes.top.1.cpu_percentage"), andonly have to multicast the real string name once. The hierarchicalinformation is stored (as a pointer, at the very least) in this hash.


What's really going to be key here is not so much the idea of making the
statically-#define'd metric hash dynamic, but keeping it up to date...

If we go far enough in this it'll look like SNMP, only more collaborative. :)

I believe the answer is that new nodes get their branch hierarchy all atonce from the oldest gmond in the cluster (which I will call the eldestnode). Matt has been talking about this for some time, as it will solvesome other problems as well. If we get an XDR metric packet thatspecifies an unknown branch, we discard it. However, we realize that wemust have missed something, so we query the eldest node for their metrichierarchy. If we can't find the eldest node, we query the second eldest,etc. We also query the second eldest if we didn't learn anything newfrom the eldest himself. (This solves the problem of the eldest nodehaving incomplete information).

I would suggest a fallback method (at least an option) of consulting an"authoritative host." Maybe even a host running gmetad could be used as afallback (after all, it's going to have to keep track of all this stufftoo), although I don't necessarily think I'd recommend that.

At the very least this will help us during development, and it's possiblethat some users might have a particular gmond running on "more reliable"hardware (this isn't a dig at any one platform, I was thinking along thelines of redundant PSUs and such) to be responsible for keeping track ofcluster metric metadata.

The assumption is that the eldest node has been listening to all the"create-branch" messages, and has a complete metric tree.

This is gonna sound like DNS. If anyone doesn't know DNS, speak up nowbefore I get too snug in wearing my hostmaster hat again...

The primary node (eldest) may actively send sync'ing messages to thesecondary node (second-eldest) in case of the primary's untimely death.Since I assume all traffic mentioned here will be on the multicast channel,a separate conduit between primary and secondary is probably redundant -eldest and second-eldest will behave identically except that thesecond-eldest won't answer queries unless the eldest misses a heartbeat ordoesn't answer a query older than "query_timeout" seconds.

Individual nodes are always "authoritative" for branches of the metric treewhich they themselves have implemented. The query packet format needs tohave an optional destination field which contains the multicast hostname/IPof a member node. If a node receives a query addressed to it from theelder server, then it responds by sending its "create-branch/create-metric"messages again to the cluster. This should be the only time this metric is*rebroadcast* by a node.

On joining the network, a new node will announce itself and wait for theheartbeats to start flowing in before it sends any multicasts besidesheartbeat, hostname, gmond_started and gmond_version. The elder gmondshould, upon receiving a new gmond heartbeat, transmit the metric tree.The new gmond, as it receives the tree, compares it to its internal metricsand sends "create-branch/create-metric" messages for each metric itsupports that is not in the metric tree received by the elder.


Cripes, this is turning into an RFC.  Should I just write this up as such?

This email message is getting too long, but I would go on about how wecould use the idea of database indexes to quickly locate any branch inthe tree.


Heh, in that case that renders the first part of my message redundant. ;)

I hope I have been relatively clear about these ideas. I realize thisproblem is pretty dense, and this solution is in its infancy. But thepoint I would like to drive home is that a naming hierarchy is helpfulfor specific reasons, and that its efficient implementation is possiblein the ganglia framework.

Dense, yes, but the area of metrics is just about the only one in theGanglia design that *doesn't* scale well (kudos, Matt & co.). I'm surethat we can work this out if we just keep banging those rocks together. :)

Re: [Ganglia-developers] hierarchical metric naming (long)

Reply via email to