Hello, I apologize to bother you all with this issue, but it seems to have stumped those in the ganglia-general mailing list, so I'm hoping the developers could help me out.
I'm using Ganglia v3.0.3 on openSUSE 10.3 which came pre-configured on a Microway cluster. It's slightly modified to add their "Microway Control" stuff integrated which is basically a button from the Ganglia homepage which leads to their TriCom/NodeWatch thermal monitoring web page. As such, I don't think that the issue I'm having has anything to do with their customization, but I wanted this to be known beforehand in case the possibility exists. The issue I'm facing is with incorrect boottime and uptime for my master and all slave nodes. The discussion I had on ganglia-general can be found here: http://www.mail-archive.com/ganglia-gene...@lists.sourceforge.net/msg04814.html To summarize, from the Ganglia homepage for my cluster, if I click on any of the nodes (master or slaves), the boottime is reported as: Wed, 31 Dec 1969 16:00:00 -0800 and the uptime is calculated based on this boottime (currently reads: 14442 days, 13:50:04). I have a total of separate clusters running the same version of Ganglia (we'll call them cluster1, cluster2, and cluster3; master1, master2 and master3 respectively). cluster2 and cluster3 exhibit the same issue, however, cluster1 does not. The only difference (software-wise) between the three is cluster1 is running SUSE (not openSUSE) 10.1. The only difference (hardware-wise) between the three clusters (besides CPU/RAM/HDD and the number of slave nodes) is cluster1 uses an older type of TriCom/NodeWatch hardware which I don't believe would affect this. cluster2 and cluster3 also have an InfiniBand network in addtion to their Ethernet network, and cluster1 simply has multiple Ethernet networks. Also, on cluster3, up until yesterday, the TriCom/NodeWatch stuff was cabled incorrectly, rendering their NodeWatch web page to report no data, but the Ganglia homepage for cluster3 was able to produce proper data for each node (besides the issue we're discussing here); e.g. node load graphs, etc. all report good data. /proc/stat reports the correct btime value. gmond is running as "nobody" which is able to get data from /proc/stat. Bernard, the gentleman which was helping me in ganglia-general, is on the right track in suspecting that it's not getting the btime value from /proc/stat, but we're not sure why. Any assistance in this matter is greatly appreciated. Thanks in advance! - Ken ------------------------------------------------------------------------------ Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge _______________________________________________ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers