Rick Mohr wrote:
On Mon, 23 Jan 2006, Ben Hartshorne wrote:
snip
When I go into the page for a single host and click on the 'gmetrics'
link, I find that all of my metrics have a record of being recieved
within the last two minutes (my time period). And yet, their graphs
show up empty.
Any
This, combined with your last message, makes it look like gmetad's not
getting any data from gmond. That could be because gmond is not
configured to accept connections from the gmetad host (yeah, even
localhost!) or that there's some other major config wackiness going on.
There doesn't
source code.
Any other pointers will be helpful.
Thanks,
Utsav Agarwal
On Tue, 16 Aug 2005 14:43:45 -0700, steven wagner [EMAIL PROTECTED] wrote:
Change the order of the hosts entry to:
IP address node1.domain.com node1 any-other
Hmm, I must have been on vacation or something.
Regardless, I don't have this code.
And for the record, I never said I was happy about having to run gmond
as root instead of nobody. :)
Adeyemi Adesanya wrote:
Hi Christopher.
I have not received a response from Steven Wagner so I will sent
Steve Feehan wrote:
Although, I suppose it would be just as good to use lo0 as this is
a one node cluster. So what is the recommendation for a one node
cluster? Multicast is, well, is there a point? And what sort of
route do I need?
Yeah, feel free to use loopback for this. I am in the
I'm sorry to report that you should be getting metric data back on
Tru64. Sadly, I can't offer any developmental support here now because
all our Alpha are belong to dumpster (although for the record, I am the
one to blame for the monitoring core running on Tru64 to begin with...
sorry about
dio wrote:
Solaris 2.6
gmond v2.5.5
/usr/sbin/gmond daemon fails to start.
/usr/sbin/gmond -d1 yields cpustuff: Not enough space
not quite sure where the error is coming from.
tnx,
--dio
The getkval() function is erroring out for some reason. Is the
monitoring core running as root?
Brooks Davis wrote:
On Thu, Nov 06, 2003 at 10:02:07PM -0500, Krishna Kumar wrote:
Hi,
I installed ganglia on my solaris (with gmetad).. when I try to start the
daemon, it gives me this error..
$ /gmond start
/etc/rc.d/init.d/functions not found
I've copied the gmond.init to /usr/sbin.. Is
Adeyemi Adesanya wrote:
Hi There.
I spent some time digging through the archives but I am unable to find a way
of running gmond as a non-root user on Solaris. Is this out of the question
or is there some way to patch the code? All of our critical servers run
Solaris, that¹s where the real
I'm guessing that your new Ganglia cluster and your old Ganglia cluster
are sending metrics out on the same multicast address.
The fix is easy in one sense, but difficult in another (depending on the
nature and quality of your cluster management tools):
Change the multicast IP or port on one
Marcia Prescott wrote:
Thanks for the ideas.
I ended up looking through the log file. It turns out
that the machine I have the metadaemon had a time of 5
minutes faster. I suppose most people would have a
network in sync. Anyway, because my metadaemon was
ahead, it would create a rrd for a
I have no answers, only vaguely-informed statements and half-formed
questions (welcome to free software's version of tech support!).
It is interesting to note that 4 hours = 16 real data points (at
15-minute polling intervals). That's a suspiciously round number...
However, if this was just
Dave Bradshaw wrote:
Dear Steve,
Thanks for the advice. I have done what you sugessted. I have also turned off
one of the gmetad daemons so it is now only running on the machine with the web
frontend.
Now when I fire up the web frontend I get the following message: -
Ganglia cannot find
Dave Bradshaw wrote:
Dear All,
Am I an idiot?
My rule of thumb is not to ask this sort of question on a list unless
I'm asking something already covered in the docs. There's always a
chance some wisecracker out there will answer it.
:)
Where am I going wrong?
Different clusters
I have no specific solutions for you but here are some potentially
helpful tidbits which may permit you to shoot your own trouble:
Does the monitoring core die right away?
Does it dump core?
Does it die when you run it in debug mode?
Does debug mode tell you anything more about the error?
Do
Hector M. Jacas wrote:
Hello to all!
I am looking for the way to build and to install a version of GMOND for
Tru64 v5.1A.
Last year, when we had a meaningful number of Alphas on the premises
running Tru64, I ported the monitoring core to that platform. It is an
experience I don't recall
Kevin James Flasch wrote:
* Check some of the gmond-only nodes' XML port output. How many nodes
do they see? Do they see 289-295 nodes or just their own output?
I believe you're referring to the mcast_port (by default 8649). When I telnet
to it, I see what appears to be all/most of them.
âÏÊËÏ îÉËÏÌÁÊ wrote:
Hi all, I have added some metrics to ganglia, and tryed to change ganglia
webfrontend code , so ,now he can show ALL metrics averaged by all nodes,in
one page .Than you have no need to look on each node and test metrics . I have
some success, but only to standart ganlia
David Aikema wrote:
Quoting steven wagner [EMAIL PROTECTED]:
On a sufficiently new (= 2.5.0, I believe...) monitoring core, metrics
and hosts should expire according to their DMAX attributes. Restarting
the monitoring cores which will be polled by your metadaemon will clear
the hosts out
Ken MacInnis wrote:
On Wed, 7 May 2003, David Bickle wrote:
Still having problems I've compiled gcc 3.2.2 from source with the
CPU=sparc64. I'm running Solaris 8. I have also compiled ganglia with
--enable-sparc64. gmond
still won't launch for some reason. Check this:
bash-2.03$ file
as root. Why is it complaining about /dev/ksyms not being
32-bit? Am I missing a configure option?
Thanks Again,
On Wed, 7 May 2003, steven wagner wrote:
Ken MacInnis wrote:
On Wed, 7 May 2003, David Bickle wrote:
Still having problems I've compiled gcc 3.2.2 from source
Make sure you're using the latest gmetad and web front-end. Latest version
is 2.5.3, and it incorporates fixes to directly address both issues (a was
addressed in 2.5.2, b in 2.5.3).
I've been having trouble with gaps for months - check the ganglia-general
archives for various musings on
M. Michael Barmada wrote:
Hi,
I'm wondering if anyone has had success compiling gmetad on OSX? Even
after getting everything else working (installing rrd through fink
required some additional arguments to configure to get all the libraries
recognized), 'make' keeps failing in the gmetad
/sw/include could be a Fink include install directory. Fink defaults to
putting installed and built software in /sw, IIRC ... (I'm not running it
at the moment on my Powerbook, which needs a 10.2 upgrade...)
matt massie wrote:
Today, M. Michael Barmada wrote forth saying...
I'm wondering
Hi Arnie,
Sounds like you need to change some multicast IPs. All the nodes that you
want to appear in a single cluster should have the same multicast IP.
Despite your best efforts to explain it, I think you're probably the best
person to determine how you want your grid layout to look. :)
matt massie wrote:
prashant-
so when a node in the cluster dies the cluster size changes but the dead
node is not reported?
this is a new problem that i haven't heard of before. did gmond get
restarted after the node failed? ganglia knows the a node dies when it
stops getting heartbeats
Leif Nixon wrote:
Well, this is a new one - at least for me.
One of our clusters was rebooted last week, due to a physical
relocation. Now the ganglia XML data doesn't contain any mention of
the cluster frontend, even though gmond is running fine and responding
to the XML data port:
nixon
Leif Nixon wrote:
Steven Wagner [EMAIL PROTECTED] writes:
That's how I found out that my front-end was *three* hops away from
the test cluster and I'm thinking you have either a monitoring core
config issue or a host/network config issue to track down... (maybe a
host/network device between
Henry Leyh wrote:
I cannot find anything unreasonable here. The polling interval seems to
be correct. Note that do not have private 192.168... addresses for the
cluster nodes.
Yup, all that looks reasonable. My grab bag o' fixes is officially empty. :)
One thing I guess you could try is
Santanu Das wrote:
Actually I did mean to say how to change the label like in spite of
Unspecified Grid some thing like HEP DataDrid or else.
Did somebody say, undocumented feature ?
gmetad and the web front-end control the grid stuff - this is a new
feature addition as of 2.5.2, which was
Nicholas Henke wrote:
OK -- so check this link, it is all of our clusters:
http://www.liniac.upenn.edu/ganglia.
Notice how the overall graph is spotty, but none of the others are? How
do I fix that ?
Nic
Hard to conclusively say without putting gmetad into debug mode and sifting
through a
That metric isn't currently supported on Solaris. I have an idea of how to
do it but I simply haven't had the time to work on it. Basically it
involves walking the /proc tree looking for processes in the Run state and
multicasting that number.
If someone else wants to write the code for it,
John Francis Lee wrote:
Thanks again!
Setting the debug level to 10 showed me that gmetad was unable to
connect to itself! I changed the datasource specification to 'localhost'
from the machine'd fqdn and things worked!
What I get now is
'There are 10 nodes up and running. There are no nodes
Joe Griffin wrote:
Hi All,
Is there any similar information on gmetric?
I found a script I would like to use in number 16 of:
http://ganglia.sourceforge.net/gmetric/
However, I cannot get gmetric to print any output.
For example, I tried:
/usr/bin/gmetric --name Resource_Usage_Rank 2 --value
John Francis Lee wrote:
Greetings,
I've downloaded and installed the software to the machines in our
internet cafe, have gmond running on all and gmetad on one.
When I try to view the setup with the ganglia-webfrontend I get a lot of
messages on the order of:
Warning: ksort() expects
John Francis Lee wrote:
Thanks for the help!
I followed you suggestions and attach the output of each telnet command.
Both were able to connect, and the machine running gmond responded with
data.
Maybe there's something wrong with php?
Take another look at the metadaemon's output:
[DTD
I noticed in CVS some comments about a job view, allowing for a
user-specified graph start time and duration. However there doesn't appear
to be any kind of interface for it.
I'm not afraid of rolling my own (in fact, I think it might be fun to roll
that into another application
clusters.
The monitoring core probably shouldn't be running on the front-end. The
metadaemon should be enough.
I know in previous reply, Steven Wagner has said that this should work, but
I am not able to get it to behave that way. Am I missing something very
obvious ?
You know, gentle
Lester Vecsey wrote:
Looking through the key_metrics.h file it seems that linux machines get a
different set of keys from aix, and so on. Theres a basic core set of keys
that are on all platforms, but then when it gets to things like pkts_in its
only available for linux.
In particular pkts_in
Lester Vecsey wrote:
I find it useful to select certain graphs and copy/paste the URL to some of
the images to call them from my own html page, and I noticed that the graphs
have a '(now )' value that is passed in with the v= arguement to graph.php.
Certainly graph.php should be able to have
Lester Vecsey wrote:
I was going to investigate this further to see exactly what kind of values
the gmond process is coming up with in the relavent sections of code, but I
thought I'd ask here. Also, does anyone know if ibm has a library for 4.3
for the vmgetinfo function? Its also mentioned in
Chris Stone wrote:
Ganglia is great. I got it up and running on my linux cluster in short
order. I do have one nagging detail I'd like to remedy.
/var/lib/ganglia/rrds/ contains a directory called unspecified. My
ganglia web page also lists this name as the name of the cluster, ie.
+ (a value between 120-150)
}
else
do nothing
/quotage
Steven Wagner wrote:
All those values are in seconds.
The mcast_min/max values specify the range (randomly determined on each
round of execution) of interval between TRANSMISSIONS of the metric.
The other two values
Adil Hasan wrote:
Hello,
I quickly took a look at Ganglia and it looks like a nice tool for
monitoring some of our servers. However, I'd like to be able to run as a
non root user. Is it possible to do this? Or, is there another tool that
would be better suited for non-root users?
The fnord content was too low to be from the REAL Illuminati.
I suppose my fnord detector code might be broken, but I fed the front page
of cnn.com through it and it went crazy so I'm pretty sure it's working...
Doug fNordwall wrote:
I admit, this was the first spam that I've actually found
[EMAIL PROTECTED] wrote:
I get a lovely bit of code. It seems to be working.
Depends on the length and breadth of the code. If it's displaying metrics,
then it's working. If it just has the DTD and there's no real data (no
CLUSTER or HOST tags), it ain't.
Also, did you install it in
[EMAIL PROTECTED] wrote:
Hi all,
[points at Ben]
HA-ha!
OK, now that we've gotten the Nelson laugh out of the way...
[not being a ROCKS guy, I defer on all these points to anyone who is
*cough*fed*cough*]
I just installed a ROCKS 2.21 cluster, which seemed to have ganglia 1.05 or
Andrew Gill wrote:
I'm trying to get Ganglia to work on Solaris 8, and
seem to be hitting my head against a wall. I can
compile it without any problems, using gcc-3.2.
However, the gmond binary exits immediately (return
code 0) and no gmond process runs in the background.
A 'truss' of gmond
[EMAIL PROTECTED] wrote:
Orest.
Does ganglia toolkits have posibilities to slow down
database updating rate not 15 seconds but 30 (60 ) ?
If you find metrics are updating too often, you can modify the values in
$GANGLIA_SOURCE/gmond/metric.h (look for mcast_min and mcast_max).
If you're
Matt once wondered (on the dev list) why I don't write documentation. So
after a solid day of SCSI troubleshooting, I thought I'd, you know,
contribute...
---
Here are the metrics that are widely supported across different platforms
(or, in a few cases, the ones we *wish* were supported
This may or may not be it, but when I first set up the ganglia frontend, I
needed to turn on register_globals in my php.ini file. The variables
passed to the different scripts (notably graph.php) just weren't being
accessed.
Then again, that was the first release... this may have been fixed
Cripes, way to freak out the developers. I hope you never see The
Adventures of Pluto Nash on an airplane, otherwise you might loudly
declare that you just saw a bomb. :P
This is normal behavior - 239.2.11.71 is a multicast address. Ganglia's
entire metric transmission system is based
:[EMAIL PROTECTED] Behalf Of Steven
Wagner
Sent: Tuesday, September 17, 2002 3:15 PM
To: ganglia-general@lists.sourceforge.net
Subject: Re: [Ganglia-general] Ganglia is not secure. (WOLF!)
Cripes, way to freak out the developers. I hope you never see The
Adventures of Pluto Nash on an airplane
Jeffrey B. Layton wrote:
At least you are thinking about security. You would be suprised how
many people don't even think about it! Don't feel bad.
Jeff
I'd also like to add that the timing of this e-mail was *perfect* as we are
readying a nice shiny new release and, if there WAS a major
Try running the monitoring cores in debug mode (in the foreground) to see
if they're receiving multicast packets from other hosts. You may need to
increase your mcast_ttl value.
Remember that all monitoring cores must use the same multicast address and
port, otherwise they won't hear one
If memory serves me correctly, the heartbeat metric was not added until
midway through our long CVS-only push from 2.4.1 to 2.5.0. Before this
implementation, it was difficult to really be sure whether a node was down
or had just randomly decided to wait more than 20-30 seconds to transmit a
HPC Mail Acct. wrote:
Hi Matt + list,
chorus of Hi, mailguy! :P
One other small unrelated thing - From your documentation:
If you want to monitor a node but do not want it to show up in the list
of hosts returned by gmond for gexec use, simply start gmond on that node
with the --no_gexec
Remember that RRD files are of a fixed size. In other words, they should
never grow beyond their original size when created. That's why they call
'em round-robin databases. :)
So the only reason new RRDs would be created is if new metrics were added
for existing hosts or if new hosts were
markp wrote:
Is anyone experiencing a high load with gmetad? I've run this daemon on
a high end intel 933mhz dual proc machine with 1gb of memory and RH
7.2. Loads get and stay as high as 3. I get worse results on single
processor machines, loads as high as 6.7 Kill the daemon and it drops
Joe Kaiser wrote:
Hi,
I am interested in getting greater granularity on some of the metrics,
especially over greater lengths of time. For example, if I wanted to
see the one hour cpu load and how it changed over an hour/day/week and I
wanted to have the same granularity at one week as I do at
Martin Margo wrote:
Dear Mr. Massie
Sir, I am really sorry to bug you again this time.
But I have finally sorted out all kind of problems
and have finally getting closer to the problem.
I execute the /sbin/gmetad script and viewed the
/logs/gmetad.log file and in there it said
User of
Martin Margo wrote:
Hi Steven, thanks a lot for your help. I checked
out the logs and restarted the daemon couple of
times, and waited for 5-10 minutes. I took a look
at the daemon logs and in it, it said
Use of uninitialized value in hash element at ./gmetad line 109.
over and over again to
Yujun_Wu wrote:
I am working on getting the monitoring info out of ganglia and put them
into a grid-level monitoring tool. I find I can do this in three ways
after browsing the ganglia documentation:
1. telnet remote.cluster.nodename 8649
2. gstat
3. through rrdb
The first one (using
Joe Griffin wrote:
Hello,
I have two clusters running ganglia/gmetad
wonderfully. Each cluster has it's own name
and gmetad seperates the clusters by those names
(the headnode name).
I have a third cluster which has two types of
nodes within the same cluster (type1 and type2).
But gmetad
[EMAIL PROTECTED] wrote:
I am trying to run gmond and gmetad for the first time, and I am having
trouble getting it to work. I think the problem involves either the
version of perl I am using or that I am trying to run it on Solaris. The
machines I am trying to run the ganglia monitor on are
Well, I have no idea if this is an official solution but it sure as heck
worked for me. I thought I'd share.
Here's the problem I was having, in a nutshell:
* Boxes in my Solaris cluster appeared to disappear and reappear between
page views of gmetad-frontend. i.e., metacluster view says
Try adding debug_level 10 (or 100 - just greather than one) to your
/etc/gmond.conf and start gmond again to see where it dies.
Also, you *are* running it as superuser, right? It setuids itself but does
seem to need to be started by root...
Aaron Lott wrote:
Has anyone had luck getting
Ionescu Razvan-RIONESC1 wrote:
Hi!
Could anybody tell me what Linux kernel version is needed for running Gmond
(and Gmetad)? Or what module are mandatory? I use a 2.4.5 kernel and didn't
work, in fact I am able to get an XML, but without any information about
nods. I worked with a 2.4.17
Just wondering if anyone else has experienced problems with one cluster's
metrics not being reported consistently in a gmetad multi-cluster setup.
At the moment I have a (fairly homogenous) 30-node all-Linux cluster that
reports very strongly (although for some reason cpu_num is reported as 1,
Gonéri Le Bouder wrote:
Le mer 26/06/2002 à 18:41, Steven Wagner a écrit :
Gonéri Le Bouder wrote:
Is it possible to increase the time betwen 2 multicast.
Yes, but you need to edit the source and recompile gmond to do it.
Open $TOP_DIR/gmond/metric.h and revise the values upwards
Good news, everyone! Most of the hardcore development I've been doing
on solaris.c for ganglia-monitoring-core 2.3.1b1 (the last version to
compile and execute for me on Solaris 8) is now finished. Since I'm
monitoring a group of fileservers, I've also added some metrics.
This means that,
71 matches
Mail list logo