Re: [Ganglia-general] pcp and /etc/services

2002-05-06 Thread Steven Wagner
Hi Joe! (and list!) Joe Griffin wrote: When I installed ganglia/gmond, /etc/services did not have an entry. Is it required? Although I am still in the testbed stage with ganglia, I didn't need to install anything extra on my Redhat 7.1 monitoring/exec-only boxes. It should be as simple as

[Ganglia-general] monitoring core 2.4.0 and Solaris 8/SPARC woe

2002-05-28 Thread Steven Wagner
Just wondering if anyone has (anecdotal or better) evidence of getting the monitoring core working on Solaris 8. I just tried cranking up gmond on a Netra t1 test box - it compiles but dumpes core (Bus error). A little gdb work seems to indicate that it is having malloc problems setuid'ing to

[Ganglia-general] follow-up: Ganglia and Solaris 8

2002-05-28 Thread Steven Wagner
So after trawling the Sourceforge archives, I found that the monitoring core breaks between version 2.3.1b1 and 2.3.1b3 - 2.3.1b1 compiled and runs on my Netra t1 test box. I'd still like to know if anyone's using a later version than this with Sun boxes running Solaris (preferably Solaris 8)

[Ganglia-general] Updated irix.c submitted to sourceforge.

2002-06-06 Thread Steven Wagner
I added support for a couple of metrics to the monitoring core's irix.c. The changes I made were current as of 2.3.1b4, but I didn't see any work on the platform in the changelog so I assume it'll work with -current. Basically it adds support for user/system/idle CPU percentage reporting. Si

[Ganglia-general] gmond/solaris - ready for alpha

2002-06-12 Thread Steven Wagner
Good news, everyone! Most of the hardcore development I've been doing on solaris.c for ganglia-monitoring-core 2.3.1b1 (the last version to compile and execute for me on Solaris 8) is now finished. Since I'm monitoring a group of fileservers, I've also added some metrics. This means that, in

Re: [Ganglia-general] Delay between 2 multicast

2002-06-26 Thread Steven Wagner
Gonéri Le Bouder wrote: gmond multicast data about every 15s. It isn't justifed for our usage. One multicast every minute is more adapted. But i can't find a way to do set up that. :( Is it possible to increase the time betwen 2 multicast. Thanks, Gonéri Yes, but you need to edit the

Re: [Ganglia-general] Delay between 2 multicast

2002-06-27 Thread Steven Wagner
Gonéri Le Bouder wrote: Le mer 26/06/2002 à 18:41, Steven Wagner a écrit : Gonéri Le Bouder wrote: Is it possible to increase the time betwen 2 multicast. Yes, but you need to edit the source and recompile gmond to do it. Open $TOP_DIR/gmond/metric.h and revise the values upwards

[Ganglia-general] [gmetad] Intermittent results reported?

2002-06-28 Thread Steven Wagner
Just wondering if anyone else has experienced problems with one cluster's metrics not being reported consistently in a gmetad multi-cluster setup. At the moment I have a (fairly homogenous) 30-node all-Linux cluster that reports very strongly (although for some reason cpu_num is reported as 1,

Re: [Ganglia-general] What version of Linux Kernel...?

2002-07-01 Thread Steven Wagner
Ionescu Razvan-RIONESC1 wrote: Hi! Could anybody tell me what Linux kernel version is needed for running Gmond (and Gmetad)? Or what module are mandatory? I use a 2.4.5 kernel and didn't work, in fact I am able to get an XML, but without any information about nods. I worked with a 2.4.17 kernel,

Re: [Ganglia-general] multiple logical clusters on same subnet?

2002-07-01 Thread Steven Wagner
Joe Kaiser wrote: Hi, I work for a High Energy Physics lab, and we are evaluating ganglia for some of our monitoring needs. I have cluster on the same subnet that I want to separate into serveral logical clusters. I have been able to do so thus far by putting different logical clusters on a di

[Ganglia-general] [gmetad] spotty updates - solution :)

2002-07-03 Thread Steven Wagner
Well, I have no idea if this is an "official" solution but it sure as heck worked for me. I thought I'd share. Here's the problem I was having, in a nutshell: * Boxes in my Solaris cluster appeared to disappear and reappear between page views of gmetad-frontend. i.e., metacluster view says

Re: [Ganglia-general] slackware 8

2002-07-03 Thread Steven Wagner
Try adding debug_level 10 (or 100 - just greather than one) to your /etc/gmond.conf and start gmond again to see where it dies. Also, you *are* running it as superuser, right? It setuids itself but does seem to need to be started by root... Aaron Lott wrote: Has anyone had luck getting gang

Re: [Ganglia-general] Re: slackware 8

2002-07-03 Thread Steven Wagner
Aaron Lott wrote: I ran gmond and this what I'm getting [EMAIL PROTECTED]:~# gmond /etc/gmond.conf options name is Nimzo mcast_channel is 239.2.11.71 mcast_port is 8649 mcast_if is eth1 mcast_ttl is 1 mcast_threads is 2 xml_port is 8649 xml_threads is 2 trusted hosts are ... num_nodes is 10 num_

Re: [Ganglia-general] Disappearing Graphs

2002-07-09 Thread Steven Wagner
I hit a few bumps setting up the front-end myself. If you're getting as far as the page partially loading (just not displaying graphs) then maybe the front-end is having trouble invoking RRDtool. Turn on warnings in PHP (or check your web server's error log) to see if the program is complaini

Re: [Ganglia-general] Disappearing Graphs

2002-07-09 Thread Steven Wagner
matt massie wrote: > 5. what platform did you install gmetad on? Geez you like numbered lists five items long, huh Matt? ;) This just reminded me, I thought I should add that I'm running gmetad on a Solaris 8 box using Perl 5.005_03 and 5.6.1 (yeah, I have both binaries available ... long s

Re: [Ganglia-general] Disappearing Graphs

2002-07-09 Thread Steven Wagner
Michael Dingwall wrote: Took a look at my apache error_log file and with every graph it tells me that Permission is Denied. I tried to mess around with permissions in a couple placeses, but that didn't work either. Is rrdtool readable/executable by the user Apache runs as? rrdtool doesn't see

Re: [Ganglia-general] Disappearing Graphs

2002-07-09 Thread Steven Wagner
matt massie wrote: > gmetad (which must be running) is contacted on port 8651. After the joy of discovering that '8649' is 'UNIX', I was disappointed to find out that '8651' translates best to 'VOL1'.

Re: [Ganglia-general] Disappearing Graphs

2002-07-09 Thread Steven Wagner
Michael Dingwall wrote: Hey guys, Thanks for the help. Found out that the owner for the rrds had been changed to nobody. That really screwed it up. Also, I don't think that they have be owned by the apache user, because they show up when owned by the root. So, as I just told you the pictu

Re: [Ganglia-general] perl and solaris

2002-07-11 Thread Steven Wagner
[EMAIL PROTECTED] wrote: I am trying to run gmond and gmetad for the first time, and I am having trouble getting it to work. I think the problem involves either the version of perl I am using or that I am trying to run it on Solaris. The machines I am trying to run the ganglia monitor on are us

Re: [Ganglia-general] A gmetad question

2002-07-12 Thread Steven Wagner
Joe Griffin wrote: Hello, I have two clusters running ganglia/gmetad wonderfully. Each cluster has it's own name and gmetad seperates the clusters by those names (the headnode name). I have a third cluster which has two types of nodes within the same cluster (type1 and type2). But gmetad modif

Re: [Ganglia-general] Help with getting info out of ganglia

2002-07-15 Thread Steven Wagner
Yujun_Wu wrote: I am working on getting the monitoring info out of ganglia and put them into a grid-level monitoring tool. I find I can do this in three ways after browsing the ganglia documentation: 1. telnet remote.cluster.nodename 8649 2. gstat 3. through rrdb The first one (using telnet)

Re: [Ganglia-general] GMETAD problem

2002-07-23 Thread Steven Wagner
Martin Margo wrote: Hi all, I have been having some problem with Gmetad. I have installed the program and it seems to work fine. I specified both the sources and trusted_host files and I start up the gmetad daemon. % sudo gmetad 127.0.0.1 xxx.xxx.xx.33 xxx.xxx.xx.42 xxx.xxx.xx.34 xxx.xxx.xx.

Re: [Ganglia-general] GMETAD problem

2002-07-24 Thread Steven Wagner
Martin Margo wrote: Hi Steven, thanks a lot for your help. I checked out the logs and restarted the daemon couple of times, and waited for 5-10 minutes. I took a look at the daemon logs and in it, it said Use of uninitialized value in hash element at ./gmetad line 109. over and over again to th

Re: [Ganglia-general] gmetad source code problem

2002-07-25 Thread Steven Wagner
Martin Margo wrote: Dear Mr. Massie Sir, I am really sorry to bug you again this time. But I have finally sorted out all kind of problems and have finally getting closer to the problem. I execute the /sbin/gmetad script and viewed the /logs/gmetad.log file and in there it said "User of unitia

Re: [Ganglia-general] raising granularity of gmetad

2002-08-06 Thread Steven Wagner
Joe Kaiser wrote: Hi, I am interested in getting greater granularity on some of the metrics, especially over greater lengths of time. For example, if I wanted to see the one hour cpu load and how it changed over an hour/day/week and I wanted to have the same granularity at one week as I do at o

Re: [Ganglia-general] high load with gmetad

2002-08-21 Thread Steven Wagner
markp wrote: Is anyone experiencing a high load with gmetad? I've run this daemon on a high end intel 933mhz dual proc machine with 1gb of memory and RH 7.2. Loads get and stay as high as 3. I get worse results on single processor machines, loads as high as 6.7 Kill the daemon and it drops b

Re: [Ganglia-general] high load with gmetad

2002-08-22 Thread Steven Wagner
Remember that RRD files are of a fixed size. In other words, they should never grow beyond their original size when created. That's why they call 'em round-robin databases. :) So the only reason new RRDs would be created is if new metrics were added for existing hosts or if new hosts were ad

Re: [Ganglia-general] Figured some stuff out for SuSE (was: libssl and libcrypto in SuSE openssl rpms)

2002-08-26 Thread Steven Wagner
HPC Mail Acct. wrote: Hi Matt + list, :P One other small unrelated thing - From your documentation: "If you want to monitor a node but do not want it to show up in the list of hosts returned by gmond for gexec use, simply start gmond on that node with the --no_gexec option." This option i

Re: [Ganglia-general] Unusual behaviour of gmond 2.4.1

2002-08-28 Thread Steven Wagner
Try running the monitoring cores in debug mode (in the foreground) to see if they're receiving multicast packets from other hosts. You may need to increase your mcast_ttl value. Remember that all monitoring cores must use the same multicast address and port, otherwise they won't hear one anot

Re: [Ganglia-general] Problems with gstat

2002-08-28 Thread Steven Wagner
If memory serves me correctly, the heartbeat metric was not added until midway through our long CVS-only push from 2.4.1 to 2.5.0. Before this implementation, it was difficult to really be sure whether a node was down or had just randomly decided to wait more than 20-30 seconds to transmit a m

Re: [Ganglia-general] Ganglia is not secure. (WOLF!)

2002-09-17 Thread Steven Wagner
Cripes, way to freak out the developers. I hope you never see "The Adventures of Pluto Nash" on an airplane, otherwise you might loudly declare that you just saw a bomb. :P This is normal behavior - 239.2.11.71 is a multicast address. Ganglia's entire metric transmission system is based aro

Re: [Ganglia-general] Ganglia is not secure. (WOLF!)

2002-09-17 Thread Steven Wagner
TECTED] [mailto:[EMAIL PROTECTED] Behalf Of Steven Wagner Sent: Tuesday, September 17, 2002 3:15 PM To: ganglia-general@lists.sourceforge.net Subject: Re: [Ganglia-general] Ganglia is not secure. (WOLF!) Cripes, way to freak out the developers. I hope you never see "The Adventures of Pluto

Re: [Ganglia-general] Ganglia is not secure. (WOLF!)

2002-09-17 Thread Steven Wagner
Jeffrey B. Layton wrote: At least you are thinking about security. You would be suprised how many people don't even think about it! Don't feel bad. Jeff I'd also like to add that the timing of this e-mail was *perfect* as we are readying a nice shiny new release and, if there WAS a major sec

Re: [Ganglia-general] Web Front End Problems

2002-10-01 Thread Steven Wagner
This may or may not be it, but when I first set up the ganglia frontend, I needed to turn on register_globals in my php.ini file. The variables passed to the different scripts (notably graph.php) just weren't being accessed. Then again, that was the first release... this may have been fixed l

Re: [Ganglia-general] explanation of metrics

2002-10-03 Thread Steven Wagner
Matt once wondered (on the dev list) why I don't write documentation. So after a solid day of SCSI troubleshooting, I thought I'd, you know, "contribute..." --- Here are the metrics that are widely supported across different platforms (or, in a few cases, the ones we *wish* were supported ac

Re: [Ganglia-general] Monitoring

2002-10-04 Thread Steven Wagner
Leif Nixon wrote: So, once you've gotten Ganglia to pull in metrics from gazillions of nodes in umpteen clusters, and got pretty graphs of everything, what do you use for monitoring the values? I mean, when a machine goes down, you don't want just a webpage to be updated, you want something to tr

Re: [Ganglia-general] Monitoring

2002-10-07 Thread Steven Wagner
Leif Nixon wrote: Steven Wagner <[EMAIL PROTECTED]> writes: Yes, that's what I did last week. It ain't no fun. Nagios' handling of passive service checks isn't flexible enough. And passive host checking Just Isn't Done. Once again, considering you have the so

Re: [Ganglia-general] update rate

2002-10-07 Thread Steven Wagner
[EMAIL PROTECTED] wrote: Orest. Does ganglia toolkits have posibilities to slow down database updating rate not 15 seconds but 30 (60 ) ? If you find metrics are updating too often, you can modify the values in $GANGLIA_SOURCE/gmond/metric.h (look for mcast_min and mcast_max). If you're s

Re: [Ganglia-general] overriding the location variable

2002-10-07 Thread Steven Wagner
Doug Nordwall wrote: I'm attempting to use the rack view in the new ganglia, and I do not want to be forced to custom write out a gmond.conf for every node in the cluster. Currently, the location variable is gathered from there, and I don't appear to be able to override it with gmetric. I have

Re: [Ganglia-general] overriding the location variable

2002-10-08 Thread Steven Wagner
Doug Nordwall wrote: Are we talking about the same location variable? Yes. :)

Re: [Ganglia-general] Ganglia 2.5.0 on Solaris 8

2002-10-08 Thread Steven Wagner
Andrew Gill wrote: I'm trying to get Ganglia to work on Solaris 8, and seem to be hitting my head against a wall. I can compile it without any problems, using gcc-3.2. However, the gmond binary exits immediately (return code 0) and no gmond process runs in the background. A 'truss' of gmond do

Re: [Ganglia-general] Ganglia 2.5.0 on Solaris 8

2002-10-10 Thread Steven Wagner
Andrew Gill wrote: It seems as though that is the problem. The last few lines of output (in debug mode) are: gmond: /dev/ksyms is not a 32-bit kernel namelist kvm_open: Error 0 *** WARNING kvm_open() failed. prepare for a segfault ... *** *** kvm_open() failed, are you running gmond as

Re: [Ganglia-general] problem defining a cluster name

2002-10-15 Thread Steven Wagner
Also, everything connected to the same multicast IP is, for all intents and purposes, on the same cluster... as far as the monitoring core's concerned. So, if you have three nodes: 1. binky (10.0.0.2) multicasting on 232.2.72.5 cluster name "work is hell" 2. sheba (10.0.0.3) multicastin

Re: [Ganglia-general] Newbie error

2002-10-16 Thread Steven Wagner
[EMAIL PROTECTED] wrote: Hi all, [points at Ben] HA-ha! OK, now that we've gotten the Nelson laugh out of the way... [not being a ROCKS guy, I defer on all these points to anyone who is *cough*fed*cough*] I just installed a ROCKS 2.21 cluster, which seemed to have ganglia 1.05 or somethi

Re: [Ganglia-general] Newbie error

2002-10-17 Thread Steven Wagner
[EMAIL PROTECTED] wrote: > I get a lovely bit of code. It seems to be working. Depends on the length and breadth of the code. If it's displaying metrics, then it's working. If it just has the DTD and there's no real data (no or tags), it ain't. Also, did you install it in $HTTPD_DOCRO

Re: [Ganglia-general] The Illuminati Order

2002-10-23 Thread Steven Wagner
The fnord content was too low to be from the REAL Illuminati. I suppose my fnord detector code might be broken, but I fed the front page of cnn.com through it and it went crazy so I'm pretty sure it's working... Doug fNordwall wrote: I admit, this was the first spam that I've actually found a

Re: [Ganglia-general] The Illuminati Order

2002-10-24 Thread Steven Wagner
Tarjei Knapstad wrote: On Wed, 2002-10-23 at 20:16, Doug Nordwall wrote: I admit, this was the first spam that I've actually found amusing and potentially useful. I was a bit put off when I realized that my degree in chemistry was not going to do me any good in their society :) 6. No human

Re: [Ganglia-general] question about ganglia

2002-10-25 Thread Steven Wagner
Adil Hasan wrote: Hello, I quickly took a look at Ganglia and it looks like a nice tool for monitoring some of our servers. However, I'd like to be able to run as a non root user. Is it possible to do this? Or, is there another tool that would be better suited for non-root users? tha

Re: [Ganglia-general] update rate

2002-10-28 Thread Steven Wagner
a right to study painting, poetry, music, and architecture." John Adams - Original Message - From: "Steven Wagner" <[EMAIL PROTECTED]> To: Sent: Monday, October 07, 2002 11:25 AM Subject: Re: [Ganglia-general] update rate [EMAIL PROTECTED] wrote: Orest. Does gangl

Re: [Ganglia-general] update rate

2002-10-28 Thread Steven Wagner
else do nothing Steven Wagner wrote: All those values are in seconds. The mcast_min/max values specify the range (randomly determined on each round of execution) of interval between TRANSMISSIONS of the metric. The other two values specify the range of the interval between

Re: [Ganglia-general] aix mem_free, 4.3

2002-10-29 Thread Steven Wagner
Lester Vecsey wrote: I was going to investigate this further to see exactly what kind of values the gmond process is coming up with in the relavent sections of code, but I thought I'd ask here. Also, does anyone know if ibm has a library for 4.3 for the vmgetinfo function? Its also mentioned in t

Re: [Ganglia-general] webfrontend config question

2002-10-29 Thread Steven Wagner
Chris Stone wrote: Ganglia is great. I got it up and running on my linux cluster in short order. I do have one nagging detail I'd like to remedy. /var/lib/ganglia/rrds/ contains a directory called "unspecified". My ganglia web page also lists this name as the name of the cluster, ie. "unspecif

Re: [Ganglia-general] Solaris binaries

2002-11-05 Thread Steven Wagner
sun4u should be sun4u. I do all my builds on a Netra t1 and deploy the resulting binary on 20 E450s. But they all run Solaris 8. I am also the one "responsible" (if you can call it that) for Ganglia's performance (or lack thereof) on the Solaris platform. My development environment was gcc

Re: [Ganglia-general] RE: Must localhost always be first in GEXEC_SVRS ?

2002-11-05 Thread Steven Wagner
Karl Kopper wrote: -Original Message- From: Brent N. Chun [mailto:[EMAIL PROTECTED] When we do this on node 1 everything works as expected: On host unfiwcl1 -- unfiwcl1:/# echo $GEXEC_SVRS unfiwcl1 unfiwcl2 unfiwcl3 unfiwcl4 unfiwcl1:/# gexec -n 0 hostname 1 unfiwcl2 0 unfiwcl1 2 unfi

ganglia-general@lists.sourceforge.net

2002-11-06 Thread Steven Wagner
Lester Vecsey wrote: I find it useful to select certain graphs and copy/paste the URL to some of the images to call them from my own html page, and I noticed that the graphs have a '(now )' value that is passed in with the &v= arguement to graph.php. Certainly graph.php should be able to have acc

ganglia-general@lists.sourceforge.net

2002-11-06 Thread Steven Wagner
lated with '&v=1', etc, and other values. - Original Message ----- From: "Steven Wagner" <[EMAIL PROTECTED]> Try leaving off the v name/value pair. Seems to work...

Re: [Ganglia-general] Can't see the other gmonds.

2002-11-25 Thread Steven Wagner
This might be FAQ-worthy, we seem to be getting variations on this question a lot. So I'll throw in some (nearly) useless or tangential information to this post, which I've added to throughout the day while working on other projects. BTW, that pic didn't help. So, for the record: Monitoring

Re: [Ganglia-general] (no subject)

2002-12-02 Thread Steven Wagner
kvm_open() is returning a permission denied error. Make sure you're running gmond as root. You should get a slightly different set of errors if you are struck by the mighty mismatched-compiler problem. Just in case, check your gmond ... file /path/to/your/gmond The output should look like

Re: [Ganglia-general] How to take a box out of a cluster?

2002-12-06 Thread Steven Wagner
2.5.1 should support the concept of DMAX for individual metrics. I think that extends to hosts, as well. Basically it's metric aging - if a metric hasn't been transmitted for X seconds, take it off the list. It's designed exactly for this type of thing - getting rid of hosts that have been de

[Ganglia-general] Re: catalyst switches vlans and gmonds

2002-12-10 Thread Steven Wagner
Lester Vecsey wrote: When I ping the multicast address, i.e. the default gmond multicast address of 239.2.11.71, on a network of say 20 gmonds, I get back one successful result and then 19 marked as "DUP!" When you ping a multicast address, you should get a response from every host listening o

Re: [Ganglia-general] cross platform gmond clusters

2002-12-18 Thread Steven Wagner
Lester Vecsey wrote: Looking through the key_metrics.h file it seems that linux machines get a different set of keys from aix, and so on. Theres a basic core set of keys that are on all platforms, but then when it gets to things like pkts_in its only available for linux. In particular pkts_in is

Re: [Ganglia-general] Kernel Upgrade killed my happy Ganglia

2002-12-19 Thread Steven Wagner
Phil Forrest wrote: Hello All, Once upon a time, I had a happy ganglia monitor that was giving me valuable data on all nodes of my 48 node cluster. Then I got a request from a user to upgrade the kernel. After I upgraded the kernels across the cluster, my ganglia could only see the data from

Re: [Ganglia-general] Kernel Upgrade killed my happy Ganglia

2002-12-19 Thread Steven Wagner
When in doubt, use telnet. See what telnetting to node02 from prism (which I assume is one of the head systems) on port 8649 gets you. You should at the very least get a Ganglia DTD and XML for one node. If you don't, something is really wrong (congratulations, you found a weird bug). If yo

Re: [Ganglia-general] Kernel Upgrade killed my happy Ganglia

2002-12-19 Thread Steven Wagner
I'd check the network equipment if I were you. The specifics of that are of course vendor-dependent (HP makes Gig-E switches? What's next, gaming consoles?). Make sure it hasn't been configured to drop multicast traffic or something (it could happen!). Oh yeah, and try increasing mcast_ttl

Re: [Ganglia-general] Kernel Upgrade killed my happy Ganglia

2002-12-19 Thread Steven Wagner
Kent IV, William (WW) wrote: I've got an almost identical problem, except I'm using a 3Com 3924 GigE switch (and Dlink DGE-500T adapters). Also, the motherboards have on-board 10/100 connections that aren't being used. Yeah, that sounds like a horse of an entirely different color. Maybe the

Re: [Ganglia-general] Cannot parse RRD - Does ganglia require a specific version of RRD TOOL ?

2003-01-13 Thread Steven Wagner
It's probably that colon in the pathname. IIRC, colons have special significance in a RRD commandline. Maybe you can escape it or refer to it using some alternate method in the config file? Vu, Phuong A (MP Technology) wrote: I have everything setup and running, except the graphs from RRD ar

Re: [Ganglia-general] How to setup multiple clusters using different multicast IP

2003-01-15 Thread Steven Wagner
ontend. I don't want XYZ to be part of the monitored clusters. The monitoring core probably shouldn't be running on the front-end. The metadaemon should be enough. I know in previous reply, Steven Wagner has said that this should work, but I am not able to get it to behave th

Re: [Ganglia-general] performance concerns

2003-01-16 Thread Steven Wagner
Nicholas Henke wrote: On Mon, 23 Dec 2002 18:40:26 -0500 Nicholas Henke <[EMAIL PROTECTED]> wrote: Hello-- I have installed ganglia on several of our clusters, but it would seem that the multi-cast channel is pumping a ton of data. On a 96 node cluster, I am seeing around 20-3

Re: [Ganglia-general] performance concerns

2003-01-17 Thread Steven Wagner
Lester Vecsey wrote: - Original Message - From: "Steven Wagner" <[EMAIL PROTECTED]> Hey, I wonder what would happen if someone specified a non-multicast IP (running gmond, of course) as the target multicast network... anyone ever try that? Even if this worked,

[Ganglia-general] web front-end: the phantom "job" view

2003-01-21 Thread Steven Wagner
I noticed in CVS some comments about a "job" view, allowing for a user-specified graph start time and duration. However there doesn't appear to be any kind of interface for it. I'm not afraid of rolling my own (in fact, I think it might be fun to roll that into another application entirely...

Re: [Ganglia-general] ganglia-webfrontend

2003-01-27 Thread Steven Wagner
John Francis Lee wrote: Greetings, I've downloaded and installed the software to the machines in our internet cafe, have gmond running on all and gmetad on one. When I try to view the setup with the ganglia-webfrontend I get a lot of messages on the order of: Warning: ksort() expects parameter

Re: [Ganglia-general] ganglia-webfrontend

2003-01-27 Thread Steven Wagner
John Francis Lee wrote: Thanks for the help! I followed you suggestions and attach the output of each telnet command. Both were able to connect, and the machine running gmond responded with data. Maybe there's something wrong with php? Take another look at the metadaemon's output: >[DTD del

Re: [Ganglia-general] ganglia-webfrontend

2003-01-28 Thread Steven Wagner
John Francis Lee wrote: Thanks again! Setting the debug level to 10 showed me that gmetad was unable to connect to itself! I changed the datasource specification to 'localhost' from the machine'd fqdn and things worked! What I get now is 'There are 10 nodes up and running. There are no nodes d

Re: [Ganglia-general] ganglia-webfrontend

2003-01-28 Thread Steven Wagner
I'd really appreciate it . thanks logan donaldson [EMAIL PROTECTED] On Tuesday, January 28, 2003, at 01:02 PM, Steven Wagner wrote: John Francis Lee wrote: Thanks again! Setting the debug level to 10 showed me that gmetad was unable to connect to itself! I changed the datasource specificatio

Re: [Ganglia-general] gmetric question

2003-01-28 Thread Steven Wagner
Joe Griffin wrote: Hi All, Is there any similar information on gmetric? I found a script I would like to use in number 16 of: http://ganglia.sourceforge.net/gmetric/ However, I cannot get gmetric to print any output. For example, I tried: /usr/bin/gmetric --name "Resource_Usage_Rank 2" --valu

Re: [Ganglia-general] solaris not reporting running processes

2003-01-30 Thread Steven Wagner
That metric isn't currently supported on Solaris. I have an idea of how to do it but I simply haven't had the time to work on it. Basically it involves walking the /proc tree looking for processes in the Run state and multicasting that number. If someone else wants to write the code for it,

Re: [Ganglia-general] Problem with added metrics

2003-02-14 Thread Steven Wagner
I wouldn't hold my breath to see any form of notification pop up in 2.x (unless, of course, someone's about to spring something on everyone). 3.x is being designed with a more open framework and, although it's still fairly early along to tell for sure, should at the very least be able to suppo

Re: [Ganglia-general] grid graphs missing parts

2003-02-21 Thread Steven Wagner
Nicholas Henke wrote: OK -- so check this link, it is all of our clusters: http://www.liniac.upenn.edu/ganglia. Notice how the overall graph is spotty, but none of the others are? How do I fix that ? Nic Hard to conclusively say without putting gmetad into debug mode and sifting through a co

Re: [Ganglia-general] showing graphs

2003-02-27 Thread Steven Wagner
Santanu Das wrote: Hi, Any body can tell me what is wrong with. http://farm002.hep.phy.cam.ac.uk/cavendish/ Why those graphs are coming empty? You didn't specify if there was a prize for doing so, but what the heck... Chances are (I'm guessing, here), you have gmetad running on the front-e

Re: [Ganglia-general] a few questions

2003-03-05 Thread Steven Wagner
Santanu Das wrote: Actually I did mean to say how to change the label like in spite of "Unspecified Grid" some thing like "HEP DataDrid" or else. Did somebody say, "undocumented feature" ? gmetad and the web front-end control the "grid" stuff - this is a new feature addition as of 2.5.2, wh

Re: [Ganglia-general] Webfrontend graph's time resolution

2003-03-05 Thread Steven Wagner
Henry Leyh wrote: Hi, We have the ganglia monitor core 2.5.2 installed on two clusters (20 and 68 hosts, different subnets, connected via gmond's "trusted_hosts") and watch it with gmetad/webfrontend 2.5.2 running on a machine which belongs to one of the two clusters. What we observe now (aft

Re: [Ganglia-general] Webfrontend graph's time resolution

2003-03-06 Thread Steven Wagner
Henry Leyh wrote: I cannot find anything unreasonable here. The polling interval seems to be correct. Note that do not have private 192.168... addresses for the cluster nodes. Yup, all that looks reasonable. My grab bag o' fixes is officially empty. :) One thing I guess you could try is r

Re: [Ganglia-general] Cluster frontend not reporting

2003-03-11 Thread Steven Wagner
Leif Nixon wrote: Well, this is a new one - at least for me. One of our clusters was rebooted last week, due to a physical relocation. Now the ganglia XML data doesn't contain any mention of the cluster frontend, even though gmond is running fine and responding to the XML data port: nixon $

Re: [Ganglia-general] Cluster frontend not reporting

2003-03-11 Thread Steven Wagner
Leif Nixon wrote: Steven Wagner <[EMAIL PROTECTED]> writes: That's how I found out that my front-end was *three* hops away from the test cluster and I'm thinking you have either a monitoring core config issue or a host/network config issue to track down... (maybe a host/networ

Re: [Ganglia-general] Cluster frontend not reporting

2003-03-11 Thread Steven Wagner
Leif Nixon wrote: Steven Wagner <[EMAIL PROTECTED]> writes: I guess it's possible that g0 is sending metrics on one interface and listening on another Which is exactly what's happening, according to my trusty tcpdump. Is that supposed to be possible? 8^) Errr... yes

Re: [Ganglia-general] Various questions related to Ganglia

2003-03-20 Thread Steven Wagner
Jesper Frank Nemholt wrote: Hi! I have a couple of questions related to Ganglia. I'll say! I've read some of the Ganglia documentation but haven't tried installing it yet. Oooh, there's a good start. What I usually need is a tool that can tell me, and allow me to tell some service respon

Re: [Ganglia-general] Incorrect number of cpu's reported

2003-03-21 Thread Steven Wagner
Hi Jason, Assuming that Itanium gmond uses gmond/machines/linux.c, the number of CPUs is retrieved using the glibc call "get_nprocs()" with the remainder of the CPU info parsed from /proc/cpuinfo. You may want to check the source. If you grep the gmond output (telnet one-of-your-monitoring-c

Re: [Ganglia-general] problem with gmetad

2003-03-24 Thread Steven Wagner
When I first read this thread I thought it was a problem with permissions, but I doubt gmetad would have been able to create the RRD file in the first place if that were the case. Last week I was grumbling about how gmetad seemed to be recreating a specific RRD for every host on every update pa

Re: [Ganglia-general] Multicast configuration

2003-03-25 Thread Steven Wagner
a. Definitely a. When a host opens a multicast socket, the kernel sends a join message *for that IP* and should start receiving traffic for that multicast network from that point on. Listening on different ports... hmmm, that didn't work the last time I tried it, but I'm pretty sure that it

Re: [Ganglia-general] Display problem

2003-03-26 Thread Steven Wagner
matt massie wrote: prashant- so when a node in the cluster dies the cluster size changes but the dead node is not reported? this is a new problem that i haven't heard of before. did gmond get restarted after the node failed? ganglia knows the a node dies when it stops getting heartbeats f

Re: [Ganglia-general] gmond --without-kvm?

2003-03-27 Thread Steven Wagner
Preston Smith wrote: On Thu, Mar 27, 2003 at 01:31:36AM -0500, Lester Vecsey ([EMAIL PROTECTED]) wrote: I compiled gmond on FreeBSD 4.4-RELEASE and I'm running it with a non-root account.. /dev/mem on the machine isn't accessible from this account, and so theres a segfault on kvm_open when I r

Re: [Ganglia-general] nodes reporting on each other

2003-04-01 Thread Steven Wagner
Hi Arnie, Sounds like you need to change some multicast IPs. All the nodes that you want to appear in a single cluster should have the same multicast IP. Despite your best efforts to explain it, I think you're probably the best person to determine how you want your grid layout to look. :) A

Re: [Ganglia-general] displays on large (1000 nodes) cluster

2003-04-01 Thread Steven Wagner
Poirier, Keith wrote: Just wondering if anyone has done any modification of the Ganglia Web-fronted with regards to large (1000+ nodes) clusters. I see that the meta view handles multiple clusters, but when it comes to a single large cluster, the graphics of the nodes (up/down/loaded etc) and t

Re: [Ganglia-general] High level design principles/philosophy?

2003-04-02 Thread Steven Wagner
Mark Seger wrote: I wasn't sure if this is the right place to ask, but I figured if not, someone will certainly tell me. 8-) The developers list should hold at least some answers to your questions. From it you can get a pretty good sense of Ganglia's direction and capabilities (not to mentio

Re: [Ganglia-general] gmetad on OSX

2003-04-08 Thread Steven Wagner
M. Michael Barmada wrote: Hi, I'm wondering if anyone has had success compiling gmetad on OSX? Even after getting everything else working (installing rrd through fink required some additional arguments to configure to get all the libraries recognized), 'make' keeps failing in the gmetad direc

Re: [Ganglia-general] gmetad on OSX

2003-04-08 Thread Steven Wagner
/sw/include could be a Fink include install directory. Fink defaults to putting installed and built software in /sw, IIRC ... (I'm not running it at the moment on my Powerbook, which needs a 10.2 upgrade...) matt massie wrote: Today, M. Michael Barmada wrote forth saying... I'm wondering i

Re: [Ganglia-general] Display problems etc...

2003-04-14 Thread Steven Wagner
Make sure you're using the latest gmetad and web front-end. Latest version is 2.5.3, and it incorporates fixes to directly address both issues (a was addressed in 2.5.2, b in 2.5.3). I've been having trouble with gaps for months - check the ganglia-general archives for various musings on it..

Re: [Ganglia-general] Trouble launching gmond

2003-05-07 Thread steven wagner
Ken MacInnis wrote: On Wed, 7 May 2003, David Bickle wrote: Still having problems I've compiled gcc 3.2.2 from source with the CPU=sparc64. I'm running Solaris 8. I have also compiled ganglia with --enable-sparc64. gmond still won't launch for some reason. Check this: bash-2.03$ file /usr/

Re: [Ganglia-general] Trouble launching gmond

2003-05-07 Thread steven wagner
g cpu_num value (20) to ncpus. Segmentation Fault Yes I am running as root. Why is it complaining about /dev/ksyms not being 32-bit? Am I missing a configure option? Thanks Again, On Wed, 7 May 2003, steven wagner wrote: Ken MacInnis wrote: On Wed, 7 May 2003, David Bickle wrote

Re: [Ganglia-general] Gmond on Solaris 2.6/2.8

2003-05-07 Thread steven wagner
I'd like to remind all of you that when I ported the monitoring core to Solaris, I never tested it on more than a four-way E420R (or on anything pre-Solaris 7) ... I tested it on the widest range of Sun iron available to me, which isn't very much. This looks like a Steve's Crappy Coding proble

  1   2   >