On Tue, Nov 25, 2003 at 01:53:51PM -0800, Brooks Davis wrote:
On Tue, Nov 25, 2003 at 01:43:19PM -0800, steven wagner wrote:
I'm sorry to report that you should be getting metric data back on
Tru64. Sadly, I can't offer any developmental support here now because
all our Alpha are belong to dumpster (although for the record, I am the
one to blame for the monitoring core running on Tru64 to begin with...
sorry about that!).
The metrics aren't being reported by the monitoring core. Either
something went wrong with the build (just because it compiled doesn't
mean it really works ... ) or something is wrong at runtime. To check
runtime, run the monitoring core in debug mode and see what kind of data
you get out of it.
I'm not familiar with the Tru64 code, but a number of metric
implementations require that you be root to run them. This means they
fail in intresting ways with the default behavior of running as nobody.
It might be worth a try to use the config file to cause gmond to run as
root all the time and see if that fixes the problem.
-- Brooks
Running gmond as root (ie. no_setuid on) does not seem to make a
difference (although, if it did I don't think that would be such a
great idea).
I did have a small problem compiling machine.c -> machines/osf.c.
There was a line break in a string on line 163(i believe). Here is
a trivial patch:
diff -ur ganglia-monitor-core-2.5.5-orig/gmond/machines/osf.c
ganglia-monitor-core-2.5.5-mod/gmond/machines/osf.c
--- ganglia-monitor-core-2.5.5-orig/gmond/machines/osf.c 2002-09-06
15:06:15.000000000 -0400
+++ ganglia-monitor-core-2.5.5-mod/gmond/machines/osf.c 2003-11-24
15:07:14.000000000 -0500
@@ -163,8 +163,7 @@
{
alpha = 0.5 * (timediff / 30.0e7);
beta = 1.0 - alpha;
- debug_msg("* * * * Setting alpha to %f and beta to %f because
timediff
-= %d",alpha,beta,timediff);
+ debug_msg("* * * * Setting alpha to %f and beta to %f because
timediff = %d",alpha,beta,timediff);
}
else
{
I will try to rebuild and pay closer attention for other problems.
Running with -d2 here is a snip of the output:
pthread_attr_init
creating cluster hash for 2 nodes
hash_create size = 2
hash->size is 3
gmond initialized cluster hash
Using multicast-enabled interface alt0
mcast listening on 239.2.11.71 8649
XML listening on port 8649
listening thread(s) have been started
mcast_listen_thread() started 26375680
mcast_listen_thread() started 15824384
listening thread(s) have been started
cleanup thread has been started
multicasting on channel 239.2.11.71 8649
created monitor thread
set_metric_value() exec'd cpu_num_func (1)
set_metric_value() exec'd cpu_speed_func (2)
set_metric_value() exec'd mem_total_func (3)
set_metric_value() exec'd swap_total_func (4)
set_metric_value() exec'd boottime_func (5)
set_metric_value() exec'd sys_clock_func (6)
set_metric_value() exec'd machine_type_func (7)
set_metric_value() exec'd os_name_func (8)
set_metric_value() exec'd os_release_func (9)
set_metric_value() exec'd gexec_func (25)
set_metric_value() exec'd heartbeat_func (26)
my start_time is 1069797762
set_metric_value() exec'd mtu_func (27)
set_metric_value() exec'd location_func (28)
my location is unspecified
mcast_value() mcasting cpu_num value
encoded 8 XDR bytes
XDR data successfully sent
mcast_value() mcasting cpu_speed value
encoded 8 XDR bytes
XDR data successfully sent
mcast_value() mcasting mem_total value
encoded 8 XDR bytes
XDR data successfully sent
mcast_value() mcasting swap_total value
encoded 8 XDR bytes
XDR data successfully sent
mcast_value() mcasting boottime value
encoded 8 XDR bytes
XDR data successfully sent
mcast_value() mcasting sys_clock value
encoded 8 XDR bytes
XDR data successfully sent
mcast_value() mcasting machine_type value
encoded 16 XDR bytes
XDR data successfully sent
mcast_value() mcasting os_name value
encoded 12 XDR bytes
XDR data successfully sent
mcast_value() mcasting os_release value
encoded 12 XDR bytes
XDR data successfully sent
set_metric_value() exec'd cpu_user_func (10)
mcast_value() mcasting cpu_user value
encoded 8 XDR bytes
XDR data successfully sent
set_metric_value() exec'd cpu_nice_func (11)
mcast_value() mcasting cpu_nice value
encoded 8 XDR bytes
XDR data successfully sent
set_metric_value() exec'd cpu_system_func (12)
mcast_value() mcasting cpu_system value
encoded 8 XDR bytes
XDR data successfully sent
set_metric_value() exec'd cpu_idle_func (13)
* * * * Setting alpha to 0.000147 and beta to 0.999853 because timediff
* = 0
CPU: Just ran table(). Got: usr 6630 , nice 0 , sys 32021 , idle
18884810, 1024hz.
CPU:--before-------------------------------------------------------------
CPU cycles:
CPU: now: 6630 , 0, 32021, 18884810 old: 6626 , 0 , 32014 ,
18884803 diffs: 0, 0, 0, -1073217344
CPU:i is 0 : new - old = difference, delta 6630 - 6626 = 4,4
CPU:i is 1 : new - old = difference, delta 0 - 0 = 0,4
CPU:i is 2 : new - old = difference, delta 32021 - 32014 = 7,11
CPU:i is 3 : new - old = difference, delta 18884810 - 18884803 = 7,18
CPU:percentages - half_total is 9, total_change is 18
CPU:--after--------------------------------------------------------------
CPU cycles:
CPU: later: 6630 , 0, 32021, 18884810 old: 6630 , 0 , 32021 ,
18884810 diffs: 4, 0, 7, 7
CPU: ** ** ** ** ** Are percentages electric? Try user 222%, nice 0%
, sys 389% , idle 389%
....
and much more...
--
Any statement of the form "X is the one, true Y" is FALSE.
PGP fingerprint 655D 519C 26A7 82E7 2529 9BF0 5D8E 8BE9 F238 1AD4