On Tue, Nov 25, 2003 at 01:53:51PM -0800, Brooks Davis wrote:
> On Tue, Nov 25, 2003 at 01:43:19PM -0800, steven wagner wrote:
> > I'm sorry to report that you should be getting metric data back on 
> > Tru64.  Sadly, I can't offer any developmental support here now because 
> > all our Alpha are belong to dumpster (although for the record, I am the 
> > one to blame for the monitoring core running on Tru64 to begin with... 
> > sorry about that!).
> > 
> > The metrics aren't being reported by the monitoring core.  Either 
> > something went wrong with the build (just because it compiled doesn't 
> > mean it really works ... ) or something is wrong at runtime.  To check 
> > runtime, run the monitoring core in debug mode and see what kind of data 
> > you get out of it.
> 
> I'm not familiar with the Tru64 code, but a number of metric
> implementations require that you be root to run them.  This means they
> fail in intresting ways with the default behavior of running as nobody.
> It might be worth a try to use the config file to cause gmond to run as
> root all the time and see if that fixes the problem.
> 
> -- Brooks

Running gmond as root (ie. no_setuid   on) does not seem to make a
difference (although, if it did I don't think that would be such a
great idea). 

I did have a small problem compiling machine.c -> machines/osf.c.
There was a line break in a string on line 163(i believe). Here is
a trivial patch:

diff -ur ganglia-monitor-core-2.5.5-orig/gmond/machines/osf.c
ganglia-monitor-core-2.5.5-mod/gmond/machines/osf.c
--- ganglia-monitor-core-2.5.5-orig/gmond/machines/osf.c    2002-09-06
15:06:15.000000000 -0400
+++ ganglia-monitor-core-2.5.5-mod/gmond/machines/osf.c 2003-11-24
15:07:14.000000000 -0500
@@ -163,8 +163,7 @@
     {
       alpha = 0.5 * (timediff / 30.0e7);
       beta = 1.0 - alpha;
-      debug_msg("* * * * Setting alpha to %f and beta to %f because
       timediff
-= %d",alpha,beta,timediff);
+      debug_msg("* * * * Setting alpha to %f and beta to %f because
timediff = %d",alpha,beta,timediff);
     }
   else
     {


I will try to rebuild and pay closer attention for other problems.

Running with -d2 here is a snip of the output:

pthread_attr_init
creating cluster hash for 2 nodes
hash_create size = 2
hash->size is 3
gmond initialized cluster hash
Using multicast-enabled interface alt0
mcast listening on 239.2.11.71 8649
XML listening on port 8649
listening thread(s) have been started
mcast_listen_thread() started 26375680
mcast_listen_thread() started 15824384
listening thread(s) have been started
cleanup thread has been started
multicasting on channel 239.2.11.71 8649
created monitor thread
set_metric_value() exec'd cpu_num_func (1)
set_metric_value() exec'd cpu_speed_func (2)
set_metric_value() exec'd mem_total_func (3)
set_metric_value() exec'd swap_total_func (4)
set_metric_value() exec'd boottime_func (5)
set_metric_value() exec'd sys_clock_func (6)
set_metric_value() exec'd machine_type_func (7)
set_metric_value() exec'd os_name_func (8)
set_metric_value() exec'd os_release_func (9)
set_metric_value() exec'd gexec_func (25)
set_metric_value() exec'd heartbeat_func (26)
my start_time is 1069797762
                                                                                
set_metric_value() exec'd mtu_func (27)
set_metric_value() exec'd location_func (28)
my location is unspecified
mcast_value() mcasting cpu_num value
encoded 8 XDR bytes
XDR data successfully sent
mcast_value() mcasting cpu_speed value
encoded 8 XDR bytes
XDR data successfully sent
mcast_value() mcasting mem_total value
encoded 8 XDR bytes
XDR data successfully sent
mcast_value() mcasting swap_total value
encoded 8 XDR bytes
XDR data successfully sent
mcast_value() mcasting boottime value
encoded 8 XDR bytes
XDR data successfully sent
mcast_value() mcasting sys_clock value
encoded 8 XDR bytes
XDR data successfully sent
mcast_value() mcasting machine_type value
encoded 16 XDR bytes
XDR data successfully sent
mcast_value() mcasting os_name value
encoded 12 XDR bytes
XDR data successfully sent
mcast_value() mcasting os_release value
encoded 12 XDR bytes
XDR data successfully sent
set_metric_value() exec'd cpu_user_func (10)
mcast_value() mcasting cpu_user value
encoded 8 XDR bytes
XDR data successfully sent
set_metric_value() exec'd cpu_nice_func (11)
mcast_value() mcasting cpu_nice value
encoded 8 XDR bytes
XDR data successfully sent
set_metric_value() exec'd cpu_system_func (12)
mcast_value() mcasting cpu_system value
encoded 8 XDR bytes
XDR data successfully sent
set_metric_value() exec'd cpu_idle_func (13)
* * * * Setting alpha to 0.000147 and beta to 0.999853 because timediff
* = 0
CPU: Just ran table().  Got:  usr 6630 , nice 0 , sys 32021 , idle
18884810, 1024hz.
CPU:--before-------------------------------------------------------------
CPU cycles:
CPU:   now: 6630 , 0, 32021, 18884810   old:  6626 , 0 , 32014 ,
18884803 diffs: 0, 0, 0, -1073217344
CPU:i is 0 : new - old = difference, delta  6630 - 6626 = 4,4
CPU:i is 1 : new - old = difference, delta  0 - 0 = 0,4
CPU:i is 2 : new - old = difference, delta  32021 - 32014 = 7,11
CPU:i is 3 : new - old = difference, delta  18884810 - 18884803 = 7,18
CPU:percentages - half_total is 9, total_change is 18
CPU:--after--------------------------------------------------------------
CPU cycles:
CPU:   later: 6630 , 0, 32021, 18884810   old:  6630 , 0 , 32021 ,
18884810 diffs: 4, 0, 7, 7
CPU: ** ** ** ** ** Are percentages electric?  Try user 222%, nice 0%
, sys 389% , idle 389%

....

and much more...
                                                                                
  

> 
> -- 
> Any statement of the form "X is the one, true Y" is FALSE.
> PGP fingerprint 655D 519C 26A7 82E7 2529  9BF0 5D8E 8BE9 F238 1AD4



-- 
Steve Feehan

Reply via email to