I think I see the problem,

I don't want ganglia to report on the nodes running pbs_server. I only 
care about the compute nodes. having non compute nodes  in ganglia 
messes up the usage statistics.


Since Job Monarch must piggy back off an existing node, I must use 
BATCH_HOST_TRANSLATE to map localhost.localdomain to one of my compute 
node.  Correct ?


<HOST NAME="localhost.localdomain" IP="127.0.0.1" REPORTED="1216758634" 
TN="2" TMAX="20" DMAX="0" LOCATIO
N="unspecified" GMOND_STARTED="0">
<METRIC NAME="MONARCH-QJ" VAL="0" TYPE="uint32" UNITS="jobs" TN="0" 
TMAX="60" DMAX="60" SLOPE="both" SOUR
CE="gmetric"/>
<METRIC NAME="MONARCH-RJ" VAL="1" TYPE="uint32" UNITS="jobs" TN="0" 
TMAX="60" DMAX="60" SLOPE="both" SOUR
CE="gmetric"/>
<METRIC NAME="MONARCH-HEARTBEAT" VAL="1216758634" TYPE="string" UNITS="" 
TN="0" TMAX="60" DMAX="60" SLOPE
="both" SOURCE="gmetric"/>
<METRIC NAME="MONARCH-JOB-3373-0" VAL="status=R 
start_timestamp=1216742945 name=mem_muncher poll_interval
=30 domain=wdi-net.com queue=main reported=1216758634 
queued_timestamp=1216742945 owner=weather nodes=ict
c01n08.wdi-net.com" TYPE="string" UNITS="" TN="0" TMAX="60" DMAX="60" 
SLOPE="both" SOURCE="gmetric"/>
</HOST>


Thanks


Daniel Bourque
Sr. Systems Engineer
WeatherData Service Inc
An Accuweather Company

Office (316) 266-8013
Office (316) 265-9127 ext. 3013
Mobile (316) 640-1024



Seth Graham wrote:
> Daniel Bourque wrote:
>> Hi,
>>
>>     my setup is as follow, 2 PBS head nodes running torque, moab , 
>> ganglia and a group of compute nodes running pbs_mom. Ganglia's gmond 
>> is running on the headnodes in mute mode.
>>
>> I'm trying to get rid of the "localhost.localdomain" node that now 
>> shows up in ganglia because job monarch reports as 
>> localhost.localdomain.
>>   
>
> I'm not sure what this means, because jobmonarch doesn't report "as" 
> anything, instead it adds metrics to an existing host's entry in the 
> xml tree (specifically,  your pbs server). If you telnet to your 
> xml_port on the gmetad server and dump to a file, and search for 
> 'MONARCH', you'll see that everything is included inside a pair of 
> <HOST> tags. The only place the hostname is set is within that HOST tag.
>
> It seems to me your pbs server is confused about its own hostname. 
> Either a bad entry in /etc/hosts, or assuming a redhat system, 
> something in /etc/sysconfig is setting the machine's name to 
> localhost.localdomain (which is a default).
>
>

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to