It almost sounds like they are going into a sleep cycle.  I get the
impression that ganglia runs off of a heartbeat on the client nodes
and so if they went into a power save state they would probably show
as down.  Outside query would probably wake them up though...

Might check and see if those BIOS are set the same as the others (or
if there is a related bug).  Could explain why just some of them are
doing it too (especially if the nodes were purchased in phases).

Just a thought, probably a silly one.


On Wed, 19 Jan 2005 07:53:23 -0800, Bernard Li <[EMAIL PROTECTED]> wrote:
> Hey David:
> 
> Are the downed nodes always the same or are they sort of random?
> 
> Can you check /var/log/messages on the nodes and see if there are any
> clues to why Ganglia is reporting them as down?
> 
> Cheers,
> 
> Bernard
> 
> > -----Original Message-----
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of
> > Dr. David F. Robinson
> > Sent: Saturday, January 15, 2005 7:04
> > To: oscar-users@lists.sourceforge.net;
> > oscar-devel@lists.sourceforge.net
> > Subject: [Oscar-users] ganglia
> >
> >
> > Ganglia is reporting nodes 121-140 of my 140 node system as
> > down.  If I do a
> >
> > cexec '/etc/init.d/gmond restart' all of the nodes show up as
> > available.
> > However, after an hour or two these nodes go back to a 'down' state.
> >
> > They do not show up under a 'pbsnodes -l' command and they
> > are working fine.
> > I can submit and run jobs on these nodes.
> >
> > Any suggestions?
> >
> > Thanks in advance,
> >
> > David
> >
> >
> >
> >
> >
> >
> > -------------------------------------------------------
> > The SF.Net email is sponsored by: Beat the post-holiday blues
> > Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
> > It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
> > _______________________________________________
> > Oscar-users mailing list
> > Oscar-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/oscar-users
> >
> 
> -------------------------------------------------------
> The SF.Net email is sponsored by: Beat the post-holiday blues
> Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
> It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
> _______________________________________________
> Oscar-devel mailing list
> Oscar-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oscar-devel
>


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Oscar-users mailing list
Oscar-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to