Torque was really easy to install, but it seems like my /etc/hosts file must be screwed up, as I can't get the cluster nodes to respond. Specifically, within a cluster of 3 machines, each having an /etc/hosts file of:

        127.0.0.1       localhost.localdomain   localhost
        199.17.152.17   runner
        199.17.152.135  muscovey
        199.17.152.13   pekin
        (( other workstations follow ))

Now, when I have the pbs_server running on runner, and the pbs_mom daemons running on muscovey, pekin, and runner, I et the following status message,

        [EMAIL PROTECTED] torque-2.1.6]# pbsnodes -a
        pekin
             state = down
             np = 1
             ntype = cluster

        muscovey
             state = down
             np = 1
             ntype = cluster

        runner
             state = down       
             np = 1
             ntype = cluster

I realize this is a pretty low-level question, but what the heck is wrong with my /etc/hosts file?

regards,

NT


ps,  the trouble shooting message given by torque is,

        [EMAIL PROTECTED] torque-2.1.6]# momctl -d 3

        Host: runner/runner   Version: 2.1.6
        WARNING:  server not specified (set $pbsserver)
        PID:                    30531
        HomeDirectory:          /var/spool/torque/mom_priv
        MOM active:             2518 seconds
        Server Update Interval: 45 seconds
        LOGLEVEL:               0 (use SIGUSR1/SIGUSR2 to adjust)
        Communication Model:    RPP
        TCP Timeout:            20 seconds
        NOTE:  no prolog configured
        Alarm Time:             0 of 10 seconds
        Trusted Client List:    199.17.152.17,127.0.0.1
        Configured to use /usr/bin/scp -rpB
        NOTE:  no local jobs detected

        diagnostics complete



- - - - - - - - - - - - - - - - - - - - - - -

Nathan Moore
Physics
Winona State University
[EMAIL PROTECTED]
AIM:nmoorewsu

- - - - - - - - - - - - - - - - - - - - - - -


On Jan 2, 2007, at 7:23 PM, Chris Samuel wrote:

On Wednesday 03 January 2007 08:06, Chris Dagdigian wrote:

Both should be fine although if you are considering *PBS you should
look at both Torque (a fork of OpenPBS I think)

That's correct, it (and ANU-PBS, another fork) seem to be the defacto queuing
systems in the state and national HPC centers down here.

Torque is just *so* much better than OpenPBS used to be (not that it was
particularly hard).

cheers,
Chris
--
 Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit http:// www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to