Dear Ben,

I meant, the hosts file on the clients should be exactly the same with the
server's hostname. According to the OSCAR FAQ,
<quote>
PBS is VERY sensitive to the contents of the /etc/hosts file. /etc/hosts
must have whatever name the hostname command returns on the cluster network
interface.
</quote>

By the way, I think the MPI and PBS are separate parts, that's probably why
it works for some case. As to the LAM/MPI, it could be another story.

Cheers
MJ


on 11/19/2001 3:06 PM, Zaiyong Sun at [EMAIL PROTECTED] wrote:
> Meng-Juei Hsieh wrote:
> [ben@control ben]$ more /etc/hosts
> 127.0.0.1       localhost       localhost.localdomain
> 192.168.2.1     control.cfdlab.lsu.edu  control lui_oscar
> nfs_oscar       pbs_oscar 192.168.2.2     cfd2.cfdlab.lsu.edu     cfd2
> [ben@control ben]$
> 
>> the hosts file on the nodes.
>> <skip>
>> 10.0.0.250      server lui_oscar nfs_oscar pbs_oscar
>> <skip>
>> 
>> [mjhsieh@server ~]$ hostname
>> server
> 
> [ben@control ben]$ hostname
> control.cfdlab.lsu.edu
> [ben@control ben]$
> 
> Recently, our cluster has not worked very well with LAM_MPI system. The
> problem is that our jobs being hold, which the state of jobs is on
> sleeping, after running sometimes several thousands or sometimes several
> hundreds time steps. But some guy who used MPICH with PBS claimed
> everything is okay. I do not wherei si the problem? and how to work out.
> 
> Best,
> Ben


_______________________________________________
Oscar-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to