Dear Ben, I meant, the hosts file on the clients should be exactly the same with the server's hostname. According to the OSCAR FAQ, <quote> PBS is VERY sensitive to the contents of the /etc/hosts file. /etc/hosts must have whatever name the hostname command returns on the cluster network interface. </quote>
By the way, I think the MPI and PBS are separate parts, that's probably why it works for some case. As to the LAM/MPI, it could be another story. Cheers MJ on 11/19/2001 3:06 PM, Zaiyong Sun at [EMAIL PROTECTED] wrote: > Meng-Juei Hsieh wrote: > [ben@control ben]$ more /etc/hosts > 127.0.0.1 localhost localhost.localdomain > 192.168.2.1 control.cfdlab.lsu.edu control lui_oscar > nfs_oscar pbs_oscar 192.168.2.2 cfd2.cfdlab.lsu.edu cfd2 > [ben@control ben]$ > >> the hosts file on the nodes. >> <skip> >> 10.0.0.250 server lui_oscar nfs_oscar pbs_oscar >> <skip> >> >> [mjhsieh@server ~]$ hostname >> server > > [ben@control ben]$ hostname > control.cfdlab.lsu.edu > [ben@control ben]$ > > Recently, our cluster has not worked very well with LAM_MPI system. The > problem is that our jobs being hold, which the state of jobs is on > sleeping, after running sometimes several thousands or sometimes several > hundreds time steps. But some guy who used MPICH with PBS claimed > everything is okay. I do not wherei si the problem? and how to work out. > > Best, > Ben _______________________________________________ Oscar-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/oscar-users
