Hello, There are several things which mess up your configuration…..
First of all as the blog mentioned Grid Engine is very sensitive to proper host names. You must have the host names setup correctly otherwise things will not work. So with that in mind the following things are wrong in the hosts file below: 1) You are using the loopback address as both ‘loopback’ and as ‘ubuntu-frontend’. Don’t do this. Just keep your loopback and remove the ‘ubuntu-frontend’ entry in the file. 1b) Almost forgot - you added ‘ubuntu-frontend’ as an alias to loopback on the line ‘127.0.0.1 localhost ubuntu-frontend’ - remove ‘ubuntu-frontend’ from that line and keep ‘loopback’. 2) Keep the entry ’10.10.1.1 ubuntu-frontend’ that is OK. 3) I am not sure if you have multiple network interfaces on your qmaster. If you do have multiple network interfaces then you will need a SGE host aliases file read ‘man 5 host_aliases’ in the SGE man pages. 4) So once you have the /etc/hosts file setup properly you have to make sure that ‘hostname’ also returns the proper name for the machine. on Ubuntu you have to edit /etc/hostname as well and make sure that it is the same name as the frontend machine which is ‘ubuntu-frontend’. You will likely have to restart services or reboot the machine for the hostname change to take effect. Finally… when you run qhost the qmaster is just providing you status information on nodes in the cluster. In this case you have ubuntu-node1 configured in Grid Engine but you don’t have the execution daemon running on the host ‘ubuntu-node1’ that is why you get the ‘dashes’ ‘- - - ‘ You may be able to remedy this easily by just installing the execd on ubuntu-node1 - or it may be more complex if you do not have a shared filesystem (i.e. NFS) between the frontend node and ubuntu-node1. Anyhow fixing the frontend master name resolution is the first thing to worry about. Regards, Bill. > On Aug 28, 2015, at 5:04 PM, Dimar Jaime González Soto > <[email protected]> wrote: > > Hi every one I have configurated a network following the steps of this site : > http://verahill.blogspot.cl/2012/06/setting-up-sun-grid-engine-with-three.html > > <http://verahill.blogspot.cl/2012/06/setting-up-sun-grid-engine-with-three.html> > > The issue is when I got "localhost" like host name when I run qhost: > > cbuach@ubuntu-frontend:~$ qhost > HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO > SWAPUS > ------------------------------------------------------------------------------- > global - - - - - - > - > localhost lx26-amd64 16 - 31.4G - 48.4G > - > ubuntu-node1 - - - - - - > - > > Plus like you see it doesn't have comunication with node "ubuntu-node1". > > My host file in ubuntu-frontend is: > 127.0.0.1 localhost ubuntu-frontend > 127.0.1.1 ubuntu-frontend > 10.10.1.1 ubuntu-frontend > 10.10.1.2 ubuntu-node1 > 10.10.1.3 ubuntu-node2 > 10.10.1.4 ubuntu-node3 > 10.10.1.5 ubuntu-node4 > 10.10.1.6 ubuntu-node5 > 10.10.1.7 ubuntu-node6 > 10.10.1.8 ubuntu-node7 > 10.10.1.9 ubuntu-node8 > 10.10.1.10 ubuntu-node9 > 10.10.1.11 ubuntu-node10 > > In ubuntu-node1 the hostfile is: > > 127.0.0.1 localhost ubuntu-node1 > 127.0.1.1 ubuntu-node1 > 10.10.1.1 ubuntu-frontend > > > I think that are issues related to the hosts files. Any help would be great. > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users William Bryce | VP Products Univa Corporation, Toronto E: [email protected] | D: 647-9742841 | Toll-Free (800) 370-5320 W: Univa.com <http://univa.com/> | FB: facebook.com/univa.corporation <http://facebook.com/univa.corporation> | T: twitter.com/Grid_Engine <http://twitter.com/Grid_Engine>
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
