Good news, I found the problem. Even if in the "bootstrap" file the "security_mode" is set to "none", qmaster wants more than plaintext communication. I guess without a certificate it can not initialize the ssl components. Means qmaster does not expect ssl nor plaintext. Maybe "Munge" communication? (If this sounds silly to people with "munge" knowledge, please go ahead and add some more documentation about "SGE and Munge", I have no idea about it (yet))
After generating the sgeCA infrastructure and set "security_mode" to "csp" the commlib error disappears and communication with qmaster works. Unfortunately I am not that deep inside the code that I can send a patch right now. It will take some time. If there is somebody who knows already how to fix it, please tell us. Greetings ... Marco On 02.11.2017 15:09, Marco Schmidt wrote: > Same behavior on debian 8 (jessie). > > Seem I have to go deeper in the code for debugging. > > Greetings ... > Marco > > > On 01.11.2017 15:46, Marco Schmidt wrote: >> Hello, >> >> I am now trying since several days to make SGE run on Debian 9 (stretch) >> using the source from the darcs repository. >> >> It was really easy to build the debian packages and install them. >> >> The master is running (it responds to qping), but any other command >> (qconf, qstat) fails with a "comm error". >> >> Yes, another of these "comm errors". I am quiet experienced with these, >> because I run the Gridengine in many versions and this is the most >> common error I get. Usually its because of wrong entries in /etc/hosts. >> And till now, I always found a solution. >> >> From the client (on same host as qmaster): >> # qconf -sql >> error: unable to contact qmaster using port 6444 on host "xxxx.fqdn" >> >> In the qmaster log: >> 11/01/2017 15:36:27|listen|xxxx|E|commlib error: got read error (closing >> "xxxx.fqdn/qconf/15") >> 11/01/2017 15:36:27|listen|xxxx|E|commlib error: got select error >> (closing "xxxx.fqdn/qconf/17") >> >> Seems that they communicate, but not successful. >> >> security is set to "none". >> >> Has anybody any idea? >> Any idea how to debug this ? >> Somebody who runs it on debian 9 (stretch)? >> >> I built the packages on Ubuntu Xenial (16.04), installed a new qmaster >> with them and get the same error. >> (I will try to send some patches, because it was not as straight forward >> as with debian stretch). >> >> Currently I try to build the packages for debian 8 (jessie) to see if it >> works there. >> >> Greetings ... >> Marco >> >> >> _______________________________________________ SGE-discuss mailing list [email protected] https://arc.liv.ac.uk/mailman/listinfo/sge-discuss
