Hello all,
we have a rather old installation of SGE that has been running for years
without any problems. In the last 2-3 weeks I've been experiencing an
odd problem: when issuing any command (qsub, qstat, qping, etc) I get
the following error:
error: commlib error: access denied (server host resolves destination host
"<server address>" as "(HOST_NOT_RESOLVABLE)")
error: unable to contact qmaster using port 6444 on host "<server address>"
There are several odd things about this:
* Nothing has changed on the server or the clients in the months
before the error started appearing.
* This happens from most of the clients, but not all.
* The error persists for 5-10 minutes, and then everything works fine.
* Both gethostbyname and gethostbyaddr return the correct values from
the client while the error occurs (I haven't had a chance to try
them from the master during these episodes).
I get a feeling that this has something to do with DNS and reverse
lookup, but I don't know where to start debugging it.
Anyone have any clue what I should look at ?
Thanks,
--
Valerio Luccio (212) 998-8736
Center for Brain Imaging 4 Washington Place, Room 157
New York University New York, NY 10003
"In an open world, who needs windows or gates ?"
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users