What was the name of the ENV variable? How was it being used by qsub and/or the job script?
On Thu, Oct 29, 2015 at 03:02:22PM -0400, Wagner, Justin wrote:
For anybody who is interested I found the root cause of this crash of qsub. The root cause is that we had an environment variable whose key was blank that was an artifact of another bug, and this environment variable key causes qsub to crash every single time. Hopefully somebody is familiar enough with the qsub code to look at why that might cause a crash. If not, I can cook up a simple script to show the problem. Justin From: [email protected] [mailto:[email protected]] On Behalf Of Wagner, Justin Sent: Tuesday, September 22, 2015 10:02 AM To: [email protected] Subject: [gridengine users] Possible Causes of: critical error: unrecoverable error - contact systems manager I am running SoGE 8.1.0 and recently I had a problem when submitting a job to the grid via qsub, and qsub returned the error "critical error: unrecoverable error - contact systems manager" I am trying to narrow down the root cause of this issue. I am able to send the same exact command, from the same exact user, on the same exact submit host, and get the command to work. However, I am using a script that is getting executed by Jenkins to launch the job, and I am also able to reliably reproduce the error when I use the "rebuild" plugin to rebuild the same build. I am suspecting that some environment variable is different between these two cases, and is causing this critical error, however I haven't been able to identify any differences there as of yet. Can somebody point me to the source that is throwing this error, or possibly give me a list of what the possible causes are for this error? Thanks, Justin _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
-- Jesse Becker (Contractor) _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
