The name of the ENV variable is blank, i.e. null, i.e. not defined.  Which I 
know seems a bit strange, so I'll give more detail.

The culprit of generating this blank ENV variable keyname is the 
Shell::EnvImporter perl module. Which can easily be worked around.

Basically when the values of keys in the environment are multiline, it will 
generate additional environment keys for each of the additional lines, each 
with a key of whatever is on that line, even if the line is blank (this is the 
root problem).  Then an empty key in the environment was causing qsub to crash.

For example:
# Delete all environment variables
map(delete($ENV{$_}), keys %ENV);
$ENV{TEST_VAR} = "Line1\nLine2\n\n";
my $runner = Shell::EnvImporter->new(
                     shell       => "some_shell",
                     file        => "/some/nonrelevant/.sourcedfile",
                     auto_run    => 1,
                     auto_import => 1,
                     ) or die $@;
foreach my $key (keys(%ENV)) {
      print "[$key]:[$ENV{$key}]\n"; 
   }

Would result in the following print:

[]:[]
[Line2]:[]
[TEST_VAR]:[Line1]

That first line of "[]:[]" is the evidence of the "Blank" environment variable 
key name that I speak of.

When the expected result is:
[TEST_VAR]:[Line1\nLine2\n\n]

When we submit jobs to the grid, we use the "-V" option which exports the 
current environment variables to the context of the job, and I believe when 
it's trying to export whatever this blank environment variable key is, it's 
core dumping.

Justin

-----Original Message-----
From: Jesse Becker [mailto:becke...@mail.nih.gov] 
Sent: Thursday, October 29, 2015 12:33 PM
To: Wagner, Justin
Cc: users@gridengine.org
Subject: Re: [gridengine users] Possible Causes of: critical error: 
unrecoverable error - contact systems manager

What was the name of the ENV variable?  How was it being used by qsub and/or 
the job script?

On Thu, Oct 29, 2015 at 03:02:22PM -0400, Wagner, Justin wrote:
>For anybody who is interested I found the root cause of this crash of qsub.
>
>The root cause is that we had an environment variable whose key was blank that 
>was an artifact of another bug, and this environment variable key causes qsub 
>to crash every single time.
>
>Hopefully somebody is familiar enough with the qsub code to look at why that 
>might cause a crash.  If not, I can cook up a simple script to show the 
>problem.
>
>Justin
>
>From: users-boun...@gridengine.org 
>[mailto:users-boun...@gridengine.org] On Behalf Of Wagner, Justin
>Sent: Tuesday, September 22, 2015 10:02 AM
>To: users@gridengine.org
>Subject: [gridengine users] Possible Causes of: critical error: 
>unrecoverable error - contact systems manager
>
>I am running SoGE 8.1.0 and recently I had a problem when submitting a job to 
>the grid via qsub, and qsub returned the error "critical error: unrecoverable 
>error - contact systems manager"
>
>I am trying to narrow down the root cause of this issue.  I am able to send 
>the same exact command, from the same exact user, on the same exact submit 
>host, and get the command to work.   However, I am using a script that is 
>getting executed by Jenkins to launch the job, and I am also able to reliably 
>reproduce the error when I use the "rebuild" plugin to rebuild the same build. 
> I am suspecting that some environment variable is different between these two 
>cases, and is causing this critical error, however I haven't been able to 
>identify any differences there as of yet.
>
>Can somebody point me to the source that is throwing this error, or possibly 
>give me a list of what the possible causes are for this error?
>
>Thanks,
>
>Justin
>
>_______________________________________________
>users mailing list
>users@gridengine.org
>https://gridengine.org/mailman/listinfo/users

--
Jesse Becker (Contractor)

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to