Ah, if it's perl, it might be easy. It might just be the difference between
system("...string...") and system(@argv).
Sent from my phone. No type good.
On Sep 4, 2014, at 8:35 AM, "Matt Thompson"
<[email protected]<mailto:[email protected]>> wrote:
Jeff,
I actually misspoke earlier. It turns out our srun is a *Perl* script around
the SLURM srun. I'll speak with our admins to see if they can massage the
script to not interpret the arguments. If possible, I'll ask them if I can
share the script with you (privately or on the list) and maybe you can see how
it is affecting Open MPI's argument passage.
Matt
On Thu, Sep 4, 2014 at 8:04 AM, Jeff Squyres (jsquyres)
<[email protected]<mailto:[email protected]>> wrote:
On Sep 3, 2014, at 9:27 AM, Matt Thompson
<[email protected]<mailto:[email protected]>> wrote:
> Just saw this, sorry. Our srun is indeed a shell script. It seems to be a
> wrapper around the regular srun that runs a --task-prolog. What it
> does...that's beyond my ken, but I could ask. My guess is that it probably
> does something that helps keep our old PBS scripts running (sets
> $PBS_NODEFILE, say). We used to run PBS but switched to SLURM recently. The
> admins would, of course, prefer all future scripts be SLURM-native scripts,
> but there are a lot of production runs that uses many, many PBS scripts.
> Converting that would need slow, careful QC to make sure any "pure SLURM"
> versions act as expected.
Ralph and I haven't had a chance to discuss this in detail yet, but I have
thought about this quite a bit.
What is happening is that one of the $argv OMPI passes is of the form
"foo;bar". Your srun script is interpreting the ";" as the end of the command
the the "bar" as the beginning of a new command, and mayhem ensues.
Basically, your srun script is violating what should be a very safe assumption:
that the $argv we pass to it will not be interpreted by a shell. Put
differently: your "srun" script behaves differently than SLURM's "srun"
executable. This violates OMPI's expectations of how srun should behave.
My $0.02 is that if we "fix" this in OMPI, we're effectively penalizing all
other SLURM installations out there that *don't* violate this assumption (i.e.,
all of them). Ralph may disagree with me on this point, BTW -- like I said, we
haven't talked about this in detail since Tuesday. :-)
So here's my question: is there any chance you can change your "srun" script to
a script language that doesn't recombine $argv? This is a common problem,
actually -- sh/csh/etc. script languages tend to recombine $argv, but other
languages such as perl and python do not (e.g.,
http://stackoverflow.com/questions/6981533/how-to-preserve-single-and-double-quotes-in-shell-script-arguments-without-the-a).
--
Jeff Squyres
[email protected]<mailto:[email protected]>
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
users mailing list
[email protected]<mailto:[email protected]>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/09/25263.php
--
"And, isn't sanity really just a one-trick pony anyway? I mean all you
get is one trick: rational thinking. But when you're good and crazy,
oooh, oooh, oooh, the sky is the limit!" -- The Tick
_______________________________________________
users mailing list
[email protected]<mailto:[email protected]>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/09/25264.php