Hi Reuti, > -----Original Message----- > From: Reuti <[email protected]> > Sent: Wednesday, March 28, 2018 11:52 PM > To: Mun Johl <[email protected]> > Cc: [email protected] > Subject: Re: [gridengine users] Weird interaction between TCL and SGE > > [EXTERNAL EMAIL] > This email was received from outside the organization. > ________________________________ > > Hi, > > Am 29.03.2018 um 06:40 schrieb Mun Johl: > > > HI, > > > > I'm updating some of our TCL scripts to submit jobs to grid and I've run > > into > an issue I can't explain. In the TCL script I initialize a variable thusly: > > > > set hostname "" > > > > and that gets passed into the qsub command as follows: > > > > set status [catch {exec qsub -b yes -cwd -sync n -V -q short.q -l vl > > $hostname tclsh $sim} result] > > What does $hostname stands for? Do you want the job to start on a > particular machine?
Precisely. That is the end goal. > > SGE v8.1.9 is not happy with that for some reason, and I get nasty emails > such as the following: > > > > > ================================================================ > ====== > > ===== failed before prolog: shepherd exited with exit status 7: before > > prolog Shepherd trace: > > 03/28/2018 16:25:20 [495:19306]: shepherd called with uid = 0, euid = > > 495 > > 03/28/2018 16:25:20 [495:19306]: starting up 8.1.9 > > 03/28/2018 16:25:20 [495:19306]: setpgid(19306, 19306) returned 0 > > 03/28/2018 16:25:20 [495:19306]: do_core_binding: "binding" parameter > > not found in config file > > 03/28/2018 16:25:20 [495:19306]: no prolog script to start > > > > Shepherd pe_hostfile: > > sim.company.com 1 [email protected] UNDEFINED > > > > Furthermore, the spool/sim/messages has the following messages: > > > > 03/28/2018 21:15:36| main|sim|E|shepherd of job 134.1 died through > > signal = 11 > > 03/28/2018 21:15:36| main|sim|E|abnormal termination of shepherd for > > job 134.1: no "exit_status" file > > 03/28/2018 21:15:36| main|sim|E|can't open file > > active_jobs/134.1/error: No such file or directory > > > > If I initialize hostname to a space (" "), the job will run just fine. > > > > Moreover, if I assign a var to something like "-l vl" to reserve a > > consumable > resource, SGE complains thusly: > > > > qsub: invalid option argument "-l vl" > > Looks like TCL will give the argument as one, but SGE expects two. Separating > them as "-l" and "vl" might work, on the command line the splitting is done > by the shell where $foo will be split but "$foo" won't and raise an error > too. I > have no clue whether TCL has an option `eval` the expression to split the > options. I'm not sure I understand: It does appear as if the "-l" and "vl" are separated by a space character. Is that not enough? Thanks, -- Mun > > -- Reuti > > > > But if I "hard code" that option in the qsub command, the job will run > correctly. > > > > I've tried to scrub 'hostname' of non-printable characters, and to strip and > (unseen) white space, but nothing seems to work. I don't know what gets > embedded into the TCL strings that SGE seems to dislike. It doesn't help that > I'm relatively new to TCL. > > > > Any suggestions would be appreciated. > > > > Regards, > > > > -- > > Mun > > _______________________________________________ > > users mailing list > > [email protected] > > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
