Hi,

Am 29.03.2018 um 06:40 schrieb Mun Johl:

> HI,
>  
> I’m updating some of our TCL scripts to submit jobs to grid and I’ve run into 
> an issue I can’t explain.  In the TCL script I initialize a variable thusly:
>  
> set hostname ""
>  
> and that gets passed into the qsub command as follows:
>  
> set status [catch {exec qsub -b yes -cwd -sync n -V -q short.q -l vl 
> $hostname tclsh $sim} result]

What does $hostname stands for? Do you want the job to start on a particular 
machine?


> SGE v8.1.9 is not happy with that for some reason, and I get nasty emails 
> such as the following:
>  
> ===========================================================================
> failed before prolog: shepherd exited with exit status 7: before prolog 
> Shepherd trace:
> 03/28/2018 16:25:20 [495:19306]: shepherd called with uid = 0, euid = 495
> 03/28/2018 16:25:20 [495:19306]: starting up 8.1.9
> 03/28/2018 16:25:20 [495:19306]: setpgid(19306, 19306) returned 0
> 03/28/2018 16:25:20 [495:19306]: do_core_binding: "binding" parameter not 
> found in config file
> 03/28/2018 16:25:20 [495:19306]: no prolog script to start
>  
> Shepherd pe_hostfile:
> sim.company.com 1 [email protected] UNDEFINED
>  
> Furthermore, the spool/sim/messages has the following messages:
>  
> 03/28/2018 21:15:36|  main|sim|E|shepherd of job 134.1 died through signal = 
> 11
> 03/28/2018 21:15:36|  main|sim|E|abnormal termination of shepherd for job 
> 134.1: no "exit_status" file
> 03/28/2018 21:15:36|  main|sim|E|can't open file active_jobs/134.1/error: No 
> such file or directory
>  
> If I initialize hostname to a space (“ “), the job will run just fine.
>  
> Moreover, if I assign a var to something like “-l vl” to reserve a consumable 
> resource, SGE complains thusly:
>  
> qsub: invalid option argument "-l vl"

Looks like TCL will give the argument as one, but SGE expects two. Separating 
them as "-l" and "vl" might work, on the command line the splitting is done by 
the shell where $foo will be split but "$foo" won't and raise an error too. I 
have no clue whether TCL has an option `eval` the expression to split the 
options.

-- Reuti


> But if I “hard code” that option in the qsub command, the job will run 
> correctly.
>  
> I’ve tried to scrub ‘hostname’ of non-printable characters, and to strip and 
> (unseen) white space, but nothing seems to work.  I don’t know what gets 
> embedded into the TCL strings that SGE seems to dislike.  It doesn’t help 
> that I’m relatively new to TCL.
>  
> Any suggestions would be appreciated.
>  
> Regards,
>  
> --
> Mun
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to