Hi, Am 01.04.2011 um 12:33 schrieb lars van der bijl:
> Hey everyone. > > Where having some issues with job's being killed with exit status 137. 137 = 128 + 9 $ kill -l 1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP 6) SIGABRT 7) SIGBUS 8) SIGFPE 9) SIGKILL ... So, the job was killed. Did you request a too small value for h_vmem or h_rt? -- Reuti > This causes the task to finish and start it dependent task which is causing > all kind of havoc. > > submitting a job with a very small max memory limit gives me this this as a > example. > > $ qacct -j 21141 > ============================================================== > qname test.q > hostname atom12.** > group ** > owner lars > project NONE > department defaultdepartment > jobname stest__out__geometry2 > jobnumber 21141 > taskid 101 > account sge > priority 0 > qsub_time Fri Apr 1 11:22:30 2011 > start_time Fri Apr 1 11:22:31 2011 > end_time Fri Apr 1 11:22:39 2011 > granted_pe smp > slots 4 > failed 100 : assumedly after job > exit_status 137 > ru_wallclock 8 > ru_utime 0.281 > ru_stime 0.167 > ru_maxrss 3744 > ru_ixrss 0 > ru_ismrss 0 > ru_idrss 0 > ru_isrss 0 > ru_minflt 70739 > ru_majflt 0 > ru_nswap 0 > ru_inblock 8 > ru_oublock 224 > ru_msgsnd 0 > ru_msgrcv 0 > ru_nsignals 0 > ru_nvcsw 1072 > ru_nivcsw 439 > cpu 2.240 > mem 0.573 > io 0.145 > iow 0.000 > maxvmem 405.820M > arid undefined > > anyone know of a reason why the task would be killed with this error state? > or how to catch it? > > Lars > > _______________________________________________ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users