Hey everyone.

Where having some issues with job's being killed with exit status 137. This
causes the task to finish and start it dependent task which is causing all
kind of havoc.

submitting a job with a very small max memory limit gives me this this as a
example.

$ qacct -j 21141
==============================================================
qname        test.q
hostname     atom12.**
group        **
owner        lars
project      NONE
department   defaultdepartment
jobname      stest__out__geometry2
jobnumber    21141
taskid       101
account      sge
priority     0
qsub_time    Fri Apr  1 11:22:30 2011
start_time   Fri Apr  1 11:22:31 2011
end_time     Fri Apr  1 11:22:39 2011
granted_pe   smp
slots        4
failed       100 : assumedly after job
exit_status  137
ru_wallclock 8
ru_utime     0.281
ru_stime     0.167
ru_maxrss    3744
ru_ixrss     0
ru_ismrss    0
ru_idrss     0
ru_isrss     0
ru_minflt    70739
ru_majflt    0
ru_nswap     0
ru_inblock   8
ru_oublock   224
ru_msgsnd    0
ru_msgrcv    0
ru_nsignals  0
ru_nvcsw     1072
ru_nivcsw    439
cpu          2.240
mem          0.573
io           0.145
iow          0.000
maxvmem      405.820M
arid         undefined

anyone know of a reason why the task would be killed with this error state?
or how to catch it?

Lars
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to