Hey everyone. Where having some issues with job's being killed with exit status 137. This causes the task to finish and start it dependent task which is causing all kind of havoc.
submitting a job with a very small max memory limit gives me this this as a example. $ qacct -j 21141 ============================================================== qname test.q hostname atom12.** group ** owner lars project NONE department defaultdepartment jobname stest__out__geometry2 jobnumber 21141 taskid 101 account sge priority 0 qsub_time Fri Apr 1 11:22:30 2011 start_time Fri Apr 1 11:22:31 2011 end_time Fri Apr 1 11:22:39 2011 granted_pe smp slots 4 failed 100 : assumedly after job exit_status 137 ru_wallclock 8 ru_utime 0.281 ru_stime 0.167 ru_maxrss 3744 ru_ixrss 0 ru_ismrss 0 ru_idrss 0 ru_isrss 0 ru_minflt 70739 ru_majflt 0 ru_nswap 0 ru_inblock 8 ru_oublock 224 ru_msgsnd 0 ru_msgrcv 0 ru_nsignals 0 ru_nvcsw 1072 ru_nivcsw 439 cpu 2.240 mem 0.573 io 0.145 iow 0.000 maxvmem 405.820M arid undefined anyone know of a reason why the task would be killed with this error state? or how to catch it? Lars
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users