Hi, Am 04.02.2014 um 17:10 schrieb Mazouzi:
> We recently noticed that the CPU time value recorded in the accounting file > is wrong (value is too high) for parallel SMP jobs (OpenMP, Gaussian09) > > SGE version: OGS 2011.11 > System: Centos 6.4 > kernel: 2.6.32-220.23.1.el6.x86_64 > > qacct -j 3559 > ... > jobnumber 3559 > taskid undefined > account sge > priority 0 > qsub_time Mon Jan 13 21:47:08 2014 > start_time Mon Jan 13 21:47:09 2014 > end_time Sat Jan 25 21:33:12 2014 > granted_pe openmp > slots 8 > failed 0 > exit_status 0 > ru_wallclock 1035963 > ru_utime 12824765754.210 > ru_stime 5630063030.787 Adding up the two lines above gives a similar result. So the measured CPU time (by SGE) below reflects the one recorded by the kernel. What was specified in the Gaussian input file for %nprocs= - though it can't be that high? -- Reuti > ru_maxrss 853692 > ru_ixrss 0 > ru_ismrss 0 > ru_idrss 0 > ru_isrss 0 > ru_minflt 187417160 > ru_majflt 0 > ru_nswap 0 > ru_inblock 1948210454 > ru_oublock -1658595332 > ru_msgsnd 0 > ru_msgrcv 0 > ru_nsignals 0 > ru_nvcsw 169985434 > ru_nivcsw 24673526 > cpu 18454828784.997 > mem 17468450365.385 > io 6282.100 > iow 0.000 > maxvmem 10.010G > arid undefined > > > The job duration is 1035963 (12d) how come the cpu time is 18454828784 !! is > this a known bug in recent version of Linux ? > > the pe definition is : > > pe_name openmp > slots 9999 > user_lists NONE > xuser_lists NONE > start_proc_args NONE > stop_proc_args NONE > allocation_rule $pe_slots > control_slaves TRUE > job_is_first_task FALSE > urgency_slots min > accounting_summary TRUE > > Regards, > > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
