Am 13.02.2016 um 01:28 schrieb Lane, William:

> We track our SGE cluster statistics through the accounting file, which we 
> import into
> a separate mySQL.
> 
> We've had some really strange results show up for job CPU time usage recently:
> 
> mysql> SELECT owner, job_number AS job_num, CPU/3600 AS "CPU", 
> ru_wallclock/3600 AS "RT", FROM_UNIXTIME(start_time) AS strt_time, 
> FROM_UNIXTIME(end_time), FROM_UNIXTIME(submission_time) AS sbmt_time, 
> (CPU/3600) FROM csclprd3 WHERE (start_time >= UNIX_TIMESTAMP('2015-07-01')) 
> AND (start_time < UNIX_TIMESTAMP('2015-08-01')) AND (start_time <> end_time) 
> AND (start_time <> 0) AND (end_time <> 0) AND ((CPU/3600) > 
> (6*(ru_wallclock/3600))) AND owner='pangjx';
> +---------+------------+---------------------+-------------+-----------------------------+-------------------------------------------+-------------------------------+-------------------+
> | owner | job_num | CPU         | RT      | start_time         | 
> FROM_UNIXTIME(end_time) | sbmt_time            | (CPU/3600)  |
> +---------+------------+---------------------+-------------+-----------------------------+-------------------------------------------+-------------------------------+-------------------+
> 
> 
> | pangjx |  143320 |2777777.7775 | 145.3114 | 2015-07-23 21:56:51 | 
> 2015-07-29 23:15:32          | 2015-07-23 21:56:40 |

2777777.7775 * 3600 looks like a fixed-point type overflow. How is the entry 
defined in your DB?

-- Reuti


> 2777777.7775 |
> | pangjx |  154178 |       7.8869 |   0.7439 | 2015-07-29 15:02:28 | 
> 2015-07-29 15:47:06          | 2015-07-29 15:02:18 |       7.8869 |
> | pangjx |  154265 |       7.4861 |   0.7106 | 2015-07-29 15:02:29 | 
> 2015-07-29 15:45:07        | 2015-07-29 15:02:23 |       7.4861 |
> | pangjx |  154244 |       5.0086 |   0.6397 | 2015-07-29 15:02:28 | 
> 2015-07-29 15:40:51         | 2015-07-29 15:02:23 |       5.0086 |
> | pangjx |  154196 |      10.0081 |   0.6386 | 2015-07-29 15:02:28 | 
> 2015-07-29 15:40:47        | 2015-07-29 15:02:19 |      10.0081 |
> | pangjx |  154136 |       5.1428 |   0.6375 | 2015-07-29 15:02:28 | 
> 2015-07-29 15:40:43        | 2015-07-29 15:02:17 |       5.1428 |
> | pangjx |  154217 |       5.2989 |   0.5658 | 2015-07-29 15:02:28 | 
> 2015-07-29 15:36:25        | 2015-07-29 15:02:19 |       5.2989 |
> | pangjx |  154233 |       4.3808 |   0.5581 | 2015-07-29 15:02:28 | 
> 2015-07-29 15:35:57         | 2015-07-29 15:02:22 |       4.3808 |
> | pangjx |  154157 |       5.4767 |   0.5517 | 2015-07-29 15:02:28 | 
> 2015-07-29 15:35:34          | 2015-07-29 15:02:18 |       5.4767 |
> | pangjx |  154152 |       3.1375 |   0.4356 | 2015-07-29 15:02:28 | 
> 2015-07-29 15:28:36         | 2015-07-29 15:02:18 |       3.1375 |
> | pangjx |  143359 | 2777777.7775 | 127.8125 | 2015-07-23 21:56:52 | 
> 2015-07-29 05:45:37         | 2015-07-23 21:56:42 | 2777777.7775 |
> | pangjx |  143334 | 2777777.7775 | 123.9389 | 2015-07-23 21:56:51 | 
> 2015-07-29 01:53:11         | 2015-07-23 21:56:41 | 2777777.7775 |
> | pangjx |  143329 |     945.1042 | 115.6944 | 2015-07-23 21:56:51 | 
> 2015-07-28 17:38:31         | 2015-07-23 21:56:41 |     945.1042 |
> | pangjx |  143355 |     766.4269 | 100.3900 | 2015-07-23 21:56:52 | 
> 2015-07-28 02:20:16         | 2015-07-23 21:56:42 |     766.4269 |
> | pangjx |  143377 | 2777777.7775 |  99.1744 | 2015-07-23 21:56:52 | 
> 2015-07-28 01:07:20         | 2015-07-23 21:56:43 | 2777777.7775 |
> 
> For a job to have CPU time usage statistics that are 27777 times greater than 
> the runtime of the job is impossible isn't it?
> 
> Our cluster has nowhere near 27000 cores (even with hyperthreading).
> 
> -Bill L.
> IMPORTANT WARNING: This message is intended for the use of the person or 
> entity to which it is addressed and may contain information that is 
> privileged and confidential, the disclosure of which is governed by 
> applicable law. If the reader of this message is not the intended recipient, 
> or the employee or agent responsible for delivering it to the intended 
> recipient, you are hereby notified that any dissemination, distribution or 
> copying of this information is strictly prohibited. Thank you for your 
> cooperation. _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to