[slurm-dev] Re: Slurm & CGROUP

2017-03-15 Thread Wensheng Deng
It should be (sorry): we 'cp'ed a 5GB file from scratch to node local disk On Wed, Mar 15, 2017 at 11:26 AM, Wensheng Deng wrote: > Hello experts: > > We turn on TaskPlugin=task/cgroup. In one Slurm job, we 'cp'ed a 5GB job > from scratch to node local disk, declared 5 GB memory for the job, an

[slurm-dev] Re: Slurm & CGROUP

2017-03-15 Thread Wensheng Deng
NoShare? > > Chris > > > > From: Wensheng Deng > Sent: 15 March 2017 10:28 > To: slurm-dev > Subject: [ext] [slurm-dev] Re: Slurm & CGROUP > > It should be (sorry): > we 'cp'ed a 5GB file from scratch to node local disk > > > On Wed, Mar 15, 2017 a

[slurm-dev] Re: Slurm & CGROUP

2017-03-15 Thread Wensheng Deng
>> >> From: Wensheng Deng >> Sent: 15 March 2017 10:28 >> To: slurm-dev >> Subject: [ext] [slurm-dev] Re: Slurm & CGROUP >> >> It should be (sorry): >> we 'cp'ed a 5GB file from scratch to node local di

[slurm-dev] Re: Slurm & CGROUP

2017-03-16 Thread Janne Blomqvist
xplicitly exclude shared usage from our measurement: > > > JobAcctGatherType=jobacct_gather/cgroup > JobAcctGatherParams=NoShare? > > Chris > > > > From: Wensheng Deng mailto:w...@nyu.edu>> >

[slurm-dev] Re: Slurm & CGROUP

2017-03-17 Thread Wensheng Deng
e shared usage from our measurement: > > > > > > JobAcctGatherType=jobacct_gather/cgroup > > JobAcctGatherParams=NoShare? > > > > Chris > > > > > > > > From: Wensheng Deng mailto:w...@nyu.edu>>

[slurm-dev] Re: Slurm & CGROUP

2017-03-17 Thread Shenglong Wang
wrote: > > > > > > We explicitly exclude shared usage from our measurement: > > > > > > JobAcctGatherType=jobacct_gather/cgroup > > JobAcctGatherParams=NoShare? > > > > Chris > > > > > > ___

[slurm-dev] Re: Slurm & CGROUP

2017-03-17 Thread Sam Gallop (NBI)
tely, because of a bug this plugin does report cache usage either. I've contributed a bug/fix to address this (https://bugs.schedmd.com/show_bug.cgi?id=3531). --- Samuel Gallop Computing infrastructure for Science CiS Support & Development From: Wensheng Deng [mailto:w...@nyu.edu] Sen

[slurm-dev] Re: Slurm & CGROUP

2017-03-17 Thread Wensheng Deng
e > of a bug this plugin does report cache usage either. I've contributed a > bug/fix to address this (https://bugs.schedmd.com/show_bug.cgi?id=3531). > > > > *---* > > *Samuel Gallop* > > *Computing infrastructure for Science* > > *CiS Support & Development* &

[slurm-dev] Re: Slurm & CGROUP

2017-03-17 Thread Sam Gallop (NBI)
@nbi.ac.uk<mailto:computing.helpd...@nbi.ac.uk> or call phone extension 1234. From: Wensheng Deng [mailto:w...@nyu.edu] Sent: 17 March 2017 15:06 To: slurm-dev Subject: [slurm-dev] Re: Slurm & CGROUP For the case of the simple 'cp' test job which copying a 5 GB file, the issu

[slurm-dev] Re: Slurm & CGROUP

2017-03-17 Thread Wensheng Deng
ture for Science* team on *group phone extension * *2003**.* > > If your request is urgent, please contact the *NBIP Computing Helpdesk* > at computing.helpd...@nbi.ac.uk or call *phone extension **1234**.* > > > > *From:* Wensheng Deng [mailto:w...@nyu.edu] > *Sent:* 17 M

[slurm-dev] Re: Slurm & CGROUP

2017-03-17 Thread Sam Gallop (NBI)
and The Sainsbury Laboratory From: Wensheng Deng [mailto:w...@nyu.edu] Sent: 17 March 2017 15:39 To: slurm-dev Subject: [slurm-dev] Re: Slurm & CGROUP Thank you. I had some doubt about the accuracy of memory.stat. Sam, what slurm conf parameters do you recommend to try your fix in bug

[slurm-dev] Re: Slurm & CGROUP

2017-03-17 Thread Ryan Cox
myself for the time being. Ryan On 03/17/2017 08:46 AM, Sam Gallop (NBI) wrote: Re: [slurm-dev] Re: Slurm & CGROUP Hi, I believe you can get that message ('Exceeded job memory limit at some point') even if the job finishes fine. When the cgroup is created (by SLURM) it updates m

[slurm-dev] Re: Slurm & CGROUP

2017-03-17 Thread Nicholas McCollum
MemoryKill (the documentation say use this with > > caution, see https://slurm.schedmd.com/slurm.conf.html), or you can > > try and account for the cache by using the jobacct_gather/cgroup.  > > Unfortunately, because of a bug this plugin does report cache usage > > either.  I

[slurm-dev] Re: Slurm & CGROUP

2017-03-17 Thread Wensheng Deng
ve plugin, you could also try the JobAcctGatherParams > > > parameter NoOverMemoryKill (the documentation say use this with > > > caution, see https://slurm.schedmd.com/slurm.conf.html), or you can > > > try and account for the cache by using the jobacct_gather/cgrou