> Am 31.03.2015 um 14:56 schrieb Gowtham <[email protected]>:
> 
> 
> I think I found the mistake in my submission script. 
> 
>  hard resource_list:         mem_free=128.00G
> 
> should be
> 
>  hard resource_list:         mem_free=2.00G
> 
> so that the job with 64 processor requests 128 GB total RAM. Correct?

Yes, the consumables are multiplied by the slot count (unless set to JOB 
instead of YES in the consumable complex definition).

-- Reuti

> 
> Best regards,
> g
> 
> --
> Gowtham, PhD
> Director of Research Computing, IT
> Adj. Asst. Professor, Physics/ECE
> Michigan Technological University
> 
> (906) 487/3593
> http://it.mtu.edu
> http://hpc.mtu.edu
> 
> 
> On Tue, 31 Mar 2015, Gowtham wrote:
> 
> | 
> | Hi Reuti,
> | 
> | It's a 64 processor job, and my hope/plan is that it requests 2 GB per 
> processor for a total of 128 GB. But each compute node only has 64 GB total 
> ram (60 of which is set to requestable/consumable).
> | 
> | I could be mistaken, but I think the job is looking for 128 GB RAM per 
> node? Please correct me if I am wrong.
> | 
> | Best regards,
> | g
> | 
> | --
> | Gowtham, PhD
> | Director of Research Computing, IT
> | Adj. Asst. Professor, Physics/ECE
> | Michigan Technological University
> | 
> | (906) 487/3593
> | http://it.mtu.edu
> | http://hpc.mtu.edu
> | 
> | 
> | On Tue, 31 Mar 2015, Reuti wrote:
> | 
> | | 
> | | > Am 31.03.2015 um 14:17 schrieb Gowtham <[email protected]>:
> | | > 
> | | > 
> | | > Please find it here:
> | | > 
> | | >  http://sgowtham.com/downloads/qstat_j_74545.txt
> | | 
> | | Ok, but where is the SGE looking for 256GB? For now each slot will get 
> 128G as requested
> | | 
> | | -- Reuti
> | | 
> | | 
> | | > 
> | | > Best regards,
> | | > g
> | | > 
> | | > --
> | | > Gowtham, PhD
> | | > Director of Research Computing, IT
> | | > Adj. Asst. Professor, Physics/ECE
> | | > Michigan Technological University
> | | > 
> | | > (906) 487/3593
> | | > http://it.mtu.edu
> | | > http://hpc.mtu.edu
> | | > 
> | | > 
> | | > On Tue, 31 Mar 2015, Reuti wrote:
> | | > 
> | | > | Hi,
> | | > | 
> | | > | > Am 31.03.2015 um 13:13 schrieb Gowtham <[email protected]>:
> | | > | > 
> | | > | > 
> | | > | > In one of our clusters that has homogeneous compute nodes (64 GB 
> RAM), I have set mem_free as a requestable and consumable resource. From the 
> mailing list archives, I have done
> | | > | > 
> | | > | >  for x in `qconf -sel`
> | | > | >  do
> | | > | >    qconf -mattr exechost complex_values mem_free=60G $x
> | | > | >  done
> | | > | > 
> | | > | > Every job that gets submitted by every user has the following line 
> in the submission script:
> | | > | > 
> | | > | >  #$ -hard -l mem_free=2G
> | | > | > 
> | | > | > for single processor jobs, and
> | | > | > 
> | | > | >  #$ -hard -l mem_free=(2/NPROCS)G
> | | > | > 
> | | > | > for a parallel job using NPROCS processors.
> | | > | > 
> | | > | > 
> | | > | > All single processor jobs run just fine, and so do many parallel 
> jobs. But some parallel jobs, when the participating processors are spread 
> across multiple compute nodes, keep on waiting.
> | | > | > 
> | | > | > When inspected with 'qstat -j JOB_ID', I notice that the job is 
> looking for (2 * NPROCS)G of RAM in each compute node. How would I go about 
> resolving this issue? If additional information is necessary from my end, 
> please let me know.
> | | > | 
> | | > | Can you please post the output of `qstat -j JOB_ID` of such a job.
> | | > | 
> | | > | -- Reuti
> | | > | 
> | | > | 
> | | > | > 
> | | > | > Thank you for your time and help.
> | | > | > 
> | | > | > Best regards,
> | | > | > g
> | | > | > 
> | | > | > --
> | | > | > Gowtham, PhD
> | | > | > Director of Research Computing, IT
> | | > | > Adj. Asst. Professor, Physics/ECE
> | | > | > Michigan Technological University
> | | > | > 
> | | > | > (906) 487/3593
> | | > | > http://it.mtu.edu
> | | > | > http://hpc.mtu.edu
> | | > | > 
> | | > | > _______________________________________________
> | | > | > users mailing list
> | | > | > [email protected]
> | | > | > https://gridengine.org/mailman/listinfo/users
> | | > | 
> | | > | 
> | | 
> | | 
> | 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to