> Am 31.03.2015 um 14:17 schrieb Gowtham <[email protected]>: > > > Please find it here: > > http://sgowtham.com/downloads/qstat_j_74545.txt
Ok, but where is the SGE looking for 256GB? For now each slot will get 128G as requested -- Reuti > > Best regards, > g > > -- > Gowtham, PhD > Director of Research Computing, IT > Adj. Asst. Professor, Physics/ECE > Michigan Technological University > > (906) 487/3593 > http://it.mtu.edu > http://hpc.mtu.edu > > > On Tue, 31 Mar 2015, Reuti wrote: > > | Hi, > | > | > Am 31.03.2015 um 13:13 schrieb Gowtham <[email protected]>: > | > > | > > | > In one of our clusters that has homogeneous compute nodes (64 GB RAM), I > have set mem_free as a requestable and consumable resource. From the mailing > list archives, I have done > | > > | > for x in `qconf -sel` > | > do > | > qconf -mattr exechost complex_values mem_free=60G $x > | > done > | > > | > Every job that gets submitted by every user has the following line in the > submission script: > | > > | > #$ -hard -l mem_free=2G > | > > | > for single processor jobs, and > | > > | > #$ -hard -l mem_free=(2/NPROCS)G > | > > | > for a parallel job using NPROCS processors. > | > > | > > | > All single processor jobs run just fine, and so do many parallel jobs. > But some parallel jobs, when the participating processors are spread across > multiple compute nodes, keep on waiting. > | > > | > When inspected with 'qstat -j JOB_ID', I notice that the job is looking > for (2 * NPROCS)G of RAM in each compute node. How would I go about resolving > this issue? If additional information is necessary from my end, please let me > know. > | > | Can you please post the output of `qstat -j JOB_ID` of such a job. > | > | -- Reuti > | > | > | > > | > Thank you for your time and help. > | > > | > Best regards, > | > g > | > > | > -- > | > Gowtham, PhD > | > Director of Research Computing, IT > | > Adj. Asst. Professor, Physics/ECE > | > Michigan Technological University > | > > | > (906) 487/3593 > | > http://it.mtu.edu > | > http://hpc.mtu.edu > | > > | > _______________________________________________ > | > users mailing list > | > [email protected] > | > https://gridengine.org/mailman/listinfo/users > | > | _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
