Re: [slurm-dev] Re: How to account how many cpus/gpus per node has
   been allocated to a specific job?

   You should be able to do this with profiling data:

   <!-- tmpl_var LEFT_BRACKET -->1<!-- tmpl_var RIGHT_BRACKET 
-->https://slurm.schedmd.com/hdf5_profile_user_guide.html


   Just use the jobacct_gather plugin.


   This is very probably an overkill for your requirements - 'scontrol
   -dd show job' should be enough if real time data suffices.

   <!-- tmpl_var LEFT_BRACKET -->2<!-- tmpl_var RIGHT_BRACKET 
-->https://slurm.schedmd.com/scontrol.html#OPT_details_1


   On 11/09/2016 04:05 AM, Ran Du wrote:

   Dear Chris,
   Glad to receive your quick reply. I go through the Slurm accounting
   database, in the job_table, there are only total allocated numbers as
   what we have got with sacct cmd.
   However, the scheduler must have information about separate allocated
   number on each node, or they cannot track how many resources left on
   each node. The question is, if SLURM keep these separate numbers in
   files(e.g. log files or database), or just keep them in memory. I am
   going to read other docs&man info, to see if there is any lead.
   I think you are right about the GRES allocation, let's wait for
   confirmation from the ScheMD developers.
   Thanks again for your kind help.
   Best regards,Ran

   On Wed, Nov 9, 2016 at 9:31 AM, Christopher Samuel <<!-- tmpl_var 
LEFT_BRACKET -->3<!-- tmpl_var RIGHT_BRACKET -->[email protected]>
   wrote:


     On 09/11/16 12:15, Ran Du wrote:

     > Thanks a lot for your reply. However, it's not what I want to
     > get. For the example of Job 6449483, it is allocated with only
     one node,
     > what if it was allocated with multiple nodes? I'd like to get
     the
     > accounting statistics about how many CPUs/GPUs separately on
     each node,
     > but not the sum number on all nodes.

     Oh sorry, that's my fault, I completely misread what you were
     after
     and managed to invert your request!

     I don't know if that information is included in the accounting
     data.

     I believe the allocation is uniform across the nodes, for
     instance:

     $ sbatch --gres=mic:1 --mem=4g --nodes=2 --wrap /bin/true

     resulted in:

     $ sacct -j 6449484 -o jobid%20,jobname,alloctres%20,allocnodes,allocgres
     JobID JobName AllocTRES AllocNodes AllocGRES
     -------------------- ---------- -------------------- ----------
     ------------
     6449484 wrap cpu=2,mem=8G,node=2 2 mic:2
     6449484.batch batch cpu=1,mem=4G,node=1 1 mic:2
     6449484.extern extern cpu=2,mem=8G,node=2 2 mic:2

     The only oddity there is that the batch step is of course
     only on the first node, but it says it was allocated 2 GRES.
     I suspect that's just a symptom of Slurm only keeping a total
     number.

     I don't think Slurm can give you an uneven GRES allocation, but
     the SchedMD folks would need to confirm that I'm afraid.

     All the best,
     Chris
     --
     Christopher Samuel Senior Systems Administrator
     VLSCI - Victorian Life Sciences Computation Initiative
     Email:
     <!-- tmpl_var LEFT_BRACKET -->4<!-- tmpl_var RIGHT_BRACKET 
-->[email protected]
     Phone:
     <!-- tmpl_var LEFT_BRACKET -->5<!-- tmpl_var RIGHT_BRACKET -->+61
     (0)3 903 55545
     <!-- tmpl_var LEFT_BRACKET -->6<!-- tmpl_var RIGHT_BRACKET 
-->http://www.vlsci.org.au/
     <!-- tmpl_var LEFT_BRACKET -->7<!-- tmpl_var RIGHT_BRACKET 
-->http://twitter.com/vlsci

   



   <!-- tmpl_var LEFT_BRACKET -->1<!-- tmpl_var RIGHT_BRACKET --> 
https://slurm.schedmd.com/hdf5_profile_user_guide.html
   <!-- tmpl_var LEFT_BRACKET -->2<!-- tmpl_var RIGHT_BRACKET --> 
https://slurm.schedmd.com/scontrol.html#OPT_details_1
   <!-- tmpl_var LEFT_BRACKET -->3<!-- tmpl_var RIGHT_BRACKET --> 
mailto:[email protected]
   <!-- tmpl_var LEFT_BRACKET -->4<!-- tmpl_var RIGHT_BRACKET --> 
mailto:[email protected]
   <!-- tmpl_var LEFT_BRACKET -->5<!-- tmpl_var RIGHT_BRACKET --> 
tel:%2B61%20%280%293%20903%2055545
   <!-- tmpl_var LEFT_BRACKET -->6<!-- tmpl_var RIGHT_BRACKET --> 
http://www.vlsci.org.au/
   <!-- tmpl_var LEFT_BRACKET -->7<!-- tmpl_var RIGHT_BRACKET --> 
http://twitter.com/vlsci


Reply via email to