Re: [slurm-users] How to get command from a finished job

2020-04-30 Thread Luis Huang
We use the elasticsearch plugin. This information is kept in there.

From: slurm-users  on behalf of Gestió 
Servidors 
Sent: Thursday, April 30, 2020 3:39:33 AM
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] How to get command from a finished job

Hello,

I would like to know if there exist any way to get the same information I can 
get from a running or pending job in the queue with “scontrol show jobid=” 
when the job has finished. When it has finished, “scontrol show jobid=” 
doesn’t work and “sacct -j jobid” doesn’t show all the information I need. For 
example, with “scontrol show jobid” I can know what command has been submited, 
its workir, the stderr file and the stdout one. This information, I think, 
cannot be obtained when the job is finished and I run “sacct”.

Thanks.

This message is for the recipient’s use only, and may contain confidential, 
privileged or protected information. Any unauthorized use or dissemination of 
this communication is prohibited. If you received this message in error, please 
immediately notify the sender and destroy all copies of this message. The 
recipient should check this email and any attachments for the presence of 
viruses, as we accept no liability for any damage caused by any virus 
transmitted by this email.


Re: [slurm-users] How to get command from a finished job

2020-04-30 Thread Ole Holm Nielsen

On 30-04-2020 15:34, Bjørn-Helge Mevik wrote:

Gestió Servidors  writes:


For example, with "scontrol show jobid" I can know what command has
been submited, its workir, the stderr file and the stdout one. This
information, I think, cannot be obtained when the job is finished and
I run "sacct".


The workdir is available with sacct, IIRC.  For other types of
information, I believe you can add code to your job_submit.lua that stores
it in the job's AdminComment field, which sacct can display.


Yes, the command to print only the workdir is:

sacct -j $jobid -nP -o WorkDir

I have added this to my "showjob" command:
https://github.com/OleHolmNielsen/Slurm_tools/tree/master/jobs

The other fields Command, StdErr, StdIn, StdOut are apparently not in 
the Slurm database, see "man sacct".


/Ole



Re: [slurm-users] How to get command from a finished job

2020-04-30 Thread Bjørn-Helge Mevik
Gestió Servidors  writes:

> For example, with "scontrol show jobid" I can know what command has
> been submited, its workir, the stderr file and the stdout one. This
> information, I think, cannot be obtained when the job is finished and
> I run "sacct".

The workdir is available with sacct, IIRC.  For other types of
information, I believe you can add code to your job_submit.lua that stores
it in the job's AdminComment field, which sacct can display.

-- 
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo


signature.asc
Description: PGP signature


Re: [slurm-users] How to get command from a finished job

2020-04-30 Thread Luca Capello
Hi there,

On 4/30/20 2:52 PM, Paul Edmon wrote:
> No, that data is purged from the scheduler after completion.  So records of 
> the job exist in your job completion log or in the sacct database.  The 
> script that it ran is not saved, though I believe there are several bug 
> requests in to SchedMD to add that feature.  People have come up with various 
> home grown solutions to save that data.

For example, SArchive was presented at this year FOSDEM:

  

Thx, bye,
Luca

-- 
Dr. Luca Capello
Ingénieur HPC
Division du Système et des Technologies de l'Information et de la Communication
Université de Genève | 24 rue Général-Dufour
Tél +41 22 379 72 42 | Bureau 151
https://hpc-community.unige.ch
mailto:luca.cape...@unige.ch



signature.asc
Description: OpenPGP digital signature


Re: [slurm-users] How to get command from a finished job

2020-04-30 Thread Paul Edmon
No, that data is purged from the scheduler after completion.  So records 
of the job exist in your job completion log or in the sacct database.  
The script that it ran is not saved, though I believe there are several 
bug requests in to SchedMD to add that feature.  People have come up 
with various home grown solutions to save that data.


You could always increase the length of time the scheduler keeps that 
data after completion by increasing MinJobAge:


*MinJobAge*
   The minimum age of a completed job before its record is purged from
   Slurm's active database. Set the values of *MaxJobCount* and to
   ensure the slurmctld daemon does not exhaust its memory or other
   resources. The default value is 300 seconds. A value of zero
   prevents any job record purging. Jobs are not purged during a
   backfill cycle, so it can take longer than MinJobAge seconds to
   purge a job if using the backfill scheduling plugin. In order to
   eliminate some possible race conditions, the minimum non-zero value
   for *MinJobAge* recommended is 2. 


-Paul Edmon-

On 4/30/2020 3:39 AM, Gestió Servidors wrote:


Hello,

I would like to know if there exist any way to get the same 
information I can get from a running or pending job in the queue with 
“scontrol show jobid=” when the job has finished. When it has 
finished, “scontrol show jobid=” doesn’t work and “sacct -j jobid” 
doesn’t show all the information I need. For example, with “scontrol 
show jobid” I can know what command has been submited, its workir, the 
stderr file and the stdout one. This information, I think, cannot be 
obtained when the job is finished and I run “sacct”.


Thanks.