Yes, srun just hangs. Commands like sinfo and squeue run fine.
I also have no slurm logs in /var/log ??

From: slurm-users [] On Behalf Of 
John Hearns
Sent: Tuesday, July 17, 2018 8:57 AM
To: Slurm User Community List
Subject: Re: [slurm-users] 'srun hostname' hangs on the command line

Ronan, sorry to ask but this is a bit unclear.

Are you unable to launch ANY sessions with srun?
In which case you need to look at the logs to see why the job is not being 

Is it only the hostname command which fails?

I would guess very much you have already run an ssh into a node and run the 
hostname command manually.

On 17 July 2018 at 09:50, Buckley, Ronan 
<<>> wrote:
Yes I do.

From: slurm-users 
 On Behalf Of Williams, Gareth (IM&T, Clayton)
Sent: Tuesday, July 17, 2018 12:33 AM
To: Slurm User Community List
Subject: Re: [slurm-users] 'srun hostname' hangs on the command line

Do you get the same problem as a non-root user?

From: slurm-users [] On Behalf Of 
Buckley, Ronan
Sent: Tuesday, 17 July 2018 12:53 AM
Subject: [slurm-users] 'srun hostname' hangs on the command line

Hi All,

Verbose mode doesn’t show much.
I hashed out the hostnames.
Any ideas/suggestions?

# srun hostname
^Csrun: interrupt (one more within 1 sec to abort)
srun: task 0: unknown
[1]+  Stopped                 srun hostname

# srun -v hostname
srun: defined options for program `srun'
srun: --------------- ---------------------
srun: user           : `root'
srun: uid            : 0
srun: gid            : 0
srun: cwd            : /root
srun: ntasks         : 1 (default)
srun: nodes          : 1 (default)
srun: jobid          : 4294967294 (default)
srun: partition      : default
srun: profile        : `NotSet'
srun: job name       : `(null)'
srun: reservation    : `(null)'
srun: burst_buffer   : `(null)'
srun: wckey          : `(null)'
srun: cpu_freq_min   : 4294967294
srun: cpu_freq_max   : 4294967294
srun: cpu_freq_gov   : 4294967294
srun: switches       : -1
srun: wait-for-switches : -1
srun: distribution   : unknown
srun: cpu_bind       : default (0)
srun: mem_bind       : default (0)
srun: verbose        : 1
srun: slurmd_debug   : 0
srun: immediate      : false
srun: label output   : false
srun: unbuffered IO  : false
srun: overcommit     : false
srun: threads        : 60
srun: checkpoint_dir : /var/slurm/checkpoint
srun: wait           : 0
srun: nice           : -2
srun: account        : (null)
srun: comment        : (null)
srun: dependency     : (null)
srun: exclusive      : false
srun: bcast          : false
srun: qos            : (null)
srun: constraints    :
srun: geometry       : (null)
srun: reboot         : yes
srun: rotate         : no
srun: preserve_env   : false
srun: network        : (null)
srun: propagate      : NONE
srun: prolog         : (null)
srun: epilog         : (null)
srun: mail_type      : NONE
srun: mail_user      : (null)
srun: task_prolog    : (null)
srun: task_epilog    : (null)
srun: multi_prog     : no
srun: sockets-per-node  : -2
srun: cores-per-socket  : -2
srun: threads-per-core  : -2
srun: ntasks-per-node   : -2
srun: ntasks-per-socket : -2
srun: ntasks-per-core   : -2
srun: plane_size        : 4294967294
srun: core-spec         : NA
srun: power             :
srun: remote command    : `hostname'
srun: Waiting for nodes to boot (delay looping 450 times @ 0.100000 secs x 
srun: Nodes ####### are ready for job
srun: jobid 50871: nodes(1):`#######', cpu counts: 64(x1)
srun: launching 50871.0 on host #######, 1 tasks: 0
srun: route default plugin loaded
srun: error: timeout waiting for task launch, started 0 of 1 tasks
srun: Job step 50871.0 aborted before step completely launched.
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
srun: error: Timed out waiting for job step to complete


Reply via email to