Quoting Gianluca Castellani <[email protected]>:
Hello,
I have installed slurm-14.11.1 on Red Hat Enterprise Linux Server release
6.5 (Santiago);
I am trying to run a simple script such as:
!/bin/bash -l
#SBATCH -p debug
#SBATCH -n 32
#SBATCH -o %j.out
#SBATCH -e %j.err
date
###########################
the job error file shows:
slurmstepd: couldn't do a strtol on str 1(1): Numerical result out of range
slurmstepd: couldn't do a strtol on str 2(2): Numerical result out of range
slurmstepd: couldn't do a strtol on str 3(3): Numerical result out of range
slurmstepd: couldn't do a strtol on str 4(4): Numerical result out of range
....
while the job output file shows:
Thu Dec 11 09:27:53 AST 2014
The job is still running...
19 debug job.sh R 4:56 1 ca098
It looks lke this error comes either from:
slurm-14.11.1/src/plugins/proctrack/linuxproc/kill_tree.c: ret_l =
strtol(num, &endptr, 10);
slurm-14.11.1/src/plugins/proctrack/linuxproc/kill_tree.c:
error("couldn't do a strtol on str %s(%ld): %m",
or slurm-14.11.1/src/plugins/proctrack/pgid/proctrack_pgid.c
Do you have any suggestion?
Cheers,
Gianluca
I would guess this is due to a vestigial errno from somewhere else in
Slurm. Could you try the attached patch and let me know if that fixes
the problem.
--
Morris "Moe" Jette
CTO, SchedMD LLC
Commercial Slurm Development and Support
diff --git a/src/plugins/proctrack/linuxproc/kill_tree.c b/src/plugins/proctrack/linuxproc/kill_tree.c
index e45f13e..ea6f12d 100644
--- a/src/plugins/proctrack/linuxproc/kill_tree.c
+++ b/src/plugins/proctrack/linuxproc/kill_tree.c
@@ -171,8 +171,7 @@ static xppid_t **_build_hashtbl(void)
if ((num[0] < '0') || (num[0] > '9'))
continue;
ret_l = strtol(num, &endptr, 10);
- if ((ret_l == LONG_MIN) || (ret_l == LONG_MAX) ||
- (errno == ERANGE)) {
+ if ((ret_l == LONG_MIN) || (ret_l == LONG_MAX)) {
error("couldn't do a strtol on str %s(%ld): %m",
num, ret_l);
continue;
diff --git a/src/plugins/proctrack/pgid/proctrack_pgid.c b/src/plugins/proctrack/pgid/proctrack_pgid.c
index 28270be..f5c334b 100644
--- a/src/plugins/proctrack/pgid/proctrack_pgid.c
+++ b/src/plugins/proctrack/pgid/proctrack_pgid.c
@@ -218,8 +218,7 @@ proctrack_p_get_pids(uint64_t cont_id, pid_t **pids, int *npids)
if ((num[0] < '0') || (num[0] > '9'))
continue;
ret_l = strtol(num, &endptr, 10);
- if ((ret_l == LONG_MIN) || (ret_l == LONG_MAX) ||
- (errno == ERANGE)) {
+ if ((ret_l == LONG_MIN) || (ret_l == LONG_MAX)) {
error("couldn't do a strtol on str %s(%ld): %m",
num, ret_l);
continue;