Quoting Gianluca Castellani <[email protected]>:

Hello,
I have installed slurm-14.11.1 on Red Hat Enterprise Linux Server release
6.5 (Santiago);

I am trying to run a simple script such as:
!/bin/bash -l
#SBATCH -p debug
#SBATCH -n 32
#SBATCH -o %j.out
#SBATCH -e %j.err

date

###########################

the job error file shows:
slurmstepd: couldn't do a strtol on str 1(1): Numerical result out of range
slurmstepd: couldn't do a strtol on str 2(2): Numerical result out of range
slurmstepd: couldn't do a strtol on str 3(3): Numerical result out of range
slurmstepd: couldn't do a strtol on str 4(4): Numerical result out of range
....

while the job output file shows:
Thu Dec 11 09:27:53 AST 2014

The job is still running...
19     debug   job.sh    R       4:56      1 ca098

It looks lke this error comes either from:
slurm-14.11.1/src/plugins/proctrack/linuxproc/kill_tree.c:        ret_l =
strtol(num, &endptr, 10);
slurm-14.11.1/src/plugins/proctrack/linuxproc/kill_tree.c:
error("couldn't do a strtol on str %s(%ld): %m",

or slurm-14.11.1/src/plugins/proctrack/pgid/proctrack_pgid.c

Do you have any suggestion?

Cheers,
Gianluca

I would guess this is due to a vestigial errno from somewhere else in Slurm. Could you try the attached patch and let me know if that fixes the problem.
--
Morris "Moe" Jette
CTO, SchedMD LLC
Commercial Slurm Development and Support
diff --git a/src/plugins/proctrack/linuxproc/kill_tree.c b/src/plugins/proctrack/linuxproc/kill_tree.c
index e45f13e..ea6f12d 100644
--- a/src/plugins/proctrack/linuxproc/kill_tree.c
+++ b/src/plugins/proctrack/linuxproc/kill_tree.c
@@ -171,8 +171,7 @@ static xppid_t **_build_hashtbl(void)
 		if ((num[0] < '0') || (num[0] > '9'))
 			continue;
 		ret_l = strtol(num, &endptr, 10);
-		if ((ret_l == LONG_MIN) || (ret_l == LONG_MAX) ||
-		    (errno == ERANGE)) {
+		if ((ret_l == LONG_MIN) || (ret_l == LONG_MAX)) {
 			error("couldn't do a strtol on str %s(%ld): %m",
 			      num, ret_l);
 			continue;
diff --git a/src/plugins/proctrack/pgid/proctrack_pgid.c b/src/plugins/proctrack/pgid/proctrack_pgid.c
index 28270be..f5c334b 100644
--- a/src/plugins/proctrack/pgid/proctrack_pgid.c
+++ b/src/plugins/proctrack/pgid/proctrack_pgid.c
@@ -218,8 +218,7 @@ proctrack_p_get_pids(uint64_t cont_id, pid_t **pids, int *npids)
 		if ((num[0] < '0') || (num[0] > '9'))
 			continue;
 		ret_l = strtol(num, &endptr, 10);
-		if ((ret_l == LONG_MIN) || (ret_l == LONG_MAX) ||
-		    (errno == ERANGE)) {
+		if ((ret_l == LONG_MIN) || (ret_l == LONG_MAX)) {
 			error("couldn't do a strtol on str %s(%ld): %m",
 			      num, ret_l);
 			continue;

Reply via email to