[slurm-users] maximum size of array jobs

2019-02-26 Thread Marcus Wagner

Hello everyone,

I have another question ;)


Does anyone know, why per default the number of array elements is 
limited to 1000?


We have one user, who would like to have 100k array elements!

What is more difficult for the scheduler, one array job with 100k 
elements or 100k non-array jobs?



Where did you set the limit? Do your users use array jobs at all?


Best
Marcus

--
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de
www.itc.rwth-aachen.de




Re: [slurm-users] maximum size of array jobs

2019-02-26 Thread Ole Holm Nielsen

On 2/26/19 9:07 AM, Marcus Wagner wrote:
Does anyone know, why per default the number of array elements is 
limited to 1000?


We have one user, who would like to have 100k array elements!

What is more difficult for the scheduler, one array job with 100k 
elements or 100k non-array jobs?



Where did you set the limit? Do your users use array jobs at all?


Google is your friend :-)

https://slurm.schedmd.com/job_array.html


A new configuration parameter has been added to control the maximum job array 
size: MaxArraySize. The smallest index that can be specified by a user is zero 
and the maximum index is MaxArraySize minus one. The default value of 
MaxArraySize is 1001. The maximum MaxArraySize supported in Slurm is 401. 
Be mindful about the value of MaxArraySize as job arrays offer an easy way for 
users to submit large numbers of jobs very quickly.


/Ole



Re: [slurm-users] maximum size of array jobs

2019-02-26 Thread Jeffrey Frey
Also see "https://slurm.schedmd.com/slurm.conf.html"; for 
MaxArraySize/MaxJobCount.

We just went through a user-requested adjustment to MaxArraySize to bump it 
from 1000 to 1; as the documentation states, since each index of an array 
job is essentially "a job," you must be sure to also adjust MaxJobCount (from 
1 to 10 in our case).  Adjusting MaxJobCount requires a restart of 
slurmctld; though the documentation doesn't state it, so does adjustment of 
MaxArraySize (scontrol reconfigure will succeed but leave the previous limit in 
effect, see "https://bugs.schedmd.com/show_bug.cgi?id=6553";).

The "MaxArraySize" is a bit of a misnomer since it's really 1 + the top of the 
valid range of indices -- "MaxArrayIndex" would be more apt.  Our users were 
very happy with Grid Engine's allowance of any index range and striding that 
produces no more than "max_aj_tasks" indices; since moving to Slurm they're 
forced to come up with their own index-mapping functionality at times, but the 
relatively low MaxArraySize versus what we had in GridEngine (75000) has been 
especially frustrating for them.

So far the 1/10 combo hasn't come close to exhausting resources on our 
slurmctld nodes; but we haven't actually submitted a couple 1-index array 
jobs and enough other jobs to hit 10 active jobs, so current memory usage 
isn't an adequate measure of usage under load.  Since the slurm.conf 
documentation states:


Performance can suffer with more than a few hundred thousand jobs. 


we're reluctant to increase MaxJobCount too much higher.




> On Feb 26, 2019, at 3:18 AM, Ole Holm Nielsen  
> wrote:
> 
> On 2/26/19 9:07 AM, Marcus Wagner wrote:
>> Does anyone know, why per default the number of array elements is limited to 
>> 1000?
>> We have one user, who would like to have 100k array elements!
>> What is more difficult for the scheduler, one array job with 100k elements 
>> or 100k non-array jobs?
>> Where did you set the limit? Do your users use array jobs at all?
> 
> Google is your friend :-)
> 
> https://slurm.schedmd.com/job_array.html
> 
>> A new configuration parameter has been added to control the maximum job 
>> array size: MaxArraySize. The smallest index that can be specified by a user 
>> is zero and the maximum index is MaxArraySize minus one. The default value 
>> of MaxArraySize is 1001. The maximum MaxArraySize supported in Slurm is 
>> 401. Be mindful about the value of MaxArraySize as job arrays offer an 
>> easy way for users to submit large numbers of jobs very quickly.
> 
> /Ole
> 


::
Jeffrey T. Frey, Ph.D.
Systems Programmer V / HPC Management
Network & Systems Services / College of Engineering
University of Delaware, Newark DE  19716
Office: (302) 831-6034  Mobile: (302) 419-4976
::






Re: [slurm-users] maximum size of array jobs

2019-02-26 Thread Merlin Hartley
max_array_tasks
Specify the maximum number of tasks that be included in a job array. The 
default limit is MaxArraySize, but this option can be used to set a lower 
limit. For example, max_array_tasks=1000 and MaxArraySize=11 would permit a 
maximum task ID of 10, but limit the number of tasks in any single job 
array to 1000.
https://slurm.schedmd.com/slurm.conf.html 


SchedulerParameters=max_array_tasks=1000

MaxArraySize=10

See commit:
https://github.com/SchedMD/slurm/commit/09c13fb292a4a6a56b4078de840aae0d4db70309
 




--
Merlin Hartley
Computer Officer
MRC Mitochondrial Biology Unit
University of Cambridge
Cambridge, CB2 0XY
United Kingdom

> On 26 Feb 2019, at 14:27, Jeffrey Frey  wrote:
> 
> Also see "https://slurm.schedmd.com/slurm.conf.html 
> " for MaxArraySize/MaxJobCount.
> 
> We just went through a user-requested adjustment to MaxArraySize to bump it 
> from 1000 to 1; as the documentation states, since each index of an array 
> job is essentially "a job," you must be sure to also adjust MaxJobCount (from 
> 1 to 10 in our case).  Adjusting MaxJobCount requires a restart of 
> slurmctld; though the documentation doesn't state it, so does adjustment of 
> MaxArraySize (scontrol reconfigure will succeed but leave the previous limit 
> in effect, see "https://bugs.schedmd.com/show_bug.cgi?id=6553 
> ").
> 
> The "MaxArraySize" is a bit of a misnomer since it's really 1 + the top of 
> the valid range of indices -- "MaxArrayIndex" would be more apt.  Our users 
> were very happy with Grid Engine's allowance of any index range and striding 
> that produces no more than "max_aj_tasks" indices; since moving to Slurm 
> they're forced to come up with their own index-mapping functionality at 
> times, but the relatively low MaxArraySize versus what we had in GridEngine 
> (75000) has been especially frustrating for them.
> 
> So far the 1/10 combo hasn't come close to exhausting resources on 
> our slurmctld nodes; but we haven't actually submitted a couple 1-index 
> array jobs and enough other jobs to hit 10 active jobs, so current memory 
> usage isn't an adequate measure of usage under load.  Since the slurm.conf 
> documentation states:
> 
> 
> Performance can suffer with more than a few hundred thousand jobs. 
> 
> 
> we're reluctant to increase MaxJobCount too much higher.
> 
> 
> 
> 
>> On Feb 26, 2019, at 3:18 AM, Ole Holm Nielsen > > wrote:
>> 
>> On 2/26/19 9:07 AM, Marcus Wagner wrote:
>>> Does anyone know, why per default the number of array elements is limited 
>>> to 1000?
>>> We have one user, who would like to have 100k array elements!
>>> What is more difficult for the scheduler, one array job with 100k elements 
>>> or 100k non-array jobs?
>>> Where did you set the limit? Do your users use array jobs at all?
>> 
>> Google is your friend :-)
>> 
>> https://slurm.schedmd.com/job_array.html 
>> 
>> 
>>> A new configuration parameter has been added to control the maximum job 
>>> array size: MaxArraySize. The smallest index that can be specified by a 
>>> user is zero and the maximum index is MaxArraySize minus one. The default 
>>> value of MaxArraySize is 1001. The maximum MaxArraySize supported in Slurm 
>>> is 401. Be mindful about the value of MaxArraySize as job arrays offer 
>>> an easy way for users to submit large numbers of jobs very quickly.
>> 
>> /Ole
>> 
> 
> 
> ::
> Jeffrey T. Frey, Ph.D.
> Systems Programmer V / HPC Management
> Network & Systems Services / College of Engineering
> University of Delaware, Newark DE  19716
> Office: (302) 831-6034  Mobile: (302) 419-4976
> ::
> 
> 
> 
> 



Re: [slurm-users] maximum size of array jobs

2019-02-26 Thread Marcus Wagner

Hi Merlin,

thanks for the answer, but our user is not in need of a high index, but 
in fact in need of 100k taskids.



Best
Marcus


On 2/26/19 3:50 PM, Merlin Hartley wrote:

*max_array_tasks*
Specify the maximum number of tasks that be included in a job
array. The default limit is MaxArraySize, but this option can be
used to set a lower limit. For example, max_array_tasks=1000 and
MaxArraySize=11 would permit a maximum task ID of 10, but
limit the number of tasks in any single job array to 1000.
https://slurm.schedmd.com/slurm.conf.html

SchedulerParameters=max_array_tasks=1000

MaxArraySize=10

See commit:
https://github.com/SchedMD/slurm/commit/09c13fb292a4a6a56b4078de840aae0d4db70309 





--
Merlin Hartley
Computer Officer
MRC Mitochondrial Biology Unit
University of Cambridge
Cambridge, CB2 0XY
United Kingdom

On 26 Feb 2019, at 14:27, Jeffrey Frey > wrote:


Also see "https://slurm.schedmd.com/slurm.conf.html"; for 
MaxArraySize/MaxJobCount.


We just went through a user-requested adjustment to MaxArraySize to 
bump it from 1000 to 1; as the documentation states, since each 
index of an array job is essentially "a job," you must be sure to 
also adjust MaxJobCount (from 1 to 10 in our case). 
 Adjusting MaxJobCount requires a restart of slurmctld; though the 
documentation doesn't state it, so does adjustment of MaxArraySize 
(scontrol reconfigure will succeed but leave the previous limit in 
effect, see "https://bugs.schedmd.com/show_bug.cgi?id=6553";).


The "MaxArraySize" is a bit of a misnomer since it's really 1 + the 
top of the valid range of indices -- "MaxArrayIndex" would be more 
apt.  Our users were very happy with Grid Engine's allowance of any 
index range and striding that produces no more than "max_aj_tasks" 
indices; since moving to Slurm they're forced to come up with their 
own index-mapping functionality at times, but the relatively low 
MaxArraySize versus what we had in GridEngine (75000) has been 
especially frustrating for them.


So far the 1/10 combo hasn't come close to exhausting 
resources on our slurmctld nodes; but we haven't actually submitted a 
couple 1-index array jobs and enough other jobs to hit 10 
active jobs, so current memory usage isn't an adequate measure of 
usage under load.  Since the slurm.conf documentation states:



Performance can suffer with more than a few hundred thousand jobs.



we're reluctant to increase MaxJobCount too much higher.




On Feb 26, 2019, at 3:18 AM, Ole Holm Nielsen 
mailto:ole.h.niel...@fysik.dtu.dk>> wrote:


On 2/26/19 9:07 AM, Marcus Wagner wrote:
Does anyone know, why per default the number of array elements is 
limited to 1000?

We have one user, who would like to have 100k array elements!
What is more difficult for the scheduler, one array job with 100k 
elements or 100k non-array jobs?

Where did you set the limit? Do your users use array jobs at all?


Google is your friend :-)

https://slurm.schedmd.com/job_array.html

A new configuration parameter has been added to control the maximum 
job array size: MaxArraySize. The smallest index that can be 
specified by a user is zero and the maximum index is MaxArraySize 
minus one. The default value of MaxArraySize is 1001. The maximum 
MaxArraySize supported in Slurm is 401. Be mindful about the 
value of MaxArraySize as job arrays offer an easy way for users to 
submit large numbers of jobs very quickly.


/Ole




::
Jeffrey T. Frey, Ph.D.
Systems Programmer V / HPC Management
Network & Systems Services / College of Engineering
University of Delaware, Newark DE  19716
Office: (302) 831-6034  Mobile: (302) 419-4976
::








--
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de
www.itc.rwth-aachen.de



Re: [slurm-users] maximum size of array jobs

2019-02-26 Thread Marcus Wagner

Hi Jeffrey,


thanks for the hint regarding scontrol reconfig. That one drove me nuts 
again.
I changed it to MaxArraySize=10. I restartet slurmctld, since i also 
changed some features of the nodes.


I soon realized, that I only could submit --array=1-9, I then 
already myself increased MaxArraySize to 11 and did an scontrol 
reconfig.


Behaviour was still the same. Now, I know why :)


Best,
Marcus

On 2/26/19 3:27 PM, Jeffrey Frey wrote:
Also see "https://slurm.schedmd.com/slurm.conf.html"; for 
MaxArraySize/MaxJobCount.


We just went through a user-requested adjustment to MaxArraySize to 
bump it from 1000 to 1; as the documentation states, since each 
index of an array job is essentially "a job," you must be sure to also 
adjust MaxJobCount (from 1 to 10 in our case). 
 Adjusting MaxJobCount requires a restart of slurmctld; though the 
documentation doesn't state it, so does adjustment of MaxArraySize 
(scontrol reconfigure will succeed but leave the previous limit in 
effect, see "https://bugs.schedmd.com/show_bug.cgi?id=6553";).


The "MaxArraySize" is a bit of a misnomer since it's really 1 + the 
top of the valid range of indices -- "MaxArrayIndex" would be more 
apt.  Our users were very happy with Grid Engine's allowance of any 
index range and striding that produces no more than "max_aj_tasks" 
indices; since moving to Slurm they're forced to come up with their 
own index-mapping functionality at times, but the relatively low 
MaxArraySize versus what we had in GridEngine (75000) has been 
especially frustrating for them.


So far the 1/10 combo hasn't come close to exhausting 
resources on our slurmctld nodes; but we haven't actually submitted a 
couple 1-index array jobs and enough other jobs to hit 10 
active jobs, so current memory usage isn't an adequate measure of 
usage under load.  Since the slurm.conf documentation states:



Performance can suffer with more than a few hundred thousand jobs.



we're reluctant to increase MaxJobCount too much higher.




On Feb 26, 2019, at 3:18 AM, Ole Holm Nielsen 
mailto:ole.h.niel...@fysik.dtu.dk>> wrote:


On 2/26/19 9:07 AM, Marcus Wagner wrote:
Does anyone know, why per default the number of array elements is 
limited to 1000?

We have one user, who would like to have 100k array elements!
What is more difficult for the scheduler, one array job with 100k 
elements or 100k non-array jobs?

Where did you set the limit? Do your users use array jobs at all?


Google is your friend :-)

https://slurm.schedmd.com/job_array.html

A new configuration parameter has been added to control the maximum 
job array size: MaxArraySize. The smallest index that can be 
specified by a user is zero and the maximum index is MaxArraySize 
minus one. The default value of MaxArraySize is 1001. The maximum 
MaxArraySize supported in Slurm is 401. Be mindful about the 
value of MaxArraySize as job arrays offer an easy way for users to 
submit large numbers of jobs very quickly.


/Ole




::
Jeffrey T. Frey, Ph.D.
Systems Programmer V / HPC Management
Network & Systems Services / College of Engineering
University of Delaware, Newark DE  19716
Office: (302) 831-6034  Mobile: (302) 419-4976
::






--
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de
www.itc.rwth-aachen.de