Hello,

we have a very peculiar array job scheduling problem.

Cluster:  Very heterogeneous SGE cluster with different types of workstations.

Problem:  One array job can use more job slots than two array jobs together.

Example: User A submitted an array job, 64 array tasks were running at the same time. Then user B submitted a similar array job and after a while each of the jobs was running with approx 28 array tasks. According to the users the problem gets worse with each array job. Terminating one of the jobs immediately leads to the normal usage again.

Question: Does anybody know if the "job_load_adjustment" of array jobs depends only on number of tasks or if the number of jobs is taken into account as well?

From my point of view limits, quotas, etc. cannot be the cause of the
problem because then the total of job slots being used by two jobs could not be less then the job slots being used by one single job. The scheduler configuration though was not made for short array jobs but for longer running parallel jobs on interactively used workstations.

Next I will try to observe the "artifitial" load dependend on the number
of array jobs and tasks - but maybe someone has a good explanation.

Many thanks!

Erik Soyez.


Some details:

[Queue]
load_thresholds       np_load_avg=0.75
suspend_thresholds    NONE
priority              10

[Scheduler]
algorithm                         default
schedule_interval                 0:0:15
maxujobs                          0
queue_sort_method                 seqno
job_load_adjustments              np_load_avg=0.90
load_adjustment_decay_time        0:2:30
load_formula                      np_load_avg
schedd_job_info                   true
flush_submit_sec                  2
flush_finish_sec                  2
params                            none
reprioritize_interval             0:0:0
halftime                          168
usage_weight_list                 cpu=1.000000,mem=0.000000,io=0.000000



--





--
Vorstandsvorsitzender/Chairman of the board of management:
Gerd-Lothar Leonhart
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Dr. Arno Steitz
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to