Fernando,

This may be merely by design.
When a job is queued, whatever resources are available that it may need but are 
not yet used are reserved.
So if it needs 12 cores on a 16 core machine but there is an 8 core job running 
there, it will reserve the remaining 8 while it waits for the other 4 it needs 
to be freed up.

Now IF there is another 8 core job that is submitted, it has to wait. UNLESS it 
can run on those 8 cores and be done before the other 8 core job completes. 
Maui can 'squeeze' it in without affecting the start time of that 12 core job.

So it could be you are seeing the resources being reserved for the 12 core job 
because the smaller jobs could not be run without bumping the soonest start 
time of the 12 core job.


Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238



From: mauiusers-boun...@supercluster.org 
[mailto:mauiusers-boun...@supercluster.org] On Behalf Of Fernando Caba
Sent: Tuesday, June 23, 2015 4:04 PM
To: Torque Users Mailing List; mauiusers
Subject: [Mauiusers] Queueing jobs in inappropriate order


Hi All, in my cluster the users run jobs in one node with different quantity of 
processors (nodes=1:ppn= 4, 8 or 12)

For some reason, the jobs are queued besides resources are available. For 
example, a job requiring 12 cores becomes queued and several nodes have 8 cores 
free (we have 8 nodes and each node have 12 cores).

If the users submit new jobs with 4 cores o 8 cores, those jobs don´t run, 
becomes queued in spite of the available resources.

Here is my maui.cfg:


# maui.cfg 3.3.1

SERVERHOST            fe

# primary admin must be first in list
ADMIN1                root

# Resource Manager Definition

RMCFG[FE] TYPE=PBS

# Allocation Manager Definition

AMCFG[bank]  TYPE=NONE

# full parameter docs at http://supercluster.org/mauidocs/a.fparameters.html
# use the 'schedctl -l' command to display current configuration

RMPOLLINTERVAL        00:00:30

SERVERPORT            42559
SERVERMODE            NORMAL

# Admin: http://supercluster.org/mauidocs/a.esecurity.html


LOGFILE               maui.log
LOGFILEMAXSIZE        10000000
LOGLEVEL              3

# Job Priority: http://supercluster.org/mauidocs/5.1jobprioritization.html

QUEUETIMEWEIGHT       1

# FairShare: http://supercluster.org/mauidocs/6.3fairshare.html

#FSPOLICY              PSDEDICATED
#FSDEPTH               7
#FSINTERVAL            86400
#FSDECAY               0.80

# Throttling Policies: 
http://supercluster.org/mauidocs/6.2throttlingpolicies.html

# NONE SPECIFIED

# Backfill: http://supercluster.org/mauidocs/8.2backfill.html

BACKFILLPOLICY        FIRSTFIT
RESERVATIONPOLICY     CURRENTHIGHEST

# Node Allocation: http://supercluster.org/mauidocs/5.2nodeallocation.html

NODEALLOCATIONPOLICY  MINRESOURCE
#NODEALLOCATIONPOLICY   FIRSTAVAILABLE

# QOS: http://supercluster.org/mauidocs/7.3qos.html

# QOSCFG[hi]  PRIORITY=100 XFTARGET=100 FLAGS=PREEMPTOR:IGNMAXJOB
# QOSCFG[low] PRIORITY=-1000 FLAGS=PREEMPTEE

# Standing Reservations: 
http://supercluster.org/mauidocs/7.1.3standingreservations.html

# SRSTARTTIME[test] 8:00:00
# SRENDTIME[test]   17:00:00
# SRDAYS[test]      MON TUE WED THU FRI
# SRTASKCOUNT[test] 20
# SRMAXTIME[test]   0:30:00

# Creds: http://supercluster.org/mauidocs/6.1fairnessoverview.html

# USERCFG[DEFAULT]      FSTARGET=25.0
# USERCFG[john]         PRIORITY=100  FSTARGET=10.0-
# GROUPCFG[staff]       PRIORITY=1000 QLIST=hi:low QDEF=hi
# CLASSCFG[batch]       FLAGS=PREEMPTEE
# CLASSCFG[interactive] FLAGS=PREEMPTOR
CLASSCFG[batch] MAXPROCPERUSER=12

JOBNODEMATCHPOLICY EXACTPROC
#JOBNODEMATCHPOLICY      EXACTNODE


and here is my torque configuration:


#
# Create queues and set their attributes.
#
#
# Create and define queue batch
#
create queue batch
set queue batch queue_type = Execution
set queue batch resources_default.nodes = 8
set queue batch resources_default.walltime = 4800:00:00
set queue batch enabled = True
set queue batch started = True
#
# Set server attributes.
#
set server scheduling = True
set server acl_hosts = fe
set server managers = root@fe
set server operators = root@fe
set server default_queue = batch
set server log_events = 511
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
set server log_level = 7
set server mom_job_sync = True
set server keep_completed = 300
set server auto_node_np = True
set server next_job_number = 10422
set server record_job_info = True
set server record_job_script = True

So, i was thinking about the creation of different queue, one for 4 cores jobs, 
another one for 8 cores jobs and another one for 12 cores jobs. Is this a 
reasonable policy, forcing the exact quantity of cores in each job per 
corresponding queue (for 4, 8 or 12 cores per job)?

Thanks in advance!!

Fernando

--

[cid:image001.jpg@01D0AF1E.78517B70]
<http://www.uns.edu.ar>Universidad
Nacional del Sur

Mg. Fernando Caba
Director General de Telecomunicaciones
Avda. Alem 1253, (B8000CPB) Bahía Blanca - Argentina
Tel/Fax: (54)-291-4595166
Tel: (54)-291-4595101 int. 2050
http://www.dgt.uns.edu.ar


_______________________________________________
mauiusers mailing list
mauiusers@supercluster.org
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to