Re: [gridengine users] "Packing" jobs on nodes v2

Hung-Sheng Tsao Ph.D. Fri, 11 May 2012 11:18:44 -0700


hi
not sure what is the issues here, see in-line
On 5/11/2012 8:11 AM, iqtcub wrote:

Hi,
Following up with the thread with the same subject (http://thread.gmane.org/gmane.comp.clustering.opengridengine.user/894/ ).
We're using sge 6.2u5, our setup is 2 machines(its a testing cluster)with 2 cores each machine.
-qsub -q v20z.q -pe smp 1 script.sub
-wait until the job runs
-qsub -q v20z.q -pe smp 1 script.sub

Each job enters a different node.

this is not ok for you?

However if we do:
-for i in 1 2; do qsub -q v20z.q -pe smp 1 script.sub; done

Then both jobs enter into the same node.

this is ok for you?


Our scheduling conf is as follows:

----------------------------
algorithm                         default
schedule_interval                 0:0:15
maxujobs                          0
queue_sort_method                 load
job_load_adjustments              NONE
load_adjustment_decay_time        0:7:30
load_formula                      slots
schedd_job_info                   true
flush_submit_sec                  0
flush_finish_sec                  0
params                            MONITOR=1
reprioritize_interval             0:0:0
halftime                          168
usage_weight_list                 cpu=1.000000,mem=0.000000,io=0.000000
compensation_factor               2.000000
weight_user                       0.250000
weight_project                    0.250000
weight_department                 0.250000
weight_job                        0.250000
weight_tickets_functional         0
weight_tickets_share              1000000
share_override_tickets            TRUE
share_functional_shares           FALSE
max_functional_jobs_to_schedule   200
report_pjob_tickets               TRUE
max_pending_tasks_per_job         50
halflife_decay_list               none
policy_hierarchy                  OS
weight_ticket                     0.890000
weight_waiting_time               0.000000
weight_deadline                   3600000.000000
weight_urgency                    0.100000
weight_priority                   0.010000
max_reservation                   50
default_duration                  9999:00:00
--------------------------------------------

The smp PE config is:
pe_name            smp
slots              999
user_lists         NONE
xuser_lists        NONE
start_proc_args    /bin/true
stop_proc_args     /bin/true
allocation_rule    $pe_slots
control_slaves     FALSE
job_is_first_task  TRUE
urgency_slots      min
accounting_summary TRUE
-------------------------

The config on both nodes are like this:
hostname              v20z-03
load_scaling          NONE
complex_values        mem_free=7891.796875M,slots=2

load_valuesarch=lx24-amd64,num_proc=2,mem_total=7935.984375M, \swap_total=4095.992188M,virtual_total=12031.976562M, \h_fsize=9.7G,load_avg=0.180000,load_short=0.080000, \

                      load_medium=0.180000,load_long=0.090000, \
                      mem_free=7830.246094M,swap_free=4095.992188M, \
                      virtual_free=11926.238281M,mem_used=105.738281M, \
                      swap_used=0.000000M,virtual_used=105.738281M, \

cpu=0.000000,m_topology=SCSC,m_topology_inuse=SCSC, \

                      m_socket=2,m_core=2,np_load_avg=0.090000, \
                      np_load_short=0.040000,np_load_medium=0.090000, \
                      np_load_long=0.045000
processors            2
user_lists            NONE
xuser_lists           NONE
projects              NONE
xprojects             NONE
usage_scaling         cpu=12.300000
report_variables      NONE
-------------------------------
The queue config:
qname                 v20z.q
hostlist              @v20z
seq_no                0
load_thresholds       np_load_avg=1.75
suspend_thresholds    NONE
nsuspend              1
suspend_interval      00:01:00
priority              0
min_cpu_interval      00:01:00
processors            UNDEFINED
qtype                 BATCH INTERACTIVE
ckpt_list             BLCR
pe_list               make smp
rerun                 FALSE
slots                 2
tmpdir                /scratch
shell                 /bin/csh
prolog                NONE
epilog                NONE
shell_start_mode      posix_compliant
starter_method        NONE
suspend_method        NONE
resume_method         NONE
terminate_method      NONE
notify                00:00:60
owner_list            NONE
user_lists            NONE
xuser_lists           NONE
subordinate_list      NONE
complex_values        split=2
projects              NONE
xprojects             NONE
calendar              NONE
initial_state         default
s_rt                  INFINITY
h_rt                  INFINITY
s_cpu                 INFINITY
h_cpu                 INFINITY
s_fsize               INFINITY
h_fsize               INFINITY
s_data                INFINITY
h_data                INFINITY
s_stack               INFINITY
h_stack               INFINITY
s_core                INFINITY
h_core                INFINITY
s_rss                 INFINITY
h_rss                 INFINITY
s_vmem                INFINITY
h_vmem                INFINITY

-----------------------------------

From what i understood, its possible that this method is broken, am iright?

I've also tried the scheduler configuration in the following links,with the same result:

http://article.gmane.org/gmane.comp.clustering.opengridengine.user/1037
http://wiki.gridengine.info/wiki/index.php/StephansBlog

Thanks in advance!
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

--

<<attachment: laotsao.vcf>>

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] "Packing" jobs on nodes v2

Reply via email to