> Am 06.12.2018 um 16:59 schrieb Dimar Jaime González Soto 
> <dimar.gonzalez.s...@gmail.com>:
> 
> qconf -sconf shows:
> 
> #global:
> execd_spool_dir              /var/spool/gridengine/execd
> ...
> ax_aj_tasks                 75000

So, this is fine too. Next place: is the amount of overall slots limited:

$ qconf -se global

especially the line "complex_values".

And next: any RQS?

$ qconf -srqs

-- Reuti


> El jue., 6 dic. 2018 a las 12:55, Reuti (<re...@staff.uni-marburg.de>) 
> escribió:
> 
> > Am 06.12.2018 um 15:19 schrieb Dimar Jaime González Soto 
> > <dimar.gonzalez.s...@gmail.com>:
> > 
> > qconf -se ubuntu-node2 :
> >  
> > hostname              ubuntu-node2
> > load_scaling          NONE
> > complex_values        NONE
> > load_values           arch=lx26-amd64,num_proc=16,mem_total=48201.960938M, \
> >                       
> > swap_total=95746.996094M,virtual_total=143948.957031M, \
> >                       load_avg=3.740000,load_short=4.000000, \
> >                       load_medium=3.740000,load_long=2.360000, \
> >                       mem_free=47376.683594M,swap_free=95746.996094M, \
> 
> Although it's unrelated to the main issue: the swap size can be limited to 2 
> GB nowadays (which is the default in openSUSE). RedHat suggests a little bit 
> more, e.g. here:
> 
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/ch-swapspace
> 
> 
> 
> >                       virtual_free=143123.679688M,mem_used=825.277344M, \
> >                       swap_used=0.000000M,virtual_used=825.277344M, \
> >                       cpu=25.000000,m_topology=NONE,m_topology_inuse=NONE, \
> >                       m_socket=0,m_core=0,np_load_avg=0.233750, \
> >                       np_load_short=0.250000,np_load_medium=0.233750, \
> >                       np_load_long=0.147500
> > processors            16
> > user_lists            NONE
> > xuser_lists           NONE
> > projects              NONE
> > xprojects             NONE
> > usage_scaling         NONE
> > report_variables      NONE
> > 
> > El jue., 6 dic. 2018 a las 11:17, Dimar Jaime González Soto 
> > (<dimar.gonzalez.s...@gmail.com>) escribió:
> > qhost :
> > 
> > HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO  
> > SWAPUS
> > -------------------------------------------------------------------------------
> > global                  -               -     -       -       -       -     
> >   -
> > ubuntu-frontend         lx26-amd64     16  4.13   31.4G    1.2G     0.0     
> > 0.0
> > ubuntu-node11           lx26-amd64     16  4.55   47.1G  397.5M   93.5G     
> > 0.0
> > ubuntu-node12           lx26-amd64     16  3.64   47.1G    1.0G   93.5G     
> > 0.0
> > ubuntu-node13           lx26-amd64     16  4.54   47.1G  399.9M   93.5G     
> > 0.0
> > ubuntu-node2            lx26-amd64     16  3.67   47.1G  818.5M   93.5G     
> > 0.0
> 
> This looks fine. So we have other settings to investigate:
> 
> $ qconf -sconf
> #global:
> execd_spool_dir              /var/spool/sge
> ...
> max_aj_tasks                 75000
> 
> Is max_aj_tasks  limited in your setup?
> 
> 
> 
> -- Reuti
> 
> 
> > 
> > El jue., 6 dic. 2018 a las 11:13, Reuti (<re...@staff.uni-marburg.de>) 
> > escribió:
> > 
> > > Am 06.12.2018 um 15:07 schrieb Dimar Jaime González Soto 
> > > <dimar.gonzalez.s...@gmail.com>:
> > > 
> > >  qalter -w p doesn't shows anything, qstat shows 16 processes and not 60:
> > > 
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node2                1 1
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node12               1 2
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node13               1 3
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node11               1 4
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node11               1 5
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node13               1 6
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node12               1 7
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node2                1 8
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node2                1 9
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node12               1 10
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node13               1 11
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node11               1 12
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node11               1 13
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node13               1 14
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node12               1 15
> > >     250 0.50000 OMA        cbuach       r     12/06/2018 11:04:15 
> > > main.q@ubuntu-node2                1 16
> > >     250 0.50000 OMA        cbuach       qw    12/06/2018 11:04:02         
> > >                            1 17-60:1
> > 
> > Aha, so they are running already on remote nodes – fine. As the setting in 
> > the queue configuration is per host, this should work and provide more 
> > processes per node instead of four.
> > 
> > Is there a setting for the exechosts:
> > 
> > qconf -se ubuntu-node2
> > 
> > limiting the slots to 4 in complex_values? Can you please also provide the 
> > `qhost` output.
> > 
> > -- Reuti
> > 
> > 
> > 
> > > 
> > > El jue., 6 dic. 2018 a las 10:59, Reuti (<re...@staff.uni-marburg.de>) 
> > > escribió:
> > > 
> > > > Am 06.12.2018 um 09:47 schrieb Hay, William <w....@ucl.ac.uk>:
> > > >
> > > > On Wed, Dec 05, 2018 at 03:29:23PM -0300, Dimar Jaime Gonz??lez Soto 
> > > > wrote:
> > > >>   the app site is https://omabrowser.org/standalone/ I tried to make a
> > > >>   parallel environment but it didn't work.
> > > > The website indicates that an array job should work for this.
> > > > Has the load average spiked to the point where np_load_avg>=1.75?
> > > 
> > > Yes, I noticed this too. Hence we need no parallel environement at all, 
> > > as OMA will just start several serial jobs as long as slots are available 
> > > AFAICS.
> > > 
> > > What does `qstat` show for a running job. There should be a line for each 
> > > executing task while the waiting once are abbreviated in one line.
> > > 
> > > -- Reuti
> > > 
> > > 
> > > >
> > > > I would try running qalter -w p  against the job id to see what it says.
> > > >
> > > > William
> > > >
> > > >
> > > >
> > > >>
> > > >>> Am 05.12.2018 um 19:10 schrieb Dimar Jaime Gonzalez Soto
> > > >>     <dimar.gonzalez.s...@gmail.com>:
> > > >>>
> > > >>> Hi everyone I'm trying to run OMA standalone on a grid engine setup
> > > >>     with this line:
> > > >>>
> > > >>> qsub -v NR_PROCESSES=60 -b y -j y -t 1-60 -cwd /usr/local/OMA/bin/OMA
> > > >>>
> > > >>> it works but only execute 4 processes  per node, there are 4 nodes
> > > >>     with 16 logical threads.  My main.q configuration is:
> > > >>>
> > > >>> qname                 main.q
> > > >>> hostlist              @allhosts
> > > >>> seq_no                0
> > > >>> load_thresholds       np_load_avg=1.75
> > > >>> suspend_thresholds    NONE
> > > >>> nsuspend              1
> > > >>> suspend_interval      00:05:00
> > > >>> priority              0
> > > >>> min_cpu_interval      00:05:00
> > > >>> processors            UNDIFINED
> > > >>> qtype                 BATCH INTERACTIVE
> > > >>> ckpt_list             NONE
> > > >>> pe_list               make
> > > >>> rerun                 FALSE
> > > >>> slots                 16
> > > 
> > > 
> > > 
> > > --
> > > Atte.
> > > 
> > > Dimar González Soto
> > > Ingeniero Civil en Informática
> > > Universidad Austral de Chile
> > > 
> > > 
> > 
> > 
> > 
> > -- 
> > Atte.
> > 
> > Dimar González Soto
> > Ingeniero Civil en Informática
> > Universidad Austral de Chile
> >  
> > 
> > 
> > 
> > -- 
> > Atte.
> > 
> > Dimar González Soto
> > Ingeniero Civil en Informática
> > Universidad Austral de Chile
> >  
> > 
> 
> 
> 
> -- 
> Atte.
> 
> Dimar González Soto
> Ingeniero Civil en Informática
> Universidad Austral de Chile
>  
> 


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to