> Am 06.12.2018 um 16:59 schrieb Dimar Jaime González Soto > <dimar.gonzalez.s...@gmail.com>: > > qconf -sconf shows: > > #global: > execd_spool_dir /var/spool/gridengine/execd > ... > ax_aj_tasks 75000
So, this is fine too. Next place: is the amount of overall slots limited: $ qconf -se global especially the line "complex_values". And next: any RQS? $ qconf -srqs -- Reuti > El jue., 6 dic. 2018 a las 12:55, Reuti (<re...@staff.uni-marburg.de>) > escribió: > > > Am 06.12.2018 um 15:19 schrieb Dimar Jaime González Soto > > <dimar.gonzalez.s...@gmail.com>: > > > > qconf -se ubuntu-node2 : > > > > hostname ubuntu-node2 > > load_scaling NONE > > complex_values NONE > > load_values arch=lx26-amd64,num_proc=16,mem_total=48201.960938M, \ > > > > swap_total=95746.996094M,virtual_total=143948.957031M, \ > > load_avg=3.740000,load_short=4.000000, \ > > load_medium=3.740000,load_long=2.360000, \ > > mem_free=47376.683594M,swap_free=95746.996094M, \ > > Although it's unrelated to the main issue: the swap size can be limited to 2 > GB nowadays (which is the default in openSUSE). RedHat suggests a little bit > more, e.g. here: > > https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/ch-swapspace > > > > > virtual_free=143123.679688M,mem_used=825.277344M, \ > > swap_used=0.000000M,virtual_used=825.277344M, \ > > cpu=25.000000,m_topology=NONE,m_topology_inuse=NONE, \ > > m_socket=0,m_core=0,np_load_avg=0.233750, \ > > np_load_short=0.250000,np_load_medium=0.233750, \ > > np_load_long=0.147500 > > processors 16 > > user_lists NONE > > xuser_lists NONE > > projects NONE > > xprojects NONE > > usage_scaling NONE > > report_variables NONE > > > > El jue., 6 dic. 2018 a las 11:17, Dimar Jaime González Soto > > (<dimar.gonzalez.s...@gmail.com>) escribió: > > qhost : > > > > HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO > > SWAPUS > > ------------------------------------------------------------------------------- > > global - - - - - - > > - > > ubuntu-frontend lx26-amd64 16 4.13 31.4G 1.2G 0.0 > > 0.0 > > ubuntu-node11 lx26-amd64 16 4.55 47.1G 397.5M 93.5G > > 0.0 > > ubuntu-node12 lx26-amd64 16 3.64 47.1G 1.0G 93.5G > > 0.0 > > ubuntu-node13 lx26-amd64 16 4.54 47.1G 399.9M 93.5G > > 0.0 > > ubuntu-node2 lx26-amd64 16 3.67 47.1G 818.5M 93.5G > > 0.0 > > This looks fine. So we have other settings to investigate: > > $ qconf -sconf > #global: > execd_spool_dir /var/spool/sge > ... > max_aj_tasks 75000 > > Is max_aj_tasks limited in your setup? > > > > -- Reuti > > > > > > El jue., 6 dic. 2018 a las 11:13, Reuti (<re...@staff.uni-marburg.de>) > > escribió: > > > > > Am 06.12.2018 um 15:07 schrieb Dimar Jaime González Soto > > > <dimar.gonzalez.s...@gmail.com>: > > > > > > qalter -w p doesn't shows anything, qstat shows 16 processes and not 60: > > > > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node2 1 1 > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node12 1 2 > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node13 1 3 > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node11 1 4 > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node11 1 5 > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node13 1 6 > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node12 1 7 > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node2 1 8 > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node2 1 9 > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node12 1 10 > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node13 1 11 > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node11 1 12 > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node11 1 13 > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node13 1 14 > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node12 1 15 > > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15 > > > main.q@ubuntu-node2 1 16 > > > 250 0.50000 OMA cbuach qw 12/06/2018 11:04:02 > > > 1 17-60:1 > > > > Aha, so they are running already on remote nodes – fine. As the setting in > > the queue configuration is per host, this should work and provide more > > processes per node instead of four. > > > > Is there a setting for the exechosts: > > > > qconf -se ubuntu-node2 > > > > limiting the slots to 4 in complex_values? Can you please also provide the > > `qhost` output. > > > > -- Reuti > > > > > > > > > > > > El jue., 6 dic. 2018 a las 10:59, Reuti (<re...@staff.uni-marburg.de>) > > > escribió: > > > > > > > Am 06.12.2018 um 09:47 schrieb Hay, William <w....@ucl.ac.uk>: > > > > > > > > On Wed, Dec 05, 2018 at 03:29:23PM -0300, Dimar Jaime Gonz??lez Soto > > > > wrote: > > > >> the app site is https://omabrowser.org/standalone/ I tried to make a > > > >> parallel environment but it didn't work. > > > > The website indicates that an array job should work for this. > > > > Has the load average spiked to the point where np_load_avg>=1.75? > > > > > > Yes, I noticed this too. Hence we need no parallel environement at all, > > > as OMA will just start several serial jobs as long as slots are available > > > AFAICS. > > > > > > What does `qstat` show for a running job. There should be a line for each > > > executing task while the waiting once are abbreviated in one line. > > > > > > -- Reuti > > > > > > > > > > > > > > I would try running qalter -w p against the job id to see what it says. > > > > > > > > William > > > > > > > > > > > > > > > >> > > > >>> Am 05.12.2018 um 19:10 schrieb Dimar Jaime Gonzalez Soto > > > >> <dimar.gonzalez.s...@gmail.com>: > > > >>> > > > >>> Hi everyone I'm trying to run OMA standalone on a grid engine setup > > > >> with this line: > > > >>> > > > >>> qsub -v NR_PROCESSES=60 -b y -j y -t 1-60 -cwd /usr/local/OMA/bin/OMA > > > >>> > > > >>> it works but only execute 4 processes per node, there are 4 nodes > > > >> with 16 logical threads. My main.q configuration is: > > > >>> > > > >>> qname main.q > > > >>> hostlist @allhosts > > > >>> seq_no 0 > > > >>> load_thresholds np_load_avg=1.75 > > > >>> suspend_thresholds NONE > > > >>> nsuspend 1 > > > >>> suspend_interval 00:05:00 > > > >>> priority 0 > > > >>> min_cpu_interval 00:05:00 > > > >>> processors UNDIFINED > > > >>> qtype BATCH INTERACTIVE > > > >>> ckpt_list NONE > > > >>> pe_list make > > > >>> rerun FALSE > > > >>> slots 16 > > > > > > > > > > > > -- > > > Atte. > > > > > > Dimar González Soto > > > Ingeniero Civil en Informática > > > Universidad Austral de Chile > > > > > > > > > > > > > > -- > > Atte. > > > > Dimar González Soto > > Ingeniero Civil en Informática > > Universidad Austral de Chile > > > > > > > > > > -- > > Atte. > > > > Dimar González Soto > > Ingeniero Civil en Informática > > Universidad Austral de Chile > > > > > > > > -- > Atte. > > Dimar González Soto > Ingeniero Civil en Informática > Universidad Austral de Chile > > _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users