qconf -sconf shows:
#global:
execd_spool_dir /var/spool/gridengine/execd
mailer /usr/bin/mail
xterm /usr/bin/xterm
load_sensor none
prolog none
epilog none
shell_start_mode posix_compliant
login_shells bash,sh,ksh,csh,tcsh
min_uid 0
min_gid 0
user_lists none
xuser_lists none
projects none
xprojects none
enforce_project false
enforce_user auto
load_report_time 00:00:40
max_unheard 00:05:00
reschedule_unknown 00:00:00
loglevel log_warning
administrator_mail root
set_token_cmd none
pag_cmd none
token_extend_time none
shepherd_cmd none
qmaster_params none
execd_params none
reporting_params accounting=true reporting=false \
flush_time=00:00:15 joblog=false
sharelog=00:00:00
finished_jobs 100
gid_range 65400-65500
max_aj_instances 2000
max_aj_tasks 75000
max_u_jobs 0
max_jobs 0
auto_user_oticket 0
auto_user_fshare 0
auto_user_default_project none
auto_user_delete_time 86400
delegated_file_staging false
reprioritize 0
rlogin_daemon /usr/sbin/sshd -i
rlogin_command /usr/bin/ssh
qlogin_daemon /usr/sbin/sshd -i
qlogin_command /usr/share/gridengine/qlogin-wrapper
rsh_daemon /usr/sbin/sshd -i
rsh_command /usr/bin/ssh
jsv_url none
jsv_allowed_mod ac,h,i,e,o,j,M,N,p,w
El jue., 6 dic. 2018 a las 12:55, Reuti (<[email protected]>)
escribió:
>
> > Am 06.12.2018 um 15:19 schrieb Dimar Jaime González Soto <
> [email protected]>:
> >
> > qconf -se ubuntu-node2 :
> >
> > hostname ubuntu-node2
> > load_scaling NONE
> > complex_values NONE
> > load_values
> arch=lx26-amd64,num_proc=16,mem_total=48201.960938M, \
> >
> swap_total=95746.996094M,virtual_total=143948.957031M, \
> > load_avg=3.740000,load_short=4.000000, \
> > load_medium=3.740000,load_long=2.360000, \
> > mem_free=47376.683594M,swap_free=95746.996094M, \
>
> Although it's unrelated to the main issue: the swap size can be limited to
> 2 GB nowadays (which is the default in openSUSE). RedHat suggests a little
> bit more, e.g. here:
>
>
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/ch-swapspace
>
>
>
> > virtual_free=143123.679688M,mem_used=825.277344M, \
> > swap_used=0.000000M,virtual_used=825.277344M, \
> >
> cpu=25.000000,m_topology=NONE,m_topology_inuse=NONE, \
> > m_socket=0,m_core=0,np_load_avg=0.233750, \
> > np_load_short=0.250000,np_load_medium=0.233750, \
> > np_load_long=0.147500
> > processors 16
> > user_lists NONE
> > xuser_lists NONE
> > projects NONE
> > xprojects NONE
> > usage_scaling NONE
> > report_variables NONE
> >
> > El jue., 6 dic. 2018 a las 11:17, Dimar Jaime González Soto (<
> [email protected]>) escribió:
> > qhost :
> >
> > HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO
> SWAPUS
> >
> -------------------------------------------------------------------------------
> > global - - - - - -
> -
> > ubuntu-frontend lx26-amd64 16 4.13 31.4G 1.2G 0.0
> 0.0
> > ubuntu-node11 lx26-amd64 16 4.55 47.1G 397.5M 93.5G
> 0.0
> > ubuntu-node12 lx26-amd64 16 3.64 47.1G 1.0G 93.5G
> 0.0
> > ubuntu-node13 lx26-amd64 16 4.54 47.1G 399.9M 93.5G
> 0.0
> > ubuntu-node2 lx26-amd64 16 3.67 47.1G 818.5M 93.5G
> 0.0
>
> This looks fine. So we have other settings to investigate:
>
> $ qconf -sconf
> #global:
> execd_spool_dir /var/spool/sge
> ...
> max_aj_tasks 75000
>
> Is max_aj_tasks limited in your setup?
>
>
>
> -- Reuti
>
>
> >
> > El jue., 6 dic. 2018 a las 11:13, Reuti (<[email protected]>)
> escribió:
> >
> > > Am 06.12.2018 um 15:07 schrieb Dimar Jaime González Soto <
> [email protected]>:
> > >
> > > qalter -w p doesn't shows anything, qstat shows 16 processes and not
> 60:
> > >
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node2 1 1
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node12 1 2
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node13 1 3
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node11 1 4
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node11 1 5
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node13 1 6
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node12 1 7
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node2 1 8
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node2 1 9
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node12 1 10
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node13 1 11
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node11 1 12
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node11 1 13
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node13 1 14
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node12 1 15
> > > 250 0.50000 OMA cbuach r 12/06/2018 11:04:15
> main.q@ubuntu-node2 1 16
> > > 250 0.50000 OMA cbuach qw 12/06/2018 11:04:02
> 1 17-60:1
> >
> > Aha, so they are running already on remote nodes – fine. As the setting
> in the queue configuration is per host, this should work and provide more
> processes per node instead of four.
> >
> > Is there a setting for the exechosts:
> >
> > qconf -se ubuntu-node2
> >
> > limiting the slots to 4 in complex_values? Can you please also provide
> the `qhost` output.
> >
> > -- Reuti
> >
> >
> >
> > >
> > > El jue., 6 dic. 2018 a las 10:59, Reuti (<[email protected]>)
> escribió:
> > >
> > > > Am 06.12.2018 um 09:47 schrieb Hay, William <[email protected]>:
> > > >
> > > > On Wed, Dec 05, 2018 at 03:29:23PM -0300, Dimar Jaime Gonz??lez Soto
> wrote:
> > > >> the app site is https://omabrowser.org/standalone/ I tried to
> make a
> > > >> parallel environment but it didn't work.
> > > > The website indicates that an array job should work for this.
> > > > Has the load average spiked to the point where np_load_avg>=1.75?
> > >
> > > Yes, I noticed this too. Hence we need no parallel environement at
> all, as OMA will just start several serial jobs as long as slots are
> available AFAICS.
> > >
> > > What does `qstat` show for a running job. There should be a line for
> each executing task while the waiting once are abbreviated in one line.
> > >
> > > -- Reuti
> > >
> > >
> > > >
> > > > I would try running qalter -w p against the job id to see what it
> says.
> > > >
> > > > William
> > > >
> > > >
> > > >
> > > >>
> > > >>> Am 05.12.2018 um 19:10 schrieb Dimar Jaime Gonzalez Soto
> > > >> <[email protected]>:
> > > >>>
> > > >>> Hi everyone I'm trying to run OMA standalone on a grid engine setup
> > > >> with this line:
> > > >>>
> > > >>> qsub -v NR_PROCESSES=60 -b y -j y -t 1-60 -cwd
> /usr/local/OMA/bin/OMA
> > > >>>
> > > >>> it works but only execute 4 processes per node, there are 4 nodes
> > > >> with 16 logical threads. My main.q configuration is:
> > > >>>
> > > >>> qname main.q
> > > >>> hostlist @allhosts
> > > >>> seq_no 0
> > > >>> load_thresholds np_load_avg=1.75
> > > >>> suspend_thresholds NONE
> > > >>> nsuspend 1
> > > >>> suspend_interval 00:05:00
> > > >>> priority 0
> > > >>> min_cpu_interval 00:05:00
> > > >>> processors UNDIFINED
> > > >>> qtype BATCH INTERACTIVE
> > > >>> ckpt_list NONE
> > > >>> pe_list make
> > > >>> rerun FALSE
> > > >>> slots 16
> > >
> > >
> > >
> > > --
> > > Atte.
> > >
> > > Dimar González Soto
> > > Ingeniero Civil en Informática
> > > Universidad Austral de Chile
> > >
> > >
> >
> >
> >
> > --
> > Atte.
> >
> > Dimar González Soto
> > Ingeniero Civil en Informática
> > Universidad Austral de Chile
> >
> >
> >
> >
> > --
> > Atte.
> >
> > Dimar González Soto
> > Ingeniero Civil en Informática
> > Universidad Austral de Chile
> >
> >
>
>
--
Atte.
Dimar González Soto
Ingeniero Civil en Informática
Universidad Austral de Chile
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users