William,

Thanks for reply. Unfortunately I have few non-interactive queues, so I
cannot limit slots this way.

99% of messages printed to system log look like this below, so I believe
that are the messages which are suppressed:

Mar 20 21:55:18 qmaster sh: -------------------------------
Mar 20 21:55:18 qmaster sh: RUE_name             (String)    =
thomas///medium.q//
Mar 20 21:55:18 qmaster sh: RUE_utilized_now     (Double)    = 2.000000
Mar 20 21:55:18 qmaster sh: RUE_utilized         (List)      = empty
Mar 20 21:55:18 qmaster sh: RUE_utilized_now_non (Double)    = 0.000000
Mar 20 21:55:18 qmaster sh: RUE_utilized_nonexcl (List)      = empty


SGE conf:
[root@qmaster ~]# qconf -sconf
#global:
execd_spool_dir              /var/sge/default/spool
mailer                       /bin/mail
xterm                        /usr/bin/xterm
load_sensor                  none
prolog                       none
epilog                       none
shell_start_mode             posix_compliant
login_shells                 sh,bash,ksh,csh,tcsh
min_uid                      100
min_gid                      100
user_lists                   none
xuser_lists                  none
projects                     none
xprojects                    none
enforce_project              false
enforce_user                 auto
load_report_time             00:00:40
max_unheard                  00:05:00
reschedule_unknown           00:00:00
loglevel                     log_err
administrator_mail
set_token_cmd                none
pag_cmd                      none
token_extend_time            none
shepherd_cmd                 none
qmaster_params               none
execd_params                 none
reporting_params             accounting=true reporting=false \
                             flush_time=00:00:15 joblog=false
sharelog=00:00:00
finished_jobs                100
gid_range                    20000-20100
qlogin_command               /opt/sge/qlogin_wrapper.sh
qlogin_daemon                /usr/sbin/sshd -i
rlogin_command               /usr/bin/ssh -o StrictHostKeyChecking=no
rlogin_daemon                /usr/sbin/sshd -i
rsh_command                  /usr/bin/ssh -o StrictHostKeyChecking=no
rsh_daemon                   /usr/sbin/sshd -i
max_aj_instances             2000
max_aj_tasks                 75000
max_u_jobs                   0
max_jobs                     0
max_advance_reservations     0
auto_user_oticket            0
auto_user_fshare             0
auto_user_default_project    none
auto_user_delete_time        86400
delegated_file_staging       false
reprioritize                 0
jsv_url                      none
jsv_allowed_mod              ac,h,i,e,o,j,M,N,p,w


Scheduler conf:
[root@qmaster ~]# qconf -ssconf
algorithm                         default
schedule_interval                 0:0:30
maxujobs                          0
queue_sort_method                 load
job_load_adjustments              np_load_avg=0.50
load_adjustment_decay_time        0:7:30
load_formula                      np_load_avg
schedd_job_info                   true
flush_submit_sec                  0
flush_finish_sec                  0
params                            none
reprioritize_interval             0:0:0
halftime                          168
usage_weight_list                 cpu=1.000000,mem=0.000000,io=0.000000
compensation_factor               5.000000
weight_user                       0.250000
weight_project                    0.250000
weight_department                 0.250000
weight_job                        0.250000
weight_tickets_functional         3000000
weight_tickets_share              0
share_override_tickets            TRUE
share_functional_shares           TRUE
max_functional_jobs_to_schedule   200
report_pjob_tickets               TRUE
max_pending_tasks_per_job         50
halflife_decay_list               none
policy_hierarchy                  OFS
weight_ticket                     1.000000
weight_waiting_time               0.000000
weight_deadline                   3600000.000000
weight_urgency                    0.100000
weight_priority                   1.000000
max_reservation                   0
default_duration                  INFINITY

Cheers,
Jakub


2018-03-20 16:59 GMT+01:00 William Hay <[email protected]>:

> On Tue, Mar 20, 2018 at 11:08:02AM +0100, Sms Backup wrote:
> >    Dear all,
> >    We have in our configuration multiple servers assigned to multiple
> queues.
> >    To limit slots number per system, I tried to create qouta:
> >    {
> >       name         slots
> >       description  Limit slots usage per node to number of cores
> >       enabled      FALSE
> >       limit        queues !interactive.q hosts {@allhosts} to slots=40
> >    }
> >    Expected result was to limit slots usage per host to 40 for
> >    non-interactive queus, but allow interactive jobs even if all slots
> are
> >    taken.
> >    It is doing it's job, but enabling this quota causes hundreds of
> thousands
> >    messages from SGE master to be passed to systemd-journalctl, causing
> >    systemd-journalctl eat 100% CPU, and make SGE master almost not
> usable,
> >    example:
>
> >    Did anyone had similar issue ? Any other ideas how to limit slots
> usage,
> >    and not causing this kind of issue ? Or maybe it is possible to force
> SGE
> >    not to send so many messages to systemd-journald ?
> >    Regards,
> >    Jakub
> If you only have one non-interactive queue you could just set the number
> of slots for that queue on each host.
>
> It would probably be useful to see what messages grid engine is sending to
> journald.  Hopefully it has retained
> a few sample messages rather than suppressing them all.
>
> A look at your sge_conf and sched_conf might give a clue as to what is
> generating the messages.
>
> William
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to