William,
Thanks for reply. Unfortunately I have few non-interactive queues, so I
cannot limit slots this way.
99% of messages printed to system log look like this below, so I believe
that are the messages which are suppressed:
Mar 20 21:55:18 qmaster sh: -------------------------------
Mar 20 21:55:18 qmaster sh: RUE_name (String) =
thomas///medium.q//
Mar 20 21:55:18 qmaster sh: RUE_utilized_now (Double) = 2.000000
Mar 20 21:55:18 qmaster sh: RUE_utilized (List) = empty
Mar 20 21:55:18 qmaster sh: RUE_utilized_now_non (Double) = 0.000000
Mar 20 21:55:18 qmaster sh: RUE_utilized_nonexcl (List) = empty
SGE conf:
[root@qmaster ~]# qconf -sconf
#global:
execd_spool_dir /var/sge/default/spool
mailer /bin/mail
xterm /usr/bin/xterm
load_sensor none
prolog none
epilog none
shell_start_mode posix_compliant
login_shells sh,bash,ksh,csh,tcsh
min_uid 100
min_gid 100
user_lists none
xuser_lists none
projects none
xprojects none
enforce_project false
enforce_user auto
load_report_time 00:00:40
max_unheard 00:05:00
reschedule_unknown 00:00:00
loglevel log_err
administrator_mail
set_token_cmd none
pag_cmd none
token_extend_time none
shepherd_cmd none
qmaster_params none
execd_params none
reporting_params accounting=true reporting=false \
flush_time=00:00:15 joblog=false
sharelog=00:00:00
finished_jobs 100
gid_range 20000-20100
qlogin_command /opt/sge/qlogin_wrapper.sh
qlogin_daemon /usr/sbin/sshd -i
rlogin_command /usr/bin/ssh -o StrictHostKeyChecking=no
rlogin_daemon /usr/sbin/sshd -i
rsh_command /usr/bin/ssh -o StrictHostKeyChecking=no
rsh_daemon /usr/sbin/sshd -i
max_aj_instances 2000
max_aj_tasks 75000
max_u_jobs 0
max_jobs 0
max_advance_reservations 0
auto_user_oticket 0
auto_user_fshare 0
auto_user_default_project none
auto_user_delete_time 86400
delegated_file_staging false
reprioritize 0
jsv_url none
jsv_allowed_mod ac,h,i,e,o,j,M,N,p,w
Scheduler conf:
[root@qmaster ~]# qconf -ssconf
algorithm default
schedule_interval 0:0:30
maxujobs 0
queue_sort_method load
job_load_adjustments np_load_avg=0.50
load_adjustment_decay_time 0:7:30
load_formula np_load_avg
schedd_job_info true
flush_submit_sec 0
flush_finish_sec 0
params none
reprioritize_interval 0:0:0
halftime 168
usage_weight_list cpu=1.000000,mem=0.000000,io=0.000000
compensation_factor 5.000000
weight_user 0.250000
weight_project 0.250000
weight_department 0.250000
weight_job 0.250000
weight_tickets_functional 3000000
weight_tickets_share 0
share_override_tickets TRUE
share_functional_shares TRUE
max_functional_jobs_to_schedule 200
report_pjob_tickets TRUE
max_pending_tasks_per_job 50
halflife_decay_list none
policy_hierarchy OFS
weight_ticket 1.000000
weight_waiting_time 0.000000
weight_deadline 3600000.000000
weight_urgency 0.100000
weight_priority 1.000000
max_reservation 0
default_duration INFINITY
Cheers,
Jakub
2018-03-20 16:59 GMT+01:00 William Hay <[email protected]>:
> On Tue, Mar 20, 2018 at 11:08:02AM +0100, Sms Backup wrote:
> > Dear all,
> > We have in our configuration multiple servers assigned to multiple
> queues.
> > To limit slots number per system, I tried to create qouta:
> > {
> > name slots
> > description Limit slots usage per node to number of cores
> > enabled FALSE
> > limit queues !interactive.q hosts {@allhosts} to slots=40
> > }
> > Expected result was to limit slots usage per host to 40 for
> > non-interactive queus, but allow interactive jobs even if all slots
> are
> > taken.
> > It is doing it's job, but enabling this quota causes hundreds of
> thousands
> > messages from SGE master to be passed to systemd-journalctl, causing
> > systemd-journalctl eat 100% CPU, and make SGE master almost not
> usable,
> > example:
>
> > Did anyone had similar issue ? Any other ideas how to limit slots
> usage,
> > and not causing this kind of issue ? Or maybe it is possible to force
> SGE
> > not to send so many messages to systemd-journald ?
> > Regards,
> > Jakub
> If you only have one non-interactive queue you could just set the number
> of slots for that queue on each host.
>
> It would probably be useful to see what messages grid engine is sending to
> journald. Hopefully it has retained
> a few sample messages rather than suppressing them all.
>
> A look at your sge_conf and sched_conf might give a clue as to what is
> generating the messages.
>
> William
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users