> -----邮件原件----- > 发件人: Paul Moore [mailto:p...@paul-moore.com] > 发送时间: 2019年9月17日 6:52 > 收件人: Li,Rongqing <lirongq...@baidu.com> > 抄送: Eric Paris <epa...@redhat.com>; linux-audit@redhat.com > 主题: Re: [PATCH][RFC] audit: set wait time to zero when audit failed > > On Sun, Sep 15, 2019 at 10:55 PM Li,Rongqing <lirongq...@baidu.com> wrote: > > > > if audit_log_start failed because queue is full, kauditd is > > > > waiting the receiving queue empty, but no receiver, a task will be > > > > forced to wait 60 seconds for each audited syscall, and it will be > > > > hang for a very long time > > > > > > > > so at this condition, set the wait time to zero to reduce wait, > > > > and restore wait time when audit works again > > > > > > > > it partially restore the commit 3197542482df ("audit: rework > > > > audit_log_start()") > > > > > > > > Signed-off-by: Li RongQing <lirongq...@baidu.com> > > > > Signed-off-by: Liang ZhiCheng <liangzhich...@baidu.com> > > > > --- > > > > reboot is taking a very long time on my machine(centos 6u4 +kernel > > > > 5.3) since TIF_SYSCALL_AUDIT is set by default, and when reboot, > > > > userspace process which receiver audit message , will be killed, > > > > and lead to that no user drain the audit queue > > > > > > > > git bitsect show it is caused by 3197542482df ("audit: rework > > > > audit_log_start()") > > > > > > > > kernel/audit.c | 9 +++++++-- > > > > 1 file changed, 7 insertions(+), 2 deletions(-) > > > > > > This is typically solved by increasing the backlog using the > "audit_backlog_limit" > > > kernel parameter (link to the docs below). > > > > It should be able to avoid my issue, but the default behaviors does not > working for me; And not all have enough knowledge about audit, who maybe > spend lots of effort to find the root cause, and estimate how large should be > "audit_backlog_limit" > > The pause/sleep behavior is desired behavior and is intended to help > kauditd/auditd process the audit backlog on a busy system. If we didn't sleep > the current process and give kauditd/auditd a chance to flush the backlog when > it was full, a lot of bad things could happen with respect to audit. We > generally select the backlog limit so that this is not a problem for most > systems, > although there will always be edge cases where the default does not work well; > it is impossible to pick defaults that work well for every case. >
I just want to it as before 3197542482df ("audit: rework audit_log_start()"), wait 60 seconds once if auditd/readaheaad-collector have some problem to drain the audit backlog. And once the auditd/readahead-collector recovers, restore the wait time to 60 seconds > If you are not using audit, you can always disable it via the kernel command > line, > or at runtime (look at what Fedora does). > > > > You might also want to investigate > > > what is generating some many audit records prior to starting the > > > audit daemon. > > > > It is /sbin/readahead-collector, in fact, we stop the auditd; We are doing a > reboot test, which rebooting machine continue to test hardware/software. > > > > it is same as below: > > auditctl -a always,exit -S all -F pid='xxx' > > kill -s 19 `pidof auditd` > > > > then the audited task will be hung > > So you are seeing this problem only when you run a test, or did you provide > this > as a reproducer? > auditctl -a always,exit -S all -F ppid=`pidof sshd` kill -s 19 `pidof auditd` ssh root@127.0.0.1 then ssh will be hung forever -Li RongQing > -- > paul moore > www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit