Hello,

I've been looking into auditd's performance. The first thing I did was to 
measure the rate at which it could log things with various settings.  To do 
this test, I had 2 windows open. One to start auditd from the command line 
without systemd interference and one to run a script as follows:

auditctl -D
auditctl -b 16440
auditctl -f 0
auditctl --backlog_wait_time 100
auditctl -a always,exit -F arch=x86_64 -S all
sleep 3
service auditd stop
auditctl -D

The results of various settings are as follows:

FLUSH           FREQ            Events/sec
------------------------------------------------------
SYNC                                    45
DATA                                    105
INCREMENTAL     20                      400
                        50                      1000
                        100                     1815
                        200                     3080
                        400                     5800
                        1000            10100
                        2000            15275
                        4000            18650
                        8000            24075
NONE                                    38300


In looking further, I found that there was a lot of lock contention and 
scheduling issues because of pthreads. I mapped out the paths in the code to 
get a picture of where events come from and where they go:

http://people.readhat.com/sgrubb/audit/auditd-data-flow.pdf

The blue boxes are where events come from, the red boxes are where we have 
contention. The gray is the path on the logging thread. The white boxes are 
the main thread.

What I found is that if I make enqueue_event call write_to_log directly, it 
doubles the throughput of the audit daemon. IOW, going from multi-threaded to 
singly threaded makes a huge difference. The audit daemon was multi-threaded 
from the very first public release back in 2004 before I started working on it.

So, what I think I am going to do is fix it to be singly threaded, fix the 
signal handlers to set a variable on error so that the main thread picks it up 
to serialize it with other events, move size check and rotate code, and remove 
the pthreads code.

That leaves an issue with dispatching events to other programs. What I have 
been thinking about is perhaps using libevfibers to manage switching between 
logging and dispatching.

One other tidbit that I found out during testing, if I generate so many events 
that it overflows the kernel queue, the default settings for backlog_wait_time 
makes the system unusable. It acts like its live-locked. So, I would recommend 
that the default setting in the kernel be changed to something more livable 
and anyone concerned about this to explicitly set the value to something low.

-Steve

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit

Reply via email to