On Sat, Dec 16, 2023 at 11:28 AM Haakon Bugge <[email protected]> wrote: > > On 14 Dec 2023, at 00:45, Paul Moore <[email protected]> wrote: > > On Tue, Dec 12, 2023 at 5:29 AM Håkon Bugge <[email protected]> wrote: > >> > >> For the most time-consuming function, when running a syscall benchmark > >> with STIG compliant audit rules: > >> > >> Overhead Command Shared Object Symbol > >> ......... ............ ................. ........................ > >> > >> 27.62% syscall_lat [kernel.kallsyms] [k] __audit_filter_op > >> > >> we apply codegen optimizations, which speeds up the syscall > >> performance by around 17% on an Intel Cascade Lake system. > >> > >> We run "perf stat -d -r 5 ./syscall_lat", where syscall_lat is a C > >> application that measures average syscall latency from getpid() > >> running 100 million rounds. > >> > >> Between each perf run, we reboot the system and waits until the last > >> minute load is less than 1.0. > >> > >> We boot the kernel, v6.6-rc4, with "mitigations=off", in order to > >> amplify the changes in the audit system. > >> > >> Let the base kernel be v6.6-rc4 with booted with "audit=1" and > >> "mitigations=off" and with the commit "audit: Vary struct audit_entry > >> alignment" on an Intel Cascade Lake system. The following three > >> metrics are reported, nanoseconds per syscall, L1D misses per syscall, > >> and finally Intructions Per Cycle, ipc. > >> > >> Base vs. base + this commit gives: > >> > >> ns per call: > >> min avg max pstdev > >> - 203 203 209 0.954149 > >> + 173 173 178 0.884534 > >> > >> L1d misses per syscall: > >> min avg max pstdev > >> - 0.012 0.103 0.817 0.238352 > >> + 0.010 0.209 1.235 0.399416 > >> > >> ipc: > >> min avg max pstdev > >> - 2.320 2.329 2.330 0.003000 > >> + 2.430 2.436 2.440 0.004899 > >> > >> Signed-off-by: Håkon Bugge <[email protected]> > >> --- > >> kernel/auditsc.c | 2 ++ > >> 1 file changed, 2 insertions(+) > >> > >> diff --git a/kernel/auditsc.c b/kernel/auditsc.c > >> index 6f0d6fb6523fa..84d0dfe75a4ac 100644 > >> --- a/kernel/auditsc.c > >> +++ b/kernel/auditsc.c > >> @@ -822,6 +822,7 @@ static int audit_in_mask(const struct audit_krule > >> *rule, unsigned long val) > >> * parameter can be NULL, but all others must be specified. > >> * Returns 1/true if the filter finds a match, 0/false if none are found. > >> */ > >> +#pragma GCC optimize("unswitch-loops", "align-loops=16", "align-jumps=16") > > > > The kernel doesn't really make use of #pragma optimization statements > > like this, at least not in any of the core areas, and I'm not > > interested in being the first to do so. I appreciate the time and > > effort that you have spent profiling the audit subsystem, but this > > isn't a patch I can accept at this point in time, I'm sorry. > > Fair enough. Will a function attribute aka: > > __attribute__((optimize("foo=bar"))) > > be acceptable for you?
Unless you can show me widespread acceptance of these types of optimizations in core kernel code, my answer is no. I'm sorry. -- paul-moore.com
