On Sat, Dec 16, 2023 at 11:28 AM Haakon Bugge <[email protected]> wrote:
> > On 14 Dec 2023, at 00:45, Paul Moore <[email protected]> wrote:
> > On Tue, Dec 12, 2023 at 5:29 AM Håkon Bugge <[email protected]> wrote:
> >>
> >> For the most time-consuming function, when running a syscall benchmark
> >> with STIG compliant audit rules:
> >>
> >>  Overhead  Command       Shared Object      Symbol
> >> .........  ............  .................  ........................
> >>
> >>    27.62%  syscall_lat   [kernel.kallsyms]  [k] __audit_filter_op
> >>
> >> we apply codegen optimizations, which speeds up the syscall
> >> performance by around 17% on an Intel Cascade Lake system.
> >>
> >> We run "perf stat -d -r 5 ./syscall_lat", where syscall_lat is a C
> >> application that measures average syscall latency from getpid()
> >> running 100 million rounds.
> >>
> >> Between each perf run, we reboot the system and waits until the last
> >> minute load is less than 1.0.
> >>
> >> We boot the kernel, v6.6-rc4, with "mitigations=off", in order to
> >> amplify the changes in the audit system.
> >>
> >> Let the base kernel be v6.6-rc4 with booted with "audit=1" and
> >> "mitigations=off" and with the commit "audit: Vary struct audit_entry
> >> alignment" on an Intel Cascade Lake system. The following three
> >> metrics are reported, nanoseconds per syscall, L1D misses per syscall,
> >> and finally Intructions Per Cycle, ipc.
> >>
> >> Base vs. base + this commit gives:
> >>
> >> ns per call:
> >>  min avg max   pstdev
> >> - 203 203 209 0.954149
> >> + 173 173 178 0.884534
> >>
> >> L1d misses per syscall:
> >>     min    avg    max   pstdev
> >> -  0.012  0.103  0.817 0.238352
> >> +  0.010  0.209  1.235 0.399416
> >>
> >> ipc:
> >>    min    avg    max   pstdev
> >> - 2.320  2.329  2.330 0.003000
> >> + 2.430  2.436  2.440 0.004899
> >>
> >> Signed-off-by: Håkon Bugge <[email protected]>
> >> ---
> >> kernel/auditsc.c | 2 ++
> >> 1 file changed, 2 insertions(+)
> >>
> >> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> >> index 6f0d6fb6523fa..84d0dfe75a4ac 100644
> >> --- a/kernel/auditsc.c
> >> +++ b/kernel/auditsc.c
> >> @@ -822,6 +822,7 @@ static int audit_in_mask(const struct audit_krule 
> >> *rule, unsigned long val)
> >>  * parameter can be NULL, but all others must be specified.
> >>  * Returns 1/true if the filter finds a match, 0/false if none are found.
> >>  */
> >> +#pragma GCC optimize("unswitch-loops", "align-loops=16", "align-jumps=16")
> >
> > The kernel doesn't really make use of #pragma optimization statements
> > like this, at least not in any of the core areas, and I'm not
> > interested in being the first to do so.  I appreciate the time and
> > effort that you have spent profiling the audit subsystem, but this
> > isn't a patch I can accept at this point in time, I'm sorry.
>
> Fair enough. Will a function attribute aka:
>
> __attribute__((optimize("foo=bar")))
>
> be acceptable for you?

Unless you can show me widespread acceptance of these types of
optimizations in core kernel code, my answer is no.  I'm sorry.

-- 
paul-moore.com

Reply via email to