On Tue, Dec 12, 2023 at 5:29 AM Håkon Bugge <[email protected]> wrote:
>
> We allocate struct audit_entry using kzalloc() which aligns the
> structure at its natural boundary and so uses the kmalloc-512
> SLAB.
>
> That means that the lower order 9 bits are equal for these allocations.
> Which on architectures with limited L1D cache-sets width would cause
> conflict misses.
>
> With STIG compliant audit rules, up-to 26 L1D misses per syscall have
> been observed. This, of course, varies from boot to boot.
>
> Fix this by using a kmem_cache aligned at cacheline width, which
> greatly increases the entropy in the lower order VA bits. With this
> fix, we observe a maximum of 0.8 misses per syscall using the same set
> of audit rules.
>
> Testing: run "perf stat -d -r 5 ./syscall_lat", where syscall_lat is a
> simple getpid() loop, running 100 million rounds. This test is run 10
> times, with a reboot in between. After each reboot, we wait until
> the last minute load is less than 1.0.
>
> We boot the kernel, v6.6-rc4, with "mitigations=off", in order to
> amplify the changes in the audit system.
>
> Base vs. base + this commit gives:
>
> ns per call:
> min avg max pstdev
> - 205 210 227 6.402000
> + 203 203 209 0.954149
>
> L1d misses per syscall:
> min avg max pstdev
> - 3.147 12.017 26.045 6.568284
> + 0.012 0.103 0.817 0.238352
>
> ipc:
> min avg max pstdev
> - 2.090 2.259 2.300 0.060075
> + 2.320 2.329 2.330 0.003000
>
> The above metrics are all improved, and the L1D misses per syscall has
> a significant reduction.
>
> Co-developed-by: Ankur Arora <[email protected]>
> Signed-off-by: Ankur Arora <[email protected]>
> Signed-off-by: Håkon Bugge <[email protected]>
> ---
> kernel/auditfilter.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
> index 8317a37dea0bb..b7597a63a3950 100644
> --- a/kernel/auditfilter.c
> +++ b/kernel/auditfilter.c
> @@ -80,6 +80,7 @@ static void audit_free_lsm_field(struct audit_field *f)
> }
> }
>
> +static struct kmem_cache *entry_cache;
> static inline void audit_free_rule(struct audit_entry *e)
> {
> int i;
> @@ -93,7 +94,7 @@ static inline void audit_free_rule(struct audit_entry *e)
> audit_free_lsm_field(&erule->fields[i]);
> kfree(erule->fields);
> kfree(erule->filterkey);
> - kfree(e);
> + kmem_cache_free(entry_cache, e);
> }
>
> void audit_free_rule_rcu(struct rcu_head *head)
> @@ -108,13 +109,20 @@ static inline struct audit_entry *audit_init_entry(u32
> field_count)
> struct audit_entry *entry;
> struct audit_field *fields;
>
> - entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> + if (!entry_cache) {
> + entry_cache = kmem_cache_create("audit_entry",
> sizeof(*entry), 0,
> + SLAB_HWCACHE_ALIGN, NULL);
> + if (!entry_cache)
> + return NULL;
Two things:
1. If we are going to create a kmem_cache pool we shouldn't create it
here, it should be in its own audit_filter_init() function which is
called from audit_init().
2. I'm not sure it makes a lot of sense to create a kmem_cache pool
for audit filter entries, especially given the modest performance
gains. Is there not some way to request cacheline alignment with
kmalloc() or similar?
> + }
> +
> + entry = kmem_cache_zalloc(entry_cache, GFP_KERNEL);
> if (unlikely(!entry))
> return NULL;
>
> fields = kcalloc(field_count, sizeof(*fields), GFP_KERNEL);
> if (unlikely(!fields)) {
> - kfree(entry);
> + kmem_cache_free(entry_cache, entry);
> return NULL;
> }
> entry->rule.fields = fields;
> --
> 2.39.3
--
paul-moore.com