Re: [PATCH 1/2] audit: Vary struct audit_entry alignment

Paul Moore Wed, 13 Dec 2023 15:55:13 -0800

On Tue, Dec 12, 2023 at 5:29 AM Håkon Bugge <[email protected]> wrote:
>
> We allocate struct audit_entry using kzalloc() which aligns the
> structure at its natural boundary and so uses the kmalloc-512
> SLAB.
>
> That means that the lower order 9 bits are equal for these allocations.
> Which on architectures with limited L1D cache-sets width would cause
> conflict misses.
>
> With STIG compliant audit rules, up-to 26 L1D misses per syscall have
> been observed. This, of course, varies from boot to boot.
>
> Fix this by using a kmem_cache aligned at cacheline width, which
> greatly increases the entropy in the lower order VA bits. With this
> fix, we observe a maximum of 0.8 misses per syscall using the same set
> of audit rules.
>
> Testing: run "perf stat -d -r 5 ./syscall_lat", where syscall_lat is a
> simple getpid() loop, running 100 million rounds. This test is run 10
> times, with a reboot in between. After each reboot, we wait until
> the last minute load is less than 1.0.
>
> We boot the kernel, v6.6-rc4, with "mitigations=off", in order to
> amplify the changes in the audit system.
>
> Base vs. base + this commit gives:
>
> ns per call:
>   min avg max   pstdev
> - 205 210 227 6.402000
> + 203 203 209 0.954149
>
> L1d misses per syscall:
>     min    avg    max   pstdev
> - 3.147 12.017 26.045 6.568284
> + 0.012  0.103  0.817 0.238352
>
> ipc:
>     min    avg    max   pstdev
> - 2.090  2.259  2.300 0.060075
> + 2.320  2.329  2.330 0.003000
>
> The above metrics are all improved, and the L1D misses per syscall has
> a significant reduction.
>
> Co-developed-by: Ankur Arora <[email protected]>
> Signed-off-by: Ankur Arora <[email protected]>
> Signed-off-by: Håkon Bugge <[email protected]>
> ---
>  kernel/auditfilter.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
> index 8317a37dea0bb..b7597a63a3950 100644
> --- a/kernel/auditfilter.c
> +++ b/kernel/auditfilter.c
> @@ -80,6 +80,7 @@ static void audit_free_lsm_field(struct audit_field *f)
>         }
>  }
>
> +static struct kmem_cache *entry_cache;
>  static inline void audit_free_rule(struct audit_entry *e)
>  {
>         int i;
> @@ -93,7 +94,7 @@ static inline void audit_free_rule(struct audit_entry *e)
>                         audit_free_lsm_field(&erule->fields[i]);
>         kfree(erule->fields);
>         kfree(erule->filterkey);
> -       kfree(e);
> +       kmem_cache_free(entry_cache, e);
>  }
>
>  void audit_free_rule_rcu(struct rcu_head *head)
> @@ -108,13 +109,20 @@ static inline struct audit_entry *audit_init_entry(u32 
> field_count)
>         struct audit_entry *entry;
>         struct audit_field *fields;
>
> -       entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> +       if (!entry_cache) {
> +               entry_cache = kmem_cache_create("audit_entry", 
> sizeof(*entry), 0,
> +                                               SLAB_HWCACHE_ALIGN, NULL);
> +               if (!entry_cache)
> +                       return NULL;


Two things:

1. If we are going to create a kmem_cache pool we shouldn't create it
here, it should be in its own audit_filter_init() function which is
called from audit_init().

2. I'm not sure it makes a lot of sense to create a kmem_cache pool
for audit filter entries, especially given the modest performance
gains.  Is there not some way to request cacheline alignment with
kmalloc() or similar?

> +       }
> +
> +       entry = kmem_cache_zalloc(entry_cache, GFP_KERNEL);
>         if (unlikely(!entry))
>                 return NULL;
>
>         fields = kcalloc(field_count, sizeof(*fields), GFP_KERNEL);
>         if (unlikely(!fields)) {
> -               kfree(entry);
> +               kmem_cache_free(entry_cache, entry);
>                 return NULL;
>         }
>         entry->rule.fields = fields;
> --
> 2.39.3

-- 
paul-moore.com

Re: [PATCH 1/2] audit: Vary struct audit_entry alignment

Reply via email to