On Mon, Mar 23, 2026 at 11:07 AM Ricardo Robaina <[email protected]> wrote: > > Currently, determining the optimal `audit_backlog_limit` relies on > instantaneous polling of the queue size. This misses transient > micro-bursts, making it difficult for system administrators to know > if their queue is adequately sized or if they are at risk of > dropping events. > > This patch introduces `backlog_max_depth`, a high-water mark metric > that tracks the maximum number of buffers in the audit queue since > the system was booted or the metric was last reset. To minimize > performance overhead in the fast-path, the metric is updated using > a lockless cmpxchg loop in `__audit_log_end()`. > > Userspace can read-and-clear this metric by sending an `AUDIT_SET` > message with the `AUDIT_STATUS_BACKLOG_MAX_DEPTH` mask. To support > periodic telemetry polling (e.g., statsd, Prometheus), the reset > operation atomically returns the snapshot of the high-water mark > right before zeroing it, ensuring no peaks are lost between polls. > > Link: https://github.com/linux-audit/audit-kernel/issues/63 > Suggested-by: Steve Grubb <[email protected]> > Signed-off-by: Ricardo Robaina <[email protected]> > --- > include/linux/audit.h | 3 ++- > include/uapi/linux/audit.h | 2 ++ > kernel/audit.c | 32 ++++++++++++++++++++++++++++++++ > 3 files changed, 36 insertions(+), 1 deletion(-)
I sat on this for a bit because I wanted to think on it for a while. While I agree audit could benefit from better statistics around queue/backlog status, I'm not sure a single "max" value alone is worth a bit in the audit_status bitmask. My concern is that the max queue length only provides a single snapshot of what the queue looked like, it doesn't give any indication of the average queue length over a span of time. Some audit users are willing to live with occasional drops, and the max size doesn't help them arrive at a good balance. As for the users who can't tolerate any audit record drops? They shouldn't be running with a backlog limit anyway so the maximum queue value will be of limit use. -- paul-moore.com

