Hi,

A quick check with the google LLM suggests this is a Ceph bug exposed
by recent changes to the Linux kernel, and not a kernel bug. A
selective quote:

Technical Root Cause: The issue stems from how the kernel handles the
fs_name field. In certain versions (notably regressions starting
around kernel 6.18-rc1), if an MDS namespace is not explicitly
specified during mount, the fs_name variable can remain NULL. When the
kernel later attempts a strict authorization check via
strcmp(auth->match.fs_name, fs_name), it triggers the NULL pointer
dereference.

Fix Implementation: Rework ceph_mdsmap_decode() to ensure m_fs_name is
always populated (using "cephfs" as a default for older systems) to
prevent the NULL comparison.

The LLM notes the related bug  CVE-2026-23189
https://nvd.nist.gov/vuln/detail/cve-2026-23189

This explanation sounds entirely plausible to me, but without several
days of code review and experimentation, I cannot confirm this.

FWIW, my ceph cluster has a non-default name (and so I have to use
`--cluster foobar` for many commands.) I have found that this
non-default name has created a significant number of "minor" issues,
which I had to patch via systemd config file settings. They were
certainly annoyances, not quite rising to the level of "bugs".  I
suspect that my non-default config might lead to a non-default MDS
namespace, which, on some of the machines, results in `fs_name` being
set to null. This is a wild guess, but would explain why I see this,
but few others do.

-- Linas

Reply via email to