On 2026/1/19 17:22, Christoph Hellwig wrote:
On Mon, Jan 19, 2026 at 04:52:54PM +0800, Gao Xiang wrote:
To me this sounds pretty scary, as we have code in the kernel's trust
domain that heavily depends on arbitrary userspace policy decisions.

For example, overlayfs metacopy can also points to
arbitary files, what's the difference between them?
https://docs.kernel.org/filesystems/overlayfs.html#metadata-only-copy-up

By using metacopy, overlayfs can access arbitary files
as long as the metacopy has the pointer, so it should
be a priviledged stuff, which is similar to this feature.

Sounds scary too.  But overlayfs' job is to combine underlying files, so
it is expected.  I think it's the mix of erofs being a disk based file

But you still could point to an arbitary page cache
if metacopy is used.

system, and reaching out beyond the device(s) assigned to the file system
instance that makes me feel rather uneasy.

You mean the page cache can be shared from other
filesystems even not backed by these devices/files?

I admitted yes, there could be different: but that
is why new mount options "inode_share" and the
"domain_id" mount option are used.

I think they should be regarded as a single super
filesystem if "domain_id" is the same: From the
security perspective much like subvolumes of
a single super filesystem.

And mounting a new filesystem within a "domain_id"
can be regard as importing data into the super
"domain_id" filesystem, and I think only trusted
data within the single domain can be mounted/shared.



Similarly the sharing of blocks between different file system
instances opens a lot of questions about trust boundaries and life
time rules.  I don't really have good answers, but writing up the

Could you give more details about the these? Since you
raised the questions but I have no idea what the threats
really come from.

Right now by default we don't allow any unprivileged mounts.  Now
if people thing that say erofs is safe enough and opt into that,
it needs to be clear what the boundaries of that are.  For a file
system limited to a single block device that boundaries are
pretty clear.  For file systems reaching out to the entire system
(or some kind of domain), the scope is much wider.

Why multiple device differ for an immutable fses, any
filesystem instance cannot change the primary or
external device/blobs. All data are immutable.


As for the lifetime: The blob itself are immutable files,
what the lifetime rules means?

What happens if the blob gets removed, intentionally or accidentally?

The extra device/blob reference is held during
the whole mount lifetime, much like the primary
(block) device.

And EROFS is an immutable filesystem, so that
inner blocks within the blob won't be go away
by some fs instance too.


And how do you define trust boundaries?  You mean users
have no right to access the data?

I think it's similar: for blockdevice-based filesystems,
you mount the filesystem with a given source, and it
should have permission to the mounter.

Yes.

For multiple-blob EROFS filesystems, you mount the
filesystem with multiple data sources, and the blockdevices
and/or backed files should have permission to the
mounters too.

And what prevents other from modifying them, or sneaking
unexpected data including unexpected comparison blobs in?

I don't think it's difference from filesystems with single
device.

First, EROFS instances never modify any underlay
device/blobs:

If you say some other program modify the device data, yes,
it can be changed externally, but I think it's just like
trusted FUSE deamons, untrusted FUSE daemon can return
arbitary (meta)data at random times too.

Thanks,
Gao Xiang



Reply via email to