Previously, commit d53cd891f0e4 ("erofs: limit the level of fs stacking
for file-backed mounts") bumped `s_stack_depth` by one to avoid kernel
stack overflow, but it breaks composefs mounts, which need erofs+ovl^2
sometimes (and such setups are already used in production for quite long
time) since `s_stack_depth` can be 3 (i.e., FILESYSTEM_MAX_STACK_DEPTH
needs to change from 2 to 3).

After a long discussion on GitHub issues [1] about possible solutions,
it seems there is no need to support nesting file-backed mounts as one
conclusion (especially when increasing FILESYSTEM_MAX_STACK_DEPTH to 3).
So let's disallow this right now, since there is always a way to use
loopback devices as a fallback.

Then, I started to wonder about an alternative EROFS quick fix to
address the composefs mounts directly for this cycle: since EROFS is the
only fs to support file-backed mounts and other stacked fses will just
bump up `FILESYSTEM_MAX_STACK_DEPTH`, just check that `s_stack_depth`
!= 0 and the backing inode is not from EROFS instead.

At least it works for all known file-backed mount use cases (composefs,
containerd, and Android APEX for some Android vendors), and the fix is
self-contained.

Let's defer increasing FILESYSTEM_MAX_STACK_DEPTH for now.

Fixes: d53cd891f0e4 ("erofs: limit the level of fs stacking for file-backed 
mounts")
Closes: https://github.com/coreos/fedora-coreos-tracker/issues/2087 [1]
Closes: 
https://lore.kernel.org/r/CAFHtUiYv4+=+JP_-JjARWjo6OwcvBj1wtYN=z0qxwcpec9s...@mail.gmail.com
Cc: Amir Goldstein <[email protected]>
Cc: Alexander Larsson <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>
---
 fs/erofs/super.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 937a215f626c..0cf41ed7ced8 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -644,14 +644,20 @@ static int erofs_fc_fill_super(struct super_block *sb, 
struct fs_context *fc)
                 * fs contexts (including its own) due to self-controlled RO
                 * accesses/contexts and no side-effect changes that need to
                 * context save & restore so it can reuse the current thread
-                * context.  However, it still needs to bump `s_stack_depth` to
-                * avoid kernel stack overflow from nested filesystems.
+                * context.
+                * However, we still need to prevent kernel stack overflow due
+                * to filesystem nesting: just ensure that s_stack_depth is 0
+                * to disallow mounting EROFS on stacked filesystems.
+                * Note: s_stack_depth is not incremented here for now, since
+                * EROFS is the only fs supporting file-backed mounts for now.
+                * It MUST change if another fs plans to support them, which
+                * may also require adjusting FILESYSTEM_MAX_STACK_DEPTH.
                 */
                if (erofs_is_fileio_mode(sbi)) {
-                       sb->s_stack_depth =
-                               file_inode(sbi->dif0.file)->i_sb->s_stack_depth 
+ 1;
-                       if (sb->s_stack_depth > FILESYSTEM_MAX_STACK_DEPTH) {
-                               erofs_err(sb, "maximum fs stacking depth 
exceeded");
+                       inode = file_inode(sbi->dif0.file);
+                       if (inode->i_sb->s_op == &erofs_sops ||
+                           inode->i_sb->s_stack_depth) {
+                               erofs_err(sb, "file-backed mounts cannot be 
applied to stacked fses");
                                return -ENOTBLK;
                        }
                }
-- 
2.43.5


Reply via email to