On 26/01/09 11:22AM, Joanne Koong wrote:
> On Wed, Jan 7, 2026 at 7:34 AM John Groves <[email protected]> wrote:
> >
> > The shadow path is a (usually in tmpfs) file system area used by the
> > famfs user space to communicate with the famfs fuse server. There is a
> > minor dilemma that the user space tools must be able to resolve from a
> > mount point path to a shadow path. Passing in the 'shadow=<path>'
> > argument at mount time causes the shadow path to be exposed via
> > /proc/mounts, Solving this dilemma. The shadow path is not otherwise
> > used in the kernel.
> 
> Instead of using mount options to pass the userspace metadata, could
> /sys/fs be used instead? The client is able to get the connection id
> by stat-ing the famfs mount path. There could be a
> /sys/fs/fuse/connections/{id}/metadata file that the server fills out
> with whatever metadata needs to be read by the client. Having
> something like this would be useful to non-famfs servers as well.

The shadow option isn't the only possible way to get what famfs needs,
but I do like it - I find it to be an elegant solution to the problem.

What's the problem? Well, for that you need to know some implementation 
details of the famfs userspace. For the *structure* of a mounted file 
system, famfs is very passthrough-like. The structure that is being 
passed through is the shadow file system, which is an actual file system 
(usually tmpfs).  Directories are just directories, but shadow files 
contain yaml that describes the file-to-dax map of the *actual* file. 
On lookup, the famfs fuse server (famfs_fused), rather than stat the 
file like passthrough, reads the yaml and decodes the stat and fmap info 
from that.

One other detail. The shadow path must be known or created (usually
as a tmpdir, to guarantee it starts empty) at mount time. The kernel
knows about it through "-o shadow=<path>", but otherwise doesn't use
it. The famfs fuse server receives the path as an input from 
'famfs mount'. The problem is that pretty much every famfs-related
user space command needs the shadow path.

In fact the the structure of the mounted file system is at 
<shadow_path>/root.  Also located in <shadow path> (above ./root) is a 
unix domain socket for REST communication with famfs_fused. We have 
plans for other files at <shadow path> and above ./root (mount-specific 
config options, for example).

Playing the famfs metadata log requires finding the shadow path,
parsing the log, and creating (or potentially modifying) shadow files
in the shadow path for the mount.

So to communicate with the fuse server we parse the shadow path from
/proc/mounts and that finds the <shadow_path>/socket that can be used
to communicate with famfs_fused. And we can play the metadata log
(accessed via MPT/.meta/.log) to <shadow_path>/root/...

Having something in sysfs would be fine, but unless we pass it into
the kernel somehow (hey, like -o shadow=<shadow path>), the kernel
won't know it and can't reveal it.

A big no-go, I think, is trying to parse the shadow path from the
famfs fuse server via 'ps -ef' or 'ps -ax'. The famfs cli etc. might
be running in a container that doesn't have access to that.

Happy to discuss further...

> 
> >
> > Signed-off-by: John Groves <[email protected]>
> > ---
> >  fs/fuse/fuse_i.h | 25 ++++++++++++++++++++++++-
> >  fs/fuse/inode.c  | 28 +++++++++++++++++++++++++++-
> >  2 files changed, 51 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> > index ec2446099010..84d0ee2a501d 100644
> > --- a/fs/fuse/fuse_i.h
> > +++ b/fs/fuse/fuse_i.h
> > @@ -620,9 +620,11 @@ struct fuse_fs_context {
> >         unsigned int blksize;
> >         const char *subtype;
> >
> > -       /* DAX device, may be NULL */
> > +       /* DAX device for virtiofs, may be NULL */
> >         struct dax_device *dax_dev;
> >
> > +       const char *shadow; /* famfs - null if not famfs */
> > +
> >         /* fuse_dev pointer to fill in, should contain NULL on entry */
> >         void **fudptr;
> >  };
> > @@ -998,6 +1000,18 @@ struct fuse_conn {
> >                 /* Request timeout (in jiffies). 0 = no timeout */
> >                 unsigned int req_timeout;
> >         } timeout;
> > +
> > +       /*
> > +        * This is a workaround until fuse uses iomap for reads.
> > +        * For fuseblk servers, this represents the blocksize passed in at
> > +        * mount time and for regular fuse servers, this is equivalent to
> > +        * inode->i_blkbits.
> > +        */
> > +       u8 blkbits;
> > +
> 
> I think you meant to remove these lines?

I was gonna say those are Darrick's lines...but they came in through my patch.
So yes, I will drop them. Oops :D

I'm not sure how this leaked into my patch, but that's one of the reasons why
reviews are good - thanks!

> 
> > +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX)
> > +       char *shadow;
> 
> Should this be const char * too?
> > +#endif
> >  };
> >
> >  /*
> > @@ -1631,4 +1645,13 @@ extern void fuse_sysctl_unregister(void);
> >  #define fuse_sysctl_unregister()       do { } while (0)
> >  #endif /* CONFIG_SYSCTL */
> >
> > +/* famfs.c */
> > +
> > +static inline void famfs_teardown(struct fuse_conn *fc)
> > +{
> > +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX)
> > +       kfree(fc->shadow);
> > +#endif
> > +}
> > +
> >  #endif /* _FS_FUSE_I_H */
> > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> > index acabf92a11f8..2e0844aabbae 100644
> > --- a/fs/fuse/inode.c
> > +++ b/fs/fuse/inode.c
> > @@ -783,6 +783,9 @@ enum {
> >         OPT_ALLOW_OTHER,
> >         OPT_MAX_READ,
> >         OPT_BLKSIZE,
> > +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX)
> > +       OPT_SHADOW,
> > +#endif
> >         OPT_ERR
> >  };
> >
> > @@ -797,6 +800,9 @@ static const struct fs_parameter_spec 
> > fuse_fs_parameters[] = {
> >         fsparam_u32     ("max_read",            OPT_MAX_READ),
> >         fsparam_u32     ("blksize",             OPT_BLKSIZE),
> >         fsparam_string  ("subtype",             OPT_SUBTYPE),
> > +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX)
> > +       fsparam_string("shadow",                OPT_SHADOW),
> 
> nit: having the spacing for ("shadow", align with the lines above
> would be aesthetically nice

Done, thanks

> 
> > +#endif
> >         {}
> >  };
> >
> > @@ -892,6 +898,15 @@ static int fuse_parse_param(struct fs_context *fsc, 
> > struct fs_parameter *param)
> >                 ctx->blksize = result.uint_32;
> >                 break;
> >
> > +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX)
> > +       case OPT_SHADOW:
> > +               if (ctx->shadow)
> > +                       return invalfc(fsc, "Multiple shadows specified");
> > +               ctx->shadow = param->string;
> > +               param->string = NULL;
> > +               break;
> > +#endif
> > +
> >         default:
> >                 return -EINVAL;
> >         }
> > @@ -905,6 +920,7 @@ static void fuse_free_fsc(struct fs_context *fsc)
> >
> >         if (ctx) {
> >                 kfree(ctx->subtype);
> > +               kfree(ctx->shadow);
> >                 kfree(ctx);
> >         }
> >  }
> > @@ -936,7 +952,10 @@ static int fuse_show_options(struct seq_file *m, 
> > struct dentry *root)
> >         else if (fc->dax_mode == FUSE_DAX_INODE_USER)
> >                 seq_puts(m, ",dax=inode");
> >  #endif
> > -
> > +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX)
> > +       if (fc->shadow)
> > +               seq_printf(m, ",shadow=%s", fc->shadow);
> > +#endif
> >         return 0;
> >  }
> >
> > @@ -1041,6 +1060,8 @@ void fuse_conn_put(struct fuse_conn *fc)
> >                 WARN_ON(atomic_read(&bucket->count) != 1);
> >                 kfree(bucket);
> >         }
> > +       famfs_teardown(fc);
> 
> imo it looks a bit cleaner with
> 
> if (IS_ENABLED(CONFIG_FUSE_FAMFS_DAX))
>      famfs_teardown(fc);
> 
> which also matches the pattern the passthrough config below uses

Done

> 
> > +
> >         if (IS_ENABLED(CONFIG_FUSE_PASSTHROUGH))
> >                 fuse_backing_files_free(fc);
> >         call_rcu(&fc->rcu, delayed_release);
> > @@ -1916,6 +1937,11 @@ int fuse_fill_super_common(struct super_block *sb, 
> > struct fuse_fs_context *ctx)
> >                 *ctx->fudptr = fud;
> >                 wake_up_all(&fuse_dev_waitq);
> >         }
> > +
> > +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX)
> > +       fc->shadow = kstrdup(ctx->shadow, GFP_KERNEL);
> 
> Is a shadow path a must-have for a famfs mount? if so, then should the
> mount fail if the allocation here fails?

Summarized above...

> 
> Thanks,
> Joanne
> > +#endif
> > +
> >         mutex_unlock(&fuse_mutex);
> >         return 0;
> >
> > --
> > 2.49.0
> >

Thanks Joanne!

John


Reply via email to