Re: [OpenAFS-devel] Alternate file systems for disk cache

Derrick Brashear Thu, 21 Oct 2010 08:05:05 -0700

Suddenly, reading this, I fear I may have led you down the garden
path. Apologies. Read on.


On Thu, Oct 21, 2010 at 9:30 AM, Charles M. Hannum <[email protected]> wrote:
> Following my bug report yesterday adding a check for JFS, I wanted to supply
> some additional information.
> The basic problem here is that the dcache code pulls out inode numbers and
> then looks them up later.  In older versions of Linux, this was done with
> iget().  In recent Linux 2.6 kernels, it's done by faking up a file handle
> with type FILEID_INO32_GEN and using the file system's fh_to_dentry()
> function.  The limitation on file systems is now primarily which ones
> support FILEID_INO32_GEN and the generation==0 hack.
> I've done a full audit of the file systems included in the Linux 2.6.35
> source tree, and found:
> 1) uses FILEID_INO32_GEN (should work):
>   efs
>   exofs
>   ext2/3/4
>   jffs2
>   jfs
>   ufs
> 2) uses FILEID_INO32_GEN (no generation==0 hack, but trivial to add):
>   ntfs
>   xfs
> 3) uses custom file handle format:
>   btrfs
>   ceph
>   fat
>   fuse
>   gfs2
>   isofs
>   ocfs2
>   reiserfs
>   udf
> It seems to me that making type 3 FSes work would be as “simple” as making
> the AFS module use encode_fh() and store the file handle actually generated
> by the file system.  This would take slightly more memory, as we'd have to
> store the type and length.  Even in the worst case (btrfs with
> connectable==true, which we don't have to use), the maximum file handle size
> is 40 bytes, so figure 44 bytes extra per dcache file.  If we decide to use
> connectable==false (ceph and fat ignore this, but keep their file handles
> within the NFSv2 limit of 20 bytes anyway), then we only need 24 extra bytes
> per dcache file.
> More importantly, this will require quite a few changes throughout the AFS
> module code, because it likes to pass around inode numbers.

It shouldn't, at least in 1.5.current and head: it passes around a
struct which as it happens contains... what I think you suggest?
#define MAX_FH_LEN 10
typedef union {
#if defined(NEW_EXPORT_OPS)
    struct fid fh;
#endif
    __u32 raw[MAX_FH_LEN];
} afs_ufs_dcache_id_t;

So... looking at the JFS patch in RT, I suddenly got it: you were
looking at 1.4.x.

>  However, other
> systems could also use the change and not be dependent on a single file
> system type for AFS cache any more, so this has potentially widespread
> benefit.
> In any case, I think it would be beneficial to at least do a feature test at
> startup time rather than encode specific file system types in afsd as is
> currently done.  I propose to do this by calling encode_fh(), checking that
> the return type is FILEID_INO32_GEN, setting the generation count to 0, and
> calling fh_to_dentry().  If this does not work, we can punt with an error.
>  This would enable all type 1 FSes to work immediately (which includes at
> least one non-integrated port of ZFS), and type 2 FSes to work if/when
> patches get integrated.
> Any thoughts?

Some of this may still need to be done for things to work more
properly, since currently we don't properly tell what you call type 2
that they've lost.
What we do:
static inline int
afs_get_fh_from_dentry(struct dentry *dp, afs_ufs_dcache_id_t *ainode, int *max_
lenp) {
    if (dp->d_sb->s_export_op->encode_fh)
        return dp->d_sb->s_export_op->encode_fh(dp, &ainode->raw[0],
max_lenp, 0);
#if defined(NEW_EXPORT_OPS)
    /* If fs doesn't provide an encode_fh method, assume the default
INO32 type */
    *max_lenp = sizeof(struct fid)/4;
    ainode->fh.i32.ino = dp->d_inode->i_ino;
    ainode->fh.i32.gen = dp->d_inode->i_generation;
    return FILEID_INO32_GEN;
#else
    /* or call the default encoding function for the old API */
    return export_op_default.encode_fh(dp, &ainode->raw[0], max_lenp, 0);
#endif
}






-- 
Derrick
_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel

Re: [OpenAFS-devel] Alternate file systems for disk cache

Reply via email to