Re: [OpenAFS-devel] Alternate file systems for disk cache

Marc Dionne Thu, 21 Oct 2010 08:43:04 -0700

On Thu, Oct 21, 2010 at 9:30 AM, Charles M. Hannum <[email protected]> wrote:
> Following my bug report yesterday adding a check for JFS, I wanted to supply
> some additional information.
> The basic problem here is that the dcache code pulls out inode numbers and
> then looks them up later.  In older versions of Linux, this was done with
> iget().  In recent Linux 2.6 kernels, it's done by faking up a file handle
> with type FILEID_INO32_GEN and using the file system's fh_to_dentry()
> function.  The limitation on file systems is now primarily which ones
> support FILEID_INO32_GEN and the generation==0 hack.
> I've done a full audit of the file systems included in the Linux 2.6.35
> source tree, and found:
> 1) uses FILEID_INO32_GEN (should work):
>   efs
>   exofs
>   ext2/3/4
>   jffs2
>   jfs
>   ufs
> 2) uses FILEID_INO32_GEN (no generation==0 hack, but trivial to add):
>   ntfs
>   xfs
> 3) uses custom file handle format:
>   btrfs
>   ceph
>   fat
>   fuse
>   gfs2
>   isofs
>   ocfs2
>   reiserfs
>   udf
> It seems to me that making type 3 FSes work would be as “simple” as making
> the AFS module use encode_fh() and store the file handle actually generated
> by the file system.  This would take slightly more memory, as we'd have to
> store the type and length.  Even in the worst case (btrfs with
> connectable==true, which we don't have to use), the maximum file handle size
> is 40 bytes, so figure 44 bytes extra per dcache file.  If we decide to use
> connectable==false (ceph and fat ignore this, but keep their file handles
> within the NFSv2 limit of 20 bytes anyway), then we only need 24 extra bytes
> per dcache file.
> More importantly, this will require quite a few changes throughout the AFS
> module code, because it likes to pass around inode numbers.  However, other
> systems could also use the change and not be dependent on a single file
> system type for AFS cache any more, so this has potentially widespread
> benefit.
> In any case, I think it would be beneficial to at least do a feature test at
> startup time rather than encode specific file system types in afsd as is
> currently done.  I propose to do this by calling encode_fh(), checking that
> the return type is FILEID_INO32_GEN, setting the generation count to 0, and
> calling fh_to_dentry().  If this does not work, we can punt with an error.
>  This would enable all type 1 FSes to work immediately (which includes at
> least one non-integrated port of ZFS), and type 2 FSes to work if/when
> patches get integrated.
> Any thoughts?


I would suggest that you have a look at the code in the master branch,
for instance a recent 1.5 release.  It works pretty much as you
describe, using encode_fh to get a file handle for each cache file and
storing it instead of an inode number.  fh_to_dentry is used to later
open the cache files.  The type and length are stored but globally for
the whole cache; these have to be the same for all files in the cache,
and you will get an error if that's not the case.

The code in 1.4 was meant to be as non-intrusive as possible.  At this
point with 1.6 around the corner, I'm not sure we'd want to make such
a significant change for 1.4.

Marc
_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel

Re: [OpenAFS-devel] Alternate file systems for disk cache

Reply via email to