This sounds reasonable, and is not like the case of AFS which uses 
special tokens in symbolic links that can expand to other things.

I'm a bit concerned about potential effects on applications, it *seems* 
like this is done in a manner that is safe, but there are a few items:

    * are applications consistent in their use of pathconf/fpathconf to 
get filesystem limits
    * presumably archivers and such are not expected to traverse these?  
(they get handled like an ordinary symbolic link)
    * what happens when the referral is archived and then reextracted?  
(is the attribute lost?)
    * as a nit, its not truly file system independent, since it relies 
on symbolic links (not all filesystems support
    symlinks, though admittedly the ones of interest to this case all do)

I believe that this case likely exceeds the obviousness test for a fast 
track.  I certainly wouldn't be comfortable having it go through with 
only a single +1 from another member (your own +1 doesn't count, as I 
understand the rules -- case owners don't count).

Given this, I'm going to derail the case, just to force enough members 
to read it to get a meaningful vote.  I'll write any resulting opinion.  
I don't think we need any additional materials apart from answers to the 
questions I've already raised.

Note that I don't think there is anything intrinsically wrong with the 
case (though my archivers question above is I think a real potential 
concern) -- the derail here should not be taken as a negative statement 
about the case itself; I just want to make sure it is adequately and 
properly reviewed.

Thanks.

    - Garrett

Glenn Skinner wrote:
> I'm sponsoring the following fast track for Afshin Salek and the CIFS
> i-team.  It times out on Friday, July 17th.
>
> A copy of the specification below appears in the case directory under
> the name "specification".
>
> I've pre-reviewed it and will give it a +1 up front.
>
>               -- Glenn
>
> ----------------
>
> Template Version: @(#)onepager.txt 1.35 07/11/07 SMI
> Copyright 2007 Sun Microsystems
>
> 1. Introduction
>    1.1. Project/Component Working Name:
>         Support for Reparse Points
>
>    1.2. Name of Document Author/Supplier:
>         Author: Afshin Salek
>
>    1.3. Date of This Document:
>         07/08/09
>       
>    1.4. Name of Major Document Customer(s)/Consumer(s):
>         PSARC
>       CIFS team
>
>    1.5. Email Aliases:
>       1.5.1. Responsible Manager: Barry.Greenberg at Sun.COM
>       1.5.2. Responsible Engineer: Afshin.Ardakani at Sun.COM
>       1.5.3. Marketing Manager:
>       1.5.4. Interest List: cifs-team at sun.com
>
>    A patch binding is requested for this change.      
>
> 4. Technical Description:
>     4.1. Details:
>
>        INTRODUCTION
>         
>        There are situations where a mechanism is needed to reflect
>        the concept that data is not present at a particular path, but
>        can be found in some alternate location(s).  Examples include
>        "referrals" used to build unified name spaces in NFSv4.x and
>        SMB, and data relocation in HSM systems.  A "reparse point" is
>        defined as the marker for a namespace redirection and a
>        container for the metadata to specify where the target of this
>        redirection is.
>         
>        Reparse points are intended to be a general mechanism for
>        location redirection and as such the file system that contains
>        them is not cognizant of the reparse point format or content.
>        Services that use reparse points know how to interpret and use
>        the stored data.
>         
>        REPARSE POINT OBJECT
>         
>        After a lot of discussion the consensus is that the best way
>        to represent reparse points in the file system, in order to
>        minimize the effect on existing applications and utilities, to
>        use symbolic links.  One of the main goals in this context has
>        been the ability to use existing utilities for backup/restore
>        and also ZFS send/receive without having to modify them to
>        know how to deal with reparse points.
>
>        Some of what is envisioned here could be done with extensions
>        to the Solaris automounter capability.  Part of the
>        motivation, though, is to create centrally-administrated
>        namespaces served by a group of fileservers to near-zero-admin
>        clients.  It is expected to be easier to keep the namespaces
>        uniform if only a small number of servers need to participate.
>        HSM solutions would also normally be tied closely to a storage
>        server by this mechanism.  Also, for both NFS and SMB
>        referrals, it is the client that chooses the target and not
>        the server.  The server only provides the targets' information
>        and it is up to the client to pick the desirable target to
>        access the data.
>
>        To distinguish a regular symlink from a reparse point, an
>        extensible system attribute will be set on the symlink.  This
>        system attribute is only one bit which indicates whether or
>        not a symlink contains reparse data.
>         
>        The reparse data will be stored as the link target.  The
>        reparse data is not in file system path format, which is the
>        typical format of a link target.  In order to avoid coming up
>        with a totaly new format for reparse data as the link target
>        we decided to adopt the format used by magic links in BSD:
>        (http://www.daemon-systems.org/man/symlink.7.html)
>         
>        @{repa...@{service-type1:data} [...@{service-type2:data}]...}
>         
>        Where some examples of service-type are:
>        
>        #define REPARSE_SVC_SMB        "SMB"
>        #define REPARSE_SVC_NFS        "NFS"
>        #define REPARSE_SVC_HSM        "HSM"
>         
>        The data for each service will be in string format, which is
>        expected to be typically a UUID string.
>
>        The pattern above starts with "REPARSE" to distinguish it from
>        a other magic links, such as those supported by BSD.  Note
>        that this case is not a proposal to support BSD magic links,
>        the intent is to avoid precluding the future addition of full
>        BSD magic link support.
>         
>        Multiple services entries can co-exist within the symlink
>        data.  It is expected that normally, all entries would resolve
>        to the same logical location, e.g.  NFS and CIFS clients would
>        find the same files.
>         
>        BASIC INTERFACES
>         
>        There is a need for both userspace and kernel APIs to work
>        with reparse points.
>         
>        Userspace API
>         
>        In userspace the symlink(2) system call will be used to set a
>        reparse point.  The readlink(2) system call will be used in
>        turn to read the reparse data.
>         
>        Kernel API
>         
>        In the kernel, VOP_SYMLINK and VOP_READLINK will be used to
>        set/get reparse data.
>         
>        These interfaces will support all replication, archive and
>        copy operations to preserve reparse points without further
>        changes.
>         
>        fop_symlink() needs to be modified to recognize the reparse
>        @{REPARSE} tag and pass the appropriate attribute (i.e.
>        reparse system attribute) to VOP_SYMLINK to be set on the
>        symlink.
>        
>        IMPLEMENTATION OBSERVATIONS
>         
>        VFS feature registration can be used to determine whether or
>        not a file system supports reparse points.
>         
>        Two things are needed to obtain the reparse point data in the
>        kernel.  First, the consumer needs to know that a reparse
>        point has been encountered and, second, it needs the vnode
>        pointer to the symlink.  The proposal is to enhance VOP_LOOKUP
>        to return the attributes of the looked up vnode.  This way
>        when the vnode is available the caller can check the
>        attributes to determine if the returned vnode is a reparse
>        point or a regular symlink.  Here are the old and revised
>        signatures of VOP_LOOKUP:
>
>        int VOP_LOOKUP(vnode_t *dvp, char *nm, vnode_t **vpp,
>             pathname_t *pnp, int flags, vnode_t *rdir, cred_t *cr,
>             caller_context_t *ct, int *deflags, pathname_t *ppnp)
>
>        int VOP_LOOKUP(vnode_t *dvp, char *nm, vnode_t **vpp,
>             pathname_t *pnp, int flags, vnode_t *rdir, cred_t *cr,
>             caller_context_t *ct, int *deflags, pathname_t *ppnp,
>             vattr_t *vap)
>         
>        A vattr_t pointer argument is added at the end to return the
>        attributes if it is non-NULL.  This is an optimization so that
>        consumers don't have to invoke an extra VOP_GETATTR after
>        lookup for obtaining the attributes.
>
>        The symlink target size should be increased to 16K to
>        accomodate the maximum size supported for MS-DFS referrals by
>        Windows.  Applications are expected to query the PATH_MAX and
>        SYMLINK_MAX values on the local system using
>        pathconf(2)/fpathconf(2).  The value of SYMLINK_MAX would be
>        changed to 16K on ZFS.  The value of PATH_MAX will not be
>        affected.
>             
>        To provide compatibility with other UNIXes (see section 6
>        below), sharemgr(1M) would be enhanced to support a "refer"
>        option for NFS exports.  This option would only result in
>        creation of a reparse point at the specified path and does not
>        actually share the path over NFS.
>             
>        This case is only about the underlying infrastructure and a
>        future case will be presented to deal with details and
>        specifics of handling referrals for NFSv4 server.
>
>        SECURITY CONSIDERATIONS
>             
>        Referrals are similar to regular symbolic links in that they
>        are only pointers to data that could be discovered in some
>        other way.  The presence of such a pointer does not compromise
>        the security of the target object or data; the target service
>        or file system must still enforce security.
>             
>        OPERATION FLOW
>             
>        Once a kernel service encounters a reparse point, it reads the
>        data using VOP_READLINK and passes the data up to a user space
>        daemon (e.g.  reparsed) along with its desired record type.
>        Depending on the requested record type the daemon could simply
>        extract the information from the passed data and return it to
>        kernel or do any other processing necessary to obtain the
>        actual referral information e.g.  in the case of FedFS,
>        contacting NSDB.  Going through a common user space daemon to
>        get the referral data makes this process generic and easily
>        expandable for possible future use cases.
>             
>        Referral extraction and creation by a userspace daemon can be
>        handled via a library plugin architecture for different
>        service types.
>             
>        Operation Flow Example
>             
>        Here is a simplified example of operation for a CIFS client
>        that tries to access a file where the path contains a DFS
>        link:
>             
>        a) Client tries to access \\srv\root\...\link\...\file.txt
>           where:
>              'root' is a share (namespace root)
>              'link' is a reparse point seen as a folder by client
>         
>        b) CIFS server does a VOP_LOOKUP for 'link' when it is
>           recognized as a reparse point by examining the attributes
>           return by VOP_LOOKUP.  At this point a
>           STATUS_PATH_NOT_COVERED is returned to client
>         
>        c) Client sends a "link referral" request to the server.  CIFS
>           server uses VOP_READLINK to get the 'link' data and sends
>           the data to 'reparsed' daemon via a door call and gets back
>           the DFS link targets in a format understandable by the CIFS
>           client.  The targets are sent back to the client in
>           response to its "link referral" request.
>         
>        b) Client picks one of the targets and contacts the target
>           server to access 'file.txt'
>         
>        NFS REFERRAL IN OTHER UNIXES
>             
>        FS referrals have been implemented in other major UNIX
>        distributions such as Linux, AIX and HP-UX but there is no
>        unified approach or implementation.
>
>        Linux, AIX and HP-UX specify referrals as an NFS export
>        option.  The option format is basically the same in all three
>        operating systems (refer=path at host) but the presentation is
>        somewhat different in each case:
>
>        - In Linux a referral is presented as a mount point.
>        - In HP-UX a referral is a file system partition or logical volume.
>        - In AIX a special object is used to represent a referral.
>
>        These are all mechanisms to trigger a change in namespace
>        while resolving a path.
>       
>        This proposal is somewhat aligned with the AIX approach but
>        does not require a new object type to be defined, which has
>        the advantage of not impacting existing applications.  As
>        mentioned previously, an NFS "refer" option will be supported
>        to provide option format compatibility.
>       
>        Additionally, the Solaris requirements include support for
>        both NFS and SMB referrals whereas these other operating
>        systems only support NFS referrals, and they do not provide
>        native SMB support.  For the Solaris operating system, this
>        proposal provides a generic solution to support multiple,
>        disparate referral mechanisms without placing restrictions on
>        the format required by each mechanism.
>     
>        The following links provide a bit more details about each OS
>        discussed above:
>             
>          http://www.citi.umich.edu/projects/nfsv4/linux/using-referrals.html
>          http://nfsv4.bullopensource.org/doc/migration-and-replication-0.2.pdf
>          http://docs.hp.com/en/5900-0306/ch01s11.html?jumpid=reg_R1002_USEN
>          http://docs.hp.com/en/13578/nfsv4_whitepaper.pdf 
>          
> http://publib.boulder.ibm.com/infocenter/systems/index.jsp?topic=/com.ibm.aix.commadmn/doc/commadmndita/nfs_referrals.htm
>  
>
>  INTERFACE TABLE
>
>                           |Proposed       |Specified   |
>                           |Stability      |in what     |
>   Interface Name          |Classification |Document?   | Comments
>   ===========================================================================
>    XAT_REPARSE            |Consolidation  |This        |Reparse extensible
>                           |Private        |Document    |attribute
>                           |               |            |
>    VOP_LOOKUP, fop_lookup |Contracted     |This        |Added new argument:
>                           |Consolidation  |Document    |vattr_t *vap 
>                           |Private*        |            |
>                           |               |            |
>    Reparse token syntax   |Committed      |This        |
>                           |Private        |Document    |
>                           |               |            |
>    SYMLINK_MAX            |Committed      |This        |Increased to 16K
>                           |               |Document    |
>
>  * The project's deliverables will all go into the OS/NET
>    Consolidation, so no contracts are required.
>
> 6. Resources and Schedule:
>
>    6.4. Product Approval Committee requested information:
>       6.4.1. Consolidation or Component Name:
>              ON
>
>    6.5. ARC review type:
>         FastTrack
>
>   


Reply via email to