[jira] [Commented] (HDFS-5303) A symlink within a snapshot pointing to a target outside the snapshot root can cause the snapshot contents to appear to change.

2013-10-04 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13786811#comment-13786811
 ] 

Colin Patrick McCabe commented on HDFS-5303:


I don't think this is a problem.  Symlinks are just references to things, not 
the things themselves.

Every other snapshotting filesystem (btrfs, ZFS, etc) allows snapshots to 
contain symlinks to paths outside the snapshot.  We currently do the same, and 
that should be fine.

> A symlink within a snapshot pointing to a target outside the snapshot root 
> can cause the snapshot contents to appear to change.
> ---
>
> Key: HDFS-5303
> URL: https://issues.apache.org/jira/browse/HDFS-5303
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Chris Nauroth
>
> A snapshot is supposed to represent the point-in-time state of a directory.  
> However, if the directory contains a symlink that targets a file outside the 
> snapshot root, then the snapshot contents will appear to change if someone 
> changes the target file (i.e. delete or append).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5303) A symlink within a snapshot pointing to a target outside the snapshot root can cause the snapshot contents to appear to change.

2013-10-05 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787284#comment-13787284
 ] 

Chris Nauroth commented on HDFS-5303:
-

bq. I don't think this is a problem.

I agree, mostly because after thinking about this, I can't come up with a 
reasonable implementation for maintaining state outside the snapshot root while 
still retaining the current performance characteristics of snapshots.

# To implement copy-on-write on symlink targets outside the snapshot root, we'd 
need an O\(n\) scan (n = # inodes) to find if there is a snapshot anywhere 
containing a symlink pointing to that inode.
# We could optimize the above by storing back-references from the target inode 
to the symlink inode, but then we'd cause an O\(n\) increase in storage cost.
# If we choose to reject creation of snapshots containing a symlink outside the 
snapshot root, then we need to scan the subtree under the snapshot root, so 
snapshot creation is no longer O\(1\).  (Again, we could throw storage at this 
problem to optimize, but it seems wasteful.)

bq. Every other snapshotting filesystem (btrfs, ZFS, etc) allows snapshots to 
contain symlinks to paths outside the snapshot.

I think there is a greater chance for user confusion with symlinks in HDFS 
snapshots (can be created for any arbitrary subtree) compared to other 
snapshotting file systems (can be created at the volume).  With a whole-volume 
snapshot, I believe the user still can assume immutability as long as the 
symlinks don't jump out of the volume.  (Full disclosure: I'm making some 
assumptions here.  I don't have much real experience using ZFS.)

At this point, I propose that the scope of this jira is to update the snapshot 
documentation to warn end users about the pitfalls of combining snapshots and 
symlinks.  (No code changes.)

> A symlink within a snapshot pointing to a target outside the snapshot root 
> can cause the snapshot contents to appear to change.
> ---
>
> Key: HDFS-5303
> URL: https://issues.apache.org/jira/browse/HDFS-5303
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Reporter: Chris Nauroth
>
> A snapshot is supposed to represent the point-in-time state of a directory.  
> However, if the directory contains a symlink that targets a file outside the 
> snapshot root, then the snapshot contents will appear to change if someone 
> changes the target file (i.e. delete or append).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5303) A symlink within a snapshot pointing to a target outside the snapshot root can cause the snapshot contents to appear to change.

2013-10-07 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13788413#comment-13788413
 ] 

Colin Patrick McCabe commented on HDFS-5303:


bq. With a whole-volume snapshot, I believe the user still can assume 
immutability as long as the symlinks don't jump out of the volume.

But symlinks can jump out of the volume.  Symlinks on ZFS can point anywhere.

bq. At this point, I propose that the scope of this jira is to update the 
snapshot documentation to warn end users about the pitfalls of combining 
snapshots and symlinks. (No code changes.)

agree

> A symlink within a snapshot pointing to a target outside the snapshot root 
> can cause the snapshot contents to appear to change.
> ---
>
> Key: HDFS-5303
> URL: https://issues.apache.org/jira/browse/HDFS-5303
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Reporter: Chris Nauroth
>
> A snapshot is supposed to represent the point-in-time state of a directory.  
> However, if the directory contains a symlink that targets a file outside the 
> snapshot root, then the snapshot contents will appear to change if someone 
> changes the target file (i.e. delete or append).



--
This message was sent by Atlassian JIRA
(v6.1#6144)