[jira] [Updated] (HADOOP-8040) Add symlink support to FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HADOOP-8040: Affects Version/s: (was: 2.0.3-alpha) (was: 3.0.0) (was: 0.23.0) Fix Version/s: (was: 2.3.0) 2.1.0-beta Updating the fix version to reflect that these subtasks (modulo HADOOP-9417) are already in branch-2.1-beta. Add symlink support to FileSystem - Key: HADOOP-8040 URL: https://issues.apache.org/jira/browse/HADOOP-8040 Project: Hadoop Common Issue Type: New Feature Components: fs Reporter: Eli Collins Assignee: Andrew Wang Fix For: 3.0.0, 2.1.0-beta Attachments: hadoop-8040-1.patch, hadoop-8040-2.patch, hadoop-8040-3.patch, hadoop-8040-4.patch, hadoop-8040-5.patch, hadoop-8040-6.patch, hadoop-8040-7.patch HADOOP-6421 added symbolic links to FileContext. Resolving symlinks is done on the client-side, and therefore requires client support. An HDFS symlink (created by FileContext) when accessed by FileSystem will result in an unhandled UnresolvedLinkException. Because not all users will migrate from FileSystem to FileContext in lock step, and we want users of FileSystem to be able to access all paths created by FileContext, we need to support symlink resolution in FileSystem as well, to facilitate migration to FileContext. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8040) Add symlink support to FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HADOOP-8040: Fix Version/s: (was: 3.0.0) Add symlink support to FileSystem - Key: HADOOP-8040 URL: https://issues.apache.org/jira/browse/HADOOP-8040 Project: Hadoop Common Issue Type: New Feature Components: fs Reporter: Eli Collins Assignee: Andrew Wang Fix For: 2.1.0-beta Attachments: hadoop-8040-1.patch, hadoop-8040-2.patch, hadoop-8040-3.patch, hadoop-8040-4.patch, hadoop-8040-5.patch, hadoop-8040-6.patch, hadoop-8040-7.patch HADOOP-6421 added symbolic links to FileContext. Resolving symlinks is done on the client-side, and therefore requires client support. An HDFS symlink (created by FileContext) when accessed by FileSystem will result in an unhandled UnresolvedLinkException. Because not all users will migrate from FileSystem to FileContext in lock step, and we want users of FileSystem to be able to access all paths created by FileContext, we need to support symlink resolution in FileSystem as well, to facilitate migration to FileContext. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8040) Add symlink support to FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HADOOP-8040: Resolution: Fixed Fix Version/s: 2.2.0 3.0.0 Status: Resolved (was: Patch Available) All of the subtasks have been resolved, so I'm marking this as closed. Thanks to everyone who reviewed this, especially Colin. There's still plenty of follow-on work to do, but that's taking place in other JIRAs. I'm hoping this gets into 2.2.0, but we're holding off on committing HADOOP-9417 (the local FS impl) to branch-2.1 until HADOOP-9652 is complete, since it'd be good to fix lstat before putting it in a stable release. Should happen shortly. Add symlink support to FileSystem - Key: HADOOP-8040 URL: https://issues.apache.org/jira/browse/HADOOP-8040 Project: Hadoop Common Issue Type: New Feature Components: fs Affects Versions: 0.23.0, 3.0.0, 2.0.3-alpha Reporter: Eli Collins Assignee: Andrew Wang Fix For: 3.0.0, 2.2.0 Attachments: hadoop-8040-1.patch, hadoop-8040-2.patch, hadoop-8040-3.patch, hadoop-8040-4.patch, hadoop-8040-5.patch, hadoop-8040-6.patch, hadoop-8040-7.patch HADOOP-6421 added symbolic links to FileContext. Resolving symlinks is done on the client-side, and therefore requires client support. An HDFS symlink (created by FileContext) when accessed by FileSystem will result in an unhandled UnresolvedLinkException. Because not all users will migrate from FileSystem to FileContext in lock step, and we want users of FileSystem to be able to access all paths created by FileContext, we need to support symlink resolution in FileSystem as well, to facilitate migration to FileContext. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8040) Add symlink support to FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HADOOP-8040: Attachment: hadoop-8040-7.patch OK, updated them into Path methods. New consolidated patch attached. I think that addresses all outstanding review comments, so I posted up newly rebased patch splits on HADOOP-9370 and subtasks of this jira. Thanks everyone for taking a look. Add symlink support to FileSystem - Key: HADOOP-8040 URL: https://issues.apache.org/jira/browse/HADOOP-8040 Project: Hadoop Common Issue Type: New Feature Components: fs Affects Versions: 0.23.0, 3.0.0, 2.0.3-alpha Reporter: Eli Collins Assignee: Andrew Wang Attachments: hadoop-8040-1.patch, hadoop-8040-2.patch, hadoop-8040-3.patch, hadoop-8040-4.patch, hadoop-8040-5.patch, hadoop-8040-6.patch, hadoop-8040-7.patch HADOOP-6421 added symbolic links to FileContext. Resolving symlinks is done on the client-side, and therefore requires client support. An HDFS symlink (created by FileContext) when accessed by FileSystem will result in an unhandled UnresolvedLinkException. Because not all users will migrate from FileSystem to FileContext in lock step, and we want users of FileSystem to be able to access all paths created by FileContext, we need to support symlink resolution in FileSystem as well, to facilitate migration to FileContext. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8040) Add symlink support to FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HADOOP-8040: Attachment: hadoop-8040-6.patch Thanks for the review Colin. I'm attaching just a consolidated patch again since the changes I made were minor; I wanted to follow up with some of your review comments before posting the patch split. bq. Why are these static methods rather than instance methods? I refactored these out of FileContext and decided to stick them in Path. Seems kind of weird to have a check instance method that throws an exception like this, so I'd mildly prefer to leave them static. bq. package-private (i.e. no qualifier.). fixed bq. a function whose name begins with getPath... should return a path. fixed, this was also a refactor but good point bq. Debug printout left in. fixed bq. Symlinks do not necessarily have the same owner as the file they point to. Here's any example: bq. Symlinks have a file mode (aka permission bits) which are independent of the files they point to... I copied this bit from the FileContext implementation, RawLocalFS. It just uses the target's status which, as you correctly noted, can be different from the link's status. Semantics for all the local filesystems is kinda fuzzy, but I agree that this feels incorrect. I'd prefer to fix both of these classes at once though in a follow-on JIRA (especially if JNI is potentially involved). {code} final public FileStatus makeQualified(URI defaultUri, Path path) { {code} bq. Are you sure it wouldn't be easier to just have a method to change the path of the object? Mutating feels wrong to me based on how makeQualified is used. The path is often passed into a method which then tries to qualify it (e.g. Hdfs#getFileStatus). Mutating in place means the caller gets back a qualified path, which is probably not what they want. We could either mutate and make copies of the original path in all these places, or just leave it as it is. bq. In fact, you seem to have left a field out here yourself-- fileId. Hmm, interesting comment, good eye. The issue here is that since symlinks can point to other filesystems, we have to return a FileStatus, not an HdfsFileStatus. Plain old FileStatus doesn't have fileId, so we have to leave it out. I think this is okay from a user perspective, since FileSystem methods only return FileStatus, and HdfsFileStatus isn't a subclass of FileStatus anyway. bq. FSLinkResolver: it seems like you only need one of these per functionJust make a static FSLinkResolverContentSummary and use that... I think this is as intended. The new inner class sometimes needs to wrap final parameters of the containing method. Since the params are different each time (and different per call), I think it needs to do this at runtime. bq. should MAX_PATH_LINKS be configurable? I think the intent here was to just pick a reasonable upper bound. I doubt any real user has 32 links to links, and I don't think there's a reason to tune it down either. Add symlink support to FileSystem - Key: HADOOP-8040 URL: https://issues.apache.org/jira/browse/HADOOP-8040 Project: Hadoop Common Issue Type: New Feature Components: fs Affects Versions: 0.23.0, 3.0.0, 2.0.3-alpha Reporter: Eli Collins Assignee: Andrew Wang Attachments: hadoop-8040-1.patch, hadoop-8040-2.patch, hadoop-8040-3.patch, hadoop-8040-4.patch, hadoop-8040-5.patch, hadoop-8040-6.patch HADOOP-6421 added symbolic links to FileContext. Resolving symlinks is done on the client-side, and therefore requires client support. An HDFS symlink (created by FileContext) when accessed by FileSystem will result in an unhandled UnresolvedLinkException. Because not all users will migrate from FileSystem to FileContext in lock step, and we want users of FileSystem to be able to access all paths created by FileContext, we need to support symlink resolution in FileSystem as well, to facilitate migration to FileContext. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8040) Add symlink support to FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HADOOP-8040: Attachment: hadoop-8040-5.patch Attaching just a consolidated patch, the fixes for the Jenkins -1 were minor (move a deprecation ignore and add a static keyword). I'll post new splits after review comments. Add symlink support to FileSystem - Key: HADOOP-8040 URL: https://issues.apache.org/jira/browse/HADOOP-8040 Project: Hadoop Common Issue Type: New Feature Components: fs Affects Versions: 0.23.0, 3.0.0, 2.0.3-alpha Reporter: Eli Collins Assignee: Andrew Wang Attachments: hadoop-8040-1.patch, hadoop-8040-2.patch, hadoop-8040-3.patch, hadoop-8040-4.patch, hadoop-8040-5.patch HADOOP-6421 added symbolic links to FileContext. Resolving symlinks is done on the client-side, and therefore requires client support. An HDFS symlink (created by FileContext) when accessed by FileSystem will result in an unhandled UnresolvedLinkException. Because not all users will migrate from FileSystem to FileContext in lock step, and we want users of FileSystem to be able to access all paths created by FileContext, we need to support symlink resolution in FileSystem as well, to facilitate migration to FileContext. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8040) Add symlink support to FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HADOOP-8040: Attachment: hadoop-8040-4.patch Hey folks, attaching a newly rebased patch. I had to twiddle things for the HADOOP-9287 changes, so maybe [~cnauroth] could take a look too. I can't find the javac warning from above, so let's do a test-patch run. Add symlink support to FileSystem - Key: HADOOP-8040 URL: https://issues.apache.org/jira/browse/HADOOP-8040 Project: Hadoop Common Issue Type: New Feature Components: fs Affects Versions: 0.23.0, 3.0.0, 2.0.3-alpha Reporter: Eli Collins Assignee: Andrew Wang Attachments: hadoop-8040-1.patch, hadoop-8040-2.patch, hadoop-8040-3.patch, hadoop-8040-4.patch HADOOP-6421 added symbolic links to FileContext. Resolving symlinks is done on the client-side, and therefore requires client support. An HDFS symlink (created by FileContext) when accessed by FileSystem will result in an unhandled UnresolvedLinkException. Because not all users will migrate from FileSystem to FileContext in lock step, and we want users of FileSystem to be able to access all paths created by FileContext, we need to support symlink resolution in FileSystem as well, to facilitate migration to FileContext. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8040) Add symlink support to FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HADOOP-8040: Attachment: hadoop-8040-3.patch test-patch.sh hit test timeouts, so I bumped them up to 10 seconds. Add symlink support to FileSystem - Key: HADOOP-8040 URL: https://issues.apache.org/jira/browse/HADOOP-8040 Project: Hadoop Common Issue Type: New Feature Components: fs Affects Versions: 0.23.0, 3.0.0, 2.0.3-alpha Reporter: Eli Collins Assignee: Andrew Wang Attachments: hadoop-8040-1.patch, hadoop-8040-2.patch, hadoop-8040-3.patch HADOOP-6421 added symbolic links to FileContext. Resolving symlinks is done on the client-side, and therefore requires client support. An HDFS symlink (created by FileContext) when accessed by FileSystem will result in an unhandled UnresolvedLinkException. Because not all users will migrate from FileSystem to FileContext in lock step, and we want users of FileSystem to be able to access all paths created by FileContext, we need to support symlink resolution in FileSystem as well, to facilitate migration to FileContext. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8040) Add symlink support to FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HADOOP-8040: Attachment: hadoop-8040-2.patch New consolidated patch, hopefully hit all the test-patch errors. Revved the patches on the sub-JIRAs as well. One further caveat I wanted to ask about: {{FileSystem#create}} always creates parents, and we instead have a separate, deprecated {{createNonRecursive}} method. {{FileSystemTestWrapper}} currently calls plain {{#create}} and I forcibly ignore symlink tests that check the non-recursive behavior. If it's okay to use the deprecated function, I can switch it to use {{createNonRecursive}} and re-enable the ignored test. Add symlink support to FileSystem - Key: HADOOP-8040 URL: https://issues.apache.org/jira/browse/HADOOP-8040 Project: Hadoop Common Issue Type: New Feature Components: fs Affects Versions: 0.23.0, 3.0.0, 2.0.3-alpha Reporter: Eli Collins Assignee: Andrew Wang Attachments: hadoop-8040-1.patch, hadoop-8040-2.patch HADOOP-6421 added symbolic links to FileContext. Resolving symlinks is done on the client-side, and therefore requires client support. An HDFS symlink (created by FileContext) when accessed by FileSystem will result in an unhandled UnresolvedLinkException. Because not all users will migrate from FileSystem to FileContext in lock step, and we want users of FileSystem to be able to access all paths created by FileContext, we need to support symlink resolution in FileSystem as well, to facilitate migration to FileContext. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8040) Add symlink support to FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HADOOP-8040: Attachment: hadoop-8040-1.patch I've put patches up at each of the subtasks (as well as HADOOP-9370), as well as a consolidated patch here for Jenkins. It makes sense to review starting with HADOOP-9370, then each of the sub-JIRAs in order. At a high-level, we're refactoring all the tests to use a wrapper that abstracts FileContext vs. FileSystem, and reusing the FSLinkResolver class from FileContext to do resolution for FileSystem as well. This requires adding some new methods to FileSystem. Implementation and tests are provided for DistributedFileSystem as well as LocalFileSystem. There are some caveats with the FileSystem implementation, mostly from not having a FileContext/AbstractFileSystem-like split: - Quite a bit of logic is pushed down into the FileSystem subclasses. - Symlink cycle detection only works within a single FileSystem. Please let me know if my patch splitting is appropriate. I'd like to avoid too much re-splitting and reorgs if possible, since it's pretty time consuming to do the git magic. Add symlink support to FileSystem - Key: HADOOP-8040 URL: https://issues.apache.org/jira/browse/HADOOP-8040 Project: Hadoop Common Issue Type: New Feature Components: fs Affects Versions: 0.23.0 Reporter: Eli Collins Assignee: Andrew Wang Attachments: hadoop-8040-1.patch HADOOP-6421 added symbolic links to FileContext. Resolving symlinks is done on the client-side, and therefore requires client support. An HDFS symlink (created by FileContext) when accessed by FileSystem will result in an unhandled UnresolvedLinkException. Because not all users will migrate from FileSystem to FileContext in lock step, and we want users of FileSystem to be able to access all paths created by FileContext, we need to support symlink resolution in FileSystem as well, to facilitate migration to FileContext. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8040) Add symlink support to FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HADOOP-8040: Affects Version/s: 3.0.0 2.0.3-alpha Status: Patch Available (was: Open) Add symlink support to FileSystem - Key: HADOOP-8040 URL: https://issues.apache.org/jira/browse/HADOOP-8040 Project: Hadoop Common Issue Type: New Feature Components: fs Affects Versions: 2.0.3-alpha, 0.23.0, 3.0.0 Reporter: Eli Collins Assignee: Andrew Wang Attachments: hadoop-8040-1.patch HADOOP-6421 added symbolic links to FileContext. Resolving symlinks is done on the client-side, and therefore requires client support. An HDFS symlink (created by FileContext) when accessed by FileSystem will result in an unhandled UnresolvedLinkException. Because not all users will migrate from FileSystem to FileContext in lock step, and we want users of FileSystem to be able to access all paths created by FileContext, we need to support symlink resolution in FileSystem as well, to facilitate migration to FileContext. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HADOOP-8040) Add symlink support to FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HADOOP-8040: Assignee: (was: Eli Collins) Add symlink support to FileSystem - Key: HADOOP-8040 URL: https://issues.apache.org/jira/browse/HADOOP-8040 Project: Hadoop Common Issue Type: New Feature Components: fs Affects Versions: 0.23.0 Reporter: Eli Collins HADOOP-6421 added symbolic links to FileContext. Resolving symlinks is done on the client-side, and therefore requires client support. An HDFS symlink (created by FileContext) when accessed by FileSystem will result in an unhandled UnresolvedLinkException. Because not all users will migrate from FileSystem to FileContext in lock step, and we want users of FileSystem to be able to access all paths created by FileContext, we need to support symlink resolution in FileSystem as well, to facilitate migration to FileContext. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira