[ https://issues.apache.org/jira/browse/HADOOP-8040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Wang updated HADOOP-8040: -------------------------------- Attachment: hadoop-8040-6.patch Thanks for the review Colin. I'm attaching just a consolidated patch again since the changes I made were minor; I wanted to follow up with some of your review comments before posting the patch split. bq. Why are these static methods rather than instance methods? I refactored these out of FileContext and decided to stick them in Path. Seems kind of weird to have a "check" instance method that throws an exception like this, so I'd mildly prefer to leave them static. bq. package-private (i.e. no qualifier.). fixed bq. a function whose name begins with getPath... should return a path. fixed, this was also a refactor but good point bq. Debug printout left in. fixed bq. Symlinks do not necessarily have the same owner as the file they point to. Here's any example: bq. Symlinks have a file mode (aka permission bits) which are independent of the files they point to... I copied this bit from the FileContext implementation, RawLocalFS. It just uses the target's status which, as you correctly noted, can be different from the link's status. Semantics for all the local filesystems is kinda fuzzy, but I agree that this feels incorrect. I'd prefer to fix both of these classes at once though in a follow-on JIRA (especially if JNI is potentially involved). {code} final public FileStatus makeQualified(URI defaultUri, Path path) { {code} bq. Are you sure it wouldn't be easier to just have a method to change the path of the object? Mutating feels wrong to me based on how makeQualified is used. The path is often passed into a method which then tries to qualify it (e.g. Hdfs#getFileStatus). Mutating in place means the caller gets back a qualified path, which is probably not what they want. We could either mutate and make copies of the original path in all these places, or just leave it as it is. bq. In fact, you seem to have left a field out here yourself-- fileId. Hmm, interesting comment, good eye. The issue here is that since symlinks can point to other filesystems, we have to return a FileStatus, not an HdfsFileStatus. Plain old FileStatus doesn't have fileId, so we have to leave it out. I think this is okay from a user perspective, since FileSystem methods only return FileStatus, and HdfsFileStatus isn't a subclass of FileStatus anyway. bq. FSLinkResolver: it seems like you only need one of these per function....Just make a static FSLinkResolver<ContentSummary> and use that... I think this is as intended. The new inner class sometimes needs to wrap final parameters of the containing method. Since the params are different each time (and different per call), I think it needs to do this at runtime. bq. should MAX_PATH_LINKS be configurable? I think the intent here was to just pick a reasonable upper bound. I doubt any real user has >32 links to links, and I don't think there's a reason to tune it down either. > Add symlink support to FileSystem > --------------------------------- > > Key: HADOOP-8040 > URL: https://issues.apache.org/jira/browse/HADOOP-8040 > Project: Hadoop Common > Issue Type: New Feature > Components: fs > Affects Versions: 0.23.0, 3.0.0, 2.0.3-alpha > Reporter: Eli Collins > Assignee: Andrew Wang > Attachments: hadoop-8040-1.patch, hadoop-8040-2.patch, > hadoop-8040-3.patch, hadoop-8040-4.patch, hadoop-8040-5.patch, > hadoop-8040-6.patch > > > HADOOP-6421 added symbolic links to FileContext. Resolving symlinks is done > on the client-side, and therefore requires client support. An HDFS symlink > (created by FileContext) when accessed by FileSystem will result in an > unhandled UnresolvedLinkException. Because not all users will migrate from > FileSystem to FileContext in lock step, and we want users of FileSystem to be > able to access all paths created by FileContext, we need to support symlink > resolution in FileSystem as well, to facilitate migration to FileContext. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira