[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties
[ https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052938#comment-13052938 ] Eli Collins commented on HDFS-1788: --- Looks like this is a patch for HADOOP-6424.. how about posting it there? > FsShell ls: Show symlinks properties > > > Key: HDFS-1788 > URL: https://issues.apache.org/jira/browse/HDFS-1788 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Reporter: Jonathan Eagles >Assignee: John George >Priority: Minor > Attachments: HDFS-1788.patch > > > ls FsShell command implementation has been consistent with the linux > implementations of ls \-l. With the addition of symlinks, I would expect the > ability to show file type 'd' for directory, '\-' for file, and 'l' for > symlink. In addition, following the linkname entry for symlinks, I would > expect the ability to show "\-> ". In linux, the default is to > the the properties of the link and not of the link target. In linux, '-L' > option allows for the dereferencing of symlinks to show link target > properties, but it is not the default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties
[ https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051629#comment-13051629 ] John George commented on HDFS-1788: --- Bochun, It is good that you submitted a patch for this ticket, because the important thing is for this issue to be resolved. I had a patch ready but it needed some cleanup since some of the junit tests were failing. So, feel free to assign this bug to yourself and submit patches :) > FsShell ls: Show symlinks properties > > > Key: HDFS-1788 > URL: https://issues.apache.org/jira/browse/HDFS-1788 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Reporter: Jonathan Eagles >Assignee: John George >Priority: Minor > Attachments: HDFS-1788.patch > > > ls FsShell command implementation has been consistent with the linux > implementations of ls \-l. With the addition of symlinks, I would expect the > ability to show file type 'd' for directory, '\-' for file, and 'l' for > symlink. In addition, following the linkname entry for symlinks, I would > expect the ability to show "\-> ". In linux, the default is to > the the properties of the link and not of the link target. In linux, '-L' > option allows for the dereferencing of symlinks to show link target > properties, but it is not the default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties
[ https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050906#comment-13050906 ] Bochun Bai commented on HDFS-1788: -- Waiting for HADOOP-7360 is a good idea. And cat/tail commands is not working, this is the motivation I involved. I feel sorry about not knowing the assignee is still working on this. It has been assigned for 50 days, and no more progress published. Next time I will contact before started, thanks daryn. > FsShell ls: Show symlinks properties > > > Key: HDFS-1788 > URL: https://issues.apache.org/jira/browse/HDFS-1788 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Reporter: Jonathan Eagles >Assignee: John George >Priority: Minor > Attachments: HDFS-1788.patch > > > ls FsShell command implementation has been consistent with the linux > implementations of ls \-l. With the addition of symlinks, I would expect the > ability to show file type 'd' for directory, '\-' for file, and 'l' for > symlink. In addition, following the linkname entry for symlinks, I would > expect the ability to show "\-> ". In linux, the default is to > the the properties of the link and not of the link target. In linux, '-L' > option allows for the dereferencing of symlinks to show link target > properties, but it is not the default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties
[ https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050465#comment-13050465 ] Daryn Sharp commented on HDFS-1788: --- Oh, I'm not disagreeing with adding {{fc}} or {{lstat}}. I'm just saying leave {{fs}} intact until we can remove it, ie. have both {{fc}} and {{fs}}. HDFS appears to lazy-load {{FileStatus}}, but not all {{FileSystem}}s do. Hence why it makes sense for a {{lstat()}} since very few commands (cp/mv/ls), in limited cases, will need a stat of the link. Whereas every command uses {{stat}}. Perhaps I'm misunderstanding: why should commands like cat/tail be symlink aware? In the normal access case, symlinks should be transparent to a client. Ie. The filesystem should be responsible for resolving symlinks. Very few commands should know or care about a symlink. Note that HADOOP-7360 is changing a lot of the {{PathData}} ctors to be private. The fs cannot be passed in, and that change was in part intended to greatly simplify the move to {{FileContext}}. I hope to have this jira approved in the next few days, so I'd recommend waiting to post another patch until 7360 is integrated. On a related note to waiting for my patches, please be mindful of working on jiras that are already assigned. In the future, please contact an assignee for the status of their work. In this case, the assignee of this jira was already working on the change and waiting for a few of my patches to go in. > FsShell ls: Show symlinks properties > > > Key: HDFS-1788 > URL: https://issues.apache.org/jira/browse/HDFS-1788 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Reporter: Jonathan Eagles >Assignee: John George >Priority: Minor > Attachments: HDFS-1788.patch > > > ls FsShell command implementation has been consistent with the linux > implementations of ls \-l. With the addition of symlinks, I would expect the > ability to show file type 'd' for directory, '\-' for file, and 'l' for > symlink. In addition, following the linkname entry for symlinks, I would > expect the ability to show "\-> ". In linux, the default is to > the the properties of the link and not of the link target. In linux, '-L' > option allows for the dereferencing of symlinks to show link target > properties, but it is not the default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties
[ https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050261#comment-13050261 ] Bochun Bai commented on HDFS-1788: -- Thanks daryn, your reviews encourage me a lot. I will rewrite the patch and follow your advises. But I got some difficult while keeping PathData maximum compatible: Adds lstat, leaving fs, stat, exists. And fc should be added because HADOOP-6446. Further works should use FileContext instead of FileSystem. And lexists is requeired to identify broken links. The problem is fc should be initialized in ctor like fs, whether it is parameter or not, breaks compatible by requiring new parameter or throwing exceptions. If fc is not a field of PathData and initialized when item.lstat() is called. RPC to NN will increased. To satisfies both compatible and feature, I suggest create a object named StatusData: contains fc, lstat and string. represent the link itself, not the real file. when cat/tail/stat is called, lstat.symlink will be resolved recursively and the real target is returned. FileStatus is recursively, and StatusData is independent to link or not. RPC to NN is decreased while listing a directory containing symlink. stat/read/write overhead to the symlink is the same, but lazy-inited. It should also improve GC activities to HADOOP-6732. And I have noticed @eli created a serial issues about porting FileContext. I will submit next patch to HADOOP-6424. > FsShell ls: Show symlinks properties > > > Key: HDFS-1788 > URL: https://issues.apache.org/jira/browse/HDFS-1788 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Reporter: Jonathan Eagles >Assignee: John George >Priority: Minor > Attachments: HDFS-1788.patch > > > ls FsShell command implementation has been consistent with the linux > implementations of ls \-l. With the addition of symlinks, I would expect the > ability to show file type 'd' for directory, '\-' for file, and 'l' for > symlink. In addition, following the linkname entry for symlinks, I would > expect the ability to show "\-> ". In linux, the default is to > the the properties of the link and not of the link target. In linux, '-L' > option allows for the dereferencing of symlinks to show link target > properties, but it is not the default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties
[ https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050065#comment-13050065 ] Daryn Sharp commented on HDFS-1788: --- Very nice! First off, this isn't a HDFS change. Maybe someone with admin powers can move the jira to HADOOP, or close this and repost on HADOOP-6424. In general, I'd really like to see {{statReal}} revert to {{stat}}. That will greatly reduce the size of the patch. Since I'm a unix stalwart, I'd "prefer" {{statLink}} be called {{lstat}}. In either case, it should be a method to avoid increasing the load on the NN (see more below). +FsShell+ * Make sure that renaming {{getFS()}} to {{getFC()}} didn't break {{DFSAdmin}}, or {{TestMRCLI}}, etc. * I think the addition of {{FileContext.processDeleteOnExit()}} to {{close()}} may cause problems. Ex. What happens if I have temp files open before running a FsShell command? Won't this change cause the files to unexpectedly go "poof"? +Ls+ * The new {{\-L}} flag is really implementing part of {{\-l}}. {{\-L}} is supposed to replace the link with the name & attrs of its target. It would be a nice option, but probably not strictly needed unless you are feeling ambitious. Now might be a good time to make {{\-l}} generate output the way ls is REALLY supposed to look. Otherwise altering the {{\-l}} output in the future will be deemed incompatible. You might consider another jira... +PathData+ * I'd recommend undoing as much as possible since there reasons why it is the way it is, plus it will cause a major merge conflict with my pending PathData changes. * The String ctors need to be restored since Path can mangle the string it is given. * The 3-arg ctor that takes FileStatus must be re-added. I took great effort to reduce the RPC load on the NN, but this patch will undo some of that work and generate *2X the stats to the NN*. * Always doing the equivalent of stat/lstat on every object is causing *2X the stats to the NN*. Combined with the previous point, this patch is causing *4X the stats to the NN* unless there's magic going on deep in the client * {{lstat}} is used so infrequently it should probably be an on-demand {{item.lstat()}} method * {{refreshStatus}} is expected to throw FNF, but this changes it to ignore it * If {{FileSystem}} can't be completely eliminated, please remove {{fs(Configuration)}} and leave the {{fs}} member intact. That will also greatly reduce the size of the patch, and remove errors that may be caused by providing a different config than the one used to originally create the object. +Tail+ * Changing {{refreshStatus}} to not throw FNF will cause a NPE here. This is one of the reasons {{refreshStatus}} needs to retain original behavior. +CopyCommands+ * Are you sure {{copyCrcToLocal}} will work now? The raw fs was needed since the actual fs obscures the crc file. Does {{FileContext}} change that behavior? +Ln+ * {{Ln}} Shouldn't be in {{CopyCommands}}. Please move this class into it's own file. * Should require {{\-s}} to create symbolic links. If {{\-s}} isn't given, it should throw that hardlinks aren't supported -- that way we leave the door open to the possibility of hardlinks someday. * It can't be a {{CommandWithDestination}} or it forces the target of the symlink to exist at creation time. It's completely legit to create a symlink that points to a non-existent path. Also, it should take 1 or 2 args like the unix command. {{processArguments}} is probably the best place to implement it. * Might consider splitting this off to accelerate the rest being integrated. Up to you. +LocalFileSystem+ * Why expose {{NAME}}? It doesn't appear to be used. +FileContext+ * Why add {{fsUri}}? It doesn't appear to be used. Overall, great work! Don't let the length of the review be discouraging. It's a great improvement to the shell! > FsShell ls: Show symlinks properties > > > Key: HDFS-1788 > URL: https://issues.apache.org/jira/browse/HDFS-1788 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Reporter: Jonathan Eagles >Assignee: John George >Priority: Minor > Attachments: HDFS-1788.patch > > > ls FsShell command implementation has been consistent with the linux > implementations of ls \-l. With the addition of symlinks, I would expect the > ability to show file type 'd' for directory, '\-' for file, and 'l' for > symlink. In addition, following the linkname entry for symlinks, I would > expect the ability to show "\-> ". In linux, the default is to > the the properties of the link and not of the link target. In linux, '-L' > option allows for the dereferencing of symlinks to show link target > properties, but it is not the default. -- This message is automatically gen
[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties
[ https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024866#comment-13024866 ] John George commented on HDFS-1788: --- Will go the FileContext route. > FsShell ls: Show symlinks properties > > > Key: HDFS-1788 > URL: https://issues.apache.org/jira/browse/HDFS-1788 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Reporter: Jonathan Eagles >Assignee: John George >Priority: Minor > > ls FsShell command implementation has been consistent with the linux > implementations of ls \-l. With the addition of symlinks, I would expect the > ability to show file type 'd' for directory, '\-' for file, and 'l' for > symlink. In addition, following the linkname entry for symlinks, I would > expect the ability to show "\-> ". In linux, the default is to > the the properties of the link and not of the link target. In linux, '-L' > option allows for the dereferencing of symlinks to show link target > properties, but it is not the default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties
[ https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022404#comment-13022404 ] Eli Collins commented on HDFS-1788: --- I think it makes sense to move FsShell over to FileContext (HADOOP-6424). That's substantially less work than supporting symlinks in FileSystem and work we need to do anyway. > FsShell ls: Show symlinks properties > > > Key: HDFS-1788 > URL: https://issues.apache.org/jira/browse/HDFS-1788 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Reporter: Jonathan Eagles >Assignee: John George >Priority: Minor > > ls FsShell command implementation has been consistent with the linux > implementations of ls \-l. With the addition of symlinks, I would expect the > ability to show file type 'd' for directory, '\-' for file, and 'l' for > symlink. In addition, following the linkname entry for symlinks, I would > expect the ability to show "\-> ". In linux, the default is to > the the properties of the link and not of the link target. In linux, '-L' > option allows for the dereferencing of symlinks to show link target > properties, but it is not the default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties
[ https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022362#comment-13022362 ] John George commented on HDFS-1788: --- To me it looks like this bug has two parts: 1. The ability for FsShell to show a symlink as "l" with the link target "link -> " just like in Linux. This seems to be a straight forward change in FsShell. 2. Inorder for FsShell to be able to do this, it needs to know that it is dealing with a symlink. As of now, it looks like FsShell uses FileSystem to check if a given path is a symlink or not. FileSystem class does not entirely support symlink. So, inorder to fix this bug, ls (FsShell) should either a) start using FileContext (HADOOP-6424) or b) FileSystem should be fixed to be able to deal with symlink. Inorder for FileSystem to support symlink, it should either be able to implement getFileLinkStatus() or getFileStatus() should itself be able to handle symlinks. The fastest/easiest way seems like getting getFileStatus() to also return the FileStatus of links. The best solution (but not the fastest) though seems to be to let FsShell use FileContext. Would it even make sense to let getFileStatus() return the status of symlinks as well (incase where the underlying filesystem supports symlinks) so that ls or any other command that uses FileSystem (as of today) can also deal with symlinks? Comments and suggestions welcome. > FsShell ls: Show symlinks properties > > > Key: HDFS-1788 > URL: https://issues.apache.org/jira/browse/HDFS-1788 > Project: Hadoop HDFS > Issue Type: Improvement > Components: tools >Reporter: Jonathan Eagles >Assignee: John George >Priority: Minor > > ls FsShell command implementation has been consistent with the linux > implementations of ls \-l. With the addition of symlinks, I would expect the > ability to show file type 'd' for directory, '\-' for file, and 'l' for > symlink. In addition, following the linkname entry for symlinks, I would > expect the ability to show "\-> ". In linux, the default is to > the the properties of the link and not of the link target. In linux, '-L' > option allows for the dereferencing of symlinks to show link target > properties, but it is not the default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira