[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties

2011-06-21 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052938#comment-13052938
 ] 

Eli Collins commented on HDFS-1788:
---

Looks like this is a patch for HADOOP-6424.. how about posting it there?

> FsShell ls: Show symlinks properties
> 
>
> Key: HDFS-1788
> URL: https://issues.apache.org/jira/browse/HDFS-1788
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Reporter: Jonathan Eagles
>Assignee: John George
>Priority: Minor
> Attachments: HDFS-1788.patch
>
>
> ls FsShell command implementation has been consistent with the linux 
> implementations of ls \-l. With the addition of symlinks, I would expect the 
> ability to show file type 'd' for directory, '\-' for file, and 'l' for 
> symlink. In addition, following the linkname entry for symlinks, I would 
> expect the ability to show "\-> ". In linux, the default is to 
> the the properties of the link and not of the link target. In linux, '-L' 
> option allows for the dereferencing of symlinks to show link target 
> properties, but it is not the default. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties

2011-06-18 Thread John George (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051629#comment-13051629
 ] 

John George commented on HDFS-1788:
---

Bochun,
It is good that you submitted a patch for this ticket, because the important 
thing is for this issue to be resolved. 
I had a patch ready but it needed some cleanup since some of the junit tests 
were failing. So, feel free to assign this bug to yourself and submit patches :)

> FsShell ls: Show symlinks properties
> 
>
> Key: HDFS-1788
> URL: https://issues.apache.org/jira/browse/HDFS-1788
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Reporter: Jonathan Eagles
>Assignee: John George
>Priority: Minor
> Attachments: HDFS-1788.patch
>
>
> ls FsShell command implementation has been consistent with the linux 
> implementations of ls \-l. With the addition of symlinks, I would expect the 
> ability to show file type 'd' for directory, '\-' for file, and 'l' for 
> symlink. In addition, following the linkname entry for symlinks, I would 
> expect the ability to show "\-> ". In linux, the default is to 
> the the properties of the link and not of the link target. In linux, '-L' 
> option allows for the dereferencing of symlinks to show link target 
> properties, but it is not the default. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties

2011-06-16 Thread Bochun Bai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050906#comment-13050906
 ] 

Bochun Bai commented on HDFS-1788:
--

Waiting for HADOOP-7360 is a good idea.
And cat/tail commands is not working, this is the motivation I involved.

I feel sorry about not knowing the assignee is still working on this.
It has been assigned for 50 days, and no more progress published.
Next time I will contact before started, thanks daryn.

> FsShell ls: Show symlinks properties
> 
>
> Key: HDFS-1788
> URL: https://issues.apache.org/jira/browse/HDFS-1788
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Reporter: Jonathan Eagles
>Assignee: John George
>Priority: Minor
> Attachments: HDFS-1788.patch
>
>
> ls FsShell command implementation has been consistent with the linux 
> implementations of ls \-l. With the addition of symlinks, I would expect the 
> ability to show file type 'd' for directory, '\-' for file, and 'l' for 
> symlink. In addition, following the linkname entry for symlinks, I would 
> expect the ability to show "\-> ". In linux, the default is to 
> the the properties of the link and not of the link target. In linux, '-L' 
> option allows for the dereferencing of symlinks to show link target 
> properties, but it is not the default. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties

2011-06-16 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050465#comment-13050465
 ] 

Daryn Sharp commented on HDFS-1788:
---

Oh, I'm not disagreeing with adding {{fc}} or {{lstat}}.  I'm just saying leave 
{{fs}} intact until we can remove it, ie. have both {{fc}} and {{fs}}.  HDFS 
appears to lazy-load {{FileStatus}}, but not all {{FileSystem}}s do.  Hence why 
it makes sense for a {{lstat()}} since very few commands (cp/mv/ls), in limited 
cases, will need a stat of the link.  Whereas every command uses {{stat}}.

Perhaps I'm misunderstanding: why should commands like cat/tail be symlink 
aware?  In the normal access case, symlinks should be transparent to a client.  
Ie. The filesystem should be responsible for resolving symlinks.  Very few 
commands should know or care about a symlink.

Note that HADOOP-7360 is changing a lot of the {{PathData}} ctors to be 
private.  The fs cannot be passed in, and that change was in part intended to 
greatly simplify the move to {{FileContext}}.  I hope to have this jira 
approved in the next few days, so I'd recommend waiting to post another patch 
until 7360 is integrated.

On a related note to waiting for my patches, please be mindful of working on 
jiras that are already assigned.  In the future, please contact an assignee for 
the status of their work.  In this case, the assignee of this jira was already 
working on the change and waiting for a few of my patches to go in.

> FsShell ls: Show symlinks properties
> 
>
> Key: HDFS-1788
> URL: https://issues.apache.org/jira/browse/HDFS-1788
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Reporter: Jonathan Eagles
>Assignee: John George
>Priority: Minor
> Attachments: HDFS-1788.patch
>
>
> ls FsShell command implementation has been consistent with the linux 
> implementations of ls \-l. With the addition of symlinks, I would expect the 
> ability to show file type 'd' for directory, '\-' for file, and 'l' for 
> symlink. In addition, following the linkname entry for symlinks, I would 
> expect the ability to show "\-> ". In linux, the default is to 
> the the properties of the link and not of the link target. In linux, '-L' 
> option allows for the dereferencing of symlinks to show link target 
> properties, but it is not the default. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties

2011-06-16 Thread Bochun Bai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050261#comment-13050261
 ] 

Bochun Bai commented on HDFS-1788:
--

Thanks daryn, your reviews encourage me a lot.
I will rewrite the patch and follow your advises.

But I got some difficult while keeping PathData maximum compatible: 
  Adds lstat, leaving fs, stat, exists.
  And fc should be added because HADOOP-6446. Further works should use 
FileContext instead of FileSystem.
  And lexists is requeired to identify broken links.

  The problem is fc should be initialized in ctor like fs, whether it is 
parameter or not, breaks compatible by requiring new parameter or throwing 
exceptions.
  If fc is not a field of PathData and initialized when item.lstat() is called. 
RPC to NN will increased.

To satisfies both compatible and feature, I suggest create a object named 
StatusData:
  contains fc, lstat and string.
  represent the link itself, not the real file.
  when cat/tail/stat is called, lstat.symlink will be resolved recursively and 
the real target is returned.
  FileStatus is recursively, and StatusData is independent to link or not.
  RPC to NN is decreased while listing a directory containing symlink. 
stat/read/write overhead to the symlink is the same, but lazy-inited. It should 
also improve GC activities to HADOOP-6732.

And I have noticed @eli created a serial issues about porting FileContext. I 
will submit next patch to HADOOP-6424.


> FsShell ls: Show symlinks properties
> 
>
> Key: HDFS-1788
> URL: https://issues.apache.org/jira/browse/HDFS-1788
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Reporter: Jonathan Eagles
>Assignee: John George
>Priority: Minor
> Attachments: HDFS-1788.patch
>
>
> ls FsShell command implementation has been consistent with the linux 
> implementations of ls \-l. With the addition of symlinks, I would expect the 
> ability to show file type 'd' for directory, '\-' for file, and 'l' for 
> symlink. In addition, following the linkname entry for symlinks, I would 
> expect the ability to show "\-> ". In linux, the default is to 
> the the properties of the link and not of the link target. In linux, '-L' 
> option allows for the dereferencing of symlinks to show link target 
> properties, but it is not the default. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties

2011-06-15 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050065#comment-13050065
 ] 

Daryn Sharp commented on HDFS-1788:
---

Very nice! First off, this isn't a HDFS change.  Maybe someone with admin 
powers can move the jira to HADOOP, or close this and repost on HADOOP-6424.  

In general, I'd really like to see {{statReal}} revert to {{stat}}.  That will 
greatly reduce the size of the patch.  Since I'm a unix stalwart, I'd "prefer" 
{{statLink}} be called {{lstat}}.  In either case, it should be a method to 
avoid increasing the load on the NN (see more below).

+FsShell+
* Make sure that renaming {{getFS()}} to {{getFC()}} didn't break {{DFSAdmin}}, 
or {{TestMRCLI}}, etc.
* I think the addition of {{FileContext.processDeleteOnExit()}} to {{close()}} 
may cause problems.  Ex.  What happens if I have temp files open before running 
a FsShell command?  Won't this change cause the files to unexpectedly go "poof"?

+Ls+
* The new {{\-L}} flag is really implementing part of {{\-l}}. {{\-L}} is 
supposed to replace the link with the name & attrs of its target.  It would be 
a nice option, but probably not strictly needed unless you are feeling 
ambitious.  Now might be a good time to make {{\-l}} generate output the way ls 
is REALLY supposed to look.  Otherwise altering the {{\-l}} output in the 
future will be deemed incompatible.  You might consider another jira...

+PathData+
* I'd recommend undoing as much as possible since there reasons why it is the 
way it is, plus it will cause a major merge conflict with my pending PathData 
changes.
* The String ctors need to be restored since Path can mangle the string it is 
given.
* The 3-arg ctor that takes FileStatus must be re-added.  I took great effort 
to reduce the RPC load on the NN, but this patch will undo some of that work 
and generate *2X the stats to the NN*.  
* Always doing the equivalent of stat/lstat on every object is causing *2X the 
stats to the NN*.  Combined with the previous point, this patch is causing *4X 
the stats to the NN* unless there's magic going on deep in the client
* {{lstat}} is used so infrequently it should probably be an on-demand 
{{item.lstat()}} method
* {{refreshStatus}} is expected to throw FNF, but this changes it to ignore it
* If {{FileSystem}} can't be completely eliminated, please remove 
{{fs(Configuration)}} and leave the {{fs}} member intact.  That will also 
greatly reduce the size of the patch, and remove errors that may be caused by 
providing a different config than the one used to originally create the object.

+Tail+
* Changing {{refreshStatus}} to not throw FNF will cause a NPE here.  This is 
one of the reasons {{refreshStatus}} needs to retain original behavior.

+CopyCommands+
* Are you sure {{copyCrcToLocal}} will work now?  The raw fs was needed since 
the actual fs obscures the crc file.  Does {{FileContext}} change that behavior?

+Ln+
* {{Ln}} Shouldn't be in {{CopyCommands}}.  Please move this class into it's 
own file.  
* Should require {{\-s}} to create symbolic links.  If {{\-s}} isn't given, it 
should throw that hardlinks aren't supported -- that way we leave the door open 
to the possibility of hardlinks someday.
* It can't be a {{CommandWithDestination}} or it forces the target of the 
symlink to exist at creation time.  It's completely legit to create a symlink 
that points to a non-existent path.  Also, it should take 1 or 2 args like the 
unix command.  {{processArguments}} is probably the best place to implement it.
* Might consider splitting this off to accelerate the rest being integrated.  
Up to you.

+LocalFileSystem+
* Why expose {{NAME}}?  It doesn't appear to be used.

+FileContext+
* Why add {{fsUri}}?  It doesn't appear to be used.

Overall, great work!  Don't let the length of the review be discouraging.  It's 
a great improvement to the shell!

> FsShell ls: Show symlinks properties
> 
>
> Key: HDFS-1788
> URL: https://issues.apache.org/jira/browse/HDFS-1788
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Reporter: Jonathan Eagles
>Assignee: John George
>Priority: Minor
> Attachments: HDFS-1788.patch
>
>
> ls FsShell command implementation has been consistent with the linux 
> implementations of ls \-l. With the addition of symlinks, I would expect the 
> ability to show file type 'd' for directory, '\-' for file, and 'l' for 
> symlink. In addition, following the linkname entry for symlinks, I would 
> expect the ability to show "\-> ". In linux, the default is to 
> the the properties of the link and not of the link target. In linux, '-L' 
> option allows for the dereferencing of symlinks to show link target 
> properties, but it is not the default. 

--
This message is automatically gen

[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties

2011-04-25 Thread John George (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024866#comment-13024866
 ] 

John George commented on HDFS-1788:
---

Will go the FileContext route.

> FsShell ls: Show symlinks properties
> 
>
> Key: HDFS-1788
> URL: https://issues.apache.org/jira/browse/HDFS-1788
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Reporter: Jonathan Eagles
>Assignee: John George
>Priority: Minor
>
> ls FsShell command implementation has been consistent with the linux 
> implementations of ls \-l. With the addition of symlinks, I would expect the 
> ability to show file type 'd' for directory, '\-' for file, and 'l' for 
> symlink. In addition, following the linkname entry for symlinks, I would 
> expect the ability to show "\-> ". In linux, the default is to 
> the the properties of the link and not of the link target. In linux, '-L' 
> option allows for the dereferencing of symlinks to show link target 
> properties, but it is not the default. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties

2011-04-20 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022404#comment-13022404
 ] 

Eli Collins commented on HDFS-1788:
---

I think it makes sense to move FsShell over to FileContext (HADOOP-6424). 
That's substantially less work than supporting symlinks in FileSystem and work 
we need to do anyway.

> FsShell ls: Show symlinks properties
> 
>
> Key: HDFS-1788
> URL: https://issues.apache.org/jira/browse/HDFS-1788
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Reporter: Jonathan Eagles
>Assignee: John George
>Priority: Minor
>
> ls FsShell command implementation has been consistent with the linux 
> implementations of ls \-l. With the addition of symlinks, I would expect the 
> ability to show file type 'd' for directory, '\-' for file, and 'l' for 
> symlink. In addition, following the linkname entry for symlinks, I would 
> expect the ability to show "\-> ". In linux, the default is to 
> the the properties of the link and not of the link target. In linux, '-L' 
> option allows for the dereferencing of symlinks to show link target 
> properties, but it is not the default. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties

2011-04-20 Thread John George (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022362#comment-13022362
 ] 

John George commented on HDFS-1788:
---

To me it looks like this bug has two parts:
1. The ability for FsShell to show a symlink as "l" with the link target "link 
-> " just like in Linux. This seems to be a straight forward change in 
FsShell.

2. Inorder for FsShell to be able to do this, it needs to know that it is 
dealing with a symlink. As of now, it looks like FsShell uses FileSystem to 
check if a given path is a symlink or not. FileSystem class does not entirely 
support symlink. So, inorder to fix this bug, ls (FsShell) should either 
 a) start using FileContext (HADOOP-6424) or 
 b) FileSystem should be fixed to be able to deal with symlink. 
Inorder for FileSystem to support symlink, it should either be able to 
implement getFileLinkStatus() or getFileStatus() should itself be able to 
handle symlinks. The fastest/easiest way seems like getting getFileStatus() to 
also return the FileStatus of links. The best solution (but not the fastest) 
though seems to be to let FsShell use FileContext. Would it even make sense to 
let getFileStatus() return the status of symlinks as well (incase where the 
underlying filesystem supports symlinks) so that ls or any other command that 
uses FileSystem (as of today) can also deal with symlinks? 

Comments and suggestions welcome. 

> FsShell ls: Show symlinks properties
> 
>
> Key: HDFS-1788
> URL: https://issues.apache.org/jira/browse/HDFS-1788
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: tools
>Reporter: Jonathan Eagles
>Assignee: John George
>Priority: Minor
>
> ls FsShell command implementation has been consistent with the linux 
> implementations of ls \-l. With the addition of symlinks, I would expect the 
> ability to show file type 'd' for directory, '\-' for file, and 'l' for 
> symlink. In addition, following the linkname entry for symlinks, I would 
> expect the ability to show "\-> ". In linux, the default is to 
> the the properties of the link and not of the link target. In linux, '-L' 
> option allows for the dereferencing of symlinks to show link target 
> properties, but it is not the default. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira