[ 
https://issues.apache.org/jira/browse/HDFS-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16129986#comment-16129986
 ] 

Yongjun Zhang edited comment on HDFS-12295 at 8/17/17 2:44 PM:
---------------------------------------------------------------

Hi [~daryn],

The proposed solution here tries to address distcp, your comment made me aware 
of that "hadoop fs -cp" would have the same problem to solve. Thanks again for 
that.

There are several proposals so far:

1. HDFS-12202, add a new set of interface to getFileStatus and listStatus, call 
this set of interface when needed to solve the problem (distcp, "hadoop fs -cp" 
etc)
Pros: clear interface, no confusion
Cons: change is very wide. Have to introduce dummy implementation for 
FileSystems that don't support attribute provider.

2. HDFS-12294, encode the additional parameter to the path string itself, and 
extract the prefix from path string. And add the prefix when needed to solve 
the problem (distcp, "hadoop fs -cp" etc)
Pros: no need to change FileSystem interface
Cons: inconsistent path string at different places potentially. Since the 
prefix is only relevant to certain operations.

3. let the external attribute provider to fall through to HDFS if it's a 
certain user. This is discussed in HDFS-12202 comment. 
 Pros: maybe simpler to implement
Cons: potentially won't work (since the same user may want to get data from 
attribute provider, and other user need to run distcp and "hadoop fs -cp" too)

[~daryn], [~chris.douglas], [~asuresh], [~andrew.wang], [~manojg], thanks for 
your comment earlier, do you think my summary above is reasonable? any better 
idea or further thoughts to share?  

Really appreciate it.








was (Author: yzhangal):
Hi [~daryn],

The proposed solution here tries to address distcp, your comment made me aware 
of that "hadoop fs -cp" would have the same problem to solve. Thanks again for 
that.

There are several proposals so far:

1. HDFS-12202, add a new set of interface to getFileStatus and listStatus, call 
this set of interface when needed to solve the problem (distcp, "hadoop fs -cp" 
etc)
Pros: clear interface, no confusion
Cons: change is too wide. Have to introduce dummy implementation for 
FileSystems that don't support attribute provider.

2. HDFS-12294, encode the additional parameter to the path string itself, and 
extract the prefix from path string. And add the prefix when needed to solve 
the problem (distcp, "hadoop fs -cp" etc)
Pros: no need to change FileSystem interface
Cons: inconsistent path string at different places potentially. Since the 
prefix is only relevant to certain operations.

3. let the external attribute provider to fall through to HDFS if it's a 
certain user. This is discussed in HDFS-12202 comment. 
 Pros: maybe simpler to implement
Cons: potentially won't work (since the same user may want to get data from 
attribute provider, and other user need to run distcp and "hadoop fs -cp" too)

[~daryn], [~chris.douglas], [~asuresh], [~andrew.wang], [~manojg], thanks for 
your comment earlier, do you think my summary above is reasonable? any better 
idea or further thoughts to share?  

Really appreciate it.







> NameNode to support file path prefix /.reserved/bypassExtAttr
> -------------------------------------------------------------
>
>                 Key: HDFS-12295
>                 URL: https://issues.apache.org/jira/browse/HDFS-12295
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs, namenode
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-12295.001.patch, HDFS-12295.001.patch
>
>
> Let NameNode to support prefix /.reserved/bypassExtAttr, so client can add 
> thisprefix to a path before calling getFileStatus, e.g. /ab/c becomes 
> /.reserved/bypassExtAttr/a/b/c. NN will parse the path at the very beginning, 
> and bypass external attribute provider if the prefix is there.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to