[ https://issues.apache.org/jira/browse/HDFS-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16129986#comment-16129986 ]
Yongjun Zhang edited comment on HDFS-12295 at 8/17/17 2:44 PM: --------------------------------------------------------------- Hi [~daryn], The proposed solution here tries to address distcp, your comment made me aware of that "hadoop fs -cp" would have the same problem to solve. Thanks again for that. There are several proposals so far: 1. HDFS-12202, add a new set of interface to getFileStatus and listStatus, call this set of interface when needed to solve the problem (distcp, "hadoop fs -cp" etc) Pros: clear interface, no confusion Cons: change is very wide. Have to introduce dummy implementation for FileSystems that don't support attribute provider. 2. HDFS-12294, encode the additional parameter to the path string itself, and extract the prefix from path string. And add the prefix when needed to solve the problem (distcp, "hadoop fs -cp" etc) Pros: no need to change FileSystem interface Cons: inconsistent path string at different places potentially. Since the prefix is only relevant to certain operations. 3. let the external attribute provider to fall through to HDFS if it's a certain user. This is discussed in HDFS-12202 comment. Pros: maybe simpler to implement Cons: potentially won't work (since the same user may want to get data from attribute provider, and other user need to run distcp and "hadoop fs -cp" too) [~daryn], [~chris.douglas], [~asuresh], [~andrew.wang], [~manojg], thanks for your comment earlier, do you think my summary above is reasonable? any better idea or further thoughts to share? Really appreciate it. was (Author: yzhangal): Hi [~daryn], The proposed solution here tries to address distcp, your comment made me aware of that "hadoop fs -cp" would have the same problem to solve. Thanks again for that. There are several proposals so far: 1. HDFS-12202, add a new set of interface to getFileStatus and listStatus, call this set of interface when needed to solve the problem (distcp, "hadoop fs -cp" etc) Pros: clear interface, no confusion Cons: change is too wide. Have to introduce dummy implementation for FileSystems that don't support attribute provider. 2. HDFS-12294, encode the additional parameter to the path string itself, and extract the prefix from path string. And add the prefix when needed to solve the problem (distcp, "hadoop fs -cp" etc) Pros: no need to change FileSystem interface Cons: inconsistent path string at different places potentially. Since the prefix is only relevant to certain operations. 3. let the external attribute provider to fall through to HDFS if it's a certain user. This is discussed in HDFS-12202 comment. Pros: maybe simpler to implement Cons: potentially won't work (since the same user may want to get data from attribute provider, and other user need to run distcp and "hadoop fs -cp" too) [~daryn], [~chris.douglas], [~asuresh], [~andrew.wang], [~manojg], thanks for your comment earlier, do you think my summary above is reasonable? any better idea or further thoughts to share? Really appreciate it. > NameNode to support file path prefix /.reserved/bypassExtAttr > ------------------------------------------------------------- > > Key: HDFS-12295 > URL: https://issues.apache.org/jira/browse/HDFS-12295 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs, namenode > Reporter: Yongjun Zhang > Assignee: Yongjun Zhang > Attachments: HDFS-12295.001.patch, HDFS-12295.001.patch > > > Let NameNode to support prefix /.reserved/bypassExtAttr, so client can add > thisprefix to a path before calling getFileStatus, e.g. /ab/c becomes > /.reserved/bypassExtAttr/a/b/c. NN will parse the path at the very beginning, > and bypass external attribute provider if the prefix is there. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org