[ 
https://issues.apache.org/jira/browse/HADOOP-17072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17139993#comment-17139993
 ] 

Uma Maheswara Rao G edited comment on HADOOP-17072 at 6/18/20, 9:06 PM:
------------------------------------------------------------------------

Hi [~virajith] , thanks for filing and the patch!

I looked at patch quickly. We have two closely similar functionality APIs which 
are publicly exposed in FileSystem.java currently.

# public Path getLinkTarget(Path f) throws IOException {
#  public FileSystem[] getChildFileSystems() {
Did you get chance to check them in ur use case without adding new APIs?

I see that in getClusterRoots, you are already using getChildFileSystems. 
You added FileSystem#getRootURI. This seems to me a simple util method, need 
not be in FileSystem.java? Is your plan actually to override getRootURI in 
specific fileSystems? 
The current implementation in ViewFileSystem#getClusterRoots can be done in any 
util class as well by using public getChildFileSystems? Could you please 
elaborate a bit?





was (Author: umamaheswararao):
Hi [~virajith] , thanks for filing and the patch!

I looked at patch quickly. We have two closely similar functionality APIs which 
are publicly exposed in FileSystem.java currently.

1) public Path getLinkTarget(Path f) throws IOException {
2) public FileSystem[] getChildFileSystems() {
Did you get chance to check them in ur use case without adding new APIs?

I see that in getClusterRoots, you are already using getChildFileSystems. 
You added FileSystem#getRootURI. This seems to me a simple util method, need 
not be in FileSystem.java? Is your plan actually to override getRootURI in 
specific fileSystems? 
The current implementation in ViewFileSystem#getClusterRoots can be done in any 
util class as well by using public getChildFileSystems? Could you please 
elaborate a bit?




> Add getClusterRoot and getClusterRoots methods to FileSystem and 
> ViewFilesystem
> -------------------------------------------------------------------------------
>
>                 Key: HADOOP-17072
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17072
>             Project: Hadoop Common
>          Issue Type: Task
>          Components: fs, viewfs
>            Reporter: Virajith Jalaparti
>            Assignee: Virajith Jalaparti
>            Priority: Major
>         Attachments: HADOOP-17072.001.patch
>
>
> In a federated setting (HDFS federation, federation across multiple buckets 
> on S3, multiple containers across Azure storage), certain system 
> tools/pipelines require the ability to map paths to the clusters/accounts.
> Consider the example of GDPR compliance/retention jobs that need to go over 
> various datasets, ingested over a period of T days and remove/quarantine 
> datasets that are not properly annotated/have reached their retention period. 
> Such jobs can rely on renames to a global trash/quarantine directory to 
> accomplish their task. However, in a federated setting, efficient, atomic 
> renames (as those within a single HDFS cluster) are not supported across the 
> different clusters/shards in federation. As a result, such jobs will need to 
> leverage a trash/quarantine directory per cluster/shard. Further, they would 
> need to map from a particular path to the cluster/shard that contains this 
> path.
> To address such cases, this JIRA proposes to get add two new methods to 
> {{FileSystem}}: {{getClusterRoot}} and {{getClusterRoots()}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to