[ 
https://issues.apache.org/jira/browse/HDFS-15289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090912#comment-17090912
 ] 

Virajith Jalaparti edited comment on HDFS-15289 at 4/23/20, 8:25 PM:
---------------------------------------------------------------------

cc:  [~shv] [~cliang], [~abhishekd]  [~aiden_zhang]

Hi [~umamaheswararao], thanks for posting this! At LinkedIn, we are currently 
evaluating the same situation and are implementing a solution along the same 
lines as described in the doc (both problem 1 and 2 as discussed in your 
document). 

Our use cases include:
 * HDFS federation (we are currently working on federating our largest cluster)
 * Accessing data across multiple storage accounts in Azure (as part of 
migration to cloud).
 * The same user code (UDFs etc.) should be able to work on both 1 and 2. In 
many cases, UDFs use {{hdfs:///.}}

Our concerns around overriding {{fs.hdfs.impl}} are:
 # {{saveNamespace}} and other methods in {{FileSystem}} all needed to be 
implemented in {{ViewFSOveraloadScheme}}. Do you have any specific plans around 
testing this?
 # Admins will not have a way to directly access HDFS unless configs on admin 
machines are deployed separately. Is this something you considered? How do you 
plan to make admin tools work?
 # How to handle cases where {{DistributedFileSystem}} is used instead of 
{{FileSystem}}? Do you plan to make {{ViewFSOveraloadScheme extend 
}}{{DistributedFileSystem?}}

 Any thoughts around 1-3 above?

 


was (Author: virajith):
cc:  [~shv] [~cliang], [~abhishekd] 

Hi [~umamaheswararao], thanks for posting this! At LinkedIn, we are currently 
evaluating the same situation and are implementing a solution along the same 
lines as described in the doc (both problem 1 and 2 as discussed in your 
document). 

Our use cases include:
 * HDFS federation (we are currently working on federating our largest cluster)
 * Accessing data across multiple storage accounts in Azure (as part of 
migration to cloud).
 * The same user code (UDFs etc.) should be able to work on both 1 and 2. In 
many cases, UDFs use {{hdfs:///.}}

Our concerns around overriding {{fs.hdfs.impl}} are:
 # {{saveNamespace}} and other methods in {{FileSystem}} all needed to be 
implemented in {{ViewFSOveraloadScheme}}. Do you have any specific plans around 
testing this?
 # Admins will not have a way to directly access HDFS unless configs on admin 
machines are deployed separately. Is this something you considered? How do you 
plan to make admin tools work?
 # How to handle cases where {{DistributedFileSystem}} is used instead of 
{{FileSystem}}? Do you plan to make {{ViewFSOveraloadScheme extend 
}}{{DistributedFileSystem?}}

 Any thoughts around 1-3 above?

 

> Allow viewfs mounts with hdfs scheme and centralized mount table
> ----------------------------------------------------------------
>
>                 Key: HDFS-15289
>                 URL: https://issues.apache.org/jira/browse/HDFS-15289
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: fs
>    Affects Versions: 3.2.0
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>            Priority: Major
>             Fix For: 3.4.0
>
>         Attachments: ViewFSOverloadScheme - V1.0.pdf
>
>
> ViewFS provides flexibility to mount different filesystem types with mount 
> points configuration table. Additionally viewFS provides flexibility to 
> configure any fs (not only HDFS) scheme in mount table mapping. This approach 
> is solving the scalability problems, but users need to reconfigure the 
> filesystem to ViewFS and to its scheme.  This will be problematic in the case 
> of paths persisted in meta stores, ex: Hive. In systems like Hive, it will 
> store uris in meta store. So, changing the file system scheme will create a 
> burden to upgrade/recreate meta stores. In our experience many users are not 
> ready to change that.  
> Router based federation is another implementation to provide coordinated 
> mount points for HDFS federation clusters. Even though this provides 
> flexibility to handle mount points easily, this will not allow 
> other(non-HDFS) file systems to mount. So, this does not solve the purpose 
> when users want to mount external(non-HDFS) filesystems.
> So, the problem here is: Even though many users want to adapt to the scalable 
> fs options available, technical challenges of changing schemes (ex: in meta 
> stores) in deployments are obstructing them. 
> So, we propose to allow hdfs scheme in ViewFS like client side mount system 
> and provision user to create mount links without changing URI paths. 
> I will upload detailed design doc shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to