[ 
https://issues.apache.org/jira/browse/HADOOP-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vishwajeet Dusane reassigned HADOOP-12666:
------------------------------------------

    Assignee: Vishwajeet Dusane

> Support Windows Azure Data Lake - as a file system in Hadoop
> ------------------------------------------------------------
>
>                 Key: HADOOP-12666
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12666
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs, fs/azure, tools
>            Reporter: Vishwajeet Dusane
>            Assignee: Vishwajeet Dusane
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> h2. Description
> This JIRA describes a new file system implementation for accessing Windows 
> Azure Data Lake Store (ADL) from within Hadoop. This would enable existing 
> Hadoop applications such has MR, HIVE, Hbase etc..,  to use ADL store as 
> input or output.
>  
> ADL is ultra-high capacity, Optimized for massive throughput with rich 
> management and security features. More details available at 
> https://azure.microsoft.com/en-us/services/data-lake-store/
> h2. High level design
> ADL file system exposes RESTful interfaces compatible with WebHdfs 
> specification 2.7.1.
> At a high level, the code here extends the SWebHdfsFileSystem class to 
> provide an implementation for accessing ADL storage; the scheme ADL is used 
> for accessing it over HTTPS. We use the URI scheme:
> {code}adl://<URI to account>/path/to/file{code} 
> to address individual Files/Folders. Tests are implemented mostly using a 
> Contract implementation for the ADL functionality, with an option to test 
> against a real ADL storage if configured.
> h2. Credits and history
> This has been ongoing work for a while, and the early version of this work 
> can be seen in. Credit for this work goes to the team: [~vishwajeet.dusane], 
> [~snayak], [~srevanka], [~kiranch], [~chakrab], [~omkarksa], [~snvijaya], 
> [~ansaiprasanna]  [~jsangwan]
> h2. Test
> Besides Contract tests, we have used ADL as the additional file system in the 
> current public preview release. Various different customer and test workloads 
> have been run against clusters with such configurations for quite some time. 
> The current version reflects to the version of the code tested and used in 
> our production environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to