[ https://issues.apache.org/jira/browse/HADOOP-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999592#comment-13999592 ]
Hadoop QA commented on HADOOP-9629: ----------------------------------- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12645104/HADOOP-9629.trunk.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 25 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 2 release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-tools/hadoop-azure hadoop-tools/hadoop-tools-dist. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3945//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/3945//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3945//console This message is automatically generated. > Support Windows Azure Storage - Blob as a file system in Hadoop > --------------------------------------------------------------- > > Key: HADOOP-9629 > URL: https://issues.apache.org/jira/browse/HADOOP-9629 > Project: Hadoop Common > Issue Type: Improvement > Reporter: Mostafa Elhemali > Assignee: Mostafa Elhemali > Attachments: HADOOP-9629.2.patch, HADOOP-9629.3.patch, > HADOOP-9629.patch, HADOOP-9629.trunk.1.patch > > > h2. Description > This JIRA incorporates adding a new file system implementation for accessing > Windows Azure Storage - Blob from within Hadoop, such as using blobs as input > to MR jobs or configuring MR jobs to put their output directly into blob > storage. > h2. High level design > At a high level, the code here extends the FileSystem class to provide an > implementation for accessing blob storage; the scheme wasb is used for > accessing it over HTTP, and wasbs for accessing over HTTPS. We use the URI > scheme: {code}wasb[s]://<container>@<account>/path/to/file{code} to address > individual blobs. We use the standard Azure Java SDK > (com.microsoft.windowsazure) to do most of the work. In order to map a > hierarchical file system over the flat name-value pair nature of blob > storage, we create a specially tagged blob named path/to/dir whenever we > create a directory called path/to/dir, then files under that are stored as > normal blobs path/to/dir/file. We have many metrics implemented for it using > the Metrics2 interface. Tests are implemented mostly using a mock > implementation for the Azure SDK functionality, with an option to test > against a real blob storage if configured (instructions provided inside in > README.txt). > h2. Credits and history > This has been ongoing work for a while, and the early version of this work > can be seen in HADOOP-8079. This JIRA is a significant revision of that and > we'll post the patch here for Hadoop trunk first, then post a patch for > branch-1 as well for backporting the functionality if accepted. Credit for > this work goes to the early team: [~minwei], [~davidlao], [~lengningliu] and > [~stojanovic] as well as multiple people who have taken over this work since > then (hope I don't forget anyone): [~dexterb], Johannes Klein, [~ivanmi], > Michael Rys, [~mostafae], [~brian_swan], [~mikelid], [~xifang], and > [~chuanliu]. > h2. Test > Besides unit tests, we have used WASB as the default file system in our > service product. (HDFS is also used but not as default file system.) Various > different customer and test workloads have been run against clusters with > such configurations for quite some time. The current version reflects to the > version of the code tested and used in our production environment. -- This message was sent by Atlassian JIRA (v6.2#6252)