[ https://issues.apache.org/jira/browse/HADOOP-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14250776#comment-14250776 ]
Hudson commented on HADOOP-9629: -------------------------------- FAILURE: Integrated in Hadoop-trunk-Commit #6739 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6739/]) HADOOP-9629. Move attribution in CHANGES.txt to 2.7.0 section. (cnauroth: rev 6ba8fd7e8e038351fc14f6096ea2216ce2abe918) * hadoop-common-project/hadoop-common/CHANGES.txt > Support Windows Azure Storage - Blob as a file system in Hadoop > --------------------------------------------------------------- > > Key: HADOOP-9629 > URL: https://issues.apache.org/jira/browse/HADOOP-9629 > Project: Hadoop Common > Issue Type: New Feature > Components: tools > Reporter: Mostafa Elhemali > Assignee: Chris Nauroth > Fix For: 3.0.0 > > Attachments: HADOOP-9629 - Azure Filesystem - Information for > developers.docx, HADOOP-9629 - Azure Filesystem - Information for > developers.pdf, HADOOP-9629.2.patch, HADOOP-9629.3.patch, HADOOP-9629.patch, > HADOOP-9629.trunk.1.patch, HADOOP-9629.trunk.2.patch, > HADOOP-9629.trunk.3.patch, HADOOP-9629.trunk.4.patch, > HADOOP-9629.trunk.5.patch > > > h2. Description > This JIRA incorporates adding a new file system implementation for accessing > Windows Azure Storage - Blob from within Hadoop, such as using blobs as input > to MR jobs or configuring MR jobs to put their output directly into blob > storage. > h2. High level design > At a high level, the code here extends the FileSystem class to provide an > implementation for accessing blob storage; the scheme wasb is used for > accessing it over HTTP, and wasbs for accessing over HTTPS. We use the URI > scheme: {code}wasb[s]://<container>@<account>/path/to/file{code} to address > individual blobs. We use the standard Azure Java SDK > (com.microsoft.windowsazure) to do most of the work. In order to map a > hierarchical file system over the flat name-value pair nature of blob > storage, we create a specially tagged blob named path/to/dir whenever we > create a directory called path/to/dir, then files under that are stored as > normal blobs path/to/dir/file. We have many metrics implemented for it using > the Metrics2 interface. Tests are implemented mostly using a mock > implementation for the Azure SDK functionality, with an option to test > against a real blob storage if configured (instructions provided inside in > README.txt). > h2. Credits and history > This has been ongoing work for a while, and the early version of this work > can be seen in HADOOP-8079. This JIRA is a significant revision of that and > we'll post the patch here for Hadoop trunk first, then post a patch for > branch-1 as well for backporting the functionality if accepted. Credit for > this work goes to the early team: [~minwei], [~davidlao], [~lengningliu] and > [~stojanovic] as well as multiple people who have taken over this work since > then (hope I don't forget anyone): [~dexterb], Johannes Klein, [~ivanmi], > Michael Rys, [~mostafae], [~brian_swan], [~mikelid], [~xifang], and > [~chuanliu]. > h2. Test > Besides unit tests, we have used WASB as the default file system in our > service product. (HDFS is also used but not as default file system.) Various > different customer and test workloads have been run against clusters with > such configurations for quite some time. The current version reflects to the > version of the code tested and used in our production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)