[ https://issues.apache.org/jira/browse/HADOOP-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15324175#comment-15324175 ]
Steve Loughran commented on HADOOP-12666: ----------------------------------------- I've only just caught up on this ... I do have some issues which will need correcting in the contract test suite. I suspect the size of the patch meant everyone was overwhelemed with the project. This is the kind of thing best done as feature branch: it can be stabilised before being merged with anything. The reason for doing in a feature branch is that the feature could have been kept out until the contract tests are working, and all those areas where there is a divergence addressed. It can take a few iterations; doing it in a branch will address this. Certainly I'd like to see backports to branch-2 holding back (briefly) until those tests are passing, where passing is "don't skip things that aren't working unless its a fundamental feature of the FS". Some of the test are being skipped with a "BUG" message. Those test failures, are, to me, a sign the FS isn't ready for use yet. Put differently: FS contract tests should not have been postponed until after this went in. Those tests are the closest we have to verifying compliance with our (reverse-engineered) specification of how a filesystem must work. If the tests don't pass, it's not a filesystem according to the Hadoop specification. Sorry. At a quick glance at the code, I'm disappointed that a loose "throw IOE" policy has been adopted rather than the strict "throw the same exceptions as HDFS" one. It's not that much harder to type {{throw new EOFException}} in a seek, is it? Speaking of which: seek() is broken. I'll leave it to the reader to guess why. > Support Microsoft Azure Data Lake - as a file system in Hadoop > -------------------------------------------------------------- > > Key: HADOOP-12666 > URL: https://issues.apache.org/jira/browse/HADOOP-12666 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, fs/azure, tools > Reporter: Vishwajeet Dusane > Assignee: Vishwajeet Dusane > Fix For: 3.0.0-alpha1 > > Attachments: Create_Read_Hadoop_Adl_Store_Semantics.pdf, > HADOOP-12666-002.patch, HADOOP-12666-003.patch, HADOOP-12666-004.patch, > HADOOP-12666-005.patch, HADOOP-12666-006.patch, HADOOP-12666-007.patch, > HADOOP-12666-008.patch, HADOOP-12666-009.patch, HADOOP-12666-010.patch, > HADOOP-12666-011.patch, HADOOP-12666-012.patch, HADOOP-12666-013.patch, > HADOOP-12666-014.patch, HADOOP-12666-015.patch, HADOOP-12666-016.patch, > HADOOP-12666-1.patch > > Original Estimate: 336h > Time Spent: 336h > Remaining Estimate: 0h > > h2. Description > This JIRA describes a new file system implementation for accessing Microsoft > Azure Data Lake Store (ADL) from within Hadoop. This would enable existing > Hadoop applications such has MR, HIVE, Hbase etc.., to use ADL store as > input or output. > > ADL is ultra-high capacity, Optimized for massive throughput with rich > management and security features. More details available at > https://azure.microsoft.com/en-us/services/data-lake-store/ -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org