This is a great work to be in to evolve HDFS towards cloud storage. Thanks Chris and folks! I had participated in the design review meeting and it looks good to me. I will take some time to look at the cocdes closely. +1 on the merge.
Regards,Kai ------------------------------------------------------------------发件人:Iñigo Goiri <elgo...@gmail.com>发送时间:2017年12月14日(星期四) 09:29收件人:Sean Mackrory <mackror...@gmail.com>抄 送:Anu Engineer <aengin...@hortonworks.com>; Virajith Jalaparti <virajit...@gmail.com>; Chris Douglas <cdoug...@apache.org>; hdfs-dev@hadoop.apache.org <hdfs-dev@hadoop.apache.org>; viraj...@apache.org <viraj...@apache.org>; ehi...@apache.org <ehi...@apache.org>; thdem...@apache.org <thdem...@apache.org>主 题:Re: [VOTE] Merge HDFS-9806 to trunk +1 I have been reviewing some of the latest patches. I skimmed through the patch in HDFS-9806 and it looks good. In addition, we have ported it to 2.7.1 (minor differences to what would be merged). It has been running in our test cluster for a couple months. All the issues we have been finding are already resolved and committed to the feature branch. After this, we have recently deployed to three production clusters and is working as expected so far. Thanks for the work Virajith and Chris; I'd like to see this merged into trunk to make the maintainability easier. On Wed, Dec 13, 2017 at 12:01 PM, Sean Mackrory <mackror...@gmail.com> wrote: > +1 from me. There are some unrelated errors building the branch right now > due to annotations in some YARN code, etc. but I was able to generate an fs > image from an S3 bucket and serve the content through HDFS on a > pseudo-distributed HDFS node this morning. Seems like a good point for a > merge. > > On Wed, Dec 13, 2017 at 11:55 AM, Anu Engineer <aengin...@hortonworks.com> > wrote: > > > Hi Virajith / Chris/ Thomas / Ewan, > > > > Thanks for developing this feature and getting to merge state. > > I would like to vote +1 for this merge. Thanks for all the hard work. > > > > Thanks > > Anu > > > > > > On 12/8/17, 7:11 PM, "Virajith Jalaparti" <virajit...@gmail.com> wrote: > > > > Hi, > > > > We have tested the HDFS-9806 branch in two settings: > > > > (i) 26 node bare-metal cluster, with PROVIDED storage configured to > > point > > to another instance of HDFS (containing 468 files, total of ~400GB of > > data). Half of the Datanodes are configured with only DISK volumes > and > > other other half have both DISK and PROVIDED volumes. > > (ii) 8 VMs on Azure, with PROVIDED storage configured to point to a > > WASB > > account (containing 26,074 files and ~1.3TB of data). All Datanodes > are > > configured with DISK and PROVIDED volumes. > > > > (i) was tested using both the text-based alias map > > (TextFileRegionAliasMap) > > and the in-memory leveldb-based alias map ( > > InMemoryLevelDBAliasMapClient), > > while (ii) was tested using the text-based alias map only. > > > > Steps followed: > > (0) Build from apache/HDFS-9806. (Note that for the leveldb-based > alias > > map, the patch posted to HDFS-12912 > > <https://issues.apache.org/jira/browse/HDFS-12912> needs to be > > applied; we > > will commit this to apache/HDFS-9806 after review). > > (1) Generate the FSImage using the image generation tool with the > > appropriate remote location (hdfs:// in (i) and wasb:// in (ii)). > > (2) Bring up the HDFS cluster. > > (3) Verify that the remote namespace is reflected correctly and data > on > > remote store can be accessed. Commands ran: ls, copyToLocal, fsck, > > getrep, > > setrep, getStoragePolicy > > (4) Run Sort and Gridmix jobs on the data in the remote location with > > the > > input paths pointing to the local HDFS. > > (5) Increase replication of the PROVIDED files and verified that > local > > (DISK) replicas were created for the PROVIDED replicas, using fsck. > > (6) Verify that Provided storage capacity is shown correctly on the > NN > > and > > Datanode Web-UI. > > (7) Bring down datanodes, one by one. When all are down, verify NN > > reports > > all PROVIDED files as missing. Bringing back up any one Datanode > makes > > all > > the data available. > > (8) Restart NN and verify data is still accesible. > > (9) Verify that Writes to local HDFS continue to work. > > (10) Bring down all Datanodes except one. Start decommissioning the > > remaining Datanode. Verify that the data in the PROVIDED storage is > > still > > accessible. > > > > Apart from the above, we ported the changes in HDFS-9806 to > branch-2.7 > > and > > deployed it on a ~800 node cluster as one of the sub-clusters in a > > Router-based Federated HDFS of nearly 4000 nodes (with help from > Inigo > > Goiri). We mounted about 1000 files, 650TB of remote data > (~2.6million > > blocks with 256MB block size) in this cluster using the text-based > > alias > > map. We verified that the basic commands (ls, copyToLocal, setrep) > > work. > > We also ran spark jobs against this cluster. > > > > -Virajith > > > > > > On Fri, Dec 8, 2017 at 3:44 PM, Chris Douglas <cdoug...@apache.org> > > wrote: > > > > > Discussion thread: https://s.apache.org/kxT1 > > > > > > We're down to the last few issues and are preparing the branch to > > > merge to trunk. We'll post merge patches to HDFS-9806 [1]. Minor, > > > "cleanup" tasks (checkstyle, findbugs, naming, etc.) will be > tracked > > > in HDFS-12712 [2]. > > > > > > We've tried to ensure that when this feature is disabled, HDFS is > > > unaffected. For those reviewing this, please look for places where > > > this might add overheads and we'll address them before the merge. > The > > > site documentation [3] and design doc [4] should be up to date and > > > sufficient to try this out. Again, please point out where it is > > > unclear and we can address it. > > > > > > This has been a long effort and we're grateful for the support > we've > > > received from the community. In particular, thanks to Íñigo Goiri, > > > Andrew Wang, Anu Engineer, Steve Loughran, Sean Mackrory, Lukas > > > Majercak, Uma Gunuganti, Kai Zheng, Rakesh Radhakrishnan, Sriram > Rao, > > > Lei Xu, Zhe Zhang, Jing Zhao, Bharat Viswanadham, ATM, Chris > Nauroth, > > > Sanjay Radia, Atul Sikaria, and Peng Li for all your input into the > > > design, testing, and review of this feature. > > > > > > The vote will close no earlier than one week from today, 12/15. -C > > > > > > [1]: https://issues.apache.org/jira/browse/HDFS-9806 > > > [2]: https://issues.apache.org/jira/browse/HDFS-12712 > > > [3]: https://github.com/apache/hadoop/blob/HDFS-9806/hadoop- > > > hdfs-project/hadoop-hdfs/src/site/markdown/HdfsProvidedStorage.md > > > [4]: https://issues.apache.org/jira/secure/attachment/ > > > 12875791/HDFS-9806-design.002.pdf > > > > > > ------------------------------------------------------------ > > --------- > > > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > > > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > > > > > > > > > > > > >