Forgot to update this thread. I branched off 2.8 last week. So, we can now go ahead and do a merge of HDFS-7285 into branch-2 (version 2.9) like we discussed before.
Thanks +Vinod > On Nov 3, 2015, at 4:40 PM, Vinod Kumar Vavilapalli <vino...@hortonworks.com> > wrote: > > That makes sense. > > Thanks for the discussion everyone, let’s stick to this tentative plan of EC > for 2.9. > > I just updated the Roadmap wiki to reflect the same. > > +Vinod > > >> On Nov 2, 2015, at 4:26 PM, Zheng, Kai <kai.zh...@intel.com> wrote: >> >> Yeah, so for the issues we recently resolved on trunk and are addressing as >> follow-on tasks in Phase I, we would label them with "erasure coding" and >> maybe also set the target version as "2.9" for the convenience? >> >> -----Original Message----- >> From: Jing Zhao [mailto:ji...@apache.org] >> Sent: Tuesday, November 03, 2015 8:04 AM >> To: hdfs-dev@hadoop.apache.org >> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 >> (erasure coding) branch to trunk] >> >> +1 for the plan about Phase I & II. >> >> BTW, maybe out of the scope of this thread, just want to mention we should >> either move the jira under HDFS-8031 or update the jira component as >> "erasure-coding" when making further improvement or fixing bugs in EC. In >> this way it will be easier for later backporting EC to 2.9. >> >> On Mon, Nov 2, 2015 at 3:48 PM, Vinayakumar B <vinayakumarb.apa...@gmail.com >>> wrote: >> >>> +1 for the idea. >>> On Nov 3, 2015 07:22, "Zheng, Kai" <kai.zh...@intel.com> wrote: >>> >>>> Sounds good to me. When it's determined to include EC in 2.9 >>>> release, it may be good to have a rough release date as Zhe asked, >>>> so accordingly the scope of EC can be discussed out. We still have >>>> quite a few of things as Phase I follow-on tasks to do before EC can >>>> be deployed in a production system. Phase II to develop non-striping >>>> EC for cold data would possibly >>> be >>>> started after that. We might consider to include only Phase I and >>>> leave Phase II for next release according to the rough release date. >>>> >>>> Regards, >>>> Kai >>>> >>>> -----Original Message----- >>>> From: Gangumalla, Uma [mailto:uma.ganguma...@intel.com] >>>> Sent: Tuesday, November 03, 2015 5:41 AM >>>> To: hdfs-dev@hadoop.apache.org >>>> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge >>>> HDFS-7285 (erasure coding) branch to trunk] >>>> >>>> +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we >>>> +plan to >>>> have 2.8 and 2.9 releases. >>>> >>>> Regards, >>>> Uma >>>> >>>> On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vino...@hortonworks.com> >>> wrote: >>>> >>>>> Forking the thread. Started looking at the 2.8 list, various >>>>> features¹ status and arrived here. >>>>> >>>>> While I understand the pervasive nature of EC and a need for a >>>>> significant bake-in, moving this to a 3.x release is not a good idea. >>>>> We will surely get a 2.8 out this year and, as needed, I can even >>>>> spend time getting started on a 2.9. OTOH, 3.x is long ways off, >>>>> and given all the incompatibilities there, it would be a while >>>>> before users can get their hands on EC if it were to be only on >>>>> 3.x. At best, this may force sites that want EC to backport the >>>>> entire EC feature to older releases, at worst this will be repeat >>>>> the mess of 0.20 security release >>>> forks. >>>>> >>>>> If we think adding this to 2.8 (even if it switched off) is too >>>>> much risk per our original plan, let¹s move this to 2.9, there by >>>>> leaving enough time for stability, integration testing and bake-in, >>>>> and a realistic chance of having it end up on users¹ clusters soonish. >>>>> >>>>> +Vinod >>>>> >>>>>> On Oct 19, 2015, at 1:44 PM, Andrew Wang >>>>>> <andrew.w...@cloudera.com> >>>>>> wrote: >>>>>> >>>>>> I think our plan thus far has been to target this for 3.0. I'm >>>>>> okay with putting it in branch-2 if we've given a hard look at >>>>>> compatibility, but I'll note though that 2.8 is already looking >>>>>> like quite a large release, and our release bandwidth has been >>>>>> focused on the 2.6 and 2.7 maintenance releases. Adding another >>>>>> multi-hundred JIRAs to 2.8 might make it too unwieldy to get out >>>>>> the door. If we bump EC past that, 3.0 might very well be our >>>>>> next release vehicle. I do plan to revive the 3.0 schedule some >>>>>> time next year. With EC and >>>>>> JDK8 in a good spot, the only big feature remaining is classpath >>>>>> isolation. >>>>>> >>>>>> EC is also a pretty fundamental change to HDFS. Even if it's >>>>>> compatible, in terms of size and impact it might best belong in a >>>>>> new major release. >>>>>> >>>>>> Best, >>>>>> Andrew >>>>>> >>>>>> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B < >>>>>> vinayakumarb.apa...@gmail.com> wrote: >>>>>> >>>>>>> Is anyone else also thinks that feature is ready to goto >>>>>>> branch-2 as well? >>>>>>> >>>>>>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable >>>>>>> since then and ready to go in branch-2. >>>>>>> >>>>>>> -Vinay >>>>>>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zhezh...@cloudera.com> wrote: >>>>>>> >>>>>>>> Thanks Vinay for capturing the issue and Uma for offering the help. >>>>>>>> >>>>>>>> --- >>>>>>>> Zhe Zhang >>>>>>>> >>>>>>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma < >>>>>>> uma.ganguma...@intel.com >>>>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Vinay, >>>>>>>>> >>>>>>>>> >>>>>>>>> I would merge them as part of HDFS-9182. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Uma >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" >>>>>>>>> <vinayakum...@apache.org> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Andrew, >>>>>>>>>> I see CHANGES.txt entries not yet merged from >>>>>>> CHANGES-HDFS-EC-7285.txt. >>>>>>>>>> >>>>>>>>>> Was this intentional? >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Vinay >>>>>>>>>> >>>>>>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang < >>>>>>> andrew.w...@cloudera.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Branch has been merged to trunk, thanks again to everyone >>>>>>>>>>> who worked >>>>>>>> on >>>>>>>>>>> the >>>>>>>>>>> feature! >>>>>>>>>>> >>>>>>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang >>>>>>>>>>> <zhezh...@cloudera.com> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Thanks everyone who has participated in this discussion. >>>>>>>>>>>> >>>>>>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this >>>>>>>>>>>> vote >>>>>>> has >>>>>>>>>>> passed. >>>>>>>>>>>> I will do a final 'git merge' with trunk and work with >>>>>>>>>>>> Andrew to >>>>>>>> merge >>>>>>>>>>> the >>>>>>>>>>>> branch to trunk. I'll update on this thread when the merge >>>>>>>>>>>> is >>>>>>> done. >>>>>>>>>>>> >>>>>>>>>>>> --- >>>>>>>>>>>> Zhe Zhang >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A >>>>>>>>>>>> <yi.a....@intel.com> >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> (Change it to binding.) >>>>>>>>>>>>> >>>>>>>>>>>>> +1 >>>>>>>>>>>>> I have been involved in the development and code review on >>>>>>>>>>>>> the >>>>>>>>>>> feature >>>>>>>>>>>>> branch. It's a great feature and I think it's ready to >>>>>>>>>>>>> merge it >>>>>>>> into >>>>>>>>>>>> trunk. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks all for the contribution. >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>> Yi Liu >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>> From: Liu, Yi A >>>>>>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM >>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org >>>>>>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) >>>>>>>>>>>>> branch to >>>>>>>> trunk >>>>>>>>>>>>> >>>>>>>>>>>>> +1 (non-binding) >>>>>>>>>>>>> I have been involved in the development and code review on >>>>>>>>>>>>> the >>>>>>>>>>> feature >>>>>>>>>>>>> branch. It's a great feature and I think it's ready to >>>>>>>>>>>>> merge it >>>>>>>> into >>>>>>>>>>>> trunk. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks all for the contribution. >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>> Yi Liu >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>> From: Vinayakumar B [mailto:vinayakum...@apache.org] >>>>>>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM >>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org >>>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) >>>>>>>>>>>>> branch to >>>>>>>> trunk >>>>>>>>>>>>> >>>>>>>>>>>>> +1, >>>>>>>>>>>>> >>>>>>>>>>>>> I've been involved starting from design and development of >>>>>>>>>>> ErasureCoding. >>>>>>>>>>>>> I think phase 1 of this development is ready to be merged >>>>>>>>>>>>> to >>>>>>>> trunk. >>>>>>>>>>>>> It had come a long way to the current state with >>>>>>>>>>>>> significant >>>>>>>> effort >>>>>>>>>>> of >>>>>>>>>>>>> many Contributors and Reviewers for both design and code. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks Everyone for the efforts. >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>> Vinay >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao >>>>>>>>>>>>> <ji...@apache.org> >>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> +1 >>>>>>>>>>>>>> >>>>>>>>>>>>>> I've been involved in both development and review on the >>>>>>> branch, >>>>>>>>>>> and >>>>>>>>>>> I >>>>>>>>>>>>>> believe it's now ready to get merged into trunk. Many >>>>>>>>>>>>>> thanks >>>>>>> to >>>>>>>>>>> all >>>>>>>>>>>>>> the contributors and reviewers! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> -Jing >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai < >>>>>>>> kai.zh...@intel.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Non-binding +1 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> According to our extensive performance tests, striping + >>>>>>> ISA-L >>>>>>>>>>> coder >>>>>>>>>>>>>> based >>>>>>>>>>>>>>> erasure coding not only can save storage, but also can >>>>>>>> increase >>>>>>>>>>> the >>>>>>>>>>>>>>> throughput of a client or a cluster. It will be a great >>>>>>>>>>> addition to >>>>>>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we >>>>>>> also >>>>>>>>>>>>>>> observed it's >>>>>>>>>>>>>> very >>>>>>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf >>>>>>> test >>>>>>>>>>> report >>>>>>>>>>>>>> after >>>>>>>>>>>>>>> it's sorted out and hope it helps. >>>>>>>>>>>>>>> Thanks! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>> Kai >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.ganguma...@intel.com] >>>>>>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM >>>>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org; >>>>>>> common-...@hadoop.apache.org >>>>>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) >>>>>>>>>>>>>>> branch >>>>>>> to >>>>>>>>>>> trunk >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> +1 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the >>>>>>>>>>>>>>> nice >>>>>>>>>>> work. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>> Uma >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zhezh...@cloudera.com> >>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 >>>>>>>>>>>>>>>> feature >>>>>>>>>>> branch >>>>>>>>>>>>>>>> back to trunk. Since November 2014 we have been >>>>>>>>>>>>>>>> designing >>>>>>> and >>>>>>>>>>>>>>>> developing this feature under the umbrella JIRAs >>>>>>>>>>>>>>>> HDFS-7285 >>>>>>>> and >>>>>>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The HDFS-7285 feature branch was created to support the >>>>>>> first >>>>>>>>>>> phase >>>>>>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of >>>>>>>>>>>>>>>> HDFS-EC >>>>>>> is >>>>>>>>>>> to >>>>>>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters. >>>>>>>>>>> Instead >>>>>>>>>>>>>>>> of always creating 3 replicas of each block with 200% >>>>>>> storage >>>>>>>>>>> space >>>>>>>>>>>>>>>> overhead, HDFS-EC provides data durability through >>>>>>>>>>>>>>>> parity >>>>>>>> data >>>>>>>>>>>> blocks. >>>>>>>>>>>>>>>> With most EC configurations, the storage overhead is no >>>>>>> more >>>>>>>>>>> than >>>>>>>>>>>> 50%. >>>>>>>>>>>>>>>> Based on profiling results of production clusters, we >>>>>>> decided >>>>>>>>>>> to >>>>>>>>>>>>>>>> support EC with the striped block layout in the first >>>>>>> phase, >>>>>>>> so >>>>>>>>>>>>>>>> that small files can be better handled. This means >>>>>>>>>>>>>>>> dividing >>>>>>>>>>> each >>>>>>>>>>>>>>>> logical HDFS file block into smaller units (striping >>>>>>>>>>>>>>>> cells) >>>>>>>> and >>>>>>>>>>>>>>>> spreading them on a set of DataNodes in round-robin >>>>>>> fashion. >>>>>>>>>>> Parity >>>>>>>>>>>>>>>> cells are generated for each stripe of original data cells. >>>>>>>> We >>>>>>>>>>> have >>>>>>>>>>>>>>>> made changes to NameNode, client, and DataNode to >>>>>>> generalize >>>>>>>>>>> the >>>>>>>>>>>>>>>> block concept and handle the mapping between a logical >>>>>>>>>>>>>>>> file >>>>>>>>>>> block >>>>>>>>>>>>>>>> and its internal storage blocks. For further details >>>>>>>>>>>>>>>> please >>>>>>>> see >>>>>>>>>>> the >>>>>>>>>>>>>>>> design doc on HDFS-7285. >>>>>>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and >>>>>>>> high-performance >>>>>>>>>>>>>>>> codec calculation support. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The nightly Jenkins job of the branch has reported >>>>>>>>>>>>>>>> several successful runs, and doesn't show new flaky >>>>>>>>>>>>>>>> tests compared >>>>>>>> with >>>>>>>>>>>>>>>> trunk. We have posted several versions of the test plan >>>>>>>>>>> including >>>>>>>>>>>>>>>> both unit testing and cluster testing, and have >>>>>>>>>>>>>>>> executed >>>>>>> most >>>>>>>>>>> tests >>>>>>>>>>>>>>>> in the plan. The most basic functionalities have been >>>>>>>>>>> extensively >>>>>>>>>>>>>>>> tested and verified in several real clusters with >>>>>>>>>>>>>>>> different hardware configurations; results have been >>>>>>>>>>>>>>>> very stable. We >>>>>>>> have >>>>>>>>>>>>>>>> created follow-on tasks for more advanced error >>>>>>>>>>>>>>>> handling >>>>>>> and >>>>>>>>>>>>> optimization under the umbrella HDFS-8031. >>>>>>>>>>>>>>>> We also plan to implement or harden the integration of >>>>>>>>>>>>>>>> EC >>>>>>>> with >>>>>>>>>>>>>>>> existing features such as WebHDFS, snapshot, append, >>>>>>>> truncate, >>>>>>>>>>>>>>>> hflush, hsync, and so forth. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Development of this feature has been a collaboration >>>>>>>>>>>>>>>> across >>>>>>>>>>> many >>>>>>>>>>>>>>>> companies and institutions. I'd like to thank J. >>>>>>>>>>>>>>>> Andreina, >>>>>>>>>>> Takanobu >>>>>>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma >>>>>>> Maheswara >>>>>>>>>>> Rao >>>>>>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, >>>>>>>>>>>>>>>> Gao >>>>>>>> Rui, >>>>>>>>>>> Kai >>>>>>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, >>>>>>>>>>>>>>>> Yong >>>>>>>>>>> Zhang, >>>>>>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code >>>>>>>> contributions >>>>>>>>>>> and >>>>>>>>>>>>> reviews. >>>>>>>>>>>>>>>> Andrew and Kai Zheng also made fundamental >>>>>>>>>>>>>>>> contributions to >>>>>>>> the >>>>>>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng >>>>>>>>>>>>>>>> and >>>>>>>> many >>>>>>>>>>>>>>>> other contributors have made great efforts in system >>>>>>> testing. >>>>>>>>>>> Many >>>>>>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and >>>>>>>>>>>>>>>> ATM, >>>>>>>> Todd >>>>>>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for >>>>>>>>>>> providing >>>>>>>>>>>>> helpful feedbacks. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Following the community convention, this vote will last >>>>>>> for 7 >>>>>>>>>>> days >>>>>>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers >>>>>>>>>>>>>>>> are >>>>>>>>>>> binding >>>>>>>>>>>>>>>> but non-binding votes are very welcome as well. And >>>>>>>>>>>>>>>> here's >>>>>>> my >>>>>>>>>>>>>>>> non-binding >>>>>>>>>>>>>> +1. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> --- >>>>>>>>>>>>>>>> Zhe Zhang >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>> >>>> >>> >