That makes sense. Thanks for the discussion everyone, let’s stick to this tentative plan of EC for 2.9.
I just updated the Roadmap wiki to reflect the same. +Vinod > On Nov 2, 2015, at 4:26 PM, Zheng, Kai <kai.zh...@intel.com> wrote: > > Yeah, so for the issues we recently resolved on trunk and are addressing as > follow-on tasks in Phase I, we would label them with "erasure coding" and > maybe also set the target version as "2.9" for the convenience? > > -----Original Message----- > From: Jing Zhao [mailto:ji...@apache.org] > Sent: Tuesday, November 03, 2015 8:04 AM > To: hdfs-dev@hadoop.apache.org > Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 > (erasure coding) branch to trunk] > > +1 for the plan about Phase I & II. > > BTW, maybe out of the scope of this thread, just want to mention we should > either move the jira under HDFS-8031 or update the jira component as > "erasure-coding" when making further improvement or fixing bugs in EC. In > this way it will be easier for later backporting EC to 2.9. > > On Mon, Nov 2, 2015 at 3:48 PM, Vinayakumar B <vinayakumarb.apa...@gmail.com >> wrote: > >> +1 for the idea. >> On Nov 3, 2015 07:22, "Zheng, Kai" <kai.zh...@intel.com> wrote: >> >>> Sounds good to me. When it's determined to include EC in 2.9 >>> release, it may be good to have a rough release date as Zhe asked, >>> so accordingly the scope of EC can be discussed out. We still have >>> quite a few of things as Phase I follow-on tasks to do before EC can >>> be deployed in a production system. Phase II to develop non-striping >>> EC for cold data would possibly >> be >>> started after that. We might consider to include only Phase I and >>> leave Phase II for next release according to the rough release date. >>> >>> Regards, >>> Kai >>> >>> -----Original Message----- >>> From: Gangumalla, Uma [mailto:uma.ganguma...@intel.com] >>> Sent: Tuesday, November 03, 2015 5:41 AM >>> To: hdfs-dev@hadoop.apache.org >>> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge >>> HDFS-7285 (erasure coding) branch to trunk] >>> >>> +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we >>> +plan to >>> have 2.8 and 2.9 releases. >>> >>> Regards, >>> Uma >>> >>> On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vino...@hortonworks.com> >> wrote: >>> >>>> Forking the thread. Started looking at the 2.8 list, various >>>> features¹ status and arrived here. >>>> >>>> While I understand the pervasive nature of EC and a need for a >>>> significant bake-in, moving this to a 3.x release is not a good idea. >>>> We will surely get a 2.8 out this year and, as needed, I can even >>>> spend time getting started on a 2.9. OTOH, 3.x is long ways off, >>>> and given all the incompatibilities there, it would be a while >>>> before users can get their hands on EC if it were to be only on >>>> 3.x. At best, this may force sites that want EC to backport the >>>> entire EC feature to older releases, at worst this will be repeat >>>> the mess of 0.20 security release >>> forks. >>>> >>>> If we think adding this to 2.8 (even if it switched off) is too >>>> much risk per our original plan, let¹s move this to 2.9, there by >>>> leaving enough time for stability, integration testing and bake-in, >>>> and a realistic chance of having it end up on users¹ clusters soonish. >>>> >>>> +Vinod >>>> >>>>> On Oct 19, 2015, at 1:44 PM, Andrew Wang >>>>> <andrew.w...@cloudera.com> >>>>> wrote: >>>>> >>>>> I think our plan thus far has been to target this for 3.0. I'm >>>>> okay with putting it in branch-2 if we've given a hard look at >>>>> compatibility, but I'll note though that 2.8 is already looking >>>>> like quite a large release, and our release bandwidth has been >>>>> focused on the 2.6 and 2.7 maintenance releases. Adding another >>>>> multi-hundred JIRAs to 2.8 might make it too unwieldy to get out >>>>> the door. If we bump EC past that, 3.0 might very well be our >>>>> next release vehicle. I do plan to revive the 3.0 schedule some >>>>> time next year. With EC and >>>>> JDK8 in a good spot, the only big feature remaining is classpath >>>>> isolation. >>>>> >>>>> EC is also a pretty fundamental change to HDFS. Even if it's >>>>> compatible, in terms of size and impact it might best belong in a >>>>> new major release. >>>>> >>>>> Best, >>>>> Andrew >>>>> >>>>> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B < >>>>> vinayakumarb.apa...@gmail.com> wrote: >>>>> >>>>>> Is anyone else also thinks that feature is ready to goto >>>>>> branch-2 as well? >>>>>> >>>>>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable >>>>>> since then and ready to go in branch-2. >>>>>> >>>>>> -Vinay >>>>>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zhezh...@cloudera.com> wrote: >>>>>> >>>>>>> Thanks Vinay for capturing the issue and Uma for offering the help. >>>>>>> >>>>>>> --- >>>>>>> Zhe Zhang >>>>>>> >>>>>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma < >>>>>> uma.ganguma...@intel.com >>>>>>>> >>>>>>> wrote: >>>>>>> >>>>>>>> Vinay, >>>>>>>> >>>>>>>> >>>>>>>> I would merge them as part of HDFS-9182. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Uma >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" >>>>>>>> <vinayakum...@apache.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Andrew, >>>>>>>>> I see CHANGES.txt entries not yet merged from >>>>>> CHANGES-HDFS-EC-7285.txt. >>>>>>>>> >>>>>>>>> Was this intentional? >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Vinay >>>>>>>>> >>>>>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang < >>>>>> andrew.w...@cloudera.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Branch has been merged to trunk, thanks again to everyone >>>>>>>>>> who worked >>>>>>> on >>>>>>>>>> the >>>>>>>>>> feature! >>>>>>>>>> >>>>>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang >>>>>>>>>> <zhezh...@cloudera.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Thanks everyone who has participated in this discussion. >>>>>>>>>>> >>>>>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this >>>>>>>>>>> vote >>>>>> has >>>>>>>>>> passed. >>>>>>>>>>> I will do a final 'git merge' with trunk and work with >>>>>>>>>>> Andrew to >>>>>>> merge >>>>>>>>>> the >>>>>>>>>>> branch to trunk. I'll update on this thread when the merge >>>>>>>>>>> is >>>>>> done. >>>>>>>>>>> >>>>>>>>>>> --- >>>>>>>>>>> Zhe Zhang >>>>>>>>>>> >>>>>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A >>>>>>>>>>> <yi.a....@intel.com> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> (Change it to binding.) >>>>>>>>>>>> >>>>>>>>>>>> +1 >>>>>>>>>>>> I have been involved in the development and code review on >>>>>>>>>>>> the >>>>>>>>>> feature >>>>>>>>>>>> branch. It's a great feature and I think it's ready to >>>>>>>>>>>> merge it >>>>>>> into >>>>>>>>>>> trunk. >>>>>>>>>>>> >>>>>>>>>>>> Thanks all for the contribution. >>>>>>>>>>>> >>>>>>>>>>>> Regards, >>>>>>>>>>>> Yi Liu >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> From: Liu, Yi A >>>>>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM >>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org >>>>>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) >>>>>>>>>>>> branch to >>>>>>> trunk >>>>>>>>>>>> >>>>>>>>>>>> +1 (non-binding) >>>>>>>>>>>> I have been involved in the development and code review on >>>>>>>>>>>> the >>>>>>>>>> feature >>>>>>>>>>>> branch. It's a great feature and I think it's ready to >>>>>>>>>>>> merge it >>>>>>> into >>>>>>>>>>> trunk. >>>>>>>>>>>> >>>>>>>>>>>> Thanks all for the contribution. >>>>>>>>>>>> >>>>>>>>>>>> Regards, >>>>>>>>>>>> Yi Liu >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> From: Vinayakumar B [mailto:vinayakum...@apache.org] >>>>>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM >>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org >>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) >>>>>>>>>>>> branch to >>>>>>> trunk >>>>>>>>>>>> >>>>>>>>>>>> +1, >>>>>>>>>>>> >>>>>>>>>>>> I've been involved starting from design and development of >>>>>>>>>> ErasureCoding. >>>>>>>>>>>> I think phase 1 of this development is ready to be merged >>>>>>>>>>>> to >>>>>>> trunk. >>>>>>>>>>>> It had come a long way to the current state with >>>>>>>>>>>> significant >>>>>>> effort >>>>>>>>>> of >>>>>>>>>>>> many Contributors and Reviewers for both design and code. >>>>>>>>>>>> >>>>>>>>>>>> Thanks Everyone for the efforts. >>>>>>>>>>>> >>>>>>>>>>>> Regards, >>>>>>>>>>>> Vinay >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao >>>>>>>>>>>> <ji...@apache.org> >>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> +1 >>>>>>>>>>>>> >>>>>>>>>>>>> I've been involved in both development and review on the >>>>>> branch, >>>>>>>>>> and >>>>>>>>>> I >>>>>>>>>>>>> believe it's now ready to get merged into trunk. Many >>>>>>>>>>>>> thanks >>>>>> to >>>>>>>>>> all >>>>>>>>>>>>> the contributors and reviewers! >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> -Jing >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai < >>>>>>> kai.zh...@intel.com> >>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Non-binding +1 >>>>>>>>>>>>>> >>>>>>>>>>>>>> According to our extensive performance tests, striping + >>>>>> ISA-L >>>>>>>>>> coder >>>>>>>>>>>>> based >>>>>>>>>>>>>> erasure coding not only can save storage, but also can >>>>>>> increase >>>>>>>>>> the >>>>>>>>>>>>>> throughput of a client or a cluster. It will be a great >>>>>>>>>> addition to >>>>>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we >>>>>> also >>>>>>>>>>>>>> observed it's >>>>>>>>>>>>> very >>>>>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf >>>>>> test >>>>>>>>>> report >>>>>>>>>>>>> after >>>>>>>>>>>>>> it's sorted out and hope it helps. >>>>>>>>>>>>>> Thanks! >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>> Kai >>>>>>>>>>>>>> >>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.ganguma...@intel.com] >>>>>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM >>>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org; >>>>>> common-...@hadoop.apache.org >>>>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) >>>>>>>>>>>>>> branch >>>>>> to >>>>>>>>>> trunk >>>>>>>>>>>>>> >>>>>>>>>>>>>> +1 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the >>>>>>>>>>>>>> nice >>>>>>>>>> work. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>> Uma >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zhezh...@cloudera.com> >>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 >>>>>>>>>>>>>>> feature >>>>>>>>>> branch >>>>>>>>>>>>>>> back to trunk. Since November 2014 we have been >>>>>>>>>>>>>>> designing >>>>>> and >>>>>>>>>>>>>>> developing this feature under the umbrella JIRAs >>>>>>>>>>>>>>> HDFS-7285 >>>>>>> and >>>>>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The HDFS-7285 feature branch was created to support the >>>>>> first >>>>>>>>>> phase >>>>>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of >>>>>>>>>>>>>>> HDFS-EC >>>>>> is >>>>>>>>>> to >>>>>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters. >>>>>>>>>> Instead >>>>>>>>>>>>>>> of always creating 3 replicas of each block with 200% >>>>>> storage >>>>>>>>>> space >>>>>>>>>>>>>>> overhead, HDFS-EC provides data durability through >>>>>>>>>>>>>>> parity >>>>>>> data >>>>>>>>>>> blocks. >>>>>>>>>>>>>>> With most EC configurations, the storage overhead is no >>>>>> more >>>>>>>>>> than >>>>>>>>>>> 50%. >>>>>>>>>>>>>>> Based on profiling results of production clusters, we >>>>>> decided >>>>>>>>>> to >>>>>>>>>>>>>>> support EC with the striped block layout in the first >>>>>> phase, >>>>>>> so >>>>>>>>>>>>>>> that small files can be better handled. This means >>>>>>>>>>>>>>> dividing >>>>>>>>>> each >>>>>>>>>>>>>>> logical HDFS file block into smaller units (striping >>>>>>>>>>>>>>> cells) >>>>>>> and >>>>>>>>>>>>>>> spreading them on a set of DataNodes in round-robin >>>>>> fashion. >>>>>>>>>> Parity >>>>>>>>>>>>>>> cells are generated for each stripe of original data cells. >>>>>>> We >>>>>>>>>> have >>>>>>>>>>>>>>> made changes to NameNode, client, and DataNode to >>>>>> generalize >>>>>>>>>> the >>>>>>>>>>>>>>> block concept and handle the mapping between a logical >>>>>>>>>>>>>>> file >>>>>>>>>> block >>>>>>>>>>>>>>> and its internal storage blocks. For further details >>>>>>>>>>>>>>> please >>>>>>> see >>>>>>>>>> the >>>>>>>>>>>>>>> design doc on HDFS-7285. >>>>>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and >>>>>>> high-performance >>>>>>>>>>>>>>> codec calculation support. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The nightly Jenkins job of the branch has reported >>>>>>>>>>>>>>> several successful runs, and doesn't show new flaky >>>>>>>>>>>>>>> tests compared >>>>>>> with >>>>>>>>>>>>>>> trunk. We have posted several versions of the test plan >>>>>>>>>> including >>>>>>>>>>>>>>> both unit testing and cluster testing, and have >>>>>>>>>>>>>>> executed >>>>>> most >>>>>>>>>> tests >>>>>>>>>>>>>>> in the plan. The most basic functionalities have been >>>>>>>>>> extensively >>>>>>>>>>>>>>> tested and verified in several real clusters with >>>>>>>>>>>>>>> different hardware configurations; results have been >>>>>>>>>>>>>>> very stable. We >>>>>>> have >>>>>>>>>>>>>>> created follow-on tasks for more advanced error >>>>>>>>>>>>>>> handling >>>>>> and >>>>>>>>>>>> optimization under the umbrella HDFS-8031. >>>>>>>>>>>>>>> We also plan to implement or harden the integration of >>>>>>>>>>>>>>> EC >>>>>>> with >>>>>>>>>>>>>>> existing features such as WebHDFS, snapshot, append, >>>>>>> truncate, >>>>>>>>>>>>>>> hflush, hsync, and so forth. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Development of this feature has been a collaboration >>>>>>>>>>>>>>> across >>>>>>>>>> many >>>>>>>>>>>>>>> companies and institutions. I'd like to thank J. >>>>>>>>>>>>>>> Andreina, >>>>>>>>>> Takanobu >>>>>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma >>>>>> Maheswara >>>>>>>>>> Rao >>>>>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, >>>>>>>>>>>>>>> Gao >>>>>>> Rui, >>>>>>>>>> Kai >>>>>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, >>>>>>>>>>>>>>> Yong >>>>>>>>>> Zhang, >>>>>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code >>>>>>> contributions >>>>>>>>>> and >>>>>>>>>>>> reviews. >>>>>>>>>>>>>>> Andrew and Kai Zheng also made fundamental >>>>>>>>>>>>>>> contributions to >>>>>>> the >>>>>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng >>>>>>>>>>>>>>> and >>>>>>> many >>>>>>>>>>>>>>> other contributors have made great efforts in system >>>>>> testing. >>>>>>>>>> Many >>>>>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and >>>>>>>>>>>>>>> ATM, >>>>>>> Todd >>>>>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for >>>>>>>>>> providing >>>>>>>>>>>> helpful feedbacks. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Following the community convention, this vote will last >>>>>> for 7 >>>>>>>>>> days >>>>>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers >>>>>>>>>>>>>>> are >>>>>>>>>> binding >>>>>>>>>>>>>>> but non-binding votes are very welcome as well. And >>>>>>>>>>>>>>> here's >>>>>> my >>>>>>>>>>>>>>> non-binding >>>>>>>>>>>>> +1. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> --- >>>>>>>>>>>>>>> Zhe Zhang >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >>> >>> >>