Re: Naming of Hadoop releases

Scott Carey Tue, 20 Mar 2012 11:29:44 -0700


On 3/19/12 11:02 PM, "Konstantin Shvachko" <shv.had...@gmail.com> wrote:


><Doug>
>> to prevent such situations in the future might be that if you backport
>> something from branch n to n-2 then you ought to also be required to
>> backport it to branch n-1 and in general to all intervening branches.
>
>This is imo the most important topic in the discussion.
>I support Doug's proposal, because it provides forward-moving
>evolution of the project,
>with releases being driven by the necessity to introduce new features,
>so that we could avoid back- and forward-porting overhead, which
>exhausts the community resources.

I believe this is untenable.  You cannot guarantee that Hadoop 11.x will
have all the features as Hadoop 3.x.  As such, a backport from 11.x to 2.x
for some reason should not imply porting all the way down the chain.  One
cannot foresee which of those intermediate versions are still active or
live in advance.
When a major version number changes, all bets are off.  The release may
completely overhaul an API, or it may not.  Assumptions of linear progress
break down.

Perhaps such a rule for branches within a major release line make sense
where one can reasonably expect to be able to maintain some expectation of
linear progress.  However not all major versions will have such an
assumption, and the same issues will apply.

The more difficult you make it for an organization to share its work with
the community (i.e. create a branch) the more likely they will work on it
on the side and not in the community.


>
><Arun>
>> This is against the Apache Hadoop release policy on major releases i.e.
>> only features deprecated for at least one release can be removed.
>
>Not sure if this is the Apache Hadoop release policy, but
>we as PMC were inconsistent in allowing decisions to implement new
>features in old releases, namely the 0.20 series, instead of creating
>new releases with those new features. This is the reason why security
>and other good features are not in 0.22.
>Feature freeze has been broken so many times for the .20 branch, so
>that it became a norm for the entire project rather than an exception,
>which we had in the past.
>
>I don't understand this constant segregation against Hadoop .22. It is
>a perfectly usable version of Hadoop. It would be waste not to have it
>released. Very glad that universities adopted it. If somebody needs
>security there is a number of choices, Hadoop-1 being the first. But
>if you cannot afford stand-alone HBase clusters or need to combine
>general Hadoop and HBase loads there is nothing else but Hadoop 0.22
>at this point.
>
>When .23 is stable I will be glad to use it. But the steady stream of
>feature ports makes it hard to decide how stable it is and to predict
>when it is ready.
>I am advocating to stop porting features and start releasing them.
>If .23 is Federation + Yarn, then 0.23 + HA is 0.24; plus PB - going
>to 0.25, etc.
>
>Thought I should clarify what I mean by forward-going progress.
>Hope it makes sense.
>
>Thanks,
>--Konstantin
>
>
>On Mon, Mar 19, 2012 at 2:56 PM, Doug Cutting <cutt...@apache.org> wrote:
>> On 03/19/2012 02:47 PM, Arun C Murthy wrote:
>>> This is against the Apache Hadoop release policy on major releases
>>>i.e. only features deprecated for at least one release can be removed.
>>
>> In many case the reason this happened was that features were backported
>> from trunk to 0.20 but not to 0.22.  In other words, its no fault of the
>> folks who were working on branch 0.22.  So a related policy we might add
>> to prevent such situations in the future might be that if you backport
>> something from branch n to n-2 then you ought to also be required to
>> backport it to branch n-1 and in general to all intervening branches.
>> Does that seem sensible?
>>
>> Doug

Re: Naming of Hadoop releases

Reply via email to