On Oct 22, 2010, at 10:36 AM, Konstantin Shvachko wrote:

Milind's point is valid, the PMC cannot demand or control what Yahoo,
Facebook, et. al. run in their productions, or what Couldera sells to their customers AS LONG AS it is within the Apache licensing requirements.

What Apache Hadoop can and should provide is a *steady* stream of base
A-releases.

I think that a single fact that we missed to release Hadoop 0.21 late last
year got us into the state we are in now. As it let different Hadoop
installations to diverge drastically from each other, whether it was based
on production or commercial reasons.

Now that we have that, it would not be feasible or worthwhile to find the common denominator based on the old 0.20 version, unless we want to spend another year looking for it and diverging the individual installations even
more in the process.

So the question imo is not "how we merge the cloudera and yahoo
distributions", but when/how do we make the new 0.22 release.
And how do we provide a steady release cycle after that.

+1

sanjay



--Konstantin

On Thu, Oct 21, 2010 at 9:29 PM, Milind A Bhandarkar
<mili...@yahoo-inc.com>wrote:


right.. the trunk is not for production use. I wasn't suggesting that.

So, what are you suggesting ? That Yahoo distribution of Hadoop should
*not* be the version we run on our production clusters ?


but the trunk is what will eventually become the next release.


Then someone in yahoo will have to decide if they are going to move to rebase their production cluster to 0.21, or just continue back- porting
what
they need to the version they are running on their clusters.

Yes, that is what we do now. If there are committed patches in trunk that do not scale for our needs, or break existing applications, or are deemed not worth the efforts needed to backport, we do not include them in our
deployments, and therefore do not include in Yahoo distribution.


and if yahoo fixes a bug in their version, it would need to be
forward-ported over to the current trunk. which will get harder and
harder
as the paths diverge.

Yes, indeed. So, care must be taken that paths do not diverge too much. I have seen some cases where the bug fixes did not need to be forward ported,
because that piece of code was completely re-written in trunk.


I'm sure you've seen it happen on other projects when a major branch
lands
on the trunk, and the amount of effort it takes to reconcile them.

Yes. And that results in delayed releases. An unexpected benefit for
application developers was that they could spend time adding features to
their applications, rather than porting same applications from
release-to-release, and validating releases. So, it's not always bad.

- Milind


--
Milind Bhandarkar
(mailto:mili...@yahoo-inc.com)
(phone: 408-203-5213 W)




Reply via email to