Ian,
On Oct 21, 2010, at 4:50 PM, Ian Holsman wrote:
but the other question I have which hopefully you guys can answer is
does
the yahoo distribution have ALL the patches from the trunk on it?
because if
it doesn't I think that is problematic as well for other reasons.
Yahoo put security on Apache Hadoop-0.20.
Apache Hadoop trunk is very far from hadoop-0.20, there are lots of
features in trunk which aren't part of yahoo-hadoop-0.20 simply
because there wasn't a need or it wasn't worth our effort to backport
them etc. I know, since I have a big hand in deciding it.
However, we have been very religious about porting all our changes to
trunk, we might have missed a couple due to time pressure, human
mistake etc.
Thus, it isn't feasible for yahoo distribution to be a superset of
trunk. Even more because it takes a *huge* amount of effort to qualify
trunk... we at Yahoo qualified Apache Hadoop 0.20 and have stuck with
it for over a year now, same as Cloudera, Facebook etc. Again, I'll
point out that we have been very good at porting nearly 4000 internal
commits to trunk throughout this time.
Hope that helps.
thanks,
Arun