Re: [DISCUSS] Hadoop Security Release off Yahoo! patchset

Milind Bhandarkar Fri, 14 Jan 2011 11:59:36 -0800

Dhruba,

While I do not think that the releasability of a branch should be determined by 
the market-cap (either on nasdaq or second-market) of the contributing company, 
I think a well-tested release is beneficial to the community.

So, I support two releases: 20.100 now, that has security. And 20.200 later 
that incorporates appends (depending on the 0.22+appends timeline). That way, a 
large percentage of community is covered in 2011.

The reasons are these:

1. The proposed 20.100 is perhaps the most tested at scale, out of all 0.20 
branches. In fact, among *all* hadoop releases in last 5 years. I know first 
hand that it causes the least disruption for users, the migration from 0.20 to 
0.20.10x was the smoothest, while adding a valuable feature.

2. HBase (running on hadoop 0.20 with append) has also been scale tested at Y!, 
but on much less than 4000 nodes, and certainly not for varied workloads (where 
the bugs tend to surface). (To my knowledge, the largest HBase instance is at 
Y! in production.)

3. Operations folks need to get some experience with raw hadoop first for any 
release, before other products on top of hadoop, and then handover the 
installation to users. So, there is still time for HBase+0.20.100, and that can 
be addressed in a separate release.

4. It is not as if the community hasn't had a preview of this mega-patch 
already. A large portion of the sub-patches are already in cdh3bx, and many of 
them have already been committed one-by-one to 0.22.

- Milind

On Jan 14, 2011, at 11:24 AM, Dhruba Borthakur wrote:

>> 
>> 
>> 1) I agree this is not a good precedent. We don't support mega-patches in
>> general. We are doing this as part of discontinuing the "yahoo distribution
>> of Hadoop".  We don't plan to continue doing 30 person year projects outside
>> apache and then merging them in!!
>> 
>> 
> I think this is a very dangerous precedent and completely unwarranted.
> mega-patches are bad and is totally not the Apache way to go. I think if you
> want to contribute it back to Apache, you should avoid the mega-patch
> completely.
> 
> 
>  I think the various 20 append patch lines may be fine for specialized
>> hbase clusters, but they doesn't have the rigor behind them to bet your
>> business in them.
>> 
>> 
> I think you are completely off-track here and jumping to conclusions. Big
> business are already betting on it. HBase is becoming a big user of Hadoop
> (dunno whether Y! uses HBase) and I completely agree with Ian that all
> business have to anyway test their release themselves before using it,
> otherwise you could land up with data loss like the type you mentioned.
> 
> thanks,
> dhruba

---
Milind Bhandarkar
mbhandar...@linkedin.com

Re: [DISCUSS] Hadoop Security Release off Yahoo! patchset

Reply via email to