On Aug 30, 2012, at 3:12 AM, Konstantin Shvachko wrote:

> 2. From technical (not community) viewpoint your "svn copy" is an ugly
> approach,
> as it creates a lot of code duplication and will result in a
> maintenance nightmare or / and
> will require many man-months to fix. My point is that you cannot
> neglect "technical issues" when you solve community problems.

Agreed Konstantin. I don't think Chris was being serious here - it was merely 
*one* way forward. 

There are, easily, better ways to solve this.

The big cross-project dependency is IPC/RPC, Security and Metrics2. Some others 
are the network topology apis etc. They need to be marked Public/Stable. We 
need to maintain compatibility across a major (stable) release anyway. This is 
true for every other Public/Stable api. 

So, *technically*, the requirements are:
a) Ensure projects only use Public/Stable apis.
b) Maintain compatibility for Public/Stable apis within a major release.
c) Clearly key components like IPC, Metrics2, Secuirty etc. *should* be marked 
stable by the time the ersatz hadoop-2 codebase is declared 'stable'.

None of these seem like the fashionably *scary* technical issues some people 
are using to justify blocking the way forward.

And, no, YARN/MR aren't the only ones downstream projects in this mix - HBase 
for e.g. uses hadoop metrics2 and our security apis. We need to support 
compatibility for HBase anyway. There are several other projects in the same 
boat. Pig/Hive need FileSystem, Security & MR apis. This is just *reality* 
being at the bottom of the stack.

Yes, there is work left - but that work is something we need to do with or 
without the split.

Furthermore, yes, the previous split/unsplit was painful. However, beyond that, 
we have made progress across several dimensions which should make this one 
smoother:
a) Mavenization has helped a *lot*.
b) Unlike the previous attempt, HDFS2 & YARN (v/s HDFS1 & MR1) no longer share 
the same run-time scripts etc. 
c) We have been fairly good at following through on our stability/visibility 
guarantees on APIs.

As a result, I don't buy the *this is technically impossible• argument.

As Konstantin suggested, we could spend the next few weeks/months preparing. 
Even after the split we would be in alpha/beta stage where-by we can recover 
from mistakes at the cost of a few extra HDFS alpha/beta releases for the sake 
of MR/YARN projects which seems like an acceptable cost given that there are 
several volunteers to RM releases.

Last, not least, the previous split failed because the overall community did 
not invest in ensuring it's success. It's clearly *not* the case this time 
around. I'm very confident of that.

Arun

Reply via email to