On Fri, Jun 17, 2011 at 7:30 AM, Brian Bockelman <bbock...@cse.unl.edu> wrote:
>
> Hi Ryan, Eric,
>
> Just looked at those two for the first time in awhile.
> - HDFS-918 (now 1323?) doesn't seem like it's too controversial, but does 
> seem like there's a bit of validation left.

Yes, 1323 and also 1148 would be "nice to haves", but neither is ready
to go, yet. Though I really want to improve HBase performance, I also
tend to be fairly conservative on how much testing these things should
need before getting checked in (unless they can be completely
pluggable).

The good news is we did get 941 in last week, and that's a real nice
improvement.

> - HDFS-347 has a long, contentious history.  However, it seems that most of 
> the strong objections have been cleared up.  Is there anyone left who objects 
> to it, now that it doesn't appear to bypass security?

It still has a way to go to be pushed over the finish line. I don't
foresee it happening for this release.

> Finally, I see Todd has posted HDFS-2080 claiming some sizable performance 
> improvements.  Would it be possible that could finish in time for release?

HDFS-2080 has very good bang-for-the-buck in the gains-per-complexity
ratio, especially compared to 347. It could also be made completely
pluggable, since it's just a new implementation of BlockReader. So it
might be feasible to include but not enabled by default.

But, I wouldn't block the 0.23 (or any other) release on including
these things. If they're done and look low-risk at an early enough
date, I'll do my best to convince the RM to include them, but if they
haven't had enough testing, then off to the next release with em.

-Todd

>
> On Jun 17, 2011, at 2:36 AM, Ryan Rawson wrote:
>
> > HDFS-918 and HDFS-347 are absolutely critical for random read
> > performance.  The smarter sites are already running HDFS-347 (I guess
> > they aren't running "Hadoop" then?), and soon they will be testing and
> > running HDFS-918 as well.  Opening 1 socket for every read just isn't
> > really scalable.
> >
> > -ryan
> >
> > On Fri, Jun 17, 2011 at 12:17 AM, Eric Baldeschwieler
> > <eri...@yahoo-inc.com> wrote:
> >> Hi Folks,
> >>
> >> I'd like to start a conversation on mainline planning and the next release 
> >> of Apache Hadoop beyond 0.22.
> >>
> >> The Yahoo! Hadoop team has been working hard to complete several big 
> >> Hadoop projects, including:
> >>
> >> - HDFS Federation [HDFS-1052]
> >>  - Already merged into trunk
> >>
> >> - Next Generation Map-Reduce [MR-279]
> >>  - Passing most tests now and discussing merging into trunk
> >>
> >> - The merging of our previous work on Hadoop with security into mainline 
> >> [http://yhoo.it/i9Ww8W]
> >>  - This is mostly done, but owen and others are doing a scrub to close out 
> >> the remaining issues
> >>
> >> All of these projects are now reaching a place where we would like to 
> >> combine them with the good work already in 0.22 and put out a new apache 
> >> release, perhaps 0.23.  We think the best way to accomplish that is to 
> >> finish the merge in the next few weeks and then cut a release from trunk.
> >>
> >> Yahoo stands ready to help us (the Apache Hadoop Community) turn this new 
> >> release into a stable release by running it through its 9 month test and 
> >> burn in process.  The result of that will be another stable release such 
> >> as 0.18, 0.20 or 0.20.203 (hadoop with security).  We have Yahoo!s support 
> >> for this substantial investment because this new release will have a great 
> >> combination of new features for small and very large sites alike:
> >>  - New Write Pipeline - HBase support [also in 0.21 & 0.22]
> >>  - Federation - Scale up to larger clusters and the ability to experiment 
> >> with new namenode approaches
> >>  - Next Gen MapReduce - Scaleup, performance improvements, ability to 
> >> experiment with new processing frameworks
> >>
> >> I think this effort will produce a great new Apache Hadoop release for the 
> >> community.  I'm starting this thread to collect feedback and hopefully 
> >> folks' endorsement for merging in MR-279 and putting together this new 
> >> release.  Feedback please?
> >>
> >> Thanks,
> >>
> >> E14
> >>
> >>
>



--
Todd Lipcon
Software Engineer, Cloudera

Reply via email to