Any one knows are the HadoopNextGen source codes available now? Thanks!

Yizheng

2011/7/1 Matei Zaharia <[email protected]>

> That's a good question. Right now we were planning to provide a wrapper
> that has the same API as the resource manager in HNG, so that not only
> MapReduce but other apps written against that API will work. We'll see if we
> run into any unforeseen problems with that.
>
> Matei
>
> On Jul 1, 2011, at 3:37 AM, Edward J. Yoon wrote:
>
> > Here's another silly question.
> >
> > Mesos plans to add HNG? or will be supported only pure Map/Reduce?
> >
> > On Fri, Jul 1, 2011 at 2:15 PM, Ted Dunning <[email protected]>
> wrote:
> >> Also, both projects are changing in terms of what they do and what they
> >> intend to do.
> >>
> >> For instance, support for long running processes and alternative
> execution
> >> models other than map-reduce is an explicit goal for Yarn.
> >>
> >> This illustrates how hard it is for anybody to compare systems.
>  Typically,
> >> any given person knows much more about one system than the other leading
> to
> >> many comparison points that are only half true (that half being the one
> with
> >> better information).  This isn't remediable without collaborative
> discussion
> >> between (differently) informed speakers.
> >>
> >>
> >> On Thu, Jun 30, 2011 at 10:10 PM, Edward J. Yoon <[email protected]
> >wrote:
> >>
> >>> Understood.
> >>>
> >>> On Fri, Jul 1, 2011 at 1:59 PM, Matei Zaharia <[email protected]
> >
> >>> wrote:
> >>>> I wouldn't say it's designed for Yahoo! only, but it's definitely
> meant
> >>> to solve issues they saw with large Hadoop clusters (and provides a lot
> of
> >>> value for that).
> >>>>
> >>>> Matei
> >>>>
> >>>> On Jul 1, 2011, at 12:51 AM, Edward J. Yoon wrote:
> >>>>
> >>>>> Hmm, HNG seems designed for their (Y!) own circumstance.
> >>>>>
> >>>>> On Fri, Jul 1, 2011 at 12:47 PM, Matei Zaharia <
> [email protected]>
> >>> wrote:
> >>>>>> Ted brought up some superficial differences, but if you want to
> >>> understand technical differences, there are a bunch of those as well.
> Mesos
> >>> and Hadoop next-gen have similar goals (more efficient resource sharing
> for
> >>> data centers), but they are coming at it from different angles -- HNG
> is
> >>> currently mainly focusing on MapReduce and aims to support other types
> of
> >>> applications too, while Mesos was meant to support a very diverse set
> of
> >>> applications, including long-running services and batch jobs (rather
> than
> >>> only multiple instances of MapReduce), and is in fact being used for
> that
> >>> already. More importantly, HNG is really two pieces -- a refactoring of
> >>> MapReduce to allow one instance of MR per application, and a resource
> >>> manager called YARN that lets these instances coordinate. We are going
> to
> >>> support having the new MR2 application masters run on top of Mesos
> instead
> >>> of YARN too (and indeed the refactoring is nice because it will enable
> >>> Hadoop MapReduce to run on other cluster scheduling systems in the
> future).
> >>>>>>
> >>>>>> In terms of the technical differences, here are some of the main
> ones
> >>> currently:
> >>>>>>
> >>>>>> - Mesos is implemented in C++ rather than Java, and has APIs in C++
> and
> >>> Python in addition to Java.
> >>>>>>
> >>>>>> - The resource allocation models are different: HNG has a central
> >>> scheduler that supports data locality constraints, while Mesos provides
> >>> "resource offers" to let applications pick the resources they like
> according
> >>> to other criteria in addition to requests/filters to describe which
> >>> resources you want to be offered. Our belief is that resource offers
> will
> >>> allow Mesos to support a wider range of application scheduling needs,
> while
> >>> simultaneously making the system more scalable and highly available
> >>> (minimizing the state and work required of the master).
> >>>>>>
> >>>>>> - Mesos can enforce resource isolation through Linux Containers to
> >>> guard against misbehaving / greedy tasks.
> >>>>>>
> >>>>>> - HNG supports Kerberos authentication for users.
> >>>>>>
> >>>>>> - HNG can run the MR2 version of Hadoop, while Mesos can run Hadoop
> >>> 0.20, Spark and MPI.
> >>>>>>
> >>>>>> - There are some smaller architectural differences that may matter
> for
> >>> some applications, such as communication being based on message-passing
> in
> >>> Mesos vs periodic heartbeats in HNG, which allows Mesos to provide
> lower
> >>> scheduling latencies (e.g. to still be efficient if your tasks take
> 100ms
> >>> each).
> >>>>>>
> >>>>>> However, overall, as Ted said, many of these differences will likely
> go
> >>> away as both projects add features. What will be interesting is whether
> some
> >>> fundamental differences in the target workloads remain, which I think
> is
> >>> likely to happen. For example, the main deployment of Mesos is
> currently to
> >>> run long-running stream processing services at Twitter, which is
> something
> >>> that typical Hadoop environments just don't do and that requires
> different
> >>> things from the cluster scheduler. I also believe we're going to see a
> lot
> >>> of other cluster scheduling systems besides Mesos and HNG in the
> future, as
> >>> people's requirements for these systems grow. There are some very
> >>> challenging problems in designing a general cluster scheduling system
> that
> >>> even the Google folks are still working hard on.
> >>>>>>
> >>>>>> Matei
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Jun 30, 2011, at 6:26 PM, Edward J. Yoon wrote:
> >>>>>>
> >>>>>>> Thanks for your nice and quick explanation!
> >>>>>>>
> >>>>>>> On Fri, Jul 1, 2011 at 10:21 AM, Ted Dunning <
> [email protected]>
> >>> wrote:
> >>>>>>>> Technically speaking, Mesos has a less expressive model for
> >>> expressing
> >>>>>>>> resource requirements.  The thesis of Mesos is that the
> negotiation
> >>> between
> >>>>>>>> application and scheduler can make up for this missing
> information.
> >>>  Mesos
> >>>>>>>> was also first to "market", but Hadoop nextGen is catching up
> fast.
> >>>  The
> >>>>>>>> MR-279 has code that works, albeit with some issues in production
> >>> use.  From
> >>>>>>>> all reports, these issues are being resolved quickly as Yahoo's
> >>> considerable
> >>>>>>>> QA resources come to bear.
> >>>>>>>>
> >>>>>>>> Politically speaking, Mesos has a nearly inactive mailing list
> which,
> >>> to
> >>>>>>>> outward appearances, indicate a nearly inactive project.  There is
> >>> some
> >>>>>>>> evidence that considerable activity is occurring off-list, but
> this
> >>> is a
> >>>>>>>> process bug in the Apache model since "if it doesn't happen on the
> >>> list, it
> >>>>>>>> doesn't happen".
> >>>>>>>>
> >>>>>>>> On the other side, Hadoop nextGen has the Hadoop community pretty
> >>> much
> >>>>>>>> behind it.  Since HNG has the potential to breakdown some of the
> >>> deadlocks
> >>>>>>>> that have plagued the Hadoop community release process, there is
> >>>>>>>> considerable enthusiasm for it.
> >>>>>>>>
> >>>>>>>> Combined, these factors make it much more likely that HNG will be
> the
> >>>>>>>> dominant force in the Hadoop world.  That is, more likely in my
> own
> >>>>>>>> estimation.  Others may differ.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thu, Jun 30, 2011 at 5:16 PM, Edward J. Yoon <
> >>> [email protected]>wrote:
> >>>>>>>>
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> I'm newbie, and wonder what's the main differences between Hadoop
> >>>>>>>>> nextGen and Mesos.
> >>>>>>>>>
> >>>>>>>>> Thanks.
> >>>>>>>>> --
> >>>>>>>>> Best Regards, Edward J. Yoon
> >>>>>>>>> @eddieyoon
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best Regards, Edward J. Yoon
> >>>>>>> @eddieyoon
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Best Regards, Edward J. Yoon
> >>>>> @eddieyoon
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Best Regards, Edward J. Yoon
> >>> @eddieyoon
> >>>
> >>
> >
> >
> >
> > --
> > Best Regards, Edward J. Yoon
> > @eddieyoon
>
>

Reply via email to