Hmm, HNG seems designed for their (Y!) own circumstance.

On Fri, Jul 1, 2011 at 12:47 PM, Matei Zaharia <[email protected]> wrote:
> Ted brought up some superficial differences, but if you want to understand 
> technical differences, there are a bunch of those as well. Mesos and Hadoop 
> next-gen have similar goals (more efficient resource sharing for data 
> centers), but they are coming at it from different angles -- HNG is currently 
> mainly focusing on MapReduce and aims to support other types of applications 
> too, while Mesos was meant to support a very diverse set of applications, 
> including long-running services and batch jobs (rather than only multiple 
> instances of MapReduce), and is in fact being used for that already. More 
> importantly, HNG is really two pieces -- a refactoring of MapReduce to allow 
> one instance of MR per application, and a resource manager called YARN that 
> lets these instances coordinate. We are going to support having the new MR2 
> application masters run on top of Mesos instead of YARN too (and indeed the 
> refactoring is nice because it will enable Hadoop MapReduce to run on other 
> cluster scheduling systems in the future).
>
> In terms of the technical differences, here are some of the main ones 
> currently:
>
> - Mesos is implemented in C++ rather than Java, and has APIs in C++ and 
> Python in addition to Java.
>
> - The resource allocation models are different: HNG has a central scheduler 
> that supports data locality constraints, while Mesos provides "resource 
> offers" to let applications pick the resources they like according to other 
> criteria in addition to requests/filters to describe which resources you want 
> to be offered. Our belief is that resource offers will allow Mesos to support 
> a wider range of application scheduling needs, while simultaneously making 
> the system more scalable and highly available (minimizing the state and work 
> required of the master).
>
> - Mesos can enforce resource isolation through Linux Containers to guard 
> against misbehaving / greedy tasks.
>
> - HNG supports Kerberos authentication for users.
>
> - HNG can run the MR2 version of Hadoop, while Mesos can run Hadoop 0.20, 
> Spark and MPI.
>
> - There are some smaller architectural differences that may matter for some 
> applications, such as communication being based on message-passing in Mesos 
> vs periodic heartbeats in HNG, which allows Mesos to provide lower scheduling 
> latencies (e.g. to still be efficient if your tasks take 100ms each).
>
> However, overall, as Ted said, many of these differences will likely go away 
> as both projects add features. What will be interesting is whether some 
> fundamental differences in the target workloads remain, which I think is 
> likely to happen. For example, the main deployment of Mesos is currently to 
> run long-running stream processing services at Twitter, which is something 
> that typical Hadoop environments just don't do and that requires different 
> things from the cluster scheduler. I also believe we're going to see a lot of 
> other cluster scheduling systems besides Mesos and HNG in the future, as 
> people's requirements for these systems grow. There are some very challenging 
> problems in designing a general cluster scheduling system that even the 
> Google folks are still working hard on.
>
> Matei
>
>
>
> On Jun 30, 2011, at 6:26 PM, Edward J. Yoon wrote:
>
>> Thanks for your nice and quick explanation!
>>
>> On Fri, Jul 1, 2011 at 10:21 AM, Ted Dunning <[email protected]> wrote:
>>> Technically speaking, Mesos has a less expressive model for expressing
>>> resource requirements.  The thesis of Mesos is that the negotiation between
>>> application and scheduler can make up for this missing information.  Mesos
>>> was also first to "market", but Hadoop nextGen is catching up fast.  The
>>> MR-279 has code that works, albeit with some issues in production use.  From
>>> all reports, these issues are being resolved quickly as Yahoo's considerable
>>> QA resources come to bear.
>>>
>>> Politically speaking, Mesos has a nearly inactive mailing list which, to
>>> outward appearances, indicate a nearly inactive project.  There is some
>>> evidence that considerable activity is occurring off-list, but this is a
>>> process bug in the Apache model since "if it doesn't happen on the list, it
>>> doesn't happen".
>>>
>>> On the other side, Hadoop nextGen has the Hadoop community pretty much
>>> behind it.  Since HNG has the potential to breakdown some of the deadlocks
>>> that have plagued the Hadoop community release process, there is
>>> considerable enthusiasm for it.
>>>
>>> Combined, these factors make it much more likely that HNG will be the
>>> dominant force in the Hadoop world.  That is, more likely in my own
>>> estimation.  Others may differ.
>>>
>>>
>>> On Thu, Jun 30, 2011 at 5:16 PM, Edward J. Yoon 
>>> <[email protected]>wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm newbie, and wonder what's the main differences between Hadoop
>>>> nextGen and Mesos.
>>>>
>>>> Thanks.
>>>> --
>>>> Best Regards, Edward J. Yoon
>>>> @eddieyoon
>>>>
>>>
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Reply via email to