John, i believe that you are 100% correct. Theoretically we should run MRv2
on Mesos but the current implementation of MRv2 on Yarn seem very complex
and difficult to decouple from the resource manager/negotiator.

It's still something that could be done I guess but maybe as completely
independent Hadoop-compatible map reduce framework for Mesos. You could
write this from scratch with a custom framework inspired by the MRv2 app
master implementation.
On Jul 27, 2014 7:00 PM, "John Omernik" <j...@omernik.com> wrote:

> So excuse my naivety in this space, but my ignorance has never really
> stopped me from asking questions:
>
> I see YARN (Yet another resource negotiator) as very similar to Mesos.
> I.e. something to manage resources on a cluster of machines. So when I hear
> talk of running "YARN" on Mesos it's seems very redundant indeed, and I ask
> myself, what are we actually getting out of this setup?
>
> So, going to the mapr/reduce question, I see Mapr Reduce V1 and MaprReduce
> V2 like this:  Map Reduce V2 is an application that runs on YARN. I.e. if
> you run a job, it creates an application master, that application master
> requests resources, and the job gets run.  It differs from Map Reduce V1 is
> there is no long running Job Tracker (other than the YARN Resource Manager,
> but that is managing resources for all applications, not just Map Reduce
> Applications).  Ok, so Mesos, why can't there be a Mesos Application that
> is similar to a Map Reduce V2 Application in YARN?  Why do we need to run
> YARN on Mesos? That doesn't really make sense.  Basically, for M/R V2 vs
> M/R V1, the only difference is to mimic M/R V1 we need task trackers and
> job trackers running as Mesos applications (which we have).  So in M/R v2,
> we just need the equivalent of an application master running on Yarn,
> requesting resources across the cluster.
>
> Fundamentally, YARN is confusing because I think they coupled running Map
> Reduce jobs with the resource manager and called it "Hadoop v2".  By
> coupling the two, people look at YARN as Map Reduce V2, but it's not
> really.  It's a way to running jobs on a cluster of machines (ala Mesos)
> with a "application" that is the equivalent of Map Reduce V1.   The names
> being given seem to be confusing to me, it makes people who have invested
> in Hadoop (Map Reduce V1) be very interested in YARN because it's called
> "Hadoop V2".  While Mesos is seen as the "Other"
>
>
> Just for my sake I summarized a TL;DR form so if someone wants to correct
> my understanding they can
>
> Mesos = Tool to manage resources
>
> YARN = Tool to manage resources it's also called Hadoopv2
>
> Map Reduce V1 = Job trackers/Task Trackers it's what we know. It can run
> on Hadoop clusters, and Mesos.  It's also called Hadoopv1
>
> Map Reduce V2 =  Application that can run on YARN that mimics Map Reduce
> V1 on a YARN Cluster. This + YARN has been called Hadoopv2.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Sun, Jul 27, 2014 at 4:10 AM, Maxime Brugidou <
> maxime.brugi...@gmail.com> wrote:
>
>> When I said that running yarn over mesos did not make sense I meant that
>> running a resource manager in a resource manager was very sub-optimal. You
>> will eventually do static allocation of resources for the Yarn framework in
>> Mesos or have complex logic to determine how much resource should be given
>> to yarn. You will also have the same burden of managing 2 different
>> clusters instead of one, even if yarn is sort of hidden as mesos framework.
>>
>> However yes I believe its easier to run yarn on mesos than to run mrv2 on
>> top of mesos. The solution I was discussing was obviously "ideal" and I
>> looked at the MRAppMaster since and it discouraged me :)
>>  On Jul 27, 2014 12:41 AM, "Rick Richardson" <rick.richard...@gmail.com>
>> wrote:
>>
>>> FWIW I also think the fastest approach here is is porting Yarn onto
>>> Mesos.
>>>
>>> In a perfect world, writing an implementation layer for the Yarn
>>> Interface on Mesos would certainly be the optimal approach, but looking at
>>> the MRv2 code, it is very very coupled to many Yarn modules.
>>>
>>> If someone wanted to take on the project of making a generic resource
>>> scheduler Interface for MRv2, that works be amazing :)
>>> On Jul 26, 2014 6:19 PM, "Jie Yu" <yujie....@gmail.com> wrote:
>>>
>>>> I am interested in investigating the idea of YARN on top of Mesos. One
>>>> of the benefits I can think of is that we can get rid of the static
>>>> resource allocation between YARN and Mesos clusters. In that way, Mesos can
>>>> allocate those resources that are not used by YARN to other Mesos
>>>> frameworks like Aurora, Marathon, etc, to increase the resource utilization
>>>> of the entire data center. Also, we could avoid running each MRv2 job as a
>>>> framework which I think might cause some maintenance complexity (e.g. for
>>>> framework rate limiting, etc). Finally, YARN currently does not have a good
>>>> isolation support. It only supports cpu isolation right now (using
>>>> cgroups). By porting YARN on top of Mesos, we might be able to leverage the
>>>> existing Mesos containerizer strategy to provide better isolation between
>>>> tasks. Maxime, I am curious why do you think it does not make sense to run
>>>> YARN over Mesos? Since I am not super familar with YARN, I might be missing
>>>> something.
>>>>
>>>> I have been thinking of making ResourceManager in YARN a Mesos
>>>> framework and making NodeManager a Mesos executor. The NodeManager will
>>>> launch containers using primitives provided by Mesos so that we have a
>>>> consistent containerizer layer. I haven't fully figured out how this could
>>>> be done yet (e.g., nested containers, communication between NodeManager and
>>>> ResourceManager, etc.), but I would love to explore this direction. I would
>>>> like to hear about any feedback/suggestions you guys have about this
>>>> direction.
>>>>
>>>> Thanks,
>>>> - Jie
>>>>
>>>>
>>>> On Fri, Jul 25, 2014 at 1:39 PM, Maxime Brugidou <
>>>> maxime.brugi...@gmail.com> wrote:
>>>>
>>>>> We run both mesos and yarn in prod and it does not make sense to run
>>>>> yarn over mesos.
>>>>>
>>>>> However it would be interesting to find a way to run MRv2 jobs on
>>>>> mesos with some custom layer to swap yarn with mesos. Not sure how to 
>>>>> start
>>>>> though... MRv2 contains a yarn application master that needs to be
>>>>> rewritten as a mesos framework scheduler. This is probably doable. However
>>>>> with MRv2 every map reduce job would be mapped as a new framework in 
>>>>> Mesos.
>>>>> Not sure how many frameworks mesos can run and scale up to. Especially
>>>>> short lived frameworks.
>>>>>  On Jul 25, 2014 8:54 PM, "Tom Arnfeld" <t...@duedil.com> wrote:
>>>>>
>>>>>> Hey Luyi,
>>>>>>
>>>>>> That's correct, the Hadoop framework currently only supports Hadoop 2
>>>>>> MRv1. It also doesn't have great support for the HA jobtracker available 
>>>>>> in
>>>>>> newer versions of Hadoop, but I've been working on that the past few 
>>>>>> weeks.
>>>>>>
>>>>>> I'm not sure how Hadoop 2 would play with Mesos, but very interested
>>>>>> to find out more. Am I correct in thinking MRv2 will only run on top of
>>>>>> YARN?
>>>>>>
>>>>>> I wonder if anyone else on the mailing list is running YARN on top of
>>>>>> Mesos...
>>>>>>
>>>>>> Tom.
>>>>>>
>>>>>> On Friday, 25 July 2014, Luyi Wang <wangluyi1...@gmail.com> wrote:
>>>>>>
>>>>>>> Checked the mesos github(https://github.com/mesos/hadoop). It
>>>>>>> listed support for MapReduce V1
>>>>>>>
>>>>>>> How about the MR V2?
>>>>>>>
>>>>>>> Right now we are using cloudera to manage hadoop clusters where uses
>>>>>>> MRV2. We are planning to migrate all our services to mesos(still in the
>>>>>>> initial investigating stage).  Good suggestions, advice and experiences 
>>>>>>> are
>>>>>>> welcomed.
>>>>>>>
>>>>>>> Thanks a lot!
>>>>>>>
>>>>>>>
>>>>>>> -Luyi.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>
>

Reply via email to