Hey all,

This is a very interesting discussion, and i’m keen to see where it goes. I’m 
particularly interested in the point you made Adam about each user launching 
their own framework…

To me this seems like a very decent approach, take for example multiple users 
all executing different pipelines of jobs (a mix between Hadoop and Spark) – 
the running time of a single task is quite short, so the opportunities to 
re-distribute the resource share among each user are frequent. Imagine if each 
user, before invoking their pipelines launched a Hadoop JobTracker and Spark 
Master *locally* which registered itself as a framework with the Mesos cluster 
under their own user. They would then launch their tasks through those local 
frameworks and Mesos would take care of scheduling users, and then frameworks.

It seems that this would work quite well with what you’re describing? I’m 
curious if anyone else has investigated taking the above approach, and if they 
found any issues with the increased network traffic between the client and the 
cluster. If the network prohibits this, you could even launch the per-user 
Hadoop JobTracker and Spark Master instances on the Mesos cluster itself with a 
fairly simple framework (Marathon could probably do that quite well).

Thanks,

Tom.

--
Tom Arnfeld
Developer // DueDil

t...@duedil.com
(+44) 7525940046




8 Warner Yard, Second Floor, London. EC1R 5EY
Company Number: 06999618

What is DueDil? |  Product features  |  Try it for free

On 2 Apr 2014, at 09:06, Adam Bordelon <a...@mesosphere.io> wrote:

> Hi Li,
> 
> I recently got a chance to dig deeper into the resource allocator code and
> want to share what I learned. Mesos already distributes its resource offers
> primarily by 'user' and secondarily by framework. However, the 'user'
> referred to is the user running/registering the framework, which isn't
> really what you're looking for. But I do think Mesos could be adapted to
> solve your use case.
> 
> Here's how it works now:
> 1. When a framework registers, it species a user in its FrameworkInfo
> ("framework-user"). If set to empty string, Mesos will set it to current
> user running the framework scheduler.
> 2. When the master decides which framework to offer its resources to next,
> it first looks at which framework-user has the lowest share, then which
> framework belonging to that user has the lowest share.
> 3. When a framework receives an offer, it decides whether to accept or
> decline the offer, and what task(s) to run with that offer.
> 4. When a task runs, it runs as the user specified in the FrameworkInfo.
> MESOS-1161 will allow the framework to set a different user per task
> ("task-user") in the CommandInfo when launching each task, getting us one
> step closer to what you want. But even once frameworks can run tasks as
> different users, the master allocator still won't know which users want to
> launch tasks before it allocates a resource offer to a framework.
> 
> This is why Mesos is known as a two level scheduler: first the master
> allocator makes a global decision, then framework scheduler decides within
> its own realm.
> Relying on the master allocator alone could work fine for you if each user
> launches a new framework instance for each job they want to run, but does
> not work for long-running services that take task requests from multiple
> users.
> Relying on the framework scheduler could work for you if you only care
> about a single framework on the cluster, but won't help you balance users
> between multiple frameworks.
> 
> If I understand correctly, you want to schedule primarily based on the
> resource shares currently allocated to the various users running the tasks,
> regardless of how many different frameworks there are, or what users are
> running the frameworks themselves. Mesos' DRF sorter/allocator approach
> could still be used to provide dominant resource fairness among task-users,
> but it would need a few additional pieces of information. First, frameworks
> would have to update the master allocator with a list of users wanting to
> launch tasks; this could be added to the existing resource request message.
> Secondly, the master would need to track the resources allocated to each
> user's tasks. Then the master could use the existing DRF algorithm to
> select the task-user furthest below his/her fair share, then select which
> of that user's frameworks to give the offer to, and pass the user selection
> to the selected framework along with the resource offer.
> 
> In summary, I do believe that Mesos can be adapted to allocate resource
> shares among task-users, but we will need to flesh out the design and
> implement it. Personally, I find the idea fascinating. Please share more
> information about your use case and requirements, and even file a new JIRA
> for this feature if you like.
> 
> Thank you,
> -Adam-
> mesosphere.io
> 
> 
> On Tue, Mar 25, 2014 at 9:40 PM, Li Jin <ice.xell...@gmail.com> wrote:
> 
>> Thanks again for the reply.
>> 
>> I should clarify I am considering the case where U1-U10 having unlimited
>> requests. And I would like to give freed-up resource to the users (U1-U10)
>> with the minimum drs instead of the framework (A,B,C) with mininum drs.
>> 
>> I am curious in how other people think and open to discussion.
>> 
>> 
>> On Wed, Mar 26, 2014 at 12:12 AM, Chengwei Yang
>> <chengwei.yang...@gmail.com>wrote:
>> 
>>> On Tue, Mar 25, 2014 at 11:45:18PM -0400, Li Jin wrote:
>>>> Thanks for the reply. Still, it's not clear to me how DRF would help in
>>>> this case, let me elaborate:
>>>> 
>>>> Let's say there are 3 frameworks A,B,C, running by user F1, F2, F3 and
>>>> there are 10 users, U1-U10, running tasks through A,B,C.
>>>> 
>>>> Now use DRF between framework with equal weight, I believe the resource
>>>> will be equally distributed among the 3 frameworks. Is it possible for
>>>> Mesos to equally distribute the resource among the 10 users?
>>> 
>>> The simple answer is *NO*, DRF isn't a equal partition of resource, no
>>> one need equal partition in fact, it always depends on the resource
>>> request. The resource allocation is a dynamic progress, not a static
>>> partition for all available sources, thinking that there may be only a
>>> few users run tasks at the same time slot.
>>> 
>>> For example, if there are more than one user and more than one tasks
>>> ready to run. Then, the DRF for user first selects user who has the
>>> minimum dominant resource share, if there are more than one tasks (say
>>> framework A task, framework B task, ...), then it selects the framework
>>> task which has the minimum dominant resource share.
>>> 
>>> BTW, just my understanding from the paper, any mistake please correct
>>> me.
>>> 
>>> --
>>> Thanks,
>>> Chengwei
>>> 
>>>> 
>>>> Thanks,
>>>> Li
>>>> 
>>>> 
>>>> On Tue, Mar 25, 2014 at 10:39 PM, Chengwei Yang
>>>> <chengwei.yang...@gmail.com>wrote:
>>>> 
>>>>> On Tue, Mar 25, 2014 at 06:17:11PM -0400, Li Jin wrote:
>>>>>> Dear Devs,
>>>>>> 
>>>>>> We are seriously investigating using Mesos as the backbone of our
>>> compute
>>>>>> infrastructure. One important question I would like to ask is about
>>> fair
>>>>>> sharing.
>>>>>> 
>>>>>> As I understand it, assuming you have 3 frameworks and 100 users
>>> using
>>>>>> those frameworks, the current algorithm gives each framework 33%
>>>>> (assuming
>>>>>> same weight), no matter how many users each framework have. In our
>>> case,
>>>>> 
>>>>> I don't think so. By default, DRF allocator used among users and
>> user's
>>>>> frameworks.
>>>>> 
>>>>> See below options of mesos-master.
>>>>> 
>>>>>  --framework_sorter=VALUE        Policy to use for allocating
>>> resources
>>>>>                                  between a given user's frameworks.
>>>>> Options
>>>>>                                  are the same as for user_allocator
>>>>> (default: drf)
>>>>> 
>>>>>  --user_sorter=VALUE             Policy to use for allocating
>>> resources
>>>>>                                  between users. May be one of:
>>>>>                                  dominant_resource_fairness (drf)
>>>>> (default: drf)
>>>>> 
>>>>> For DRF, please see this paper.
>>>>> http://people.csail.mit.edu/matei/papers/2011/nsdi_drf.pdf
>>>>> 
>>>>> --
>>>>> Thanks,
>>>>> Chengwei
>>>>> 
>>>>>> actually we would like to give each user 1% of the cluster, no
>> matter
>>>>> which
>>>>>> framework they use. The reasons are:
>>>>>> 
>>>>>> (1) It's much easier for us to decide weight between users than
>>> weight
>>>>>> between framework.
>>>>>> (2) It makes it much easy to add and remove frameworks since it
>> won't
>>>>>> change distribution of fair share
>>>>>> 
>>>>>> In general, I feel frameworks compute on behave of users and thus
>>> users
>>>>>> should "pay" for the computation.
>>>>>> 
>>>>>> I am wondering if this makes sense and if this is something could
>> be
>>>>>> supported by Mesos.
>>>>>> 
>>>>>> Thanks,
>>>>>> Li
>>>>> 
>>> 
>> 

Reply via email to