Re: Mesos 0.27.0 release update

2016-01-22 Thread Shuai Lin
Tests all passed in ubuntu 14.04 with kernel 3.13 and docker engine 1.9.1.

On Sat, Jan 23, 2016 at 3:11 AM, Timothy Chen  wrote:

> Hi all,
>
> (Kapil, MPark and I) We're still having 3 blocker issues outstanding
> at this moment:
>
> MESOS-4449: SegFault on agent during executor startup (shepherd: Joris)
> MESOS-4441: Do not allocate non-revocable resources beyond quota
> guarantee. (shepherd: Joris)
> MESOS-4410: Introduce protobuf for quota set request. (shepherd: Joris)
>
> The remaining major tickets are ContainerLogger related and should be
> committed today according to Ben.
>
> We've started to test latest master and will be looking at the test
> failures to see what needs to be addressed.
>
> I encourage everyone to test the latest master on your platform if
> possible to catch issues early, and once the Blocker issues are
> resolved we'll be sending a RC to test and vote.
>
> Thanks,
>
> Tim
>


Re: question about DRF and task allocation to slaves

2016-01-22 Thread Qian Zhang
>
> My question is - if the slaves to execute the tasks are decided by the
> scheduler, how does the DRF get used by the master in balancing the slaves?
> Does it not get used for choosing the right slaves for the tasks? Relevant
> pointers to the code are also welcome.
>

I think the DRF allocator (hierarchical.cpp) is mainly used to balance the
fair share of roles/frameworks rather than slaves, and it does not get used
for choosing the right slaves for tasks (since it has no idea about tasks)
which is the responsibility of framework scheduler, instead it is
responsible for sending slaves to framework scheduler as resource offer and
such process can be affected by framework scheduler as well, e.g.,
framework scheduler sets filter to refuse a specific slave for a specific
duration,
https://github.com/apache/mesos/blob/0.26.0/src/master/allocator/mesos/hierarchical.cpp#L761:L806


question about DRF and task allocation to slaves

2016-01-22 Thread Mehrotra, Ashish
Hi Devs,

I have started looking at the Mesos core code and came upon following 
observations. I have seen a few dots but haven’t been able to connect them.
Please let me know if you can

 As we know the DRF (Dominant Resource Framework) algorithm is used in Mesos to 
allocate the tasks to the slaves so that the dominant share is equalized (or as 
equal as possible). The example can be seen in the paper 
(https://www.cs.berkeley.edu/~alig/papers/drf.pdf). Thus, it looks like Mesos 
master has the control to allocate the tasks to the slaves (by internally 
maintaining the status of all tasks for various frameworks and slaves and their 
offers).
On the other hand, there are schedulers (am using Netflix Fenzo) whose job is 
to figure out the slaves most suitable for the tasks and the API exposed 
through the Mesos Java API used by Fenzo is —
MesosSchedulerDriver.launchTasks(Collection offerIds, 
Collection taskIds);

The code that I have seen through in Mesos is (master, drf/sorter etc - 
master.cpp, hierarchical.cpp) showing me on various events (methods in 
master.cpp) like addFramework(), activeFramework(), addSlave() etc, there is 
DRF method –>calculateShare() being called in sorter.cpp. But when the tasks 
are allocate to master and launchTask() method is called in master (by passing 
the map of tasks assigned to slaves), the slaves which have been calculated by 
the scheduler and present in the map of tasks and offers get called directly.

My question is - if the slaves to execute the tasks are decided by the 
scheduler, how does the DRF get used by the master in balancing the slaves? 
Does it not get used for choosing the right slaves for the tasks? Relevant 
pointers to the code are also welcome.

Thanks,
Ashish



Re: [Discussion] MESOS-4442: `allocated` may have more resources then `total` in allocator

2016-01-22 Thread Klaus Ma
@team, any comments?


Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
Platform OpenSource Technology, STG, IBM GCG
+86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me

On Thu, Jan 21, 2016 at 9:31 PM, Klaus Ma  wrote:

> Yes, *total*: cpus(*):2 vs. *allocated*: cpus(*):2;cpus(*){REV}:2
>
> 
> Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
> Platform OpenSource Technology, STG, IBM GCG
> +86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me
>
> On Thu, Jan 21, 2016 at 5:43 PM, Qian Zhang  wrote:
>
>> In the log you posted, it seems total cpus is also 2 rather than 1, but it
>> seem there are 4 allocated cpus (2 non-revocable and 2 revocable)?
>>
>> I0121 17:08:09.303431 4284416 hierarchical.cpp:528] Slave
>> f2d8b550-ed52-44a4-a35a-1fff81d41391-S0 (9.181.90.153) updated with
>> oversubscribed resources  (total: cpus(*):2; mem(*):1024; disk(*):1024;
>> ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024;
>> ports(*):[31000-32000]; *cpus(*){REV}:2*)
>>
>>
>> Thanks,
>> Qian Zhang
>>
>> On Thu, Jan 21, 2016 at 5:25 PM, Klaus Ma  wrote:
>>
>> > Hi team,
>> >
>> > When I double-check the feature interaction between Optimistic Offer
>> Phase
>> > 1 & Oversubscription, I found an issue that `allocated` may have more
>> > resources then `total` in allocator when enable Oversubscription. I'd
>> like
>> > to get your input on whether this is design behaviour, although the
>> impact
>> > is low: 1.) allocator will not offer this delta resources, 2) QoS
>> > Controller will correct it later by killing the executor. Personally,
>> I'd
>> > like to keep this assumption in allocator: slave.total always contains
>> > slave.allocated.
>> >
>> > Here's the steps:
>> >
>> > T1: in cluster, cpus=2: one is revocable and the other one is
>> nonRevocable
>> > T2: framework1 get offer cpus=2, launch task but estimator report empty
>> > resources before executor launched
>> > T3: slave.total is updated to cpus=1 in
>> > HierarchicalAllocatorProcess::updateSlave
>> > T4: in allocate(), slave.total (cpus=1) < slave.allocated (cpus=2)
>> >
>> > Here's the log I got:
>> >
>> > I0121 17:08:09.303431 4284416 hierarchical.cpp:528] Slave
>> > f2d8b550-ed52-44a4-a35a-1fff81d41391-S0 (9.181.90.153) updated with
>> > oversubscribed resources  (total: cpus(*):2; mem(*):1024; disk(*):1024;
>> > ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024;
>> > ports(*):[31000-32000]; *cpus(*){REV}:2*)
>> >
>> > Please refer to MESOS-4442 for more detail.
>> >
>> > 
>> > Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
>> > Platform OpenSource Technology, STG, IBM GCG
>> > +86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me
>> >
>>
>
>


Mesos 0.27.0 release update

2016-01-22 Thread Timothy Chen
Hi all,

(Kapil, MPark and I) We're still having 3 blocker issues outstanding
at this moment:

MESOS-4449: SegFault on agent during executor startup (shepherd: Joris)
MESOS-4441: Do not allocate non-revocable resources beyond quota
guarantee. (shepherd: Joris)
MESOS-4410: Introduce protobuf for quota set request. (shepherd: Joris)

The remaining major tickets are ContainerLogger related and should be
committed today according to Ben.

We've started to test latest master and will be looking at the test
failures to see what needs to be addressed.

I encourage everyone to test the latest master on your platform if
possible to catch issues early, and once the Blocker issues are
resolved we'll be sending a RC to test and vote.

Thanks,

Tim


Core affinity in Mesos

2016-01-22 Thread Nielsen, Niklas
Hi everyone,

We have been talking about core affinity in Mesos for a while, and Ian D. has 
recently been giving this topic thought in his ‘exclusive resources’ proposal 
[1].
Trying to avoid too conservative placements, latency critical workloads are at 
risk without it.
We are interested in the topic through our work on oversubscription in Serenity 
[2], as oversubscription was exactly to be able to colocate latency critical 
and best-effort batch jobs.
We had an informal meeting yesterday, going over the proposal and trying to get 
some cadence behind the capability.

It is a tricky but exciting topic:
 - How do we avoid making task launch even more complex? How do we express the 
topology and acquire parts of it. Do we use hints on the affinity properties 
instead?
 - How do we mix pinned with normal ‘floating’ tasks.
 - How do we convey information to the resource estimator about the task 
sensitivity.

Note, above list not meant for inlined discussion or answers. Let’s collect 
feedback on the proposals themselves.

Here are our proposed next steps:
 - We are going to use the ‘Isolation Working Group’ as an umbrella for this. I 
will fill in details and members.
 - We will schedule an online meeting within the Wednesday 9AM PST next week 
discussing next steps. I will share a hangout link when we get closer.
 - Plan being, getting to designs (maybe more than one) we agree on and then 
scope out and distribute the work needed to be done.

Who ever is interested, join us. The use cases for this work are critical. 
Maybe we can even work on some representative workloads we can verify our 
proposal against.

Cheers,
Niklas

PS For comments on the proposal itself, please refer to Ian’s thread for the 
dev list [3].

[1] https://issues.apache.org/jira/browse/MESOS-4138
[2] https://github.com/mesosphere/serenity
[3] https://www.mail-archive.com/dev%40mesos.apache.org/msg33892.html