Re: [Performance Isolation] Meeting on Thursday May 12 5pm PST

2016-05-05 Thread Niklas Nielsen
Alrighty - 3rd time is a charm. Let's try to aim for next Thursday (May 12)
at 5pm. We are still aiming for a late afternoon slot to accommodate Asian
time zones.
As we have folks dialing in from China, we have previously experienced
problems with Hangouts. So let's try to meet at
https://meet.intel.com/niklas.nielsen/Q3C5MHZ3 instead.

The agenda is the same as last:
 - Talk progress on performance tiers and ways of marking expected quality
of service for tasks
 - Plan for core isolation taken above
 - Discuss how cache partitioning ties into core isolation
 - If time allows, talk about additional CFS settings for our cpu isolators

I will update the Apache Mesos calendar accordingly.

Thanks!
Niklas



On Thu, Apr 28, 2016 at 12:16 PM, Kevin Klues <klue...@gmail.com> wrote:

> I won't be able to call today.  I am on vacation until the 9th.
>
> On Fri, Apr 29, 2016 at 2:30 AM, Niklas Nielsen <n...@qni.dk> wrote:
> > Ian, Kevin: Would you be able to dial in this afternoon instead?
> >
> > On Mon, Apr 25, 2016 at 9:00 AM, Niklas Nielsen <n...@qni.dk> wrote:
> >
> >> Hi all,
> >>
> >> I apologize! this was on me: I got stuck in traffic and couldn't dial in
> >> (and accept the external calls).
> >> Do you have time to reschedule for the same time slot? That be Thursday
> >> 5pm to 6pm.
> >>
> >> Niklas
> >>
> >>
> >> On Fri, Apr 22, 2016 at 10:06 AM, Ian Downes
> <idow...@twitter.com.invalid>
> >> wrote:
> >>
> >>> Likewise, I tried calling but no one was hosting the meeting.
> >>>
> >>> On Fri, Apr 22, 2016 at 10:04 AM, Kevin Klues <klue...@gmail.com>
> wrote:
> >>>
> >>> > I tried calling into this last night, but no one was there.  Was it
> >>> > post-poned again?
> >>> >
> >>> > On Mon, Apr 18, 2016 at 12:03 PM, Niklas Nielsen <n...@qni.dk> wrote:
> >>> > > Hi everyone,
> >>> > >
> >>> > > Per our conversation about Intel CAT enablement in Mesos, we are
> >>> > scheduling
> >>> > > a Performance Isolation Working Group meeting at Thursday April 21
> >>> 2016
> >>> > > 5pm-6pm PST.
> >>> > >
> >>> > > Feel free to suggest agenda topics here:
> >>> > >
> >>> >
> >>>
> https://docs.google.com/document/d/11mlGPZSABItP47J6VX-zB0fAK6Qr1mCIOI7nhlATMqk/edit#
> >>> > >
> >>> > > Also, let's use this hangout to meet:
> >>> > >
> >>> >
> >>>
> https://hangouts.google.com/hangouts/_/intel.com/mesos-performance-isolation?hl=en=1
> >>> > >
> >>> > > Cheers,
> >>> > > Niklas
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > ~Kevin
> >>> >
> >>>
> >>
> >>
> >>
> >> --
> >> Niklas
> >>
> >
> >
> >
> > --
> > Niklas
>
>
>
> --
> ~Kevin
>



-- 
Niklas


Re: [Performance Isolation] Meeting on Thursday April 21 2016 5pm PST

2016-04-28 Thread Niklas Nielsen
We will use the same hangout link:
https://hangouts.google.com/hangouts/_/intel.com/isolation?hl=en=1



On Thu, Apr 28, 2016 at 7:30 AM, Niklas Nielsen <n...@qni.dk> wrote:

> Ian, Kevin: Would you be able to dial in this afternoon instead?
>
> On Mon, Apr 25, 2016 at 9:00 AM, Niklas Nielsen <n...@qni.dk> wrote:
>
>> Hi all,
>>
>> I apologize! this was on me: I got stuck in traffic and couldn't dial in
>> (and accept the external calls).
>> Do you have time to reschedule for the same time slot? That be Thursday
>> 5pm to 6pm.
>>
>> Niklas
>>
>>
>> On Fri, Apr 22, 2016 at 10:06 AM, Ian Downes <idow...@twitter.com.invalid
>> > wrote:
>>
>>> Likewise, I tried calling but no one was hosting the meeting.
>>>
>>> On Fri, Apr 22, 2016 at 10:04 AM, Kevin Klues <klue...@gmail.com> wrote:
>>>
>>> > I tried calling into this last night, but no one was there.  Was it
>>> > post-poned again?
>>> >
>>> > On Mon, Apr 18, 2016 at 12:03 PM, Niklas Nielsen <n...@qni.dk> wrote:
>>> > > Hi everyone,
>>> > >
>>> > > Per our conversation about Intel CAT enablement in Mesos, we are
>>> > scheduling
>>> > > a Performance Isolation Working Group meeting at Thursday April 21
>>> 2016
>>> > > 5pm-6pm PST.
>>> > >
>>> > > Feel free to suggest agenda topics here:
>>> > >
>>> >
>>> https://docs.google.com/document/d/11mlGPZSABItP47J6VX-zB0fAK6Qr1mCIOI7nhlATMqk/edit#
>>> > >
>>> > > Also, let's use this hangout to meet:
>>> > >
>>> >
>>> https://hangouts.google.com/hangouts/_/intel.com/mesos-performance-isolation?hl=en=1
>>> > >
>>> > > Cheers,
>>> > > Niklas
>>> >
>>> >
>>> >
>>> > --
>>> > ~Kevin
>>> >
>>>
>>
>>
>>
>> --
>> Niklas
>>
>
>
>
> --
> Niklas
>



-- 
Niklas


Re: [Performance Isolation] Meeting on Thursday April 21 2016 5pm PST

2016-04-28 Thread Niklas Nielsen
Ian, Kevin: Would you be able to dial in this afternoon instead?

On Mon, Apr 25, 2016 at 9:00 AM, Niklas Nielsen <n...@qni.dk> wrote:

> Hi all,
>
> I apologize! this was on me: I got stuck in traffic and couldn't dial in
> (and accept the external calls).
> Do you have time to reschedule for the same time slot? That be Thursday
> 5pm to 6pm.
>
> Niklas
>
>
> On Fri, Apr 22, 2016 at 10:06 AM, Ian Downes <idow...@twitter.com.invalid>
> wrote:
>
>> Likewise, I tried calling but no one was hosting the meeting.
>>
>> On Fri, Apr 22, 2016 at 10:04 AM, Kevin Klues <klue...@gmail.com> wrote:
>>
>> > I tried calling into this last night, but no one was there.  Was it
>> > post-poned again?
>> >
>> > On Mon, Apr 18, 2016 at 12:03 PM, Niklas Nielsen <n...@qni.dk> wrote:
>> > > Hi everyone,
>> > >
>> > > Per our conversation about Intel CAT enablement in Mesos, we are
>> > scheduling
>> > > a Performance Isolation Working Group meeting at Thursday April 21
>> 2016
>> > > 5pm-6pm PST.
>> > >
>> > > Feel free to suggest agenda topics here:
>> > >
>> >
>> https://docs.google.com/document/d/11mlGPZSABItP47J6VX-zB0fAK6Qr1mCIOI7nhlATMqk/edit#
>> > >
>> > > Also, let's use this hangout to meet:
>> > >
>> >
>> https://hangouts.google.com/hangouts/_/intel.com/mesos-performance-isolation?hl=en=1
>> > >
>> > > Cheers,
>> > > Niklas
>> >
>> >
>> >
>> > --
>> > ~Kevin
>> >
>>
>
>
>
> --
> Niklas
>



-- 
Niklas


Re: [Performance Isolation] Meeting on Thursday April 21 2016 5pm PST

2016-04-25 Thread Niklas Nielsen
Hi all,

I apologize! this was on me: I got stuck in traffic and couldn't dial in
(and accept the external calls).
Do you have time to reschedule for the same time slot? That be Thursday 5pm
to 6pm.

Niklas


On Fri, Apr 22, 2016 at 10:06 AM, Ian Downes <idow...@twitter.com.invalid>
wrote:

> Likewise, I tried calling but no one was hosting the meeting.
>
> On Fri, Apr 22, 2016 at 10:04 AM, Kevin Klues <klue...@gmail.com> wrote:
>
> > I tried calling into this last night, but no one was there.  Was it
> > post-poned again?
> >
> > On Mon, Apr 18, 2016 at 12:03 PM, Niklas Nielsen <n...@qni.dk> wrote:
> > > Hi everyone,
> > >
> > > Per our conversation about Intel CAT enablement in Mesos, we are
> > scheduling
> > > a Performance Isolation Working Group meeting at Thursday April 21 2016
> > > 5pm-6pm PST.
> > >
> > > Feel free to suggest agenda topics here:
> > >
> >
> https://docs.google.com/document/d/11mlGPZSABItP47J6VX-zB0fAK6Qr1mCIOI7nhlATMqk/edit#
> > >
> > > Also, let's use this hangout to meet:
> > >
> >
> https://hangouts.google.com/hangouts/_/intel.com/mesos-performance-isolation?hl=en=1
> > >
> > > Cheers,
> > > Niklas
> >
> >
> >
> > --
> > ~Kevin
> >
>



-- 
Niklas


[Performance Isolation] Meeting on Thursday April 21 2016 5pm PST

2016-04-18 Thread Niklas Nielsen
Hi everyone,

Per our conversation about Intel CAT enablement in Mesos, we are scheduling
a Performance Isolation Working Group meeting at Thursday April 21 2016
5pm-6pm PST.

Feel free to suggest agenda topics here:
https://docs.google.com/document/d/11mlGPZSABItP47J6VX-zB0fAK6Qr1mCIOI7nhlATMqk/edit#

Also, let's use this hangout to meet:
https://hangouts.google.com/hangouts/_/intel.com/mesos-performance-isolation?hl=en=1

Cheers,
Niklas


Re: [Isolation][Containerization] - Add Intel Cache Allocation Technology(CAT) Isolator Support

2016-04-18 Thread Niklas Nielsen
Hi everyone,

Moving off this thread but wanted to bring up that Ian can't join today.
Proposing a new time in another thread.

Niklas

On Mon, Apr 11, 2016 at 10:53 AM, Niklas Nielsen <n...@qni.dk> wrote:

> Suggesting Monday April 18 5pm PST - I will send out a hangout link when
> we get closer. In the mean time, feel free to suggest agenda topics here:
> https://docs.google.com/document/d/11mlGPZSABItP47J6VX-zB0fAK6Qr1mCIOI7nhlATMqk/edit#
>
> Niklas
>
> On Sun, Apr 10, 2016 at 8:27 PM, Du, Fan <fan...@intel.com> wrote:
>
>> 5:00 PM PST time is 8:00 AM on my timezone(UTC + 08:00)
>> I think time slot after 5:00 PM PST will be ok for me.
>>
>> Thanks Nik!
>>
>>
>> On 2016/4/9 6:11, Niklas Nielsen wrote:
>>
>>> Alright - with that in mind. Will morning, midday or evening PST be
>>> preferred compared to your time zone?
>>>
>>
>
>
> --
> Niklas
>



-- 
Niklas


Re: [Isolation][Containerization] - Add Intel Cache Allocation Technology(CAT) Isolator Support

2016-04-11 Thread Niklas Nielsen
Suggesting Monday April 18 5pm PST - I will send out a hangout link when we
get closer. In the mean time, feel free to suggest agenda topics here:
https://docs.google.com/document/d/11mlGPZSABItP47J6VX-zB0fAK6Qr1mCIOI7nhlATMqk/edit#

Niklas

On Sun, Apr 10, 2016 at 8:27 PM, Du, Fan <fan...@intel.com> wrote:

> 5:00 PM PST time is 8:00 AM on my timezone(UTC + 08:00)
> I think time slot after 5:00 PM PST will be ok for me.
>
> Thanks Nik!
>
>
> On 2016/4/9 6:11, Niklas Nielsen wrote:
>
>> Alright - with that in mind. Will morning, midday or evening PST be
>> preferred compared to your time zone?
>>
>


-- 
Niklas


Re: [Isolation][Containerization] - Add Intel Cache Allocation Technology(CAT) Isolator Support

2016-04-06 Thread Niklas Nielsen
On Tue, Apr 5, 2016 at 7:51 PM, Du, Fan <fan...@intel.com> wrote:

>
> Thanks for heads up! Kevin and Niklas.
>
> Exposing LLC as resource needs special modification to current resource
> managing and offering behavior.
> Here is my early thoughts:
> 1) 'cpu' resource is essentially a cpu share resources, while LLC is per
> processor resources,
>   This will require:
>   1a): Resource offer for cpu and LLC have to be NUMA node aware
>For a two NUMA nodes Agent with 2x40 logical cpu cores, suppose LLC
> has 20 subsets.
>Master will make two resource offers:
>Offer1: cpu 40; LLC 20 with NUMA 1
>Offer2: cpu 40; LLC 20 with NUMA 2
>
>From a high level point of view, all the RDT related features
> require Mesos to be aware
>of hardware topology when managing resources, .e.g Memory Bandwidth
> will also
>be one type of resource, anyway it’s a long term goal to make this
> happen eventually.
>

We have been discussing this at length. Whether or not to expose numa
topology to the user and to which degree we automate by having the user
expressing intent instead of explicit NUMA settings.
Let's discuss it during the performance isolation meeting. Ian has been
preparing the latest proposal on core pinning.
We would have to have something like that in place before aiming to enable
CAT, IMO. Don't get me wrong, we definitely should have CAT in mind when we
design core isolation.
The thing I would like to avoid is conflicting abstractions and duplicate
effort, when the two isolation mechanisms are so tightly bound.


>
>   1b): Agent will apply cpu share isolation along with cpuset.
>We might need to revisit MESOS-314 to support this partially.
>
> Actually, CAT kernel support could/should still support scenario when task
> migrate between NUMA
> nodes, but right now it does not. This is why I filed the ticket and draft
> the initial design doc to track this.
>
> 2) All the Monitoring support(CMT and MBM) is all most ready in Mesos,
> that’s all perf stuff.
> Check MESOS-4955 and MESOS-4595 for details.


Yep - Bartek had a patch for this


>
>
>
>
>
> On 2016/4/5 23:13, Niklas Nielsen wrote:
>
>> My gut feeling is that it won't be very useful to expose LLC as a first
>> class resource type at this point.
>> It's very hard to pick for the user and requires framework support
>> everywhere.
>> Also, as Kevin mentioned, Mesos doesn't pin your tasks so you don't know
>> which cores your tasks will be running on.
>>
>> We have been talking about QoS isolators in the performance isolation
>> working group, where more low-level decisions are made on the agent
>> itself.
>> Both core pinning and CAT would be controls which those isolators could
>> adjust to uphold higher level notions of task performance tiers.
>>
>> Let's discuss this in the performance isolation working group. We can
>> schedule a call end of this week or start next week.
>>
>> Niklas
>>
>>
>> On Mon, Apr 4, 2016 at 10:39 PM, Kevin Klues <klue...@gmail.com> wrote:
>>
>> Hi Fan,
>>>
>>> Thanks for putting this together. I have been looking into this quite
>>> a bit myself recently, and have been slowly preparing a design doc for
>>> both CAT and CMT support in Mesos. One of the biggest things I have
>>> been trying to figure out (which is why I haven't pushed my design doc
>>> out yet) is how to combine CAT support with the existing resource
>>> model.
>>>
>>> Specifically, Mesos currently gives out fractional cores using the
>>> cgroups cpu.shares mechanism and doesn't allow tasks to choose
>>> specific cores to run on (even more than this, there is no way for a
>>> task to even see which specific cores might be available).
>>> Furthermore, when a resource offer goes out, it's just a collection of
>>> SCALARS, SETS, and RANGES, and there's no way to tie one particular
>>> resource to another (e.g. you can't say give me cores and memory that
>>> are close together to mitigate NUMA effects).
>>>
>>> Given these limitations, it's not clear how to take immediate
>>> advantage of CAT, since it relies on specifying a specific core to
>>> allocate the cache from. That is, some mechanism must exist to ensure
>>> that both the CPU and the cache are colocated.  This is a problem with
>>> the current resource model in general, and applies to properly
>>> supporting NUMA as well.
>>>
>>> You seem to propose simply adding cache partitions as a first class
>>> resource on par with CPUs and memory, with no mentio

Re: [Isolation][Containerization] - Add Intel Cache Allocation Technology(CAT) Isolator Support

2016-04-05 Thread Niklas Nielsen
My gut feeling is that it won't be very useful to expose LLC as a first
class resource type at this point.
It's very hard to pick for the user and requires framework support
everywhere.
Also, as Kevin mentioned, Mesos doesn't pin your tasks so you don't know
which cores your tasks will be running on.

We have been talking about QoS isolators in the performance isolation
working group, where more low-level decisions are made on the agent itself.
Both core pinning and CAT would be controls which those isolators could
adjust to uphold higher level notions of task performance tiers.

Let's discuss this in the performance isolation working group. We can
schedule a call end of this week or start next week.

Niklas


On Mon, Apr 4, 2016 at 10:39 PM, Kevin Klues  wrote:

> Hi Fan,
>
> Thanks for putting this together. I have been looking into this quite
> a bit myself recently, and have been slowly preparing a design doc for
> both CAT and CMT support in Mesos. One of the biggest things I have
> been trying to figure out (which is why I haven't pushed my design doc
> out yet) is how to combine CAT support with the existing resource
> model.
>
> Specifically, Mesos currently gives out fractional cores using the
> cgroups cpu.shares mechanism and doesn't allow tasks to choose
> specific cores to run on (even more than this, there is no way for a
> task to even see which specific cores might be available).
> Furthermore, when a resource offer goes out, it's just a collection of
> SCALARS, SETS, and RANGES, and there's no way to tie one particular
> resource to another (e.g. you can't say give me cores and memory that
> are close together to mitigate NUMA effects).
>
> Given these limitations, it's not clear how to take immediate
> advantage of CAT, since it relies on specifying a specific core to
> allocate the cache from. That is, some mechanism must exist to ensure
> that both the CPU and the cache are colocated.  This is a problem with
> the current resource model in general, and applies to properly
> supporting NUMA as well.
>
> You seem to propose simply adding cache partitions as a first class
> resource on par with CPUs and memory, with no mention of its
> dependence on particular cores.  What are your thoughts on this?
>
> Kevin
>
> On Mon, Apr 4, 2016 at 7:36 PM, Du, Fan  wrote:
> > Hi,ALL
> >
> > MESOS-5076 is filed to investigate how Intel Cache Allocation
> > Technology(CAT)[1] could be
> > used in Mesos. Some introduction and early thoughts is documented
> here[2].
> >
> > The motivation is to:
> > a) Add CAT isolation support for Mesos Containerization
> > b) Expose Last Level Cache(LLC) as Scalar Resource
> > c) Bridge the interface gap for Docker Containerization,
> >CAT support for Docker[3] has been submitted to Docker OCI with
> positive
> > feedback.
> >
> > The ultimate goal is to provide operator CAT isolator for better
> colocation
> > of cluster resources.
> > I'm looking forward for any comments for community to move this forward.
> >
> > Thanks!
> >
> > [1]:
> http://www.intel.com/content/www/us/en/communications/cache-monitoring-cache-allocation-technologies.html
> > [2]:
> https://docs.google.com/document/d/130ay0e2DZ9S61SC3tGcik5wQaC8L40t5tWj3K3GJxTg/edit?usp=sharing
> > [3]:https://github.com/opencontainers/runtime-spec/pull/267
> > https://github.com/opencontainers/runc/pull/447
> >
>
>
>
> --
> ~Kevin
>



-- 
Niklas


Re: RFC: RevocableInfo Changes

2016-03-28 Thread Niklas Nielsen
Echoing Ben Mahler's comment. I still don't find the ThrottleInfo very
intuitive. Did you discuss the general notion of resource quality further?

On Mon, Mar 21, 2016 at 11:50 PM, Klaus Ma  wrote:

> @benm/joris,
>
> here's the user scenario in my mind:
>
> 1. master offers resources to the framework, e.g. 2 cpu
> 2. framework launch a task (2 cpu) and *mark the task/executors as
> throttleable*
> 3. in ResourceEstimator, it should only consider the throttleable
> task/executors:
>   - keep enough resources for the tasks/executors *without* throttleable
> flag/attribute
>   - report allocated but not used resources by task/executor *with*
> throttleable flag/attribute; for example, report 1 cpu as "
> *Revocable.Throttleable"* resources to framework in this case
> 4. it's up to framework to use which resources; "*Revocable.Throttleable*"
> means it'll share compress resources with resources owner, "*Revocable*"
> (without ThrottleableInfo) means it'll be evicted when the resources owner
> reclaimed it back
> 5. QoS Controller makes sure:
>   - enough resources for the tasks/executors *without* throttleable
> flag/attribute
>   - if used resources exceed allocated resources *with* throttleable
> flag/attribute, evict the task/executor on revocable resource
>
> So to @connor's question, maybe a flag/attribute to task/executor when
> launching it. Regarding the name, both "ScavengeInfo"/
> "BestEffortInfo"/"ThrottleableInfo" are OK for me, maybe "ScavengeInfo" is
> better.
>
> Any comments?
>
> For this scenario, I think there're still open questions:
> 1. Can framework launch task with throttleable flag/attribute on revocable
> resources?
> 2. For ResourceEstimator/QoS Controllor, should Agent double check it
> report?
> 3. What's the behaviour between the two container: the container on
> original resouces & the container on revocable resource?
> 4. Who handle compressible/in-compressible resources? Maybe
> ResourceEstimator/QoSController, it should not report in-compressible
> resources as Revocable.Throttleable.
>
> Thanks
> Klaus
>
> On Tuesday, March 22, 2016 at 4:13:10 AM UTC+8, Benjamin Mahler wrote:
>>
>> Yeah that's definitely a question I've been asking myself, and we synced
>> on that with Niklas during the last meeting. The thought currently is that
>> we should choose a better name than ThrottleInfo. ThrottleInfo seems to
>> carry too strong of an implication about what the resources will
>> experience. Rather, we could pick a name like "ScavengeInfo" /
>> "BestEffortInfo" / etc that indicates that these resources are running
>> within the un-utilized portion of the machine and _may_ experience
>> degradation.
>>
>> On Mon, Mar 21, 2016 at 1:26 AM, Joris Van Remoortere <
>> jo...@mesosphere.io> wrote:
>>
>>> @klaus:
>>> I think @connor's question is whether we are absolutely sure we never
>>> want to support throttle-able but non-revocable resources.
>>> It's clear from the protos that this is not supported, the question is
>>> whether we are sure that is what we want. If so, can you elaborate as to
>>> *why* we would never want that concept in Mesos.
>>>
>>> —
>>> *Joris Van Remoortere*
>>> Mesosphere
>>>
>>> On Sun, Mar 20, 2016 at 8:33 PM, Klaus Ma 
>>> wrote:
>>>
 Here's some input :).

 If throttling is tolerable but preemption is not, how would that be
 expressed? (Is that supported?)
 [Klaus]: It's not supported; only revocable resources has this
 attribute: non-throttleable or throttleable. The throttleable revocable
 resources is reported by ResourceEstimator which means the resources maybe
 throttled by its original owner.

 How does this work with the QoS controller? Will there be a new
 correction type to indicate throttling, or does throttling happen "behind
 the agent's back"?
 [Klaus]: The QoSController/ResourceEstimator only manages throttleable
 revocable resources; the others resources (regular resources and
 non-throttleable revocable resources) are managed by allocator. The
 "manage" means generation and destroy/eviction. Regarding "throttling
 happen", good question. I think the throttling will dependent on
 containers, let me double check it :).

 If any comments, please let me know.

 
 Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
 Platform OpenSource Technology, STG, IBM GCG
 +86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me

 On Sat, Mar 19, 2016 at 11:15 PM,  wrote:

> Thanks for the good explanations so far Ben and Klaus.  Apologies if
> you guys already covered these questions in the meeting:
>
> If throttling is tolerable but preemption is not, how would that be
> expressed? (Is that supported?)
>
> How does this work with the QoS controller? Will there be a new
> correction type to indicate throttling, or does throttling happen 

Re: [Proposal] Use dev mailing list for working groups

2016-03-25 Thread Niklas Nielsen
SGTM - let's do it :)

On Fri, Mar 25, 2016 at 12:44 AM, Jay JN Guo  wrote:

> +1
>
> Although we really need to agree on one single style and stick to it.
>
> I thought all mails sending to dev@mesos.apache.org would be [MESOS-dev]
> by
> default.
> So I don't know how much sense it would make to have this first label. Even
> though I strongly
> agree with @guangya on hierarchical labels.
>
> Cheers,
> /J
>
> Zhitao Li  wrote on 03/25/2016 14:56:50:
>
> > From: Zhitao Li 
> > To: dev 
> > Date: 03/25/2016 14:58
> > Subject: Re: [Proposal] Use dev mailing list for working groups
> >
> > +1
> >
> > On Thu, Mar 24, 2016 at 11:47 PM, Timothy Chen 
> wrote:
> >
> > > +1
> > >
> > > Tim
> > >
> > > On Thu, Mar 24, 2016 at 8:17 PM, Chris Lambert  >
> > > wrote:
> > > > Another +1.  These WGs are great, and more visibility (with the
> ability
> > > to
> > > > filter) will be awesome.
> > > >
> > > >
> > > > On Thu, Mar 24, 2016 at 8:13 PM, Klaus Ma 
> > > wrote:
> > > >
> > > >> +1, that's helpful to filter feature/question out :).
> > > >>
> > > >> 
> > > >> Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
> > > >> Platform OpenSource Technology, STG, IBM GCG
> > > >> +86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me
> > > >>
> > > >> On Fri, Mar 25, 2016 at 11:04 AM, Du, Fan  wrote:
> > > >>
> > > >> > +1
> > > >> >
> > > >> > This will definitely make the new developer easily get to know
> where
> > > each
> > > >> > component
> > > >> > is heading, and which component is most of his interest and then
> > > >> > contribute.
> > > >> >
> > > >> > Thanks for the proposal!
> > > >> >
> > > >> >
> > > >> > On 2016/3/25 6:55, Jie Yu wrote:
> > > >> >
> > > >> >> Hi,
> > > >> >>
> > > >> >> This came up during today's community sync.
> > > >> >>
> > > >> >> Mesos currently has a few working groups for various features:
> > > >> >>
> > > >> >>
> > > >>
> > > https://cwiki.apache.org/confluence/display/MESOS/Apache+Mesos
> > +Working+Groups
> > > >> >>
> > > >> >> Some of those working groups are using separate mailing lists.
> That
> > > >> limits
> > > >> >> the visibility of some discussions. Also, some people in the
> > > community
> > > >> are
> > > >> >> not aware of those mailing lists (and the wiki page).
> > > >> >>
> > > >> >> Therefore, I am proposing that we consolidate all working groups
> > > mailing
> > > >> >> lists to the dev mailing list. To distinguish discussions from
> > > different
> > > >> >> working groups, please use a special subject format. For
> instance, if
> > > >> you
> > > >> >> want to send an email to "Mesos GPU" working group, please use
> the
> > > >> >> subject:
> > > >> >>
> > > >> >> "[Mesos GPU WG] YOUR SUBJECT HERE"
> > > >> >>
> > > >> >> Let me know if you have any comments/thoughts on this!
> > > >> >>
> > > >> >> - Jie
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
> >
> >
> > --
> > Cheers,
> >
> > Zhitao Li
>



-- 
Niklas


Re: RFC: RevocableInfo Changes

2016-03-14 Thread Niklas Nielsen
Ben, when do you have your next mesos allocator sync? We don't have our
next performance isolation sync lined up yet, so we could piggy back on
yours if you have it scheduled already.

Niklas

On Mon, Mar 14, 2016 at 9:32 AM, Jie Yu <yujie@gmail.com> wrote:

> >
> > Just a quick note: Ian D. and the performance isolation working group are
> > discussing similar annotations and we should meet and talk about the
> > options.
>
>
> +1
>
> Would love to understand the relationship between this and the
> task/executor level annotations.
>
> - Jie
>
> On Mon, Mar 14, 2016 at 9:29 AM, Niklas Nielsen <n...@qni.dk> wrote:
>
> > Hi Ben,
> >
> > Just a quick note: Ian D. and the performance isolation working group are
> > discussing similar annotations and we should meet and talk about the
> > options.
> >
> > Niklas
> >
> > On Sat, Mar 12, 2016 at 12:05 AM, Klaus Ma <klaus1982...@gmail.com>
> wrote:
> >
> > > Yes, I think that's true for now; so we define `ThrottleInfo` as
> message
> > to
> > > be more flexible. In Optimistic Offer Phase 1, we only use it to
> > > distinguish usage oversubscriptions and allocation oversubscription,
> > > similar to bool :).
> > >
> > > Regarding the resources type, two questions after the discussion:
> > >
> > > 1. should we send different offer to the framework, so when
> > > usage/allocation oversubscription updated, only one type of offer will
> be
> > > rescinded?
> > > 2. should we define framework's capability against `ThrottleInfo`?
> > >
> > > 
> > > Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
> > > Platform OpenSource Technology, STG, IBM GCG
> > > +86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me
> > >
> > > On Sat, Mar 12, 2016 at 12:03 PM, Guangya Liu <gyliu...@gmail.com>
> > wrote:
> > >
> > > >
> > > > Hi Ben,
> > > >
> > > > I think that currently and even in the near future, the
> > __ThrottleInfo__
> > > > will only be used by the usage oversubscriptions and the
> > oversubscription
> > > > for allocator (Both quota and reservations) will not use this value
> but
> > > > only using __RevocableInfo__ is enough.
> > > >
> > > > I can even think that the __ThrottleInfo__ as a boolean value in
> > > > optimistic offer phase 1 as it is mainly used to distinguish
> resources
> > > > between usage oversubscriptions and allocation oversubscription
> (Quota
> > > and
> > > > Reservations), comments?
> > > >
> > > > Thanks,
> > > >
> > > > Guangya
> > > >
> > > > 在 2016年3月12日星期六 UTC+8上午11:09:46,Benjamin Mahler写道:
> > > >
> > > >> Hey folks,
> > > >>
> > > >> In the resource allocation working group we've been looking into a
> few
> > > >> projects that will make the allocator able to offer out resources as
> > > >> revocable. For example:
> > > >>
> > > >> -We'll want to eventually allocate resources as revocable _by
> > default_,
> > > >> only allowing non-revocable when there are guarantees put in place
> > > (static
> > > >> reservations or quota).
> > > >>
> > > >> -On the path to revocable by default, we can incrementally start to
> > > offer
> > > >> certain resources as revocable. Consider when quota is set but the
> > role
> > > >> isn't using all of the quota. The unallocated quota can be offered
> to
> > > other
> > > >> roles, but it should be revocable because we may revoke them should
> > the
> > > >> quota'ed role want to use the resources. Unused reservations fall
> > into a
> > > >> similar category.
> > > >>
> > > >> -Going revocable by default also allows us to enforce fairness in a
> > > >> dynamically changing cluster by revoking resources as weights are
> > > changed,
> > > >> frameworks are added or removed, etc.
> > > >>
> > > >> In this context, "revocable" means that the resources may be taken
> > away
> > > >> and the container will be destroyed. The meaning of "revocable" in
> the
> > > >> context of usage oversubscription includes this, but also the
> > container
>

Re: RFC: RevocableInfo Changes

2016-03-14 Thread Niklas Nielsen
Hi Ben,

Just a quick note: Ian D. and the performance isolation working group are
discussing similar annotations and we should meet and talk about the
options.

Niklas

On Sat, Mar 12, 2016 at 12:05 AM, Klaus Ma  wrote:

> Yes, I think that's true for now; so we define `ThrottleInfo` as message to
> be more flexible. In Optimistic Offer Phase 1, we only use it to
> distinguish usage oversubscriptions and allocation oversubscription,
> similar to bool :).
>
> Regarding the resources type, two questions after the discussion:
>
> 1. should we send different offer to the framework, so when
> usage/allocation oversubscription updated, only one type of offer will be
> rescinded?
> 2. should we define framework's capability against `ThrottleInfo`?
>
> 
> Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
> Platform OpenSource Technology, STG, IBM GCG
> +86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me
>
> On Sat, Mar 12, 2016 at 12:03 PM, Guangya Liu  wrote:
>
> >
> > Hi Ben,
> >
> > I think that currently and even in the near future, the __ThrottleInfo__
> > will only be used by the usage oversubscriptions and the oversubscription
> > for allocator (Both quota and reservations) will not use this value but
> > only using __RevocableInfo__ is enough.
> >
> > I can even think that the __ThrottleInfo__ as a boolean value in
> > optimistic offer phase 1 as it is mainly used to distinguish resources
> > between usage oversubscriptions and allocation oversubscription (Quota
> and
> > Reservations), comments?
> >
> > Thanks,
> >
> > Guangya
> >
> > 在 2016年3月12日星期六 UTC+8上午11:09:46,Benjamin Mahler写道:
> >
> >> Hey folks,
> >>
> >> In the resource allocation working group we've been looking into a few
> >> projects that will make the allocator able to offer out resources as
> >> revocable. For example:
> >>
> >> -We'll want to eventually allocate resources as revocable _by default_,
> >> only allowing non-revocable when there are guarantees put in place
> (static
> >> reservations or quota).
> >>
> >> -On the path to revocable by default, we can incrementally start to
> offer
> >> certain resources as revocable. Consider when quota is set but the role
> >> isn't using all of the quota. The unallocated quota can be offered to
> other
> >> roles, but it should be revocable because we may revoke them should the
> >> quota'ed role want to use the resources. Unused reservations fall into a
> >> similar category.
> >>
> >> -Going revocable by default also allows us to enforce fairness in a
> >> dynamically changing cluster by revoking resources as weights are
> changed,
> >> frameworks are added or removed, etc.
> >>
> >> In this context, "revocable" means that the resources may be taken away
> >> and the container will be destroyed. The meaning of "revocable" in the
> >> context of usage oversubscription includes this, but also the container
> may
> >> experience a throttling (e.g. lower cpu shares, less network priority,
> etc).
> >>
> >> For this reason, and because we internally need to distinguish revocable
> >> resources between the those that are generated by usage oversubscription
> >> and those that are generated by the allocator, we're thinking of the
> >> following change to the API:
> >>
> >>
> >>
> >> -  message RevocableInfo {}
> >> +  message RevocableInfo {
> >> +message ThrottleInfo {}
> >> +
> >> +// If set, indicates that the resources may be throttled at
> >> +// any time. Throttle-able resoruces can be used for tasks
> >> +// that do not have strict performance requirements and are
> >> +// capable of handling being throttled.
> >> +optional ThrottleInfo throttle_info;
> >> +  }
> >>
> >>// If this is set, the resources are revocable, i.e., any tasks or
> >> -  // executors launched using these resources could get preempted or
> >> -  // throttled at any time. This could be used by frameworks to run
> >> -  // best effort tasks that do not need strict uptime or performance
> >> +  // executors launched using these resources could be terminated at
> >> +  // any time. This could be used by frameworks to run
> >> +  // best effort tasks that do not need strict uptime
> >>// guarantees. Note that if this is set, 'disk' or 'reservation'
> >>// cannot be set.
> >>optional RevocableInfo revocable = 9;
> >>
> >>
> >>
> >> Essentially we want to distinguish between revocable and revocable +
> >> throttle-able. This is because usage-oversubscription generates
> >> throttle-able revocable resources, whereas the allocator does not. This
> >> also solves our problem of distinguishing between these two kinds of
> >> revocable resources internally.
> >>
> >> Feedback welcome!
> >>
> >> Ben
> >>
> >> --
> > You received this message because you are subscribed to the Google Groups
> > "Mesos Resource Allocation Working Group" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to 

Re: Performance isolation working group meeting on Friday 10am PST

2016-02-29 Thread Niklas Nielsen
Avinash, we didn't get to add a calendar invite. Apologize the
inconvenience.

MPark, what is the procedure to get things added to the mesos calendar? Who
owns it?

Niklas

On Fri, Feb 26, 2016 at 10:32 AM, Avinash Sridharan <avin...@mesosphere.io>
wrote:

> Was there a calendar invite for this? Have the Apache mesos calendar added,
> but don't see this meeting as part of the calendar.
>
> On Fri, Feb 26, 2016 at 10:14 AM, Ian Downes <idow...@twitter.com.invalid>
> wrote:
>
> > Please use this link:
> > https://plus.google.com/hangouts/_/twitter.com/mesos-cpu
> >
> > On Fri, Feb 26, 2016 at 10:09 AM, Ian Downes <idow...@twitter.com>
> wrote:
> >
> > > we're a little late to get started, it will become accessible shortly
> > >
> > > > On Feb 26, 2016, at 10:03, Kevin Klues <klue...@gmail.com> wrote:
> > > >
> > > > I'm having trouble joining the call. Keeps saying "Requesting to join
> > > > the video call..."
> > > >
> > > >> On Wed, Feb 24, 2016 at 12:12 PM, Niklas Nielsen <n...@qni.dk>
> wrote:
> > > >> Hi all,
> > > >>
> > > >> We will meet and talk performance isolation on Friday 10am PST, with
> > the
> > > >> agenda:
> > > >>
> > > >> 1) New proposal for core affinity
> > > >> 2) CFS configuration status
> > > >> 3) Workload Benchmarking
> > > >> 4) Discussion on actuating isolation for resources that are
> accounted
> > /
> > > not
> > > >> accounted by Mesos
> > > >> For now, this will be the hangout:
> > > >> https://plus.google.com/hangouts/_/qni.dk/nik and we will follow up
> > > with
> > > >> any changes
> > > >>
> > > >> --
> > > >> Niklas
> > > >
> > > >
> > > >
> > > > --
> > > > ~Kevin
> > >
> >
>
>
>
> --
> Avinash Sridharan, Mesosphere
> +1 (323) 702 5245
>



-- 
Niklas


Performance isolation working group meeting on Friday 10am PST

2016-02-24 Thread Niklas Nielsen
Hi all,

We will meet and talk performance isolation on Friday 10am PST, with the
agenda:

1) New proposal for core affinity
2) CFS configuration status
3) Workload Benchmarking
4) Discussion on actuating isolation for resources that are accounted / not
accounted by Mesos
For now, this will be the hangout:
https://plus.google.com/hangouts/_/qni.dk/nik and we will follow up with
any changes

-- 
Niklas


Re: Experimentation harnesses?

2016-02-22 Thread Niklas Nielsen
Hi Michael,

We are working on this in context of generating workloads (with many
different combinations of latency critical workloads co-located with
best-effort ones) and testing scenarios for oversubscription and would love
to chat

Cheers,
Niklas

On Mon, Feb 22, 2016 at 4:14 PM, James Peach  wrote:

>
> > On Feb 22, 2016, at 11:57 AM, Michael Browning 
> wrote:
> >
> > Hi all,
> >
> > I was curious if anyone with an active Mesos deployment knows of, has
> used,
> > or has developed a harness for integration and exploratory testing
> against
> > your installations. The sort of capabilities I'm after include:
> >
> >   - Sufficient flexibility to allow the launch of multiple frameworks in
> >   test setup -- this could involve deployment logic, e.g. fetching and
> >   installing a package or a provisioning script on a host.
> >   - Sufficient flexibility to allow the orchestration of multiple
> >   frameworks in a test run, to e.g. understand how different
> combinations of
> >   frameworks interact with each other under various usage scenarios, how
> they
> >   interact under quota, etc.
> >   - Sufficient flexibility to allow the gathering of many different
> >   metrics -- depending on what's being investigated, we might want to be
> able
> >   to include various host-level metrics in addition to the metrics and
> gauges
> >   that Mesos itself exposes.
>
> At one point I wrote a small framework similar to mesos-execute to poke
> what I needed at the time. I considered adding Lua bindings to this so make
> it easier to make bespoke schedulers to hit different scenarios but never
> got around to investing the time :-/
>
> >
> > This set of capabilities is something I'd expect from a distributed
> testing
> > framework, but Googling around hasn't yielded any immediately convincing
> > open source offerings -- things like Locust or Tsung are focused on
> > stress-testing and seem to lack the orchestration and provisioning
> > abilities I'm looking for. LinkedIn seems to have their own open source
> > offering for this, called Zopkio, that seems like it could hit the above
> > points, but it doesn't seem to be widely-used and I'm not sure how mature
> > it is.
> >
> > Does anyone have any leads in this area? Have you implemented your own
> > solution? I'd be curious to hear how you've approached this problem.
> >
> > Regards,
> > Michael
>
>


-- 
Niklas


Re: [RESULT][VOTE] Release Apache Mesos 0.27.0 (rc2)

2016-02-04 Thread Niklas Nielsen
Awesome guys!

Kapil, we usually linked to the user documentation in the blog to the new
features. Do you have a link to the docs on multiple disk resources?

On Wed, Feb 3, 2016 at 11:27 PM, Kapil Arya  wrote:

> And here is the blog post:
> http://mesos.apache.org/blog/mesos-0-27-0-released.
>
> On Wed, Feb 3, 2016 at 4:48 PM, Michael Park  wrote:
> > Kapil is currently working on it. We'll publish it shortly :)
> >
> > On 3 February 2016 at 13:41, Benjamin Mahler  wrote:
> >>
> >> Great! Is a blog post on the way?
> >>
> >> On Sun, Jan 31, 2016 at 5:39 PM, Michael Park  wrote:
> >>
> >> > Hi all,
> >> >
> >> > The vote for Mesos 0.27.0 (rc2) has passed with the
> >> > following votes.
> >> >
> >> > +1 (Binding)
> >> > --
> >> > Vinod Kone
> >> > Joris Van Remoortere
> >> > Till Toenshoff
> >> >
> >> > +1 (Non-binding)
> >> > --
> >> > Jörg Schad
> >> > Marco Massenzio
> >> > Greg Mann
> >> >
> >> > There were no 0 or -1 votes.
> >> >
> >> > Please find the release at:
> >> > https://dist.apache.org/repos/dist/release/mesos/0.27.0
> >> >
> >> > It is recommended to use a mirror to download the release:
> >> > http://www.apache.org/dyn/closer.cgi
> >> >
> >> > The CHANGELOG for the release is available at:
> >> >
> >> >
> >> >
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.27.0
> >> >
> >> > The mesos-0.27.0.jar has been released to:
> >> > https://repository.apache.org
> >> >
> >> > The website (http://mesos.apache.org) will be updated shortly to
> reflect
> >> > this release.
> >> >
> >> > Thanks,
> >> >
> >> > Tim, Kapil, MPark
> >> >
> >
> >
>



-- 
Niklas


Re: [RESULT][VOTE] Release Apache Mesos 0.27.0 (rc2)

2016-02-04 Thread Niklas Nielsen
Thanks Jie and Kapil! Looking forward to it.

I did read the proposal, but playing the devils advocate for a bit; how can
we ship a 'major feature' without bundled user documentation? Referring to
tickets and design docs (which may/or may not be the way things ended up
getting implemented) or to code is not good enough, in my mind.

Should we line up some expectations to available user docs, example
framework code, etc. before announcing the availability of a feature?

Niklas

On Thu, Feb 4, 2016 at 5:41 PM, Kapil Arya <ka...@mesosphere.io> wrote:

> And here is the ticket tracking the user doc:
> https://issues.apache.org/jira/browse/MESOS-4531. Will link to the
> blog post once the doc is ready :-).
>
> On Thu, Feb 4, 2016 at 11:38 AM, Jie Yu <yujie@gmail.com> wrote:
> > Niklas,
> >
> > I think Joris is still working on the user doc for multi-disk support in
> > Mesos.
> >
> > - Jie
> >
> > On Thu, Feb 4, 2016 at 1:22 AM, Niklas Nielsen <n...@qni.dk> wrote:
> >
> >> Awesome guys!
> >>
> >> Kapil, we usually linked to the user documentation in the blog to the
> new
> >> features. Do you have a link to the docs on multiple disk resources?
> >>
> >> On Wed, Feb 3, 2016 at 11:27 PM, Kapil Arya <ka...@mesosphere.io>
> wrote:
> >>
> >> > And here is the blog post:
> >> > http://mesos.apache.org/blog/mesos-0-27-0-released.
> >> >
> >> > On Wed, Feb 3, 2016 at 4:48 PM, Michael Park <mp...@apache.org>
> wrote:
> >> > > Kapil is currently working on it. We'll publish it shortly :)
> >> > >
> >> > > On 3 February 2016 at 13:41, Benjamin Mahler <bmah...@apache.org>
> >> wrote:
> >> > >>
> >> > >> Great! Is a blog post on the way?
> >> > >>
> >> > >> On Sun, Jan 31, 2016 at 5:39 PM, Michael Park <mp...@apache.org>
> >> wrote:
> >> > >>
> >> > >> > Hi all,
> >> > >> >
> >> > >> > The vote for Mesos 0.27.0 (rc2) has passed with the
> >> > >> > following votes.
> >> > >> >
> >> > >> > +1 (Binding)
> >> > >> > --
> >> > >> > Vinod Kone
> >> > >> > Joris Van Remoortere
> >> > >> > Till Toenshoff
> >> > >> >
> >> > >> > +1 (Non-binding)
> >> > >> > --
> >> > >> > Jörg Schad
> >> > >> > Marco Massenzio
> >> > >> > Greg Mann
> >> > >> >
> >> > >> > There were no 0 or -1 votes.
> >> > >> >
> >> > >> > Please find the release at:
> >> > >> > https://dist.apache.org/repos/dist/release/mesos/0.27.0
> >> > >> >
> >> > >> > It is recommended to use a mirror to download the release:
> >> > >> > http://www.apache.org/dyn/closer.cgi
> >> > >> >
> >> > >> > The CHANGELOG for the release is available at:
> >> > >> >
> >> > >> >
> >> > >> >
> >> >
> >>
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.27.0
> >> > >> >
> >> > >> > The mesos-0.27.0.jar has been released to:
> >> > >> > https://repository.apache.org
> >> > >> >
> >> > >> > The website (http://mesos.apache.org) will be updated shortly to
> >> > reflect
> >> > >> > this release.
> >> > >> >
> >> > >> > Thanks,
> >> > >> >
> >> > >> > Tim, Kapil, MPark
> >> > >> >
> >> > >
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> Niklas
> >>
>



-- 
Niklas


Re: Please review design doc for task resizing

2015-12-09 Thread Niklas Nielsen
(Inlined)

On Mon, Dec 7, 2015 at 6:54 AM, Qian Zhang <zhq527...@gmail.com> wrote:

> Thanks Niklas for your comments :-)
>
> For your first comment, so you prefer option 2 in the design doc (i.e., add
> resize as a new offer operation), right? Actually after more thinking, I
> think if we want to support the following 2 use cases (especially the
> second one) and OK to resize a task with an empty offer list (meaning we
> may need to remove the check:
> https://github.com/apache/mesos/blob/0.25.0/src/master/master.cpp#L2816
> for
> reducing resource from a task), then I also agree option 2 is the best.
> 1. Framework adds / reduces multiple resources for its task at the same
> time.
> 2. Framework reserves the resources and resizes a task to use those
> resources at the same time.
>

Great!


>
> For your second comment, can you please clarify more about "should be
> dictated by the resource type itself"? I see your comment in the design doc
> "The resource type should dictate the sign of R_2 -R_1.", but I am not sure
> what you meant about "R_2 -R_1". Actually there are still some discussion
> about whether frameworks need to send desired resource (e.g., resize my
> task to 8GB) or send the resource delta (e.g., add 2GB to my task) to
> master, personally I prefer the later because the former can cause some
> race condition that we can not handle, you can refer my comments in the
> design doc for details.
>

I see. However, that operation is not idempotent. Imagine you issue a
resize request and for some reason, the request takes long to carry out and
you don't have a way to guarantee that the request was received (for
example, during a master failover). In the mean time, you issue another
resize. When both land, it may not be the action you wanted.
containerizer->update() applies the aggregate size anyway, so you need to
keep track of the 'sign' of the resize all the way down to the slave
process.

Maybe I am completely off. Other folks have some input here?


>
> And I have 2 more questions that I want to discuss with you:
> 1. David G raised a user story about framework should be able to resize its
> executor, I think this should be a valid use case, but I would suggest us
> to focus on task resizing in MVP and handle executor resizing in the
> post-MVP, how do you think?
> 2. Do you think we need to involve executor in task resizing? E.g., let
> slave send a message (e.g., RunTaskMessage) to executor so that executor
> can do the actual resizing? The reason I raise this question is that I
> think in some cases, executor needs to be aware of the resized resources,
> e.g., framework adds a new port to a task, I think executor & task should
> know such new port so that the task can start to use it. And in the
> Kubernetes on Mesos case, user may want to resize a pod which is actually
> created an managed by k8sm-executor, so it should be involved to resize the
> resources of the pod.
>

Maybe we can do that down the line; as an MVP, maybe we can skip it but
have a model that supports it?
Using the task info as a 'desired state', changing the executor info
resources could be used to change it's size. However, there are some
details in terms of master failover and slave reregistration where executor
infos are sent from the slaves, where we need to be careful.


>
> Currently I do not have PoC implementation for my proposal yet, do you
> recommend that we should have it now? Or after the design is close to be
> finalized or at least after we make the decision among those 3 options
> about scheduler API changes in the design doc?
>

Doesn't hurt to experiment and see if there are obvious things that we
missed to address.
If you haven't done any work yet, I'd maybe defer until we at least have
the placement of the 'resize operation' nailed down.


>
> I'd like to have an online sync up with you, can you please let me know
> when you will be online in IRC usually? Or you prefer other ways to sync
> up? I will try to catch you :-)
>

Let's do a joint call; how about Friday or Monday?
I am available in business hours PST.


>
>
> Thanks,
> Qian
>
>
> 2015-12-05 7:03 GMT+08:00 Niklas Nielsen <n...@qni.dk>:
>
> > Hi Qian,
> >
> > Thanks for the update and I apologize the response time.
> >
> > Do you have a PoC implementation of your proposal?
> >
> > I have trouble understanding the motivation of _not_ adding resizing as a
> > usual operation. It seems much cleaner in my mind. To David G's and Alex
> > R's comment: if you want to resize without an offer (during task
> > shrinking), you could do it with an empty offer list. Giving up combining
> > task resizing with the other operations (which will most likely scale

Re: Please review design doc for task resizing

2015-12-04 Thread Niklas Nielsen
Hi Qian,

Thanks for the update and I apologize the response time.

Do you have a PoC implementation of your proposal?

I have trouble understanding the motivation of _not_ adding resizing as a
usual operation. It seems much cleaner in my mind. To David G's and Alex
R's comment: if you want to resize without an offer (during task
shrinking), you could do it with an empty offer list. Giving up combining
task resizing with the other operations (which will most likely scale with
upcoming features) is a big loss, but maybe I am missing something.

Secondly, whether the new desired resource shape requires growing and
shrinking, I think should be dictated by the resource type itself rather
than explicitly set by the framework writer. You have to do that math
anyway to figure out whether the framework's request is valid, no?

We can do a online sync soon, if you want to give a pitch on the design.

Cheers,
Niklas


On Thu, Nov 19, 2015 at 6:34 AM, Qian Zhang  wrote:

> Hi All,
>
> I am currently working on task resizing (MESOS-938), and have drafted a
> design doc (see the link below).
>
> https://docs.google.com/document/d/15rVmS2AXLzTDSEugAVDxWuHFUentp82KhL2yzxBCsi8/edit?usp=sharing
>
>
> Please feel free to review it, any comments are welcome, thanks!
>
>
> Regards,
> Qian
>



-- 
Niklas


Re: Finish Oversubscription before 0.26.0?

2015-11-05 Thread Niklas Nielsen
Nope - go ahead and close

On Thu, Nov 5, 2015 at 10:24 AM, Jie Yu  wrote:

> I would say the MVP is done. Of course, there'll be some followup
> improvement to the feature, and all the remaining issues are within that
> category. I am fine resolving this epic. Any one has any objection?
>
> - Jie
>
> On Thu, Nov 5, 2015 at 10:18 AM, Bernd Mathiske 
> wrote:
>
>> All who worked on MESOS-354,
>>
>> What’s the status of the Oversubscription epic? Can we already call it a
>> feature in 0.26? Shall we wait a few days to finish it? Will it slip into
>> 0.27?
>>
>> I see only 6 unresolved tickets and lots of resolved ones here:
>>
>> https://issues.apache.org/jira/browse/MESOS-354
>>
>> (Could you please assign someone to this ticket as overall responsible
>> epic master?)
>>
>> Bernd
>>
>>
>


-- 
Niklas


Re: NetworkInfo change - MESOS-3788

2015-10-22 Thread Niklas Nielsen
On 22 October 2015 at 13:09, Spike Curtis (projectcalico.org) <
sp...@projectcalico.org> wrote:

> Hi,
>
> Mesos 0.25.0 introduced a new NetworkInfo protobuf message.  The
> NetworkInfo allows frameworks to request IP-per-container networking for
> the containerized Tasks they launch, and also allows a unified way for
> frameworks to discover the IP address the task will use (both traditional
> and IP-per-container networking).
>
> We've noticed an issue with the way NetworkInfo was specified that means a
> less than ideal experience for framework and application developers.
> MESOS-3788<https://issues.apache.org/jira/browse/MESOS-3788> is our
> attempt to fix NetworkInfo to be easier for network module developers and
> Mesos users to understand.
>
> What went wrong?
>
> In the original NetworkInfo, merged in 0.25.0, the way to specify multiple
> IP addresses was to repeat the entire NetworkInfo object.  If the framework
> repeated the NetworkInfo object with different network groups in each
> NetworkInfo object, it wasn't clear how the overall policy would be applied.
>
> Furthermore, it wasn't clear how multiple IP addresses might or might not
> be assigned to multiple virtual network interfaces.  Would you get one
> interface with several IP addresses, or several interfaces with an IP
> address each?
>
> How do we fix it?
>
> We've collaborated with Ben Hindman, Connor Doyle, Kapil Arya, and Niklas
> Nielsen on the proposed solution. MESOS-3788<
> https://issues.apache.org/jira/browse/MESOS-3788> clarifies that each
> NetworkInfo object represents a single virtual network interface in the
> container.  In order to allow multiple IP addresses on a single interface,
> we've deprecated the original protocol and ip_address fields and moved to a
> repeated IPAddress sub-message.  If you want 3 IPs on an interface, repeat
> the IPAddress sub-message 3 times.
>
> The fields in the IPAddress sub-message are optional.  The mainline use
> will be to include 1 copy of the IPAddress message with no parameters,
> which means "give me a single IP address."  At present, since Mesos itself
> only supports IPv4, this will always be an IPv4 address, but in the future,
> if not specified, the network provider can assign whatever the local
> default is.
>

+1

Thanks for the writeup Spike!


>
> Please feel free to comment directly on the issue or reply to this message.
>
> Cheers,
> Spike Curtis  (Project Calico)
>


Notes from off-line sync on processes for Mesos road maps

2015-10-22 Thread Niklas Nielsen
Hi everyone,

Yesterday, a few of us met up to discuss processes for Mesos road maps: how
to capture core values of the project, how to prioritize features from
users running different workloads and how to delegate specifics of road
mapping to the working groups.

The meeting notes are available here:
https://docs.google.com/document/d/11LLDKQBZObduNUG7_bayBIU6VMKq2E8CvmaI0tt1U4M/edit#

Ben H will follow up with a write up with a proposal for a vision statement
and I will create working groups for project strategy for those of you, who
are interested in process and the high-level view and interactions between
working groups (similar to yesterdays meeting).

Let us know if you have any questions.

Cheers,
Niklas


Re: Contribution Request for Mesos

2015-10-14 Thread Niklas Nielsen
Hi Yong,

You should be on the list now.

Looking forward to your contributions!

Niklas

On 14 October 2015 at 12:29, Yong Tang  wrote:

> Hi
> After using Mesos and following Mesos project for quite some time, I would
> like to participate and contribute to the Mesos project. Could I be added
> to the list of Contributors?
> My JIRA username is: yongtang
> Thanks!Yong


[RESULT][VOTE] Release Apache Mesos 0.25.0 (rc3)

2015-10-12 Thread Niklas Nielsen
Hi all,


The vote for Mesos 0.25.0 (rc3) has passed with the

following votes.


+1 (Binding)

--

Joris Van Remoortere

Michael Park

Brenden Matthews


+1 (Non-binding)

--

Kapil Arya


There were no 0 or -1 votes.


Please find the release at:

https://dist.apache.org/repos/dist/release/mesos/0.25.0


It is recommended to use a mirror to download the release:

http://www.apache.org/dyn/closer.cgi


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.25.0


The mesos-0.25.0.jar has been released to:

https://repository.apache.org


The website (http://mesos.apache.org) will be updated shortly to reflect
this release.


Thanks,

Mpark, Joris and Niklas


[VOTE] Release Apache Mesos 0.25.0 (rc3)

2015-10-09 Thread Niklas Nielsen
Hi all,

Following up with an RC with the build fix suggested by Kapil:

Please vote on releasing the following candidate as Apache Mesos 0.25.0.



0.25.0 includes the following:



 * [MESOS-1474] - Experimental support for maintenance primitives.

 * [MESOS-2600] - Added master endpoints /reserve and /unreserve for
dynamic reservations.

 * [MESOS-2044] - Extended Module APIs to enable IP per container
assignment, isolation and resolution.


** Bug fixes

  * [MESOS-2635] - Web UI Display Bug when starting lots of tasks with
small cpu value.

  * [MESOS-2986] - Docker version output is not compatible with Mesos.

  * [MESOS-3046] - Stout's UUID re-seeds a new random generator during each
call to UUID::random.

  * [MESOS-3051] - performance issues with port ranges comparison.

  * [MESOS-3052] - Allocator performance issue when using a large number of
filters.

  * [MESOS-3136] - COMMAND health checks with Marathon 0.10.0 are broken.

  * [MESOS-3169] - FrameworkInfo should only be updated if the
re-registration is valid.

  * [MESOS-3185] - Refactor Subprocess logic in linux/perf.cpp to use
common subroutine.

  * [MESOS-3239] - Refactor master HTTP endpoints help messages such that
they cannot be out of sync.

  * [MESOS-3245] - The comments of DRFSorter::dirty is not correct.

  * [MESOS-3254] - Cgroup CHECK fails test harness.

  * [MESOS-3258] - Remove Frameworkinfo capabilities on re-registration.

  * [MESOS-3261] - Move QoS plug-ins to a specified folder like
resource_estimator.

  * [MESOS-3269] - The comments of Master::updateSlave() is not correct.

  * [MESOS-3282] - Web UI no longer shows Tasks information.

  * [MESOS-3344] - Add more comments for strings::internal::fmt.

  * [MESOS-3351] - duplicated slave id in master after master failover.

  * [MESOS-3387] - Refactor MesosContainerizer to accept namespace
dynamically.

  * [MESOS-3408] - Labels field of FrameworkInfo should be added into v1
mesos.proto.

  * [MESOS-3411] - ReservationEndpointsTest.AvailableResources appears to
be faulty.

  * [MESOS-3423] - Perf event isolator stops performing sampling if a
single timeout occurs.

  * [MESOS-3426] - process::collect and process::await do not perform
discard propagation.

  * [MESOS-3430] -
LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithoutRootFilesystem
fails on CentOS 7.1.

  * [MESOS-3450] - Update Mesos C++ Style Guide for namespace usage.

  * [MESOS-3451] - Failing tests after changes to
Isolator/MesosContainerizer API.

  * [MESOS-3458] - Segfault when accepting or declining inverse offers.

  * [MESOS-3474] - ExamplesTest.{TestFramework, JavaFramework,
PythonFramework} failed on CentOS 6.

  * [MESOS-3489] - Add support for exposing Accept/Decline responses for
inverse offers.

  * [MESOS-3490] - Mesos UI fails to represent JSON entities.

  * [MESOS-3512] - Don't retry close() on EINTR.

  * [MESOS-3513] - Cgroups Test Filters aborts tests on Centos 6.6.

  * [MESOS-3519] - Fix file descriptor leakage / double close in the code
base.

  * [MESOS-3538] -
CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy test
is flaky.

  * [MESOS-3575] - V1 API java/python protos are not generated.


** Improvements

  * [MESOS-2719] - Deprecating '.json' extension in master endpoints urls.

  * [MESOS-2757] - Add -> operator for Option, Try, Result,
Future.

  * [MESOS-2875] - Add containerId to ResourceUsage to enable QoS
controller to target a container.

  * [MESOS-2964] - libprocess io does not support peek().

  * [MESOS-2983] - Deprecating '.json' extension in slave endpoints url.

  * [MESOS-2984] - Deprecating '.json' extension in files endpoints url.

  * [MESOS-3037] - Add a SUPPRESS call to the scheduler.

  * [MESOS-3187] - Docker cli option support.

  * [MESOS-3304] - Remove remnants of LIBPROCESS_STATISTICS_WINDOW.

  * [MESOS-3312] - Factor out JSON to repeated protobuf conversion.

  * [MESOS-3340] - Command-line flags should take precedence over OS Env
variables.

  * [MESOS-3347] - Remove dead code in src/linux/perf.cpp.

  * [MESOS-3377] - mesos docker container with container_name as ENV
variable.

  * [MESOS-3457] - Add flag to disable hostname lookup.


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.25.0-rc3




The candidate for Mesos 0.25.0 release is available at:

https://dist.apache.org/repos/dist/dev/mesos/0.25.0-rc3/mesos-0.25.0.tar.gz


The tag to be voted on is 0.25.0-rc3:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.25.0-rc3


The MD5 checksum of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/0.25.0-rc3/mesos-0.25.0.tar.gz.md5


The signature of the tarball can be found at:


Re: Proposal: moving Mesos website to project codebase

2015-10-09 Thread Niklas Nielsen
+1

On 9 October 2015 at 09:50, Yan Xu  wrote:

> +1 for making it easier for contributors to understand the website code and
> collaboratively maintain it!
>
> --
> Jiang Yan Xu  @xujyan 
>
> On Fri, Oct 9, 2015 at 5:21 PM, Paul Brett 
> wrote:
>
> > +1
> >
> > On Fri, Oct 9, 2015 at 8:59 AM, haosdent  wrote:
> >
> > > +1!
> > > On Oct 9, 2015 10:37 PM, "Kevin Sweeney"  wrote:
> > >
> > > > +1!
> > > >
> > > > On Fri, Oct 9, 2015 at 3:35 PM Marco Massenzio 
> > > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > Dave - great stuff!
> > > > >
> > > > > *Marco Massenzio*
> > > > >
> > > > > *Distributed Systems Engineerhttp://codetrips.com <
> > > http://codetrips.com
> > > > >*
> > > > >
> > > > > On Fri, Oct 9, 2015 at 3:05 PM, Dave Lester 
> > > wrote:
> > > > >
> > > > > > As part of the #MesosCon Europe hackathon, my team has been
> making
> > > > > > improvements to the website. Among these changes, we'd like to
> > > propose
> > > > > > changing where the website source files live by moving them to
> the
> > > main
> > > > > > Mesos codebase. Our current progress / working branch of this is
> > > > > > available on GitHub:
> > https://github.com/fayusohenson/mesos/tree/site
> > > > > >
> > > > > > * What does this mean? *
> > > > > > We've added a /site directory to the Mesos codebase, which
> includes
> > > the
> > > > > > website source files. Today, these live in subversion. The rake
> > file
> > > > and
> > > > > > other parts of building the website all work in this new
> > environment,
> > > > > > plus a number of related fixes (image linking, etc).
> > > > > >
> > > > > > For committers that are familiar with the current model for
> pushing
> > > the
> > > > > > site live, this immediate change still requires us `svn commit`
> the
> > > > > > /publish directory for the website (static files that are
> > generated).
> > > > > >
> > > > > > * Why this change? *
> > > > > > 1. Today we do not have an easy process for the community to
> > > contribute
> > > > > > to the project website. By merging this with the Mesos codebase,
> it
> > > > will
> > > > > > be significantly easier to send a review or pull request.
> > > > > > 2. It'll be easier for committers to manage the website, and
> check
> > > that
> > > > > > documentation changes render on the website properly before
> > > committing.
> > > > > > Because it's difficult to do today, this is often not checked. :(
> > > > > > 3. It's a solid step toward an automated deployment of the
> website
> > in
> > > > > > the future: https://issues.apache.org/jira/browse/MESOS-1309
> > > > > >
> > > > > > * Who approves of this change? *
> > > > > > As the Mesos website maintainer, I feel good about this change
> and
> > > its
> > > > > > direction for the project. Before committing this change, I'd
> like
> > > > > > community support that including this in the main Mesos codebase
> > > makes
> > > > > > sense.
> > > > > >
> > > > > > Comments? Questions?
> > > > > >
> > > > > > Dave
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > @paul_b
> >
>


Re: [VOTE] Release Apache Mesos 0.25.0 (rc2)

2015-10-09 Thread Niklas Nielsen
Hi everyone,

Thanks everyone for testing the RC! Due to the -1, we will cherry pick
bfeb070a2aef52f445eb057076d344fd184eb461 and send out an RC3 today.

Stay tuned and thanks for the prompt feedback!

Cheers,
Mpark, Joris and Niklas

On 8 October 2015 at 06:40, Bernd Mathiske <be...@mesosphere.io> wrote:

> I suppose this makes my vote +1 binding, assuming the cherry-picking
> happens.
>
> > On Oct 8, 2015, at 3:37 PM, Bernd Mathiske <be...@mesosphere.io> wrote:
> >
> > I have the exact same result as Greg.
> >
> >> On Oct 8, 2015, at 12:14 AM, Greg Mann <g...@mesosphere.io> wrote:
> >>
> >> Successfully built `sudo make distcheck` on CentOS 7.1 and Ubuntu 14.04
> >> with only expected test failures. On our Fedora 22 CI build, however,
> while
> >> the tests are building the following compile-time error is produced:
> >>
> >> [17:18:46][Step 4/6]   CXX
> >> tests/containerizer/mesos_tests-composing_containerizer_tests.o
> >>
> >> [17:18:48][Step 4/6] In file included from
> >> ../../../src/tests/values_tests.cpp:22:0:
> >>
> >> [17:18:48][Step 4/6]
> >>
> ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h: In
> >> instantiation of ‘testing::AssertionResult
> >> testing::internal::CmpHelperEQ(const char*, const char*, const T1&,
> const
> >> T2&) [with T1 = int; T2 = long unsigned int]’:
> >>
> >> [17:18:48][Step 4/6]
> >>
> ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1484:23:
> >> required from ‘static testing::AssertionResult
> >> testing::internal::EqHelper::Compare(const char*,
> >> const char*, const T1&, const T2&) [with T1 = int; T2 = long unsigned
> int;
> >> bool lhs_is_null_literal = false]’
> >>
> >> [17:18:48][Step 4/6] ../../../src/tests/values_tests.cpp:149:3:
>  required
> >> from here
> >>
> >> [17:18:48][Step 4/6]
> >>
> ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1448:16:
> >> error: comparison between signed and unsigned integer expressions
> >> [-Werror=sign-compare]
> >>
> >> [17:18:48][Step 4/6]    if (expected == actual) {
> >>
> >> [17:18:48][Step 4/6] ^
> >>
> >>
> >> Cherry-picking one commit (bfeb070a2aef52f445e "Fixed compiler warning
> in
> >> values test.") resolves this issue.
> >>
> >>
> >>
> >> On Wed, Oct 7, 2015 at 2:32 AM, Joris Van Remoortere <
> jo...@mesosphere.io>
> >> wrote:
> >>
> >>> +1 (binding)
> >>>
> >>> On Mon, Oct 5, 2015 at 11:12 PM, Niklas Nielsen <nik...@mesosphere.io>
> >>> wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> Please vote on releasing the following candidate as Apache Mesos
> 0.25.0.
> >>>>
> >>>>
> >>>>
> >>>> 0.25.0 includes the following:
> >>>>
> >>>>
> >>>>
> 
> >>>>
> >>>> * [MESOS-1474] - Experimental support for maintenance primitives.
> >>>>
> >>>> * [MESOS-2600] - Added master endpoints /reserve and /unreserve for
> >>>> dynamic reservations.
> >>>>
> >>>> * [MESOS-2044] - Extended Module APIs to enable IP per container
> >>>> assignment, isolation and resolution.
> >>>>
> >>>>
> >>>> ** Bug fixes
> >>>>
> >>>> * [MESOS-2635] - Web UI Display Bug when starting lots of tasks with
> >>>> small cpu value.
> >>>>
> >>>> * [MESOS-2986] - Docker version output is not compatible with Mesos.
> >>>>
> >>>> * [MESOS-3046] - Stout's UUID re-seeds a new random generator during
> >>>> each call to UUID::random.
> >>>>
> >>>> * [MESOS-3051] - performance issues with port ranges comparison.
> >>>>
> >>>> * [MESOS-3052] - Allocator performance issue when using a large number
> >>>> of filters.
> >>>>
> >>>> * [MESOS-3136] - COMMAND health checks with Marathon 0.10.0 are
> broken.
> >>>>
> >>>> * [MESOS-3169] - FrameworkInfo should only be updated if the
> >>>> re-registration is valid.
> >>>>
> >>

[VOTE] Release Apache Mesos 0.25.0 (rc2)

2015-10-06 Thread Niklas Nielsen
Hi all,

Please vote on releasing the following candidate as Apache Mesos 0.25.0.



0.25.0 includes the following:



 * [MESOS-1474] - Experimental support for maintenance primitives.

 * [MESOS-2600] - Added master endpoints /reserve and /unreserve for
dynamic reservations.

 * [MESOS-2044] - Extended Module APIs to enable IP per container
assignment, isolation and resolution.


** Bug fixes

  * [MESOS-2635] - Web UI Display Bug when starting lots of tasks with
small cpu value.

  * [MESOS-2986] - Docker version output is not compatible with Mesos.

  * [MESOS-3046] - Stout's UUID re-seeds a new random generator during each
call to UUID::random.

  * [MESOS-3051] - performance issues with port ranges comparison.

  * [MESOS-3052] - Allocator performance issue when using a large number of
filters.

  * [MESOS-3136] - COMMAND health checks with Marathon 0.10.0 are broken.

  * [MESOS-3169] - FrameworkInfo should only be updated if the
re-registration is valid.

  * [MESOS-3185] - Refactor Subprocess logic in linux/perf.cpp to use
common subroutine.

  * [MESOS-3239] - Refactor master HTTP endpoints help messages such that
they cannot be out of sync.

  * [MESOS-3245] - The comments of DRFSorter::dirty is not correct.

  * [MESOS-3254] - Cgroup CHECK fails test harness.

  * [MESOS-3258] - Remove Frameworkinfo capabilities on re-registration.

  * [MESOS-3261] - Move QoS plug-ins to a specified folder like
resource_estimator.

  * [MESOS-3269] - The comments of Master::updateSlave() is not correct.

  * [MESOS-3282] - Web UI no longer shows Tasks information.

  * [MESOS-3344] - Add more comments for strings::internal::fmt.

  * [MESOS-3351] - duplicated slave id in master after master failover.

  * [MESOS-3387] - Refactor MesosContainerizer to accept namespace
dynamically.

  * [MESOS-3408] - Labels field of FrameworkInfo should be added into v1
mesos.proto.

  * [MESOS-3411] - ReservationEndpointsTest.AvailableResources appears to
be faulty.

  * [MESOS-3423] - Perf event isolator stops performing sampling if a
single timeout occurs.

  * [MESOS-3426] - process::collect and process::await do not perform
discard propagation.

  * [MESOS-3430] -
LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithoutRootFilesystem
fails on CentOS 7.1.

  * [MESOS-3450] - Update Mesos C++ Style Guide for namespace usage.

  * [MESOS-3451] - Failing tests after changes to
Isolator/MesosContainerizer API.

  * [MESOS-3458] - Segfault when accepting or declining inverse offers.

  * [MESOS-3474] - ExamplesTest.{TestFramework, JavaFramework,
PythonFramework} failed on CentOS 6.

  * [MESOS-3489] - Add support for exposing Accept/Decline responses for
inverse offers.

  * [MESOS-3490] - Mesos UI fails to represent JSON entities.

  * [MESOS-3512] - Don't retry close() on EINTR.

  * [MESOS-3513] - Cgroups Test Filters aborts tests on Centos 6.6.

  * [MESOS-3519] - Fix file descriptor leakage / double close in the code
base.

  * [MESOS-3538] -
CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_MountUnmountHierarchy test
is flaky.

  * [MESOS-3575] - V1 API java/python protos are not generated.


** Improvements

  * [MESOS-2719] - Deprecating '.json' extension in master endpoints urls.

  * [MESOS-2757] - Add -> operator for Option, Try, Result,
Future.

  * [MESOS-2875] - Add containerId to ResourceUsage to enable QoS
controller to target a container.

  * [MESOS-2964] - libprocess io does not support peek().

  * [MESOS-2983] - Deprecating '.json' extension in slave endpoints url.

  * [MESOS-2984] - Deprecating '.json' extension in files endpoints url.

  * [MESOS-3037] - Add a SUPPRESS call to the scheduler.

  * [MESOS-3187] - Docker cli option support.

  * [MESOS-3304] - Remove remnants of LIBPROCESS_STATISTICS_WINDOW.

  * [MESOS-3312] - Factor out JSON to repeated protobuf conversion.

  * [MESOS-3340] - Command-line flags should take precedence over OS Env
variables.

  * [MESOS-3347] - Remove dead code in src/linux/perf.cpp.

  * [MESOS-3377] - mesos docker container with container_name as ENV
variable.

  * [MESOS-3457] - Add flag to disable hostname lookup.


The full CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.25.0-rc2




The candidate for Mesos 0.25.0 release is available at:

https://dist.apache.org/repos/dist/dev/mesos/0.25.0-rc2/mesos-0.25.0.tar.gz


The tag to be voted on is 0.25.0-rc2:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.25.0-rc2


The MD5 checksum of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/0.25.0-rc2/mesos-0.25.0.tar.gz.md5


The signature of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/0.25.0-rc2/mesos-0.25.0.tar.gz.asc


The PGP key used to sign the release is here:


Re: Patch for the website's Rakefile

2015-10-05 Thread Niklas Nielsen
I can help out; Where can I find the patch?

Niklas

On 4 October 2015 at 12:31, Dave Lester  wrote:

> If anyone is attending MesosCon Europe this upcoming week and would like
> to make enhancements here I should be available and could pair. There's
> lots of low-hanging fruit to make large improvements.
>
> Dave
>
> On Fri, Oct 2, 2015, at 11:00 PM, haosdent wrote:
> > I remember we also have another patches to fixes the error link
> > generation
> > in rakefile. And seems current website generation way a bit hard to
> > maintain.
> > On Oct 3, 2015 10:35 AM, "Jake Farrell"  wrote:
> >
> > > I am currently not a Mesos committer so can not help shepard this
> through,
> > > but I did help write the website code and have tested and verified this
> > > patch does take care of the problem outlined. steps to help any
> committer
> > > not familiar with the website code listed below. ping if you have
> questions
> > > or need help
> > >
> > > -Jake
> > >
> > >
> > > svn co https://svn.apache.org/repos/asf/mesos/site
> > > patch -p0 < rake.patch
> > > bundle install
> > > rake (rake dev if you want to test the site locally)
> > > svn add publish/documentation/latest/images
> > > source/documentation/latest/images
> > > svn commit ...
> > >
> > >
> > >
> > > On Fri, Oct 2, 2015 at 12:29 PM, Dave Lester 
> wrote:
> > >
> > > > Unfortunately I don't have time the next few weeks to assist -- but
> all
> > > > committers do have access to the site and should be able to review /
> > > > merge this.
> > > >
> > > > Best,
> > > > Dave
> > > >
> > > > On Fri, Oct 2, 2015, at 09:23 AM, Joseph Wu wrote:
> > > > > Dave,
> > > > >
> > > > > Would it be possible for you to take a look at the patches in
> > > MESOS-3183
> > > > > ?
> > > > >
> > > > > Ideally, we should fix the documentation before 0.25 goes out.
> > > > >
> > > > > Thanks,
> > > > > ~Joseph
> > > > >
> > > > > On Mon, Sep 28, 2015 at 1:59 PM, Joseph Wu 
> > > wrote:
> > > > >
> > > > > > + Dev
> > > > > >
> > > > > >
> > > > > > On Mon, Sep 28, 2015 at 1:56 PM, Dave Lester <
> dles...@twitter.com>
> > > > wrote:
> > > > > >
> > > > > >> Can this be discussed on the mailing list? Thanks
> > > > > >>
> > > > > >>
> > > > > >> On Monday, September 28, 2015, Joseph Wu 
> > > > wrote:
> > > > > >>
> > > > > >>> + Niq, Joris, MPark (so that this doesn't get neglected when
> the
> > > > website
> > > > > >>> is updated for the 0.25 release).
> > > > > >>>
> > > > > >>> Both patches (for the website and for the docs) will be tracked
> > > here:
> > > > > >>> https://issues.apache.org/jira/browse/MESOS-3183
> > > > > >>>
> > > > > >>> Feedback/reviews would be appreciated,
> > > > > >>> ~Joseph
> > > > > >>>
> > > > > >>> On Tue, Sep 22, 2015 at 11:48 AM, Adam Bordelon <
> > > a...@mesosphere.io>
> > > > > >>> wrote:
> > > > > >>>
> > > > >  Unfortunately, we don't have RB synced up to our svn repo.
> > > > Submitting
> > > > >  raw patches has been best practice so far (AFAIK)
> > > > > 
> > > > >  On Tue, Sep 22, 2015 at 11:35 AM, Joseph Wu <
> jos...@mesosphere.io
> > > >
> > > > >  wrote:
> > > > > 
> > > > > > Looks like this has been an issue since the end of July (!).
> > > > > > https://issues.apache.org/jira/browse/MESOS-3183
> > > > > >
> > > > > > Is there a reviewboard equivalent for modifications to the
> > > website
> > > > > > like this patch?
> > > > > > ~Joseph
> > > > > >
> > > > > > On Mon, Sep 21, 2015 at 2:56 PM, Vinod Kone <
> vi...@twitter.com>
> > > > wrote:
> > > > > >
> > > > > >> +dave lester who previously looked into the image loading
> issue.
> > > > > >>
> > > > > >>
> > > > > >> @vinodkone
> > > > > >>
> > > > > >> On Mon, Sep 21, 2015 at 2:50 PM, Joseph Wu <
> > > jos...@mesosphere.io>
> > > > > >> wrote:
> > > > > >>
> > > > > >>> Hi Adam/Vinod,
> > > > > >>>
> > > > > >>> Documentation images aren't being published to the
> website.  A
> > > > few
> > > > > >>> places with image(s):
> > > > > >>> *
> > > > > >>>
> > > > http://mesos.apache.org/documentation/latest/external-containerizer/
> > > > > >>> * Or
> > > > http://mesos.apache.org/documentation/latest/oversubscription/
> > > > > >>> * Or
> http://mesos.apache.org/documentation/latest/maintenance/
> > > > > >>> <-* ;(*
> > > > > >>>
> > > > > >>> I've attached a patch which sort-of fixes this.  It only
> > > sort-of
> > > > > >>> works because the images are copied to the website, *BUT*
> for
> > > > them
> > > > > >>> to show up, you need to remove the trailing slash "/" from
> the
> > > > URL.
> > > > > >>>
> > > > > >>> I'll submit a separate patch on RB for changing all image
> URLs
> > > > from
> > > > > >>> "images/*" to "/documentation/latest/images/*", so that

Re: [VOTE] Release Apache Mesos 0.21.2 (rc1)

2015-10-02 Thread Niklas Nielsen
+1 (binding)

On 1 October 2015 at 20:27, Michael Park  wrote:

> +1 (binding)
>
> *make distcheck* passed with the Jenkins Build script with
>
> *OS=ubuntu:15.04 COMPILER=gcc CONFIGURATION="--enable-optimize"
> ./support/jenkins_build.sh*
>
> On Fri, Sep 25, 2015 at 5:29 PM Vinod Kone  wrote:
>
>> +1 (binding)
>>
>> Tested on CI for CentOS5/6.
>>
>>
>> On Thu, Sep 24, 2015 at 6:12 PM, Adam Bordelon 
>> wrote:
>>
>>> +1 (binding) Tested on CI for CentOS7 and Ubuntu 14.04.
>>>
>>> On Thu, Sep 24, 2015 at 5:44 PM, Adam Bordelon 
>>> wrote:
>>>
 Hi friends,

 Here's a candidate for the last of the docker patch releases
 (0.21.x-0.24.x).
 Please vote on releasing the following candidate as Apache Mesos 0.21.2.

 0.21.2 is a bug fix release and includes the following:

 
 * [MESOS-2986] - Docker version output is not compatible with Mesos

 The CHANGELOG for the release is available at:

 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.21.2-rc1

 

 The candidate for Mesos 0.21.2 release is available at:

 https://dist.apache.org/repos/dist/dev/mesos/0.21.2-rc1/mesos-0.21.2.tar.gz

 The tag to be voted on is 0.21.2-rc1:

 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.21.2-rc1

 The MD5 checksum of the tarball can be found at:

 https://dist.apache.org/repos/dist/dev/mesos/0.21.2-rc1/mesos-0.21.2.tar.gz.md5

 The signature of the tarball can be found at:

 https://dist.apache.org/repos/dist/dev/mesos/0.21.2-rc1/mesos-0.21.2.tar.gz.asc

 The PGP key used to sign the release is here:
 https://dist.apache.org/repos/dist/release/mesos/KEYS

 The JAR is up in Maven in a staging repository here:
 https://repository.apache.org/content/repositories/orgapachemesos-1074

 Please vote on releasing this package as Apache Mesos 0.21.2!

 The vote is open until Tue Sep 29 18:00 PDT 2015 and passes if a
 majority of at least 3 +1 PMC votes are cast.

 [ ] +1 I tested this package. Release this package as Apache Mesos
 0.21.2
 [ ] -1 Do not release this package because ...

 Thanks,
 -Adam-

>>>
>>>
>>


Re: Problems with deprecation cycles for critical/hard to adapt dependencies

2015-09-30 Thread Niklas Nielsen
@vinod, ben, jie - Any thoughts on this?

I am in favor of the time based deprecation as well and can come up with a
proposal, taken there are no objections.

Niklas

On 28 September 2015 at 21:09, James DeFelice <james.defel...@gmail.com>
wrote:

> +1 for time-based deprecation cycle of O(months)
>
> On Mon, Sep 28, 2015 at 6:16 PM, Zameer Manji <zma...@apache.org> wrote:
>
> > Niklas,
> >
> > Thanks for starting this thread. I think Mesos can best move forward if
> it
> > switches from release based deprecation cycle to a time based deprecation
> > cycle. This means that APIs would be deprecated after a time period (ie 4
> > months) instead of at a specific release. This is the model that Google's
> > Guava library uses and I think it works really well. It ensures that the
> > ecosystem and community has sufficient time to react to deprecations
> while
> > still allowing them to move forward at a reasonable pace.
> >
> > On Mon, Sep 28, 2015 at 2:19 PM, Niklas Nielsen <nik...@mesosphere.io>
> > wrote:
> >
> > > Hi everyone,
> > >
> > > With a (targeted) release cadence of *one month*, we should revisit our
> > > deprecation cycles of 3 releases (e.g. in version N, we warn. In
> version
> > > N+1, support both old and new API. In Version N+2, we break
> > compatibility).
> > > Sometimes we cannot do the first step, and we deprecate in version N+1
> > and
> > > thus in 2 releases. With the new cadence, that is no longer around two
> > > quarters but two months which is too short for 3rd party tooling to
> > adapt.
> > >
> > > Even though our release cycles have been longer than one month in the
> > past,
> > > we are running into issues with deprecation due to lack of outreach
> (i.e.
> > > our communication to framework and 3rd party tooling communities) or
> > > because we are simply unaware on the internal dependencies they have on
> > > Mesos.
> > >
> > > We/I became aware of this, when we saw a planned deprecation of
> > /state.json
> > > in 0.26.0 (0.25.0 supports both). I suspect that _a lot_ of tools will
> > > break because of this. This, on top of the problems we have run into
> > > recently with the Zookeeper master info change from binary protobuf to
> > > json.
> > >
> > > Even though we document this in our upgrade.md, the
> visibility/knowledge
> > > of
> > > this document seem too low and we probably need to do more.
> > >
> > > Do you guys have thoughts/ideas on how we can address this?
> > >
> > > Cheers,
> > > Niklas
> > >
> > > --
> > > Zameer Manji
> > >
> > >
> >
>
>
>
> --
> James DeFelice
> 585.241.9488 (voice)
> 650.649.6071 (fax)
>


Re: Request : Add to contributor list

2015-09-29 Thread Niklas Nielsen
You should be on the list now :)

Niklas

On 28 September 2015 at 20:04, Mandeep Chadha <
mandeep_cha...@yahoo.com.invalid> wrote:

> Thanks in advance !
> Best,Mandeep


Problems with deprecation cycles for critical/hard to adapt dependencies

2015-09-28 Thread Niklas Nielsen
Hi everyone,

With a (targeted) release cadence of *one month*, we should revisit our
deprecation cycles of 3 releases (e.g. in version N, we warn. In version
N+1, support both old and new API. In Version N+2, we break compatibility).
Sometimes we cannot do the first step, and we deprecate in version N+1 and
thus in 2 releases. With the new cadence, that is no longer around two
quarters but two months which is too short for 3rd party tooling to adapt.

Even though our release cycles have been longer than one month in the past,
we are running into issues with deprecation due to lack of outreach (i.e.
our communication to framework and 3rd party tooling communities) or
because we are simply unaware on the internal dependencies they have on
Mesos.

We/I became aware of this, when we saw a planned deprecation of /state.json
in 0.26.0 (0.25.0 supports both). I suspect that _a lot_ of tools will
break because of this. This, on top of the problems we have run into
recently with the Zookeeper master info change from binary protobuf to json.

Even though we document this in our upgrade.md, the visibility/knowledge of
this document seem too low and we probably need to do more.

Do you guys have thoughts/ideas on how we can address this?

Cheers,
Niklas


Testing Mesos 0.25.0 RCs

2015-09-25 Thread Niklas Nielsen
Hi everyone,

Mesos 0.25.0 rc1 should now be ready for testing! [1]

While we already know of documentation additions we want in RC2, this is
our chance to test and bake the release bits.

*Please keep us posted with any failures, crashes, flaky test, unintuitive
upgrade paths or deprecations we didn't catch in upgrades.md
.*

If you have patches you need to get into this release, fill out lines in
[2] and we will cherry pick those in the upcoming RCs.

We will be reaching out to as many framework communities as we can and have
them test as well

Cheers,
Mpark, Joris and Niklas

[1]
https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=shortlog;h=refs/tags/0.25.0-rc1
[2]
https://docs.google.com/spreadsheets/d/1EimZy4mzwbc_RIzzo05xr7vR6EjxPEgEIW0UwCN0x5w/edit#gid=0


Re: Testing Mesos 0.25.0 RCs

2015-09-25 Thread Niklas Nielsen
And here are the release bits:

The candidate for Mesos 0.25.0 release is available at:

https://dist.apache.org/repos/dist/dev/mesos/0.25.0-rc1/mesos-0.25.0.tar.gz


The tag to be voted on is 0.25.0-rc1:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.25.0-rc1


The MD5 checksum of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/0.25.0-rc1/mesos-0.25.0.tar.gz.md5


The signature of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/0.25.0-rc1/mesos-0.25.0.tar.gz.asc


The PGP key used to sign the release is here:

https://dist.apache.org/repos/dist/release/mesos/KEYS


The JAR is up in Maven in a staging repository here:

https://repository.apache.org/content/repositories/orgapachemesos-1076

On 25 September 2015 at 11:14, Niklas Nielsen <nik...@mesosphere.io> wrote:

> Hi everyone,
>
> Mesos 0.25.0 rc1 should now be ready for testing! [1]
>
> While we already know of documentation additions we want in RC2, this is
> our chance to test and bake the release bits.
>
> *Please keep us posted with any failures, crashes, flaky test, unintuitive
> upgrade paths or deprecations we didn't catch in upgrades.md
> <http://upgrades.md>.*
>
> If you have patches you need to get into this release, fill out lines in
> [2] and we will cherry pick those in the upcoming RCs.
>
> We will be reaching out to as many framework communities as we can and
> have them test as well
>
> Cheers,
> Mpark, Joris and Niklas
>
> [1]
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=shortlog;h=refs/tags/0.25.0-rc1
> [2]
> https://docs.google.com/spreadsheets/d/1EimZy4mzwbc_RIzzo05xr7vR6EjxPEgEIW0UwCN0x5w/edit#gid=0
>


Re: Mesos 0.25.0

2015-09-25 Thread Niklas Nielsen
Following up;

We tagged 0.25.0 yesterday and will push to ASF repos asap.
If you already know about regressions (and have patches ready); please add
entries in
https://docs.google.com/spreadsheets/d/1EimZy4mzwbc_RIzzo05xr7vR6EjxPEgEIW0UwCN0x5w/edit#gid=0
so we can cherry pick them in the upcoming RCs (again, only regressions,
test fixes and documentation at this point).

Thanks!
Mpark, Joris and Niklas



On 24 September 2015 at 17:03, Niklas Nielsen <nik...@mesosphere.io> wrote:

> Hi everyone!
>
> We promised to have 0.25.0 RC tagged yesterday and it slipped into this
> morning; we are currently working on
> https://issues.apache.org/jira/browse/MESOS-3510 to make sure the API
> versioning is in place before tagging.
>
> We will base the RC off HEAD as off today at 5:30PM and cherry pick the 5
> patches for MESOS-3510 on top.
>
> Cheers,
> Joris, MPark and Niklas
>
>
> On 17 September 2015 at 14:05, Joris Van Remoortere <jo...@mesosphere.io>
> wrote:
>
>> As mentioned by Vinod in the Community Developer Meeting, we will be
>> having
>> a 0.25.0 rc1 triage meeting Friday September 18th at 10:30am Pacific Time.
>> If you are interested in attending or are heavily engaged in the remaining
>> tickets please reach out to me if you would like to join the meeting. I
>> will try my best to accommodate everyone in the hangout.
>> Regardless, your shepherd will reach out to you if you have patches
>> outstanding targeted for 0.25.0.
>>
>> The meeting will move quickly through the remaining issues, their
>> shepherds, and what is critical.
>>
>> Joris
>>
>> On Wed, Sep 16, 2015 at 7:56 PM, Michael Park <mcyp...@gmail.com> wrote:
>>
>> > We have created the new target version for 0.26.0 and have cut out what
>> > we're planning to land for 0.25.0 on the 0.25.0 dashboard.
>> >
>> > Please set your target version to 0.26.0 unless it's a critical issue
>> that
>> > needs to get into 0.25.0!
>> >
>> > Thanks,
>> >
>> > Joris, MPark, and Niklas
>> >
>> > On Mon, Sep 14, 2015, 9:23 AM Joris Van Remoortere <jo...@mesosphere.io
>> >
>> > wrote:
>> >
>> > > Following in the footsteps of the dashboard for the 0.23.0 release,
>> > > here is the Mesos 0.25.0 Release Dashboard
>> > > <
>> > >
>> >
>> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12326859
>> > > >
>> > >
>> > > This dashboard was very successful at reducing the time required to
>> > triage
>> > > the first release candidate. Let's repeat that same success!
>> > >
>> > > A special thanks to Adam for providing us with this great template.
>> > >
>> > > Joris, Mpark, and Niklas
>> > >
>> >
>>
>
>


Re: [VOTE] Release Apache Mesos 0.23.1 (rc1)

2015-09-24 Thread Niklas Nielsen
+1 (binding)

Tested on centos7 and ubuntu 14.04

On 24 September 2015 at 07:53, Alexander Rojas 
wrote:

> +1 (non binding)
>
> Tested Ubuntu 14.04, OSX
>
> > On 22 Sep 2015, at 03:06, Adam Bordelon  wrote:
> >
> > Hi friends,
> >
> > Please vote on releasing the following candidate as Apache Mesos 0.23.1.
> >
> > 0.23.1 is a bug fix release and includes the following:
> >
> 
> > * [MESOS-2986] - Docker version output is not compatible with Mesos
> > * [MESOS-3136] - COMMAND health checks with Marathon 0.10.0 are broken
> >
> > The CHANGELOG for the release is available at:
> >
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.23.1-rc1
> <
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.23.1-rc1
> >
> >
> 
> >
> > The candidate for Mesos 0.23.1 release is available at:
> >
> https://dist.apache.org/repos/dist/dev/mesos/0.23.1-rc1/mesos-0.23.1.tar.gz
> <
> https://dist.apache.org/repos/dist/dev/mesos/0.23.1-rc1/mesos-0.23.1.tar.gz
> >
> >
> > The tag to be voted on is 0.23.1-rc1:
> >
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.23.1-rc1
>  >
> >
> > The MD5 checksum of the tarball can be found at:
> >
> https://dist.apache.org/repos/dist/dev/mesos/0.23.1-rc1/mesos-0.23.1.tar.gz.md5
> <
> https://dist.apache.org/repos/dist/dev/mesos/0.23.1-rc1/mesos-0.23.1.tar.gz.md5
> >
> >
> > The signature of the tarball can be found at:
> >
> https://dist.apache.org/repos/dist/dev/mesos/0.23.1-rc1/mesos-0.23.1.tar.gz.asc
> <
> https://dist.apache.org/repos/dist/dev/mesos/0.23.1-rc1/mesos-0.23.1.tar.gz.asc
> >
> >
> > The PGP key used to sign the release is here:
> > https://dist.apache.org/repos/dist/release/mesos/KEYS <
> https://dist.apache.org/repos/dist/release/mesos/KEYS>
> >
> > The JAR is up in Maven in a staging repository here:
> > https://repository.apache.org/content/repositories/orgapachemesos-1070 <
> https://repository.apache.org/content/repositories/orgapachemesos-1070>
> >
> > Please vote on releasing this package as Apache Mesos 0.23.1!
> >
> > The vote is open until Thu Sep 24 18:00 PDT 2015 and passes if a
> majority of at least 3 +1 PMC votes are cast.
> >
> > [ ] +1 I tested this package. Release it as Apache Mesos 0.24.1
> > [ ] -1 Do not release this package because ...
> >
> > Thanks,
> > -Adam-
>
>


Re: [2/2] mesos git commit: Added masterSlaveLostHook.

2015-09-23 Thread Niklas Nielsen
On 23 September 2015 at 15:47, Benjamin Mahler <benjamin.mah...@gmail.com>
wrote:

> +dev
>
> Just so the rest of the dev community understands, are these kinds of event
> based hooks going to be subsumed by a mechanism to stream out cluster
> events? Or will these hooks co-exist alongside cluster events?
>

Could definitely be. All modules have to be (re)built for new Mesos
releases; if they can listen to a SlaveLost event, then we can obsolete the
explicit hooks.

Niklas

>
> On Wed, Sep 23, 2015 at 3:38 PM, <nniel...@apache.org> wrote:
>
> > Added masterSlaveLostHook.
> >
> > This patch adds a new masterSlaveLost hook to enable modules to clean up
> > after lost slaves events (as in networking modules, where we want to
> > avoid leaking IPs).
> >
> > Review: https://reviews.apache.org/r/38575
> >
> >
> > Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
> > Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/1d86932c
> > Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/1d86932c
> > Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/1d86932c
> >
> > Branch: refs/heads/master
> > Commit: 1d86932ce424f3e7b921cc6a436051a45f704dd0
> > Parents: f6706e8
> > Author: Niklas Nielsen <n...@qni.dk>
> > Authored: Wed Sep 23 15:37:21 2015 -0700
> > Committer: Niklas Q. Nielsen <nik...@mesosphere.io>
> > Committed: Wed Sep 23 15:37:24 2015 -0700
> >
> > --
> >  include/mesos/hook.hpp| 10 +++
> >  src/examples/test_hook_module.cpp | 26 +
> >  src/hook/manager.cpp  | 12 
> >  src/hook/manager.hpp  |  2 ++
> >  src/master/master.cpp |  5 
> >  src/tests/hook_tests.cpp  | 53
> ++
> >  6 files changed, 108 insertions(+)
> > --
> >
> >
> >
> >
> http://git-wip-us.apache.org/repos/asf/mesos/blob/1d86932c/include/mesos/hook.hpp
> > --
> > diff --git a/include/mesos/hook.hpp b/include/mesos/hook.hpp
> > index 2353602..2fe060e 100644
> > --- a/include/mesos/hook.hpp
> > +++ b/include/mesos/hook.hpp
> > @@ -62,6 +62,16 @@ public:
> >  return None();
> >}
> >
> > +
> > +  // This hook is called when an Agent is removed i.e. deemed lost by
> the
> > +  // master. The hook is invoked after all frameworks have been informed
> > about
> > +  // the loss.
> > +  virtual Try masterSlaveLostHook(const SlaveInfo& slaveInfo)
> > +  {
> > +return Nothing();
> > +  }
> > +
> > +
> >// This environment decorator hook is called from within slave when
> >// launching a new executor. A module implementing the hook creates
> >// and returns a set of environment variables. These environment
> >
> >
> >
> http://git-wip-us.apache.org/repos/asf/mesos/blob/1d86932c/src/examples/test_hook_module.cpp
> > --
> > diff --git a/src/examples/test_hook_module.cpp
> > b/src/examples/test_hook_module.cpp
> > index 5c4d71a..c09d7dd 100644
> > --- a/src/examples/test_hook_module.cpp
> > +++ b/src/examples/test_hook_module.cpp
> > @@ -108,6 +108,32 @@ public:
> >  return labels;
> >}
> >
> > +  virtual Try masterSlaveLostHook(const SlaveInfo& slaveInfo)
> > +  {
> > +LOG(INFO) << "Executing 'masterSlaveLostHook' in slave '"
> > +  << slaveInfo.id() << "'";
> > +
> > +// TODO(nnielsen): Add argument to signal(), so we can filter
> > messages from
> > +// the `masterSlaveLostHook` from `slaveRemoveExecutorHook`.
> > +// NOTE: Will not be a problem **as long as** the test doesn't start
> > any
> > +// tasks.
> > +HookProcess hookProcess;
> > +process::spawn();
> > +Future future =
> > +  process::dispatch(hookProcess, ::await);
> > +
> > +process::dispatch(hookProcess, ::signal);
> > +
> > +// Make sure we don't terminate the process before the message
> > self-send has
> > +// completed.
> > +future.await();
> > +
> > +process::terminate(hookProcess);
> > +process::wait(hookProcess);
> > +
> > +return Nothing();
> > +  }
> > +
> >// T

Re: [2/2] mesos git commit: Added masterSlaveLostHook.

2015-09-23 Thread Niklas Nielsen
It applies to all hooks; so I don't think a TODO fit well around the
mastSlaveLostHook() as a one-off. Do you want it for each hook or in the
module docs?
As long as the plans around eventing and subscribing to events from modules
haven't been solidified more, I'd suggest we post-pone adding it as it
isn't an alternative (yet).

Niklas

On 23 September 2015 at 16:29, Benjamin Mahler <bmah...@twitter.com.invalid>
wrote:

> Ok, any plan to add TODOs with this context?
>
> On Wed, Sep 23, 2015 at 4:11 PM, Niklas Nielsen <nik...@mesosphere.io>
> wrote:
>
> > On 23 September 2015 at 15:47, Benjamin Mahler <
> benjamin.mah...@gmail.com>
> > wrote:
> >
> > > +dev
> > >
> > > Just so the rest of the dev community understands, are these kinds of
> > event
> > > based hooks going to be subsumed by a mechanism to stream out cluster
> > > events? Or will these hooks co-exist alongside cluster events?
> > >
> >
> > Could definitely be. All modules have to be (re)built for new Mesos
> > releases; if they can listen to a SlaveLost event, then we can obsolete
> the
> > explicit hooks.
> >
> > Niklas
> >
> > >
> > > On Wed, Sep 23, 2015 at 3:38 PM, <nniel...@apache.org> wrote:
> > >
> > > > Added masterSlaveLostHook.
> > > >
> > > > This patch adds a new masterSlaveLost hook to enable modules to clean
> > up
> > > > after lost slaves events (as in networking modules, where we want to
> > > > avoid leaking IPs).
> > > >
> > > > Review: https://reviews.apache.org/r/38575
> > > >
> > > >
> > > > Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
> > > > Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/1d86932c
> > > > Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/1d86932c
> > > > Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/1d86932c
> > > >
> > > > Branch: refs/heads/master
> > > > Commit: 1d86932ce424f3e7b921cc6a436051a45f704dd0
> > > > Parents: f6706e8
> > > > Author: Niklas Nielsen <n...@qni.dk>
> > > > Authored: Wed Sep 23 15:37:21 2015 -0700
> > > > Committer: Niklas Q. Nielsen <nik...@mesosphere.io>
> > > > Committed: Wed Sep 23 15:37:24 2015 -0700
> > > >
> > > >
> --
> > > >  include/mesos/hook.hpp| 10 +++
> > > >  src/examples/test_hook_module.cpp | 26 +
> > > >  src/hook/manager.cpp  | 12 
> > > >  src/hook/manager.hpp  |  2 ++
> > > >  src/master/master.cpp |  5 
> > > >  src/tests/hook_tests.cpp  | 53
> > > ++
> > > >  6 files changed, 108 insertions(+)
> > > >
> --
> > > >
> > > >
> > > >
> > > >
> > >
> >
> http://git-wip-us.apache.org/repos/asf/mesos/blob/1d86932c/include/mesos/hook.hpp
> > > >
> --
> > > > diff --git a/include/mesos/hook.hpp b/include/mesos/hook.hpp
> > > > index 2353602..2fe060e 100644
> > > > --- a/include/mesos/hook.hpp
> > > > +++ b/include/mesos/hook.hpp
> > > > @@ -62,6 +62,16 @@ public:
> > > >  return None();
> > > >}
> > > >
> > > > +
> > > > +  // This hook is called when an Agent is removed i.e. deemed lost
> by
> > > the
> > > > +  // master. The hook is invoked after all frameworks have been
> > informed
> > > > about
> > > > +  // the loss.
> > > > +  virtual Try masterSlaveLostHook(const SlaveInfo&
> slaveInfo)
> > > > +  {
> > > > +return Nothing();
> > > > +  }
> > > > +
> > > > +
> > > >// This environment decorator hook is called from within slave
> when
> > > >// launching a new executor. A module implementing the hook
> creates
> > > >// and returns a set of environment variables. These environment
> > > >
> > > >
> > > >
> > >
> >
> http://git-wip-us.apache.org/repos/asf/mesos/blob/1d86932c/src/examples/test_hook_module.cpp
> > > >
> ---

Re: [VOTE] Release Apache Mesos 0.24.1 (rc1)

2015-09-22 Thread Niklas Nielsen
+1 (binding)

On 21 September 2015 at 11:46, Vinod Kone  wrote:

> +1 (binding)
>
> Tested on CI for CentOS5 and CentOS6.
>
> On Fri, Sep 18, 2015 at 6:21 PM, Adam Bordelon  wrote:
>
>> Hi friends,
>>
>> Please vote on releasing the following candidate as Apache Mesos 0.24.1.
>>
>> 0.24.1 includes the following:
>>
>> 
>> * [MESOS-2986] - Docker version output is not compatible with Mesos
>> * [MESOS-3136] - COMMAND health checks with Marathon 0.10.0 are broken
>>
>> The CHANGELOG for the release is available at:
>>
>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.24.1-rc1
>>
>> 
>>
>> The candidate for Mesos 0.24.1 release is available at:
>>
>> https://dist.apache.org/repos/dist/dev/mesos/0.24.1-rc1/mesos-0.24.1.tar.gz
>>
>> The tag to be voted on is 0.24.1-rc1:
>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.24.1-rc1
>>
>> The MD5 checksum of the tarball can be found at:
>>
>> https://dist.apache.org/repos/dist/dev/mesos/0.24.1-rc1/mesos-0.24.1.tar.gz.md5
>>
>> The signature of the tarball can be found at:
>>
>> https://dist.apache.org/repos/dist/dev/mesos/0.24.1-rc1/mesos-0.24.1.tar.gz.asc
>>
>> The PGP key used to sign the release is here:
>> https://dist.apache.org/repos/dist/release/mesos/KEYS
>>
>> The JAR is up in Maven in a staging repository here:
>> https://repository.apache.org/content/repositories/orgapachemesos-1068
>>
>> Please vote on releasing this package as Apache Mesos 0.24.1!
>>
>> The vote is open until Wed Sep 23 18:00 PDT 2015 and passes if a majority
>> of at least 3 +1 PMC votes are cast.
>>
>> [ ] +1 I tested this package. Release it as Apache Mesos 0.24.1
>> [ ] -1 Do not release this package because ...
>>
>> Thanks,
>> -Adam-
>>
>
>


Working Group List

2015-09-18 Thread Niklas Nielsen
Hi folks,

Per our last community meeting, I created a list on our Confluence Wiki to
keep track of and increase visibility into the running working groups.
We have previously created these groups implicitly, and as the community
grow, we unfortunately ended up with duplicate meetings with different
stake holders in the case of Networking and folks have been missing out on
other meetings too.

So, here it is:
https://cwiki.apache.org/confluence/display/MESOS/Apache+Mesos+Working+Groups

I imagined that we could create IRC channels, hangouts etc and note the
channels there.

Of course, the main activity should be on the dev@ list but we have
traditionally met up (online or IRL) for design sessions, higher frequency
feedback during reviews, etc.

Thoughts / ideas?

Cheers,
Niklas


Re: Mesos 0.25.0

2015-09-09 Thread Niklas Nielsen
Hi folks,

With 0.24.0 out the door, we need to start focus on 0.25.0.

As to planning, we will aim to release by *October 5th (just before
MesosCon EU)*. In order to test, fix and stabilize, we will try to tag a
release by *September 23th*. That leaves us with two weeks to land the
final patches for this release.

Currently, we have the following issues going into the release:
https://issues.apache.org/jira/issues/?jql=(fixVersion%20%3D%200.25.0%20OR%20%22Target%20Version%2Fs%22%20%3D%200.25.0)%20AND%20project%20%3D%20MESOS%20ORDER%20BY%20status%20DESC

With currently 7 unresolved issues.

Please tag your tickets you want in this release with 0.25.0 target or fix
version and we will track the progress of the tickets up to the deadlines.

Let us know if this doesn't make sense or if you have any objections.

Cheers,
Joris, Mpark and Niklas


On 31 August 2015 at 11:34, Niklas Nielsen <nik...@mesosphere.io> wrote:

>
>
> On 30 August 2015 at 16:00, Dave Lester <d...@davelester.org> wrote:
>
>> Hi Niklas,
>>
>> Could you create a JIRA issue tracking this release? A great example
>> would be what was created for the 0.22.0 release and promotes more
>> transparency IMO about what is going in a release:
>> https://issues.apache.org/jira/browse/MESOS-2248
>>
>> Additionally, can you add a link to that JIRA issue on the release
>> planning wiki?
>> https://cwiki.apache.org/confluence/pages/editpage.action?pageId=51812502
>
>
> Thanks for bringing this up, Dave!
>
> I created a ticket and will track blockers/issues there.
>
> Nik
>
>
>>
>>
>> Thanks,
>> Dave
>>
>> On Fri, Aug 28, 2015, at 11:24 AM, Joris Van Remoortere wrote:
>> > Hi Nik,
>> >
>> > I'd like to co-manage with you as I am invested in Maintenance
>> Primitives
>> > :-)
>> >
>> > Joris
>> >
>> > On Thu, Aug 27, 2015 at 9:25 PM, Michael Park <mcyp...@gmail.com>
>> wrote:
>> >
>> > > Hey Niklas,
>> > >
>> > > As we discussed offline, I'm happy to co-manage or shadow this
>> release.
>> > >
>> > > I would also like to add the beta follow-up features for persistence
>> > > primitives (dynamic reservation + persistent volume):
>> > >
>> > >- Operator API
>> > >- Authorization
>> > >
>> > > Thanks,
>> > >
>> > > MPark.
>> > >
>> > > On Thu, Aug 27, 2015 at 7:24 PM Niklas Nielsen <nik...@mesosphere.io>
>> > > wrote:
>> > >
>> > > > Hi folks,
>> > > >
>> > > > While we have 0.24.0 in flight (voting and fixing is in progress),
>> with a
>> > > > release cadence of 1 month, we should consider to start thinking
>> about
>> > > > 0.25.0.
>> > > >
>> > > > I have cycles to release manage this one (and bring some of the
>> newer
>> > > > committers on for shadowing). If others want to take this on (or if
>> > > people
>> > > > have objections), feel free to jump in.
>> > > >
>> > > > Of larger items for the next release, we want to get (again, feel
>> free to
>> > > > add) are:
>> > > >
>> > > >  - Maintenance primitives in Alpha state
>> > > >  - Networking plug-ability (demoed at MesosCon)
>> > > >
>> > > > We can wait until 0.24.0 lands, but we should be able to work on
>> this in
>> > > > parallel.
>> > > >
>> > > > Niklas
>> > > >
>> > >
>>
>
>


Re: [VOTE] Release Apache Mesos 0.24.0 (rc2)

2015-09-03 Thread Niklas Nielsen
+1 - tested on our CI

On Tuesday, September 1, 2015, Vinod Kone  wrote:

> Hi all,
>
>
> Please vote on releasing the following candidate as Apache Mesos 0.24.0.
>
>
> 0.24.0 includes the following:
>
>
> 
>
> Experimental support for v1 scheduler HTTP API!
>
> This release also wraps up support for fetcher.
>
> The CHANGELOG for the release is available at:
>
>
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.24.0-rc2
>
>
> 
>
>
> The candidate for Mesos 0.24.0 release is available at:
>
> https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc2/mesos-0.24.0.tar.gz
>
>
> The tag to be voted on is 0.24.0-rc2:
>
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.24.0-rc2
>
>
> The MD5 checksum of the tarball can be found at:
>
>
> https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc2/mesos-0.24.0.tar.gz.md5
>
>
> The signature of the tarball can be found at:
>
>
> https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc2/mesos-0.24.0.tar.gz.asc
>
>
> The PGP key used to sign the release is here:
>
> https://dist.apache.org/repos/dist/release/mesos/KEYS
>
>
> The JAR is up in Maven in a staging repository here:
>
> https://repository.apache.org/content/repositories/orgapachemesos-1066
>
>
> Please vote on releasing this package as Apache Mesos 0.24.0!
>
>
> The vote is open until Fri Sep  4 17:33:05 PDT 2015 and passes if a
> majority of at least 3 +1 PMC votes are cast.
>
>
> [ ] +1 Release this package as Apache Mesos 0.24.0
>
> [ ] -1 Do not release this package because ...
>
>
> Thanks,
>
> Vinod
>


Re: Mesos 0.25.0

2015-08-31 Thread Niklas Nielsen
On 30 August 2015 at 16:00, Dave Lester <d...@davelester.org> wrote:

> Hi Niklas,
>
> Could you create a JIRA issue tracking this release? A great example
> would be what was created for the 0.22.0 release and promotes more
> transparency IMO about what is going in a release:
> https://issues.apache.org/jira/browse/MESOS-2248
>
> Additionally, can you add a link to that JIRA issue on the release
> planning wiki?
> https://cwiki.apache.org/confluence/pages/editpage.action?pageId=51812502


Thanks for bringing this up, Dave!

I created a ticket and will track blockers/issues there.

Nik


>
>
> Thanks,
> Dave
>
> On Fri, Aug 28, 2015, at 11:24 AM, Joris Van Remoortere wrote:
> > Hi Nik,
> >
> > I'd like to co-manage with you as I am invested in Maintenance Primitives
> > :-)
> >
> > Joris
> >
> > On Thu, Aug 27, 2015 at 9:25 PM, Michael Park <mcyp...@gmail.com> wrote:
> >
> > > Hey Niklas,
> > >
> > > As we discussed offline, I'm happy to co-manage or shadow this release.
> > >
> > > I would also like to add the beta follow-up features for persistence
> > > primitives (dynamic reservation + persistent volume):
> > >
> > >- Operator API
> > >- Authorization
> > >
> > > Thanks,
> > >
> > > MPark.
> > >
> > > On Thu, Aug 27, 2015 at 7:24 PM Niklas Nielsen <nik...@mesosphere.io>
> > > wrote:
> > >
> > > > Hi folks,
> > > >
> > > > While we have 0.24.0 in flight (voting and fixing is in progress),
> with a
> > > > release cadence of 1 month, we should consider to start thinking
> about
> > > > 0.25.0.
> > > >
> > > > I have cycles to release manage this one (and bring some of the newer
> > > > committers on for shadowing). If others want to take this on (or if
> > > people
> > > > have objections), feel free to jump in.
> > > >
> > > > Of larger items for the next release, we want to get (again, feel
> free to
> > > > add) are:
> > > >
> > > >  - Maintenance primitives in Alpha state
> > > >  - Networking plug-ability (demoed at MesosCon)
> > > >
> > > > We can wait until 0.24.0 lands, but we should be able to work on
> this in
> > > > parallel.
> > > >
> > > > Niklas
> > > >
> > >
>


Mesos 0.25.0

2015-08-27 Thread Niklas Nielsen
Hi folks,

While we have 0.24.0 in flight (voting and fixing is in progress), with a
release cadence of 1 month, we should consider to start thinking about
0.25.0.

I have cycles to release manage this one (and bring some of the newer
committers on for shadowing). If others want to take this on (or if people
have objections), feel free to jump in.

Of larger items for the next release, we want to get (again, feel free to
add) are:

 - Maintenance primitives in Alpha state
 - Networking plug-ability (demoed at MesosCon)

We can wait until 0.24.0 lands, but we should be able to work on this in
parallel.

Niklas


Re: [VOTE] Release Apache Mesos 0.24.0 (rc1)

2015-08-27 Thread Niklas Nielsen
-1: sudo make check on centos 7

[--] Global test environment tear-down

[==] 793 tests from 121 test cases ran. (606946 ms total)

[  PASSED  ] 786 tests.

[  FAILED  ] 7 tests, listed below:

[  FAILED  ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where
TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess

[  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem

[  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox

[  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost

[  FAILED  ]
LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint

[  FAILED  ]
LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem

[  FAILED  ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs

Configured with:

../mesos/configure --prefix=/home/vagrant/releases/0.24.0/ --disable-python

On 26 August 2015 at 17:00, Khanduja, Vaibhav vaibhav.khand...@emc.com
wrote:

 +1

  On Aug 26, 2015, at 4:43 PM, Vinod Kone vinodk...@gmail.com wrote:
 
  Pinging the thread for more (binding) votes. Hopefully people have caught
  up with emails after Mesos madness.
 
  On Wed, Aug 19, 2015 at 1:28 AM, haosdent haosd...@gmail.com wrote:
 
  +1
 
  OS: Ubutnu 14.04
  Verify command: sudo make -j8 check
  Compiler: Both gcc4.8 and clang3.5
  Configuration: default configuration
  Result: all tests(828 tests) pass
 
  MESOS-3053 https://issues.apache.org/jira/browse/MESOS-3053 is
 because
  need update add iptable first.
 
  On Wed, Aug 19, 2015 at 2:39 PM, haosdent haosd...@gmail.com wrote:
 
  Could not
  pass DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged in
 Ubuntu
  14.04. Already have a issue for this
  https://issues.apache.org/jira/browse/MESOS-3053, it is acceptable?
 
  On Wed, Aug 19, 2015 at 12:55 PM, Marco Massenzio ma...@mesosphere.io
 
  wrote:
 
  +1 (non-binding)
 
  All tests (including ROOT) pass on:
  Ubuntu 14.04 (physical box)
 
  All non-ROOT tests pass on:
  CentOS 7 (VirtualBox VM)
 
  Known issue (MESOS-3050) for ROOT tests on CentOS 7, non-blocker.
 
  Thanks,
 
  *Marco Massenzio*
 
  *Distributed Systems Engineerhttp://codetrips.com 
 http://codetrips.com*
 
  On Tue, Aug 18, 2015 at 3:26 PM, Vinod Kone vinodk...@apache.org
  wrote:
 
  0.24.0 includes the following:
 
 
 
 
 
  Experimental support for v1 scheduler HTTP API!
 
  This release also wraps up support for fetcher.
 
 
  The CHANGELOG for the release is available at:
 
 
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.24.0-rc1
 
 
 
 
 
 
  The candidate for Mesos 0.24.0 release is available at:
 
 
 
 https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc1/mesos-0.24.0.tar.gz
 
 
  The tag to be voted on is 0.24.0-rc1:
 
 
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.24.0-rc1
 
 
  The MD5 checksum of the tarball can be found at:
 
 
 
 https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc1/mesos-0.24.0.tar.gz.md5
 
 
  The signature of the tarball can be found at:
 
 
 
 https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc1/mesos-0.24.0.tar.gz.asc
 
 
  The PGP key used to sign the release is here:
 
  https://dist.apache.org/repos/dist/release/mesos/KEYS
 
 
  The JAR is up in Maven in a staging repository here:
 
 
 https://repository.apache.org/content/repositories/orgapachemesos-1064
 
 
  Please vote on releasing this package as Apache Mesos 0.24.0!
 
 
  The vote is open until Fri Aug 21 15:24:05 PDT 2015 and passes if a
  majority of at least 3 +1 PMC votes are cast.
 
 
  [ ] +1 Release this package as Apache Mesos 0.24.0
 
  [ ] -1 Do not release this package because ...
 
 
  Thanks,
 
  Vinod
 
 
  --
  Best Regards,
  Haosdent Huang
 
 
 
  --
  Best Regards,
  Haosdent Huang
 



Re: [VOTE] Release Apache Mesos 0.24.0 (rc1)

2015-08-27 Thread Niklas Nielsen
If it is that easy to fix, why not get it in?

How about https://issues.apache.org/jira/browse/MESOS-3053 (which Haosdent
ran into)?

On 27 August 2015 at 15:36, Jie Yu yujie@gmail.com wrote:

 Niklas,

 This is the known problem reported by Marco. I am OK with both because the
 linux filesystem isolator cannot be used in 0.24.0.

 If you guys prefer to cut another RC, here is the patch that needs to be
 cherry picked:

 commit 3ecd54320397c3a813d555f291b51778372e273b
 Author: Greg Mann g...@mesosphere.io
 Date:   Fri Aug 21 13:21:10 2015 -0700

 Added symlink test for /bin, lib, and /lib64 when preparing test root
 filesystem.

 Review: https://reviews.apache.org/r/37684



 On Thu, Aug 27, 2015 at 3:30 PM, Niklas Nielsen nik...@mesosphere.io
 wrote:

 -1: sudo make check on centos 7

 [--] Global test environment tear-down

 [==] 793 tests from 121 test cases ran. (606946 ms total)

 [  PASSED  ] 786 tests.

 [  FAILED  ] 7 tests, listed below:

 [  FAILED  ] UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup, where
 TypeParam = mesos::internal::slave::CgroupsCpushareIsolatorProcess

 [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_ChangeRootFilesystem

 [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromSandbox

 [  FAILED  ] LinuxFilesystemIsolatorTest.ROOT_VolumeFromHost

 [  FAILED  ]
 LinuxFilesystemIsolatorTest.ROOT_VolumeFromHostSandboxMountPoint

 [  FAILED  ]
 LinuxFilesystemIsolatorTest.ROOT_PersistentVolumeWithRootFilesystem

 [  FAILED  ] MesosContainerizerLaunchTest.ROOT_ChangeRootfs

 Configured with:

 ../mesos/configure --prefix=/home/vagrant/releases/0.24.0/
 --disable-python

 On 26 August 2015 at 17:00, Khanduja, Vaibhav vaibhav.khand...@emc.com
 wrote:

 +1

  On Aug 26, 2015, at 4:43 PM, Vinod Kone vinodk...@gmail.com wrote:
 
  Pinging the thread for more (binding) votes. Hopefully people have
 caught
  up with emails after Mesos madness.
 
  On Wed, Aug 19, 2015 at 1:28 AM, haosdent haosd...@gmail.com wrote:
 
  +1
 
  OS: Ubutnu 14.04
  Verify command: sudo make -j8 check
  Compiler: Both gcc4.8 and clang3.5
  Configuration: default configuration
  Result: all tests(828 tests) pass
 
  MESOS-3053 https://issues.apache.org/jira/browse/MESOS-3053 is
 because
  need update add iptable first.
 
  On Wed, Aug 19, 2015 at 2:39 PM, haosdent haosd...@gmail.com
 wrote:
 
  Could not
  pass DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged in
 Ubuntu
  14.04. Already have a issue for this
  https://issues.apache.org/jira/browse/MESOS-3053, it is acceptable?
 
  On Wed, Aug 19, 2015 at 12:55 PM, Marco Massenzio 
 ma...@mesosphere.io
  wrote:
 
  +1 (non-binding)
 
  All tests (including ROOT) pass on:
  Ubuntu 14.04 (physical box)
 
  All non-ROOT tests pass on:
  CentOS 7 (VirtualBox VM)
 
  Known issue (MESOS-3050) for ROOT tests on CentOS 7, non-blocker.
 
  Thanks,
 
  *Marco Massenzio*
 
  *Distributed Systems Engineerhttp://codetrips.com 
 http://codetrips.com*
 
  On Tue, Aug 18, 2015 at 3:26 PM, Vinod Kone vinodk...@apache.org
  wrote:
 
  0.24.0 includes the following:
 
 
 
 
 
  Experimental support for v1 scheduler HTTP API!
 
  This release also wraps up support for fetcher.
 
 
  The CHANGELOG for the release is available at:
 
 
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.24.0-rc1
 
 
 
 
 
 
  The candidate for Mesos 0.24.0 release is available at:
 
 
 
 https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc1/mesos-0.24.0.tar.gz
 
 
  The tag to be voted on is 0.24.0-rc1:
 
 
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.24.0-rc1
 
 
  The MD5 checksum of the tarball can be found at:
 
 
 
 https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc1/mesos-0.24.0.tar.gz.md5
 
 
  The signature of the tarball can be found at:
 
 
 
 https://dist.apache.org/repos/dist/dev/mesos/0.24.0-rc1/mesos-0.24.0.tar.gz.asc
 
 
  The PGP key used to sign the release is here:
 
  https://dist.apache.org/repos/dist/release/mesos/KEYS
 
 
  The JAR is up in Maven in a staging repository here:
 
 
 https://repository.apache.org/content/repositories/orgapachemesos-1064
 
 
  Please vote on releasing this package as Apache Mesos 0.24.0!
 
 
  The vote is open until Fri Aug 21 15:24:05 PDT 2015 and passes if a
  majority of at least 3 +1 PMC votes are cast.
 
 
  [ ] +1 Release this package as Apache Mesos 0.24.0
 
  [ ] -1 Do not release this package because ...
 
 
  Thanks,
 
  Vinod
 
 
  --
  Best Regards,
  Haosdent Huang
 
 
 
  --
  Best Regards,
  Haosdent Huang
 






Re: Why does allocator keep track of the revocable resources separately from regular resources

2015-08-24 Thread Niklas Nielsen
On 16 August 2015 at 02:56, Qian AZ Zhang zhang...@cn.ibm.com wrote:

 Thanks Niklas and Christos.

 To Niklas, we'd like to try oversubscription feature with our framework.
 However, I do not quite understand your example below:
For example: cpus: 0.5; cpus{REV}: 1.89; mem: 512.
Here, schedulers would usually only look at the first cpu resource
 and decline the offer.
 Can you please let me know why framework scheduler would usually only look
 at the first cpu resource and decline the offer? I understand when
 launching a task, for one type of resource (e.g., cpus), the task can only
 use either revocable or non-revocable, not both. So for your example, I
 think scheduler can either pick up cpus: 0.5; mem: 512 or cpus{REV}:
 1.89; mem: 512 to launch its task, right?

 Correct; the offer will contain both and the scheduler need to make a
decision whether to use regular or revocable resources.



 To Christos, that's exactly what our framework wants: allow frameworks
 to slack resources (allocated but not used) for revocable best effort
 work.. Our use case is, framework1 is allocated with 500MB memory for
 launching its tasks, but actually after its tasks are launched, they only
 use 300MB, i.e., there are 200MB memory allocated to framework1 but not
 used by its tasks. We'd like those 200MB memory reported by resource
 estimator as revocable resources, and let framework2 which has the
 revocable resources capability set use them. And once framework1 launches
 more tasks (i.e., need more memory), the tasks of framework2 can be killed
 by QoS controller so that framework1 can take those 200MB memory back to
 launch more tasks.

We are only oversubscribing compressible resources (cpu shares, I/O and
networking bandwidth) for now (or at least encouraging users to); you need
to be careful with oversubscribing disk and memory, but would love to help
if you want to try it out.



 However, after reading Mesos's code, I found the revocable resources
 reported by resource estimator are actually separately kept track by
 allocator from regular resources:
   HierarchicalAllocatorProcessRoleSorter, FrameworkSorter::updateSlave(
   const SlaveID slaveId,
   const Resources oversubscribed)
   {
 ...
   *  slaves[slaveId].total += oversubscribed;*
 ...
   }
 As you see in the above code, oversubscribed resources will be added on
 top of slave's total resources, that means, slave's original total
 resources is enlarged with these *extra *resources. So when framework2
 launches its tasks, what its tasks use is actually these extra resources
 rather that framework1's unused resources. This is not what we expect, we'd
 like framework2 to use framework1's unused resources. That's why I said in
 my first mail that allocator needs to mark part of the total resources as
 revocable based on what resource estimator returns rather than add revocable
 resources on top of total resources.


 Regards,
 Qian Zhang

 [image: Inactive hide details for Christos Kozyrakis ---08/16/2015
 02:26:14--- On Aug 15, 2015, at 8:26 AM, Qian AZ Zhang zhangqxa@cn]Christos
 Kozyrakis ---08/16/2015 02:26:14--- On Aug 15, 2015, at 8:26 AM, Qian AZ
 Zhang zhang...@cn.ibm.com wrote: 

 From: Christos Kozyrakis chris...@mesosphere.io
 To: dev@mesos.apache.org
 Date: 08/16/2015 02:26
 Subject: Re: Why does allocator keep track of the revocable resources
 separately from regular resources
 --




  On Aug 15, 2015, at 8:26 AM, Qian AZ Zhang zhang...@cn.ibm.com wrote:
 
  That
  means frameworks can use more than the auto detected resources which I
  think should be slave's total resources. This seems a bit strange to me,
 I
  think allocator needs to mark part of the auto detected resources as
  revocable based on what resource estimator returns.

 That’s the whole idea of oversubscription Qian, to carefully understand
 the difference between allocated and actually used and allow frameworks to
 slack resources (allocated but not used) for revocable best effort work.

 Revocable resources are clearly marked in the offer. It is up to your
 framework to use them or ignore. You can also opt out as Niklas mentioned.
 Note that if a task uses some regular resources and some revocable
 resources at the same time, it is essentially a best effort task. So
 proceed carefully with your scheduler.





Re: Why does allocator keep track of the revocable resources separately from regular resources

2015-08-15 Thread Niklas Nielsen
Hi Qian,

Yes; frameworks will have to:

1) Register with the revocable resources framework capability set; the
important bit here, is that frameworks running on oversubscribed resources
will have to cope with frequent preemptions of their tasks. We wanted to
gain experience with this (experimental) feature and therefore let
frameworks opt out of running on oversubscribed resources.

2) The offer with revocable resources will look different and most
frameworks will actually have to rework their offer accept logic a bit; The
offer will include both regular (non-revocable) and revocable resources.
For example: cpus: 0.5; cpus{REV}: 1.89; mem: 512.
Here, schedulers would usually only look at the first cpu resource and
decline the offer.

See https://github.com/apache/mesos/blob/master/docs/oversubscription.md
for more information and feel free to reach out if you need help/assistance
running with oversubscription.

Cheers,
Niklas

On 15 August 2015 at 08:26, Qian AZ Zhang zhang...@cn.ibm.com wrote:



 Hi,

 When I try Mesos oversubscription feature, I found the revocable resources
 returned by resource estimator are actually separately kept track by
 allocator from regular resources, e.g., I started my slave with this
 command:
 ./bin/mesos-slave.sh --master=192.168.122.171:5050
 --resource_estimator=org_apache_mesos_FixedResourceEstimator
 --modules=/home/stack/mesos/build/slave_modules

 The content of /home/stack/mesos/build/slave_modules is:
 {
   libraries: {
 file:
 /home/stack/mesos/build/src/.libs/libfixed_resource_estimator.so,
 modules: {
 name: org_apache_mesos_FixedResourceEstimator,
   parameters: {
   key: resources,
   value: cpus:2;mem:500
   }
 }
   }
 }

 Then I see the following message in master's output:
 I0815 23:16:34.065404 23218 hierarchical.hpp:600] Slave
 20150815-231543-2876942528-5050-23204-S0 (mesos) updated with
 oversubscribed resources cpus(*){REV}:2; mem(*){REV}:500 (total: cpus(*):4;
 mem(*):2929; disk(*):36813; ports(*):[31000-32000]; cpus(*){REV}:2; mem
 (*){REV}:500, allocated: )

 So as you can see, the slave's total resources are: cpus(*):4; mem(*):2929;
 disk(*):36813; ports(*):[31000-32000]; cpus(*){REV}:2; mem(*){REV}:500, the
 revocable resources (cpus(*){REV}:2; mem(*){REV}:500) are kept separately
 from the regular resources (cpus(*):4; mem(*):2929; disk(*):36813; ports
 (*):[31000-32000];) which are auto detected when slave started up. That
 means frameworks can use more than the auto detected resources which I
 think should be slave's total resources. This seems a bit strange to me, I
 think allocator needs to mark part of the auto detected resources as
 revocable based on what resource estimator returns.


 Regards,
 Qian Zhang


Re: Add klaus1982 as contributor for MESOS-3023

2015-07-14 Thread Niklas Nielsen
You should have been added by now

On 14 July 2015 at 09:12, kl...@cguru.net wrote:

 Hi team,

 I'm working on MESOS-3023 (Factoring out the pattern for URL generation),
 and have a fix for it; would you help to add me into contributor list? so I
 can assign it to myself and trigger review process :).

 My JIRA ID is klaus1982 (kl...@cguru.net)


 Regards,
 
 Klaus Ma, PMP® | http://www.cguru.net
 WhatsApp: +8615011062542



Re: Regarding old frameworks in Mesos repository

2015-06-24 Thread Niklas Nielsen
On 24 June 2015 at 10:55, Yan Xu y...@jxu.me wrote:

 If we anticipated further development we wouldn't have proposed moving them
 out. :)
 So I think it's safe to say that we are just looking for a graveyard for
 them. Do the policies still apply?
 That said, if they cannot be moved out, we should still just delete them
 and point to the commit in which they are deleted.


+1



 Most of the discussions around this are on this thread. Relevant notes from
 the community sync are in this google doc
 
 https://docs.google.com/document/d/153CUCj5LOJCFAVpdDZC7COJDwKh9RDjxaTA0S7lzwDA/edit
 
 (June
 4). It's not about ec2 maintenance as the scripts in this repo are not
 up-to-date and and people have created ec2 tools not based on these
 scripts. (See this
 
 http://mail-archives.apache.org/mod_mbox/mesos-dev/201505.mbox/%3ccak8jagmtmdetskckqmwatyab3dmautsxo5p+6p2f2s2k-zt...@mail.gmail.com%3E
 
 .)

 Yan

 --
 Jiang Yan Xu y...@jxu.me @xujyan http://twitter.com/xujyan

 On Tue, Jun 23, 2015 at 7:18 PM, Jake Farrell jfarr...@apache.org wrote:

  (infra hat on) This can not be just shifted to a random github org and
  still be maintained under the Apache Mesos project. All commits must
 occur
  to Apache hardware before going to any mirrors such as github, which
  currently does not support anything not under the Apache github org.
 
  (Mesos hat back on) Sorry I missed the last sync up, I did not see any
  notes from this brought back to the dev list. If there is a need for
  maintainers for the ec2 section of the code base I would be happy to step
  in and help either rework to cloudformation or answer any outstanding
  questions
 
  -Jake
 
 
 
  On Tue, Jun 23, 2015 at 9:26 PM, Erik Weathers eweath...@groupon.com
  wrote:
 
   Please maintain the git history for the files when you move them.  They
   should not all appear to have been born into the new repos...
  
   - Erik
  
   On Tuesday, June 23, 2015, Yan Xu y...@jxu.me wrote:
  
So I'd like to resurface this topic. The last attempt
https://reviews.apache.org/r/33090/ to remove things under
  frameworks/
was put off because scripts under ec2/ still reference these
  frameworks.
   
However we seem to have reached the consensus that these unmaintained
   code
need to be moved out to avoid confusion (People asking questions /
reporting errors for things we don't maintain anymore). This point
 was
reiterated during our last community sync and we decided to remove
 ec2/
folder as well.
   
Therefore, if there's no objection, I will delete these files and
   recreate
them as individual projects under github.com/mesos. Our website will
  be
updated with the links to them either deleted or replaced by similar
external projects.
   
--
Jiang Yan Xu y...@jxu.me javascript:; @xujyan 
http://twitter.com/xujyan
   
On Fri, Apr 10, 2015 at 2:39 AM, Alexander Rojas 
   alexan...@mesosphere.io
javascript:;
wrote:
   
 +1 If they are not maintain they should be somewhere else.

  On 06 Apr 2015, at 21:10, Yan Xu y...@jxu.me javascript:;
  wrote:
 
  There exist a couple of frameworks in the Mesos codebase under
 /frameworks:
  deploy_jar haproxy+apache mesos-submit   torque
  (See https://github.com/apache/mesos/tree/master/frameworks)
 
  Anyone still uses them?
 
  These frameworks are not trivial implementations like the ones
  under
  src/examples to demonstrate/test Mesos features and they rely on
external
  programs to run. Since we don't actively maintain them, they may
  have
  already stopped working with the current versions of these
  programs.
 
  We'd like to remove these from the Mesos repository. If there are
   folks
 who
  still use them and would like to contribute, the ideal place to
  host
them
  is in their own repos. e.g., https://github.com/mesos/hadoop
 
  Any comments?
 
  --
  Jiang Yan Xu y...@jxu.me javascript:; @xujyan 
http://twitter.com/xujyan


   
  
 



Re: Native code error

2015-06-09 Thread Niklas Nielsen
Thanks Alberto - I finally got to look at the trace and added a ticket to
track it: https://issues.apache.org/jira/browse/MESOS-2839

Thanks again!

On 5 June 2015 at 00:39, Alberto Rodriguez ardl...@gmail.com wrote:

 Hi Niklas,

 Thank you for replying, see attached the full crash.

 Kind regards,

 2015-06-03 19:36 GMT+02:00 Niklas Nielsen nik...@mesosphere.io:

 Hi Alberto,

 Can you share the full crash report with us?

 Niklas

 On 2 June 2015 at 02:37, Alberto Rodriguez ardl...@gmail.com wrote:

  Hi all,
 
  I've got a bunch of tests that are checking whether my spark process is
  able to connect to a mesos cluster. The happy path (mesos up 
 running)
  is working fine the problem comes when I'm testing a connection with a
  wrong mesos ip then I'm getting the following error:
 
  WARNING: Logging before InitGoogleLogging() is written to STDERR
  W0602 11:26:03.549504 13269 sched.cpp:1323]
  **
  Scheduler driver bound to loopback interface! Cannot communicate with
  remote master(s). You might want to set 'LIBPROCESS_IP' environment
  variable to use a routable IP address.
  **
  #
  # A fatal error has been detected by the Java Runtime Environment:
  #
  #  SIGSEGV (0xb) at pc=0x7fcdf32c5660, pid=13125,
 tid=140517706200832
  #
  # JRE version: Java(TM) SE Runtime Environment (7.0_75-b13) (build
  1.7.0_75-b13)
  # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.75-b04 mixed mode
  linux-amd64 compressed oops)
  # Problematic frame:
  # C  [libc.so.6+0x83660]  cfree+0x220
  #
  # Core dump written. Default location:
  /home/arodriguez/dev/datavis-master/back/core or core.13125 (max size
 5
  kB). To ensure a full core dump, try ulimit -c unlimited before
 starting
  Java again
  #
  # An error report file with more information is saved as:
  # /home/arodriguez/dev/datavis-master/back/hs_err_pid13125.log
  #
  # If you would like to submit a bug report, please visit:
  #   http://bugreport.sun.com/bugreport/crash.jsp
  # The crash happened outside the Java Virtual Machine in native code.
  # See problematic frame for where to report the bug.
  #
  /bin/sh: línea 1: 13125 Abortado(`core' generado)
  /usr/java/jdk1.7.0_75/jre/bin/java
 
 
 -javaagent:/home/arodriguez/.m2/repository/org/jacoco/org.jacoco.agent/0.7.2.201409121644/org.jacoco.agent-0.$.2.201409121644-runtime.jar=destfile=/home/arodriguez/dev/datavis-master/back/target/jacocoIT.exec
  -Xmx1024m -XX:MaxPermSize=256m
  -Dlogback.configurationFile=conf/logger/develop.logger.xml
  -Djava.security.polic$=conf/java.policy -XX:+HeapDumpOnOutOfMemoryError
  org.apache.maven.surefire.booter.ForkedBooter
 
 
 /home/arodriguez/dev/datavis-master/back/target/surefire/surefire6809257750333050101tmp
 
 
 /home/arodriguez/dev/datav$s-master/back/target/surefire/surefire_02320528722071658975tmp
 
 
  Any ideas?
 





Re: Shepherd for MESOS-2637

2015-06-05 Thread Niklas Nielsen
I can help you out.

On 5 June 2015 at 09:51, Colin Williams lack...@gmail.com wrote:

 Can I get a shepherd for https://issues.apache.org/jira/browse/MESOS-2637
 ?
 It's been completed for a while and I think I've addressed all of the
 concerns raised.

 Thanks,
 Colin



Re: Native code error

2015-06-03 Thread Niklas Nielsen
Hi Alberto,

Can you share the full crash report with us?

Niklas

On 2 June 2015 at 02:37, Alberto Rodriguez ardl...@gmail.com wrote:

 Hi all,

 I've got a bunch of tests that are checking whether my spark process is
 able to connect to a mesos cluster. The happy path (mesos up  running)
 is working fine the problem comes when I'm testing a connection with a
 wrong mesos ip then I'm getting the following error:

 WARNING: Logging before InitGoogleLogging() is written to STDERR
 W0602 11:26:03.549504 13269 sched.cpp:1323]
 **
 Scheduler driver bound to loopback interface! Cannot communicate with
 remote master(s). You might want to set 'LIBPROCESS_IP' environment
 variable to use a routable IP address.
 **
 #
 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGSEGV (0xb) at pc=0x7fcdf32c5660, pid=13125, tid=140517706200832
 #
 # JRE version: Java(TM) SE Runtime Environment (7.0_75-b13) (build
 1.7.0_75-b13)
 # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.75-b04 mixed mode
 linux-amd64 compressed oops)
 # Problematic frame:
 # C  [libc.so.6+0x83660]  cfree+0x220
 #
 # Core dump written. Default location:
 /home/arodriguez/dev/datavis-master/back/core or core.13125 (max size 5
 kB). To ensure a full core dump, try ulimit -c unlimited before starting
 Java again
 #
 # An error report file with more information is saved as:
 # /home/arodriguez/dev/datavis-master/back/hs_err_pid13125.log
 #
 # If you would like to submit a bug report, please visit:
 #   http://bugreport.sun.com/bugreport/crash.jsp
 # The crash happened outside the Java Virtual Machine in native code.
 # See problematic frame for where to report the bug.
 #
 /bin/sh: línea 1: 13125 Abortado(`core' generado)
 /usr/java/jdk1.7.0_75/jre/bin/java

 -javaagent:/home/arodriguez/.m2/repository/org/jacoco/org.jacoco.agent/0.7.2.201409121644/org.jacoco.agent-0.$.2.201409121644-runtime.jar=destfile=/home/arodriguez/dev/datavis-master/back/target/jacocoIT.exec
 -Xmx1024m -XX:MaxPermSize=256m
 -Dlogback.configurationFile=conf/logger/develop.logger.xml
 -Djava.security.polic$=conf/java.policy -XX:+HeapDumpOnOutOfMemoryError
 org.apache.maven.surefire.booter.ForkedBooter

 /home/arodriguez/dev/datavis-master/back/target/surefire/surefire6809257750333050101tmp

 /home/arodriguez/dev/datav$s-master/back/target/surefire/surefire_02320528722071658975tmp


 Any ideas?



Re: Scaling Proposal: MAINTAINERS Files

2015-05-15 Thread Niklas Nielsen
+1 - thanks for taking this on Ben!

On 14 May 2015 at 18:12, Adam Bordelon a...@mesosphere.io wrote:

 SGTM. +1 on no special privileges for maintainers, trusting our committers,
 and associating maintainers with (JIRA) components

 On Thu, May 14, 2015 at 5:23 PM, Vinod Kone vinodk...@apache.org wrote:

  +1
 
  On Thu, May 14, 2015 at 5:07 PM, Benjamin Mahler 
  benjamin.mah...@gmail.com
  wrote:
 
   After stepping back and giving time for everyone to mull it over, and
   having discussed it further with many of you, I wanted to bring the
   discussion back to the list. The motivation remains the same, but I
  propose
   two changes to the approach:
  
   (1) We use maintainers without any process requirement. That is, when
   contributors are unsure who to send reviews to, maintainers provides a
  way
   for them to engage with committers who have interest, experience, and a
   long-term perspective on the relevant component. For committers, we
 will
   trust them to use their judgement for when to engage with maintainers.
   There is no sign off requirement or tooling enforcement related to
  this.
   Taking our aversion of OWNERS further, many of us discussed how
   privilege-based sign offs go against the Apache Way and can go awry
 in
   the long term. For those that weren't present, the Spark maintainers
   proposal thread here captures the concerns:
   http://markmail.org/thread/vdqfdnfjwhvd46ry
  
   (2) Rather than adding a hierarchical file structure, we document
   maintainers by component (e.g. framework API, containerization
  (external,
   docker, mesos (network isolator, ...)), stout, libprocess, master,
 slave,
   zookeeper, replicated log, webui, ...), ideally having a close
  relationship
   to JIRA components. The maintainers can be documented in a table
  alongside
   contribution / committer guidelines on our website. I think this will
 be
   clearer than splitting it across files in our source filesystem, and
  isn't
   tied to the file layout of our code.
  
   I will be proposing documentation soon based on this updated approach:
   https://issues.apache.org/jira/browse/MESOS-2737
  
   Please chime in with your feedback!
   Ben
  
   On Wed, Feb 25, 2015 at 9:16 AM, Benjamin Hindman 
  b...@eecs.berkeley.edu
   wrote:
  
I had chatted with BenM and Vinod pretty extensively about this and
 am
  a
+1.
   
BenM: can you confirm how this interacts with the Apache by-laws?
   
On Sat, Feb 14, 2015 at 10:25 AM, Till Toenshoff toensh...@me.com
   wrote:
   
 +1 — thanks for this Ben!

  On Feb 10, 2015, at 8:56 PM, Cody Maloney c...@mesosphere.io
   wrote:
 
  +1
 
  It would be nice if there was way to specify things like build
   system
  changes which are different than just adding/removing a single
  file.
But
  probably that level of enforcement isn't worth the effort it
 would
   take
 to
  add.
 
  On Tue, Feb 10, 2015 at 8:56 AM, James DeFelice 
 james.defel...@gmail.com
  wrote:
 
  +1 Tom/Adam
 
  --sent from my phone
  On Feb 10, 2015 10:52 AM, Niklas Nielsen 
 nik...@mesosphere.io
 wrote:
 
  +1
  Thanks for the write up Ben!
 
  On Tuesday, February 10, 2015, Dominic Hamon
 dha...@twitter.com.invalid
 
  wrote:
 
  Well, we should probably do that anyway :)
  On Feb 10, 2015 2:25 AM, Adam Bordelon a...@mesosphere.io
  javascript:; wrote:
 
  +1 on MAINTAINERS over OWNERS, and the rest of the proposal
  thus
far.
  Also +1 on Merit is not about quantity of work, it means
 doing
  things
  the
  community values in a way that the community values.
  I will, however, echo Tom's concern that we may need to break
  up
  master.cpp
  and slave.cpp if we want fine-grained maintainers of
   subcomponents
of
  either.
 
  On Mon, Feb 9, 2015 at 1:47 PM, Yan Xu y...@jxu.me
   javascript:;
  wrote:
 
  Good point for MAINTAINERS
 
  --
  Jiang Yan Xu y...@jxu.me javascript:; @xujyan 
  http://twitter.com/xujyan
 
  On Mon, Feb 9, 2015 at 12:05 PM, Vinod Kone 
   vinodk...@gmail.com
  javascript:; wrote:
 
  I like MAINTAINERS because it sounds less authoritative
 than
  OWNERS.
 
  FWIW, maintainers is also a well understood and well used
  term
  (e.g:
  https://www.kernel.org/doc/linux/MAINTAINERS,
 
 
 
 
 
 
 

   
  
 
 https://wiki.jenkins-ci.org/display/JENKINS/Hosting+Plugins#HostingPlugins-AddingMaintainerInformation
  )
 
  On Sun, Feb 8, 2015 at 10:40 AM, Dominic Hamon 
  dha...@twopensource.com javascript:;
  wrote:
 
  Yes, great.
 
  Why not use OWNERS as it is already in use internally at
  Twitter,
  at
  Google, in Chromium, and tooling already supports that as
 an
  implicit
  standard?
  On Feb 8, 2015

Re: Enabling 'network' namespace for custom network isolators

2015-05-11 Thread Niklas Nielsen
(inlined)

On 11 May 2015 at 14:30, Kapil Arya ka...@mesosphere.io wrote:

 On Mon, May 11, 2015 at 4:58 PM, Jie Yu yujie@gmail.com wrote:

  
   Yes. The simplest (cleanest?) way that I see would be to refactor the
   launcher to take the desired flags when executing the executor, i.e.,
   (Linux)Launcher::fork() takes the namespace flags. The launcher would
 be
   directed which namespaces to create by the caller, not inferring them
   itself from any flags: the MesosContainerizer in turn would determine
  this
   based on the isolators it was using for the container (querying them).
  This
   also facilitates the MesosContainerizer having different isolators, and
   thus namespaces, for different containers.
 
 
  +1
 
  For instance, the isolator could have an interface 'int namespaces()'
 which
  specifies the namespaces needed. The launcher can just query that and
 pass
  them to the linux launcher.
 
  Since currently, the launcher and isolator interfaces are designed for
 both
  mac and linux and namespace does not make sense on Mac, we probably need
 a
  LinuxIsolator interface (inherit from Isolator) and a LinuxLauncher
  interface (inherit from Launcher).
 

 This is a good idea. I think this will keep things pretty clean and
 readable within Mesos and for the Isolators.


  - Jie
 
  On Mon, May 11, 2015 at 1:29 PM, Ian Downes idow...@twitter.com.invalid
 
  wrote:
 
   
TLDR: We want to use a custom network isolator, but there is no way
 to
enable the 'network' namespace from within an isolator module.
   
   
We are working on creating a custom network isolator as a Mesos
 module.
However, the way Mesos Slave is setup, there is no way to enable
   'network'
namespace for the executor without enabling the 'port-mapping'
  isolator.
This is due to the fact that the LinuxLauncher looks at the
  '--isolation'
flag to figure out the list of namespaces to be enabled. The same
  problem
would exist if one were to write a custom pid or filesystem isolation
module.
   
  
   Curious, are these going to be open source and added to the codebase or
   will they be proprietary modules? What specifically is lacking in the
   existing network and pid isolators? Could we extend those?


We will be bringing in a dependency, experimental work with Calico, and
wanted to be flexible in how we call out to our tools.


  
   So, there are a couple of question:
   
1. With the current Mesos source code, is there a way to specify the
'network/port_mapping' isolator in a way that it doesn't do the
 actual
   work
of mapping the ports (e.g., without specifying any port-mapping
  specific
flags)? If this works, we can just specify this isolator on the slave
command line and it would force the LinuxLauncher to create a new
  network
namespace.
  
  
   No, as written they're coupled.
  
   2. Is it reasonable to have a separate mechanism to specify what
  namespaces
should be created/enabled for an executor if we don't want to use the
in-built isolators such as pid and port-mapping?
  
  
   Yes. The simplest (cleanest?) way that I see would be to refactor the
   launcher to take the desired flags when executing the executor, i.e.,
   (Linux)Launcher::fork() takes the namespace flags. The launcher would
 be
   directed which namespaces to create by the caller, not inferring them
   itself from any flags: the MesosContainerizer in turn would determine
  this
   based on the isolators it was using for the container (querying them).
  This
   also facilitates the MesosContainerizer having different isolators, and
   thus namespaces, for different containers.
  
   WRT (2), one potential mechanism is to introduce a new flag,
  '--namespace'.
The downside of creating such a low-level flag is that it provides
  little
to no value to the end users. The end users shouldn't be concerned
  about
which namespaces to enable and so on

 
  
   That seems unduly onerous to the user and almost as rigid.


No matter if you refactor the way we tell the launcher to set the namespace
flags, we need to refactor the way it is provided by the user to the slave.
(Don't get me wrong - I do love the decoupling of the slave flags from the
launcher :)

I suggested that we create 'namespaces/network' instead of the
--namespaces, which would be equivalent to the namespaces/pid
isolator+launcher code.

Thoughts?


  
  
Another alternative is to create a decorator hook for the
  LinuxLauncher,
which can force the LinuxLauncher to enable certain namespaces
 without
having to look at the '--isolation' flag. The downside here is that
 the
decorator will be literally setting up a few bits and nothing more.
   
  
   I don't think there's a need for a decorator hook, just refactoring to
  pass
   in through fork() is sufficient?
  
   Are there any other alternatives for a better and cleaner design (both
  long
term and short term)?
   
  
   Happy to 

Re: [VOTE] Release Apache Mesos 0.22.1 (rc6)

2015-05-04 Thread Niklas Nielsen
+1 Verified with test cluster

On 4 May 2015 at 15:45, Timothy Chen tnac...@gmail.com wrote:

 +1 Verified with test cluster and running make check myself.

 Tim

 On Fri, May 1, 2015 at 2:28 PM, Alexander Rojas alexan...@mesosphere.io
 wrote:
  +1 non binding (Tested in OSX and 3 VM’s cluster running Ubuntu 12.10)
 
  On 30 Apr 2015, at 00:48, Adam Bordelon a...@mesosphere.io wrote:
 
  Hi all,
 
  Please vote on releasing the following candidate as Apache Mesos 0.22.1.
 
  0.22.1 is a bug fix release and includes the following:
 
 
   * [MESOS-1795] - Assertion failure in state abstraction crashes JVM.
   * [MESOS-2161] - AbstractState JNI check fails for Marathon framework.
   * [MESOS-2461] - Slave should provide details on processes running in
 its
  cgroups
   * [MESOS-2583] - Tasks getting stuck in staging.
   * [MESOS-2592] - The sandbox directory is not chown'ed if the fetcher
  doesn't run.
   * [MESOS-2601] - Tasks are not removed after recovery from slave and
  mesos containerizer
   * [MESOS-2614] - Update name, hostname, failover_timeout, and webui_url
  in master on framework re-registration
   * [MESOS-2643] - Python scheduler driver disables implicit
  acknowledgments by default.
   * [MESOS-2668] - Slave fails to recover when there are still processes
  left in its cgroup
 
  The CHANGELOG for the release is available at:
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.22.1-rc6
 
 
 
  The candidate for Mesos 0.22.1 release is available at:
 
 https://dist.apache.org/repos/dist/dev/mesos/0.22.1-rc6/mesos-0.22.1.tar.gz
 
  The tag to be voted on is 0.22.1-rc6:
 
 https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.22.1-rc6
 
  The MD5 checksum of the tarball can be found at:
 
 https://dist.apache.org/repos/dist/dev/mesos/0.22.1-rc6/mesos-0.22.1.tar.gz.md5
 
  The signature of the tarball can be found at:
 
 https://dist.apache.org/repos/dist/dev/mesos/0.22.1-rc6/mesos-0.22.1.tar.gz.asc
 
  The PGP key used to sign the release is here:
  https://dist.apache.org/repos/dist/release/mesos/KEYS
 
  The JAR is up in Maven in a staging repository here:
  https://repository.apache.org/content/repositories/orgapachemesos-1054
 
  Please vote on releasing this package as Apache Mesos 0.22.1!
 
  The vote is open until Mon May 4 18:00:00 PDT 2015 and passes if a
 majority
  of at least 3 +1 PMC votes are cast.
 
  [ ] +1 Release this package as Apache Mesos 0.22.1
  [ ] -1 Do not release this package because ...
 
  Thanks,
  -Adam-
 



Re: Review Request 31265: Provided a factory for allocator in tests.

2015-04-21 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31265/#review81038
---

Ship it!


Ship It!

- Niklas Nielsen


On April 21, 2015, 11:45 a.m., Alexander Rukletsov wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31265/
 ---
 
 (Updated April 21, 2015, 11:45 a.m.)
 
 
 Review request for mesos, Kapil Arya, Michael Park, and Niklas Nielsen.
 
 
 Bugs: MESOS-2160
 https://issues.apache.org/jira/browse/MESOS-2160
 
 
 Repository: mesos
 
 
 Description
 ---
 
 The factory creates allocator instances in a way identical to how instances 
 from modules are created. It allows us to use same typed tests for built-in 
 and modularized allocators.
 
 
 Diffs
 -
 
   src/examples/test_allocator_module.cpp PRE-CREATION 
   src/local/local.cpp 289b9bced7bab0d745fe14823efa4e90ec36905e 
   src/master/allocator/mesos/allocator.hpp 
 af27a9bd8299cbff01e04b74db47c86bf247b908 
   src/master/main.cpp 7cce3a0bb808a1cb7bac9acab31eb1c67a15ea9f 
   src/tests/cluster.hpp a56b6541adcdebc5866571bbdbb6828df97b34ec 
   src/tests/hierarchical_allocator_tests.cpp 
 0b564a74d3f04df46fe52fcbe1bf8d4d1e41c53c 
   src/tests/mesos.hpp 7744df55a2a31446327da7bd2b16457e90711d22 
 
 Diff: https://reviews.apache.org/r/31265/diff/
 
 
 Testing
 ---
 
 make check (Mac OS 10.9.5, Ubuntu 14.04)
 
 
 Thanks,
 
 Alexander Rukletsov
 




Re: Review Request 31263: Refactored TestAllocator and allocator text fixture.

2015-04-21 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31263/#review81040
---

Ship it!


Ship It!

- Niklas Nielsen


On April 21, 2015, 12:02 p.m., Alexander Rukletsov wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31263/
 ---
 
 (Updated April 21, 2015, 12:02 p.m.)
 
 
 Review request for mesos, Kapil Arya, Michael Park, and Niklas Nielsen.
 
 
 Bugs: MESOS-2160
 https://issues.apache.org/jira/browse/MESOS-2160
 
 
 Repository: mesos
 
 
 Description
 ---
 
 TestAllocator owns a pointer to a real allocator. Each test should 
 instantiate and destroy Allocator instances explicitly to avoid not expected 
 calls.
 
 
 Diffs
 -
 
   src/tests/containerizer.cpp 26b87ac6b16dfeaf84888e80296ef540697bd775 
   src/tests/master_allocator_tests.cpp 
 03a1bb8c92b44bc1ad1b5f5cff8d1fb971df2302 
   src/tests/mesos.hpp 7744df55a2a31446327da7bd2b16457e90711d22 
   src/tests/slave_recovery_tests.cpp 87f4a6aab27d142fa8eb7a6571f684a6ce59687e 
 
 Diff: https://reviews.apache.org/r/31263/diff/
 
 
 Testing
 ---
 
 make check (Mac OS 10.9.5, CentOS 7.0)
 
 
 Thanks,
 
 Alexander Rukletsov
 




Re: Review Request 33372: Added decorator documentation and described the semantic change in Mesos 0.23.0

2015-04-21 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33372/
---

(Updated April 21, 2015, 1:13 p.m.)


Review request for mesos and Adam B.


Bugs: MESOS-2622
https://issues.apache.org/jira/browse/MESOS-2622


Repository: mesos


Description
---

See summary.


Diffs (updated)
-

  docs/modules.md a8b471541cdfa584eeb89fbe96643f93c712cfd4 

Diff: https://reviews.apache.org/r/33372/diff/


Testing
---


Thanks,

Niklas Nielsen



Review Request 33372: Added decorator documentation and described the semantic change in Mesos 0.23.0

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33372/
---

Review request for mesos and Adam B.


Repository: mesos


Description
---

See summary.


Diffs
-

  docs/modules.md a8b471541cdfa584eeb89fbe96643f93c712cfd4 

Diff: https://reviews.apache.org/r/33372/diff/


Testing
---


Thanks,

Niklas Nielsen



Re: Review Request 32948: Refactored VerifyMasterLaunchTaskHook to _not_ use command executor.

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32948/
---

(Updated April 20, 2015, 2:34 p.m.)


Review request for mesos, Adam B and Kapil Arya.


Bugs: MESOS-2351
https://issues.apache.org/jira/browse/MESOS-2351


Repository: mesos


Description
---

See summary


Diffs (updated)
-

  src/tests/hook_tests.cpp bb9de25bd2c4601d333a3ca1aec13820c7df7378 

Diff: https://reviews.apache.org/r/32948/diff/


Testing
---

make check (test broke previously with an assert as the TestContainerizer 
cannot be used with a command executor)


Thanks,

Niklas Nielsen



Re: Review Request 33372: Added decorator documentation and described the semantic change in Mesos 0.23.0

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33372/
---

(Updated April 20, 2015, 2:27 p.m.)


Review request for mesos and Adam B.


Changes
---

D'oh. bugs, not branch


Bugs: MESOS-2622
https://issues.apache.org/jira/browse/MESOS-2622


Repository: mesos


Description
---

See summary.


Diffs
-

  docs/modules.md a8b471541cdfa584eeb89fbe96643f93c712cfd4 

Diff: https://reviews.apache.org/r/33372/diff/


Testing
---


Thanks,

Niklas Nielsen



Re: Review Request 33372: Added decorator documentation and described the semantic change in Mesos 0.23.0

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33372/
---

(Updated April 20, 2015, 2:27 p.m.)


Review request for mesos and Adam B.


Changes
---

added JIRA


Repository: mesos


Description
---

See summary.


Diffs
-

  docs/modules.md a8b471541cdfa584eeb89fbe96643f93c712cfd4 

Diff: https://reviews.apache.org/r/33372/diff/


Testing
---


Thanks,

Niklas Nielsen



Re: Review Request 31028: Added slave run task hook tests.

2015-04-20 Thread Niklas Nielsen


 On April 11, 2015, 3:54 a.m., Adam B wrote:
  src/examples/test_hook_module.cpp, lines 80-85
  https://reviews.apache.org/r/31028/diff/4/?file=920382#file920382line80
 
  Create variables like testLabelKey, etc. above so it's easier to track 
  all these label k/v strings.
 
 Niklas Nielsen wrote:
 I would prefer if we could defer this to a subsequent review; we use foo, 
 bar, baz, qux quite a few places for label testing. Is that OK with you? I 
 can create a JIRA for it now if you'd like.
 
 Adam B wrote:
 Separate review/JIRA is fine by me. I just have an aversion to hardcoded 
 strings showing up in multiple places.

https://issues.apache.org/jira/browse/MESOS-2622


- Niklas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31028/#review79804
---


On April 20, 2015, 1:17 p.m., Niklas Nielsen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31028/
 ---
 
 (Updated April 20, 2015, 1:17 p.m.)
 
 
 Review request for mesos, Ben Mahler and Kapil Arya.
 
 
 Bugs: MESOS-2351
 https://issues.apache.org/jira/browse/MESOS-2351
 
 
 Repository: mesos
 
 
 Description
 ---
 
 See summary
 
 
 Diffs
 -
 
   src/examples/test_hook_module.cpp 2f2da1c5ef85af06c7f366d38ce5b64f39d0076f 
   src/tests/hook_tests.cpp bb9de25bd2c4601d333a3ca1aec13820c7df7378 
 
 Diff: https://reviews.apache.org/r/31028/diff/
 
 
 Testing
 ---
 
 make check (with newly added VerifySlaveRunTaskHook test)
 
 
 Thanks,
 
 Niklas Nielsen
 




Re: Review Request 31016: Added slave run task decorator.

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31016/
---

(Updated April 20, 2015, 1:44 p.m.)


Review request for mesos, Ben Mahler and Kapil Arya.


Bugs: MESOS-2351
https://issues.apache.org/jira/browse/MESOS-2351


Repository: mesos


Description
---

Added decorator which gets invoked on start of runTask() sequence in the slave.


Diffs (updated)
-

  include/mesos/hook.hpp 9ae8b9455a86c7a5cbf4f1d1b1ce88f2811ce35d 
  src/hook/manager.hpp da813492108974a7e26b366845368517589da876 
  src/hook/manager.cpp 7a4cb09bc221af502e867cfb7fff2900b599ff1f 
  src/slave/slave.hpp 9495c704ca4bde4ab283d12efa3ea9b2f1158a4c 
  src/slave/slave.cpp 8ec80ed26f338690e0a1e712065750ab77a724cd 
  src/tests/mesos.hpp 7744df55a2a31446327da7bd2b16457e90711d22 
  src/tests/mesos.cpp fc534e9febed1e293076e00e0f5c3879a78df90f 

Diff: https://reviews.apache.org/r/31016/diff/


Testing
---

make check


Thanks,

Niklas Nielsen



Re: Review Request 32859: Add Camel-case libprocess variable and method names sample.

2015-04-20 Thread Niklas Nielsen


 On April 6, 2015, 8:06 p.m., Joris Van Remoortere wrote:
  Hi haosdent,
  I added some higher level comments in the JIRA ticket for you :-)
 
 Niklas Nielsen wrote:
 haosdent, did you get a chance to take a look at Joris' comments? Do you 
 need to update the review request or what are our next steps?
 
 haosdent huang wrote:
 @nnielsen Yes? I also reply the comment. So I think the issue is not 
 accepted, should I discard this review?

Let's do that for now. Thanks for understanding!


- Niklas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32859/#review79117
---


On April 6, 2015, 2:19 p.m., haosdent huang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/32859/
 ---
 
 (Updated April 6, 2015, 2:19 p.m.)
 
 
 Review request for mesos, Joris Van Remoortere and Niklas Nielsen.
 
 
 Bugs: MESOS-2576
 https://issues.apache.org/jira/browse/MESOS-2576
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Add Camel-case libprocess variable and method names sample.
 
 
 Diffs
 -
 
   3rdparty/libprocess/src/libev.hpp e4a403d9e769c13182f26034e0dd1c4db92b04cb 
   3rdparty/libprocess/src/libev.cpp 610dfb896ed8f9c00f9cf8fc8dbfc7d434f7d1e5 
   3rdparty/libprocess/src/libev_poll.cpp 
 324e4dd950989f3717ca73fe48520ca3e518518f 
   3rdparty/libprocess/src/process.cpp 
 cf4e36489be2c6aa01e838c1c71383f248deab5b 
 
 Diff: https://reviews.apache.org/r/32859/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 haosdent huang
 




Re: Review Request 30962: Enabled environment decorator to override.

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30962/
---

(Updated April 20, 2015, 1:16 p.m.)


Review request for mesos, Ben Mahler and Kapil Arya.


Bugs: MESOS-2351
https://issues.apache.org/jira/browse/MESOS-2351


Repository: mesos


Description
---

See summary


Diffs (updated)
-

  src/examples/test_hook_module.cpp 2f2da1c5ef85af06c7f366d38ce5b64f39d0076f 
  src/hook/manager.cpp 7a4cb09bc221af502e867cfb7fff2900b599ff1f 

Diff: https://reviews.apache.org/r/30962/diff/


Testing
---

make check


Thanks,

Niklas Nielsen



Re: Review Request 30961: Enabled label decorator to override.

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30961/
---

(Updated April 20, 2015, 1:16 p.m.)


Review request for mesos, Ben Mahler and Kapil Arya.


Bugs: MESOS-2351
https://issues.apache.org/jira/browse/MESOS-2351


Repository: mesos


Description
---

See summary.


Diffs (updated)
-

  include/mesos/hook.hpp 9ae8b9455a86c7a5cbf4f1d1b1ce88f2811ce35d 
  src/examples/test_hook_module.cpp 2f2da1c5ef85af06c7f366d38ce5b64f39d0076f 
  src/hook/manager.cpp 7a4cb09bc221af502e867cfb7fff2900b599ff1f 
  src/master/master.cpp e30b951eda2b3b0d5b2a80716f0b32c6bbe041bc 
  src/tests/hook_tests.cpp bb9de25bd2c4601d333a3ca1aec13820c7df7378 

Diff: https://reviews.apache.org/r/30961/diff/


Testing
---

make check (with modified VerifyMasterLaunchTaskHook test)


Thanks,

Niklas Nielsen



Re: Review Request 31016: Added slave run task decorator.

2015-04-20 Thread Niklas Nielsen


 On April 11, 2015, 3:26 a.m., Adam B wrote:
  src/slave/slave.cpp, lines 1186-1188
  https://reviews.apache.org/r/31016/diff/4/?file=920381#file920381line1186
 
  What makes this the ideal place to do the label decoration? Looks like 
  this is wedged between setting up different unschedule calls. Seems like we 
  should either decorate the labels as early as possible, right after (or 
  before?) the pid/id/state checks; or, if we need to delay it, we could do 
  it after the unschedules.
 
 Niklas Nielsen wrote:
 Great point - I will probably move it up before the unschedule code. Does 
 that sound OK to you?
 
 Adam B wrote:
 Yeah, I'm thinking earlier is better, so we don't early-exit without 
 updating labels.

Took that for a spin and we need the framework info, so we need to have it 
after the unschedule block :/


- Niklas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31016/#review79803
---


On April 11, 2015, 3:03 a.m., Niklas Nielsen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31016/
 ---
 
 (Updated April 11, 2015, 3:03 a.m.)
 
 
 Review request for mesos, Ben Mahler and Kapil Arya.
 
 
 Bugs: MESOS-2351
 https://issues.apache.org/jira/browse/MESOS-2351
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Added decorator which gets invoked on start of runTask() sequence in the 
 slave.
 
 
 Diffs
 -
 
   include/mesos/hook.hpp 9ae8b9455a86c7a5cbf4f1d1b1ce88f2811ce35d 
   src/hook/manager.hpp da813492108974a7e26b366845368517589da876 
   src/hook/manager.cpp 7a4cb09bc221af502e867cfb7fff2900b599ff1f 
   src/slave/slave.hpp 19e6b44bc344c0ca509674803f401cbb4e1f47ae 
   src/slave/slave.cpp 9fec023b643d410f4d511fa6f80e9835bab95b7e 
 
 Diff: https://reviews.apache.org/r/31016/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Niklas Nielsen
 




Re: Review Request 32948: Refactored VerifyMasterLaunchTaskHook to _not_ use command executor.

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32948/
---

(Updated April 20, 2015, 1:17 p.m.)


Review request for mesos, Adam B and Kapil Arya.


Bugs: MESOS-2351
https://issues.apache.org/jira/browse/MESOS-2351


Repository: mesos


Description
---

See summary


Diffs (updated)
-

  src/tests/hook_tests.cpp bb9de25bd2c4601d333a3ca1aec13820c7df7378 

Diff: https://reviews.apache.org/r/32948/diff/


Testing
---

make check (test broke previously with an assert as the TestContainerizer 
cannot be used with a command executor)


Thanks,

Niklas Nielsen



Re: Review Request 31028: Added slave run task hook tests.

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31028/
---

(Updated April 20, 2015, 1:17 p.m.)


Review request for mesos, Ben Mahler and Kapil Arya.


Bugs: MESOS-2351
https://issues.apache.org/jira/browse/MESOS-2351


Repository: mesos


Description
---

See summary


Diffs (updated)
-

  src/examples/test_hook_module.cpp 2f2da1c5ef85af06c7f366d38ce5b64f39d0076f 
  src/tests/hook_tests.cpp bb9de25bd2c4601d333a3ca1aec13820c7df7378 

Diff: https://reviews.apache.org/r/31028/diff/


Testing
---

make check (with newly added VerifySlaveRunTaskHook test)


Thanks,

Niklas Nielsen



Re: Review Request 31017: Fixed comment for remove executor hook.

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31017/
---

(Updated April 20, 2015, 1:17 p.m.)


Review request for mesos and Kapil Arya.


Bugs: MESOS-2351
https://issues.apache.org/jira/browse/MESOS-2351


Repository: mesos


Description
---

See summary.


Diffs (updated)
-

  include/mesos/hook.hpp 9ae8b9455a86c7a5cbf4f1d1b1ce88f2811ce35d 

Diff: https://reviews.apache.org/r/31017/diff/


Testing
---

make check


Thanks,

Niklas Nielsen



Re: Review Request 31016: Added slave run task decorator.

2015-04-20 Thread Niklas Nielsen


 On April 11, 2015, 3:26 a.m., Adam B wrote:
  src/slave/slave.cpp, lines 1186-1188
  https://reviews.apache.org/r/31016/diff/4/?file=920381#file920381line1186
 
  What makes this the ideal place to do the label decoration? Looks like 
  this is wedged between setting up different unschedule calls. Seems like we 
  should either decorate the labels as early as possible, right after (or 
  before?) the pid/id/state checks; or, if we need to delay it, we could do 
  it after the unschedules.
 
 Niklas Nielsen wrote:
 Great point - I will probably move it up before the unschedule code. Does 
 that sound OK to you?
 
 Adam B wrote:
 Yeah, I'm thinking earlier is better, so we don't early-exit without 
 updating labels.
 
 Niklas Nielsen wrote:
 Took that for a spin and we need the framework info, so we need to have 
 it after the unschedule block :/

Could use the bare info - the hook is moved up in the latest patch.


- Niklas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31016/#review79803
---


On April 11, 2015, 3:03 a.m., Niklas Nielsen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31016/
 ---
 
 (Updated April 11, 2015, 3:03 a.m.)
 
 
 Review request for mesos, Ben Mahler and Kapil Arya.
 
 
 Bugs: MESOS-2351
 https://issues.apache.org/jira/browse/MESOS-2351
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Added decorator which gets invoked on start of runTask() sequence in the 
 slave.
 
 
 Diffs
 -
 
   include/mesos/hook.hpp 9ae8b9455a86c7a5cbf4f1d1b1ce88f2811ce35d 
   src/hook/manager.hpp da813492108974a7e26b366845368517589da876 
   src/hook/manager.cpp 7a4cb09bc221af502e867cfb7fff2900b599ff1f 
   src/slave/slave.hpp 19e6b44bc344c0ca509674803f401cbb4e1f47ae 
   src/slave/slave.cpp 9fec023b643d410f4d511fa6f80e9835bab95b7e 
 
 Diff: https://reviews.apache.org/r/31016/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Niklas Nielsen
 




Re: Review Request 32198: Added a not equal operator for json objects.

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32198/#review80559
---



3rdparty/libprocess/3rdparty/stout/tests/json_tests.cpp
https://reviews.apache.org/r/32198/#comment130593

How about also checking for a larger array?


- Niklas Nielsen


On March 23, 2015, 7:19 a.m., Alexander Rojas wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/32198/
 ---
 
 (Updated March 23, 2015, 7:19 a.m.)
 
 
 Review request for mesos, Benjamin Hindman, Bernd Mathiske, Joerg Schad, 
 Niklas Nielsen, and Till Toenshoff.
 
 
 Bugs: MESOS-2510
 https://issues.apache.org/jira/browse/MESOS-2510
 
 
 Repository: mesos
 
 
 Description
 ---
 
 For consistency, adds a non equal operator to the json objects.
 
 It also adds tests to the equality operators.
 
 
 Diffs
 -
 
   3rdparty/libprocess/3rdparty/stout/include/stout/json.hpp 
 334c898906018be6e663f53815abbe047806b95c 
   3rdparty/libprocess/3rdparty/stout/tests/json_tests.cpp 
 f60d1bbe60f2e2b6460c06bba98e8b85ebb6a3f9 
 
 Diff: https://reviews.apache.org/r/32198/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Alexander Rojas
 




Re: Review Request 32859: Add Camel-case libprocess variable and method names sample.

2015-04-20 Thread Niklas Nielsen


 On April 6, 2015, 8:06 p.m., Joris Van Remoortere wrote:
  Hi haosdent,
  I added some higher level comments in the JIRA ticket for you :-)

haosdent, did you get a chance to take a look at Joris' comments? Do you need 
to update the review request or what are our next steps?


- Niklas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32859/#review79117
---


On April 6, 2015, 2:19 p.m., haosdent huang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/32859/
 ---
 
 (Updated April 6, 2015, 2:19 p.m.)
 
 
 Review request for mesos, Joris Van Remoortere and Niklas Nielsen.
 
 
 Bugs: MESOS-2576
 https://issues.apache.org/jira/browse/MESOS-2576
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Add Camel-case libprocess variable and method names sample.
 
 
 Diffs
 -
 
   3rdparty/libprocess/src/libev.hpp e4a403d9e769c13182f26034e0dd1c4db92b04cb 
   3rdparty/libprocess/src/libev.cpp 610dfb896ed8f9c00f9cf8fc8dbfc7d434f7d1e5 
   3rdparty/libprocess/src/libev_poll.cpp 
 324e4dd950989f3717ca73fe48520ca3e518518f 
   3rdparty/libprocess/src/process.cpp 
 cf4e36489be2c6aa01e838c1c71383f248deab5b 
 
 Diff: https://reviews.apache.org/r/32859/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 haosdent huang
 




Re: Review Request 32995: Split up documentation for reporting bugs and submitting patches.

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32995/#review80784
---

Ship it!


Ship It!

- Niklas Nielsen


On April 8, 2015, 5:30 p.m., Ben Mahler wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/32995/
 ---
 
 (Updated April 8, 2015, 5:30 p.m.)
 
 
 Review request for mesos, Benjamin Hindman and Vinod Kone.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Developers guide was too generic, this splits up the documents to focus on 
 what the reader is interesting in doing.
 
 
 Diffs
 -
 
   docs/home.md 6ab61f85aa7d62e0f19267b886dffb4e0aa826ea 
   docs/mesos-developers-guide.md 6023f2cc4d37ecabc24699bef6bda32068e5b43d 
   docs/reporting-a-bug.md PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/32995/diff/
 
 
 Testing
 ---
 
 N/A
 
 
 Thanks,
 
 Ben Mahler
 




Re: Review Request 32996: Updated roadmap document to link to 'Epic' tickets.

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32996/#review80785
---



docs/mesos-roadmap.md
https://reviews.apache.org/r/32996/#comment130903

It sounds a bit hostile (it may just be me), how about something like For 
comments and suggestions for the Mesos roadmap, feel free to reach out to the 
Mesos dev list


- Niklas Nielsen


On April 8, 2015, 5:30 p.m., Ben Mahler wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/32996/
 ---
 
 (Updated April 8, 2015, 5:30 p.m.)
 
 
 Review request for mesos and Vinod Kone.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Epic tickets provide a good view of upcoming and ongoing projects.
 
 
 Diffs
 -
 
   docs/home.md 6ab61f85aa7d62e0f19267b886dffb4e0aa826ea 
   docs/mesos-roadmap.md 150fad84d5ad264929775365fd1b941aefb89ec4 
 
 Diff: https://reviews.apache.org/r/32996/diff/
 
 
 Testing
 ---
 
 N/A
 
 
 Thanks,
 
 Ben Mahler
 




Re: Review Request 32998: Split committer's guide into code reviewing and committing documents.

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32998/#review80786
---



docs/committing.md
https://reviews.apache.org/r/32998/#comment130904

s/committers/Apache Mesos Committers/?
s/change/changes/?
s/one/a committer/



docs/committing.md
https://reviews.apache.org/r/32998/#comment130905

Very long line. Do we have a style guide for markdown somewhere? I would 
think that 80 cols would work :)



docs/committing.md
https://reviews.apache.org/r/32998/#comment130906

You chose to break the line here, so would it make sense to do a scan and 
wrap?



docs/committing.md
https://reviews.apache.org/r/32998/#comment130907

It would be great to have a todo or somehow refer to a link with a doc 
specifying our supported OS's, compilers etc. :)



docs/effective-code-reviewing.md
https://reviews.apache.org/r/32998/#comment130911

s/, this is not meant to be a sparring match!/./



docs/effective-code-reviewing.md
https://reviews.apache.org/r/32998/#comment130913

s/**//g


- Niklas Nielsen


On April 8, 2015, 5:30 p.m., Ben Mahler wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/32998/
 ---
 
 (Updated April 8, 2015, 5:30 p.m.)
 
 
 Review request for mesos, Adam B, Benjamin Hindman, Jie Yu, Niklas Nielsen, 
 and Vinod Kone.
 
 
 Bugs: MESOS-2581
 https://issues.apache.org/jira/browse/MESOS-2581
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Committer's Guide was too generic. This names the documents after what 
 the reader is looking for: doing effective reviews, and how to commit changes 
 (for committers only).
 
 
 Diffs
 -
 
   docs/committers-guide.md c016ee9cb3290d7788ed258904547b59bbea4f11 
   docs/committing.md PRE-CREATION 
   docs/effective-code-reviewing.md PRE-CREATION 
   docs/home.md 6ab61f85aa7d62e0f19267b886dffb4e0aa826ea 
 
 Diff: https://reviews.apache.org/r/32998/diff/
 
 
 Testing
 ---
 
 N/A
 
 
 Thanks,
 
 Ben Mahler
 




Re: Review Request 32999: Added a document for engineering principles and practices.

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32999/#review80796
---

Ship it!



docs/engineering-principles-and-practices.md
https://reviews.apache.org/r/32999/#comment130914

Long lines :) Do you think it is worth applying the 80 col style? If so, we 
should do a scan.

s/**high quality**, **robust** code/**high quality** and **robust** code/?


- Niklas Nielsen


On April 8, 2015, 5:30 p.m., Ben Mahler wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/32999/
 ---
 
 (Updated April 8, 2015, 5:30 p.m.)
 
 
 Review request for mesos, Adam B, Benjamin Hindman, Jie Yu, Niklas Nielsen, 
 and Vinod Kone.
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Added a document for engineering principles and practices.
 
 
 Diffs
 -
 
   docs/engineering-principles-and-practices.md PRE-CREATION 
   docs/home.md 6ab61f85aa7d62e0f19267b886dffb4e0aa826ea 
 
 Diff: https://reviews.apache.org/r/32999/diff/
 
 
 Testing
 ---
 
 N/A
 
 
 Thanks,
 
 Ben Mahler
 




Re: Review Request 31645: Added upgrade path test script

2015-04-20 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31645/#review80798
---


Closing this review for now. Kapil will take over this test effort and upload a 
new review request :)

- Niklas Nielsen


On March 2, 2015, 3:44 p.m., Niklas Nielsen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31645/
 ---
 
 (Updated March 2, 2015, 3:44 p.m.)
 
 
 Review request for mesos, Ben Mahler and Vinod Kone.
 
 
 Bugs: MESOS-2372
 https://issues.apache.org/jira/browse/MESOS-2372
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Upgrade path test script runs previous and new framework versions (i.e. 
 different versions of test framework+libmesos) against different versions of 
 mesos slave and master. We can hopefully generate the upgrade paths we want 
 to test systematically (applying some combinatorics) and cover those in this 
 script.
 
 
 Diffs
 -
 
   support/test-upgrade.py PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/31645/diff/
 
 
 Testing
 ---
 
 The output from running the script against 0.21.0 and HEAD (0.23.0):
 
 $ python ./support/run-upgrade.py --prev=../mesos/build-0.21.0/ --next=build
 Running upgrade test from mesos 0.21.0 to mesos 0.23.0
 +--+++---+
 | Test case|   Framework| Master | Slave |
 +--+++---+
 |#1|  mesos 0.21.0  | mesos 0.21.0   | mesos 0.21.0  |
 |#2 (live) |  mesos 0.21.0  | mesos 0.21.0   | mesos 0.23.0  |
 |#3|  mesos 0.23.0  | mesos 0.21.0   | mesos 0.23.0  |
 |#4|  mesos 0.23.0  | mesos 0.23.0   | mesos 0.23.0  |
 +--+++---+
 
 NOTE: live denotes that master process keeps running from previous case.
 
 
 Test case 1 (Run of previous setup)
 # Starting mesos 0.21.0 master #
 Run ['../mesos/build-0.21.0/bin/mesos-master.sh', '--ip=127.0.0.1', 
 '--work_dir=/var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4gn/T/tmp2UMcJv', 
 '--credentials=/var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4gn/T
 /tmpEMpQQd'], output: 
 /var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4gn/T/tmpEXE5wH
 # Starting mesos 0.21.0 slave #
 Run ['../mesos/build-0.21.0/bin/mesos-slave.sh', '--master=localhost:5050', 
 '--credential=/var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4gn/T/tmpEMpQQd'], 
 output: /var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4
 gn/T/tmpnGiJfB
 # Starting mesos 0.21.0 framework #
 Waiting for mesos 0.21.0 framework to complete (10 sec max)...
 Run ['../mesos/build-0.21.0/src/test-framework', '--master=localhost:5050'], 
 output: /var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4gn/T/tmp5cfI3A
 
 Test case 2 (Upgrade of slave)
 # Stopping mesos 0.21.0 slave #
 # Starting mesos 0.23.0 slave #
 Run ['build/bin/mesos-slave.sh', '--master=localhost:5050', 
 '--credential=/var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4gn/T/tmpEMpQQd'], 
 output: /var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4gn/T/tmpj9sM2K
 # Starting mesos 0.21.0 framework #
 Waiting for mesos 0.21.0 framework to complete (10 sec max)...
 Run ['../mesos/build-0.21.0/src/test-framework', '--master=localhost:5050'], 
 output: /var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4gn/T/tmpltfC_p
 
 Test case 3 (Upgrade framework)
 # Starting mesos 0.23.0 framework #
 Waiting for mesos 0.23.0 framework to complete (10 sec max)...
 Run ['build/src/test-framework', '--master=localhost:5050'], output: 
 /var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4gn/T/tmpJVWJjW
 
 Test case 4 (Run of next setup)
 # Stopping mesos 0.23.0 slave 
 # Stopping mesos 0.21.0 slave 
 Run ['build/bin/mesos-master.sh', '--ip=127.0.0.1', 
 '--work_dir=/var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4gn/T/tmpALCpmO', 
 '--credentials=/var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4gn/T/tmpEMpQQd'], 
 o$tput: /var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4gn/T/tmp6oLkTt
 # Starting mesos 0.23.0 slave #
 Run ['build/bin/mesos-slave.sh', '--master=localhost:5050', 
 '--credential=/var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4gn/T/tmpEMpQQd'], 
 output: /var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4gn/T/tmpYOCLCM
 # Starting mesos 0.23.0 framework #
 Waiting for mesos 0.23.0 framework to complete (10 sec max)...
 Run ['build/src/test-framework', '--master=localhost:5050'], output: 
 /var/folders/y3/w04yjljd5gbcvbvxvd5bhmp4gn/T/tmpCVjtu3
 
 
 Thanks,
 
 Niklas Nielsen
 




Re: Review Request 31028: Added slave run task hook tests.

2015-04-17 Thread Niklas Nielsen


 On April 11, 2015, 3:54 a.m., Adam B wrote:
  src/tests/hook_tests.cpp, lines 132-140
  https://reviews.apache.org/r/31028/diff/4/?file=920383#file920383line132
 
  Why this change? What's wrong with the TestContainerizer?

I was running into issues as the test containerizer doesn't support using the 
command executor. Let me double check real quick if this is a moot change 
compared to the test refactor later in this patch set.


- Niklas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31028/#review79804
---


On April 11, 2015, 3:03 a.m., Niklas Nielsen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31028/
 ---
 
 (Updated April 11, 2015, 3:03 a.m.)
 
 
 Review request for mesos, Ben Mahler and Kapil Arya.
 
 
 Bugs: MESOS-2351
 https://issues.apache.org/jira/browse/MESOS-2351
 
 
 Repository: mesos
 
 
 Description
 ---
 
 See summary
 
 
 Diffs
 -
 
   src/examples/test_hook_module.cpp 47409cd4d02e238d1d182571d92019114662cd41 
   src/tests/hook_tests.cpp bb9de25bd2c4601d333a3ca1aec13820c7df7378 
 
 Diff: https://reviews.apache.org/r/31028/diff/
 
 
 Testing
 ---
 
 make check (with newly added VerifySlaveRunTaskHook test)
 
 
 Thanks,
 
 Niklas Nielsen
 




Re: Review Request 29947: Fixed a race condition in hook tests for remove-executor hook.

2015-04-17 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29947/#review80485
---


This patch introduce some technical debt: There is currently no good way to 
synchronize between the test body and the hook code, so we wire a promise 
(owned by the test code). Let's capture this well by creating JIRA's, 
commenting the code and add it to the description of this review request, so it 
goes into the commit message.


src/examples/test_hook_module.cpp
https://reviews.apache.org/r/29947/#comment130407

Let's add a lot of context here as to why we need to share this between the 
module and the test body :)



src/tests/hook_tests.cpp
https://reviews.apache.org/r/29947/#comment130410

Symbol doesn't describe if it is going to be a function or just a variable. 
I wonder if we can come up with a bit more precise name



src/tests/hook_tests.cpp
https://reviews.apache.org/r/29947/#comment130420

Can you expand on why you need to do this?



src/tests/hook_tests.cpp
https://reviews.apache.org/r/29947/#comment130422

This is the part I hoped we could get around. Can you add a todo with the 
jira referring to the technical debt?



src/tests/hook_tests.cpp
https://reviews.apache.org/r/29947/#comment130423

s/  / /


- Niklas Nielsen


On Jan. 27, 2015, 4:39 p.m., Kapil Arya wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/29947/
 ---
 
 (Updated Jan. 27, 2015, 4:39 p.m.)
 
 
 Review request for mesos and Niklas Nielsen.
 
 
 Bugs: MESOS-2226
 https://issues.apache.org/jira/browse/MESOS-2226
 
 
 Repository: mesos
 
 
 Description
 ---
 
 The task must be killed properly before shutting down the driver. 
 
 
 Diffs
 -
 
   src/examples/test_hook_module.cpp 04fd43eb3eacae0d850dd7f4e191116d20620f10 
   src/tests/hook_tests.cpp 44f73effdce2d03627215418007ccbc3263a0c52 
 
 Diff: https://reviews.apache.org/r/29947/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Kapil Arya
 




Re: Review Request 31028: Added slave run task hook tests.

2015-04-17 Thread Niklas Nielsen


 On April 11, 2015, 3:54 a.m., Adam B wrote:
  src/examples/test_hook_module.cpp, lines 80-85
  https://reviews.apache.org/r/31028/diff/4/?file=920382#file920382line80
 
  Create variables like testLabelKey, etc. above so it's easier to track 
  all these label k/v strings.

I would prefer if we could defer this to a subsequent review; we use foo, bar, 
baz, qux quite a few places for label testing. Is that OK with you? I can 
create a JIRA for it now if you'd like.


- Niklas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31028/#review79804
---


On April 11, 2015, 3:03 a.m., Niklas Nielsen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31028/
 ---
 
 (Updated April 11, 2015, 3:03 a.m.)
 
 
 Review request for mesos, Ben Mahler and Kapil Arya.
 
 
 Bugs: MESOS-2351
 https://issues.apache.org/jira/browse/MESOS-2351
 
 
 Repository: mesos
 
 
 Description
 ---
 
 See summary
 
 
 Diffs
 -
 
   src/examples/test_hook_module.cpp 47409cd4d02e238d1d182571d92019114662cd41 
   src/tests/hook_tests.cpp bb9de25bd2c4601d333a3ca1aec13820c7df7378 
 
 Diff: https://reviews.apache.org/r/31028/diff/
 
 
 Testing
 ---
 
 make check (with newly added VerifySlaveRunTaskHook test)
 
 
 Thanks,
 
 Niklas Nielsen
 




Re: Review Request 30961: Enabled label decorator to override.

2015-04-17 Thread Niklas Nielsen


 On April 8, 2015, 5:31 p.m., Adam B wrote:
  src/examples/test_hook_module.cpp, line 36
  https://reviews.apache.org/r/30961/diff/7/?file=920371#file920371line36
 
  Unused? Or should you check that the value being removed is what you 
  expect?

It was just to keep the module and test code in sync. Would you prefer I remove 
it?


 On April 8, 2015, 5:31 p.m., Adam B wrote:
  src/examples/test_hook_module.cpp, lines 52-57
  https://reviews.apache.org/r/30961/diff/7/?file=920371#file920371line52
 
  label shadows label? Maybe oldLabel/newLabel? Then you can also reuse 
  the `Label*` in line 52 and 59 (`label_`)

Good point - will get that in.


- Niklas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30961/#review79454
---


On April 7, 2015, 5:57 p.m., Niklas Nielsen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/30961/
 ---
 
 (Updated April 7, 2015, 5:57 p.m.)
 
 
 Review request for mesos, Ben Mahler and Kapil Arya.
 
 
 Bugs: MESOS-2351
 https://issues.apache.org/jira/browse/MESOS-2351
 
 
 Repository: mesos
 
 
 Description
 ---
 
 See summary.
 
 
 Diffs
 -
 
   include/mesos/hook.hpp 9ae8b9455a86c7a5cbf4f1d1b1ce88f2811ce35d 
   src/examples/test_hook_module.cpp 47409cd4d02e238d1d182571d92019114662cd41 
   src/hook/manager.cpp 7a4cb09bc221af502e867cfb7fff2900b599ff1f 
   src/master/master.cpp 618db68ee4163b06e479cf3413eda4b63c9c5a4b 
   src/tests/hook_tests.cpp bb9de25bd2c4601d333a3ca1aec13820c7df7378 
 
 Diff: https://reviews.apache.org/r/30961/diff/
 
 
 Testing
 ---
 
 make check (with modified VerifyMasterLaunchTaskHook test)
 
 
 Thanks,
 
 Niklas Nielsen
 




Re: Review Request 31016: Added slave run task decorator.

2015-04-17 Thread Niklas Nielsen


 On April 11, 2015, 3:26 a.m., Adam B wrote:
  src/slave/slave.cpp, lines 1186-1188
  https://reviews.apache.org/r/31016/diff/4/?file=920381#file920381line1186
 
  What makes this the ideal place to do the label decoration? Looks like 
  this is wedged between setting up different unschedule calls. Seems like we 
  should either decorate the labels as early as possible, right after (or 
  before?) the pid/id/state checks; or, if we need to delay it, we could do 
  it after the unschedules.

Great point - I will probably move it up before the unschedule code. Does that 
sound OK to you?


- Niklas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31016/#review79803
---


On April 11, 2015, 3:03 a.m., Niklas Nielsen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/31016/
 ---
 
 (Updated April 11, 2015, 3:03 a.m.)
 
 
 Review request for mesos, Ben Mahler and Kapil Arya.
 
 
 Bugs: MESOS-2351
 https://issues.apache.org/jira/browse/MESOS-2351
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Added decorator which gets invoked on start of runTask() sequence in the 
 slave.
 
 
 Diffs
 -
 
   include/mesos/hook.hpp 9ae8b9455a86c7a5cbf4f1d1b1ce88f2811ce35d 
   src/hook/manager.hpp da813492108974a7e26b366845368517589da876 
   src/hook/manager.cpp 7a4cb09bc221af502e867cfb7fff2900b599ff1f 
   src/slave/slave.hpp 19e6b44bc344c0ca509674803f401cbb4e1f47ae 
   src/slave/slave.cpp 9fec023b643d410f4d511fa6f80e9835bab95b7e 
 
 Diff: https://reviews.apache.org/r/31016/diff/
 
 
 Testing
 ---
 
 make check
 
 
 Thanks,
 
 Niklas Nielsen
 




Re: Review Request 32961: Allow framework re-registeration to update master http fields.

2015-04-15 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32961/#review80204
---

Ship it!


Ship It!

- Niklas Nielsen


On April 14, 2015, 9:30 a.m., Joris Van Remoortere wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/32961/
 ---
 
 (Updated April 14, 2015, 9:30 a.m.)
 
 
 Review request for mesos, Benjamin Hindman and Niklas Nielsen.
 
 
 Bugs: MESOS-1218, MESOS-2614 and MESOS-703
 https://issues.apache.org/jira/browse/MESOS-1218
 https://issues.apache.org/jira/browse/MESOS-2614
 https://issues.apache.org/jira/browse/MESOS-703
 
 
 Repository: mesos
 
 
 Description
 ---
 
 Fields: 'name', 'hostname', 'failover_timeout', 'webui_url'
 
 
 Diffs
 -
 
   src/master/master.hpp 6141917644b84edfed9836fa0a005d55a36880e3 
   src/master/master.cpp 44b0a0147f5354824d86332a67b30018634c9a36 
 
 Diff: https://reviews.apache.org/r/32961/diff/
 
 
 Testing
 ---
 
 make check.
 re-registered no_executor_framework with different 'name', 'hostname', 
 'failover_timeout', and 'webui_url'
 
 
 Thanks,
 
 Joris Van Remoortere
 




Oversubscription design doc

2015-04-14 Thread Niklas Nielsen
Hi everyone,

In context of some of the recent discussions on 'Resource overcommittal',
we are very interested and invested in enabling oversubscription in Mesos
in a safe and efficient way. Some of the community members have been
working on an architecture document and prototypes recently and feel we
have something we can start working out of:
https://docs.google.com/document/d/1pUnElxHy1uWfHY_FOvvRC73QaOGgdXE0OXN-gbxdXA0/edit#

We will continue meeting up and discussing the approach, as it is a
significant change to how workloads are run on Mesos.
Feel free to jump in, add suggestions, raise concerns and participate in
the development of this feature.

Cheers,
Niklas


Re: Review Request 31028: Added slave run task hook tests.

2015-04-11 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31028/
---

(Updated April 11, 2015, 3:03 a.m.)


Review request for mesos, Ben Mahler and Kapil Arya.


Bugs: MESOS-2351
https://issues.apache.org/jira/browse/MESOS-2351


Repository: mesos


Description
---

See summary


Diffs
-

  src/examples/test_hook_module.cpp 47409cd4d02e238d1d182571d92019114662cd41 
  src/tests/hook_tests.cpp bb9de25bd2c4601d333a3ca1aec13820c7df7378 

Diff: https://reviews.apache.org/r/31028/diff/


Testing
---

make check (with newly added VerifySlaveRunTaskHook test)


Thanks,

Niklas Nielsen



Re: Review Request 31017: Fixed comment for remove executor hook.

2015-04-11 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31017/
---

(Updated April 11, 2015, 3:03 a.m.)


Review request for mesos and Kapil Arya.


Bugs: MESOS-2351
https://issues.apache.org/jira/browse/MESOS-2351


Repository: mesos


Description
---

See summary.


Diffs
-

  include/mesos/hook.hpp 9ae8b9455a86c7a5cbf4f1d1b1ce88f2811ce35d 

Diff: https://reviews.apache.org/r/31017/diff/


Testing
---

make check


Thanks,

Niklas Nielsen



Re: Review Request 31016: Added slave run task decorator.

2015-04-11 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31016/
---

(Updated April 11, 2015, 3:03 a.m.)


Review request for mesos, Ben Mahler and Kapil Arya.


Bugs: MESOS-2351
https://issues.apache.org/jira/browse/MESOS-2351


Repository: mesos


Description
---

Added decorator which gets invoked on start of runTask() sequence in the slave.


Diffs
-

  include/mesos/hook.hpp 9ae8b9455a86c7a5cbf4f1d1b1ce88f2811ce35d 
  src/hook/manager.hpp da813492108974a7e26b366845368517589da876 
  src/hook/manager.cpp 7a4cb09bc221af502e867cfb7fff2900b599ff1f 
  src/slave/slave.hpp 19e6b44bc344c0ca509674803f401cbb4e1f47ae 
  src/slave/slave.cpp 9fec023b643d410f4d511fa6f80e9835bab95b7e 

Diff: https://reviews.apache.org/r/31016/diff/


Testing
---

make check


Thanks,

Niklas Nielsen



Re: Review Request 32948: Refactored VerifyMasterLaunchTaskHook to _not_ use command executor.

2015-04-11 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32948/
---

(Updated April 11, 2015, 3:37 a.m.)


Review request for mesos, Adam B and Kapil Arya.


Bugs: MESOS-2351
https://issues.apache.org/jira/browse/MESOS-2351


Repository: mesos


Description
---

See summary


Diffs
-

  src/tests/hook_tests.cpp bb9de25bd2c4601d333a3ca1aec13820c7df7378 

Diff: https://reviews.apache.org/r/32948/diff/


Testing
---

make check (test broke previously with an assert as the TestContainerizer 
cannot be used with a command executor)


Thanks,

Niklas Nielsen



Re: Review Request 32911: Fixed sandbox ownership bug for executors without URIs.

2015-04-07 Thread Niklas Nielsen


 On April 6, 2015, 7:22 p.m., Benjamin Hindman wrote:
  src/slave/slave.cpp, line 4164
  https://reviews.apache.org/r/32911/diff/1/?file=918555#file918555line4164
 
  Great comment! Can we also add something to the end of the comment that 
  says that the user is validated when the task goes through the master? 
  Thanks!

Are you referring to when authentication and ACL's are enabled?


 On April 6, 2015, 7:22 p.m., Benjamin Hindman wrote:
  src/slave/slave.cpp, line 4166
  https://reviews.apache.org/r/32911/diff/1/?file=918555#file918555line4166
 
  Why don't you need to check the taskInfo? Is it because we should have 
  set the executorInfo's 'user' appropriately? If so, let's comment as much, 
  even going as far as introducing a CHECK!

I don't follow; the user may not be set on the task.command.user or 
task.executor.command.user? It isn't ensured to be set by the master?


- Niklas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32911/#review79113
---


On April 6, 2015, 5:40 p.m., Niklas Nielsen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/32911/
 ---
 
 (Updated April 6, 2015, 5:40 p.m.)
 
 
 Review request for mesos, Benjamin Hindman and Ian Downes.
 
 
 Bugs: MESOS-2592
 https://issues.apache.org/jira/browse/MESOS-2592
 
 
 Repository: mesos
 
 
 Description
 ---
 
 During recent refactorings, executor directory ownership was delegated to the 
 fetcher. However, the fetcher is not invoked if no URIs are present in the 
 executor or task command. This left some of these tasks broken as the 
 directory ownership defaulted to the mesos-slave's (root).
 
 
 Diffs
 -
 
   src/slave/containerizer/external_containerizer.cpp 
 1bbd61cb096771b7e4a1350079f79a20102e78f9 
   src/slave/paths.hpp 1618439d728ded347ec75317ce8dd998acd7ee94 
   src/slave/paths.cpp 01ea856aa2e628d4aee5fd31f7e49d147f740e8f 
   src/slave/slave.cpp 521624c335b9110e12ee1ff21c3918e5af6a2bde 
 
 Diff: https://reviews.apache.org/r/32911/diff/
 
 
 Testing
 ---
 
 Functional tests with mesos-execute and make check. Have created JIRA's for 
 introduction of more permission/user tests.
 
 
 Thanks,
 
 Niklas Nielsen
 




Re: Review Request 32911: Fixed sandbox ownership bug for executors without URIs.

2015-04-07 Thread Niklas Nielsen


 On April 7, 2015, 10:13 a.m., Ian Downes wrote:
  src/slave/slave.cpp, line 4166
  https://reviews.apache.org/r/32911/diff/1/?file=918555#file918555line4166
 
  getExecutorInfo() will return the ExecutorInfo if the TaskInfo includes 
  it, otherwise it will construct one (for the command executor), taking the 
  user from the CommandInfo, if present, otherwise this code will fallback to 
  the user from the FrameworkInfo.

What is the issue here?


- Niklas


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32911/#review79198
---


On April 6, 2015, 5:40 p.m., Niklas Nielsen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/32911/
 ---
 
 (Updated April 6, 2015, 5:40 p.m.)
 
 
 Review request for mesos, Benjamin Hindman and Ian Downes.
 
 
 Bugs: MESOS-2592
 https://issues.apache.org/jira/browse/MESOS-2592
 
 
 Repository: mesos
 
 
 Description
 ---
 
 During recent refactorings, executor directory ownership was delegated to the 
 fetcher. However, the fetcher is not invoked if no URIs are present in the 
 executor or task command. This left some of these tasks broken as the 
 directory ownership defaulted to the mesos-slave's (root).
 
 
 Diffs
 -
 
   src/slave/containerizer/external_containerizer.cpp 
 1bbd61cb096771b7e4a1350079f79a20102e78f9 
   src/slave/paths.hpp 1618439d728ded347ec75317ce8dd998acd7ee94 
   src/slave/paths.cpp 01ea856aa2e628d4aee5fd31f7e49d147f740e8f 
   src/slave/slave.cpp 521624c335b9110e12ee1ff21c3918e5af6a2bde 
 
 Diff: https://reviews.apache.org/r/32911/diff/
 
 
 Testing
 ---
 
 Functional tests with mesos-execute and make check. Have created JIRA's for 
 introduction of more permission/user tests.
 
 
 Thanks,
 
 Niklas Nielsen
 




Re: Review Request 32911: Fixed sandbox ownership bug for executors without URIs.

2015-04-07 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32911/
---

(Updated April 7, 2015, 11:10 a.m.)


Review request for mesos, Benjamin Hindman and Ian Downes.


Bugs: MESOS-2592
https://issues.apache.org/jira/browse/MESOS-2592


Repository: mesos


Description
---

During recent refactorings, executor directory ownership was delegated to the 
fetcher. However, the fetcher is not invoked if no URIs are present in the 
executor or task command. This left some of these tasks broken as the directory 
ownership defaulted to the mesos-slave's (root).


Diffs (updated)
-

  src/slave/containerizer/external_containerizer.cpp 1bbd61c 
  src/slave/paths.hpp 1618439 
  src/slave/paths.cpp 01ea856 
  src/slave/slave.cpp 521624c 

Diff: https://reviews.apache.org/r/32911/diff/


Testing
---

Functional tests with mesos-execute and make check. Have created JIRA's for 
introduction of more permission/user tests.


Thanks,

Niklas Nielsen



Re: Review Request 31016: Added slave run task decorator.

2015-04-07 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31016/
---

(Updated April 7, 2015, 5:57 p.m.)


Review request for mesos, Ben Mahler and Kapil Arya.


Changes
---

Addressed Adam's and Kapil's comments.


Repository: mesos


Description
---

Added decorator which gets invoked on start of runTask() sequence in the slave.


Diffs (updated)
-

  include/mesos/hook.hpp 9ae8b9455a86c7a5cbf4f1d1b1ce88f2811ce35d 
  src/hook/manager.hpp da813492108974a7e26b366845368517589da876 
  src/hook/manager.cpp 7a4cb09bc221af502e867cfb7fff2900b599ff1f 
  src/slave/slave.hpp 19e6b44bc344c0ca509674803f401cbb4e1f47ae 
  src/slave/slave.cpp 9fec023b643d410f4d511fa6f80e9835bab95b7e 

Diff: https://reviews.apache.org/r/31016/diff/


Testing
---

make check


Thanks,

Niklas Nielsen



Re: Review Request 30961: Enabled label decorator to override.

2015-04-07 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30961/
---

(Updated April 7, 2015, 5:57 p.m.)


Review request for mesos, Ben Mahler and Kapil Arya.


Changes
---

Addressed Adam's and Kapil's comments.


Bugs: MESOS-2351
https://issues.apache.org/jira/browse/MESOS-2351


Repository: mesos


Description
---

See summary.


Diffs (updated)
-

  include/mesos/hook.hpp 9ae8b9455a86c7a5cbf4f1d1b1ce88f2811ce35d 
  src/examples/test_hook_module.cpp 47409cd4d02e238d1d182571d92019114662cd41 
  src/hook/manager.cpp 7a4cb09bc221af502e867cfb7fff2900b599ff1f 
  src/master/master.cpp 618db68ee4163b06e479cf3413eda4b63c9c5a4b 
  src/tests/hook_tests.cpp bb9de25bd2c4601d333a3ca1aec13820c7df7378 

Diff: https://reviews.apache.org/r/30961/diff/


Testing
---

make check (with modified VerifyMasterLaunchTaskHook test)


Thanks,

Niklas Nielsen



Re: Review Request 30962: Enabled environment decorator to override.

2015-04-07 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30962/
---

(Updated April 7, 2015, 5:57 p.m.)


Review request for mesos, Ben Mahler and Kapil Arya.


Changes
---

Addressed Adam's and Kapil's comments.


Bugs: MESOS-2351
https://issues.apache.org/jira/browse/MESOS-2351


Repository: mesos


Description
---

See summary


Diffs (updated)
-

  src/examples/test_hook_module.cpp 47409cd4d02e238d1d182571d92019114662cd41 
  src/hook/manager.cpp 7a4cb09bc221af502e867cfb7fff2900b599ff1f 

Diff: https://reviews.apache.org/r/30962/diff/


Testing
---

make check


Thanks,

Niklas Nielsen



Re: Review Request 31028: Added slave run task hook tests.

2015-04-07 Thread Niklas Nielsen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31028/
---

(Updated April 7, 2015, 5:57 p.m.)


Review request for mesos, Ben Mahler and Kapil Arya.


Changes
---

Addressed Adam's and Kapil's comments.


Repository: mesos


Description
---

See summary


Diffs (updated)
-

  src/examples/test_hook_module.cpp 47409cd4d02e238d1d182571d92019114662cd41 
  src/tests/hook_tests.cpp bb9de25bd2c4601d333a3ca1aec13820c7df7378 

Diff: https://reviews.apache.org/r/31028/diff/


Testing
---

make check (with newly added VerifySlaveRunTaskHook test)


Thanks,

Niklas Nielsen



  1   2   3   4   5   6   7   8   9   10   >