Re: [VOTE] Move Apache Mesos to Attic

2021-04-06 Thread Kevin Klues
+1 (binding)

While it is of course sad to see things come to an end here, I do encourage
those of you who wish to see Mesos live on to try and breath new life into
it as a standalone project on github.
In that sense, this is more of a new beginning than an end.
The king is dead, long live the king!

Kevin

Am Di., 6. Apr. 2021 um 21:07 Uhr schrieb Benjamin Mahler <
bmah...@apache.org>:

> +1 (binding)
>
> Thanks to all who contributed to the project.
>
> On Mon, Apr 5, 2021 at 1:58 PM Vinod Kone  wrote:
>
>> Hi folks,
>>
>> Based on the recent conversations
>> <
>> https://lists.apache.org/thread.html/raed89cc5ab78531c48f56aa1989e1e7eb05f89a6941e38e9bc8803ff%40%3Cuser.mesos.apache.org%3E
>> >
>> on our mailing list, it seems to me that the majority consensus among the
>> existing PMC is to move the project to the attic <
>> https://attic.apache.org/>
>> and let the interested community members collaborate on a fork in Github.
>>
>> I would like to call a vote to dissolve the PMC and move the project to
>> the
>> attic.
>>
>> Please reply to this thread with your vote. Only binding votes from
>> PMC/committers count towards the final tally but everyone in the community
>> is encouraged to vote. See process here
>> .
>>
>> Thanks,
>>
>

-- 
~Kevin


Re: Next Steps

2021-02-18 Thread Kevin Klues
Hello old friends. Long time no hear.

+1 (binding)

Haven't written that in a while...

I also think moving it to the attic (as far as Apache is concerned) makes a
lot of sense.
It can have a life of its own on github (without the overhead of Apache
PMC, requirements for voting, etc.)

Kevin

Am Do., 18. Feb. 2021 um 21:27 Uhr schrieb Till Toenshoff :

> +1 to what Renan (and Benjamin) suggested.
>
>

-- 
~Kevin


Re: [External] Re: docker containerizer with nvidia-docker

2019-04-10 Thread Kevin Klues
Adding GPU support to the docker containerizer is not something that is
very hard to do. The choice in the past to *not* build GPU support for the
docker containerizer was a conscious one in order get people moved over to
the UCR instead. All of the other innovations we work on are prioritised
for the UCR, and we didn't see a compelling reason to make an exception for
GPU support. Building a solution around nvidia-docker would have been a
solution requiring minimal changes to mesos, but then there would have been
yet another dependency in the system that we didn't want to introduce.

However, this was a decision made over 3 years ago, and maybe it's time to
revisit it.

The next docker release will include an integrated `--gpus` flag, bypassing
the need for nvidia-docker entirely:

https://github.com/docker/cli/pull/1714

With this in place it really would be trivial to add support for GPUs to
the docker containerizer, since there would be no requirement for users to
do any external setup for nvidia-docker.

What do people think? Has the landscape changed and does it now make sense
to add GPU support for the docker containerizer given the new upcoming
`--gpus` flag?

Kevin

On Fri, Apr 5, 2019 at 6:58 PM Benjamin Mahler  wrote:

> +Kevin Klues
>
>
> On Fri, Apr 5, 2019 at 1:24 AM Huadong Liu  wrote:
>
>> Hi Ben, thanks for pointing me to the docker containerizer ticket. I do
>> see
>> the value of UCR.
>>
>> Since nvidia-docker already takes care of mounting the driver etc., if we
>> use the "--docker=nvidia-docker" agent option to replace the docker
>> command
>> with the nvidia-docker command, GPU support with the docker containerizer
>> seems trivial. Did I miss anything?
>>
>> On Thu, Apr 4, 2019 at 8:00 PM Benjamin Mahler 
>> wrote:
>>
>> > The "UCR" (aka mesos containerizer) and "Docker containerizer" are two
>> > different containerizers that users tend to choose between. UCR is what
>> > many of our serious users rely on and so we made the investment there
>> > first. GPU support for the docker containerizer was also something that
>> was
>> > planned, but hasn't been prioritized:
>> > https://issues.apache.org/jira/browse/MESOS-5795
>> >
>> > These days, many of our users use Docker images with UCR (i.e. bypassing
>> > the need for the docker daemon).
>> >
>> > Maybe the containerization devs can chime in here I'm in saying anything
>> > inaccurate or to shed some light on where things are headed.
>> >
>> > On Wed, Apr 3, 2019 at 2:21 PM Huadong Liu 
>> wrote:
>> >
>> > > Hi,
>> > >
>> > > Nvidia GPU support in Mesos/Marathon mandates the mesos containerizer
>> > > <
>> > >
>> >
>> https://github.com/mesosphere/marathon/blob/master/src/main/scala/mesosphere/marathon/state/AppDefinition.scala#L557
>> > > >
>> > >  which "mimics" nvidia-docker.
>> > > <http://mesos.apache.org/documentation/latest/gpu-support/> Can
>> someone
>> > > help me understand why docker containerizer with agent option
>> > > "--docker=nvidia-docker" wasn't the choice? Thank you!
>> > >
>> > > --
>> > > Huadong
>> > >
>> >
>>
>


Re: Question ablout "Attach/Exec Support in Mesos"

2017-06-17 Thread Kevin Klues
I've been told the newest master doesn't work with the Mac version of this
tool anymore.  Here is a link to a new version that works:

https://drive.google.com/drive/u/0/folders/0B4qvtaqAh24VZWZqc3ZYUGVTd2c

On Sun, May 21, 2017 at 1:16 PM Mao Geng 
wrote:

> Hi Kevin,
>
> I just tried it on my Mac and it works well with our standalone Mesos 1.2.0
> cluster. Thanks for sharing this!
>
> Cheers,
> Mao
>
> On Sun, May 21, 2017 at 9:16 AM, Kevin Klues  wrote:
>
> > ------ Forwarded message -
> > From: Kevin Klues 
> > Date: Sun, May 21, 2017 at 9:14 AM
> > Subject: Re: question ablout "Attach/Exec Support in Mesos"
> > To: 唐亮 
> >
> > Hi Tangliang,
> >
> > Unfortunately we only have support for `task exec` in the DC/OS CLI at
> the
> > moment. We have been planning to backport it to the Mesos CLI for some
> > time, but haven't managed to do so yet.
> >
> > To complicate things, DC/OS used to allow running against standalone
> mesos,
> > but the latest release of the DC/OS CLI doesn't support this anymore.
> > Version 0.4.16 is the only release that supports *both* standalone mesos
> > and `task exec` (https://github.com/dcos/dcos-cli/releases/tag/0.4.16).
> >
> > However, version 0.4.16 had a bug which didn't allow `task exec` to work
> > with pods (it works with all other containers launched by the universal
> > containerizer, just not pods). A fix for this has been committed upstream
> > and is included in the latest DC/OS CLI, but that version of the CLI
> > doesn't support running against standalone mesos anymore.
> >
> > Bummer
> >
> > Ideally, we would just backport `task exec` support into the mesos CLI
> and
> > not have to worry about this. However, since this hasn't been done yet,
> > I've decided to create a (non-release) version of the DC/OS CLI which can
> > be used against standalone mesos and supports `task exec` for both normal
> > containers and pods.
> >
> > Below are links to both Mac and Linux binaries for this version of the
> CLI:
> > Mac: https://drive.google.com/open?id=0B4qvtaqAh24VVWVIY1RkR2ZMS1U
> > Linux: https://drive.google.com/open?id=0B4qvtaqAh24Vd2JtMTZuUFJSUjg
> >
> > To use this version of the DC/OS CLI with standalone mesos, you first
> need
> > to set core.dcos_url to a dummy value and then set core.mesos_master_url
> to
> > the URL for your mesos master.
> > $ dcos config set core.dcos_url ""
> > $ dcos config set core.mesos_master_url 
> >
> > The format of the mesos_master_url is:
> > "mesos_master_url": {
> > "description": "Mesos master URL. Must be set in format:
> > \"http://host:port\"";,
> > "format": "uri",
> > "title": "Mesos Master URL",
> > "type": "string"
> > }
> >
> > I'm not sure how many of the commands in the DC/OS CLI work in a
> standalone
> > mesos cluster, but I've tested at least the following with the binaries
> > attached to this email and they seem to work just fine:
> >
> > $ dcos task
> > NAME  HOST  USER  STATE  ID
> > gpu-test  core-dev  rootRgpu-test
> >
> > $ dcos task exec -it gpu-test bash
> > [root@core-dev /]# exit
> >
> > $ dcos task log gpu-test
> > Executing pre-exec command
> > '{"arguments":["mesos-containerizer","mount","--
> > help=false","--operation=make-rslave","--path=\/"],"shell":
> > false,"value":"\/home\/klueska\/projects\/mesos\/build\/src\/mesos-
> > containerizer"}'
> > Received SUBSCRIBED event
> > Subscribed executor on core-dev
> > Received LAUNCH event
> > Starting task gpu-test
> > ...
> >
> > Hopefully we will find the time soon to backport all of this to the mesos
> > CLI, so you won't have to do this awkward dance just to use `task exec`.
> >
> > Let me know if you have any other questions.
> >
> > Thanks!
> >
> > Kevin
> >
> > >
> >
>


Re: GPU Users -- Deprecation of GPU_RESOURCES capability

2017-05-26 Thread Kevin Klues
I've added JIRAs to:

1) Add master flag `--filter-gpu-resources={true|false}`
https://issues.apache.org/jira/browse/MESOS-7576

2) Deprecate GPU_RESOURCES capability and master flag
`--filter-gpu-resources={true|false}`
https://issues.apache.org/jira/browse/MESOS-7579

3) Remove GPU_RESOURCES capability and master flag
`--filter-gpu-resources={true|false}`
https://issues.apache.org/jira/browse/MESOS-7577

Kevin

On Fri, May 26, 2017 at 1:49 PM Benjamin Mahler  wrote:

> I filed https://issues.apache.org/jira/browse/MESOS-7574 for reservations
> to multiple roles. We'll find one that captures the deprecation of the
> GPU_RESOURCES capability as well, with reservations to multiple roles as a
> blocker.
>
> On Fri, May 26, 2017 at 8:54 AM, Zhitao Li  wrote:
>
> > Hi Benjamin,
> >
> > Thanks for getting back. Do you have an issue already filed for
> > the "reservations to multiple roles" story, or is it folded under another
> > JIRA story?
> >
> >
> >
> > On Fri, May 26, 2017 at 12:44 AM, Benjamin Mahler 
> > wrote:
> >
> > > Thanks for the feedback!
> > >
> > > There have been some discussions for allowing reservations to multiple
> > > roles (or more generally, role expressions), which is essentially what
> > > you've suggested Zhitao. (However, note that what is provided by the
> GPU
> > > capability filtering is not quite this, it's actually analogous to a
> > > reservation for multiple schedulers, not roles). Reservations to
> multiple
> > > roles seems to be the right replacement for those who rely on the GPU
> > > filtering behavior.
> > >
> > > Since we don't have reservations to multiple roles at this point, we
> > > shouldn't deprecate the GPU_RESOURCES capability until this is in
> place.
> > >
> > > With hierarchical roles, it's possible (although potentially clumsy) to
> > > achieve roughly what is provided by the GPU filtering using sub-roles.
> > > Since reservations made to a "gpu" role would be available to all of
> the
> > > descendant roles within tree, e.g.
> > > "gpu/analytics", "gpu/forecasting/training", etc. This is equivalent
> to a
> > > restricted version of reservations to multiple roles, where the roles
> are
> > > restricted to the descendant roles. This can get clumsy because if
> > > "eng/backend/image-processing" wants to get in on the reserved gpus,
> the
> > > user would have to place a related role underneath the "gpu" role, e.g.
> > > "gpu/eng/backend/image-processing".
> > >
> >
> > The exact reason you mentioned about the "clumsy" part would effectively
> > prevent me of implementing this in our org even if it's already
> available.
> >
> >
> > >
> > > For the addition of the filter, note that this flag would be a
> temporary
> > > measure that would be removed when the deprecation cycle of the
> > capability
> > > is complete. It would be good to independently consider the generalized
> > > filtering idea you brought up.
> > >
> > > On Mon, May 22, 2017 at 9:15 AM, Zhitao Li 
> > wrote:
> > >
> > > > Hi Kevin,
> > > >
> > > > Thanks for engaging with the community on this. My 2 cents:
> > > >
> > > > 1. I feel that this capabilities has a particular useful semantic
> which
> > > is
> > > > lacking in the current reservation system: reserving some scarce
> > resource
> > > > for a* dynamic list of multiple roles:*
> > > >
> > > > Right now, any reservation (static or dynamic) can only express the
> > > > semantic of "reserving this resource for the given role R". However,
> > in a
> > > > complex cluster, it is possible that we have [R1, R2, ..., RN] which
> > > wants
> > > > to share the scarce resource among them but there is another set of
> > roles
> > > > which should never see the given resource.
> > > >
> > > > The new hierarchical role (and/or multi-role?) might be able to
> > provide a
> > > > better solution, but until that's widely available and adopted, the
> > > > capabilities based hack is the only thing I know that can solve the
> > > > problem.
> > > >
> > > > In fact, I think if we are going to wo with `--filter-gpu-resources`
> > > path,
> > > > I think we should make the filter mo

Fwd: Question ablout "Attach/Exec Support in Mesos"

2017-05-21 Thread Kevin Klues
-- Forwarded message -
From: Kevin Klues 
Date: Sun, May 21, 2017 at 9:14 AM
Subject: Re: question ablout "Attach/Exec Support in Mesos"
To: 唐亮 

Hi Tangliang,

Unfortunately we only have support for `task exec` in the DC/OS CLI at the
moment. We have been planning to backport it to the Mesos CLI for some
time, but haven't managed to do so yet.

To complicate things, DC/OS used to allow running against standalone mesos,
but the latest release of the DC/OS CLI doesn't support this anymore.
Version 0.4.16 is the only release that supports *both* standalone mesos
and `task exec` (https://github.com/dcos/dcos-cli/releases/tag/0.4.16).

However, version 0.4.16 had a bug which didn't allow `task exec` to work
with pods (it works with all other containers launched by the universal
containerizer, just not pods). A fix for this has been committed upstream
and is included in the latest DC/OS CLI, but that version of the CLI
doesn't support running against standalone mesos anymore.

Bummer

Ideally, we would just backport `task exec` support into the mesos CLI and
not have to worry about this. However, since this hasn't been done yet,
I've decided to create a (non-release) version of the DC/OS CLI which can
be used against standalone mesos and supports `task exec` for both normal
containers and pods.

Below are links to both Mac and Linux binaries for this version of the CLI:
Mac: https://drive.google.com/open?id=0B4qvtaqAh24VVWVIY1RkR2ZMS1U
Linux: https://drive.google.com/open?id=0B4qvtaqAh24Vd2JtMTZuUFJSUjg

To use this version of the DC/OS CLI with standalone mesos, you first need
to set core.dcos_url to a dummy value and then set core.mesos_master_url to
the URL for your mesos master.
$ dcos config set core.dcos_url ""
$ dcos config set core.mesos_master_url 

The format of the mesos_master_url is:
"mesos_master_url": {
"description": "Mesos master URL. Must be set in format:
\"http://host:port\"";,
"format": "uri",
"title": "Mesos Master URL",
"type": "string"
}

I'm not sure how many of the commands in the DC/OS CLI work in a standalone
mesos cluster, but I've tested at least the following with the binaries
attached to this email and they seem to work just fine:

$ dcos task
NAME  HOST  USER  STATE  ID
gpu-test  core-dev  rootRgpu-test

$ dcos task exec -it gpu-test bash
[root@core-dev /]# exit

$ dcos task log gpu-test
Executing pre-exec command
'{"arguments":["mesos-containerizer","mount","--help=false","--operation=make-rslave","--path=\/"],"shell":false,"value":"\/home\/klueska\/projects\/mesos\/build\/src\/mesos-containerizer"}'
Received SUBSCRIBED event
Subscribed executor on core-dev
Received LAUNCH event
Starting task gpu-test
...

Hopefully we will find the time soon to backport all of this to the mesos
CLI, so you won't have to do this awkward dance just to use `task exec`.

Let me know if you have any other questions.

Thanks!

Kevin

>


GPU Users -- Deprecation of GPU_RESOURCES capability

2017-05-20 Thread Kevin Klues
Hello GPU users,

We are currently considering deprecating the requirement that frameworks
register with the GPU _RESOURCES capability in order to receive offers that
contain GPUs. Going forward, we will recommend that users rely on Mesos's
builtin `reservation` mechanism to achieve similar results.

Before deprecating it, we wanted to get a sense from the community if
anyone is currently relying on this capability and would like to see it
persist. If not, we will begin deprecating it in the next Mesos release and
completely remove it in Mesos 2.0.

As background, the original motivation for this capability was to keep
“legacy” frameworks from inadvertently scheduling jobs that don’t require
GPUs on GPU capable machines and thus starving out other frameworks that
legitimately want to place GPU jobs on those machines. The assumption here
was that most machines in a cluster won't have GPUs installed on them, so
some mechanism was necessary to keep legacy frameworks from scheduling jobs
on those machines. In essence, it provided an implicit reservation of GPU
machines for "GPU aware" frameworks, bypassing the traditional
`reservation` mechanism already built into Mesos.

In such a setup, legacy frameworks would be free to schedule jobs on
non-GPU machines, and "GPU aware" frameworks would be free to schedule GPU
jobs GPU machines and other types of jobs on other machines (or mix and
match them however they please).

However, the problem comes when *all* machines in a cluster contain GPUs
(or even if most of the machines in a cluster container them). When this is
the case, we have the opposite problem we were trying to solve by
introducing the GPU_RESOURCES capability in the first place. We end up
starving out jobs from legacy frameworks that *don’t* require GPU resources
because there are not enough machines available that don’t have GPUs on
them to service those jobs. We've actually seen this problem manifest in
the wild at least once.

An alternative to completely deprecating the GPU_RESOURCES flag would be to
add a new flag to the mesos master called `--filter-gpu-resources`. When
set to `true`, this flag will cause the mesos master to continue to
function as it does today. That is, it would filter offers containing GPU
resources and only send them to frameworks that opt into the GPU_RESOURCES
framework capability. When set to `false`, this flag would cause the master
to *not* filter offers containing GPU resources, and indiscriminately send
them to all frameworks whether they set the GPU_RESOURCES capability or not.

, this flag would allow them to keep relying on it without disruption.

We'd prefer to deprecate the capability completely, but would consider
adding this flag if people are currently relying on the GPU_RESOURCES
capability and would like to see it persist

We welcome any feedback you have.

Kevin + Ben


Re: Mesos Container Attach/Exec

2016-10-27 Thread Kevin Klues
+user  list

Hello all,

We recently started working on support for `docker attach` and `docker
exec` like functionality in Mesos.

Here is a link to the design doc:
https://docs.google.com/document/d/1nAVr0sSSpbDLrgUlAEB5hKzCl482NSVk8V0D56sFMzU

The design doc is not yet complete, but it is filled out enough to start
eliciting feedback. Please feel free to add comments (or even add
suggestions for content!) as you wish.

Thanks!

Kevin


Mesos Container Attach/Exec

2016-10-27 Thread Kevin Klues
Hello all,

We recently started working on support for `docker attach` and `docker
exec` like functionality in Mesos.


Here is a link to the design doc:
https://docs.google.com/document/d/1nAVr0sSSpbDLrgUlAEB5hKzCl482NSVk8V0D56sFMzU

The design doc is not yet complete, but it is filled out enough to start
eliciting feedback. Please feel free to add comments (or even add
suggestions for content!) as you wish.

Thanks!

Kevin


Re: mesos git commit: Fixed a bug in getRootContainerId due to protobuf copying issue.

2016-09-19 Thread Kevin Klues
Here is another problematic use case that I ran into a few weeks back:

Running the following ends up enteringa  recursive loop and blowing up the
stack:

```  container.id.mutable_parent()->CopyFrom(container.id);
```

If you look at the body of `MergeFrom()`, which is called from
`CopyFrom()`, you can see why:

```
void ContainerID::MergeFrom(const ContainerID& from) {
  GOOGLE_CHECK_NE(&from, this);
  if (from._has_bits_[0 / 32] & (0xffu << (0 % 32))) {
if (from.has_value()) {
  set_value(from.value());
}
if (from.has_parent()) {
  mutable_parent()->::mesos::ContainerID::MergeFrom(from.parent());
}
  }
  mutable_unknown_fields()->MergeFrom(from.unknown_fields());
}
```

When we call `CopyFrom()` we pass it the same object who’s parent we are
trying to modify.  However, once we make the original `mutable_parent()`
call, a parent has been created (albeit an empty one) on the original
object. This allows the `from.has_parent()` call in `MergeFrom()` to
succeed. From there we enter a recursive loop in calls to
`MergeFrom()`since we always operate on the same object.

On Mon, Sep 19, 2016 at 1:33 AM haosdent  wrote:

> From mesos.pb.h/mesos.pb.cc
>
> ```
>   inline ContainerID& operator=(const ContainerID& from) {
> CopyFrom(from);
> return *this;
>   }
>
> void ContainerID::CopyFrom(const ::google::protobuf::Message& from) {
>   if (&from == this) return;
>   Clear();
>   MergeFrom(from);
> }
> ```
>
> On Mon, Sep 19, 2016 at 4:32 PM, haosdent  wrote:
>
> > @Neil, when
> >
> > ```
> > rootContainerId = rootContainerId.parent();
> > ```
> >
> > protobuf would try to call `ContaienrID::Clear()` first and then perform
> > `ContainerID::CopyFrom`. Because the parent has been broken after
> > `ContainerID::Clear()`, so the `ContainerID::CopyFrom` would get an empty
> > value. I think this is not a bug of protobuf and we should avoid using
> >
> > ```
> > Message = Message.xx();
> > ```
> >
> > On Mon, Sep 19, 2016 at 3:42 PM, Neil Conway 
> > wrote:
> >
> >> Hi Jie,
> >>
> >> Do you have more details on what exactly the problem is here? If
> >> protobuf is unable to copy/merge nested messages in general, that
> >> seems like something that might crop up elsewhere.
> >>
> >> Perhaps we can (a) file a JIRA (ideally with a self-contained
> >> test-case), and/or (c) report the problem to upstream?
> >>
> >> Neil
> >>
> >> -- Forwarded message --
> >> From:  
> >> Date: Sat, Sep 17, 2016 at 11:27 PM
> >> Subject: mesos git commit: Fixed a bug in getRootContainerId due to
> >> protobuf copying issue.
> >> To: comm...@mesos.apache.org
> >>
> >>
> >> Repository: mesos
> >> Updated Branches:
> >>   refs/heads/master a4fd86bce -> be81a924a
> >>
> >>
> >> Fixed a bug in getRootContainerId due to protobuf copying issue.
> >>
> >> It looks like protobuf is not so great dealing with nesting messages
> >> when doing merge or copy. This patch uses an extra copy to bypass that
> >> issue in the protobuf.
> >>
> >> Review: https://reviews.apache.org/r/51992
> >>
> >>
> >> Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
> >> Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/be81a924
> >> Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/be81a924
> >> Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/be81a924
> >>
> >> Branch: refs/heads/master
> >> Commit: be81a924a9e9414ec98f8c9a87a5391dad865146
> >> Parents: a4fd86b
> >> Author: Jie Yu 
> >> Authored: Sat Sep 17 14:22:31 2016 -0700
> >> Committer: Jie Yu 
> >> Committed: Sat Sep 17 14:25:33 2016 -0700
> >>
> >> --
> >>  src/slave/containerizer/mesos/utils.hpp | 7 ++-
> >>  1 file changed, 6 insertions(+), 1 deletion(-)
> >> --
> >>
> >>
> >> http://git-wip-us.apache.org/repos/asf/mesos/blob/be81a924/s
> >> rc/slave/containerizer/mesos/utils.hpp
> >> --
> >> diff --git a/src/slave/containerizer/mesos/utils.hpp
> >> b/src/slave/containerizer/mesos/utils.hpp
> >> index 2bb55c1..178ebf3 100644
> >> --- a/src/slave/containerizer/mesos/utils.hpp
> >> +++ b/src/slave/containerizer/mesos/utils.hpp
> >> @@ -27,7 +27,12 @@ static ContainerID getRootContainerId(const
> >> ContainerID& containerId)
> >>  {
> >>ContainerID rootContainerId = containerId;
> >>while (rootContainerId.has_parent()) {
> >> -rootContainerId = rootContainerId.parent();
> >> +// NOTE: Looks like protobuf does not handle copying well when
> >> +// nesting message is involved, because the source and the target
> >> +// point to the same object. Therefore, we create a temporary
> >> +// variable and use an extra copy here.
> >> +ContainerID id = rootContainerId.parent();
> >> +rootContainerId = id;
> >>}
> >>
> >>return rootContainerId;
> >>
> >
> >
> >
> > --
> > Best Regards,
> > Haosden

Re: Adding "syntax=proto2" to Mesos public protobuf files

2016-09-07 Thread Kevin Klues
I would promote doing the following as well if it doesn't break anything.
https://issues.apache.org/jira/browse/MESOS-5186

It would unblock tfmesos (a distributed tensorflow framework on Mesos) from
working with an unmodified Mesos.

On Wed, Sep 7, 2016 at 2:13 AM Greg Mann  wrote:

> AFAICT, this shouldn't break anything, since by omitting the syntax
> specification we're effectively setting it to the default, "proto2". As a
> quick test, I added `syntax = "proto2";` to the top of mesos.proto, built
> Mesos, and then ran some tests - it seems to work fine!
>
> Greg
>
>
> On Tue, Sep 6, 2016 at 12:50 PM, Zhitao Li  wrote:
>
> > Hi all,
> >
> > Does anyone see what could break if we add a line "syntax=proto2" to
> every
> > public .proto files in Mesos codebase?
> >
> > Some of our internal project compiles against Mesos and we are manually
> > adding this line because we also uses proto3 for internal files, and this
> > is necessary for protoc to work correctly. It'll be better if this can be
> > added upstream.
> >
> > This is not really urgent or blocker, just something nice to have.
> >
> > Thanks!
> >
> > --
> > Cheers,
> >
> > Zhitao Li
> >
>


Re: 1.0.1 release

2016-08-10 Thread Kevin Klues
Depends on if we want something to enable DC/OS 1.8 to configure easily for
GPU use.

On Tuesday, August 9, 2016, Benjamin Mahler  wrote:

> All of the issues I've been shepherding have been fixed.
>
> The only one I see remaining is this one, but doesn't look like a blocking
> issue: https://issues.apache.org/jira/browse/MESOS-5985
>
> Anything else that needs to go in?
>
> On Mon, Aug 1, 2016 at 4:19 PM, Vinod Kone  > wrote:
>
> > Hi,
> >
> > As discussed on the 1.0 voting thread, we plan to cut a 1.0.1 as early as
> > this week. So if you have anything that needs to absolutely go into the
> > patch release, please work with your shepherd and get it landed on trunk
> > and backported to the 1.0.x branch.
> >
> > Thanks,
> >
>


-- 
~Kevin


Re: GPU channel on slack

2016-06-30 Thread Kevin Klues
https://reviews.apache.org/r/49456/

On Thu, Jun 30, 2016 at 8:55 AM Vinod Kone  wrote:

> Mind updating
> https://github.com/apache/mesos/blob/master/docs/working-groups.md with
> this info?
>
> On Thu, Jun 30, 2016 at 8:44 AM, Kevin Klues  wrote:
>
> > If you are interested in the ongoing GPU work on Mesos, please join the
> > #gpus channel at mesos.slack.com. The big announcements for the GPU work
> > will still happen on this mailing list, but the day to day discussions
> will
> > likely happen on the slack channel going forward.
> >
>


GPU channel on slack

2016-06-30 Thread Kevin Klues
If you are interested in the ongoing GPU work on Mesos, please join the
#gpus channel at mesos.slack.com. The big announcements for the GPU work
will still happen on this mailing list, but the day to day discussions will
likely happen on the slack channel going forward.


Re: Mesos CLI

2016-06-22 Thread Kevin Klues
>
> The best option may still be for it
> to be in Python, this is why I'm asking if there are particular things that
> our helper libraries don't provide which you are leveraging in python.
>

One thing we rely heavily on that is missing is `docopt`. We use docopt for
convenient / standardized command line parsing and help formatting. This
makes it really easy to enforce a standard help format across plugins so
the CLI has a consistent feel throughout all of its subcommands. Supposedly
there is a C++ implementation of this now, but it requires gcc 4.9+ (for
regex).
https://github.com/docopt/docopt.cpp

In addition to this, the plugin architecture we built was very easy to
implement in python, and I'm worried it would be much more complicated (and
less readable) to get the same functionality out of C++. The existing CLI
has some support for "plugins" (by looking for executables in the path with
a "mesos-" prefix and assuming they are an extension to the CLI that can
exist as a subcommand). However, the implementation of this is pretty
ad-hoc and error prone (though it could conceivably be redone to work
better).

To get the equivalent functionality out of C++ for the plugin architecture
we've built for python, each plugin would need to be implemented as a
shared object that we dlopen() from the main program. Each module would
define a set of global variables describing properties of the plugin
(including help information) as well as create an instance of a class that
inherits from a `PluginBase` class to perform the actual functionality of
the plugin. The main program would then load this module, integrate its
help information and other meta data into its own metadata, and begin
invoking functions on the plugin class.

I'm not saying it's impossible to do in C++, just that python lends itself
better to doing this kind of stuff, and is much more readable when doing so.


Re: Mesos CLI

2016-06-22 Thread Kevin Klues
Here is a link to the design doc:
https://docs.google.com/document/d/1r6Iv4Efu8v8IBrcUTjgYkvZ32WVscgYqrD07OyIglsA/edit?ts=57573bba#

It sounds like people would like to see the CLI written in C++.
However, the reference implementation we have is written completely in
python. Please see the section on "Reference Implementation" in the
design doc for a discussion of why we chose this.

Do people consider a C++ implementation of the CLI a requirement or
simply a desirable. Is there strong opposition to python for any
reason other than code parity?

On Wed, Jun 22, 2016 at 1:01 PM, Joris Van Remoortere
 wrote:
> +1 for maintaining in repo.
> +1 for C++. Are the tools in libprocess not sufficient to make this easy to
> write? What is missing that would make it easier to write in C++?
>
> —
> *Joris Van Remoortere*
> Mesosphere
>
> On Wed, Jun 22, 2016 at 2:24 AM, Zhou Z Xing  wrote:
>
>> +1 for keeping it in Mesos repo and rewrite it in C++.
>>
>> Also, do we need a CLI work group on this?
>>
>> Thanks & Best Wishes,
>>
>> Tom Xing(邢舟)
>> Emerging Technology Institute, IBM China Software Development Lab
>> --
>> IBM China Software Development Laboratory (CSDL)
>> Notes ID:Zhou Z Xing/China/IBM
>> Phone :86-10-82450442
>> e-Mail :xingz...@cn.ibm.com
>> Address :Building No.28, ZhongGuanCun Software Park, No.8 Dong Bei Wang
>> West Road, Haidian District, Beijing, P.R.China 100193
>> 地址 :中国北京市海淀区东北旺西路8号 中关村软件园28号楼 100193
>>
>>
>> [image: Inactive hide details for Benjamin Mahler ---2016-06-22 上午
>> 06:12:40---+1 for keeping it in the repo. We can establish maint]Benjamin
>> Mahler ---2016-06-22 上午 06:12:40---+1 for keeping it in the repo. We can
>> establish maintainers for the CLI to ensure that it can mainta
>>
>> From: Benjamin Mahler 
>> To: dev , jfarr...@apache.org
>> Date: 2016-06-22 上午 06:12
>> Subject: Re: Mesos CLI
>> --
>>
>>
>>
>> +1 for keeping it in the repo.
>>
>> We can establish maintainers for the CLI to ensure that it can maintain a
>> reasonable update cadence. Note that we haven't done this well for the
>> webui and CLI, so we need to make sure we do it better this time around.
>>
>> If the architecture allows for easy integration of custom commands (written
>> in any language) then it should enable users to create their own helpful
>> CLI commands that we can eventually pull in and support in a first class
>> way.
>>
>> On Tue, Jun 21, 2016 at 1:26 PM, Jake Farrell  wrote:
>>
>> > +1 to in repo
>> >
>> > C++ would be nice to maintain language parity, GO would be a great choice
>> > also
>> >
>> > -Jake
>> >
>> > On Tue, Jun 21, 2016 at 3:15 PM, Vinod Kone  wrote:
>> >
>> > > +1 for keeping it in repo.
>> > >
>> > > Would be nice if the CLI can be written entirely in C++ though, to
>> avoid
>> > > supporting more languages.
>> > >
>> > > On Tue, Jun 21, 2016 at 12:12 PM, Jie Yu  wrote:
>> > >
>> > > > I personally prefer it being part of the Mesos repo so that when
>> people
>> > > > install our package, they'll get the command line tools as well. That
>> > > also
>> > > > avoids the potential version mismatch between Mesos and CLI (as you
>> > > > mentioned).
>> > > >
>> > > > What does others think?
>> > > >
>> > > > - Jie
>> > > >
>> > > > On Tue, Jun 21, 2016 at 10:20 AM, Kevin Klues 
>> > wrote:
>> > > >
>> > > > > I've created an Epic to track this:
>> > > > > https://issues.apache.org/jira/browse/MESOS-5676
>> > > > >
>> > > > > There have been efforts on this that have failed in the past (e.g.
>> > > > > https://github.com/mesosphere/mesos-cli)
>> > > > >
>> > > > > I'm curious what people's thoughts are in terms of keeping the CLI
>> > > > > integrated into mesos itself vs. maintaining it outside in a
>> separate
>> > > > > repo. There are advantages / disadvantages to both.  The primary
>> > > > > advantage of keeping it in is (in theory) it can keep better pace
>> > with
>> > > > > Mesos itself and will be fixed if any new / changed features break
>> > its
>> > > > > unit tests. 

Re: [GPU] [Allocation] "Scarce" Resource Allocation

2016-06-21 Thread Kevin Klues
As an FYI, preliminary support to work around this issue for GPUs will
appear in the 1.0 release
https://reviews.apache.org/r/48914/

This doesn't solve the problem of scarce resources in general, but it
will at least keep non-GPU workloads from starving out GPU-based
workloads on GPU capable machines. The downside of this approach is
that only GPU aware frameworks will be able to launch stuff on GPU
capable machines (meaning some of their resources could go unused
unnecessarily).  We decided this tradeoff is acceptable for now.

Kevin

On Tue, Jun 21, 2016 at 1:40 PM, Elizabeth Lingg
 wrote:
> Thanks, looking forward to discussion and review on your document. The main 
> use case I see here is that some of our frameworks will want to request the 
> GPU resources, and we want to make sure that those frameworks are able to 
> successfully launch tasks on agents with those resources. We want to be 
> certain that other frameworks that do not require GPU’s will not request all 
> other resources on those agents (i.e. cpu, disk, memory) which would mean the 
> GPU resources are not allocated and the frameworks that require them will not 
> receive them. As Ben Mahler mentioned, "(2) Because we do not have revocation 
> yet, if a framework decides to consume the non-GPU resources on a GPU 
> machine, it will prevent the GPU workloads from running!” This will occur for 
> us in clusters where we have higher utilization as well as different types of 
> workloads running. Smart task placement then becomes more relevant (i.e. we 
> want to be able to schedule with scarce resources successfully and we may 
> have considerations like not scheduling too many I/O bound workloads on a 
> single host or more stringent requirements for scheduling persistent tasks).
>
>  Elizabeth Lingg
>
>
>
>> On Jun 20, 2016, at 7:24 PM, Guangya Liu  wrote:
>>
>> Had some discussion with Ben M, for the following two solutions:
>>
>> 1) Ben M: Create sub-pools of resources based on machine profile and
>> perform fair sharing / quota within each pool plus a framework
>> capability GPU_AWARE
>> to enable allocator filter out scarce resources for some frameworks.
>> 2) Guangya: Adding new sorters for non scarce resources plus a framework
>> capability GPU_AWARE to enable allocator filter out scarce resources for
>> some frameworks.
>>
>> Both of the above two solutions are meaning same thing and there is no
>> difference between those two solutions: Create sub-pools of resources will
>> need to introduce different sorters for each sub-pools, so I will merge
>> those two solutions to one.
>>
>> Also had some dicsussion with Ben for AlexR's solution of implementing
>> "requestResource", this API should be treated as an improvement to the
>> issues of doing resource allocation pessimistically. (e.g. we offer/decline
>> the GPUs to 1000 frameworks before offering it to the GPU framework that
>> wants it). And the "requestResource" is providing *more information* to
>> mesos. Namely, it gives us awareness of demand.
>>
>> Even though for some cases, we can use the "requestResource" to get all of
>> the scarce resources, and then once those scarce resources are in use, then
>> the WDRF sorter will sorter non scarce resources as normal, but the problem
>> is that we cannot guarantee that the framework which have "requestResource"
>> can always consume all of the scarce resources before those scarce resource
>> allocated to other frameworks.
>>
>> I'm planning to draft a document based on solution 1) "Create sub-pools"
>> for the long term solution, any comments are welcome!
>>
>> Thanks,
>>
>> Guangya
>>
>> On Sat, Jun 18, 2016 at 11:58 AM, Guangya Liu  wrote:
>>
>>> Thanks Du Fan. So you mean that we should have some clear rules in
>>> document or somewhere else to tell or guide cluster admin which resources
>>> should be classified as scarce resources, right?
>>>
>>> On Sat, Jun 18, 2016 at 2:38 AM, Du, Fan  wrote:
>>>


 On 2016/6/17 7:57, Guangya Liu wrote:

> @Fan Du,
>
> Currently, I think that the scarce resources should be defined by cluster
> admin, s/he can specify those scarce resources via a flag when master
> start
> up.
>

 This is not what I mean.
 IMO, it's not cluster admin's call to decide what resources should be
 marked as scarce , they can carry out the operation, but should be advised
 on based on the clear rule: to what extend the resource is scarce compared
 with other resources, and it will affect wDRF by causing starvation for
 frameworks which holds scarce resources, that's my point.

 To my best knowledge here, a quantitative study of how wDRF behaves in
 scenario of one/multiple scarce resources first will help to verify the
 proposed approach, and guide the user of this functionality.



 Regarding to the proposal of generic scarce resources, do you have any
> thoughts on this? I can see that giving framework d

Re: Mesos CLI

2016-06-21 Thread Kevin Klues
I've created an Epic to track this:
https://issues.apache.org/jira/browse/MESOS-5676

There have been efforts on this that have failed in the past (e.g.
https://github.com/mesosphere/mesos-cli)

I'm curious what people's thoughts are in terms of keeping the CLI
integrated into mesos itself vs. maintaining it outside in a separate
repo. There are advantages / disadvantages to both.  The primary
advantage of keeping it in is (in theory) it can keep better pace with
Mesos itself and will be fixed if any new / changed features break its
unit tests.  The advantage of keeping it out is that it evolve more
easily and is not subject to the limitations of the Mesos build
system.

On Mon, Jun 20, 2016 at 11:05 AM, Haris Choudhary
 wrote:
> Hey All,
>
> We are finalizing a Design Doc for the redesign and hope to send it out in
> the next few days.
>
>
>
> On Mon, Jun 20, 2016 at 9:47 AM, Zhitao Li  wrote:
>
>> +1
>>
>> Very interested in participating on design/feature discussions and making
>> some contributions here.
>>
>> On Sun, Jun 19, 2016 at 10:23 PM, tommy xiao  wrote:
>>
>> > +1
>> >
>> > 2016-06-20 11:23 GMT+08:00 Zhou Z Xing :
>> >
>> > > Hi Haris,
>> > >
>> > > Is there any detail plan on the Mesos CLI redesign? we are now at the
>> > same
>> > > time want to improve Mesos CLI, such as leveraging new HTTP/Operator
>> > APIs,
>> > > add new commands to CLI. We definitely want to join you in the
>> discussion
>> > > and develop a new CLI for Mesos.
>> > >
>> > > Thanks & Best Wishes,
>> > >
>> > > Tom Xing(邢舟)
>> > > Emerging Technology Institute, IBM China Software Development Lab
>> > > --
>> > > IBM China Software Development Laboratory (CSDL)
>> > > Notes ID:Zhou Z Xing/China/IBM
>> > > Phone :86-10-82450442
>> > > e-Mail :xingz...@cn.ibm.com
>> > > Address :Building No.28, ZhongGuanCun Software Park, No.8 Dong Bei Wang
>> > > West Road, Haidian District, Beijing, P.R.China 100193
>> > > 地址 :中国北京市海淀区东北旺西路8号 中关村软件园28号楼 100193
>> > >
>> > >
>> > > [image: Inactive hide details for Haris Choudhary ---2016-06-18 上午
>> > > 01:11:16---Hey All, The Mesos CLI is going through a redesign. W]Haris
>> > > Choudhary ---2016-06-18 上午 01:11:16---Hey All, The Mesos CLI is going
>> > > through a redesign. We are aware that the
>> > >
>> > > From: Haris Choudhary 
>> > > To: dev@mesos.apache.org
>> > > Date: 2016-06-18 上午 01:11
>> > > Subject: Mesos CLI
>> > > --
>> > >
>> > >
>> > >
>> > > Hey All,
>> > >
>> > > The Mesos CLI is going through a redesign. We are aware that the
>> > > "mesos-execute" command is used pretty often, so that will be ported
>> into
>> > > the new CLI. However we're not sure if any of the other current CLI
>> > > commands are being used at all. The remaining list of commands are as
>> > > follow:
>> > > - cat
>> > > - ps
>> > > - tail
>> > > - scp
>> > >
>> > > If anyone is still using them, please let us know. *If a command is not
>> > > being used it may be removed completely without a deprecation notice. *
>> > >
>> > > Thanks!
>> > >
>> > >
>> > >
>> > >
>> >
>> >
>> > --
>> > Deshi Xiao
>> > Twitter: xds2000
>> > E-mail: xiaods(AT)gmail.com
>> >
>>
>>
>>
>> --
>> Cheers,
>>
>> Zhitao Li
>>



-- 
~Kevin


Re: New external dependency

2016-06-20 Thread Kevin Klues
The goal is to let users leverage the nvidia Docker images
(https://hub.docker.com/r/nvidia/) without any added effort on their
behalf. Using docker they are able to launch containers from these
images by simply running `nvidia-docker run ...` (i.e. they are
unaware that a magic volume is being injected on their behalf). On
Mesos we want the experience to be similar.

In terms of providing an external component to do the library
consolidation instead of building it into Mesos itself -- we
considered this.  We originally planned on building this functionality
as an isolator module (giving us the benefit of external linkage
without having to run a separate linux process), but there some some
limitations with the current isolator interface that prohibit us from
doing this properly. Moreover, building it as an isolator module would
mean that it couldn't be shared by the docker containerizer (which we
plan to add support for in the future).

On Mon, Jun 20, 2016 at 7:30 PM, Jean Christophe “JC” Martin
 wrote:
> Kevin,
>
> I agree about the need to create the volume, and gather the information. My 
> point was not really clear, sorry.
> My point was that it should not be different than any use case needing 
> special mounts and could either be solved by passing this information at the 
> time of container creation (it doesn’t seem that there are that many 
> libraries, and it would not be harder than say running the mesos slave in a 
> container, purely from a number of volume statements), or it could be solved 
> externally as the docker volume container does with a more generic solution.
>
> Thanks,
>
> JC
>
>> On Jun 20, 2016, at 6:59 PM, Kevin Klues  wrote:
>>
>> For now we've decided to actually remove the hard dependence on libelf
>> for the 1.0 release and spend a bit more time thinking about the right
>> way to pull it in.
>>
>> Jean, to answer your question though -- someone would still need to
>> consolidate these libraries, even if it wasn't left to Mesos to do so.
>> These libraries are spread across the file system, and need to be
>> pulled into a single place for easy injection. The full list of
>> binaries / libraries are here:
>>
>> https://github.com/NVIDIA/nvidia-docker/blob/master/tools/src/nvidia/volumes.go#L109
>>
>> We could put this burden on the operator and trust he gets it right,
>> or we could have Mesos programmatically do it itself. We considered
>> just leveraging the nvidia-docker-plugin itself (instead of
>> duplicating its functionality into mesos), but ultimately decided it
>> was better not to introduce an external dependency on it (since it is
>> a separate running excutable, rather than a simple library, like
>> libelf).
>>
>> On Mon, Jun 20, 2016 at 5:12 PM, Jean Christophe “JC” Martin
>>  wrote:
>>> As an operator not using GPUs, I feel that the burden seems misplaced, and 
>>> disproportionate.
>>> I assume that the operator of a GPU cluster knows the location of the 
>>> libraries based on their OS, and could potentially provide this information 
>>> at the time of creating the containers. I am not sure to see why this 
>>> something that mesos is required to do (consolidating the libraries in the 
>>> volume, versus being a configuration/external information).
>>>
>>> Thanks,
>>>
>>> JC
>>>
>>>> On Jun 20, 2016, at 2:30 PM, Kevin Klues  wrote:
>>>>
>>>> Sorry, the ticket just links to the nvidia-docker project without much
>>>> further explanation. The information at the link below should make it
>>>> a bit more clear:
>>>>
>>>> https://github.com/NVIDIA/nvidia-docker/wiki/NVIDIA-driver.
>>>>
>>>> The crux of the issue is that we need to be able consolidate all of
>>>> the Nvidia binaries/libraries into a single volume that we inject into
>>>> a docker container.  We use libelf is used to get the canonical names
>>>> of all the Nvidia libraries (i.e. SONAME in their dynamic sections) as
>>>> well as lookup what external dependences they have (i.e. NEEDED in
>>>> their dynamic sections) in order to build this volume.
>>>>
>>>> NOTE: None of this volume support is actually in Mesos yet -- we just
>>>> added the libelf dependence in anticipation of it.
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Jun 20, 2016 at 12:59 PM, Yan Xu  wrote:
>>>>> It's not immediately clear form the ticket why the change from optional
>>>>> dependency to required dependency though? Could you summarize?
>

Re: New external dependency

2016-06-20 Thread Kevin Klues
For now we've decided to actually remove the hard dependence on libelf
for the 1.0 release and spend a bit more time thinking about the right
way to pull it in.

Jean, to answer your question though -- someone would still need to
consolidate these libraries, even if it wasn't left to Mesos to do so.
These libraries are spread across the file system, and need to be
pulled into a single place for easy injection. The full list of
binaries / libraries are here:

https://github.com/NVIDIA/nvidia-docker/blob/master/tools/src/nvidia/volumes.go#L109

We could put this burden on the operator and trust he gets it right,
or we could have Mesos programmatically do it itself. We considered
just leveraging the nvidia-docker-plugin itself (instead of
duplicating its functionality into mesos), but ultimately decided it
was better not to introduce an external dependency on it (since it is
a separate running excutable, rather than a simple library, like
libelf).

On Mon, Jun 20, 2016 at 5:12 PM, Jean Christophe “JC” Martin
 wrote:
> As an operator not using GPUs, I feel that the burden seems misplaced, and 
> disproportionate.
> I assume that the operator of a GPU cluster knows the location of the 
> libraries based on their OS, and could potentially provide this information 
> at the time of creating the containers. I am not sure to see why this 
> something that mesos is required to do (consolidating the libraries in the 
> volume, versus being a configuration/external information).
>
> Thanks,
>
> JC
>
>> On Jun 20, 2016, at 2:30 PM, Kevin Klues  wrote:
>>
>> Sorry, the ticket just links to the nvidia-docker project without much
>> further explanation. The information at the link below should make it
>> a bit more clear:
>>
>> https://github.com/NVIDIA/nvidia-docker/wiki/NVIDIA-driver.
>>
>> The crux of the issue is that we need to be able consolidate all of
>> the Nvidia binaries/libraries into a single volume that we inject into
>> a docker container.  We use libelf is used to get the canonical names
>> of all the Nvidia libraries (i.e. SONAME in their dynamic sections) as
>> well as lookup what external dependences they have (i.e. NEEDED in
>> their dynamic sections) in order to build this volume.
>>
>> NOTE: None of this volume support is actually in Mesos yet -- we just
>> added the libelf dependence in anticipation of it.
>>
>>
>>
>>
>> On Mon, Jun 20, 2016 at 12:59 PM, Yan Xu  wrote:
>>> It's not immediately clear form the ticket why the change from optional
>>> dependency to required dependency though? Could you summarize?
>>>
>>>
>>> On Sun, Jun 19, 2016 at 12:33 PM, Kevin Klues  wrote:
>>>>
>>>> Thanks Zhitao,
>>>>
>>>> I just pushed out a review for upgrades.md and added you as a reviewer.
>>>>
>>>> The new dependence was added in the JIRA that haosdent linked, but the
>>>> actual reason for adding the dependence is more related to:
>>>> https://issues.apache.org/jira/browse/MESOS-5401
>>>>
>>>> On Sun, Jun 19, 2016 at 9:34 AM, haosdent  wrote:
>>>>> The related issue is Change build to always enable Nvidia GPU support
>>>>> for
>>>>> Linux
>>>>> Last time my local build break before Kevin send out the email, and then
>>>>> find this change.
>>>>>
>>>>> On Mon, Jun 20, 2016 at 12:11 AM, Zhitao Li 
>>>>> wrote:
>>>>>>
>>>>>> Hi Kevin,
>>>>>>
>>>>>> Thanks for letting us know. It seems like this is not called out in
>>>>>> upgrades.md, so can you please document this additional dependency
>>>>>> there?
>>>>>>
>>>>>> Also, can you include the link to the JIRA or patch requiring this
>>>>>> dependency so we can have some contexts?
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> On Sat, Jun 18, 2016 at 10:25 AM, Kevin Klues 
>>>>>> wrote:
>>>>>>
>>>>>>> Hello all,
>>>>>>>
>>>>>>> Just an FYI that the newest libmesos now has an external dependence
>>>>>>> on
>>>>>>> libelf on Linux. This dependence can be installed via the following
>>>>>>> packages:
>>>>>>>
>>>>>>> CentOS 6/7: yum install elfutils-libelf.x86_64
>>>>>>> Ubuntu14.04:   apt-get install libelf1
>>>>>>>
>>>>>>> Alternatively you can install from source:
>>>>>>> https://directory.fsf.org/wiki/Libelf
>>>>>>>
>>>>>>> For developers, you will also need to install the libelf headers in
>>>>>>> order to build master. This dependency can be installed via:
>>>>>>>
>>>>>>> CentOS: elfutils-libelf-devel.x86_64
>>>>>>> Ubuntu: libelf-dev
>>>>>>>
>>>>>>> Alternatively, you can install from source:
>>>>>>> https://directory.fsf.org/wiki/Libelf
>>>>>>>
>>>>>>> The getting started guide and the support/docker_build.sh scripts
>>>>>>> have
>>>>>>> been updated appropriately, but you may need to update your local
>>>>>>> environment if you don't yet have these packages installed.
>>>>>>>
>>>>>>> --
>>>>>>> ~Kevin
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Cheers,
>>>>>>
>>>>>> Zhitao Li
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Haosdent Huang
>>>>
>>>>
>>>>
>>>> --
>>>> ~Kevin
>>>
>>>
>>
>>
>>
>> --
>> ~Kevin
>



-- 
~Kevin


Re: New external dependency

2016-06-20 Thread Kevin Klues
Sorry, the ticket just links to the nvidia-docker project without much
further explanation. The information at the link below should make it
a bit more clear:

https://github.com/NVIDIA/nvidia-docker/wiki/NVIDIA-driver.

The crux of the issue is that we need to be able consolidate all of
the Nvidia binaries/libraries into a single volume that we inject into
a docker container.  We use libelf is used to get the canonical names
of all the Nvidia libraries (i.e. SONAME in their dynamic sections) as
well as lookup what external dependences they have (i.e. NEEDED in
their dynamic sections) in order to build this volume.

NOTE: None of this volume support is actually in Mesos yet -- we just
added the libelf dependence in anticipation of it.




On Mon, Jun 20, 2016 at 12:59 PM, Yan Xu  wrote:
> It's not immediately clear form the ticket why the change from optional
> dependency to required dependency though? Could you summarize?
>
>
> On Sun, Jun 19, 2016 at 12:33 PM, Kevin Klues  wrote:
>>
>> Thanks Zhitao,
>>
>> I just pushed out a review for upgrades.md and added you as a reviewer.
>>
>> The new dependence was added in the JIRA that haosdent linked, but the
>> actual reason for adding the dependence is more related to:
>> https://issues.apache.org/jira/browse/MESOS-5401
>>
>> On Sun, Jun 19, 2016 at 9:34 AM, haosdent  wrote:
>> > The related issue is Change build to always enable Nvidia GPU support
>> > for
>> > Linux
>> > Last time my local build break before Kevin send out the email, and then
>> > find this change.
>> >
>> > On Mon, Jun 20, 2016 at 12:11 AM, Zhitao Li 
>> > wrote:
>> >>
>> >> Hi Kevin,
>> >>
>> >> Thanks for letting us know. It seems like this is not called out in
>> >> upgrades.md, so can you please document this additional dependency
>> >> there?
>> >>
>> >> Also, can you include the link to the JIRA or patch requiring this
>> >> dependency so we can have some contexts?
>> >>
>> >> Thanks!
>> >>
>> >> On Sat, Jun 18, 2016 at 10:25 AM, Kevin Klues 
>> >> wrote:
>> >>
>> >> > Hello all,
>> >> >
>> >> > Just an FYI that the newest libmesos now has an external dependence
>> >> > on
>> >> > libelf on Linux. This dependence can be installed via the following
>> >> > packages:
>> >> >
>> >> > CentOS 6/7: yum install elfutils-libelf.x86_64
>> >> > Ubuntu14.04:   apt-get install libelf1
>> >> >
>> >> > Alternatively you can install from source:
>> >> > https://directory.fsf.org/wiki/Libelf
>> >> >
>> >> > For developers, you will also need to install the libelf headers in
>> >> > order to build master. This dependency can be installed via:
>> >> >
>> >> > CentOS: elfutils-libelf-devel.x86_64
>> >> > Ubuntu: libelf-dev
>> >> >
>> >> > Alternatively, you can install from source:
>> >> > https://directory.fsf.org/wiki/Libelf
>> >> >
>> >> > The getting started guide and the support/docker_build.sh scripts
>> >> > have
>> >> > been updated appropriately, but you may need to update your local
>> >> > environment if you don't yet have these packages installed.
>> >> >
>> >> > --
>> >> > ~Kevin
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Cheers,
>> >>
>> >> Zhitao Li
>> >
>> >
>> >
>> >
>> > --
>> > Best Regards,
>> > Haosdent Huang
>>
>>
>>
>> --
>> ~Kevin
>
>



-- 
~Kevin


Re: New external dependency

2016-06-19 Thread Kevin Klues
Thanks Zhitao,

I just pushed out a review for upgrades.md and added you as a reviewer.

The new dependence was added in the JIRA that haosdent linked, but the
actual reason for adding the dependence is more related to:
https://issues.apache.org/jira/browse/MESOS-5401

On Sun, Jun 19, 2016 at 9:34 AM, haosdent  wrote:
> The related issue is Change build to always enable Nvidia GPU support for
> Linux
> Last time my local build break before Kevin send out the email, and then
> find this change.
>
> On Mon, Jun 20, 2016 at 12:11 AM, Zhitao Li  wrote:
>>
>> Hi Kevin,
>>
>> Thanks for letting us know. It seems like this is not called out in
>> upgrades.md, so can you please document this additional dependency there?
>>
>> Also, can you include the link to the JIRA or patch requiring this
>> dependency so we can have some contexts?
>>
>> Thanks!
>>
>> On Sat, Jun 18, 2016 at 10:25 AM, Kevin Klues  wrote:
>>
>> > Hello all,
>> >
>> > Just an FYI that the newest libmesos now has an external dependence on
>> > libelf on Linux. This dependence can be installed via the following
>> > packages:
>> >
>> > CentOS 6/7: yum install elfutils-libelf.x86_64
>> > Ubuntu14.04:   apt-get install libelf1
>> >
>> > Alternatively you can install from source:
>> > https://directory.fsf.org/wiki/Libelf
>> >
>> > For developers, you will also need to install the libelf headers in
>> > order to build master. This dependency can be installed via:
>> >
>> > CentOS: elfutils-libelf-devel.x86_64
>> > Ubuntu: libelf-dev
>> >
>> > Alternatively, you can install from source:
>> > https://directory.fsf.org/wiki/Libelf
>> >
>> > The getting started guide and the support/docker_build.sh scripts have
>> > been updated appropriately, but you may need to update your local
>> > environment if you don't yet have these packages installed.
>> >
>> > --
>> > ~Kevin
>> >
>>
>>
>>
>> --
>> Cheers,
>>
>> Zhitao Li
>
>
>
>
> --
> Best Regards,
> Haosdent Huang



-- 
~Kevin


New external dependency

2016-06-18 Thread Kevin Klues
Hello all,

Just an FYI that the newest libmesos now has an external dependence on
libelf on Linux. This dependence can be installed via the following
packages:

CentOS 6/7: yum install elfutils-libelf.x86_64
Ubuntu14.04:   apt-get install libelf1

Alternatively you can install from source:
https://directory.fsf.org/wiki/Libelf

For developers, you will also need to install the libelf headers in
order to build master. This dependency can be installed via:

CentOS: elfutils-libelf-devel.x86_64
Ubuntu: libelf-dev

Alternatively, you can install from source:
https://directory.fsf.org/wiki/Libelf

The getting started guide and the support/docker_build.sh scripts have
been updated appropriately, but you may need to update your local
environment if you don't yet have these packages installed.

-- 
~Kevin


Re: [Tech-debt] Introduce regex into Mesos

2016-06-10 Thread Kevin Klues
By compiler errors, I mean "internal compiler errors"

On Fri, Jun 10, 2016 at 11:38 AM, Kevin Klues  wrote:
> I've run into compiler errors using simple regex stuff from the
> standard library on our supported version of gcc.
>
> On Thu, Jun 9, 2016 at 7:30 PM, Klaus Ma  wrote:
>> Hi team,
>>
>>
>> We're discussing to introduce regex into Mesos when investigating 
>> MESOS-4627<https://issues.apache.org/jira/browse/MESOS-4627>; so I'd like to 
>> ask whether anyone has experience on regex after C++11? for example, 
>> supported compiler, compatibility, performance and so on :).
>>
>>
>> 
>>
>> Da (Klaus), Ma (??), PMP®| Advisory Software Engineer
>> Platform DCOS Development & Support, STG, IBM GCG
>> +86-10-8245 4084 | mad...@cn.ibm.com | http://k82.me
>>
>> <http://k82.me/>
>
>
>
> --
> ~Kevin



-- 
~Kevin


Re: [Tech-debt] Introduce regex into Mesos

2016-06-10 Thread Kevin Klues
I've run into compiler errors using simple regex stuff from the
standard library on our supported version of gcc.

On Thu, Jun 9, 2016 at 7:30 PM, Klaus Ma  wrote:
> Hi team,
>
>
> We're discussing to introduce regex into Mesos when investigating 
> MESOS-4627; so I'd like to 
> ask whether anyone has experience on regex after C++11? for example, 
> supported compiler, compatibility, performance and so on :).
>
>
> 
>
> Da (Klaus), Ma (??), PMP®| Advisory Software Engineer
> Platform DCOS Development & Support, STG, IBM GCG
> +86-10-8245 4084 | mad...@cn.ibm.com | http://k82.me
>
> 



-- 
~Kevin


Re: mesos git commit: Updated quota endpoint help.

2016-05-23 Thread Kevin Klues
We had considered what Benjamin proposes (I think I even wrote the
code to do it), but we decided it was better to commit them back since
all of the other documentation is in the docs folder and is checked
in.  I don't have a strong opinion either way though, so I'm fine with
not commtting them back in the future.

On Mon, May 23, 2016 at 3:48 PM, Benjamin Mahler  wrote:
> +kevin for context
>
> When we approached this problem there were a few routes we could take. One
> approach we were leaning towards was to allow the help information to be
> specified directly in markdown rather than in C++. Markdown would be the
> only source of help information, and the C++ code calling route() would
> point libprocess to the markdown file. Or it would be "burned in" to the
> C++ source via tooling during the build process. There were some tradeoffs
> with this approach:
>
> + single source of help information, no need for "generating" it
> + easy to discover and read help information by navigating the project,
> it's in markdown only, don't need to read C++ source
> - not as easy to enforce a standard help markdown format
> - need to bundle more "assets" with the code (unlike mesos, libprocess
> currently has no assets that need bundling so this is new territory),
> because at run time we need access to the markdown files to serve them
>
> I'm not sure we considered your approach Benjamin, which sounds like a
> small tweak to the current approach:
>
> (1) Continue writing help information in C++.
> (2) Keep existing tooling we built for obtaining help markdown from the
> binaries.
> (3) Run this tooling during the build and output into the build directory
> (these do not get checked in).
> (4) As part of the manual website update, do a build and publish the
> generated markdown.
>
> I think the only reason we avoided this approach was that we wanted the
> help information to be easily discoverable in the project source (i.e.
> docs/endpoints rather than scattered across C++), but I suppose the
> argument here is that for now the users can just use the website instead.
> Sounds ok to me.
>
> On Wed, May 18, 2016 at 4:01 AM, Alex Rukletsov  wrote:
>
>> Thanks Neil. Pushed 54339eb0e934e120e5cb5e693681679dea24b2d2. It also
>> includes an update induced by slave->agent rename.
>>
>> I think we eventually should do what Benjamin suggests. It's hard for a
>> human to remember, when a certain script should be run (even though having
>> a script is a huge improvement already).
>>
>> On Wed, May 18, 2016 at 11:53 AM, Benjamin Bannier <
>> benjamin.bann...@mesosphere.io> wrote:
>>
>> > Hi,
>> >
>> > the way one currently has to manually regenerate markdown outputs which
>> > should then be checked in together (and ideally: atomically) with the
>> > corresponding source changes seems to be a reoccurring source of
>> friction.
>> >
>> > I understand that being able to e.g., reference the generated markdown
>> > outputs is useful, but believe the fundamentally right thing to do would
>> be
>> > to generate the markdown outputs as part of the build and *not check them
>> > into source control*. If one would need to reference the endpoint help
>> one
>> > could e.g., use links to
>> > https://mesos.apache.org/documentation/latest/endpoints/ and children.
>> >
>> > Any reason this isn’t what we are already doing?
>> >
>> >
>> > Cheers,
>> >
>> > Benjamin
>> >
>> >
>> > > On May 18, 2016, at 11:42 AM, haosdent  wrote:
>> > >
>> > > Is it possible to show a warning in `./support/mesos-style.py` when
>> > commit
>> > > changes contains "src/master/http.cpp" or "src/slave/http.cpp" while
>> > > doesn't contain document changes?
>> > >
>> > > On Wed, May 18, 2016 at 5:06 PM, Neil Conway 
>> > wrote:
>> > >
>> > >> When modifying the endpoint help text, we should remember to update
>> > >> the generated help files (via support/generate-endpoint-help.py) --
>> > >> the changes to both the input text and generated output files should
>> > >> be included as part of the same commit.
>> > >>
>> > >> Neil
>> > >>
>> > >> On Wed, May 18, 2016 at 10:58 AM,   wrote:
>> > >>> Repository: mesos
>> > >>> Updated Branches:
>> > >>>  refs/heads/master a7835f889 -> 9f63d95f3
>> > >>>
>> > >>>
>> > >>> Updated quota endpoint help.
>> > >>>
>> > >>>
>> > >>> Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
>> > >>> Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/9f63d95f
>> > >>> Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/9f63d95f
>> > >>> Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/9f63d95f
>> > >>>
>> > >>> Branch: refs/heads/master
>> > >>> Commit: 9f63d95f3cac17c94a7aff57980478263c78f6ee
>> > >>> Parents: a7835f8
>> > >>> Author: Adam B 
>> > >>> Authored: Wed May 18 01:56:57 2016 -0700
>> > >>> Committer: Adam B 
>> > >>> Committed: Wed May 18 01:57:52 2016 -0700
>> > >>>
>> > >>>
>> --
>> > >>> src/master/http.cpp | 11 ---
>> > >

Re: [REVIEW PROCESS] Proposal for new review process working group

2016-05-20 Thread Kevin Klues
What happened with these proposals?  Do we know how / why they lost momentum?

On Fri, May 20, 2016 at 11:17 AM, Jojy Varghese  wrote:
> Hi Kevin
>  Great initiative. Wanted to point out to earlier proposals on the topic:
>
> http://marc.info/?l=mesos-dev&m=144286256512205&w=2 
> <http://marc.info/?l=mesos-dev&m=144286256512205&w=2>
>
>
> -Jojy
>
>
>
>> On May 20, 2016, at 11:11 AM, Shivam Pathak  
>> wrote:
>>
>> Great! please add me to the group
>>
>> On Fri, May 20, 2016 at 11:07 AM, haosdent  wrote:
>>
>>> This sounds great, add me to the group please.
>>>
>>> On Sat, May 21, 2016 at 1:59 AM, Kevin Klues  wrote:
>>>
>>>> Hi all,
>>>>
>>>> I'd like to propose starting a dedicated "review process" working
>>>> group.  The goals of this working group will be to:
>>>>
>>>> 1) Discuss issues around the current review process
>>>> 2) Propose improvements to the current review process
>>>> 3) Implement / Monitor / Enforce the new process we come up with going
>>>> forward
>>>>
>>>> Anyone who'd like to be involved, please respond to this thread so I
>>>> can add you to the working group.  We will likely start actively
>>>> discussing things after MesosCon.
>>>>
>>>> --
>>>> ~Kevin
>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Haosdent Huang
>>>
>>
>>
>>
>> --
>> *Shivam Pathak (Mr)*
>> Software Engineer and Systems Architect
>> Novatap Private Ltd.
>> HP: +65 8543 2297
>



-- 
~Kevin


[REVIEW PROCESS] Proposal for new review process working group

2016-05-20 Thread Kevin Klues
Hi all,

I'd like to propose starting a dedicated "review process" working
group.  The goals of this working group will be to:

1) Discuss issues around the current review process
2) Propose improvements to the current review process
3) Implement / Monitor / Enforce the new process we come up with going forward

Anyone who'd like to be involved, please respond to this thread so I
can add you to the working group.  We will likely start actively
discussing things after MesosCon.

-- 
~Kevin


Re: [WEBSITE] Readme update

2016-05-20 Thread Kevin Klues
Ah, now I see what you are saying. Yes. It would probably be good to add
those steps to the docker.

On Friday, May 20, 2016, haosdent  wrote:

> The mesos-webiste-container works pefect. But for release a complete
> website, it didn't contain the doxygen and javadoc.
>
> https://mesos.apache.org/api/latest/java/ and
> https://mesos.apache.org/api/latest/c++
>
> On Sat, May 21, 2016 at 12:22 AM, Kevin Klues  > wrote:
>
> > What was the error with docker? I use that all the time (even just
> > yesterday)
> >
> > On Friday, May 20, 2016, haosdent >
> wrote:
> >
> > > Let me include this in the pactch. ;-)
> > >
> > > On Sat, May 21, 2016 at 12:17 AM, Vinod Kone  
> > > > wrote:
> > >
> > > > On Fri, May 20, 2016 at 9:00 AM, haosdent  
> > > > wrote:
> > > >
> > > > > run ../support/generate-endpoint-help.py, rake doxygen and rake
> > > javadoc.
> > > > >
> > > >
> > > > yes. maybe update the rake target ":default" target to also do
> doxygen
> > > and
> > > > javadoc tasks?
> > > >
> > >
> > >
> > >
> > > --
> > > Best Regards,
> > > Haosdent Huang
> > >
> >
> >
> > --
> > ~Kevin
> >
>
>
>
> --
> Best Regards,
> Haosdent Huang
>


-- 
~Kevin


Re: [WEBSITE] Readme update

2016-05-20 Thread Kevin Klues
What was the error with docker? I use that all the time (even just
yesterday)

On Friday, May 20, 2016, haosdent  wrote:

> Let me include this in the pactch. ;-)
>
> On Sat, May 21, 2016 at 12:17 AM, Vinod Kone  > wrote:
>
> > On Fri, May 20, 2016 at 9:00 AM, haosdent  > wrote:
> >
> > > run ../support/generate-endpoint-help.py, rake doxygen and rake
> javadoc.
> > >
> >
> > yes. maybe update the rake target ":default" target to also do doxygen
> and
> > javadoc tasks?
> >
>
>
>
> --
> Best Regards,
> Haosdent Huang
>


-- 
~Kevin


Re: [WEBSITE] Readme update

2016-05-20 Thread Kevin Klues
Yes, that is out of date.  The recommended way of generating the
website now is support/site-docker/

On Fri, May 20, 2016 at 7:45 AM, haosdent  wrote:
>> It doesn't mention mesos-website-container
> I think it may forgot to update in https://reviews.apache.org/r/39194/
>> support/generate-help-site.py
> It should be a typo introduced in this commit.
>
> I just posted a simple fix at here
> https://reviews.apache.org/r/47645/diff/1#index_header
>
> On Fri, May 20, 2016 at 10:15 PM, Tomek Janiszewski 
> wrote:
>
>> Hi
>>
>> I think website readme 
>> is out of date.
>> 1. It doesn't mention mesos-website-container
>> 
>> 2. support/generate-help-site.py does not exists
>> Am I right? How to generate full site (with documentation and getting
>> started section)?
>>
>> Thanks
>> Tomek
>>
>
>
>
> --
> Best Regards,
> Haosdent Huang



-- 
~Kevin


Re: Should "read-only" HTTP endpoints allow other request methods than "GET"?

2016-05-10 Thread Kevin Klues
There was some discussion of this between mpark and I in relation to
the v1 operator API.  The idea was to have a base class for endpoints
that implement GET/POST/DELETE/PUT/etc... functions that return an
error by default. You can then override the specific subset of them
that that each endpoint supports.

On Tue, May 10, 2016 at 6:42 AM, Vinod Kone  wrote:
> +1 to only allow GET for read only
>
> @vinodkone
>
>> On May 10, 2016, at 6:37 AM, Jan Schlicht  wrote:
>>
>> Hi guys,
>>
>> while working on HTTP endpoint authorization for Mesos, I found some
>> interesting behavior: It's the responsibility of the HTTP endpoint handlers
>> to validate the HTTP request method they've been called with. Many
>> "read-only" endpoints (e.g. "/flags", "/state") don't do this at the
>> moment. This means that it's possible to send, for example, an HTTP "POST"
>> to the "/state" endpoint and get the same results as if it would have been
>> an HTTP "GET".
>> While this is currently not a problem, it will complicate things when we
>> want to authorize endpoint access. The authorization should take the HTTP
>> request method into account to distinguish between "user wants read access
>> to the endpoint" and "user wants write access to the endpoint". This makes
>> it ambitious on how to handle these "read-only" endpoints that also accept
>> a "POST" request.
>> The solution to that problem would be to add HTTP request method validation
>> to every endpoint, i.e. the read-only endpoints would reject any request
>> method that isn't "GET". I've created MESOS-5346 for that.
>> Because that would change the existing behavior, that allows to e.g. "POST"
>> to a "read-only" endpoint, I'd like to know if anybody relies on that
>> behavior, or if there are any other objections on changing it.
>>
>> Cheers,
>> Jan



-- 
~Kevin


Re: Rename 'include/mesos/slave' to 'include/mesos/agent'

2016-04-29 Thread Kevin Klues
As of last week, such a symlink has been added. The plan is to leave the
symlink in place until the deprecation cycle has ended. At that point, all
references to "slave" in our external APIs will be removed.

This email is just a warning that people should start migrating their
include paths over to "agent" now to avoid complications in the future. For
now, both "slave" and "agent" will work, until the deprecation cycle ends
(I.e. a few months from now).

On Friday, April 29, 2016, zhiwei  wrote:

> Please use relative symlink path in further updates if any.
>
> Thanks.
>
> On Thu, Apr 28, 2016 at 2:06 PM, Zhou Z Xing  > wrote:
>
>>
>> Dear developers and users,
>>
>> While doing ticket MESOS-5230, we found that it is necessary to rename
>> folder 'include/mesos/slave' to 'include/mesos/agent'. With the change of
>> this folder, it may affect that:
>>
>>1. with command "make install", the header files installation location
>>will be changed to '$(DESTDIR)/include/mesos/agent'.
>>2. all the files that include the headers in this folder need to change
>>their include claim
>>3. for the protos that use 'mesos.slave' package need to change to use
>> '
>>mesos.agent' package
>>
>> As a result, if this change affects your program or environment, please
>> let
>> us know. We are planning this change soon, welcome your comments on this!
>>
>> Thanks & Best Wishes,
>>
>> Tom Xing(邢舟)
>> Emerging Technology Institute, IBM China Software Development Lab
>> --
>> IBM China Software Development Laboratory (CSDL)
>> Notes ID:Zhou Z Xing/China/IBM
>> Phone   :86-10-82450442
>> e-Mail  :xingz...@cn.ibm.com
>> 
>> Address :Building No.28, ZhongGuanCun Software Park, No.8 Dong Bei Wang
>> West Road, Haidian District, Beijing, P.R.China 100193
>> 地址:中国北京市海淀区东北旺西路8号 中关村软件园28号楼 100193
>>
>
>

-- 
~Kevin


Re: [Performance Isolation] Meeting on Thursday April 21 2016 5pm PST

2016-04-28 Thread Kevin Klues
I won't be able to call today.  I am on vacation until the 9th.

On Fri, Apr 29, 2016 at 2:30 AM, Niklas Nielsen  wrote:
> Ian, Kevin: Would you be able to dial in this afternoon instead?
>
> On Mon, Apr 25, 2016 at 9:00 AM, Niklas Nielsen  wrote:
>
>> Hi all,
>>
>> I apologize! this was on me: I got stuck in traffic and couldn't dial in
>> (and accept the external calls).
>> Do you have time to reschedule for the same time slot? That be Thursday
>> 5pm to 6pm.
>>
>> Niklas
>>
>>
>> On Fri, Apr 22, 2016 at 10:06 AM, Ian Downes 
>> wrote:
>>
>>> Likewise, I tried calling but no one was hosting the meeting.
>>>
>>> On Fri, Apr 22, 2016 at 10:04 AM, Kevin Klues  wrote:
>>>
>>> > I tried calling into this last night, but no one was there.  Was it
>>> > post-poned again?
>>> >
>>> > On Mon, Apr 18, 2016 at 12:03 PM, Niklas Nielsen  wrote:
>>> > > Hi everyone,
>>> > >
>>> > > Per our conversation about Intel CAT enablement in Mesos, we are
>>> > scheduling
>>> > > a Performance Isolation Working Group meeting at Thursday April 21
>>> 2016
>>> > > 5pm-6pm PST.
>>> > >
>>> > > Feel free to suggest agenda topics here:
>>> > >
>>> >
>>> https://docs.google.com/document/d/11mlGPZSABItP47J6VX-zB0fAK6Qr1mCIOI7nhlATMqk/edit#
>>> > >
>>> > > Also, let's use this hangout to meet:
>>> > >
>>> >
>>> https://hangouts.google.com/hangouts/_/intel.com/mesos-performance-isolation?hl=en&authuser=1
>>> > >
>>> > > Cheers,
>>> > > Niklas
>>> >
>>> >
>>> >
>>> > --
>>> > ~Kevin
>>> >
>>>
>>
>>
>>
>> --
>> Niklas
>>
>
>
>
> --
> Niklas



-- 
~Kevin


Re: [Performance Isolation] Meeting on Thursday April 21 2016 5pm PST

2016-04-22 Thread Kevin Klues
I tried calling into this last night, but no one was there.  Was it
post-poned again?

On Mon, Apr 18, 2016 at 12:03 PM, Niklas Nielsen  wrote:
> Hi everyone,
>
> Per our conversation about Intel CAT enablement in Mesos, we are scheduling
> a Performance Isolation Working Group meeting at Thursday April 21 2016
> 5pm-6pm PST.
>
> Feel free to suggest agenda topics here:
> https://docs.google.com/document/d/11mlGPZSABItP47J6VX-zB0fAK6Qr1mCIOI7nhlATMqk/edit#
>
> Also, let's use this hangout to meet:
> https://hangouts.google.com/hangouts/_/intel.com/mesos-performance-isolation?hl=en&authuser=1
>
> Cheers,
> Niklas



-- 
~Kevin


Removing the External Containerizer

2016-04-20 Thread Kevin Klues
Hello all,

The 'external' containerizer has been deprecated since August and we
are now considering removing it permanently before the 0.29 release.
Are there any objections to this?

The following JIRA suggests that Hadoop on Mesos was still using the
External containerizer format.
https://issues.apache.org/jira/browse/MESOS-3370

However, it looks like this has been fixed in:
https://github.com/mesos/hadoop/pull/68

Is anyone else still using the external containerizer and would like
to see it persist a bit longer?

-- 
~Kevin


Re: Typed Error Handling in Mesos

2016-04-06 Thread Kevin Klues
+1

This is also similar to how errors are typed in Go as well.

On Wednesday, April 6, 2016, Alexander Rojas 
wrote:

> +1
>
> What I like is that it allows from some kind of type safety into the error
> management beyond trying to parse error strings.
>
> > On 05 Apr 2016, at 03:48, Michael Park >
> wrote:
> >
> > Contrary to standard C++ practices, Mesos uses return values as the
> > mechanism
> > for error handling rather than exceptions.
> >
> > This proposal is simply an evolution of the current mechanism we have in
> > Mesos today.
> > This direction is consistent with the designs made in Rust, which uses
> > return values as
> > the error handling mechanism at the language level.
> >
> > The first step is to add an additional template parameter to class
> template
> > *Try*, to get *Try*.
> >
> > The proposed design defaults* E *to *Error*, and requires that *E* be, or
> > is inherited from *Error*.
> > The return type of *error()* is *const std::string&* if *E == Error* and
> > *const E&* otherwise,
> > for backwards-compatibility reasons.
> >
> > So in the end, *Try* behaves exactly as before.
> >
> > The work is being tracked in MESOS-5107
> > , and i've written a
> > quick design doc
> > <
> https://docs.google.com/document/d/1tG21sD-ZX64FHAKJwhEPk6JkgsBIv12AmA1Y3J0kCYY/edit#
> >
> > capturing
> > some of the preliminary thoughts on this topic, and a proposal for an
> > immediate use case
> > for the Windows work.
> >
> > If you're interested in how Rust deals with error handling, check out
> > https://doc.rust-lang.org/book/error-handling.html. Our *Option* is
> their
> > *Option*,
> > our *Try* is their *Result*, and they don't have our *Result*.
> >
> > I'm going to be pushing the changes proposed shortly, but the changes are
> > small and
> > does not require a large sweeping changes or anything like that.
> > So please reach out to me with your concerns and complaints and I will be
> > sure to address them.
> >
> > Thanks,
> >
> > MPark
>
>

-- 
~Kevin


Re: [Isolation][Containerization] - Add Intel Cache Allocation Technology(CAT) Isolator Support

2016-04-04 Thread Kevin Klues
Hi Fan,

Thanks for putting this together. I have been looking into this quite
a bit myself recently, and have been slowly preparing a design doc for
both CAT and CMT support in Mesos. One of the biggest things I have
been trying to figure out (which is why I haven't pushed my design doc
out yet) is how to combine CAT support with the existing resource
model.

Specifically, Mesos currently gives out fractional cores using the
cgroups cpu.shares mechanism and doesn't allow tasks to choose
specific cores to run on (even more than this, there is no way for a
task to even see which specific cores might be available).
Furthermore, when a resource offer goes out, it's just a collection of
SCALARS, SETS, and RANGES, and there's no way to tie one particular
resource to another (e.g. you can't say give me cores and memory that
are close together to mitigate NUMA effects).

Given these limitations, it's not clear how to take immediate
advantage of CAT, since it relies on specifying a specific core to
allocate the cache from. That is, some mechanism must exist to ensure
that both the CPU and the cache are colocated.  This is a problem with
the current resource model in general, and applies to properly
supporting NUMA as well.

You seem to propose simply adding cache partitions as a first class
resource on par with CPUs and memory, with no mention of its
dependence on particular cores.  What are your thoughts on this?

Kevin

On Mon, Apr 4, 2016 at 7:36 PM, Du, Fan  wrote:
> Hi,ALL
>
> MESOS-5076 is filed to investigate how Intel Cache Allocation
> Technology(CAT)[1] could be
> used in Mesos. Some introduction and early thoughts is documented here[2].
>
> The motivation is to:
> a) Add CAT isolation support for Mesos Containerization
> b) Expose Last Level Cache(LLC) as Scalar Resource
> c) Bridge the interface gap for Docker Containerization,
>CAT support for Docker[3] has been submitted to Docker OCI with positive
> feedback.
>
> The ultimate goal is to provide operator CAT isolator for better colocation
> of cluster resources.
> I'm looking forward for any comments for community to move this forward.
>
> Thanks!
>
> [1]:http://www.intel.com/content/www/us/en/communications/cache-monitoring-cache-allocation-technologies.html
> [2]:https://docs.google.com/document/d/130ay0e2DZ9S61SC3tGcik5wQaC8L40t5tWj3K3GJxTg/edit?usp=sharing
> [3]:https://github.com/opencontainers/runtime-spec/pull/267
> https://github.com/opencontainers/runc/pull/447
>



-- 
~Kevin


Re: Looking for Shepherd for MESOS-4033 (commit hook for non-ascii characters)

2016-04-03 Thread Kevin Klues
@Alexr:  Looking at the code now, and finally reading Benjamin's
comment (I somehow missed that before), I agree that it probably makes
sense to defer the docs check to an external validator.
mesos-style.py should be limited to only checking for errors in the
code files.

On Sun, Apr 3, 2016 at 7:36 AM, Kevin Klues  wrote:
> @Alexr: That makes sense. I think we should enforce the check for
> unicode in the docs though. Ascii in the code, unicode in the docs. I
> can review the python.
>
> On Sun, Apr 3, 2016 at 2:42 AM, Alex Rukletsov  wrote:
>> @Vinod: I can take it, but would like someone, more experienced in python
>> than myself, to review it.
>>
>> @Kevin: I think we should not apply the same rules for code and docs. Since
>> docs are meant for people, I believe special characters, umlauts, symbols
>> make them easier to read and digest. I'd suggest pushing UTF-8 for docs,
>> it's 21st century! Alternatively, we can use numeric character references,
>> if folks think unicode in docs is a bad idea.
>>
>> On Sat, Apr 2, 2016 at 8:48 PM, Kevin Klues  wrote:
>>
>>> @vinod, @alexr.  Neil and I had also suggested not excluding the doc
>>> directories (which Yong's current patch still does).  What are your
>>> thoughts on this?
>>>
>>> On Fri, Apr 1, 2016 at 6:47 PM, Yong Tang 
>>> wrote:
>>> > Hi Vinod,
>>> >
>>> > Thanks for the help. I updated the mesos-style.py and added the
>>> non-ascii check there.
>>> >
>>> > Please let me know if there is anything else that needs to do to move
>>> forward this issue.
>>> >
>>> >
>>> > Thanks
>>> > Yong
>>> >
>>> >> From: vinodk...@apache.org
>>> >> Date: Fri, 1 Apr 2016 15:47:29 -0700
>>> >> Subject: Re: Looking for Shepherd for MESOS-4033 (commit hook for
>>> non-ascii characters)
>>> >> To: dev@mesos.apache.org
>>> >>
>>> >> @AleR do you want to be a shepherd for this, since you originally filed
>>> >> this ticket?
>>> >>
>>> >> @Yong: Just took a quick look at the review. It's unfortunate that we
>>> need
>>> >> a whole new script for checking non-ascii characters. Can we update
>>> >> mesos-style.py to catch this?
>>> >>
>>> >> On Fri, Apr 1, 2016 at 7:36 AM, Yong Tang >> >
>>> >> wrote:
>>> >>
>>> >> > Ping again to find a shepherd for MESOS-4033. A review request has
>>> been
>>> >> > created with good discussions from many reviewers:
>>> >> >
>>> >> > https://reviews.apache.org/r/45033/
>>> >> >
>>> >> > It would be really good if a shepherd could provide some guidance so
>>> that
>>> >> > this ticket could move forward.
>>> >> >
>>> >> > Thanks
>>> >> > Yong
>>> >> >
>>> >> > > Date: Sat, 26 Mar 2016 08:13:28 +0800
>>> >> > > Subject: Re: Looking for Shepherd for MESOS-4033 (commit hook for
>>> >> > non-ascii characters)
>>> >> > > From: xia...@gmail.com
>>> >> > > To: dev@mesos.apache.org
>>> >> > >
>>> >> > > +1
>>> >> > >
>>> >> > > 2016-03-25 23:26 GMT+08:00 Yong Tang >> >:
>>> >> > >
>>> >> > > > Hi
>>> >> > > >
>>> >> > > > Just bump the email to look for shepherd for MESOS-4033.
>>> >> > > >
>>> >> > > > This issue (commit hook for non ascii characters) has already been
>>> >> > fairly
>>> >> > > > discussed on the review board. Thanks Deshi, Neil, and haosdent
>>> for the
>>> >> > > > great inputs.
>>> >> > > >
>>> >> > > > It really would be nice if there is anyone that could shepherd so
>>> that
>>> >> > the
>>> >> > > > issue could move forward.
>>> >> > > >
>>> >> > > > Thanks
>>> >> > > > Yong
>>> >> > > >
>>> >> > > >
>>> >> > > > > From: yong.tang.git...@outlook.com
>>> >> > > > > To: dev@mesos.apache.org
>>> >> > > > > Subject: Looking for Shepherd for MESOS-4033 (commit hook for
>>> >> > non-ascii
>>> >> > > > characters)
>>> >> > > > > Date: Tue, 22 Mar 2016 08:31:40 -0700
>>> >> > > > >
>>> >> > > > > Hi All
>>> >> > > > >
>>> >> > > > > Can anyone help shepherd MESOS-4033 - Add a commit hook for
>>> non-ascii
>>> >> > > > characters?
>>> >> > > > >
>>> >> > > > > https://issues.apache.org/jira/browse/MESOS-4033
>>> >> > > > >
>>> >> > > > > This issue is about adding a commit hook to check for non-ascii
>>> >> > > > characters. The issue has been accepted sometime ago.
>>> >> > > > >
>>> >> > > > > Thanks a lot for the help
>>> >> > > > > Yong
>>> >> > > > >
>>> >> > > >
>>> >> > > >
>>> >> > >
>>> >> > >
>>> >> > >
>>> >> > > --
>>> >> > > Deshi Xiao
>>> >> > > Twitter: xds2000
>>> >> > > E-mail: xiaods(AT)gmail.com
>>> >> >
>>> >> >
>>> >
>>>
>>>
>>>
>>> --
>>> ~Kevin
>>>
>
>
>
> --
> ~Kevin



-- 
~Kevin


Re: Looking for Shepherd for MESOS-4033 (commit hook for non-ascii characters)

2016-04-03 Thread Kevin Klues
@Alexr: That makes sense. I think we should enforce the check for
unicode in the docs though. Ascii in the code, unicode in the docs. I
can review the python.

On Sun, Apr 3, 2016 at 2:42 AM, Alex Rukletsov  wrote:
> @Vinod: I can take it, but would like someone, more experienced in python
> than myself, to review it.
>
> @Kevin: I think we should not apply the same rules for code and docs. Since
> docs are meant for people, I believe special characters, umlauts, symbols
> make them easier to read and digest. I'd suggest pushing UTF-8 for docs,
> it's 21st century! Alternatively, we can use numeric character references,
> if folks think unicode in docs is a bad idea.
>
> On Sat, Apr 2, 2016 at 8:48 PM, Kevin Klues  wrote:
>
>> @vinod, @alexr.  Neil and I had also suggested not excluding the doc
>> directories (which Yong's current patch still does).  What are your
>> thoughts on this?
>>
>> On Fri, Apr 1, 2016 at 6:47 PM, Yong Tang 
>> wrote:
>> > Hi Vinod,
>> >
>> > Thanks for the help. I updated the mesos-style.py and added the
>> non-ascii check there.
>> >
>> > Please let me know if there is anything else that needs to do to move
>> forward this issue.
>> >
>> >
>> > Thanks
>> > Yong
>> >
>> >> From: vinodk...@apache.org
>> >> Date: Fri, 1 Apr 2016 15:47:29 -0700
>> >> Subject: Re: Looking for Shepherd for MESOS-4033 (commit hook for
>> non-ascii characters)
>> >> To: dev@mesos.apache.org
>> >>
>> >> @AleR do you want to be a shepherd for this, since you originally filed
>> >> this ticket?
>> >>
>> >> @Yong: Just took a quick look at the review. It's unfortunate that we
>> need
>> >> a whole new script for checking non-ascii characters. Can we update
>> >> mesos-style.py to catch this?
>> >>
>> >> On Fri, Apr 1, 2016 at 7:36 AM, Yong Tang > >
>> >> wrote:
>> >>
>> >> > Ping again to find a shepherd for MESOS-4033. A review request has
>> been
>> >> > created with good discussions from many reviewers:
>> >> >
>> >> > https://reviews.apache.org/r/45033/
>> >> >
>> >> > It would be really good if a shepherd could provide some guidance so
>> that
>> >> > this ticket could move forward.
>> >> >
>> >> > Thanks
>> >> > Yong
>> >> >
>> >> > > Date: Sat, 26 Mar 2016 08:13:28 +0800
>> >> > > Subject: Re: Looking for Shepherd for MESOS-4033 (commit hook for
>> >> > non-ascii characters)
>> >> > > From: xia...@gmail.com
>> >> > > To: dev@mesos.apache.org
>> >> > >
>> >> > > +1
>> >> > >
>> >> > > 2016-03-25 23:26 GMT+08:00 Yong Tang > >:
>> >> > >
>> >> > > > Hi
>> >> > > >
>> >> > > > Just bump the email to look for shepherd for MESOS-4033.
>> >> > > >
>> >> > > > This issue (commit hook for non ascii characters) has already been
>> >> > fairly
>> >> > > > discussed on the review board. Thanks Deshi, Neil, and haosdent
>> for the
>> >> > > > great inputs.
>> >> > > >
>> >> > > > It really would be nice if there is anyone that could shepherd so
>> that
>> >> > the
>> >> > > > issue could move forward.
>> >> > > >
>> >> > > > Thanks
>> >> > > > Yong
>> >> > > >
>> >> > > >
>> >> > > > > From: yong.tang.git...@outlook.com
>> >> > > > > To: dev@mesos.apache.org
>> >> > > > > Subject: Looking for Shepherd for MESOS-4033 (commit hook for
>> >> > non-ascii
>> >> > > > characters)
>> >> > > > > Date: Tue, 22 Mar 2016 08:31:40 -0700
>> >> > > > >
>> >> > > > > Hi All
>> >> > > > >
>> >> > > > > Can anyone help shepherd MESOS-4033 - Add a commit hook for
>> non-ascii
>> >> > > > characters?
>> >> > > > >
>> >> > > > > https://issues.apache.org/jira/browse/MESOS-4033
>> >> > > > >
>> >> > > > > This issue is about adding a commit hook to check for non-ascii
>> >> > > > characters. The issue has been accepted sometime ago.
>> >> > > > >
>> >> > > > > Thanks a lot for the help
>> >> > > > > Yong
>> >> > > > >
>> >> > > >
>> >> > > >
>> >> > >
>> >> > >
>> >> > >
>> >> > > --
>> >> > > Deshi Xiao
>> >> > > Twitter: xds2000
>> >> > > E-mail: xiaods(AT)gmail.com
>> >> >
>> >> >
>> >
>>
>>
>>
>> --
>> ~Kevin
>>



-- 
~Kevin


Re: Looking for Shepherd for MESOS-4033 (commit hook for non-ascii characters)

2016-04-02 Thread Kevin Klues
@vinod, @alexr.  Neil and I had also suggested not excluding the doc
directories (which Yong's current patch still does).  What are your
thoughts on this?

On Fri, Apr 1, 2016 at 6:47 PM, Yong Tang  wrote:
> Hi Vinod,
>
> Thanks for the help. I updated the mesos-style.py and added the non-ascii 
> check there.
>
> Please let me know if there is anything else that needs to do to move forward 
> this issue.
>
>
> Thanks
> Yong
>
>> From: vinodk...@apache.org
>> Date: Fri, 1 Apr 2016 15:47:29 -0700
>> Subject: Re: Looking for Shepherd for MESOS-4033 (commit hook for non-ascii 
>> characters)
>> To: dev@mesos.apache.org
>>
>> @AleR do you want to be a shepherd for this, since you originally filed
>> this ticket?
>>
>> @Yong: Just took a quick look at the review. It's unfortunate that we need
>> a whole new script for checking non-ascii characters. Can we update
>> mesos-style.py to catch this?
>>
>> On Fri, Apr 1, 2016 at 7:36 AM, Yong Tang 
>> wrote:
>>
>> > Ping again to find a shepherd for MESOS-4033. A review request has been
>> > created with good discussions from many reviewers:
>> >
>> > https://reviews.apache.org/r/45033/
>> >
>> > It would be really good if a shepherd could provide some guidance so that
>> > this ticket could move forward.
>> >
>> > Thanks
>> > Yong
>> >
>> > > Date: Sat, 26 Mar 2016 08:13:28 +0800
>> > > Subject: Re: Looking for Shepherd for MESOS-4033 (commit hook for
>> > non-ascii characters)
>> > > From: xia...@gmail.com
>> > > To: dev@mesos.apache.org
>> > >
>> > > +1
>> > >
>> > > 2016-03-25 23:26 GMT+08:00 Yong Tang :
>> > >
>> > > > Hi
>> > > >
>> > > > Just bump the email to look for shepherd for MESOS-4033.
>> > > >
>> > > > This issue (commit hook for non ascii characters) has already been
>> > fairly
>> > > > discussed on the review board. Thanks Deshi, Neil, and haosdent for the
>> > > > great inputs.
>> > > >
>> > > > It really would be nice if there is anyone that could shepherd so that
>> > the
>> > > > issue could move forward.
>> > > >
>> > > > Thanks
>> > > > Yong
>> > > >
>> > > >
>> > > > > From: yong.tang.git...@outlook.com
>> > > > > To: dev@mesos.apache.org
>> > > > > Subject: Looking for Shepherd for MESOS-4033 (commit hook for
>> > non-ascii
>> > > > characters)
>> > > > > Date: Tue, 22 Mar 2016 08:31:40 -0700
>> > > > >
>> > > > > Hi All
>> > > > >
>> > > > > Can anyone help shepherd MESOS-4033 - Add a commit hook for non-ascii
>> > > > characters?
>> > > > >
>> > > > > https://issues.apache.org/jira/browse/MESOS-4033
>> > > > >
>> > > > > This issue is about adding a commit hook to check for non-ascii
>> > > > characters. The issue has been accepted sometime ago.
>> > > > >
>> > > > > Thanks a lot for the help
>> > > > > Yong
>> > > > >
>> > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Deshi Xiao
>> > > Twitter: xds2000
>> > > E-mail: xiaods(AT)gmail.com
>> >
>> >
>



-- 
~Kevin


Re: Event bus for Mesos

2016-03-25 Thread Kevin Klues
There is this design doc that was circulating a few months back, but
I'm not sure what the status of it is:

https://docs.google.com/document/d/1b2gheqWPw4V-60RdKu-dGWTy-qLGL5p5xJwmUXteDYE/edit?pli=1#heading=h.86u1r3w05n13

On Fri, Mar 25, 2016 at 10:02 AM, Zhitao Li  wrote:
> Hi,
>
> Has anyone thought about the idea of building some kind of event bus
> (similar to what Marathon provided) directly inside Mesos?
>
> What I roughly have in mind:
>
>1. The HTTP API effort already defined a very rich set of events, which
>Mesos master/agent could send to scheduler/executor;
>2. We could build an HTTP endpoint like `/v1/events`, which utilizes the
>same mechanism of HTTP scheduler/executor API and allows a *subscriber* to
>subscribe to event streams;
>3. Some kind of filter rules as well as authorization is definitely
>needed for both reducing traffic and enforce ACL.
>
> Possible use cases I could image:
>
>1. Pipe event streams to external database for analysis/archiving;
>2. Faster notification to external service discovery systems like
>Mesos-DNS w/o requiring it to poll /state from master;
>3. Notification to other long running daemons for newly launched/killed
>tasks at a Mesos agent
>
> Please let me know what you think.
>
> Thanks!
>
> --
> Cheers,
>
> Zhitao Li



-- 
~Kevin


Re: Compile with CFLAGS=-DWITH_NETWORK_ISOLATOR

2016-03-22 Thread Kevin Klues
As haosdent said, the default libnl-3 that comes with ubuntu is not
new enough.  It will cause the following check in configure.ac to
fail:

AC_CHECK_LIB([nl-3], [nl_has_capability], ...

because the default ubuntu version does not contain the function
nl_has_capability.  You need to install the version from:
https://github.com/thom311/libnl/releases/tag/libnl3_2_27

and then run through the instructions that Guangya linked to:
https://github.com/apache/mesos/blob/master/docs/network-monitoring.md#prerequisites

On Tue, Mar 22, 2016 at 6:36 AM, Guangya Liu  wrote:
> I did try this feature before, and you may want to follow here
> https://github.com/apache/mesos/blob/master/docs/network-monitoring.md#prerequisites
> to install the right version prerequisites first.
>
> On Tue, Mar 22, 2016 at 9:21 PM, Jay Guo  wrote:
>
>> Hi,
>>
>> I got error trying to compile Mesos
>> on Ubuntu
>> with CFLAG WITH_NETWORK_ISOLATOR
>>
>> Here's what I did:
>> 1. apt-get install libnl-dev
>> 2. ./bootstrap
>> 3. mkdir build && cd build
>> 4. CXXFLAGS=-DWITH_NETWORK_ISOLATOR ../configure --disable-java
>> --disable-python
>> 5. make check
>>
>> Although I got following error:
>>
>> In file included from ../../src/linux/routing/filter/ip.hpp:35:0,
>>  from
>> ../../src/slave/containerizer/mesos/isolators/network/port_mapping.hpp:44,
>>  from
>> ../../src/slave/containerizer/mesos/containerizer.cpp:82:
>> ../../src/linux/routing/handle.hpp:92:39: error: ‘TC_H_ROOT’ was not
>> declared in this scope
>>  constexpr Handle EGRESS_ROOT = Handle(TC_H_ROOT);
>>^
>> ../../src/linux/routing/handle.hpp:93:40: error: ‘TC_H_INGRESS’ was not
>> declared in this scope
>>  constexpr Handle INGRESS_ROOT = Handle(TC_H_INGRESS);
>>
>> Any ideas?
>>
>> Also, does this work with OSX? Is there any equivalent library as libnl?
>>
>> Cheers,
>> /J
>>



-- 
~Kevin


Re: [RESULT][VOTE] Release Apache Mesos 0.27.2 (rc1)

2016-03-22 Thread Kevin Klues
t; > >
> > > As Kevin and Michael mentioned, tags and branches are different
> concepts.
> > > We can use them together. We still want to create immutable tags to
> point
> > > at releases so that we don't accidentally add new patches to releases
> by
> > > updating a branch.
> > >
> > > I think building up the releases in public branches is a good step
> > towards
> > > improved visibility. I hope this will also increase the accountability
> of
> > > the community to ensure the patches they contribute are committed to
> the
> > > right branches.
> > >
> > > On Fri, Mar 18, 2016 at 8:10 PM, Erik Weathers <
> > > eweath...@groupon.com.invalid> wrote:
> > >
> > > > BTW, if the tag is created against a commit that *doesn't* become
> > > > "unreachable" from HEAD [1], then `git pull` is sufficient to also
> pull
> > > > down the tags.
> > > >
> > > > The only time I've needed to do `git fetch --tags` is when the tagged
> > > > commit SHA gets merged away.  So presumably the process being
> followed
> > by
> > > > the core committers / releasers is resulting in these "unreachable"
> > tags.
> > > > Not sure if that is preventable though.
> > > >
> > > > - Erik
> > > >
> > > > [1]
> > > http://eddiemoya.com/2013/02/21/better-git-git-fetch-not-getting-tags/
> > > >
> > > > From the git manual (“git help fetch”): [1]
> > > >
> > > > -t, –tags Most of the tags are fetched automatically as branch heads
> > are
> > > > downloaded, but tags that do not point at objects reachable from the
> > > branch
> > > > heads that are being tracked will not be fetched by this mechanism.
> > This
> > > > flag lets all tags and their associated objects be downloaded. The
> > > default
> > > > behavior for a remote may be specified with the remote..tagopt
> > > > setting. See git-config(1).
> > > >
> > > >
> > > >
> > > > On Fri, Mar 18, 2016 at 6:22 PM, Michael Browning <
> > > invitapri...@gmail.com>
> > > > wrote:
> > > >
> > > > > I agree with Kevin -- tags are immutable, so they're naturally
> suited
> > > > > for labeling releases, which ought to be immutable too.
> > > > >
> > > > > On Fri, Mar 18, 2016 at 4:59 PM, Kevin Klues 
> > > wrote:
> > > > > > I respectfully disagree.
> > > > > >
> > > > > > The whole purpose of tags is to mark permanent things like
> > releases,
> > > > > > whereas branches are designed as temporary lines of development
> > that
> > > > > > come and go (and grow and shrink) dynamically all the time.
> > > > > >
> > > > > > On Fri, Mar 18, 2016 at 4:04 PM, Jie Yu 
> > wrote:
> > > > > >> I like the idea of using branches to manage releases.
> > > > > >>
> > > > > >> We can use that to manage point releases and backports as well.
> > > > > >>
> > > > > >> Say we want to cut 0.29.0 now, we fork a branch 0.29.0 and tag
> RCs
> > > in
> > > > > that
> > > > > >> branch. Once the RC is accepted, the head of that branch will
> > become
> > > > the
> > > > > >> release.
> > > > > >>
> > > > > >> Then, we immediate fork that branch and create 0.29.1 branch.
> > > > > >>
> > > > > >> When a new bug fix is committed on the trunk, the committer will
> > > > decide
> > > > > >> whether it'll affect the old releases (a bounded number, we can
> > > decide
> > > > > that
> > > > > >> later). If it does, the committer of that patch should also
> > > > cherry-pick
> > > > > >> that patch to the point releases (e.g., 0.29.1 in this case). We
> > can
> > > > do
> > > > > a
> > > > > >> timely based point releases.
> > > > > >>
> > > > > >> - Jie
> > > > > >>
> > > > > >> On Fri, Mar 18, 2016 at 1:35 PM, Cong Wang <
> > cw...@twopensource.com>
> > > > > wrote:
> > > > > >>
> > > > > >>> On Wed, Mar 16, 2016 at 11:56 AM, Joseph Wu <
> > jos...@mesosphere.io>
> > > > > wrote:
> > > > > >>> > Cong Wang,
> > > > > >>> >
> > > > > >>> > The tags are sync'd.  See:
> > > > https://github.com/apache/mesos/releases
> > > > > >>> >
> > > > > >>> > You might not have done: git pull --tags
> > > > > >>>
> > > > > >>>
> > > > > >>> Yeah, I figured it out by myself too. This is why I hate tags
> > > > > personally,
> > > > > >>> branches are better since they are fetched without additional
> > > > > parameters.
> > > > > >>>
> > > > > >>> Any reason why Mesos maintainers picked tags over branches to
> > > manage
> > > > > >>> releases? Just curious...
> > > > > >>>
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > ~Kevin
> > > > >
> > > >
> > >
> >
>



-- 
~Kevin


Re: [DISCUSS] Fetching Docker Images Requiring User Credentials.

2016-03-18 Thread Kevin Klues
On Tue, Mar 15, 2016 at 6:10 PM, Gilbert Song  wrote:
> @Kevin, thanks for writing it down in detail. It sounds good that a more
> concrete
> schema is designed to generally solve similar auth problem.
>
> Just have two potential issues inlined below:
>
> On Tue, Mar 15, 2016 at 5:39 PM, Kevin Klues  wrote:
>>
>> Yeah, option 2.
>>
>> I was trying to expand on Avinash's suggestion and make it a bit more
>> concrete in terms of what was being proposed. Needing to reload the
>> agent just to update the list of credentials it accepts seems
>> undesirable though.
>>
>> Maybe we could have a way to start the agent with a default config (by
>> iterating on the schema from my previous email), but allow newly
>> launched frameworks to somehow update the config on the fly through a
>
>
> Will it be too expensive to update all agents every time a new framework
> joins (handling consensus problem as well)?

Not sure, I haven't though about it in depth.  What I was picturing
though was something exactly like what you describe for how the docker
containerizer currently solves this problem, except instead of using
docker/config.json directly, use a new credentials.json file which
follows a schema similar to what I proposed above.

>>
>> file in their sandbox that follows the same schema.
>
>
> Does that mean the file in sandbox should be exposed to each other?
>
>>
>> On Tue, Mar 15, 2016 at 5:25 PM, Jie Yu  wrote:
>> > Kevin, are you suggesting option 2 and having a config file like the
>> > above?
>> >
>> > I think another downside of a per-agent config is that it's hard to
>> > maintain this. What if a new framework joins and has a new credential
>> > for
>> > the docker images. Do we need to restart the agent to reload the config?
>> >
>> > - Jie
>> >
>> > On Tue, Mar 15, 2016 at 1:25 PM, Kevin Klues  wrote:
>> >
>> >> Can we be a bit more concrete here and try to build up a schema for
>> >> this.
>> >> Maybe something like:
>> >>
>> >> {
>> >>   [
>> >> {
>> >>   "service" : "docker",
>> >>   "registries" :
>> >>   [
>> >> "uri" : "",
>> >> "default_credentials" :
>> >> {
>> >>   "type" : "",
>> >>   "credential" :
>> >>   {
>> >>   // Custom based on type...
>> >>   }
>> >> },
>> >> "image_credentials" :
>> >> [
>> >>   {
>> >> "image_name" : "",
>> >> "type" : "",
>> >> "credential" :
>> >> {
>> >>   // Custom based on type...
>> >> },
>> >>   },
>> >>   ...
>> >> ],
>> >> ...
>> >>   ]
>> >>   ...
>> >> },
>> >> ...
>> >>   ]
>> >> }
>> >>
>> >>
>> >> On Tue, Mar 15, 2016 at 12:57 PM, Jie Yu  wrote:
>> >> >>
>> >> >> Yeah I was thinking having the JSON as a dictionary with keys being
>> >> >> the
>> >> >> registry URI (appc/docker) and the values being credentials (which
>> >> >> will
>> >> be
>> >> >> a dictionary as well I guess).
>> >> >
>> >> >
>> >> > Using registry URI as the key is problematic. Think about the public
>> >> docker
>> >> > hub. Different frameworks might want to use different credentials to
>> >> access
>> >> > their docker images.
>> >> >
>> >> > - Jie
>> >> >
>> >> > On Tue, Mar 15, 2016 at 11:52 AM, Avinash Sridharan <
>> >> avin...@mesosphere.io
>> >> >
>> >> > wrote:
>> >> >
>> >> >> On Tue, Mar 15, 2016 at 11:43 AM, Vinod Kone 
>> >> wrote:
>> >> >>
>> >> >> > moved core@ to *bcc*
>> >> >> >
>> >> >> > On Tue, Mar 15, 2016 at 11:18 AM, Avinash Sridharan <
>> >> >> avin...@mesosphere.io
>> >> >> > > wrote:
>> >> >> >
>> >> >> >> Why not follow option 2, but instead of passing the agent
>> >> credentials,
>> >> >> >> pass a location to the flag where credentials for the registry
>> >> >> >> can be
>> >> >> found
>> >> >> >> (in JSON)? The frameworks can set credentials (maybe registry
>> >> >> >> name or
>> >> >> URL
>> >> >> >> to the registry), and the credentials can be learnt from the JSON
>> >> >> config.
>> >> >> >>
>> >> >> >
>> >> >> > What if we need credentials for multiple-registries? Have a JSON
>> >> >> > with
>> >> one
>> >> >> > credential per registry I guess? But if possible, I would love to
>> >> solve
>> >> >> > this more generally as possible; as Gilbert mentioned, this is not
>> >> >> > a
>> >> >> > problem just for Docker images but any URIs that need AuthN.
>> >> >> >
>> >> >> Yeah I was thinking having the JSON as a dictionary with keys being
>> >> >> the
>> >> >> registry URI (appc/docker) and the values being credentials (which
>> >> >> will
>> >> be
>> >> >> a dictionary as well I guess).
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Avinash Sridharan, Mesosphere
>> >> >> +1 (323) 702 5245
>> >> >>
>> >>
>> >>
>> >>
>> >> --
>> >> ~Kevin
>> >>
>>
>>
>>
>> --
>> ~Kevin
>
>



-- 
~Kevin


Re: [RESULT][VOTE] Release Apache Mesos 0.27.2 (rc1)

2016-03-18 Thread Kevin Klues
I respectfully disagree.

The whole purpose of tags is to mark permanent things like releases,
whereas branches are designed as temporary lines of development that
come and go (and grow and shrink) dynamically all the time.

On Fri, Mar 18, 2016 at 4:04 PM, Jie Yu  wrote:
> I like the idea of using branches to manage releases.
>
> We can use that to manage point releases and backports as well.
>
> Say we want to cut 0.29.0 now, we fork a branch 0.29.0 and tag RCs in that
> branch. Once the RC is accepted, the head of that branch will become the
> release.
>
> Then, we immediate fork that branch and create 0.29.1 branch.
>
> When a new bug fix is committed on the trunk, the committer will decide
> whether it'll affect the old releases (a bounded number, we can decide that
> later). If it does, the committer of that patch should also cherry-pick
> that patch to the point releases (e.g., 0.29.1 in this case). We can do a
> timely based point releases.
>
> - Jie
>
> On Fri, Mar 18, 2016 at 1:35 PM, Cong Wang  wrote:
>
>> On Wed, Mar 16, 2016 at 11:56 AM, Joseph Wu  wrote:
>> > Cong Wang,
>> >
>> > The tags are sync'd.  See: https://github.com/apache/mesos/releases
>> >
>> > You might not have done: git pull --tags
>>
>>
>> Yeah, I figured it out by myself too. This is why I hate tags personally,
>> branches are better since they are fetched without additional parameters.
>>
>> Any reason why Mesos maintainers picked tags over branches to manage
>> releases? Just curious...
>>



-- 
~Kevin


Re: [DISCUSS] Fetching Docker Images Requiring User Credentials.

2016-03-15 Thread Kevin Klues
Yeah, option 2.

I was trying to expand on Avinash's suggestion and make it a bit more
concrete in terms of what was being proposed. Needing to reload the
agent just to update the list of credentials it accepts seems
undesirable though.

Maybe we could have a way to start the agent with a default config (by
iterating on the schema from my previous email), but allow newly
launched frameworks to somehow update the config on the fly through a
file in their sandbox that follows the same schema.

On Tue, Mar 15, 2016 at 5:25 PM, Jie Yu  wrote:
> Kevin, are you suggesting option 2 and having a config file like the above?
>
> I think another downside of a per-agent config is that it's hard to
> maintain this. What if a new framework joins and has a new credential for
> the docker images. Do we need to restart the agent to reload the config?
>
> - Jie
>
> On Tue, Mar 15, 2016 at 1:25 PM, Kevin Klues  wrote:
>
>> Can we be a bit more concrete here and try to build up a schema for this.
>> Maybe something like:
>>
>> {
>>   [
>> {
>>   "service" : "docker",
>>   "registries" :
>>   [
>> "uri" : "",
>> "default_credentials" :
>> {
>>   "type" : "",
>>   "credential" :
>>   {
>>   // Custom based on type...
>>   }
>> },
>> "image_credentials" :
>> [
>>   {
>> "image_name" : "",
>> "type" : "",
>> "credential" :
>> {
>>   // Custom based on type...
>> },
>>   },
>>   ...
>> ],
>> ...
>>   ]
>>   ...
>> },
>> ...
>>   ]
>> }
>>
>>
>> On Tue, Mar 15, 2016 at 12:57 PM, Jie Yu  wrote:
>> >>
>> >> Yeah I was thinking having the JSON as a dictionary with keys being the
>> >> registry URI (appc/docker) and the values being credentials (which will
>> be
>> >> a dictionary as well I guess).
>> >
>> >
>> > Using registry URI as the key is problematic. Think about the public
>> docker
>> > hub. Different frameworks might want to use different credentials to
>> access
>> > their docker images.
>> >
>> > - Jie
>> >
>> > On Tue, Mar 15, 2016 at 11:52 AM, Avinash Sridharan <
>> avin...@mesosphere.io
>> >
>> > wrote:
>> >
>> >> On Tue, Mar 15, 2016 at 11:43 AM, Vinod Kone 
>> wrote:
>> >>
>> >> > moved core@ to *bcc*
>> >> >
>> >> > On Tue, Mar 15, 2016 at 11:18 AM, Avinash Sridharan <
>> >> avin...@mesosphere.io
>> >> > > wrote:
>> >> >
>> >> >> Why not follow option 2, but instead of passing the agent
>> credentials,
>> >> >> pass a location to the flag where credentials for the registry can be
>> >> found
>> >> >> (in JSON)? The frameworks can set credentials (maybe registry name or
>> >> URL
>> >> >> to the registry), and the credentials can be learnt from the JSON
>> >> config.
>> >> >>
>> >> >
>> >> > What if we need credentials for multiple-registries? Have a JSON with
>> one
>> >> > credential per registry I guess? But if possible, I would love to
>> solve
>> >> > this more generally as possible; as Gilbert mentioned, this is not a
>> >> > problem just for Docker images but any URIs that need AuthN.
>> >> >
>> >> Yeah I was thinking having the JSON as a dictionary with keys being the
>> >> registry URI (appc/docker) and the values being credentials (which will
>> be
>> >> a dictionary as well I guess).
>> >>
>> >>
>> >> --
>> >> Avinash Sridharan, Mesosphere
>> >> +1 (323) 702 5245
>> >>
>>
>>
>>
>> --
>> ~Kevin
>>



-- 
~Kevin


Re: [DISCUSS] Fetching Docker Images Requiring User Credentials.

2016-03-15 Thread Kevin Klues
Can we be a bit more concrete here and try to build up a schema for this.
Maybe something like:

{
  [
{
  "service" : "docker",
  "registries" :
  [
"uri" : "",
"default_credentials" :
{
  "type" : "",
  "credential" :
  {
  // Custom based on type...
  }
},
"image_credentials" :
[
  {
"image_name" : "",
"type" : "",
"credential" :
{
  // Custom based on type...
},
  },
  ...
],
...
  ]
  ...
},
...
  ]
}


On Tue, Mar 15, 2016 at 12:57 PM, Jie Yu  wrote:
>>
>> Yeah I was thinking having the JSON as a dictionary with keys being the
>> registry URI (appc/docker) and the values being credentials (which will
be
>> a dictionary as well I guess).
>
>
> Using registry URI as the key is problematic. Think about the public
docker
> hub. Different frameworks might want to use different credentials to
access
> their docker images.
>
> - Jie
>
> On Tue, Mar 15, 2016 at 11:52 AM, Avinash Sridharan 
> wrote:
>
>> On Tue, Mar 15, 2016 at 11:43 AM, Vinod Kone 
wrote:
>>
>> > moved core@ to *bcc*
>> >
>> > On Tue, Mar 15, 2016 at 11:18 AM, Avinash Sridharan <
>> avin...@mesosphere.io
>> > > wrote:
>> >
>> >> Why not follow option 2, but instead of passing the agent credentials,
>> >> pass a location to the flag where credentials for the registry can be
>> found
>> >> (in JSON)? The frameworks can set credentials (maybe registry name or
>> URL
>> >> to the registry), and the credentials can be learnt from the JSON
>> config.
>> >>
>> >
>> > What if we need credentials for multiple-registries? Have a JSON with
one
>> > credential per registry I guess? But if possible, I would love to solve
>> > this more generally as possible; as Gilbert mentioned, this is not a
>> > problem just for Docker images but any URIs that need AuthN.
>> >
>> Yeah I was thinking having the JSON as a dictionary with keys being the
>> registry URI (appc/docker) and the values being credentials (which will
be
>> a dictionary as well I guess).
>>
>>
>> --
>> Avinash Sridharan, Mesosphere
>> +1 (323) 702 5245
>>



-- 
~Kevin


Re: [VOTE] Release Apache Mesos 0.28.0 (rc1)

2016-03-10 Thread Kevin Klues
The list of patches to include in 0.28.0-rc2 are now being tracked by a JIRA:

https://issues.apache.org/jira/browse/MESOS-4915

On Thu, Mar 10, 2016 at 3:51 PM, Vinod Kone  wrote:
> I'll cut it first thing tomorrow. Whatever from kevin's list ablove gets in
> by tonight will get into rc2.
>
> On Thu, Mar 10, 2016 at 6:28 PM, Daniel Osborne <
> daniel.osbo...@metaswitch.com> wrote:
>
>> Kevin,
>>
>> When are you planning on cutting?
>>
>> I'm very keen to seeing 4370 get merged. It just needs some final fixes to
>> get past "Fix it, then ship it".
>>
>> Thanks,
>> Dan
>>
>>
>> -Original Message-
>> From: Kevin Klues [mailto:klue...@gmail.com]
>> Sent: Thursday, March 10, 2016 11:46 AM
>> To: user ; dev 
>> Subject: Re: [VOTE] Release Apache Mesos 0.28.0 (rc1)
>>
>> Updated list of patches to include in 0.28.0-rc2.  We are cutting the
>> release candidate today, so make sure your patches land soon if they
>> haven't already.
>>
>> Did I miss any?
>>
>> Committed:
>> Added documentation about container image support.
>> commit 7de8cdd4d8ed1d222fa03ea0d8fa6740c4a9f84b
>> https://reviews.apache.org/r/44414
>>
>> Fixed the logic for default docker cmd case.
>> commit e42f740ccb655c0478a3002c0b6fa90c1144f41c
>> https://reviews.apache.org/r/44468/
>>
>>
>> Still Under Review:
>> MESOS-4370 NetworkSettings.IPAddress field is deprectaed in Docker.
>> https://reviews.apache.org/r/43093/
>>
>> Fixed a bug that causes the task stuck in staging state.
>> https://reviews.apache.org/r/44435/
>>
>> Fixed http endpoint trigger two inverse offer calls.
>> https://reviews.apache.org/r/44258/
>>
>> Added support for "overlay" keyword.
>> https://reviews.apache.org/r/44421/
>>
>> Added document for overlayfs backend.
>> https://reviews.apache.org/r/44391/
>>
>> Add support for user-defined networks.
>> https://reviews.apache.org/r/42516/
>>
>> On Wed, Mar 9, 2016 at 5:50 PM, Guangya Liu  wrote:
>> > Tim,
>> >
>> > What about https://reviews.apache.org/r/42516/ for user-defined
>> > network in docker containerizer, the user defined network has been
>> > landed in docker for quite a while and it is better to enable mesos
>> > docker containerizer support this.
>> >
>> > Thanks,
>> >
>> > Guangya
>> >
>> > On Thu, Mar 10, 2016 at 2:00 AM, Kevin Klues  wrote:
>> >>
>> >> Tim,
>> >>
>> >> Is there a review other than the following for MESOS-4370?
>> >>
>> >> Restore Mesos' ability to extract Docker assigned IPs (still under
>> >> review):
>> >> https://reviews.apache.org/r/43093/
>> >>
>> >> If not, it was already on the list, but has not yet landed.
>> >>
>> >> On Wed, Mar 9, 2016 at 9:57 AM, Timothy Chen  wrote:
>> >> > Also like to include MESOS-4370 as it fixes IP Address look up
>> >> > logic and also unblocks users using custom Docker network.
>> >> >
>> >> > Tim
>> >> >
>> >> > On Wed, Mar 9, 2016 at 9:55 AM, Gilbert Song
>> >> > 
>> >> > wrote:
>> >> >> Hi Kevin,
>> >> >>
>> >> >> Please remove the the patch below from the list:
>> >> >> Implemented runtime isolator default cmd test (still under review).
>> >> >> https://reviews.apache.org/r/44469/
>> >> >>
>> >> >> Because the bug was fixed by patch #44468, the test should not be
>> >> >> considered as a block. I am updating MESOS-4888 and move the test
>> >> >> to a separate JIRA.
>> >> >>
>> >> >> Thanks,
>> >> >> Gilbert
>> >> >>
>> >> >> On Tue, Mar 8, 2016 at 2:43 PM, Kevin Klues 
>> wrote:
>> >> >>
>> >> >>> Here are the list of reviews/patches that have been called out in
>> >> >>> this thread for inclusion in 0.28.0-rc2.  Some of them are still
>> >> >>> under review and will need to land by Thursday to be included.
>> >> >>>
>> >> >>> Are there others?
>> >> >>>
>> >> >>> Jie's container image documentation (submitted):
>> >> >>> commit 7de8cdd4d8ed1d222fa03e

Re: [VOTE] Release Apache Mesos 0.28.0 (rc1)

2016-03-10 Thread Kevin Klues
Updated list of patches to include in 0.28.0-rc2.  We are cutting the
release candidate today, so make sure your patches land soon if they
haven't already.

Did I miss any?

Committed:
Added documentation about container image support.
commit 7de8cdd4d8ed1d222fa03ea0d8fa6740c4a9f84b
https://reviews.apache.org/r/44414

Fixed the logic for default docker cmd case.
commit e42f740ccb655c0478a3002c0b6fa90c1144f41c
https://reviews.apache.org/r/44468/


Still Under Review:
MESOS-4370 NetworkSettings.IPAddress field is deprectaed in Docker.
https://reviews.apache.org/r/43093/

Fixed a bug that causes the task stuck in staging state.
https://reviews.apache.org/r/44435/

Fixed http endpoint trigger two inverse offer calls.
https://reviews.apache.org/r/44258/

Added support for "overlay" keyword.
https://reviews.apache.org/r/44421/

Added document for overlayfs backend.
https://reviews.apache.org/r/44391/

Add support for user-defined networks.
https://reviews.apache.org/r/42516/

On Wed, Mar 9, 2016 at 5:50 PM, Guangya Liu  wrote:
> Tim,
>
> What about https://reviews.apache.org/r/42516/ for user-defined network in
> docker containerizer, the user defined network has been landed in docker for
> quite a while and it is better to enable mesos docker containerizer support
> this.
>
> Thanks,
>
> Guangya
>
> On Thu, Mar 10, 2016 at 2:00 AM, Kevin Klues  wrote:
>>
>> Tim,
>>
>> Is there a review other than the following for MESOS-4370?
>>
>> Restore Mesos' ability to extract Docker assigned IPs (still under
>> review):
>> https://reviews.apache.org/r/43093/
>>
>> If not, it was already on the list, but has not yet landed.
>>
>> On Wed, Mar 9, 2016 at 9:57 AM, Timothy Chen  wrote:
>> > Also like to include MESOS-4370 as it fixes IP Address look up logic
>> > and also unblocks users using custom Docker network.
>> >
>> > Tim
>> >
>> > On Wed, Mar 9, 2016 at 9:55 AM, Gilbert Song 
>> > wrote:
>> >> Hi Kevin,
>> >>
>> >> Please remove the the patch below from the list:
>> >> Implemented runtime isolator default cmd test (still under review).
>> >> https://reviews.apache.org/r/44469/
>> >>
>> >> Because the bug was fixed by patch #44468, the test should not be
>> >> considered as a block. I am updating MESOS-4888 and move the test to a
>> >> separate JIRA.
>> >>
>> >> Thanks,
>> >> Gilbert
>> >>
>> >> On Tue, Mar 8, 2016 at 2:43 PM, Kevin Klues  wrote:
>> >>
>> >>> Here are the list of reviews/patches that have been called out in this
>> >>> thread for inclusion in 0.28.0-rc2.  Some of them are still under
>> >>> review and will need to land by Thursday to be included.
>> >>>
>> >>> Are there others?
>> >>>
>> >>> Jie's container image documentation (submitted):
>> >>> commit 7de8cdd4d8ed1d222fa03ea0d8fa6740c4a9f84b
>> >>> https://reviews.apache.org/r/44414
>> >>>
>> >>> Restore Mesos' ability to extract Docker assigned IPs (still under
>> >>> review):
>> >>> https://reviews.apache.org/r/43093/
>> >>>
>> >>> Fixed the logic for default docker cmd case (submitted).
>> >>> commit e42f740ccb655c0478a3002c0b6fa90c1144f41c
>> >>> https://reviews.apache.org/r/44468/
>> >>>
>> >>> Implemented runtime isolator default cmd test (still under review).
>> >>> https://reviews.apache.org/r/44469/
>> >>>
>> >>> Fixed a bug that causes the task stuck in staging state (still under
>> >>> review).
>> >>> https://reviews.apache.org/r/44435/
>> >>>
>> >>> On Tue, Mar 8, 2016 at 10:30 AM, Kevin Klues 
>> >>> wrote:
>> >>> > Yes, will do.
>> >>> >
>> >>> > On Tue, Mar 8, 2016 at 10:26 AM, Vinod Kone 
>> >>> wrote:
>> >>> >> +kevin klues
>> >>> >>
>> >>> >> OK. I'm cancelling this vote since there are some show stopper
>> >>> >> issues
>> >>> that
>> >>> >> we need to cherry-pick. I'll cut another RC on Thursday.
>> >>> >>
>> >>> >> @shepherds: can you please make sure the blocker tickets are marked
>> >>> >> with
>> >>> >> fix version and that they land today or tomorrow?
>> >>>

Re: [VOTE] Release Apache Mesos 0.28.0 (rc1)

2016-03-09 Thread Kevin Klues
Tim,

Is there a review other than the following for MESOS-4370?

Restore Mesos' ability to extract Docker assigned IPs (still under review):
https://reviews.apache.org/r/43093/

If not, it was already on the list, but has not yet landed.

On Wed, Mar 9, 2016 at 9:57 AM, Timothy Chen  wrote:
> Also like to include MESOS-4370 as it fixes IP Address look up logic
> and also unblocks users using custom Docker network.
>
> Tim
>
> On Wed, Mar 9, 2016 at 9:55 AM, Gilbert Song  wrote:
>> Hi Kevin,
>>
>> Please remove the the patch below from the list:
>> Implemented runtime isolator default cmd test (still under review).
>> https://reviews.apache.org/r/44469/
>>
>> Because the bug was fixed by patch #44468, the test should not be
>> considered as a block. I am updating MESOS-4888 and move the test to a
>> separate JIRA.
>>
>> Thanks,
>> Gilbert
>>
>> On Tue, Mar 8, 2016 at 2:43 PM, Kevin Klues  wrote:
>>
>>> Here are the list of reviews/patches that have been called out in this
>>> thread for inclusion in 0.28.0-rc2.  Some of them are still under
>>> review and will need to land by Thursday to be included.
>>>
>>> Are there others?
>>>
>>> Jie's container image documentation (submitted):
>>> commit 7de8cdd4d8ed1d222fa03ea0d8fa6740c4a9f84b
>>> https://reviews.apache.org/r/44414
>>>
>>> Restore Mesos' ability to extract Docker assigned IPs (still under review):
>>> https://reviews.apache.org/r/43093/
>>>
>>> Fixed the logic for default docker cmd case (submitted).
>>> commit e42f740ccb655c0478a3002c0b6fa90c1144f41c
>>> https://reviews.apache.org/r/44468/
>>>
>>> Implemented runtime isolator default cmd test (still under review).
>>> https://reviews.apache.org/r/44469/
>>>
>>> Fixed a bug that causes the task stuck in staging state (still under
>>> review).
>>> https://reviews.apache.org/r/44435/
>>>
>>> On Tue, Mar 8, 2016 at 10:30 AM, Kevin Klues  wrote:
>>> > Yes, will do.
>>> >
>>> > On Tue, Mar 8, 2016 at 10:26 AM, Vinod Kone 
>>> wrote:
>>> >> +kevin klues
>>> >>
>>> >> OK. I'm cancelling this vote since there are some show stopper issues
>>> that
>>> >> we need to cherry-pick. I'll cut another RC on Thursday.
>>> >>
>>> >> @shepherds: can you please make sure the blocker tickets are marked with
>>> >> fix version and that they land today or tomorrow?
>>> >>
>>> >> @kevin: since you have volunteered to help with the release, can you
>>> make
>>> >> sure we have a list of commits to cherry pick for rc2?
>>> >>
>>> >> Thanks,
>>> >>
>>> >>
>>> >> On Tue, Mar 8, 2016 at 12:05 AM, Shuai Lin 
>>> wrote:
>>> >>
>>> >>> Maybe also https://issues.apache.org/jira/browse/MESOS-4877 and
>>> >>> https://issues.apache.org/jira/browse/MESOS-4878 ?
>>> >>>
>>> >>>
>>> >>> On Tue, Mar 8, 2016 at 9:13 AM, Jie Yu  wrote:
>>> >>>
>>> >>>> I'd like to fix https://issues.apache.org/jira/browse/MESOS-4888 as
>>> well
>>> >>>> if you guys plan to cut another RC
>>> >>>>
>>> >>>> On Mon, Mar 7, 2016 at 10:16 AM, Daniel Osborne <
>>> >>>> daniel.osbo...@metaswitch.com> wrote:
>>> >>>>
>>> >>>>> -1
>>> >>>>>
>>> >>>>> If it doesn’t cause too much pain, I'm hoping we can squeeze a
>>> >>>>> relatively small patch which restores Mesos' ability to extract
>>> Docker
>>> >>>>> assigned IPs. This has been broken with Docker 1.10's release over
>>> a month
>>> >>>>> ago, and prevents service discovery and DNS from working.
>>> >>>>>
>>> >>>>> Mesos-4370: https://issues.apache.org/jira/browse/MESOS-4370
>>> >>>>> RB# 43093: https://reviews.apache.org/r/43093/
>>> >>>>>
>>> >>>>> I've built 0.28.0-rc1 with this patch and can confirm that it fixes
>>> it
>>> >>>>> as expected.
>>> >>>>>
>>> >>>>> Apologies for not bringing this to attention earlier.
>>

Re: [VOTE] Release Apache Mesos 0.28.0 (rc1)

2016-03-08 Thread Kevin Klues
Here are the list of reviews/patches that have been called out in this
thread for inclusion in 0.28.0-rc2.  Some of them are still under
review and will need to land by Thursday to be included.

Are there others?

Jie's container image documentation (submitted):
commit 7de8cdd4d8ed1d222fa03ea0d8fa6740c4a9f84b
https://reviews.apache.org/r/44414

Restore Mesos' ability to extract Docker assigned IPs (still under review):
https://reviews.apache.org/r/43093/

Fixed the logic for default docker cmd case (submitted).
commit e42f740ccb655c0478a3002c0b6fa90c1144f41c
https://reviews.apache.org/r/44468/

Implemented runtime isolator default cmd test (still under review).
https://reviews.apache.org/r/44469/

Fixed a bug that causes the task stuck in staging state (still under review).
https://reviews.apache.org/r/44435/

On Tue, Mar 8, 2016 at 10:30 AM, Kevin Klues  wrote:
> Yes, will do.
>
> On Tue, Mar 8, 2016 at 10:26 AM, Vinod Kone  wrote:
>> +kevin klues
>>
>> OK. I'm cancelling this vote since there are some show stopper issues that
>> we need to cherry-pick. I'll cut another RC on Thursday.
>>
>> @shepherds: can you please make sure the blocker tickets are marked with
>> fix version and that they land today or tomorrow?
>>
>> @kevin: since you have volunteered to help with the release, can you make
>> sure we have a list of commits to cherry pick for rc2?
>>
>> Thanks,
>>
>>
>> On Tue, Mar 8, 2016 at 12:05 AM, Shuai Lin  wrote:
>>
>>> Maybe also https://issues.apache.org/jira/browse/MESOS-4877 and
>>> https://issues.apache.org/jira/browse/MESOS-4878 ?
>>>
>>>
>>> On Tue, Mar 8, 2016 at 9:13 AM, Jie Yu  wrote:
>>>
>>>> I'd like to fix https://issues.apache.org/jira/browse/MESOS-4888 as well
>>>> if you guys plan to cut another RC
>>>>
>>>> On Mon, Mar 7, 2016 at 10:16 AM, Daniel Osborne <
>>>> daniel.osbo...@metaswitch.com> wrote:
>>>>
>>>>> -1
>>>>>
>>>>> If it doesn’t cause too much pain, I'm hoping we can squeeze a
>>>>> relatively small patch which restores Mesos' ability to extract Docker
>>>>> assigned IPs. This has been broken with Docker 1.10's release over  a 
>>>>> month
>>>>> ago, and prevents service discovery and DNS from working.
>>>>>
>>>>> Mesos-4370: https://issues.apache.org/jira/browse/MESOS-4370
>>>>> RB# 43093: https://reviews.apache.org/r/43093/
>>>>>
>>>>> I've built 0.28.0-rc1 with this patch and can confirm that it fixes it
>>>>> as expected.
>>>>>
>>>>> Apologies for not bringing this to attention earlier.
>>>>>
>>>>> Thanks all,
>>>>> Dan
>>>>>
>>>>> -Original Message-
>>>>> From: Vinod Kone [mailto:vinodk...@apache.org]
>>>>> Sent: Thursday, March 3, 2016 5:44 PM
>>>>> To: dev ; user 
>>>>> Subject: [VOTE] Release Apache Mesos 0.28.0 (rc1)
>>>>>
>>>>> Hi all,
>>>>>
>>>>>
>>>>> Please vote on releasing the following candidate as Apache Mesos 0.28.0.
>>>>>
>>>>>
>>>>> 0.28.0 includes the following:
>>>>>
>>>>>
>>>>> 
>>>>>
>>>>>   * [MESOS-4343] - A new cgroups isolator for enabling the net_cls
>>>>> subsystem in
>>>>>
>>>>> Linux. The cgroups/net_cls isolator allows operators to provide
>>>>> network
>>>>>
>>>>>
>>>>> performance isolation and network segmentation for containers within
>>>>> a Mesos
>>>>>
>>>>> cluster. To enable the cgroups/net_cls isolator, append
>>>>> `cgroups/net_cls` to
>>>>>
>>>>> the `--isolation` flag when starting the slave. Please refer to
>>>>>
>>>>>
>>>>> docs/mesos-containerizer.md for more details.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>   * [MESOS-4687] - The implementation of scalar resource values (e.g.,
>>>>> "2.5
>>>>>
>>>>>
>>>>> CPUs") has changed. Mesos now reliably supports resources with up to
>>>>> th

Re: [VOTE] Release Apache Mesos 0.28.0 (rc1)

2016-03-08 Thread Kevin Klues
Yes, will do.

On Tue, Mar 8, 2016 at 10:26 AM, Vinod Kone  wrote:
> +kevin klues
>
> OK. I'm cancelling this vote since there are some show stopper issues that
> we need to cherry-pick. I'll cut another RC on Thursday.
>
> @shepherds: can you please make sure the blocker tickets are marked with
> fix version and that they land today or tomorrow?
>
> @kevin: since you have volunteered to help with the release, can you make
> sure we have a list of commits to cherry pick for rc2?
>
> Thanks,
>
>
> On Tue, Mar 8, 2016 at 12:05 AM, Shuai Lin  wrote:
>
>> Maybe also https://issues.apache.org/jira/browse/MESOS-4877 and
>> https://issues.apache.org/jira/browse/MESOS-4878 ?
>>
>>
>> On Tue, Mar 8, 2016 at 9:13 AM, Jie Yu  wrote:
>>
>>> I'd like to fix https://issues.apache.org/jira/browse/MESOS-4888 as well
>>> if you guys plan to cut another RC
>>>
>>> On Mon, Mar 7, 2016 at 10:16 AM, Daniel Osborne <
>>> daniel.osbo...@metaswitch.com> wrote:
>>>
>>>> -1
>>>>
>>>> If it doesn’t cause too much pain, I'm hoping we can squeeze a
>>>> relatively small patch which restores Mesos' ability to extract Docker
>>>> assigned IPs. This has been broken with Docker 1.10's release over  a month
>>>> ago, and prevents service discovery and DNS from working.
>>>>
>>>> Mesos-4370: https://issues.apache.org/jira/browse/MESOS-4370
>>>> RB# 43093: https://reviews.apache.org/r/43093/
>>>>
>>>> I've built 0.28.0-rc1 with this patch and can confirm that it fixes it
>>>> as expected.
>>>>
>>>> Apologies for not bringing this to attention earlier.
>>>>
>>>> Thanks all,
>>>> Dan
>>>>
>>>> -Original Message-
>>>> From: Vinod Kone [mailto:vinodk...@apache.org]
>>>> Sent: Thursday, March 3, 2016 5:44 PM
>>>> To: dev ; user 
>>>> Subject: [VOTE] Release Apache Mesos 0.28.0 (rc1)
>>>>
>>>> Hi all,
>>>>
>>>>
>>>> Please vote on releasing the following candidate as Apache Mesos 0.28.0.
>>>>
>>>>
>>>> 0.28.0 includes the following:
>>>>
>>>>
>>>> 
>>>>
>>>>   * [MESOS-4343] - A new cgroups isolator for enabling the net_cls
>>>> subsystem in
>>>>
>>>> Linux. The cgroups/net_cls isolator allows operators to provide
>>>> network
>>>>
>>>>
>>>> performance isolation and network segmentation for containers within
>>>> a Mesos
>>>>
>>>> cluster. To enable the cgroups/net_cls isolator, append
>>>> `cgroups/net_cls` to
>>>>
>>>> the `--isolation` flag when starting the slave. Please refer to
>>>>
>>>>
>>>> docs/mesos-containerizer.md for more details.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>   * [MESOS-4687] - The implementation of scalar resource values (e.g.,
>>>> "2.5
>>>>
>>>>
>>>> CPUs") has changed. Mesos now reliably supports resources with up to
>>>> three
>>>>
>>>> decimal digits of precision (e.g., "2.501 CPUs"); resources with
>>>> more than
>>>>
>>>> three decimal digits of precision will be rounded. Internally,
>>>> resource math
>>>>
>>>> is now done using a fixed-point format that supports three decimal
>>>> digits of
>>>>
>>>> precision, and then converted to/from floating point for input and
>>>> output,
>>>>
>>>> respectively. Frameworks that do their own resource math and
>>>> manipulate
>>>>
>>>>
>>>> fractional resources may observe differences in roundoff error and
>>>> numerical
>>>>
>>>> precision.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>   * [MESOS-4479] - Reserved resources can now optionally include
>>>> "labels".
>>>>
>>>>
>>>> Labels are a set of key-value pairs that can be used to associate
>>>> metadata
>>>>
>>>> with a reserved resource. For example

Re: [VOTE] Release Apache Mesos 0.27.2 (rc1)

2016-03-07 Thread Kevin Klues
Sure, that's fine with me. Since the fix for the flaky test was a flaw
in the test itself there shouldn't be any issues.

Thanks for double checking. Especially since my vote was non-binding.

Kevin

On Mon, Mar 7, 2016 at 7:33 PM, Michael Park  wrote:
> Kevin,
>
> Sorry for missing your flaky test patch. It seems like we added the 0.27.2
> to the backports list after you had indicated that 0.24.2, 0.25.1, and
> 0.26.1
> needs to include the patch. We should have asked whether this needs to
> be included in 0.27.2 as well. I think we missed it because there were many
> patches that needed to be included in 0.24.2, 0.25.1 and 0.26.1 but not
> 0.27.2
> since they had made it into 0.27.0 or 0.27.1.
>
> Having said that, I'm inclined to agree with Joris and proceed since it
> doesn't
> have much of an impact in terms of the resulting binary, for example.
>
> Could you confirm or deny whether you're ok with this?
>
> Thanks,
>
> MPark
>
> On 4 March 2016 at 15:51, Joris Van Remoortere
>  wrote:
>>
>> +1 (binding)
>> Greg's upgrade scripts & CI results
>>
>> The missing commit is for a flaky test which doesn't influence the
>> production binaries.
>> Unless we need to cut another RC for a bug, I suggest we move ahead.
>>
>> On Wed, Mar 2, 2016 at 10:36 AM, Jörg Schad  wrote:
>>
>> > Except the missing fix for Mesos-4518, if we consider cutting a rc2
>> > for that, maybe we could include the fix for MESOS-4677 as well (see
>> > failing ROOT_CGROUPS_Pids_and_Tids test below).
>> > +1 (non-binding)
>> >
>> > All the failing tests I encountered seem to be known.
>> >
>> > Centos 7
>> > * LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids  (fixed with
>> > MESOS-4677 for 0.28 )
>> > * LinuxFilesystemIsolatorTest.ROOT_MultipleContainers (open ticket
>> > MESOS-4423)
>> >
>> > Centos 7 - SSL
>> > All green
>> >
>> > Centos 6 (+/- SSL)
>> > * MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery (reopened
>> > MESOS-4047)
>> >
>> > Debian 8 (+/- SSL)
>> > * DockerContainerizerTest.ROOT_DOCKER_Kill (seems the same issue as
>> > MESOS-3937)
>> >
>> > Ubuntu 15 (+/- SSL)
>> > Green
>> >
>> > Ubuntu 14 (+/- SSL)
>> > Green
>> >
>> > Ubuntu 12 (+/- SSL)
>> > Green
>> >
>> > On Tue, Mar 1, 2016 at 10:18 PM, Kevin Klues  wrote:
>> > > -1 (non-binding)
>> > >
>> > > This release
>> > > candidate
>> > > should have included the backport to re
>> > > s
>> > > olv
>> > > e
>> > > MESOS-4518 <https://issues.apache.org/jira/browse/MESOS-4518>.
>> > > All of the other release candidates that came out as backports
>> > > recently
>> > > have included this, but somehow this one was overlooked.
>> > >
>> > >
>> > >
>> > >
>> > > On Tue, Mar 1, 2016 at 4:35 PM, Greg Mann  wrote:
>> > >
>> > >> I was able to successfully test a simple upgrade scenario between
>> > >> 0.26.1-rc1 and 0.27.2-rc1 using Niklas's upgrade testing script,
>> > >> which
>> > I've
>> > >> modified slightly and reposted here:
>> > https://reviews.apache.org/r/44229/
>> > >>
>> > >> On Tue, Mar 1, 2016 at 2:22 PM, Kevin Klues 
>> > >> wrote:
>> > >>
>> > >> > The others all seem to have them though:
>> > >> >
>> > >> >
>> > >> >
>> > >>
>> >
>> > https://github.com/apache/mesos/commits/0.26.1-rc1/src/tests/master_tests.cpp
>> > >> >
>> > >> >
>> > >>
>> >
>> > https://github.com/apache/mesos/commits/0.25.1-rc1/src/tests/master_tests.cpp
>> > >> >
>> > >> >
>> > >>
>> >
>> > https://github.com/apache/mesos/commits/0.24.2-rc1/src/tests/master_tests.cpp
>> > >> >
>> > >> > Just not:
>> > >> >
>> > >> >
>> > >>
>> >
>> > https://github.com/apache/mesos/commits/0.27.2-rc1/src/tests/master_tests.cpp
>> > >> >
>> > >> > On Tue, Mar 1, 2016 at 2:17 PM, Kevin Klues 
>> > wrote:
>> > >> > > Looks like this rc is missing this c

Re: [VOTE] Release Apache Mesos 0.27.2 (rc1)

2016-03-01 Thread Kevin Klues
-1 (non-binding)

This release
​candidate ​
should have included the backport to re
​s​
olv
​e ​
MESOS-4518 <https://issues.apache.org/jira/browse/MESOS-4518>​.
​All ​of the other release candidates that came out as backports recently
have included this, but somehow this one was overlooked.




On Tue, Mar 1, 2016 at 4:35 PM, Greg Mann  wrote:

> I was able to successfully test a simple upgrade scenario between
> 0.26.1-rc1 and 0.27.2-rc1 using Niklas's upgrade testing script, which I've
> modified slightly and reposted here: https://reviews.apache.org/r/44229/
>
> On Tue, Mar 1, 2016 at 2:22 PM, Kevin Klues  wrote:
>
> > The others all seem to have them though:
> >
> >
> >
> https://github.com/apache/mesos/commits/0.26.1-rc1/src/tests/master_tests.cpp
> >
> >
> https://github.com/apache/mesos/commits/0.25.1-rc1/src/tests/master_tests.cpp
> >
> >
> https://github.com/apache/mesos/commits/0.24.2-rc1/src/tests/master_tests.cpp
> >
> > Just not:
> >
> >
> https://github.com/apache/mesos/commits/0.27.2-rc1/src/tests/master_tests.cpp
> >
> > On Tue, Mar 1, 2016 at 2:17 PM, Kevin Klues  wrote:
> > > Looks like this rc is missing this commit:
> > >
> > >
> >
> https://github.com/apache/mesos/commit/d3108d776b6f7121e37176eda686ecc7245be4cd
> > >
> > > On Tue, Mar 1, 2016 at 2:08 PM, Joris Van Remoortere
> > >  wrote:
> > >> @Michael Browning:
> > >>>
> > >>> MasterTest.MaxCompletedTasksPerFrameworkFlag [flaky, tracked in
> > >>> MESOS-4518]
> > >>
> > >> This is supposed to be fixed in this release. It is concerning that
> this
> > >> came up.
> > >> Can you verify this and provide logs to Kevin Klues?
> > >>
> > >>
> > >> —
> > >> Joris Van Remoortere
> > >> Mesosphere
> > >>
> > >> On Tue, Mar 1, 2016 at 2:00 PM, Michael Browning <
> > invitapri...@gmail.com>
> > >> wrote:
> > >>>
> > >>> +1 (non-binding)
> > >>>
> > >>> Fedora 23: `make check` non-root OK
> > >>> OS X: `make check` non-root OK
> > >>> Ubuntu 14.04: `make check` non-root, three failures:
> > >>> ContainerLoggerTest.DefaultToSandbox [flaky, tracked in MESOS-4615]
> > >>> MasterQuotaTest.AvailableResourcesAfterRescinding [flaky, tracked in
> > >>> MESOS-4542]
> > >>> MasterTest.MaxCompletedTasksPerFrameworkFlag [flaky, tracked in
> > >>> MESOS-4518]
> > >>>
> > >>> On Mon, Feb 29, 2016 at 10:40 PM, Greg Mann 
> > wrote:
> > >>>
> > >>> > +1 (non-binding)
> > >>> >
> > >>> > `sudo make check` on Ubuntu 14.04 using gcc, with libevent and SSL
> > >>> > enabled.
> > >>> >
> > >>> > All tests pass except
> > MemoryPressureMesosTest.CGROUPS_ROOT_Statistics,
> > >>> > which seems to be due to the issue found here:
> > >>> > https://issues.apache.org/jira/browse/MESOS-4053
> > >>> >
> > >>> >
> > >>> > On Mon, Feb 29, 2016 at 2:17 PM, Michael Park 
> > wrote:
> > >>> >
> > >>> > > Vinod, we've only committed the CHANGELOGs to the specific tags.
> I
> > >>> > > didn't
> > >>> > > realize that I should commit those to master as well, but it
> makes
> > >>> > > total
> > >>> > > sense to do so. I'll do that. Thanks.
> > >>> > >
> > >>> > > On 29 February 2016 at 13:50, Vinod Kone 
> > wrote:
> > >>> > >
> > >>> > >> I don't see CHANGELOGs for these versions on the master branch?
> > >>> > >>
> > >>> > >> On Mon, Feb 29, 2016 at 1:39 PM, Neil Conway <
> > neil.con...@gmail.com>
> > >>> > >> wrote:
> > >>> > >>
> > >>> > >> > As described (briefly) in the release emails, 0.27.2, 0.26.1,
> > >>> > >> > 0.25.1,
> > >>> > >> > and 0.24.2 contains a new feature: "reliable floating point
> for
> > >>> > >> > scalar
> > >>> > >> > resources" (MESOS-4687).
> > >>> > >> >
> > >>> > >> > To elaborate on th

Re: [VOTE] Release Apache Mesos 0.26.1 (rc1)

2016-03-01 Thread Kevin Klues
I committed a fix for this in:
https://github.com/apache/mesos/commit/42f746937233349660c687ea7a66cc0a78871663

Looks like that's post 0.26 though, so maybe it should be included in the
.1 rc

On Mon, Feb 29, 2016 at 2:27 PM, Vinod Kone  wrote:

> Looks like the ASF CI builds for CentOS7 are failing because they are
> unable to find JAVA_HOME. Couldn't tell if it's an issue with the docker
> build script or something in the configure script.
>
>
> checking for svn_txdelta in -lsvn_delta-1... yes
> checking for sasl_done in -lsasl2... yes
> checking SASL CRAM-MD5 support... yes
> checking for javac... /usr/bin/javac
> checking for java... /usr/bin/java
> checking value of Java system property 'java.home'... 
> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.71-2.b15.el7_2.x86_64/jre
> configure: error: could not guess JAVA_HOME
>
>
>
> *Revision*: a05261dbed1c2577676b11235380de95d586aeeb
>
>- refs/tags/0.26.1-rc1
>
> Configuration Matrix gcc clang
> centos:7 --verbose --enable-libevent --enable-ssl
> [image: Failed]
> 
> [image: Not run]
> --verbose
> [image: Failed]
> 
> [image: Not run]
> ubuntu:14.04 --verbose --enable-libevent --enable-ssl
> [image: Success]
> 
> [image: Success]
> 
> --verbose
> [image: Success]
> 
> [image: Success]
> 
>
> On Mon, Feb 29, 2016 at 11:21 AM, Kapil Arya  wrote:
>
>> +1 (binding)
>>
>> Successful CI builds for the following distros:
>>
>> amd64/centos/6
>> amd64/centos/7
>> amd64/debian/jessie
>> amd64/ubuntu/precise
>> amd64/ubuntu/trusty
>> amd64/ubuntu/vivid
>>
>> Kapil
>>
>> On Sat, Feb 27, 2016 at 12:26 AM, Michael Park  wrote:
>>
>> > Hi all,
>> >
>> > Please vote on releasing the following candidate as Apache Mesos 0.26.1.
>> >
>> >
>> > 0.26.1 includes the following:
>> >
>> >
>> 
>> >
>> >- Improvements
>> >   - `/state` endpoint performance
>> >   - systemd integration
>> >   - GLOG performance
>> >   - Configurable task/framework history
>> >   - Offer filter timeout fix for backlogged allocator
>> >
>> >
>> >- Bugs
>> >- SSL
>> >   - Libevent
>> >   - Fixed point resources math
>> >- HDFS
>> >   - Agent upgrade compatibility
>> >
>> > The CHANGELOG for the release is available at:
>> >
>> >
>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.26.1-rc1
>> >
>> >
>> 
>> >
>> > The candidate for Mesos 0.26.1 release is available at:
>> >
>> https://dist.apache.org/repos/dist/dev/mesos/0.26.1-rc1/mesos-0.26.1.tar.gz
>> >
>> > The tag to be voted on is 0.26.1-rc1:
>> >
>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=0.26.1-rc1
>> >
>> > The MD5 checksum of the tarball can be found at:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/mesos/0.26.1-rc1/mesos-0.26.1.tar.gz.md5
>> >
>> > The signature of the tarball can be found at:
>> >
>> >
>> https://dist.apache.org/repos/dist/dev/mesos/0.26.1-rc1/mesos-0.26.1.tar.gz.asc
>> >
>> > The PGP key used to sign the release is here:
>> > https://dist.apache.org/repos/dist/release/mesos/KEYS
>> >
>> > The JAR is up in Maven in a staging repository here:
>> > https://repository.apache.org/content/repositories/orgapachemesos-1106
>> >
>> > Please vote on releasing this package as Apache Mesos 0.26.1!
>> >
>> > The vote is open until Wed Mar 2 23:59:59 PST 2016 and passes if a
>> majority
>> > of at least 3 +1 PMC votes are cast.
>> >
>> > [ ] +1 Release this package as Apache Mesos 0.26.1
>> > [ ] -1 Do not release this package because ...
>> >
>> > Thanks,
>> >
>> > Joris, Kapil, MPark
>> >
>>
>
>


-- 
~Kevin


Re: [VOTE] Release Apache Mesos 0.27.2 (rc1)

2016-03-01 Thread Kevin Klues
Looks like this rc is missing this commit:

https://github.com/apache/mesos/commit/d3108d776b6f7121e37176eda686ecc7245be4cd

On Tue, Mar 1, 2016 at 2:08 PM, Joris Van Remoortere
 wrote:
> @Michael Browning:
>>
>> MasterTest.MaxCompletedTasksPerFrameworkFlag [flaky, tracked in
>> MESOS-4518]
>
> This is supposed to be fixed in this release. It is concerning that this
> came up.
> Can you verify this and provide logs to Kevin Klues?
>
>
> —
> Joris Van Remoortere
> Mesosphere
>
> On Tue, Mar 1, 2016 at 2:00 PM, Michael Browning 
> wrote:
>>
>> +1 (non-binding)
>>
>> Fedora 23: `make check` non-root OK
>> OS X: `make check` non-root OK
>> Ubuntu 14.04: `make check` non-root, three failures:
>> ContainerLoggerTest.DefaultToSandbox [flaky, tracked in MESOS-4615]
>> MasterQuotaTest.AvailableResourcesAfterRescinding [flaky, tracked in
>> MESOS-4542]
>> MasterTest.MaxCompletedTasksPerFrameworkFlag [flaky, tracked in
>> MESOS-4518]
>>
>> On Mon, Feb 29, 2016 at 10:40 PM, Greg Mann  wrote:
>>
>> > +1 (non-binding)
>> >
>> > `sudo make check` on Ubuntu 14.04 using gcc, with libevent and SSL
>> > enabled.
>> >
>> > All tests pass except MemoryPressureMesosTest.CGROUPS_ROOT_Statistics,
>> > which seems to be due to the issue found here:
>> > https://issues.apache.org/jira/browse/MESOS-4053
>> >
>> >
>> > On Mon, Feb 29, 2016 at 2:17 PM, Michael Park  wrote:
>> >
>> > > Vinod, we've only committed the CHANGELOGs to the specific tags. I
>> > > didn't
>> > > realize that I should commit those to master as well, but it makes
>> > > total
>> > > sense to do so. I'll do that. Thanks.
>> > >
>> > > On 29 February 2016 at 13:50, Vinod Kone  wrote:
>> > >
>> > >> I don't see CHANGELOGs for these versions on the master branch?
>> > >>
>> > >> On Mon, Feb 29, 2016 at 1:39 PM, Neil Conway 
>> > >> wrote:
>> > >>
>> > >> > As described (briefly) in the release emails, 0.27.2, 0.26.1,
>> > >> > 0.25.1,
>> > >> > and 0.24.2 contains a new feature: "reliable floating point for
>> > >> > scalar
>> > >> > resources" (MESOS-4687).
>> > >> >
>> > >> > To elaborate on that slightly, Mesos now only supports scalar
>> > >> > resource
>> > >> > values with three decimal digits of precision (e.g., reserving
>> > >> > "5.001
>> > >> > CPUs" for a task). As a result of this change, frameworks that do
>> > >> > their own resource math may see slightly different results;
>> > >> > furthermore, if any frameworks were trying to manage extremely
>> > >> > fine-grained resource values (> 3 decimal digits of precision),
>> > >> > that
>> > >> > will no longer be supported.
>> > >> >
>> > >> > For more information, please see:
>> > >> >
>> > >> >
>> > >> >
>> > >>
>> >
>> > https://mail-archives.apache.org/mod_mbox/mesos-user/201602.mbox/%3CCAOW5sYZJn5caBOwZyPV008JgL1F2FYFxL_bM5CtYA2PF2OG7Bw%40mail.gmail.com%3E
>> > >> >
>> > >> >
>> > >>
>> >
>> > https://docs.google.com/document/d/14qLxjZsfIpfynbx0USLJR0GELSq8hdZJUWw6kaY_DXc/edit?usp=sharing
>> > >> > https://issues.apache.org/jira/browse/MESOS-4687
>> > >> >
>> > >> > Neil
>> > >> >
>> > >> >
>> > >> > On Fri, Feb 26, 2016 at 8:54 PM, Michael Park 
>> > >> wrote:
>> > >> > > Hi all,
>> > >> > >
>> > >> > > Please vote on releasing the following candidate as Apache Mesos
>> > >> 0.27.2.
>> > >> > >
>> > >> > >
>> > >> > > 0.27.2 includes the following:
>> > >> > >
>> > >> >
>> > >>
>> >
>> > 
>> > >> > >
>> > >> > > MESOS-4693 - Variable shadowing in
>> > >> HookManager::slavePreLaunchDockerHook.
>> > >> > > MESOS-4711 - Race condition in libevent poll implementation
>> > >>

Re: [VOTE] Release Apache Mesos 0.27.2 (rc1)

2016-03-01 Thread Kevin Klues
The others all seem to have them though:

https://github.com/apache/mesos/commits/0.26.1-rc1/src/tests/master_tests.cpp
https://github.com/apache/mesos/commits/0.25.1-rc1/src/tests/master_tests.cpp
https://github.com/apache/mesos/commits/0.24.2-rc1/src/tests/master_tests.cpp

Just not:
https://github.com/apache/mesos/commits/0.27.2-rc1/src/tests/master_tests.cpp

On Tue, Mar 1, 2016 at 2:17 PM, Kevin Klues  wrote:
> Looks like this rc is missing this commit:
>
> https://github.com/apache/mesos/commit/d3108d776b6f7121e37176eda686ecc7245be4cd
>
> On Tue, Mar 1, 2016 at 2:08 PM, Joris Van Remoortere
>  wrote:
>> @Michael Browning:
>>>
>>> MasterTest.MaxCompletedTasksPerFrameworkFlag [flaky, tracked in
>>> MESOS-4518]
>>
>> This is supposed to be fixed in this release. It is concerning that this
>> came up.
>> Can you verify this and provide logs to Kevin Klues?
>>
>>
>> —
>> Joris Van Remoortere
>> Mesosphere
>>
>> On Tue, Mar 1, 2016 at 2:00 PM, Michael Browning 
>> wrote:
>>>
>>> +1 (non-binding)
>>>
>>> Fedora 23: `make check` non-root OK
>>> OS X: `make check` non-root OK
>>> Ubuntu 14.04: `make check` non-root, three failures:
>>> ContainerLoggerTest.DefaultToSandbox [flaky, tracked in MESOS-4615]
>>> MasterQuotaTest.AvailableResourcesAfterRescinding [flaky, tracked in
>>> MESOS-4542]
>>> MasterTest.MaxCompletedTasksPerFrameworkFlag [flaky, tracked in
>>> MESOS-4518]
>>>
>>> On Mon, Feb 29, 2016 at 10:40 PM, Greg Mann  wrote:
>>>
>>> > +1 (non-binding)
>>> >
>>> > `sudo make check` on Ubuntu 14.04 using gcc, with libevent and SSL
>>> > enabled.
>>> >
>>> > All tests pass except MemoryPressureMesosTest.CGROUPS_ROOT_Statistics,
>>> > which seems to be due to the issue found here:
>>> > https://issues.apache.org/jira/browse/MESOS-4053
>>> >
>>> >
>>> > On Mon, Feb 29, 2016 at 2:17 PM, Michael Park  wrote:
>>> >
>>> > > Vinod, we've only committed the CHANGELOGs to the specific tags. I
>>> > > didn't
>>> > > realize that I should commit those to master as well, but it makes
>>> > > total
>>> > > sense to do so. I'll do that. Thanks.
>>> > >
>>> > > On 29 February 2016 at 13:50, Vinod Kone  wrote:
>>> > >
>>> > >> I don't see CHANGELOGs for these versions on the master branch?
>>> > >>
>>> > >> On Mon, Feb 29, 2016 at 1:39 PM, Neil Conway 
>>> > >> wrote:
>>> > >>
>>> > >> > As described (briefly) in the release emails, 0.27.2, 0.26.1,
>>> > >> > 0.25.1,
>>> > >> > and 0.24.2 contains a new feature: "reliable floating point for
>>> > >> > scalar
>>> > >> > resources" (MESOS-4687).
>>> > >> >
>>> > >> > To elaborate on that slightly, Mesos now only supports scalar
>>> > >> > resource
>>> > >> > values with three decimal digits of precision (e.g., reserving
>>> > >> > "5.001
>>> > >> > CPUs" for a task). As a result of this change, frameworks that do
>>> > >> > their own resource math may see slightly different results;
>>> > >> > furthermore, if any frameworks were trying to manage extremely
>>> > >> > fine-grained resource values (> 3 decimal digits of precision),
>>> > >> > that
>>> > >> > will no longer be supported.
>>> > >> >
>>> > >> > For more information, please see:
>>> > >> >
>>> > >> >
>>> > >> >
>>> > >>
>>> >
>>> > https://mail-archives.apache.org/mod_mbox/mesos-user/201602.mbox/%3CCAOW5sYZJn5caBOwZyPV008JgL1F2FYFxL_bM5CtYA2PF2OG7Bw%40mail.gmail.com%3E
>>> > >> >
>>> > >> >
>>> > >>
>>> >
>>> > https://docs.google.com/document/d/14qLxjZsfIpfynbx0USLJR0GELSq8hdZJUWw6kaY_DXc/edit?usp=sharing
>>> > >> > https://issues.apache.org/jira/browse/MESOS-4687
>>> > >> >
>>> > >> > Neil
>>> > >> >
>>> > >> >
>>> > >> > On Fri, Feb 26, 2016 at 8:54 PM, Michael Park 
>>> > >> 

Re: Performance isolation working group meeting on Friday 10am PST

2016-02-26 Thread Kevin Klues
I'm having trouble joining the call. Keeps saying "Requesting to join
the video call..."

On Wed, Feb 24, 2016 at 12:12 PM, Niklas Nielsen  wrote:
> Hi all,
>
> We will meet and talk performance isolation on Friday 10am PST, with the
> agenda:
>
> 1) New proposal for core affinity
> 2) CFS configuration status
> 3) Workload Benchmarking
> 4) Discussion on actuating isolation for resources that are accounted / not
> accounted by Mesos
> For now, this will be the hangout:
> https://plus.google.com/hangouts/_/qni.dk/nik and we will follow up with
> any changes
>
> --
> Niklas



-- 
~Kevin


Re: Reorganize 3rdparty directory

2016-02-18 Thread Kevin Klues
I am also a fan of git submodules in the long term, but avoiding them
in the short term.  We should get things organized as we want them
first, and then start thinking about pulling libprocess/stout out into
submodules later (while also preserving their history!)

I disagree with moving libprocess and stout up to the same level as
src/. If we want to make sure they don't bleed into Mesos proper, we
really should treat them the same as any other 3rdparty code that we
depend on.  This will become more relevant when/if we move them into
submodules.

Given all that, the only real challenge with flattening our 3rdparty
dependencies into a single folder should be the changes we have to
make to our configure.ac and Makefile.am scripts to know where to look
for their dependencies now.  In the end these should be URLs to
versioned tarballs that we host somewhere (or git repos that we can
have forked and tagged with specific versions).  In the short term
these can just be relative paths in the mesos tree though.

On Tue, Feb 16, 2016 at 1:26 PM, Kapil Arya  wrote:
> Thanks for bringing it up Alexander!
>
> I don't have a strong opinion wrt git submodules since I don't have
> much experience with them personally. Having said that, I would like
> to go conservative on this one (baby steps :-) ).
>
> Further, I do understand that moving libprocess and stout directories
> will be painful for people who already have several branches and will
> have conflicts. But I do think, there are some interim solutions as
> well (for example, move libprocess/stout to wherever we want, but keep
> a symlink from 3rdparty/libprocess, etc, to those new locations for
> some time). I am sure there are better solutions out there, but this
> should work too.
>
> Best,
> Kapil
>
> On Tue, Feb 16, 2016 at 12:38 PM, Erik Weathers
>  wrote:
>> If we go to git submodules, please ensure there are good docs around how to
>> update cloned repos.
>>
>> e.g., From ansible: https://docs.ansible.com/ansible/intro_installation.html
>>
>> Note when updating ansible, be sure to not only update the source tree, but
>> also the “submodules” in git which point at Ansible’s own modules (not the
>> same kind of modules, alas).
>>
>> $ git pull --rebase
>> $ git submodule update --init --recursive
>>
>> Thanks,
>>
>> - Erik
>>
>> On Tue, Feb 16, 2016 at 8:54 AM, Alexander Rojas 
>> wrote:
>>
>>> +1
>>> I am one who is totally in for that change. It is not only the directories
>>> problem, but the structure which has led that the stout tests (which do
>>> need to be compiled) are actually managed in the libprocess Makefile, on
>>> top of all the things you have already mentioned.
>>>
>>>
>>> > On 09 Feb 2016, at 17:53, Kapil Arya  wrote:
>>> >
>>> > On Tue, Feb 9, 2016 at 8:23 PM, Jie Yu  wrote:
>>> >> Kapil,
>>> >>
>>> >> I guess what I want to understand is why the existing structure makes it
>>> >> hard for you to do the things that you want to do (installing
>>> >> module-specific 3rdparty dependencies into "${pkglibdir}/3rdparty" as
>>> part
>>> >> of "make install").
>>> >
>>> > Let me see if I can answer that :-).
>>> >
>>> > This is somewhat related. For example, if we want to install protobuf
>>> > in 3rdparty/{include,lib} (for module developers to use them without
>>> > doing a proper mesos installation), you need to provide the correct
>>> > "--prefix" flag that points to 3rdparty/. However, due to multiple
>>> > levels of configure.ac, the "--prefix" can at best be generated by
>>> > prepending "../../../" to get to the great-grandparent directory. This
>>> > is because we have a separate configure.ac which manages
>>> > 3rdparty/libprocess/3rdparty/Makefile.am. There are ways around it,
>>> > but they are not clean.
>>> >
>>> > Similar thing holds for system-wide installation of these 3rdparty
>>> > packages. For example, ideally, we would want to use
>>> > "${pkglibdir}/3rdparty" as a prefix for those packages. However, since
>>> > they are part of libprocess package, we don't get the correct
>>> > directory and have to use either hardwired $pkglibdir, or somehow pass
>>> > it from the top-level configure all the way down to
>>> > 3rdparty/libprocess/3rdparty/Makefile.am :-(.
>>> >
>>> >
>>> >> The only reason you mentioned in the original email is that "in the
>>> current
>>> >> code base, we don't strictly follow the 3rdparty structure", which IMO
>>> is
>>> >> not a very convincing reason for such a big change.
>>> >
>>> > How about a not so big change? :-). What if we just move
>>> > 3rdparty/libprocess/3rdparty/* stuff out to 3rdparty/ while leaving
>>> > stout as is? That is not a big change since we are not touching
>>> > libprocess/stout. Just adjusting Makefiles and I am pretty sure it
>>> > will be cleaner and simpler than what we have right now.
>>> >
>>> > As a later time, we can then consider moving stout out to 3rdparty/
>>> > while leaving libprocess as is. But that's something we can decide
>>> > later and leave stout as an excepti

Re: Mesos binaries and shared library dependencies.

2016-02-18 Thread Kevin Klues
That's interesting because autotools actually does the opposite.  It
uses a wrapper script to avoid setting the rpath after a build, but
then sets the rpath (relative to --prefix) during 'make install'.

I know there has been some work towards moving to a CMake build, but
I'm not sure exactly what the status of that is at the moment.

On Thu, Feb 18, 2016 at 6:46 AM, Matthias Bach
 wrote:
> Hi Kevin,
>
> CMake will add rpaths during build and then strip them in `make install`
> which solves both, the development and the deployment use case. In my
> opinion this is the proper way to do it.
>
> Regards,
> Matthias
>
> Am 15.02.2016 um 00:31 schrieb Kevin Klues:
>> I was wrong. This unset the rpath in the binary, but left the rpath
>> set in libmesos.so.  I am going to leave this as an open issue for
>> now, pending a better solution.
>>
>> On Fri, Feb 12, 2016 at 5:03 PM, Kevin Klues  wrote:
>>> Playing with this a bit more, I came up with a solution that bridges
>>> the gap between the two alternatives I proposed before. Basically, I
>>> found a way to get the wrapper script to set the rpath when running
>>> the binary from the build directory, but leave the rpath of the actual
>>> binary unchanged. Best of both worlds.
>>>
>>> Instead of explicity setting the global LDFLAGS in configure.ac, I now
>>> set an intermediate variable called LIBMESOS_RPATH for all non-bundled
>>> packages:
>>>
>>> LIBMESOS_RPATH="$LIBMESOS_RPATH:${with_zookeeper}/lib"
>>> LIBMESOS_RPATH="$LIBMESOS_RPATH:${with_glog}/lib"
>>> ...
>>>
>>> Then at the bottom of configure.ac, I do:
>>>
>>> AC_SUBST(LIBMESOS_RPATH, ["${LIBMESOS_RPATH//:/-rpath }"])
>>>
>>> This substitutes all colons in LIBMESOS_RPATH with "-rpath " to form a
>>> valid string of rpaths for consumption by Makefile.am. I chose to
>>> format the original LIBMESOS_RPATH string with colons so that it forms
>>> a valid LD_LIBRARY_PATH as well (plus it's easier to tokenize by ':'
>>> than by '-rpath ').
>>>
>>> Then in Makefile.am I do:
>>>
>>> libmesos_la_LDFLAGS += $(LIBMESOS_RPATH)
>>>
>>> This sets the rpath for libmesos using *libtool's* LDFLAGS rather than
>>> setting the global LDFLAGS for all linked objects (as I was doing
>>> before).
>>>
>>> With this change, I can inspect libtools wrapper scripts to see that indeed
>>> it patches up the binary with the added rpaths, but libmesos itself
>>> has no rpath set.
>>>
>>> Pending any objections, a RR will be forthcoming.
>>>
>>> On Fri, Feb 12, 2016 at 3:31 PM, Kevin Klues  wrote:
>>>> I like that idea.  Don't set rpath by default, but allow people to
>>>> specify that it should be set via a flag. How does
>>>>
>>>> --set-rpath-for-external-libs
>>>>
>>>> sound for a name. Too long?  I don't like just --with-rpath, because
>>>> it's not descriptive enough in my opinion.
>>>>
>>>> On Fri, Feb 12, 2016 at 2:34 PM, Jojy Varghese  wrote:
>>>>> Maybe have an opt-in (say —with-rpath)?
>>>>>
>>>>> -Jojy
>>>>>
>>>>>> On Feb 12, 2016, at 2:19 PM, Kevin Klues  wrote:
>>>>>>
>>>>>> To be clear, I'm actually a bit torn both ways on this.
>>>>>>
>>>>>> On the one hand, including the rpath makes it easy for those who don't
>>>>>> know anything about LD_LIBRARY_PATH, the ldcache, etc. to simply pass
>>>>>> their paths to their external dependencies at configure time and then
>>>>>> run their binaries without further effort.
>>>>>>
>>>>>> On the other hand, maybe they should be cognizant of the fact that
>>>>>> something is going on under the hood to actually allow their binaries
>>>>>> to link properly (i.e. I can imagine a situation where someone builds,
>>>>>> runs, and tests everything locally, and then is confused as to why it
>>>>>> nothing works once deployed).
>>>>>>
>>>>>> In my previous email, I argue that we should include the rpath by
>>>>>> default, and can strip it later if we don't want it for some reason
>>>>>> (i.e. when bundling into debs/rpms).  Conversely, we could leave it
>>>>

Re: Enable compiler optimization by default?

2016-02-17 Thread Kevin Klues
+1

On Wed, Feb 17, 2016 at 4:24 PM, Neil Conway  wrote:
> Hi folks,
>
> At present, Mesos defaults to compiling with "-O0"; to enable compiler
> optimizations, the user needs to specify "--enable-optimize".
>
> I'd like to propose we change the default, for a few reasons:
>
> (1) The autoconf default for CFLAGS/CXXFLAGS is "-O2 -g". Anecdotally,
> I think most software packages compile with a reasonable level of
> optimizations enabled by default.
>
> (2) I think we should make the default configure flags appropriate for
> end-users (rather than Mesos developers): developers will be familiar
> enough with Mesos to tune the configure flags according to their own
> preferences.
>
> (3) The performance consequences of not enabling compiler
> optimizations can be pretty severe: 5x in a benchmark I just ran, and
> we've seen between 2x and 30x (!) performance differences for some
> real-world workloads.
>
> Neil



-- 
~Kevin


Re: Mesos binaries and shared library dependencies.

2016-02-14 Thread Kevin Klues
I was wrong. This unset the rpath in the binary, but left the rpath
set in libmesos.so.  I am going to leave this as an open issue for
now, pending a better solution.

On Fri, Feb 12, 2016 at 5:03 PM, Kevin Klues  wrote:
> Playing with this a bit more, I came up with a solution that bridges
> the gap between the two alternatives I proposed before. Basically, I
> found a way to get the wrapper script to set the rpath when running
> the binary from the build directory, but leave the rpath of the actual
> binary unchanged. Best of both worlds.
>
> Instead of explicity setting the global LDFLAGS in configure.ac, I now
> set an intermediate variable called LIBMESOS_RPATH for all non-bundled
> packages:
>
> LIBMESOS_RPATH="$LIBMESOS_RPATH:${with_zookeeper}/lib"
> LIBMESOS_RPATH="$LIBMESOS_RPATH:${with_glog}/lib"
> ...
>
> Then at the bottom of configure.ac, I do:
>
> AC_SUBST(LIBMESOS_RPATH, ["${LIBMESOS_RPATH//:/-rpath }"])
>
> This substitutes all colons in LIBMESOS_RPATH with "-rpath " to form a
> valid string of rpaths for consumption by Makefile.am. I chose to
> format the original LIBMESOS_RPATH string with colons so that it forms
> a valid LD_LIBRARY_PATH as well (plus it's easier to tokenize by ':'
> than by '-rpath ').
>
> Then in Makefile.am I do:
>
> libmesos_la_LDFLAGS += $(LIBMESOS_RPATH)
>
> This sets the rpath for libmesos using *libtool's* LDFLAGS rather than
> setting the global LDFLAGS for all linked objects (as I was doing
> before).
>
> With this change, I can inspect libtools wrapper scripts to see that indeed
> it patches up the binary with the added rpaths, but libmesos itself
> has no rpath set.
>
> Pending any objections, a RR will be forthcoming.
>
> On Fri, Feb 12, 2016 at 3:31 PM, Kevin Klues  wrote:
>> I like that idea.  Don't set rpath by default, but allow people to
>> specify that it should be set via a flag. How does
>>
>> --set-rpath-for-external-libs
>>
>> sound for a name. Too long?  I don't like just --with-rpath, because
>> it's not descriptive enough in my opinion.
>>
>> On Fri, Feb 12, 2016 at 2:34 PM, Jojy Varghese  wrote:
>>> Maybe have an opt-in (say —with-rpath)?
>>>
>>> -Jojy
>>>
>>>> On Feb 12, 2016, at 2:19 PM, Kevin Klues  wrote:
>>>>
>>>> To be clear, I'm actually a bit torn both ways on this.
>>>>
>>>> On the one hand, including the rpath makes it easy for those who don't
>>>> know anything about LD_LIBRARY_PATH, the ldcache, etc. to simply pass
>>>> their paths to their external dependencies at configure time and then
>>>> run their binaries without further effort.
>>>>
>>>> On the other hand, maybe they should be cognizant of the fact that
>>>> something is going on under the hood to actually allow their binaries
>>>> to link properly (i.e. I can imagine a situation where someone builds,
>>>> runs, and tests everything locally, and then is confused as to why it
>>>> nothing works once deployed).
>>>>
>>>> In my previous email, I argue that we should include the rpath by
>>>> default, and can strip it later if we don't want it for some reason
>>>> (i.e. when bundling into debs/rpms).  Conversely, we could leave it
>>>> out by default and only set it as a post-processing step in situations
>>>> where we actually care about it.
>>>>
>>>> I'm curious what other people's thoughts are.
>>>>
>>>>
>>>> On Fri, Feb 12, 2016 at 1:47 PM, Kevin Klues  wrote:
>>>>> Hi all,
>>>>>
>>>>> A discussion came up recently around including rpaths in our mesos
>>>>> binaries to help resolve any shared library dependencies that don't
>>>>> exist in standard library paths (e.g. /lib, /usr/local/lib, etc.).
>>>>>
>>>>> By default, there are no shared library dependencies that exist in
>>>>> non-standard paths, because we bundle all of these dependencies into
>>>>> the mesos source and statically link them into our executables (e.g.
>>>>> glog, zookeeper, etc.)
>>>>>
>>>>> However, if you configure mesos with e.g.
>>>>>
>>>>> ../configure disable-bundled
>>>>>
>>>>> or the more selective
>>>>>
>>>>> ../configure --with-glog[=DIR] --with-zookeeper[=DIR]  ...
>>>>>
>>&g

Re: Mesos binaries and shared library dependencies.

2016-02-12 Thread Kevin Klues
Playing with this a bit more, I came up with a solution that bridges
the gap between the two alternatives I proposed before. Basically, I
found a way to get the wrapper script to set the rpath when running
the binary from the build directory, but leave the rpath of the actual
binary unchanged. Best of both worlds.

Instead of explicity setting the global LDFLAGS in configure.ac, I now
set an intermediate variable called LIBMESOS_RPATH for all non-bundled
packages:

LIBMESOS_RPATH="$LIBMESOS_RPATH:${with_zookeeper}/lib"
LIBMESOS_RPATH="$LIBMESOS_RPATH:${with_glog}/lib"
...

Then at the bottom of configure.ac, I do:

AC_SUBST(LIBMESOS_RPATH, ["${LIBMESOS_RPATH//:/-rpath }"])

This substitutes all colons in LIBMESOS_RPATH with "-rpath " to form a
valid string of rpaths for consumption by Makefile.am. I chose to
format the original LIBMESOS_RPATH string with colons so that it forms
a valid LD_LIBRARY_PATH as well (plus it's easier to tokenize by ':'
than by '-rpath ').

Then in Makefile.am I do:

libmesos_la_LDFLAGS += $(LIBMESOS_RPATH)

This sets the rpath for libmesos using *libtool's* LDFLAGS rather than
setting the global LDFLAGS for all linked objects (as I was doing
before).

With this change, I can inspect libtools wrapper scripts to see that indeed
it patches up the binary with the added rpaths, but libmesos itself
has no rpath set.

Pending any objections, a RR will be forthcoming.

On Fri, Feb 12, 2016 at 3:31 PM, Kevin Klues  wrote:
> I like that idea.  Don't set rpath by default, but allow people to
> specify that it should be set via a flag. How does
>
> --set-rpath-for-external-libs
>
> sound for a name. Too long?  I don't like just --with-rpath, because
> it's not descriptive enough in my opinion.
>
> On Fri, Feb 12, 2016 at 2:34 PM, Jojy Varghese  wrote:
>> Maybe have an opt-in (say —with-rpath)?
>>
>> -Jojy
>>
>>> On Feb 12, 2016, at 2:19 PM, Kevin Klues  wrote:
>>>
>>> To be clear, I'm actually a bit torn both ways on this.
>>>
>>> On the one hand, including the rpath makes it easy for those who don't
>>> know anything about LD_LIBRARY_PATH, the ldcache, etc. to simply pass
>>> their paths to their external dependencies at configure time and then
>>> run their binaries without further effort.
>>>
>>> On the other hand, maybe they should be cognizant of the fact that
>>> something is going on under the hood to actually allow their binaries
>>> to link properly (i.e. I can imagine a situation where someone builds,
>>> runs, and tests everything locally, and then is confused as to why it
>>> nothing works once deployed).
>>>
>>> In my previous email, I argue that we should include the rpath by
>>> default, and can strip it later if we don't want it for some reason
>>> (i.e. when bundling into debs/rpms).  Conversely, we could leave it
>>> out by default and only set it as a post-processing step in situations
>>> where we actually care about it.
>>>
>>> I'm curious what other people's thoughts are.
>>>
>>>
>>> On Fri, Feb 12, 2016 at 1:47 PM, Kevin Klues  wrote:
>>>> Hi all,
>>>>
>>>> A discussion came up recently around including rpaths in our mesos
>>>> binaries to help resolve any shared library dependencies that don't
>>>> exist in standard library paths (e.g. /lib, /usr/local/lib, etc.).
>>>>
>>>> By default, there are no shared library dependencies that exist in
>>>> non-standard paths, because we bundle all of these dependencies into
>>>> the mesos source and statically link them into our executables (e.g.
>>>> glog, zookeeper, etc.)
>>>>
>>>> However, if you configure mesos with e.g.
>>>>
>>>> ../configure disable-bundled
>>>>
>>>> or the more selective
>>>>
>>>> ../configure --with-glog[=DIR] --with-zookeeper[=DIR]  ...
>>>>
>>>> then mesos will be built with an external shared library dependency
>>>> (e.g. glog and zookeeper in this case).
>>>>
>>>> The build system is smart enough to set up LDFLAGS so we can link
>>>> against whatever external libraries are passed in via the --with-*
>>>> flags.
>>>>
>>>> However, when we go to run the binaries that are produced (e.g.
>>>> mesos-master, mesos-slave, mesos-test, etc.), we have to prefix them
>>>> with an LD_LIBRARY_PATH pointing to the location of the shared
>>>> libraires from 

Re: Mesos binaries and shared library dependencies.

2016-02-12 Thread Kevin Klues
I like that idea.  Don't set rpath by default, but allow people to
specify that it should be set via a flag. How does

--set-rpath-for-external-libs

sound for a name. Too long?  I don't like just --with-rpath, because
it's not descriptive enough in my opinion.

On Fri, Feb 12, 2016 at 2:34 PM, Jojy Varghese  wrote:
> Maybe have an opt-in (say —with-rpath)?
>
> -Jojy
>
>> On Feb 12, 2016, at 2:19 PM, Kevin Klues  wrote:
>>
>> To be clear, I'm actually a bit torn both ways on this.
>>
>> On the one hand, including the rpath makes it easy for those who don't
>> know anything about LD_LIBRARY_PATH, the ldcache, etc. to simply pass
>> their paths to their external dependencies at configure time and then
>> run their binaries without further effort.
>>
>> On the other hand, maybe they should be cognizant of the fact that
>> something is going on under the hood to actually allow their binaries
>> to link properly (i.e. I can imagine a situation where someone builds,
>> runs, and tests everything locally, and then is confused as to why it
>> nothing works once deployed).
>>
>> In my previous email, I argue that we should include the rpath by
>> default, and can strip it later if we don't want it for some reason
>> (i.e. when bundling into debs/rpms).  Conversely, we could leave it
>> out by default and only set it as a post-processing step in situations
>> where we actually care about it.
>>
>> I'm curious what other people's thoughts are.
>>
>>
>> On Fri, Feb 12, 2016 at 1:47 PM, Kevin Klues  wrote:
>>> Hi all,
>>>
>>> A discussion came up recently around including rpaths in our mesos
>>> binaries to help resolve any shared library dependencies that don't
>>> exist in standard library paths (e.g. /lib, /usr/local/lib, etc.).
>>>
>>> By default, there are no shared library dependencies that exist in
>>> non-standard paths, because we bundle all of these dependencies into
>>> the mesos source and statically link them into our executables (e.g.
>>> glog, zookeeper, etc.)
>>>
>>> However, if you configure mesos with e.g.
>>>
>>> ../configure disable-bundled
>>>
>>> or the more selective
>>>
>>> ../configure --with-glog[=DIR] --with-zookeeper[=DIR]  ...
>>>
>>> then mesos will be built with an external shared library dependency
>>> (e.g. glog and zookeeper in this case).
>>>
>>> The build system is smart enough to set up LDFLAGS so we can link
>>> against whatever external libraries are passed in via the --with-*
>>> flags.
>>>
>>> However, when we go to run the binaries that are produced (e.g.
>>> mesos-master, mesos-slave, mesos-test, etc.), we have to prefix them
>>> with an LD_LIBRARY_PATH pointing to the location of the shared
>>> libraires from these external dependencies, e.g.
>>>
>>> LD_LIBRARY_PATH="/glog/lib:/zookeeper/lib" ./mesos-master
>>>
>>> It would be nice if we didn't have to explicitly set the
>>> LD_LIBRARY_PATH to launch these binaries when linking against any
>>> external shared library dependencies.
>>>
>>> One way around this would be to make sure that all external library
>>> dependencies were stored in standard search paths for the dynamic
>>> linker. This is typically what happens if you install these
>>> dependencies via a standard package manager (e.g. apt-get, yum, etc.).
>>> Sometimes this is undesirable (or impossible) though, especially if
>>> the external dependencies do not exist as packages or follow a
>>> non-standard directory hierarchy in terms of where it places its
>>> include files, libraries, etc.
>>>
>>> Another option is to install the paths to these external libraries
>>> into the ldcache (e.g. via /etc/ld.so.conf on linux) so that the
>>> dynamic linker will search them at runtime.  This is also unfeasible
>>> at times and has the added disadvantage that these library paths will
>>> now be searched for *all* binaries that get executed (not just the
>>> ones we currently care about).
>>>
>>> The final option (and the one I'm proposing here) is to set the
>>> 'rpath' of the binary to point to the location of the external shared
>>> library on the build machine.  The rpath is embedded into the binary
>>> at link time and is used to give the linker an extra set of paths to
>>> search for shared libraries at runtime.  Th

Re: Mesos binaries and shared library dependencies.

2016-02-12 Thread Kevin Klues
To be clear, I'm actually a bit torn both ways on this.

On the one hand, including the rpath makes it easy for those who don't
know anything about LD_LIBRARY_PATH, the ldcache, etc. to simply pass
their paths to their external dependencies at configure time and then
run their binaries without further effort.

On the other hand, maybe they should be cognizant of the fact that
something is going on under the hood to actually allow their binaries
to link properly (i.e. I can imagine a situation where someone builds,
runs, and tests everything locally, and then is confused as to why it
nothing works once deployed).

In my previous email, I argue that we should include the rpath by
default, and can strip it later if we don't want it for some reason
(i.e. when bundling into debs/rpms).  Conversely, we could leave it
out by default and only set it as a post-processing step in situations
where we actually care about it.

I'm curious what other people's thoughts are.


On Fri, Feb 12, 2016 at 1:47 PM, Kevin Klues  wrote:
> Hi all,
>
> A discussion came up recently around including rpaths in our mesos
> binaries to help resolve any shared library dependencies that don't
> exist in standard library paths (e.g. /lib, /usr/local/lib, etc.).
>
> By default, there are no shared library dependencies that exist in
> non-standard paths, because we bundle all of these dependencies into
> the mesos source and statically link them into our executables (e.g.
> glog, zookeeper, etc.)
>
> However, if you configure mesos with e.g.
>
> ../configure disable-bundled
>
> or the more selective
>
> ../configure --with-glog[=DIR] --with-zookeeper[=DIR]  ...
>
> then mesos will be built with an external shared library dependency
> (e.g. glog and zookeeper in this case).
>
> The build system is smart enough to set up LDFLAGS so we can link
> against whatever external libraries are passed in via the --with-*
> flags.
>
> However, when we go to run the binaries that are produced (e.g.
> mesos-master, mesos-slave, mesos-test, etc.), we have to prefix them
> with an LD_LIBRARY_PATH pointing to the location of the shared
> libraires from these external dependencies, e.g.
>
> LD_LIBRARY_PATH="/glog/lib:/zookeeper/lib" ./mesos-master
>
> It would be nice if we didn't have to explicitly set the
> LD_LIBRARY_PATH to launch these binaries when linking against any
> external shared library dependencies.
>
> One way around this would be to make sure that all external library
> dependencies were stored in standard search paths for the dynamic
> linker. This is typically what happens if you install these
> dependencies via a standard package manager (e.g. apt-get, yum, etc.).
> Sometimes this is undesirable (or impossible) though, especially if
> the external dependencies do not exist as packages or follow a
> non-standard directory hierarchy in terms of where it places its
> include files, libraries, etc.
>
> Another option is to install the paths to these external libraries
> into the ldcache (e.g. via /etc/ld.so.conf on linux) so that the
> dynamic linker will search them at runtime.  This is also unfeasible
> at times and has the added disadvantage that these library paths will
> now be searched for *all* binaries that get executed (not just the
> ones we currently care about).
>
> The final option (and the one I'm proposing here) is to set the
> 'rpath' of the binary to point to the location of the external shared
> library on the build machine.  The rpath is embedded into the binary
> at link time and is used to give the linker an extra set of paths to
> search for shared libraries at runtime.  This obvious advantage here
> is that setting rpath allows us to run our binaries without requiring
> LD_LIBRARY_PATH or any of the other methods mentioned above to tell
> the linker where to find our shared libraries.  However, it has the
> disadvantage of baking a path into the binary that may only exist on
> the specific machine the binary was built on.
>
> That said, the standard search order used by the dynamic linker to
> find shared libraries is:
>
> 1) LD_LIBRARY_PATH
> 2) rpath
> 4) the ldcache (/etc/ld.so.conf on linux)
> 3) default paths (e.g. /lib, /usr/local/lib)
>
> Meaning that we could always overwrite the rpath using LD_LIBARY_PATH
> if we wanted to.  Moreover, we could even change the rpath at the time
> of deployment (e.g. via chrpath on linux). This may be desirable if
> the shared libraries are installed at different locations on the
> deployment machine.
>
> If there are no objections, I therefore propose we modify the
> following files to add rpaths to all external dependencies set via
> --with-* flags:
>
> ./c

Mesos binaries and shared library dependencies.

2016-02-12 Thread Kevin Klues
Hi all,

A discussion came up recently around including rpaths in our mesos
binaries to help resolve any shared library dependencies that don't
exist in standard library paths (e.g. /lib, /usr/local/lib, etc.).

By default, there are no shared library dependencies that exist in
non-standard paths, because we bundle all of these dependencies into
the mesos source and statically link them into our executables (e.g.
glog, zookeeper, etc.)

However, if you configure mesos with e.g.

../configure disable-bundled

or the more selective

../configure --with-glog[=DIR] --with-zookeeper[=DIR]  ...

then mesos will be built with an external shared library dependency
(e.g. glog and zookeeper in this case).

The build system is smart enough to set up LDFLAGS so we can link
against whatever external libraries are passed in via the --with-*
flags.

However, when we go to run the binaries that are produced (e.g.
mesos-master, mesos-slave, mesos-test, etc.), we have to prefix them
with an LD_LIBRARY_PATH pointing to the location of the shared
libraires from these external dependencies, e.g.

LD_LIBRARY_PATH="/glog/lib:/zookeeper/lib" ./mesos-master

It would be nice if we didn't have to explicitly set the
LD_LIBRARY_PATH to launch these binaries when linking against any
external shared library dependencies.

One way around this would be to make sure that all external library
dependencies were stored in standard search paths for the dynamic
linker. This is typically what happens if you install these
dependencies via a standard package manager (e.g. apt-get, yum, etc.).
Sometimes this is undesirable (or impossible) though, especially if
the external dependencies do not exist as packages or follow a
non-standard directory hierarchy in terms of where it places its
include files, libraries, etc.

Another option is to install the paths to these external libraries
into the ldcache (e.g. via /etc/ld.so.conf on linux) so that the
dynamic linker will search them at runtime.  This is also unfeasible
at times and has the added disadvantage that these library paths will
now be searched for *all* binaries that get executed (not just the
ones we currently care about).

The final option (and the one I'm proposing here) is to set the
'rpath' of the binary to point to the location of the external shared
library on the build machine.  The rpath is embedded into the binary
at link time and is used to give the linker an extra set of paths to
search for shared libraries at runtime.  This obvious advantage here
is that setting rpath allows us to run our binaries without requiring
LD_LIBRARY_PATH or any of the other methods mentioned above to tell
the linker where to find our shared libraries.  However, it has the
disadvantage of baking a path into the binary that may only exist on
the specific machine the binary was built on.

That said, the standard search order used by the dynamic linker to
find shared libraries is:

1) LD_LIBRARY_PATH
2) rpath
4) the ldcache (/etc/ld.so.conf on linux)
3) default paths (e.g. /lib, /usr/local/lib)

Meaning that we could always overwrite the rpath using LD_LIBARY_PATH
if we wanted to.  Moreover, we could even change the rpath at the time
of deployment (e.g. via chrpath on linux). This may be desirable if
the shared libraries are installed at different locations on the
deployment machine.

If there are no objections, I therefore propose we modify the
following files to add rpaths to all external dependencies set via
--with-* flags:

./configure.ac
./3rdparty/libprocess/3rdparty/stout/configure.ac
./3rdparty/libprocess/configure.ac

The pattern would change from:

CPPFLAGS="-I${with_thing}/include $CPPFLAGS"
LDFLAGS="-L${with_thing}/lib $LDFLAGS"

to include an additional line with:

LDFLAGS="-Wl,-rpath,${with_thing}/lib $LDFLAGS"

I know there has some hesitation with this in the past (especially
when it comes to producing rpms or debs, where baking in an rpath
seems really strange), but I'm hoping people will agree that it's
useful enough that it makes sense to include the rpaths as the default
case.  We can always run a post-processing step to strip them in cases
where they are undesirable.

Thanks!

--
~Kevin


Re: Inconsistent naming of support scripts

2016-02-11 Thread Kevin Klues
I typically think of files having dashes as binaries or scripts that
are runnable, whereas files with underscores are meant as source or
otherwise supplementary to the binary produced (e.g. a supplementary
python library that the main python program imports).  I'm  not sure
where I inherited this convention from, but it's always been the way
I've done things.

As far as our code base goes, we seem to use this convention as well
with our mesos-master.sh. mesos-slave.sh, etc. binaries.

On Thu, Feb 11, 2016 at 2:17 PM, Vinod Kone  wrote:
> Why hyphens? Most of the files in our repo use underscores. I would like us
> to be consistent on how we name files in the repo.
>
> On Thu, Feb 11, 2016 at 1:40 PM, Kevin Klues  wrote:
>
>> I prefer hyphens as well
>>
>> On Thu, Feb 11, 2016 at 1:28 PM, Jojy Varghese  wrote:
>> > hyphen++. Is google friendly apparently.  Also less keys to press :)
>> >
>> > -Jojy
>> >
>> >
>> >
>> >> On Feb 11, 2016, at 12:43 PM, Greg Mann  wrote:
>> >>
>> >> +1
>> >>
>> >> On Thu, Feb 11, 2016 at 11:41 AM, Vinod Kone 
>> wrote:
>> >>
>> >>> Some the scripts in the "support" directory have dashes ("-") in their
>> >>> names (e.g., apply-review.sh, apply-reviews.py), whereas some have
>> >>> underscores ("_") (e.g., docker_build.sh, mesos_split.py).
>> >>>
>> >>> This is really confusing and we should stick with one style. I propose
>> to
>> >>> change all them to use underscores. I will make sure the CI jobs are
>> >>> updated accordingly.
>> >>>
>> >>> Any objections?
>> >>>
>> >>> Thanks,
>> >>> Vinod
>> >>>
>> >
>>
>>
>>
>> --
>> ~Kevin
>>



-- 
~Kevin


Re: Inconsistent naming of support scripts

2016-02-11 Thread Kevin Klues
I prefer hyphens as well

On Thu, Feb 11, 2016 at 1:28 PM, Jojy Varghese  wrote:
> hyphen++. Is google friendly apparently.  Also less keys to press :)
>
> -Jojy
>
>
>
>> On Feb 11, 2016, at 12:43 PM, Greg Mann  wrote:
>>
>> +1
>>
>> On Thu, Feb 11, 2016 at 11:41 AM, Vinod Kone  wrote:
>>
>>> Some the scripts in the "support" directory have dashes ("-") in their
>>> names (e.g., apply-review.sh, apply-reviews.py), whereas some have
>>> underscores ("_") (e.g., docker_build.sh, mesos_split.py).
>>>
>>> This is really confusing and we should stick with one style. I propose to
>>> change all them to use underscores. I will make sure the CI jobs are
>>> updated accordingly.
>>>
>>> Any objections?
>>>
>>> Thanks,
>>> Vinod
>>>
>



-- 
~Kevin


Re: Core affinity in Mesos

2016-01-29 Thread Kevin Klues
I agree. "Isolation" on it's own is too broad a term. However, since
we are talking mostly about reducing interference, which typically
implies performance isolation, my vote for the group name is the
"Performance Isolation Working Group".

On Fri, Jan 29, 2016 at 11:22 AM, Benjamin Mahler  wrote:
> Since "Isolation" applies broadly outside of the context of addressing
> latency sensitive workloads (e.g. user/pid/network namespacing,
> resource limitations (e.g. cpu quota, memory limits, gpu device visibility) it
> would be great to choose a more specific name. Some suggestions:
> interference, performance-related isolation, colocation, latency
> sensitivity.
>
> Thoughts?
>
> Looking forward to seeing the discussions here!
>
> Ben
>
> On Friday, January 22, 2016, Nielsen, Niklas 
> wrote:
>
>> Hi everyone,
>>
>> We have been talking about core affinity in Mesos for a while, and Ian D.
>> has recently been giving this topic thought in his ‘exclusive resources’
>> proposal [1].
>> Trying to avoid too conservative placements, latency critical workloads
>> are at risk without it.
>> We are interested in the topic through our work on oversubscription in
>> Serenity [2], as oversubscription was exactly to be able to colocate
>> latency critical and best-effort batch jobs.
>> We had an informal meeting yesterday, going over the proposal and trying
>> to get some cadence behind the capability.
>>
>> It is a tricky but exciting topic:
>>  - How do we avoid making task launch even more complex? How do we express
>> the topology and acquire parts of it. Do we use hints on the affinity
>> properties instead?
>>  - How do we mix pinned with normal ‘floating’ tasks.
>>  - How do we convey information to the resource estimator about the task
>> sensitivity.
>>
>> Note, above list not meant for inlined discussion or answers. Let’s
>> collect feedback on the proposals themselves.
>>
>> Here are our proposed next steps:
>>  - We are going to use the ‘Isolation Working Group’ as an umbrella for
>> this. I will fill in details and members.
>>  - We will schedule an online meeting within the Wednesday 9AM PST next
>> week discussing next steps. I will share a hangout link when we get closer.
>>  - Plan being, getting to designs (maybe more than one) we agree on and
>> then scope out and distribute the work needed to be done.
>>
>> Who ever is interested, join us. The use cases for this work are critical.
>> Maybe we can even work on some representative workloads we can verify our
>> proposal against.
>>
>> Cheers,
>> Niklas
>>
>> PS For comments on the proposal itself, please refer to Ian’s thread for
>> the dev list [3].
>>
>> [1] https://issues.apache.org/jira/browse/MESOS-4138
>> [2] https://github.com/mesosphere/serenity
>> [3] https://www.mail-archive.com/dev%40mesos.apache.org/msg33892.html
>>



-- 
~Kevin


Re: Follow up on the proposal for simulation tools for master and allocator

2016-01-21 Thread Kevin Klues
Count me in as well.

On Thu, Jan 21, 2016 at 11:13 AM, Neil Conway  wrote:
> Hi Zhitao,
>
> There's a JIRA here:
>
> https://issues.apache.org/jira/browse/MESOS-3855
>
> A few people who are interested in simulation of Mesos have been
> meeting periodically, although due to the holidays we haven't had a
> meeting in a little bit. I'll make sure you're included in the next
> meeting when we get it scheduled.
>
> Thanks,
> Neil
>
> On Wed, Jan 20, 2016 at 11:14 AM, Zhitao Li  wrote:
>> Hi,
>>
>> I saw a message from last year 
>> (http://www.mail-archive.com/dev%40mesos.apache.org/msg33342.html 
>> ) about a 
>> proposal for simulation tools. Has it been formalized as a JIRA issue so 
>> interested parties can subscribe and contribute design ideas?
>>
>> Thanks.



-- 
~Kevin


Re: .gitignore-template

2016-01-20 Thread Kevin Klues
+1 for Consistency!

As a side note, I add custom .gitignore stuff in a global .gitignore
file I install at ~/.gitignore.  This is useful for ignoring things
specific to editor temporary files (e.g. *.swo in vim), etc.

you can make git aware of it via:
$ git config --global core.excludesfile ~/.gitignore

On Wed, Jan 20, 2016 at 3:45 PM, Michael Park  wrote:
> We have a few other default templates such as `support/clang-format` and
> `support/reviewboardrc`, and `bootstrap` symlinks them to `.clang-format`
> and `.reviewboardrc` respectively.
>
> To keep this pattern consistent, I would like to move the
> `.gitignore-template` template to `support/gitignore` and have `bootstrap`
> symlink it to `.gitignore`.
>
> Please let me know if you're opposed to this change.
>
> Thanks!
>
> MPark.



-- 
~Kevin


Re: Links in documentation

2016-01-14 Thread Kevin Klues
So by 2 you mean relative links to the *.md files vs 3 which is
absolute (from the repos topdir).
If so, +1

On Thu, Jan 14, 2016 at 11:39 AM, Joris Van Remoortere
 wrote:
>>
>> *In fact it seems that all links ending with .md are interpreted as
>> relative links on the webpage, i.e. [label](https://test.com/foo.md) is
>> rendered into https://test.com/foo/
>> ">label.
>
>
> I think this should be fixed. We shouldn't be restricted from linking to
> external documentation.
>
> —
> *Joris Van Remoortere*
> Mesosphere
>
> On Thu, Jan 14, 2016 at 4:44 AM, Jörg Schad  wrote:
>
>> Hi,
>> just a short note about links in our documentation.
>> In the current documentation we use three different ways to link between
>> different *.md pages:
>> 1. [label](/documentation/latest/foo/)
>> 2. [label](foo.md)
>> 3. [label](https://github.com/apache/mesos/blob/master/docs/foo.md)
>>
>> First of all, option 3 should *not* be used as it is rendered incorrectly
>> onto website* and rather long.
>>
>> Between option 1 and 2 Neil and myself discussed (MESOS-4295) and are
>> favoring option 2 as it
>> - previews better on github
>> - is shorter
>> - is easier to maintain multiple versions of the same doc.
>>
>> Any comments or objections?
>>
>> Thanks for your feedback!
>>
>> *In fact it seems that all links ending with .md are interpreted as
>> relative links on the webpage, i.e. [label](https://test.com/foo.md) is
>> rendered into https://test.com/foo/
>> ">label.
>>



-- 
~Kevin


Re: Jenkins builds failing for CentOS 7

2015-12-16 Thread Kevin Klues
I filed a JIRA for this:

https://issues.apache.org/jira/browse/MESOS-4184

On Wed, Dec 16, 2015 at 1:59 PM, Kevin Klues  wrote:
> Verified that both solutions work. That is, either install
> java-1.8.0-openjdk-devel instead of
> java-1.7.0-openjdk-devel, or move things around such that maven is
> installed AFTER we install java-1.7.0-openjdk-devel.
>
> Which one is preferred?  I will put together a patch.
>
> Solution 1:
> --- a/support/docker_build.sh
> +++ b/support/docker_build.sh
> @@ -40,7 +40,7 @@ case $OS in
>  append_dockerfile "RUN yum groupinstall -y 'Development Tools'"
>  append_dockerfile "RUN yum install -y epel-release" # Needed for clang.
>  append_dockerfile "RUN yum install -y clang git maven"
> -append_dockerfile "RUN yum install -y java-1.7.0-openjdk-devel
> python-devel zlib-devel libcurl-devel openssl-devel cyrus-sasl-devel
> cyrus-sasl-md5 apr-devel subversion-devel apr-utils-devel
> libevent-devel libev-devel"
> +append_dockerfile "RUN yum install -y java-1.8.0-openjdk-devel
> python-devel zlib-devel libcurl-devel openssl-devel cyrus-sasl-devel
> cyrus-sasl-md5 apr-devel subversion-devel apr-utils-devel
> libevent-devel libev-devel"
>
>  # Add an unprivileged user.
>  append_dockerfile "RUN adduser mesos"
>
>
> Solution 2:
> diff --git a/support/docker_build.sh b/support/docker_build.sh
> index c14370d..7058258 100755
> --- a/support/docker_build.sh
> +++ b/support/docker_build.sh
> @@ -39,8 +39,8 @@ case $OS in
>
>  append_dockerfile "RUN yum groupinstall -y 'Development Tools'"
>  append_dockerfile "RUN yum install -y epel-release" # Needed for clang.
> -append_dockerfile "RUN yum install -y clang git maven"
>  append_dockerfile "RUN yum install -y java-1.7.0-openjdk-devel
> python-devel zlib-devel libcurl-devel openssl-devel cyrus-sasl-devel
> cyrus-sasl-md5 apr-devel subversion-devel apr-utils-devel
> libevent-devel libev-devel"
> +append_dockerfile "RUN yum install -y clang git maven"
>
>  # Add an unprivileged user.
>  append_dockerfile "RUN adduser mesos"
>
> On Wed, Dec 16, 2015 at 1:34 PM, Kevin Klues  wrote:
>> I'm assuming this is the script used for building the image ran by jenkins:
>>
>> support/docker_build.sh
>>
>> If so, I can verify that this is what is causing the failure.  Running
>> manually on my local machine:
>>
>> CONFIGURATION="--verbose --enable-libevent --enable-ssl"
>> COMPILER="gcc" OS="centos:7" support/docker_build.sh
>>
>> results in the same failure.
>>
>> And it stems from the following packages being installed:
>>
>>  java-1.7.0-openjdk  x86_64 1:1.7.0.91-2.6.2.3.el7   base 
>> 207 k
>>  java-1.7.0-openjdk-develx86_64 1:1.7.0.91-2.6.2.3.el7   base 
>> 9.2 M
>>  java-1.7.0-openjdk-headless x86_64 1:1.7.0.91-2.6.2.3.el7   base  
>> 25 M
>>  java-1.8.0-openjdk  x86_64 1:1.8.0.65-3.b17.el7 base 
>> 215 k
>>  java-1.8.0-openjdk-headless x86_64 1:1.8.0.65-3.b17.el7 base  
>> 31 M
>>
>> we either need to install java-1.8.0-openjdk-devel instead of
>> java-1.7.0-openjdk-devel, or move things around such that maven is
>> installed AFTER we install java-1.7.0-openjdk-devel so that its
>> dependence on java doesn't automatically pull in java-1.8.0-openjdk.
>>
>> On Wed, Dec 16, 2015 at 1:11 PM, Kevin Klues  wrote:
>>> Hey all,
>>>
>>> Jenkins builds are now consistently failing for centos 7, withe the failure:
>>>
>>> checking value of Java system property 'java.home'...
>>> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.65-3.b17.el7.x86_64/jre
>>> configure: error: could not guess JAVA_HOME
>>>
>>> I ran into this problem a few days ago while building a vagrant box
>>> for centos 7.  The problem is summarized in this review request:
>>>
>>> https://reviews.apache.org/r/41371/
>>>
>>> I'm not familiar with how the docker images for the build are
>>> launched, but it seems they need to be updated in a manner similar to
>>> what I propose in the updated getting started guide in the review.
>>>
>>> --
>>> ~Kevin
>>
>>
>>
>> --
>> ~Kevin
>
>
>
> --
> ~Kevin



-- 
~Kevin


Re: Jenkins builds failing for CentOS 7

2015-12-16 Thread Kevin Klues
Verified that both solutions work. That is, either install
java-1.8.0-openjdk-devel instead of
java-1.7.0-openjdk-devel, or move things around such that maven is
installed AFTER we install java-1.7.0-openjdk-devel.

Which one is preferred?  I will put together a patch.

Solution 1:
--- a/support/docker_build.sh
+++ b/support/docker_build.sh
@@ -40,7 +40,7 @@ case $OS in
 append_dockerfile "RUN yum groupinstall -y 'Development Tools'"
 append_dockerfile "RUN yum install -y epel-release" # Needed for clang.
 append_dockerfile "RUN yum install -y clang git maven"
-append_dockerfile "RUN yum install -y java-1.7.0-openjdk-devel
python-devel zlib-devel libcurl-devel openssl-devel cyrus-sasl-devel
cyrus-sasl-md5 apr-devel subversion-devel apr-utils-devel
libevent-devel libev-devel"
+append_dockerfile "RUN yum install -y java-1.8.0-openjdk-devel
python-devel zlib-devel libcurl-devel openssl-devel cyrus-sasl-devel
cyrus-sasl-md5 apr-devel subversion-devel apr-utils-devel
libevent-devel libev-devel"

 # Add an unprivileged user.
 append_dockerfile "RUN adduser mesos"


Solution 2:
diff --git a/support/docker_build.sh b/support/docker_build.sh
index c14370d..7058258 100755
--- a/support/docker_build.sh
+++ b/support/docker_build.sh
@@ -39,8 +39,8 @@ case $OS in

 append_dockerfile "RUN yum groupinstall -y 'Development Tools'"
 append_dockerfile "RUN yum install -y epel-release" # Needed for clang.
-append_dockerfile "RUN yum install -y clang git maven"
 append_dockerfile "RUN yum install -y java-1.7.0-openjdk-devel
python-devel zlib-devel libcurl-devel openssl-devel cyrus-sasl-devel
cyrus-sasl-md5 apr-devel subversion-devel apr-utils-devel
libevent-devel libev-devel"
+append_dockerfile "RUN yum install -y clang git maven"

 # Add an unprivileged user.
 append_dockerfile "RUN adduser mesos"

On Wed, Dec 16, 2015 at 1:34 PM, Kevin Klues  wrote:
> I'm assuming this is the script used for building the image ran by jenkins:
>
> support/docker_build.sh
>
> If so, I can verify that this is what is causing the failure.  Running
> manually on my local machine:
>
> CONFIGURATION="--verbose --enable-libevent --enable-ssl"
> COMPILER="gcc" OS="centos:7" support/docker_build.sh
>
> results in the same failure.
>
> And it stems from the following packages being installed:
>
>  java-1.7.0-openjdk  x86_64 1:1.7.0.91-2.6.2.3.el7   base 207 
> k
>  java-1.7.0-openjdk-develx86_64 1:1.7.0.91-2.6.2.3.el7   base 9.2 
> M
>  java-1.7.0-openjdk-headless x86_64 1:1.7.0.91-2.6.2.3.el7   base  25 
> M
>  java-1.8.0-openjdk  x86_64 1:1.8.0.65-3.b17.el7 base 215 
> k
>  java-1.8.0-openjdk-headless x86_64 1:1.8.0.65-3.b17.el7 base  31 
> M
>
> we either need to install java-1.8.0-openjdk-devel instead of
> java-1.7.0-openjdk-devel, or move things around such that maven is
> installed AFTER we install java-1.7.0-openjdk-devel so that its
> dependence on java doesn't automatically pull in java-1.8.0-openjdk.
>
> On Wed, Dec 16, 2015 at 1:11 PM, Kevin Klues  wrote:
>> Hey all,
>>
>> Jenkins builds are now consistently failing for centos 7, withe the failure:
>>
>> checking value of Java system property 'java.home'...
>> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.65-3.b17.el7.x86_64/jre
>> configure: error: could not guess JAVA_HOME
>>
>> I ran into this problem a few days ago while building a vagrant box
>> for centos 7.  The problem is summarized in this review request:
>>
>> https://reviews.apache.org/r/41371/
>>
>> I'm not familiar with how the docker images for the build are
>> launched, but it seems they need to be updated in a manner similar to
>> what I propose in the updated getting started guide in the review.
>>
>> --
>> ~Kevin
>
>
>
> --
> ~Kevin



-- 
~Kevin


Re: Jenkins builds failing for CentOS 7

2015-12-16 Thread Kevin Klues
I'm assuming this is the script used for building the image ran by jenkins:

support/docker_build.sh

If so, I can verify that this is what is causing the failure.  Running
manually on my local machine:

CONFIGURATION="--verbose --enable-libevent --enable-ssl"
COMPILER="gcc" OS="centos:7" support/docker_build.sh

results in the same failure.

And it stems from the following packages being installed:

 java-1.7.0-openjdk  x86_64 1:1.7.0.91-2.6.2.3.el7   base 207 k
 java-1.7.0-openjdk-develx86_64 1:1.7.0.91-2.6.2.3.el7   base 9.2 M
 java-1.7.0-openjdk-headless x86_64 1:1.7.0.91-2.6.2.3.el7   base  25 M
 java-1.8.0-openjdk  x86_64 1:1.8.0.65-3.b17.el7 base 215 k
 java-1.8.0-openjdk-headless x86_64 1:1.8.0.65-3.b17.el7 base  31 M

we either need to install java-1.8.0-openjdk-devel instead of
java-1.7.0-openjdk-devel, or move things around such that maven is
installed AFTER we install java-1.7.0-openjdk-devel so that its
dependence on java doesn't automatically pull in java-1.8.0-openjdk.

On Wed, Dec 16, 2015 at 1:11 PM, Kevin Klues  wrote:
> Hey all,
>
> Jenkins builds are now consistently failing for centos 7, withe the failure:
>
> checking value of Java system property 'java.home'...
> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.65-3.b17.el7.x86_64/jre
> configure: error: could not guess JAVA_HOME
>
> I ran into this problem a few days ago while building a vagrant box
> for centos 7.  The problem is summarized in this review request:
>
> https://reviews.apache.org/r/41371/
>
> I'm not familiar with how the docker images for the build are
> launched, but it seems they need to be updated in a manner similar to
> what I propose in the updated getting started guide in the review.
>
> --
> ~Kevin



-- 
~Kevin


Jenkins builds failing for CentOS 7

2015-12-16 Thread Kevin Klues
Hey all,

Jenkins builds are now consistently failing for centos 7, withe the failure:

checking value of Java system property 'java.home'...
/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.65-3.b17.el7.x86_64/jre
configure: error: could not guess JAVA_HOME

I ran into this problem a few days ago while building a vagrant box
for centos 7.  The problem is summarized in this review request:

https://reviews.apache.org/r/41371/

I'm not familiar with how the docker images for the build are
launched, but it seems they need to be updated in a manner similar to
what I propose in the updated getting started guide in the review.

-- 
~Kevin