Re: Official Mesos/Marathon Docker images - please share your thoughts/plans

2016-01-08 Thread Marek Zawadzki

Adam - thank you.
Issue filed (https://github.com/mesosphere/docker-containers/issues/37).

-marek

On 08.01.2016 01:41, Adam Bordelon wrote:

Officially, Apache prefers to distribute source instead of binaries, which
is why Mesosphere volunteered to build and distribute rpm/deb packages as
well as docker images. Apache Mesos could own the Docker files for Mesos
though.
That said, if you have problems with the Mesosphere (or mesoscloud) docker
images for Mesos/Marathon, file an issue with the source repo, and we'd be
happy to address your issues.
https://github.com/mesosphere/docker-containers/issues

On Thu, Jan 7, 2016 at 3:00 AM, haosdent  wrote:


hmm, mesoscloud seems not keep updating since 0.24.1

On Thu, Jan 7, 2016 at 5:55 PM, Michal Rostecki 
wrote:


On 01/07/2016 10:17 AM, haosdent wrote:


Mesoscloud docker file https://github.com/mesoscloud



Hello haosdent,

These Dockerfiles don't include the newest versions of

- Marathon (the newest is 0.13.0, mesoscloud maintains 0.11.0)
- Mesos (the newest is 0.26.0, mesoscloud maintains 0.24.1)

I recently discovered also a few disadvantages of the mesoscloud images
that are not mentioned in Marek's mail:

- ZooKeeper is installed from some tarball, while it can be easily
installed from Mesosphere package repository like the other components
- all containers are running as root - there is no need for that except
mesos-slave repo and IMO it's a good practice to run applications as
non-root user if possible

Cheers,
Michal




--
Best Regards,
Haosdent Huang





Re: Shepherd for MESOS-4279 (Graceful restart of docker task)

2016-01-08 Thread Timothy Chen
I'll shepherd this, can you add me to the jira?

Thanks,

Tim

> On Jan 8, 2016, at 7:04 AM, Qian Zhang  wrote:
> 
> Hi,
> 
> Can anyone shepherd https://issues.apache.org/jira/browse/MESOS-4279? I
> have posted some findings there, we can do further discussion in the ticket.
> 
> 
> Thanks,
> Qian Zhang


Re: [MESOS-1865] Redirect to the leader master when current master is not a leader.

2016-01-08 Thread Benjamin Mahler
Some feedback on this ticket: it focuses on the solution rather than the
problem. We generally want to avoid this, I guess it's been coined 'The XY
Problem' (thanks Benjamin Bannier). In this case it turns out that there
are actually 2 distinct problems that the user is facing:

(1) Passive masters return information in some endpoints that can be
interpreted as incorrect. A passive master does not know the list of tasks,
for example, and so returning an empty list is less accurate than
expressing that no response is possible.

(2) It is difficult to reliably obtain cluster state through the existing
endpoints. This one is less clear to me than the first problem. Here we
have to think through how we want users to be hitting state endpoints. Do
they hit all the masters and take the first valid response? Do they first
ask for the leader, then query the leader? Both of these have races (the
first case has an issue that the requests are not atomic, you may receive
two valid responses ; the second case the leader information may become
stale before the second request). Do we add redirects? Even redirects have
issues, there may be multiple redirects, there may be a redirect to a
master that is unable to redirect further (and so we haven't really solved
the race difficulties with redirects).

The point is, it looks like we can easily solve (1), but (2) warrants more
thought and will be easier to assess with the problem well understood.

On Wed, Jan 6, 2016 at 12:52 PM, Diogo Gomes  wrote:

> Hi, Adam and Haosdent
>
>
> Resurrecting this issue, https://issues.apache.org/jira/browse/MESOS-1865,
> I would like to make a +1 for this change, which apparently became cold but
> I think is very relevant and we had enough time to be prepared for a change
> like this, right?
>
>
> If necessary, can I help with something?
>
>
> Diogo Gomes
>
>
>
>
>


Re: [MESOS-1865] Redirect to the leader master when current master is not a leader.

2016-01-08 Thread Marco Massenzio
+1
(my two cent is that the “correct” approach from an operations viewpoint is to 
first query for the leader, then ask the leader; shortcoming identified by Ben 
obvious, but possibly the lesser of the two evils - and probably unavoidable in 
a distributed systems without atomic transactions - which I don’t think anyone 
on this list would advocate for?)

Thanks to the Benjamin(s) for (finally) giving a name to something I have 
encountered often :)
(I used to informally call it “the A-B problems” - your naming is definitely 
more compelling!)

> On Jan 8, 2016, at 12:29 PM, Benjamin Mahler  wrote:
> 
> Some feedback on this ticket: it focuses on the solution rather than the
> problem. We generally want to avoid this, I guess it's been coined 'The XY
> Problem' (thanks Benjamin Bannier). In this case it turns out that there
> are actually 2 distinct problems that the user is facing:
> 
> (1) Passive masters return information in some endpoints that can be
> interpreted as incorrect. A passive master does not know the list of tasks,
> for example, and so returning an empty list is less accurate than
> expressing that no response is possible.
> 
> (2) It is difficult to reliably obtain cluster state through the existing
> endpoints. This one is less clear to me than the first problem. Here we
> have to think through how we want users to be hitting state endpoints. Do
> they hit all the masters and take the first valid response? Do they first
> ask for the leader, then query the leader? Both of these have races (the
> first case has an issue that the requests are not atomic, you may receive
> two valid responses ; the second case the leader information may become
> stale before the second request). Do we add redirects? Even redirects have
> issues, there may be multiple redirects, there may be a redirect to a
> master that is unable to redirect further (and so we haven't really solved
> the race difficulties with redirects).
> 
> The point is, it looks like we can easily solve (1), but (2) warrants more
> thought and will be easier to assess with the problem well understood.
> 
> On Wed, Jan 6, 2016 at 12:52 PM, Diogo Gomes  wrote:
> 
>> Hi, Adam and Haosdent
>> 
>> 
>> Resurrecting this issue, https://issues.apache.org/jira/browse/MESOS-1865,
>> I would like to make a +1 for this change, which apparently became cold but
>> I think is very relevant and we had enough time to be prepared for a change
>> like this, right?
>> 
>> 
>> If necessary, can I help with something?
>> 
>> 
>> Diogo Gomes
>> 
>> 
>> 
>> 
>> 



Re: [MESOS-1865] Redirect to the leader master when current master is not a leader.

2016-01-08 Thread Neil Conway
On Fri, Jan 8, 2016 at 12:29 PM, Benjamin Mahler  wrote:
> (2) It is difficult to reliably obtain cluster state through the existing
> endpoints. This one is less clear to me than the first problem. Here we
> have to think through how we want users to be hitting state endpoints. Do
> they hit all the masters and take the first valid response? Do they first
> ask for the leader, then query the leader? Both of these have races (the
> first case has an issue that the requests are not atomic, you may receive
> two valid responses ; the second case the leader information may become
> stale before the second request). Do we add redirects? Even redirects have
> issues, there may be multiple redirects, there may be a redirect to a
> master that is unable to redirect further (and so we haven't really solved
> the race difficulties with redirects).

I believe the proposed behavior is:

* Clients can query any master
* Endpoint queries against a non-leading master result in redirects to
the current leader

If the client follows a redirect to a different master, it may get
redirected one or more times; it might also be unable to reach the
current leader, or the queried master might be unable to determine the
current leader. That seems like quite reasonable behavior to me,
though (and technically I would argue that these situations aren't
really "races" -- the client just needs to recognize that as in any
distributed system, the information it observes might be stale).

We could alternatively introduce a "who-is-the-current-leader"
endpoint (which is something people have asked for [1]). As long as
non-leading masters notify clients that they aren't talking to a
leader (e.g., by returning a 403/503 error), that should also avoid
races.

Neil

[1] https://issues.apache.org/jira/browse/MESOS-3841


Re: Looking for a shepherd for MESOS-4258

2016-01-08 Thread Benjamin Mahler
This is very helpful, thanks for doing it. I'll shepherd.

On Wed, Jan 6, 2016 at 9:47 PM, Shuai Lin  wrote:

> Hi list,
>
> I'm working on MESOS-4258 <
> https://issues.apache.org/jira/browse/MESOS-4258>,
> "Generate xml test reports in the jenkins build",  ,
>
> It's a quite trivial patch, here is the review request:
> https://reviews.apache.org/r/42011
>
> Hope someone could shepherd this, thanks!
>
>
> Shuai
>


Re: [MESOS-1865] Redirect to the leader master when current master is not a leader.

2016-01-08 Thread Benjamin Mahler
We should add the "who-is-the-current" leader informational endpoint
regardless of whether we do redirection, no?

Will it be clear which endpoints should redirect? Seems the redirection
approach, if we were to do it, needs to be specified explicitly by the
user. Otherwise it may be confusing for users that some endpoints redirect
and some do not.

On Fri, Jan 8, 2016 at 12:47 PM, Neil Conway  wrote:

> On Fri, Jan 8, 2016 at 12:29 PM, Benjamin Mahler 
> wrote:
> > (2) It is difficult to reliably obtain cluster state through the existing
> > endpoints. This one is less clear to me than the first problem. Here we
> > have to think through how we want users to be hitting state endpoints. Do
> > they hit all the masters and take the first valid response? Do they first
> > ask for the leader, then query the leader? Both of these have races (the
> > first case has an issue that the requests are not atomic, you may receive
> > two valid responses ; the second case the leader information may become
> > stale before the second request). Do we add redirects? Even redirects
> have
> > issues, there may be multiple redirects, there may be a redirect to a
> > master that is unable to redirect further (and so we haven't really
> solved
> > the race difficulties with redirects).
>
> I believe the proposed behavior is:
>
> * Clients can query any master
> * Endpoint queries against a non-leading master result in redirects to
> the current leader
>
> If the client follows a redirect to a different master, it may get
> redirected one or more times; it might also be unable to reach the
> current leader, or the queried master might be unable to determine the
> current leader. That seems like quite reasonable behavior to me,
> though (and technically I would argue that these situations aren't
> really "races" -- the client just needs to recognize that as in any
> distributed system, the information it observes might be stale).
>
> We could alternatively introduce a "who-is-the-current-leader"
> endpoint (which is something people have asked for [1]). As long as
> non-leading masters notify clients that they aren't talking to a
> leader (e.g., by returning a 403/503 error), that should also avoid
> races.
>
> Neil
>
> [1] https://issues.apache.org/jira/browse/MESOS-3841
>


Re: Shepherd for MESOS-4279 (Graceful restart of docker task)

2016-01-08 Thread Qian Zhang
Sure, I have added you as the shepherd, thanks Tim!

2016-01-09 1:44 GMT+08:00 Timothy Chen :

> I'll shepherd this, can you add me to the jira?
>
> Thanks,
>
> Tim
>
> > On Jan 8, 2016, at 7:04 AM, Qian Zhang  wrote:
> >
> > Hi,
> >
> > Can anyone shepherd https://issues.apache.org/jira/browse/MESOS-4279? I
> > have posted some findings there, we can do further discussion in the
> ticket.
> >
> >
> > Thanks,
> > Qian Zhang
>


Re: Anonymous Modules "runtime context"

2016-01-08 Thread Marco Massenzio
Hey folks,

any takers?
I'd really like to have an initial conversation about MESOS-4253, anyone
willing to shepherd this one?

Many thanks!

-- 
*Marco Massenzio*
http://codetrips.com

On Mon, Jan 4, 2016 at 12:19 PM, Marco Massenzio 
wrote:

> Happy New Year, everyone!
>
> During the break, I've been playing with a toy anon module[0] mostly for
> "learning" purposes.
>
> In doing so, I realized it would be useful, as a developer, to get access
> to even a "minimal" runtime context and filed MESOS-4253[1].
> (there was also a TODO from benh in the respective main.cpp of both
> master/slave).
>
> I've submitted a review chain[2], which can be seen as a "proof of
> concept," and would really be grateful if:
>
> - someone volunteered to be a shepherd for MESOS-4253
>   (in particular, I'd like to discuss the approach, and especially whether
> just passing the Flags is sufficient, or there is something else that may
> be of interest);
>
> - someone could cast a quick critical glance on r/41760 and provide
> feedback;
>
> - finally (no pun intended), I'd like to have conversation around whether
> we should also introduce a `finalize()` method too (again, there is a TODO
> about this as well).
> (I think we should, but can be convinced otherwise)
>
> Thanks in advance!
>
> [0] https://github.com/massenz/execute-module
> [1] https://issues.apache.org/jira/browse/MESOS-4253
> [2] https://reviews.apache.org/r/41760/
> --
> *Marco Massenzio*
> http://codetrips.com
>


Re: Using dolt instead of libel when possible

2016-01-08 Thread Benjamin Bannier
Hi,

> On Jan 5, 2016, at 8:08 PM, James Peach  wrote:
>> On Jan 5, 2016, at 12:59 AM, Benjamin Bannier 
>>  wrote:
>> dolt is a replacement for libtool which promises to fix some performance 
>> issues of libtool, many of which have since dolt’s release landed in some 
>> versions of libtool.
> 
> Is dolt still maintained?

No, development has stopped, but we are talking about 180 lines of m4 code 
here, most of which are embedded shell script templates.

I have used dolt without issues for other projects in the past (mostly under 
some Linux), and sent this mail around to find out it it breaks builds for some 
systems we don’t test often.

>> I have made some first measurements of dolt under Debian8 (hardly any 
>> improvement) and OS X 10.10.5 (noticeable speed-up)
> 
> Which version of autoconf did you test on OS X?

This is GNU autoconf-2.69 from homebrew.


Cheers,

Benjamin

Shepherd for MESOS-4279 (Graceful restart of docker task)

2016-01-08 Thread Qian Zhang
Hi,

Can anyone shepherd https://issues.apache.org/jira/browse/MESOS-4279? I
have posted some findings there, we can do further discussion in the ticket.


Thanks,
Qian Zhang