Re: Official Mesos/Marathon Docker images - please share your thoughts/plans
Adam - thank you. Issue filed (https://github.com/mesosphere/docker-containers/issues/37). -marek On 08.01.2016 01:41, Adam Bordelon wrote: Officially, Apache prefers to distribute source instead of binaries, which is why Mesosphere volunteered to build and distribute rpm/deb packages as well as docker images. Apache Mesos could own the Docker files for Mesos though. That said, if you have problems with the Mesosphere (or mesoscloud) docker images for Mesos/Marathon, file an issue with the source repo, and we'd be happy to address your issues. https://github.com/mesosphere/docker-containers/issues On Thu, Jan 7, 2016 at 3:00 AM, haosdentwrote: hmm, mesoscloud seems not keep updating since 0.24.1 On Thu, Jan 7, 2016 at 5:55 PM, Michal Rostecki wrote: On 01/07/2016 10:17 AM, haosdent wrote: Mesoscloud docker file https://github.com/mesoscloud Hello haosdent, These Dockerfiles don't include the newest versions of - Marathon (the newest is 0.13.0, mesoscloud maintains 0.11.0) - Mesos (the newest is 0.26.0, mesoscloud maintains 0.24.1) I recently discovered also a few disadvantages of the mesoscloud images that are not mentioned in Marek's mail: - ZooKeeper is installed from some tarball, while it can be easily installed from Mesosphere package repository like the other components - all containers are running as root - there is no need for that except mesos-slave repo and IMO it's a good practice to run applications as non-root user if possible Cheers, Michal -- Best Regards, Haosdent Huang
Re: Shepherd for MESOS-4279 (Graceful restart of docker task)
I'll shepherd this, can you add me to the jira? Thanks, Tim > On Jan 8, 2016, at 7:04 AM, Qian Zhangwrote: > > Hi, > > Can anyone shepherd https://issues.apache.org/jira/browse/MESOS-4279? I > have posted some findings there, we can do further discussion in the ticket. > > > Thanks, > Qian Zhang
Re: [MESOS-1865] Redirect to the leader master when current master is not a leader.
Some feedback on this ticket: it focuses on the solution rather than the problem. We generally want to avoid this, I guess it's been coined 'The XY Problem' (thanks Benjamin Bannier). In this case it turns out that there are actually 2 distinct problems that the user is facing: (1) Passive masters return information in some endpoints that can be interpreted as incorrect. A passive master does not know the list of tasks, for example, and so returning an empty list is less accurate than expressing that no response is possible. (2) It is difficult to reliably obtain cluster state through the existing endpoints. This one is less clear to me than the first problem. Here we have to think through how we want users to be hitting state endpoints. Do they hit all the masters and take the first valid response? Do they first ask for the leader, then query the leader? Both of these have races (the first case has an issue that the requests are not atomic, you may receive two valid responses ; the second case the leader information may become stale before the second request). Do we add redirects? Even redirects have issues, there may be multiple redirects, there may be a redirect to a master that is unable to redirect further (and so we haven't really solved the race difficulties with redirects). The point is, it looks like we can easily solve (1), but (2) warrants more thought and will be easier to assess with the problem well understood. On Wed, Jan 6, 2016 at 12:52 PM, Diogo Gomeswrote: > Hi, Adam and Haosdent > > > Resurrecting this issue, https://issues.apache.org/jira/browse/MESOS-1865, > I would like to make a +1 for this change, which apparently became cold but > I think is very relevant and we had enough time to be prepared for a change > like this, right? > > > If necessary, can I help with something? > > > Diogo Gomes > > > > >
Re: [MESOS-1865] Redirect to the leader master when current master is not a leader.
+1 (my two cent is that the “correct” approach from an operations viewpoint is to first query for the leader, then ask the leader; shortcoming identified by Ben obvious, but possibly the lesser of the two evils - and probably unavoidable in a distributed systems without atomic transactions - which I don’t think anyone on this list would advocate for?) Thanks to the Benjamin(s) for (finally) giving a name to something I have encountered often :) (I used to informally call it “the A-B problems” - your naming is definitely more compelling!) > On Jan 8, 2016, at 12:29 PM, Benjamin Mahlerwrote: > > Some feedback on this ticket: it focuses on the solution rather than the > problem. We generally want to avoid this, I guess it's been coined 'The XY > Problem' (thanks Benjamin Bannier). In this case it turns out that there > are actually 2 distinct problems that the user is facing: > > (1) Passive masters return information in some endpoints that can be > interpreted as incorrect. A passive master does not know the list of tasks, > for example, and so returning an empty list is less accurate than > expressing that no response is possible. > > (2) It is difficult to reliably obtain cluster state through the existing > endpoints. This one is less clear to me than the first problem. Here we > have to think through how we want users to be hitting state endpoints. Do > they hit all the masters and take the first valid response? Do they first > ask for the leader, then query the leader? Both of these have races (the > first case has an issue that the requests are not atomic, you may receive > two valid responses ; the second case the leader information may become > stale before the second request). Do we add redirects? Even redirects have > issues, there may be multiple redirects, there may be a redirect to a > master that is unable to redirect further (and so we haven't really solved > the race difficulties with redirects). > > The point is, it looks like we can easily solve (1), but (2) warrants more > thought and will be easier to assess with the problem well understood. > > On Wed, Jan 6, 2016 at 12:52 PM, Diogo Gomes wrote: > >> Hi, Adam and Haosdent >> >> >> Resurrecting this issue, https://issues.apache.org/jira/browse/MESOS-1865, >> I would like to make a +1 for this change, which apparently became cold but >> I think is very relevant and we had enough time to be prepared for a change >> like this, right? >> >> >> If necessary, can I help with something? >> >> >> Diogo Gomes >> >> >> >> >>
Re: [MESOS-1865] Redirect to the leader master when current master is not a leader.
On Fri, Jan 8, 2016 at 12:29 PM, Benjamin Mahlerwrote: > (2) It is difficult to reliably obtain cluster state through the existing > endpoints. This one is less clear to me than the first problem. Here we > have to think through how we want users to be hitting state endpoints. Do > they hit all the masters and take the first valid response? Do they first > ask for the leader, then query the leader? Both of these have races (the > first case has an issue that the requests are not atomic, you may receive > two valid responses ; the second case the leader information may become > stale before the second request). Do we add redirects? Even redirects have > issues, there may be multiple redirects, there may be a redirect to a > master that is unable to redirect further (and so we haven't really solved > the race difficulties with redirects). I believe the proposed behavior is: * Clients can query any master * Endpoint queries against a non-leading master result in redirects to the current leader If the client follows a redirect to a different master, it may get redirected one or more times; it might also be unable to reach the current leader, or the queried master might be unable to determine the current leader. That seems like quite reasonable behavior to me, though (and technically I would argue that these situations aren't really "races" -- the client just needs to recognize that as in any distributed system, the information it observes might be stale). We could alternatively introduce a "who-is-the-current-leader" endpoint (which is something people have asked for [1]). As long as non-leading masters notify clients that they aren't talking to a leader (e.g., by returning a 403/503 error), that should also avoid races. Neil [1] https://issues.apache.org/jira/browse/MESOS-3841
Re: Looking for a shepherd for MESOS-4258
This is very helpful, thanks for doing it. I'll shepherd. On Wed, Jan 6, 2016 at 9:47 PM, Shuai Linwrote: > Hi list, > > I'm working on MESOS-4258 < > https://issues.apache.org/jira/browse/MESOS-4258>, > "Generate xml test reports in the jenkins build", , > > It's a quite trivial patch, here is the review request: > https://reviews.apache.org/r/42011 > > Hope someone could shepherd this, thanks! > > > Shuai >
Re: [MESOS-1865] Redirect to the leader master when current master is not a leader.
We should add the "who-is-the-current" leader informational endpoint regardless of whether we do redirection, no? Will it be clear which endpoints should redirect? Seems the redirection approach, if we were to do it, needs to be specified explicitly by the user. Otherwise it may be confusing for users that some endpoints redirect and some do not. On Fri, Jan 8, 2016 at 12:47 PM, Neil Conwaywrote: > On Fri, Jan 8, 2016 at 12:29 PM, Benjamin Mahler > wrote: > > (2) It is difficult to reliably obtain cluster state through the existing > > endpoints. This one is less clear to me than the first problem. Here we > > have to think through how we want users to be hitting state endpoints. Do > > they hit all the masters and take the first valid response? Do they first > > ask for the leader, then query the leader? Both of these have races (the > > first case has an issue that the requests are not atomic, you may receive > > two valid responses ; the second case the leader information may become > > stale before the second request). Do we add redirects? Even redirects > have > > issues, there may be multiple redirects, there may be a redirect to a > > master that is unable to redirect further (and so we haven't really > solved > > the race difficulties with redirects). > > I believe the proposed behavior is: > > * Clients can query any master > * Endpoint queries against a non-leading master result in redirects to > the current leader > > If the client follows a redirect to a different master, it may get > redirected one or more times; it might also be unable to reach the > current leader, or the queried master might be unable to determine the > current leader. That seems like quite reasonable behavior to me, > though (and technically I would argue that these situations aren't > really "races" -- the client just needs to recognize that as in any > distributed system, the information it observes might be stale). > > We could alternatively introduce a "who-is-the-current-leader" > endpoint (which is something people have asked for [1]). As long as > non-leading masters notify clients that they aren't talking to a > leader (e.g., by returning a 403/503 error), that should also avoid > races. > > Neil > > [1] https://issues.apache.org/jira/browse/MESOS-3841 >
Re: Shepherd for MESOS-4279 (Graceful restart of docker task)
Sure, I have added you as the shepherd, thanks Tim! 2016-01-09 1:44 GMT+08:00 Timothy Chen: > I'll shepherd this, can you add me to the jira? > > Thanks, > > Tim > > > On Jan 8, 2016, at 7:04 AM, Qian Zhang wrote: > > > > Hi, > > > > Can anyone shepherd https://issues.apache.org/jira/browse/MESOS-4279? I > > have posted some findings there, we can do further discussion in the > ticket. > > > > > > Thanks, > > Qian Zhang >
Re: Anonymous Modules "runtime context"
Hey folks, any takers? I'd really like to have an initial conversation about MESOS-4253, anyone willing to shepherd this one? Many thanks! -- *Marco Massenzio* http://codetrips.com On Mon, Jan 4, 2016 at 12:19 PM, Marco Massenziowrote: > Happy New Year, everyone! > > During the break, I've been playing with a toy anon module[0] mostly for > "learning" purposes. > > In doing so, I realized it would be useful, as a developer, to get access > to even a "minimal" runtime context and filed MESOS-4253[1]. > (there was also a TODO from benh in the respective main.cpp of both > master/slave). > > I've submitted a review chain[2], which can be seen as a "proof of > concept," and would really be grateful if: > > - someone volunteered to be a shepherd for MESOS-4253 > (in particular, I'd like to discuss the approach, and especially whether > just passing the Flags is sufficient, or there is something else that may > be of interest); > > - someone could cast a quick critical glance on r/41760 and provide > feedback; > > - finally (no pun intended), I'd like to have conversation around whether > we should also introduce a `finalize()` method too (again, there is a TODO > about this as well). > (I think we should, but can be convinced otherwise) > > Thanks in advance! > > [0] https://github.com/massenz/execute-module > [1] https://issues.apache.org/jira/browse/MESOS-4253 > [2] https://reviews.apache.org/r/41760/ > -- > *Marco Massenzio* > http://codetrips.com >
Re: Using dolt instead of libel when possible
Hi, > On Jan 5, 2016, at 8:08 PM, James Peachwrote: >> On Jan 5, 2016, at 12:59 AM, Benjamin Bannier >> wrote: >> dolt is a replacement for libtool which promises to fix some performance >> issues of libtool, many of which have since dolt’s release landed in some >> versions of libtool. > > Is dolt still maintained? No, development has stopped, but we are talking about 180 lines of m4 code here, most of which are embedded shell script templates. I have used dolt without issues for other projects in the past (mostly under some Linux), and sent this mail around to find out it it breaks builds for some systems we don’t test often. >> I have made some first measurements of dolt under Debian8 (hardly any >> improvement) and OS X 10.10.5 (noticeable speed-up) > > Which version of autoconf did you test on OS X? This is GNU autoconf-2.69 from homebrew. Cheers, Benjamin
Shepherd for MESOS-4279 (Graceful restart of docker task)
Hi, Can anyone shepherd https://issues.apache.org/jira/browse/MESOS-4279? I have posted some findings there, we can do further discussion in the ticket. Thanks, Qian Zhang