Re: GLOG settings

2015-12-10 Thread Steven Schlansker
On Dec 10, 2015, at 11:22 AM, Vinod Kone wrote: > > On Thu, Dec 10, 2015 at 11:12 AM, Zameer Manji wrote: > the native library logs to stderr directly. > > By default the library logs to stderr. But you can set Mesos/GLOG env > variables (e.g., MESOS_LOG_DIR) to make it write to a file inste

Re: [VOTE] Release Apache Mesos 0.27.1 (rc1)

2016-02-16 Thread Steven Schlansker
On Feb 16, 2016, at 4:52 PM, Michael Park wrote: > Hi all, > > Please vote on releasing the following candidate as Apache Mesos 0.27.1. > I filed a bug against 0.27.0 where Mesos can emit totally invalid JSON in response to the /files/read.json endpoint: https://issues.apache.org/jira/browse/

Re: [VOTE] Release Apache Mesos 0.27.1 (rc1)

2016-02-18 Thread Steven Schlansker
Very reasonable, thanks! :) > MPark > > On 16 February 2016 at 17:10, Steven Schlansker > wrote: > On Feb 16, 2016, at 4:52 PM, Michael Park wrote: > > > Hi all, > > > > Please vote on releasing the following candidate as Apache Mesos 0.27.1. > > > >

Re: [VOTE] Release Apache Mesos 0.28.0 (rc1)

2016-03-04 Thread Steven Schlansker
> On Mar 3, 2016, at 5:43 PM, Vinod Kone wrote: > > Hi all, > Please vote on releasing the following candidate as Apache Mesos 0.28.0. > 0.28.0 includes the following: > ... > * [MESOS-2840] - **Experimental** support for container images in Mesos > containerizer (a.k.a. Unified Containeri

Re: What are the invalid-user.log files?

2016-03-18 Thread Steven Schlansker
We are seeing the same thing, using Mesosphere .debs on Ubuntu: mesos-master.mesos1-qa-sf.invalid-user.log.WARNING.20160209-221625.12071 mesos-master.mesos1-qa-sf.invalid-user.log.WARNING.20160223-211310.1456 mesos-master.mesos1-qa-sf.invalid-user.log.WARNING.20160223-211857.5347 What if the fall

Re: Mesos-master url in HA

2016-04-13 Thread Steven Schlansker
I personally believe that this is not a sufficient workaround -- what if the master is failing over, and your autoscaler happens to redirect to a master which just lost leadership? This solution is inherently racy and leads to the end user writing extra code to work around it, and even then can st

Re: mesos docker vs native container

2016-04-26 Thread Steven Schlansker
> On Apr 25, 2016, at 12:37 AM, vincent gromakowski > wrote: > > I am very interesting in getting some feedback of people who has moved from > native container through Docker specially from network performance > perspective. > DCOS has been open sourced and I like all automation it brings wit

Potential serious issue when upgrading OpenJDK8 to >= 8u77 with Debian / Ubuntu packaging

2016-04-27 Thread Steven Schlansker
Hello Mesos fans, I just wanted to alert you to a potentially disastrous incompatibility introduced in the last few OpenJDK packages released for the popular "openjdk-r" Ubuntu PPA. Per Debian bug 815475: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=815475 The OpenJDK packaging changed the

Re: Consequences of health-check timeouts?

2016-05-18 Thread Steven Schlansker
> On May 18, 2016, at 10:44 AM, haosdent wrote: > > >In re executor_shutdown_grace_period: how would this enable the task > >(MongoDB) to terminate gracefully? (BTW: I am fairly certain that the mongo > >STDOUT as captured by Mesos shows that it received signal 15 just before it > >said good-

Re: Monitoring at container level

2016-07-07 Thread Steven Schlansker
We use Graphite and ran into similar problems with huge metric namespaces. We use the Singularity framework which provides both the task "request id" (name) and "instance number" (0..N) to the task. So we set our Graphite namespace to be "request-number" e.g. "myservice-3" This has the downside o

Re: Mesos Docker logs

2016-11-11 Thread Steven Schlansker
There were also known issues with losing docker logs on shutdown prior to 1.0.0: https://issues.apache.org/jira/browse/MESOS-4279 > On Nov 10, 2016, at 9:24 AM, Joseph Wu wrote: > > You can think of Mesos sandbox logs like: > > docker run ... > $MESOS_SANDBOX/stdout 2>&1 $MESOS_SANDBOX/stderr

Re: cron-like scheduling in mesos framework?

2017-01-06 Thread Steven Schlansker
Singularity also supports this well, https://github.com/HubSpot/Singularity (and it is a framework where you can do both "regular" and "scheduled" tasks through one interface) > On Jan 6, 2017, at 8:47 AM, Tomas Barton wrote: > > Hi, > > try Chronos framework https://mesos.github.io/chronos/

Re: Agent Working Directory Best Practices

2017-06-26 Thread Steven Schlansker
> On Jun 25, 2017, at 11:24 PM, Benjamin Mahler wrote: > > As a data point, as far as I'm aware, most users are using a local work > directory, not an NFS mounted one. Would love to hear from anyone on the list > if they are doing this, and if there are any subtleties that should be > documen

Re: Agent Working Directory Best Practices

2017-06-27 Thread Steven Schlansker
> On Jun 26, 2017, at 5:30 PM, James Peach wrote: > > >> On Jun 26, 2017, at 4:05 PM, Steven Schlansker >> wrote: >> >> >>> On Jun 25, 2017, at 11:24 PM, Benjamin Mahler wrote: >>> >>> As a data point, as far as I'm aw

Re: MesosCon attendee introduction thread

2014-08-19 Thread Steven Schlansker
Hi everyone! I’m Steven Schlansker, from OpenTable, and I’ll be at Mesoscon. I am working on building out Mesos / Docker related infrastructure to help us better develop and deploy software. Unfortunately my C++ skills are … lacking, so I haven’t contributed to core. But I have a number of

Trying out Docker containerizer: fails with no interesting output

2014-09-04 Thread Steven Schlansker
I am trying to integrate the Docker containerizer into the Singularity framework (https://github.com/HubSpot/Singularity) I have filled in my containerInfo: "containerInfo": { "type": "DOCKER", "docker": { "image": "registry.mesos-vpcqa.otenv.com/demo-server:latest-master" }

Re: Mesos 0.20.0 with Docker registry availability

2014-09-04 Thread Steven Schlansker
Would it be possible to have a mode where it tries to pull, but then does not fail solely due to the fail of a pull? In particular, we use tags to indicate which build should be deployed e.g. “foo-server:production” tag vs “foo-server:staging” tags. On Sep 4, 2014, at 11:05 PM, Tim Chen wrote

Re: Trying out Docker containerizer: fails with no interesting output

2014-09-05 Thread Steven Schlansker
Thanks, that was definitely a gotcha! On Sep 4, 2014, at 4:27 PM, David Greenberg wrote: > Even though command is blank, you must set shell to false. There's a ticket > for this that I don't have off-hand. > > On Thursday, September 4, 2014, Steven Schlansker >

Re: Multiple disks with Mesos

2014-10-07 Thread Steven Schlansker
On Oct 7, 2014, at 4:06 PM, Arunabha Ghosh wrote: > Hi, > I would like to run Mesos slaves on machines that have multiple disks. > According to the Mesos configuration page I can specify a work_dir argument > to the slaves. > > 1) Can the work_dir argument contain multiple directories ?

Re: Do i really need HDFS?

2014-10-20 Thread Steven Schlansker
On Oct 20, 2014, at 2:09 AM, Ankur Chauhan wrote: > Hi all, > > I am trying to setup a new mesos cluster and I so far I have a set of master > and slave nodes working and I can get everything running. I am able to > install and run a couple of sample apps, hookup jenkins etc. My main question

Re: Mesos IRC office hours

2014-10-21 Thread Steven Schlansker
Assuming a number of interesting / useful Q&As show up, it might be nice to start a FAQ page documenting any of the questions which are general enough to apply to most users. That way people who are not on IRC at the very moment can benefit too. On Oct 21, 2014, at 11:33 AM, Niklas Nielsen wr

Re: Reconciliation Document

2014-11-03 Thread Steven Schlansker
Hi, I'm the poor end user in question :) I have the Singularity logs from task reconciliation saved here: https://gist.githubusercontent.com/stevenschlansker/50dbe2e068c8156a12de/raw/bd4bee96aab770f0899885d826c5b7bca76225e4/gistfile1.txt The last line in the log file sums it up pretty well - INFO

Re: Reconciliation Document

2014-11-03 Thread Steven Schlansker
t.com/stevenschlansker/1577a1fc269525459571/raw/5cd53f53acc8e3b27490b0ea9af04812d624bc50/gistfile1.txt On Nov 3, 2014, at 10:46 AM, Benjamin Mahler wrote: > Thanks! Do you have the master logs? > > On Mon, Nov 3, 2014 at 10:13 AM, Steven Schlansker > wrote: > Hi, > I'm the poor en

Re: Rocket

2014-12-01 Thread Steven Schlansker
On Dec 1, 2014, at 11:22 AM, Niklas Nielsen wrote: > Huge +1 > > On 1 December 2014 at 11:10, Tim Chen wrote: > Hi all, > > Per the announcement from CoreOS about Rocket > (https://coreos.com/blog/rocket/) , it seems to be an exciting containerizer > runtime that has composable isolation/co

Re: Mesos inside Docker

2014-12-02 Thread Steven Schlansker
On Dec 2, 2014, at 8:22 AM, Jeremy Jongsma wrote: > What is the current state of running Mesos inside Docker? I ran across a > thread that indicates it may be working now, but I have not seen any Docker > images for Mesos from authoritative sources. > > https://www.mail-archive.com/user@mesos

Re: Monitoring Mesos slave/master processes

2014-12-09 Thread Steven Schlansker
On Dec 9, 2014, at 3:45 PM, Gary Malouf wrote: > We did this in the past with Nagios, but I was wondering if there was a > recommended way from others using in production. I wrote a Nagios plugin for it https://github.com/opentable/nagios-mesos

Resize Mesos master quorum

2014-12-18 Thread Steven Schlansker
I'm looking to be able to increase (3 -> 5) and decrease (5 -> 3) the number of Mesos masters I run. There doesn't seem to be any documentation on this procedure on the website. What is the correct way to approach this? Thanks, Steven

Accounting for Mesos resource usage

2014-12-18 Thread Steven Schlansker
I am running a corporate Mesos cluster, shared by a number of teams and projects. We are looking to get some insight into our usage of precious computing resources. For example, I'd like to be able to present a report breaking down CPU-hour and RAM GB-hour utilization by service, team, or other

Re: Accounting for Mesos resource usage

2014-12-19 Thread Steven Schlansker
nteresting problems to solve when it gets to master > fail-over, but let's try to enumerate those in the ticket. > > Thanks, > Niklas > > On Thu, Dec 18, 2014 at 11:56 AM, Steven Schlansker > wrote: > I am running a corporate Mesos cluster, shared by a number o

Re: Accounting for Mesos resource usage

2014-12-19 Thread Steven Schlansker
ks and do the resource math. >> >> There are some interesting problems to solve when it gets to master >> fail-over, but let's try to enumerate those in the ticket. >> >> Thanks, >> Niklas >> >> On Thu, Dec 18, 2014 at 11:56 AM, Steven Schlansker >&

Re: Accounting for Mesos resource usage

2014-12-19 Thread Steven Schlansker
rvices, teams, etc. > > I hope this helps, IMHO this is probably the simplest path forward for you, > as you don't need to have any pluggable functionality built-in to mesos to > get what you need. > > Ben > > [1] https://github.com/apache/mesos/blob/0.21.0/includ

Re: mesos-collectd-plugin

2015-03-10 Thread Steven Schlansker
w would you propose a decent way to namespace master stats that are cluster > wide. I had to lie (in the collectd plugin code) and change the host as a > metric such as: > > collectd.$HOSTNAME.master.* seems to make absolutely no sense when there is a > single elected master per cluster.

Re: mesos-collectd-plugin

2015-03-12 Thread Steven Schlansker
We would use (and probably contribute back to!) such an improved plugin as well, if you do polish it up be sure to announce to this list :) On Mar 10, 2015, at 2:05 PM, Dan Dong wrote: > Hi, Jeff, > Thanks, is your plugin working together with collectd? It would be great to > publish it! >

Re: [DISCUSS] Renaming Mesos Slave

2015-06-02 Thread Steven Schlansker
I'm going to stay out of the argument over whether it's an appropriate name or not, but there is definitely a very serious cost to changing it. I'm just imagining what I will have to go through to upgrade my existing clusters. Each one will have to go through a tricky upgrade; removing all the

Re: [DISCUSS] Renaming Mesos Slave

2015-06-08 Thread Steven Schlansker
On Jun 8, 2015, at 1:12 AM, Aaron Carey wrote: > I've been following this thread with interest, it draws a lot of parallels > with similar problems my wife faces as a teacher (and I imagine this happens > in other government/public sector organisations, earlier in this thread James > pointed

Re: Debugging framework registration from inside docker

2015-06-10 Thread Steven Schlansker
On Jun 10, 2015, at 10:10 AM, James Vanns wrote: > Hi. When attempting to run my scheduler inside a docker container in > --net=bridge mode it never receives acknowledgement or a reply to that > request. However, it works fine in --net=host mode. It does not listen on any > port as a service s

Re: Get List of Active Slaves

2015-08-04 Thread Steven Schlansker
Unfortunately this is racey. If you redirect to a master just as it is removed from leadership, you can still get bogus data, with no indication anything went wrong. Some people are reporting that this breaks tools that generate HTTP proxy configurations. I filed this issue a while ago as ht

Re: Get List of Active Slaves

2015-08-04 Thread Steven Schlansker
s to get an a > record for each. I believe it responds to srv requests too. > > On Aug 4, 2015 7:29 PM, "Steven Schlansker" wrote: > Unfortunately this is racey. If you redirect to a master just as it is > removed from leadership, you can still get bogus data, with no

Re: Mesos Modifying User Group

2015-08-13 Thread Steven Schlansker
On Aug 12, 2015, at 3:28 PM, Nastooh Avessta (navesta) wrote: > Having a bit of a strange problem with Mesos 0.22, running Spark 1.4.0, on > Docker 1.6 slaves. Part of my Spark program calls on a script that accesses a > GPU. I am able to run this script: > 1. As Bash > 2. Via Mar

Re: Improvements to container support with custom executor

2015-08-14 Thread Steven Schlansker
To be clear, will this Unified Containerizer give us the ability to run Docker images *without* the Docker daemon? That would be fantastic! This would remove one of the least reliable components of the ecosystem (the Docker daemon) while preserving the illusion of Docker to our end users. On

Re: mesos-slave crashing with CHECK_SOME

2015-08-31 Thread Steven Schlansker
On Aug 31, 2015, at 11:54 AM, Scott Rankin wrote: > > tag=mesos-slave[12858]: F0831 09:37:29.838184 12898 slave.cpp:3354] > CHECK_SOME(os::touch(path)): Failed to open file: No such file or directory I reported a similar bug a while back: https://issues.apache.org/jira/browse/MESOS-2684 T

Re: mesos-slave crashing with CHECK_SOME

2015-09-02 Thread Steven Schlansker
ll, then > all bets are off). > > Having said all that - if there are areas where we have been over-eager with > our CHECKs, we should definitely revisit that and make it more > crash-resistant, absolutely. > > [0] http://research.google.com/pubs/pub43438.html >

Re: API client libraries

2015-09-03 Thread Steven Schlansker
As a Mesos user who wants to be more of a contributor but hates C++, I could volunteer to help work with the Java reference implementation. I totally understand wanting to keep the various client libraries out of the Mesos release cycle and somewhat independent. Maybe a model where the clients