Re: Mesos cluster across a wide area network

2016-12-02 Thread Rodrick Brown
gt; <mailto:agall...@concord.io>> wrote: > > > On Fri, Dec 2, 2016 at 3:04 PM, Rodrick Brown <rodr...@orchard-app.com > <mailto:rodr...@orchard-app.com>> wrote: > How feasible is it to run Mesos across a private WAN. > > I have two data centers and private f

Mesos python bindings

2016-08-17 Thread Rodrick Brown
Where can I find official python bindings for Mesos the latest I see on pip is 0.19 which seems to be over ~2 years old has this project been discontinued? I’m on Mesos 0.28.2 -- <http://www.orchardplatform.com/> Rodrick Brown / DevOPs Engineer +1 917 445 6839 / rodr...@orchardplatfo

Re: OS X latency issue when run as a plist

2016-07-13 Thread Rodrick Brown
Have you tried using something like supervisord? Or the slew of other process launchers available for *nix.  Check brew. I would look to that as an interim solution if the plist method remains problematic.  Get Outlook for iOS On Wed, Jul 13, 2016 at 7:44 AM -0400, "Rinaldo Digiorgio"

Re: What's the official pronounce of mesos?

2016-07-13 Thread Rodrick Brown
Mess-O's  Get Outlook for iOS On Wed, Jul 13, 2016 at 7:56 PM -0400, "zhiwei" wrote: Hi, I saw in some videos, different people pronounce 'mesos' differently. Can someone add the official pronounce of mesos to wikipedia? -- *NOTICE TO RECIPIENTS*:

Unable to execute sparkr jobs with Chronos

2016-06-16 Thread Rodrick Brown
ingJobType": false, "errorsSinceLastSuccess": 0, "uris": [ "file:///data/orchard/R/sparkr_env.R", "file:///data/orchard/R/applepie_loan_detail.R"], "environmentVariables": [ { "name": "SP

Re: Setting constraints

2016-05-22 Thread Rodrick Brown
No side affects at all. I also had similar issues where all my spark slaves need to run the mesos-shuffle-service and I just launch N instances of this service according to the number of slaves I have with a unique constraint. \-- **Rodrick Brown** / Systems Engineer +1 917 445 6839

Re: distributed file systems

2016-05-11 Thread Rodrick Brown
Does EFS count? :-) https://aws.amazon.com/efs/ \-- **Rodrick Brown** / Systems Engineer +1 917 445 6839 / [rodr...@orchardplatform.com](mailto:char...@orchardplatform.com) **Orchard Platform** 101 5th Avenue, 4th Floor, New York, NY 10003 [http://www.orchardplatform.com

Re: spark.cores.max=20 has no affect in Mesos 0.28.1

2016-04-29 Thread Rodrick Brown
Yeah this is so Mesos can use multiple executors from different machines and allow dynamic allocations to scale up to 20 CORES but limit the maximum amount to 20 with spark.cores.max=20? Am I missing something? \-- **Rodrick Brown** / Systems Engineer +1 917 445 6839 / [rodr

spark.cores.max=20 has no affect in Mesos 0.28.1

2016-04-29 Thread Rodrick Brown
rs/spark-job-library-3e30539922ff540f4632d1d0745501c48300b89b- assembled.jar \-- **Rodrick Brown** / Systems Engineer +1 917 445 6839 / [rodr...@orchardplatform.com](mailto:char...@orchardplatform.com) **Orchard Platform** 101 5th Avenue, 4th Floor, New York, NY 10003 [http://www.orcha

Re: setting roles in mesos 0.28

2016-04-20 Thread Rodrick Brown
rod-mesos-m-3.aws.xxx.com marathon[29617]: [2016-04-20 12:11:42,807] INFO Offer ID: [50ceafa4-f3c1-4738-a9eb-c5d3bf0ff742-O13166461]. Considered resources with roles: [sparkr]. Not all basic resources satisfied: cpu not in offer, disk not in offer, mem not in offer (mesosphere.mesos.ResourceMat

setting roles in mesos 0.28

2016-04-19 Thread Rodrick Brown
not able to get any tasks to run on this server do I need to set it on the masters also ? Please advise thanks. \--RB \-- **Rodrick Brown** / Systems Engineer +1 917 445 6839 / [rodr...@orchardplatform.com](mailto:char...@orchardplatform.com) **Orchard Platform** 101 5th

Re: Framework taking default resources even though a role is specified

2016-04-15 Thread Rodrick Brown
You can try setting constraints on tasks in both Chronos and marathon that will limit deployment to only a certain set of nodes.  Sent from Outlook for iPhone On Fri, Apr 15, 2016 at 1:35 PM -0700, "June Taylor" wrote: Evan, I'm not sure about it. We're new to the

Re: decline_offer_duration for spark frameworks

2016-04-14 Thread Rodrick Brown
cluster consists of 14 nodes each with 16 cores and 60GB of memory I have about 120 or so spark tasks configured in Chronos which run very fequently with about 10 long running tasks running on marathon. \-- **Rodrick Brown** / Systems Engineer +1 917 445 6839 / [rodr...@orchardplatform.com

decline_offer_duration for spark frameworks

2016-04-14 Thread Rodrick Brown
. \-- **Rodrick Brown** / Systems Engineer +1 917 445 6839 / [rodr...@orchardplatform.com](mailto:char...@orchardplatform.com) **Orchard Platform** 101 5th Avenue, 4th Floor, New York, NY 10003 [http://www.orchardplatform.com](http://www.orchardplatform.com/) [Orchard Blog](http

Re: Mesos-DNS Failed to connect to...

2016-04-14 Thread Rodrick Brown
https://mesosphere.github.io/mesos-dns/docs/naming.html \-- **Rodrick Brown** / Systems Engineer +1 917 445 6839 / [rodr...@orchardplatform.com](mailto:char...@orchardplatform.com) **Orchard Platform** 101 5th Avenue, 4th Floor, New York, NY 10003 [http://www.orchardplatform.com

Re: Executors no longer inherit environment variables from the agent

2016-03-10 Thread Rodrick Brown
This is unfortunate we are using environment variables that get passed into the executors context such as CHRONOS_RESOURCE_MEM MARATHON_APP_RESOURCE_MEM What will be the workaround? \-- **Rodrick Brown** / Systems Engineer +1 917 445 6839 / [rodr...@orchardplatform.com

Re: Cluster history wiped after master leader reelection

2016-03-10 Thread Rodrick Brown
Yes this is the intended behavior. \-- **Rodrick Brown** / Systems Engineer +1 917 445 6839 / [rodr...@orchardplatform.com](mailto:char...@orchardplatform.com) **Orchard Platform** 101 5th Avenue, 4th Floor, New York, NY 10003 [http://www.orchardplatform.com](http

Re: Alternative mesos-ui tasks

2016-03-10 Thread Rodrick Brown
probably deploy this on our cluster, however as it is right now with no way to navigate back to the default sandbox dir this is something my team is already use to :( \-- **Rodrick Brown** / Systems Engineer +1 917 445 6839 / [rodr...@orchardplatform.com](mailto:char...@orchardplatform.com

Re: Alternative mesos-ui tasks

2016-03-09 Thread Rodrick Brown
Looks good, Did a local install on my macbook pointing to my mesos cluster however, any reason why all my frameworks running have a broken gif in the frameworks screen? Marathon's gif seems to be fine. \-- **Rodrick Brown** / Systems Engineer +1 917 445 6839 / [rodr

Unable to receive offers / long delays when starting or restarting.

2016-02-01 Thread Rodrick Brown
My cluster consist of 9 slaves server split in 1/2 for two primary applications (Spark | Scala Microservices) * Spark - (server 1,2,3,4,8) attributes: "rack:spark" * Long running Microservices (server 5,6,7,9) attributes "rack:ms" The spark jobs run in coarse mode and the majority of

Re: deploy mesos cluster on aws

2016-01-10 Thread Rodrick Brown
on the ELB that routes to the microservice via marathon-lb generated configs. > On Jan 10 2016, at 10:46 pm, Jeff Schroeder jeffschroe...@computer.org wrote: On Sunday, January 10, 2016, Rodrick Brown [rodrick@orchard- app.com](mailto:rodr...@orchard-app.com) wrote: > >> We run

Re: deploy mesos cluster on aws

2016-01-10 Thread Rodrick Brown
We run 100% on AWS and have been running Mesos in production since version 0.19 Our cluster consists of 3 dedicated zookeeper nodes (M3.2lx), 3 dedicated masters (M3.2lx), 8 dedicated slaves (M4.4xl) and 2 haproxy (M4.Medium) instances used in conjunction with marathon-lb for routing requests

Re: Mesos masters and zookeeper running together?

2015-12-24 Thread Rodrick Brown
With our design we end up building out a stand alone zookeeper cluster 3 nodes.  Zookeeper seems to be the default dumping ground for many Apache based products these days. You will eventually see many services and frameworks require a zk instance for leader election, coordination, Kv store

Re: mesos ui best practise - mesos cluster in HA

2015-12-02 Thread Rodrick Brown
There's really no use mesos-dns and just point your browser to http://leader.mesos:5050 to reach the active master. > On Dec 2 2015, at 3:18 pm, Haripriya Ayyalasomayajula aharipriy...@gmail.com wrote: > > > > Hi all, > > > > I am having a mesos cluster (version 0.25.0) running

Re: Team organization around Mesos cluster

2015-11-30 Thread Rodrick Brown
Right now we're using constraits on the slaves to isolate different workloads to ensure one team can't DOS everyone else and this has been working very well for us so far. > On Nov 30 2015, at 12:41 pm, Harry Metske harry.met...@gmail.com wrote: > > We are in a similar stage and also have

resolving hosts with mesos-dns not working with "/" in the appid

2015-11-17 Thread Rodrick Brown
the following should work $ dig mess_spark-shuffle-service.marathon.mesos I don’t get the IP of those service. -- <http://www.orchardplatform.com/> Rodrick Brown / DevOPs Engineer +1 917 445 6839 / rodr...@orchardplatform.com <mailto:char...@orchardplatform.com> Orchard Platform 101 5th

Job Constraints from Marathon or Spark

2015-11-07 Thread Rodrick Brown
0, 0 ], "upgradeStrategy": { "minimumHealthCapacity": 0.5, "maximumOverCapacity": 0.5 } } I’ve also tried running spark jobs like this timeout 3600 /opt/spark-1.4.1-bin-hadoop2.4/bin/spark-submit --conf spark.mesos.constraints

spark mesos shuffle service failing under marathon

2015-11-04 Thread Rodrick Brown
Starting the mesos shuffle service seems to background the process so when ever marathon tries to bring up this process it constantly keeps trying to start and never registers as started? Is there a fix for this? -- <http://www.orchardplatform.com/> Rodrick Brown / DevOPs Engineer

spark shuffle service failing to start on slaves

2015-11-03 Thread Rodrick Brown
cutor.java:357) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) at java.lang.Thread.run(Thread.java:745) -- <http://www.orchardplatform.com/> Rodric

unable to start mesos-slave as non-root user after 0.25 upgrade

2015-10-28 Thread Rodrick Brown
the cluster and everything works as designed. Was something changed in 0.24.1 and 0.25 ? Thanks. -- <http://www.orchardplatform.com/> Rodrick Brown / DevOPs Engineer +1 917 445 6839 / rodr...@orchardplatform.com <mailto:char...@orchardplatform.com> Orchard Platform 101 5th Avenue, 4t

Re: unable to start mesos-slave as non-root user after 0.25 upgrade

2015-10-28 Thread Rodrick Brown
=0765 recurse=yes with_items: - /var/lib/mesos - /var/log/mesos - /etc/mesos - /tmp/mesos notify: restart mesos service tags: set_perms -- <http://www.orchardplatform.com/> Rodrick Brown / DevOPs Engineer +1 917 445 6839 / rodr...@orchardplatform.com <ma

Running mesos-slave as a non-root user.

2015-10-08 Thread Rodrick Brown
. -- <http://www.orchardplatform.com/> Rodrick Brown / DevOPs Engineer +1 917 445 6839 / rodr...@orchardplatform.com <mailto:char...@orchardplatform.com> Orchard Platform 101 5th Avenue, 4th Floor, New York, NY 10003 http://www.orchardplatform.com <http://www.orchardplatfor

Re: Files not being copied to all slaves from hdfs w/spark-submit

2015-10-03 Thread Rodrick Brown
problem here. Do you mean your job depends on files on hdfs and it could not download in slaves after you execute spark-submit? On Sat, Oct 3, 2015 at 5:07 AM, Rodrick Brown <rodr...@orchard-app.com> wrote: For some reason my jobs are not being copied to all the slaves when they’re d

Files not being copied to all slaves from hdfs w/spark-submit

2015-10-02 Thread Rodrick Brown
For some reason my jobs are not being copied to all the slaves when they’re download from hdfs am I missing something obvious? They only seem to be copied to the node where the job is submitted. -- <http://www.orchardplatform.com/> Rodrick Brown / DevOPs Engineer +1 917 445 6839