Re: decline_offer_duration for spark frameworks

2016-04-14 Thread haosdent
>I could not find any details on how set decline offer timeout in the spark framework. Seems spark framework don't support this while marathon support decline offer duration. Have you ever try set different role and reservation resources for marathon? Related documents are [reservation.md]( https:

Re: decline_offer_duration for spark frameworks

2016-04-14 Thread David Greenberg
Hi Rodrick, You should check out Cook (github.com/twosigma/cook). It acts as a server for Spark drivers, except that it can manage preemption and fair-sharing between those drivers, so that you don't encounter starvation. If you'd like to discuss it, I'd be happy to hop on a call. Best, David On

Re: decline_offer_duration for spark frameworks

2016-04-14 Thread Rodrick Brown
Hello Haosdent, I'm running in Coarse gain mode the issue i'm seeing while looking at logs I see tons of offers going to these frameworks which are started by chronos using a simple python wrapper that calls my spark task. Once the spark jobs are running I see offers being declined over and ov

Re: decline_offer_duration for spark frameworks

2016-04-14 Thread haosdent
Hi, @Rodrick Which spark mode are you running? coarse-grained or fine-grained? On Fri, Apr 15, 2016 at 10:05 AM, Rodrick Brown wrote: > I have hundreds of small spark jobs running on my Mesos cluster > causing starvation to other frameworks like Marathon on my cluster. > > Is their a way to pre

decline_offer_duration for spark frameworks

2016-04-14 Thread Rodrick Brown
I have hundreds of small spark jobs running on my Mesos cluster causing starvation to other frameworks like Marathon on my cluster. Is their a way to prevent these frameworks from getting offers so often? Apr 15 02:00:12 prod-mesos-m-3.$SERVER.com mesos-master[10259]: I0415 02:00:12.50373

Re: Mesos Task History

2016-04-14 Thread Adam Bordelon
You might also want to check out TwoSigma's Satellite: https://github.com/twosigma/satellite Sunil's Satellite talk from MesosCon: https://www.youtube.com/watch?v=yLkc17HFEb8 On Thu, Apr 14, 2016 at 9:16 AM, Dick Davies wrote: > We just grab them with collectds mesos plugin and log to Graphite,

Re: Prometheus Exporters on Marathon

2016-04-14 Thread June Taylor
David, Thanks for the reply. Would you be able to share your configs for starting up the exporters? Thanks, June Taylor System Administrator, Minnesota Population Center University of Minnesota On Thu, Apr 14, 2016 at 11:27 AM, David Keijser wrote: > We run the mesos exporter [1] and the node

Re: Prometheus Exporters on Marathon

2016-04-14 Thread David Keijser
We run the mesos exporter [1] and the node_exporter on each host directly managed by systemd. For other application specific exporters we have so far been baking them into the docker image of the application which is being run by marathon. 1) https://github.com/mesosphere/mesos_exporter On Thu, 1

Prometheus Exporters on Marathon

2016-04-14 Thread June Taylor
Is anyone else running Prometheus exporters on their cluster? I am stuck because I can't get a working "go build" environment right now. Is anyone else running this directly on their nodes and masters? Or, via Marathon? If so, please share your setup specifics. Thanks, June Taylor System Adminis

Re: Mesos Task History

2016-04-14 Thread Dick Davies
We just grab them with collectds mesos plugin and log to Graphite, gives us long term trend details. https://github.com/rayrod2030/collectd-mesos Haven't used this one but it supposedly does per-task metric collection: https://github.com/bobrik/collectd-mesos-tasks On 14 April 2016 at 13:37, Ju

Re: Pyspark Cluster Mode

2016-04-14 Thread June Taylor
Shuai, Thank you for your reply. Are you actually using this docker image in Marathon successfully? If so, please share your JSON for the application, as that would help me understand exactly what you suggest. Thanks, June Taylor System Administrator, Minnesota Population Center University of Mi

Re: Pyspark Cluster Mode

2016-04-14 Thread Shuai Lin
To run the dispatcher in marathon I would recommend use a docker image like mesosphere/spark https://hub.docker.com/r/mesosphere/spark/tags/ One problem is how to access the dispatcher since it may be launched on any one the slaves. You can setup a service discovery mechanism like marathon-lb or

How to get the web_ui to use --advertise_ip from the slaves

2016-04-14 Thread Alexander Gallego
Hello, I'm trying to get the web_ui to use the --advertise_ip to connect to master's instead of the --hostname passed into the slaves. running mesos 28 slaves run with a command like this: mesos-slave --master=zk://xxx/mesos --advertise_ip=y.y.y.y --containerizers=docker,mesos --hostname=tmp-m

Re: Pyspark Cluster Mode

2016-04-14 Thread June Taylor
Pradeep, Thank you for your reply. I have read that documentation, but it leaves out a lot of key pieces. Have you actually run MesosClusterDispatcher on Marathon? If so, can you please share your JSON configuration for the application? Thanks, June Taylor System Administrator, Minnesota Populat

Re: Mesos-DNS Failed to connect to...

2016-04-14 Thread shakeel
Hi Stephano, In this case you should follow Chris advice of putting the mesos-dns server as your local dns server. It will allow you to connect to the services using a URL. Kind Regards Shakeel Suffee On 14/04/16 14:29, Stefano Bianchi wrote: > shakeel > > Dig command are perfectly working, i

Re: Mesos-DNS Failed to connect to...

2016-04-14 Thread Stefano Bianchi
shakeel Dig command are perfectly working, i see the correct address on which i have mesos-dns running Unfortunately in the openstack environment where i am working there is not a DNS. And this miss it's causing me a lot a issues also for other stuff. 2016-04-14 15:22 GMT+02:00 shakeel : > Hi, >

Re: SharedFilesystemIsolator (filesystem/shared)

2016-04-14 Thread Erb, Stephan
?Sounds great. Thanks you! From: Jie Yu Sent: Wednesday, April 13, 2016 00:56 To: user Subject: Re: SharedFilesystemIsolator (filesystem/shared) Stephan, Thanks for testing! I'll try to address that ticket and will make sure not removing filesystem/shared befor

Re: Mesos-DNS Failed to connect to...

2016-04-14 Thread shakeel
Hi, Once you have mesos-dns running from marathon, test that it's working properly with dig. (You migth want to add you main dns servers as resolvers within the mesos-dns config and allow recursion.) Otherwise, configure you slaves to use the mesos-dns as their dns servers. I created a subdomai

Re: Mesos-DNS Failed to connect to...

2016-04-14 Thread Chris Baker
Also, make sure that the machine you're trying to launch from has Mesos-DNS as its DNS server :) On Thu, Apr 14, 2016 at 3:33 AM Stefano Bianchi wrote: > Im correctly running mesos-dns from marathon and it seems to work. > But when i launch: > > http://test.marathon.mesos > > (where test is a fu

Re: Mesos master not joining cluster

2016-04-14 Thread shakeel
Hi Shuai, Please find an exert from the master in question logs below. It thinks it is the leader. The WebUI does not show any jobs and does not redirect to the current master. The other two masters are working properly, with the corect leader showing the jobs and the other master redirecting.

Re: Mesos-DNS Failed to connect to...

2016-04-14 Thread June Taylor
Stefano, Try inspecting the DNS directly, for example here is an nslookup query to find the port and slave node that contains a running Docker container started by Marathon, and then you can see the curl command touching on that node and the port specified in the SRV record. I am not sure your exp

Re: Mesos master not joining cluster

2016-04-14 Thread Shuai Lin
Hi Shakeel, what do you mean by "one of the master was not participating in the quorum"? Can you paste related lines from the logs of that master? On Thu, Apr 14, 2016 at 8:44 PM, shakeel wrote: > Hi, > > I have three mesos master configured. They have all been working > properly for a while and

Mesos master not joining cluster

2016-04-14 Thread shakeel
Hi, I have three mesos master configured. They have all been working properly for a while and today I noticed one of the master was not participating in the quorum. A reboot did not resolve the problem. All three of the masters are configured with a quorum of 2. Has anyone experienced this prob

Re: Mesos Task History

2016-04-14 Thread June Taylor
Adam, Is there a way to keep this history? Thanks, June Taylor System Administrator, Minnesota Population Center University of Minnesota On Wed, Apr 13, 2016 at 4:32 PM, Adam Bordelon wrote: > Yes, these counters are only kept in-memory, so any time a Mesos master > starts, its counters are i

Re: Mesos-DNS Failed to connect to...

2016-04-14 Thread Rodrick Brown
https://mesosphere.github.io/mesos-dns/docs/naming.html \-- **Rodrick Brown** / Systems Engineer +1 917 445 6839 / [rodr...@orchardplatform.com](mailto:char...@orchardplatform.com) **Orchard Platform** 101 5th Avenue, 4th Floor, New York, NY 10003 [http://www.orchardplatform.com](

Re: Mesos Masters Leader Keeps Fluctuating

2016-04-14 Thread Adam Bordelon
You could try attaching them to this thread, but ideally we'd want them attached to the JIRA that haosdent created: https://issues.apache.org/jira/browse/MESOS-5207 You may need to create a JIRA account to upload a file. If it still won't let you, ask dev@ to make you a JIRA contributor. On Thu, A

Re: Mesos Masters Leader Keeps Fluctuating

2016-04-14 Thread Stefano Bianchi
Ok please follow me in this strange story. At the beginning i have set 3 mesos master on the same cluster using mesos 0.27 now i deleted one of these 3 mesos 0.27 masters and build a mesos 0.28 to joint to the other 2. I get the problem i described after 10 seconds i get failed to connect, but the

Re: Hybrid application deployments (container/VM/bare metal) in Mesos

2016-04-14 Thread haosdent
>Does Marathon still use cgroups to limit the resource usage of these apps even they are not containerized? Hi, @xiaoning. This is up to you how you start Mesos Agent. All tasks in mesos are always containerized. If you use --isolation="posix/mem,posix/cpu" and use MesosContainerizer to start your

Re: Hybrid application deployments (container/VM/bare metal) in Mesos

2016-04-14 Thread Guangya Liu
Yes, but this depends: if you are using mesos containerizer, then the cgroup will help on resources isolation; if you are using docker containerizer, docker daemon will help on the resources isolation etc. On Thu, Apr 14, 2016 at 3:21 PM, Xiaoning Ding wrote: > Thank you. The JIRA is something I

Mesos-DNS Failed to connect to...

2016-04-14 Thread Stefano Bianchi
Im correctly running mesos-dns from marathon and it seems to work. But when i launch: http://test.marathon.mesos (where test is a funning task on marathon) I get: curl: (7) Failed connect to test.marathon.mesos:80; Connection refused Where am i wrong? Il 13/apr/2016 17:46, "June Taylor" ha sc

RE: Hybrid application deployments (container/VM/bare metal) in Mesos

2016-04-14 Thread Xiaoning Ding
Thank you. The JIRA is something I’m looking for. I’m still going through Marathon documents to see how it addresses my scenario. So please forgive me if my question is already covered by Marathon documents. Suppose I run Mesos and Marathon on some bare metal cluster nodes. Without the