Let's say that I run Spark on Mesos in fine-grained mode, and I have 12 cores and 64GB memory. I run application A on Spark, and some time after that (but before A finished) application B. How many CPUs will each of them get?
*Romi Kuntsman*, *Big Data Engineer* http://www.totango.com On Tue, Nov 4, 2014 at 11:33 AM, Akhil Das <ak...@sigmoidanalytics.com> wrote: > You need to install mesos on your cluster. Then you will run your spark > applications by specifying mesos master (mesos://) instead of (spark://). > > Spark can run over Mesos in two modes: “*fine-grained*” (default) and “ > *coarse-grained*”. > > In “*fine-grained*” mode (default), each Spark task runs as a separate > Mesos task. This allows multiple instances of Spark (and other frameworks) > to share machines at a very fine granularity, where each application gets > more or fewer machines as it ramps up and down, but it comes with an > additional overhead in launching each task. This mode may be inappropriate > for low-latency requirements like interactive queries or serving web > requests. > > The “*coarse-grained*” mode will instead launch only one long-running > Spark task on each Mesos machine, and dynamically schedule its own > “mini-tasks” within it. The benefit is much lower startup overhead, but at > the cost of reserving the Mesos resources for the complete duration of the > application. > > To run in coarse-grained mode, set the spark.mesos.coarse property in your > SparkConf: > conf.set("spark.mesos.coarse", "true") > > > In addition, for coarse-grained mode, you can control the maximum number > of resources Spark will acquire. By default, it will acquire all cores in > the cluster (that get offered by Mesos), which only makes sense if you run > just one application at a time. You can cap the maximum number of cores > using conf.set("spark.cores.max", "10") (for example). > > > If you run your application in fine-grained mode, then mesos will take > care of the resource allocation for you. You just tell mesos from your > application that you are running in fine-grained mode, and it is the > default mode. > > Thanks > Best Regards > > On Tue, Nov 4, 2014 at 2:46 PM, Romi Kuntsman <r...@totango.com> wrote: > >> I have a single Spark cluster, not multiple frameworks and not multiple >> versions. Is it relevant for my use-case? >> Where can I find information about exactly how to make Mesos tell Spark >> how many resources of the cluster to use? (instead of the default take-all) >> >> *Romi Kuntsman*, *Big Data Engineer* >> http://www.totango.com >> >> On Tue, Nov 4, 2014 at 11:00 AM, Akhil Das <ak...@sigmoidanalytics.com> >> wrote: >> >>> You can look at different modes over here >>> http://docs.sigmoidanalytics.com/index.php/Spark_On_Mesos#Mesos_Run_Modes >>> >>> These people has very good tutorial to get you started >>> http://mesosphere.com/docs/tutorials/run-spark-on-mesos/#overview >>> >>> Thanks >>> Best Regards >>> >>> On Tue, Nov 4, 2014 at 1:44 PM, Romi Kuntsman <r...@totango.com> wrote: >>> >>>> How can I configure Mesos allocation policy to share resources between >>>> all current Spark applications? I can't seem to find it in the architecture >>>> docs. >>>> >>>> *Romi Kuntsman*, *Big Data Engineer* >>>> http://www.totango.com >>>> >>>> On Tue, Nov 4, 2014 at 9:11 AM, Akhil Das <ak...@sigmoidanalytics.com> >>>> wrote: >>>> >>>>> Yes. i believe Mesos is the right choice for you. >>>>> http://mesos.apache.org/documentation/latest/mesos-architecture/ >>>>> >>>>> Thanks >>>>> Best Regards >>>>> >>>>> On Mon, Nov 3, 2014 at 9:33 PM, Romi Kuntsman <r...@totango.com> >>>>> wrote: >>>>> >>>>>> So, as said there, static partitioning is used in "Spark’s standalone >>>>>> and YARN modes, as well as the coarse-grained Mesos mode". >>>>>> That leaves us only with Mesos, where there is *dynamic sharing* of >>>>>> CPU cores. >>>>>> >>>>>> It says "when the application is not running tasks on a machine, >>>>>> other applications may run tasks on those cores". >>>>>> But my applications are short lived (seconds to minutes), and they >>>>>> read a large dataset, process it, and write the results. They are also >>>>>> IO-bound, meaning most of the time is spent reading input data (from S3) >>>>>> and writing the results back. >>>>>> >>>>>> Is it possible to divide the resources between them, according to how >>>>>> many are trying to run at the same time? >>>>>> So for example if I have 12 cores - if one job is scheduled, it will >>>>>> get 12 cores, but if 3 are scheduled, then each one will get 4 cores and >>>>>> then will all start. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> *Romi Kuntsman*, *Big Data Engineer* >>>>>> http://www.totango.com >>>>>> >>>>>> On Mon, Nov 3, 2014 at 5:46 PM, Akhil Das <ak...@sigmoidanalytics.com >>>>>> > wrote: >>>>>> >>>>>>> Have a look at scheduling pools >>>>>>> <https://spark.apache.org/docs/latest/job-scheduling.html>. If you >>>>>>> want more sophisticated resource allocation, then you are better of to >>>>>>> use >>>>>>> cluster managers like mesos or yarn >>>>>>> >>>>>>> Thanks >>>>>>> Best Regards >>>>>>> >>>>>>> On Mon, Nov 3, 2014 at 9:10 PM, Romi Kuntsman <r...@totango.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> I have a Spark 1.1.0 standalone cluster, with several nodes, and >>>>>>>> several jobs (applications) being scheduled at the same time. >>>>>>>> By default, each Spark job takes up all available CPUs. >>>>>>>> This way, when more than one job is scheduled, all but the first >>>>>>>> are stuck in "WAITING". >>>>>>>> On the other hand, if I tell each job to initially limit itself to >>>>>>>> a fixed number of CPUs, and that job runs by itself, the cluster is >>>>>>>> under-utilized and the job runs longer than it could have if it took >>>>>>>> all >>>>>>>> the available resources. >>>>>>>> >>>>>>>> - How to give the tasks a more fair resource division, which lets >>>>>>>> many jobs run together, and together lets them use all the available >>>>>>>> resources? >>>>>>>> - How do you divide resources between applications on your usecase? >>>>>>>> >>>>>>>> P.S. I started reading about Mesos but couldn't figure out if/how >>>>>>>> it could solve the described issue. >>>>>>>> >>>>>>>> Thanks! >>>>>>>> >>>>>>>> *Romi Kuntsman*, *Big Data Engineer* >>>>>>>> http://www.totango.com >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >