Hi Michael, That caught my attention...
Could you please elaborate on "elastically grow and shrink CPU usage" and how it really works under the covers? It seems that CPU usage is just a "label" for an executor on Mesos. Where's this in the code? Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Mon, Dec 26, 2016 at 6:25 PM, Michael Gummelt <mgumm...@mesosphere.io> wrote: >> Using 0 for spark.mesos.mesosExecutor.cores is better than dynamic >> allocation > > Maybe for CPU, but definitely not for memory. Executors never shut down in > fine-grained mode, which means you only elastically grow and shrink CPU > usage, not memory. > > On Sat, Dec 24, 2016 at 10:14 PM, Davies Liu <davies....@gmail.com> wrote: >> >> Using 0 for spark.mesos.mesosExecutor.cores is better than dynamic >> allocation, but have to pay a little more overhead for launching a >> task, which should be OK if the task is not trivial. >> >> Since the direct result (up to 1M by default) will also go through >> mesos, it's better to tune it lower, otherwise mesos could become the >> bottleneck. >> >> spark.task.maxDirectResultSize >> >> On Mon, Dec 19, 2016 at 3:23 PM, Chawla,Sumit <sumitkcha...@gmail.com> >> wrote: >> > Tim, >> > >> > We will try to run the application in coarse grain mode, and share the >> > findings with you. >> > >> > Regards >> > Sumit Chawla >> > >> > >> > On Mon, Dec 19, 2016 at 3:11 PM, Timothy Chen <tnac...@gmail.com> wrote: >> > >> >> Dynamic allocation works with Coarse grain mode only, we wasn't aware >> >> a need for Fine grain mode after we enabled dynamic allocation support >> >> on the coarse grain mode. >> >> >> >> What's the reason you're running fine grain mode instead of coarse >> >> grain + dynamic allocation? >> >> >> >> Tim >> >> >> >> On Mon, Dec 19, 2016 at 2:45 PM, Mehdi Meziane >> >> <mehdi.mezi...@ldmobile.net> wrote: >> >> > We will be interested by the results if you give a try to Dynamic >> >> allocation >> >> > with mesos ! >> >> > >> >> > >> >> > ----- Mail Original ----- >> >> > De: "Michael Gummelt" <mgumm...@mesosphere.io> >> >> > À: "Sumit Chawla" <sumitkcha...@gmail.com> >> >> > Cc: user@mesos.apache.org, d...@mesos.apache.org, "User" >> >> > <u...@spark.apache.org>, d...@spark.apache.org >> >> > Envoyé: Lundi 19 Décembre 2016 22h42:55 GMT +01:00 Amsterdam / Berlin >> >> > / >> >> > Berne / Rome / Stockholm / Vienne >> >> > Objet: Re: Mesos Spark Fine Grained Execution - CPU count >> >> > >> >> > >> >> >> Is this problem of idle executors sticking around solved in Dynamic >> >> >> Resource Allocation? Is there some timeout after which Idle >> >> >> executors >> >> can >> >> >> just shutdown and cleanup its resources. >> >> > >> >> > Yes, that's exactly what dynamic allocation does. But again I have >> >> > no >> >> idea >> >> > what the state of dynamic allocation + mesos is. >> >> > >> >> > On Mon, Dec 19, 2016 at 1:32 PM, Chawla,Sumit >> >> > <sumitkcha...@gmail.com> >> >> > wrote: >> >> >> >> >> >> Great. Makes much better sense now. What will be reason to have >> >> >> spark.mesos.mesosExecutor.cores more than 1, as this number doesn't >> >> include >> >> >> the number of cores for tasks. >> >> >> >> >> >> So in my case it seems like 30 CPUs are allocated to executors. And >> >> there >> >> >> are 48 tasks so 48 + 30 = 78 CPUs. And i am noticing this gap of >> >> >> 30 is >> >> >> maintained till the last task exits. This explains the gap. >> >> >> Thanks >> >> >> everyone. I am still not sure how this number 30 is calculated. ( >> >> >> Is >> >> it >> >> >> dynamic based on current resources, or is it some configuration. I >> >> have 32 >> >> >> nodes in my cluster). >> >> >> >> >> >> Is this problem of idle executors sticking around solved in Dynamic >> >> >> Resource Allocation? Is there some timeout after which Idle >> >> >> executors >> >> can >> >> >> just shutdown and cleanup its resources. >> >> >> >> >> >> >> >> >> Regards >> >> >> Sumit Chawla >> >> >> >> >> >> >> >> >> On Mon, Dec 19, 2016 at 12:45 PM, Michael Gummelt < >> >> mgumm...@mesosphere.io> >> >> >> wrote: >> >> >>> >> >> >>> > I should preassume that No of executors should be less than >> >> >>> > number >> >> of >> >> >>> > tasks. >> >> >>> >> >> >>> No. Each executor runs 0 or more tasks. >> >> >>> >> >> >>> Each executor consumes 1 CPU, and each task running on that >> >> >>> executor >> >> >>> consumes another CPU. You can customize this via >> >> >>> spark.mesos.mesosExecutor.cores >> >> >>> >> >> >>> (https://github.com/apache/spark/blob/v1.6.3/docs/running-on-mesos.md) >> >> and >> >> >>> spark.task.cpus >> >> >>> (https://github.com/apache/spark/blob/v1.6.3/docs/configuration.md) >> >> >>> >> >> >>> On Mon, Dec 19, 2016 at 12:09 PM, Chawla,Sumit >> >> >>> <sumitkcha...@gmail.com >> >> > >> >> >>> wrote: >> >> >>>> >> >> >>>> Ah thanks. looks like i skipped reading this "Neither will >> >> >>>> executors >> >> >>>> terminate when they’re idle." >> >> >>>> >> >> >>>> So in my job scenario, I should preassume that No of executors >> >> >>>> should >> >> >>>> be less than number of tasks. Ideally one executor should execute >> >> >>>> 1 >> >> or more >> >> >>>> tasks. But i am observing something strange instead. I start my >> >> >>>> job >> >> with >> >> >>>> 48 partitions for a spark job. In mesos ui i see that number of >> >> >>>> tasks >> >> is 48, >> >> >>>> but no. of CPUs is 78 which is way more than 48. Here i am >> >> >>>> assuming >> >> that 1 >> >> >>>> CPU is 1 executor. I am not specifying any configuration to set >> >> number of >> >> >>>> cores per executor. >> >> >>>> >> >> >>>> Regards >> >> >>>> Sumit Chawla >> >> >>>> >> >> >>>> >> >> >>>> On Mon, Dec 19, 2016 at 11:35 AM, Joris Van Remoortere >> >> >>>> <jo...@mesosphere.io> wrote: >> >> >>>>> >> >> >>>>> That makes sense. From the documentation it looks like the >> >> >>>>> executors >> >> >>>>> are not supposed to terminate: >> >> >>>>> >> >> >>>>> http://spark.apache.org/docs/latest/running-on-mesos.html# >> >> fine-grained-deprecated >> >> >>>>>> >> >> >>>>>> Note that while Spark tasks in fine-grained will relinquish >> >> >>>>>> cores as >> >> >>>>>> they terminate, they will not relinquish memory, as the JVM does >> >> not give >> >> >>>>>> memory back to the Operating System. Neither will executors >> >> terminate when >> >> >>>>>> they’re idle. >> >> >>>>> >> >> >>>>> >> >> >>>>> I suppose your task to executor CPU ratio is low enough that it >> >> >>>>> looks >> >> >>>>> like most of the resources are not being reclaimed. If your tasks >> >> were using >> >> >>>>> significantly more CPU the amortized cost of the idle executors >> >> would not be >> >> >>>>> such a big deal. >> >> >>>>> >> >> >>>>> >> >> >>>>> — >> >> >>>>> Joris Van Remoortere >> >> >>>>> Mesosphere >> >> >>>>> >> >> >>>>> On Mon, Dec 19, 2016 at 11:26 AM, Timothy Chen >> >> >>>>> <tnac...@gmail.com> >> >> >>>>> wrote: >> >> >>>>>> >> >> >>>>>> Hi Chawla, >> >> >>>>>> >> >> >>>>>> One possible reason is that Mesos fine grain mode also takes up >> >> cores >> >> >>>>>> to run the executor per host, so if you have 20 agents running >> >> >>>>>> Fine >> >> >>>>>> grained executor it will take up 20 cores while it's still >> >> >>>>>> running. >> >> >>>>>> >> >> >>>>>> Tim >> >> >>>>>> >> >> >>>>>> On Fri, Dec 16, 2016 at 8:41 AM, Chawla,Sumit < >> >> sumitkcha...@gmail.com> >> >> >>>>>> wrote: >> >> >>>>>> > Hi >> >> >>>>>> > >> >> >>>>>> > I am using Spark 1.6. I have one query about Fine Grained >> >> >>>>>> > model in >> >> >>>>>> > Spark. >> >> >>>>>> > I have a simple Spark application which transforms A -> B. >> >> >>>>>> > Its a >> >> >>>>>> > single >> >> >>>>>> > stage application. To begin the program, It starts with 48 >> >> >>>>>> > partitions. >> >> >>>>>> > When the program starts running, in mesos UI it shows 48 tasks >> >> >>>>>> > and >> >> >>>>>> > 48 CPUs >> >> >>>>>> > allocated to job. Now as the tasks get done, the number of >> >> >>>>>> > active >> >> >>>>>> > tasks >> >> >>>>>> > number starts decreasing. How ever, the number of CPUs does >> >> >>>>>> > not >> >> >>>>>> > decrease >> >> >>>>>> > propotionally. When the job was about to finish, there was a >> >> single >> >> >>>>>> > remaininig task, however CPU count was still 20. >> >> >>>>>> > >> >> >>>>>> > My questions, is why there is no one to one mapping between >> >> >>>>>> > tasks >> >> >>>>>> > and cpus >> >> >>>>>> > in Fine grained? How can these CPUs be released when the job >> >> >>>>>> > is >> >> >>>>>> > done, so >> >> >>>>>> > that other jobs can start. >> >> >>>>>> > >> >> >>>>>> > >> >> >>>>>> > Regards >> >> >>>>>> > Sumit Chawla >> >> >>>>> >> >> >>>>> >> >> >>>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> -- >> >> >>> Michael Gummelt >> >> >>> Software Engineer >> >> >>> Mesosphere >> >> >> >> >> >> >> >> > >> >> > >> >> > >> >> > -- >> >> > Michael Gummelt >> >> > Software Engineer >> >> > Mesosphere >> >> >> >> >> >> -- >> - Davies > > > > > -- > Michael Gummelt > Software Engineer > Mesosphere