@Andrew Or I assume you are referring to this ticket [SPARK-5095]: https://issues.apache.org/jira/browse/SPARK-5095 <https://issues.apache.org/jira/browse/SPARK-5095> Thank you!
Best Regards, Jerry > On Nov 23, 2015, at 2:41 PM, Andrew Or <and...@databricks.com> wrote: > > @Jerry Lam > > Can someone confirm if it is true that dynamic allocation on mesos "is > designed to run one executor per slave with the configured amount of > resources." I copied this sentence from the documentation. Does this mean > there is at most 1 executor per node? Therefore, if you have a big machine, > you need to allocate a fat executor on this machine in order to fully utilize > it? > > Mesos inherently does not support multiple executors per slave currently. > This is actually not related to dynamic allocation. There is, however, an > outstanding patch to add support for multiple executors per slave. When that > feature is merged, it will work well with dynamic allocation. > > > 2015-11-23 9:27 GMT-08:00 Adam McElwee <a...@mcelwee.me > <mailto:a...@mcelwee.me>>: > > > On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș <iulian.dra...@typesafe.com > <mailto:iulian.dra...@typesafe.com>> wrote: > > > On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee <a...@mcelwee.me > <mailto:a...@mcelwee.me>> wrote: > I've used fine-grained mode on our mesos spark clusters until this week, > mostly because it was the default. I started trying coarse-grained because of > the recent chatter on the mailing list about wanting to move the mesos > execution path to coarse-grained only. The odd things is, coarse-grained vs > fine-grained seems to yield drastic cluster utilization metrics for any of > our jobs that I've tried out this week. > > If this is best as a new thread, please let me know, and I'll try not to > derail this conversation. Otherwise, details below: > > I think it's ok to discuss it here. > > We monitor our spark clusters with ganglia, and historically, we maintain at > least 90% cpu utilization across the cluster. Making a single configuration > change to use coarse-grained execution instead of fine-grained consistently > yields a cpu utilization pattern that starts around 90% at the beginning of > the job, and then it slowly decreases over the next 1-1.5 hours to level out > around 65% cpu utilization on the cluster. Does anyone have a clue why I'd be > seeing such a negative effect of switching to coarse-grained mode? GC > activity is comparable in both cases. I've tried 1.5.2, as well as the 1.6.0 > preview tag that's on github. > > I'm not very familiar with Ganglia, and how it computes utilization. But one > thing comes to mind: did you enable dynamic allocation > <https://spark.apache.org/docs/latest/running-on-mesos.html#dynamic-resource-allocation-with-mesos> > on coarse-grained mode? > > Dynamic allocation is definitely not enabled. The only delta between runs is > adding --conf "spark.mesos.coarse=true" the job submission. Ganglia is just > pulling stats from the procfs, and I've never seen it report bad results. If > I sample any of the 100-200 nodes in the cluster, dstat reflects the same > average cpu that I'm seeing reflected in ganglia. > > iulian > >