Re: Removing the Mesos fine-grained mode

Jerry Lam Mon, 23 Nov 2015 12:03:01 -0800

@Andrew Or

I assume you are referring to this ticket [SPARK-5095]: 
https://issues.apache.org/jira/browse/SPARK-5095 
<https://issues.apache.org/jira/browse/SPARK-5095> 
Thank you!


Best Regards,

Jerry

> On Nov 23, 2015, at 2:41 PM, Andrew Or <and...@databricks.com> wrote:
> 
> @Jerry Lam
> 
> Can someone confirm if it is true that dynamic allocation on mesos "is 
> designed to run one executor per slave with the configured amount of 
> resources." I copied this sentence from the documentation. Does this mean 
> there is at most 1 executor per node? Therefore,  if you have a big machine, 
> you need to allocate a fat executor on this machine in order to fully utilize 
> it?
> 
> Mesos inherently does not support multiple executors per slave currently. 
> This is actually not related to dynamic allocation. There is, however, an 
> outstanding patch to add support for multiple executors per slave. When that 
> feature is merged, it will work well with dynamic allocation.
>  
> 
> 2015-11-23 9:27 GMT-08:00 Adam McElwee <a...@mcelwee.me 
> <mailto:a...@mcelwee.me>>:
> 
> 
> On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș <iulian.dra...@typesafe.com 
> <mailto:iulian.dra...@typesafe.com>> wrote:
> 
> 
> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee <a...@mcelwee.me 
> <mailto:a...@mcelwee.me>> wrote:
> I've used fine-grained mode on our mesos spark clusters until this week, 
> mostly because it was the default. I started trying coarse-grained because of 
> the recent chatter on the mailing list about wanting to move the mesos 
> execution path to coarse-grained only. The odd things is, coarse-grained vs 
> fine-grained seems to yield drastic cluster utilization metrics for any of 
> our jobs that I've tried out this week.
> 
> If this is best as a new thread, please let me know, and I'll try not to 
> derail this conversation. Otherwise, details below:
> 
> I think it's ok to discuss it here.
>  
> We monitor our spark clusters with ganglia, and historically, we maintain at 
> least 90% cpu utilization across the cluster. Making a single configuration 
> change to use coarse-grained execution instead of fine-grained consistently 
> yields a cpu utilization pattern that starts around 90% at the beginning of 
> the job, and then it slowly decreases over the next 1-1.5 hours to level out 
> around 65% cpu utilization on the cluster. Does anyone have a clue why I'd be 
> seeing such a negative effect of switching to coarse-grained mode? GC 
> activity is comparable in both cases. I've tried 1.5.2, as well as the 1.6.0 
> preview tag that's on github.
> 
> I'm not very familiar with Ganglia, and how it computes utilization. But one 
> thing comes to mind: did you enable dynamic allocation 
> <https://spark.apache.org/docs/latest/running-on-mesos.html#dynamic-resource-allocation-with-mesos>
>  on coarse-grained mode?
> 
> Dynamic allocation is definitely not enabled. The only delta between runs is 
> adding --conf "spark.mesos.coarse=true" the job submission. Ganglia is just 
> pulling stats from the procfs, and I've never seen it report bad results. If 
> I sample any of the 100-200 nodes in the cluster, dstat reflects the same 
> average cpu that I'm seeing reflected in ganglia.
> 
> iulian
> 
>

Re: Removing the Mesos fine-grained mode

Reply via email to