Re: spark.mesos.coarse impacts memory performance on mesos

Tim Chen Tue, 22 Sep 2015 01:27:12 -0700

Hi Utkarsh,

Just to be sure you originally set coarse to false but then to true? Or is
it the other way around?


Also what's the exception/stack trace when the driver crashed?

Coarse grain mode per-starts all the Spark executor backends, so has the
least overhead comparing to fine grain. There is no single answer for which
mode you should use, otherwise we would have removed one of those modes
since it depends on your use case.

There are quite some factor why there could be huge GC pauses, but I don't
think if you switch to standalone your GC pauses go away.

Tim

On Mon, Sep 21, 2015 at 5:18 PM, Utkarsh Sengar <utkarsh2...@gmail.com>
wrote:

> I am running Spark 1.4.1 on mesos.
>
> The spark job does a "cartesian" of 4 RDDs (aRdd, bRdd, cRdd, dRdd) of
> size 100, 100, 7 and 1 respectively. Lets call it prouctRDD.
>
> Creation of "aRdd" needs data pull from multiple data sources, merging it
> and creating a tuple of JavaRdd, finally aRDD looks something like this:
> JavaRDD<Tuple4<A1, A2>>
> bRdd, cRdd and dRdds are just List<> of values.
>
> Then apply a transformation on prouctRDD and finally call "saveAsTextFile"
> to save the result of my transformation.
>
> Problem:
> By setting "spark.mesos.coarse=true", creation of "aRdd" works fine but
> driver crashes while doing the cartesian but when I do
> "spark.mesos.coarse=true", the job works like a charm. I am running spark
> on mesos.
>
> Comments:
> So I wanted to understand what role does "spark.mesos.coarse=true" plays
> in terms of memory and compute performance. My findings look counter
> intuitive since:
>
>    1. "spark.mesos.coarse=true" just runs on 1 mesos task, so there
>    should be an overhead of spinning up mesos tasks which should impact the
>    performance.
>    2. What config for "spark.mesos.coarse" recommended for running spark
>    on mesos? Or there is no best answer and it depends on usecase?
>    3. Also by setting "spark.mesos.coarse=true", I notice that I get huge
>    GC pauses even with small dataset but a long running job (but this can be a
>    separate discussion).
>
> Let me know if I am missing something obvious, we are learning spark
> tuning as we move forward :)
>
> --
> Thanks,
> -Utkarsh
>

Re: spark.mesos.coarse impacts memory performance on mesos

Reply via email to