That is precisely my question- what kind of leads can I look at to get a
hint of where the inefficiencies lay?

On Thu, Nov 15, 2018 at 4:56 PM David Markovitz <
dudu.markov...@microsoft.com> wrote:

> It seems it is almost fully utilized – when it is active.
>
> What happens in the gaps, where there is no spark activity?
>
>
>
> Best regards,
>
>
>
> David (דודו) Markovitz
>
> Technology Solutions Professional, Data Platform
>
> Microsoft Israel
>
>
>
> Mobile: +972-525-834-304
>
> Office: +972-747-119-274
>
>
>
> *[image: cid:image002.png@01D166A7.36DE1270]*
>
>
>
> *From:* Vitaliy Pisarev <vitaliy.pisa...@biocatch.com>
> *Sent:* Thursday, November 15, 2018 4:51 PM
> *To:* user <user@spark.apache.org>
> *Cc:* David Markovitz <dudu.markov...@microsoft.com>
> *Subject:* How to address seemingly low core utilization on a spark
> workload?
>
>
>
> I have a workload that runs on a cluster of 300 cores.
>
> Below is a plot of the amount of active tasks over time during the
> execution of this workload:
>
>
>
> [image: image.png]
>
>
>
> What I deduce is that there are substantial intervals where the cores are
> heavily under-utilised.
>
>
>
> What actions can I take to:
>
>    - Increase the efficiency (== core utilisation) of the cluster?
>    - Understand the root causes behind the drops in core utilisation?
>
>

Reply via email to