Hi, Pedro

I also start using AWS EMR, with Spark 2.4.0. I'm seeking methods for
performance tuning.

Do you configure dynamic allocation ?

FYI:
https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation

I've not tested it yet. I guess spark-submit needs to specify number of
executors.

Regards,
Hiroyuki

2019年2月1日(金) 5:23、Pedro Tuero さん(tuerope...@gmail.com)のメッセージ:

> Hi guys,
> I use to run spark jobs in Aws emr.
> Recently I switch from aws emr label  5.16 to 5.20 (which use Spark 2.4.0).
> I've noticed that a lot of steps are taking longer than before.
> I think it is related to the automatic configuration of cores by executor.
> In version 5.16, some executors toke more cores if the instance allows it.
> Let say, if an instance had 8 cores and 40gb of ram, and ram configured by
> executor was 10gb, then aws emr automatically assigned 2 cores by executor.
> Now in label 5.20, unless I configure the number of cores manually, only
> one core is assigned per executor.
>
> I don't know if it is related to Spark 2.4.0 or if it is something managed
> by aws...
> Does anyone know if there is a way to automatically use more cores when it
> is physically possible?
>
> Thanks,
> Peter.
>

Reply via email to