Hi, Pedro
I also start using AWS EMR, with Spark 2.4.0. I'm seeking methods for performance tuning. Do you configure dynamic allocation ? FYI: https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation I've not tested it yet. I guess spark-submit needs to specify number of executors. Regards, Hiroyuki 2019年2月1日(金) 5:23、Pedro Tuero さん(tuerope...@gmail.com)のメッセージ: > Hi guys, > I use to run spark jobs in Aws emr. > Recently I switch from aws emr label 5.16 to 5.20 (which use Spark 2.4.0). > I've noticed that a lot of steps are taking longer than before. > I think it is related to the automatic configuration of cores by executor. > In version 5.16, some executors toke more cores if the instance allows it. > Let say, if an instance had 8 cores and 40gb of ram, and ram configured by > executor was 10gb, then aws emr automatically assigned 2 cores by executor. > Now in label 5.20, unless I configure the number of cores manually, only > one core is assigned per executor. > > I don't know if it is related to Spark 2.4.0 or if it is something managed > by aws... > Does anyone know if there is a way to automatically use more cores when it > is physically possible? > > Thanks, > Peter. >