Hi ,

I recently created a spark cluster on AWS-EMR using a fleet configuration
with hybrid instance types. The instance types on this cluster vary
depending on the availability of the type.

 While running the same spark applications that were running on homogenous
cluster(some pyspark apps doing dataframe operations), I've observed a big
spike in the number of tasks(like 1000x times) .

I would really appreciate it if someone could point me in the right
direction.I'm having hard time in understanding how having a hybrid type
cluster can have impact on the number of tasks....
The cluster was on:
Spark 2.4.6 with
spark.sql.adaptive.enabled = true
spark.dynamicAllocation.enabled=true

Kind Regards

Reply via email to