I have a 3 nodes ec2, each assigned 18G for the spark-executor-mem, So after
I run my spark batch job, I got two rdd from different forks, but with the
exact same format. And when i perform union operations, I got executors
disassociate error and the whole spark job fail and quit. Memory shouldn't
does union function cause any data shuffling?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/apache-spark-union-function-cause-executors-disassociate-Lost-executor-1-on-172-32-1-12-remote-Akka--tp15442p15444.html
Sent from the Apache Spark User List
19:02:45,963 INFO [org.apache.spark.MapOutputTrackerMaster]
(spark-akka.actor.default-dispatcher-14) Size of output statuses for shuffle
1 is 216 bytes
19:02:45,964 INFO [org.apache.spark.scheduler.DAGScheduler]
(spark-akka.actor.default-dispatcher-14) Got job 5 (getCallSite at null:-1)
with