Re: Out of memory HDFS Multiple Cluster Write

2019-12-21 Thread Chris Teoh
I'm not entirely sure what the behaviour is when writing to remote cluster. It could be that the connections are being established for every element in your dataframe, perhaps having to use for each partition may reduce the number of connections? You may have to look at what the executors do when

unsubscribe

2019-12-21 Thread Aslan Bakirov
unsubscribe

Out of memory HDFS Multiple Cluster Write

2019-12-21 Thread Ruijing Li
I managed to make the failing stage work by increasing memoryOverhead to something ridiculous > 50%. Our spark.executor.memory = 12gb and I bumped spark.mesos.executor.memoryOverhead=8G *Can someone explain why this solved the issue?* As I understand, usage of memoryOverhead is for VM overhead

Re: Out of memory HDFS Multiple Cluster Write

2019-12-21 Thread Ruijing Li
Not for the stage that fails, all it does is read and write - the number of tasks is # of cores * # of executor instances. For us that is 60 (3 cores 20 executors) The input partition size for the failing stage, when spark reads the 20 files each 132M, it comes out to be 40 partitions. On Fri,