Hello All,
I have a Spark job that throws "java.lang.OutOfMemoryError: GC overhead limit exceeded". The job is trying to process a filesize 4.5G. I've tried following spark configuration: --num-executors 6 --executor-memory 6G --executor-cores 6 --driver-memory 3G I tried increasing more cores and executors which sometime works, but takes over 20 minutes to process the file. Could I do something to improve the performance? or stop the Java Heap issue? Thank you. Best regards, Raj.