JVM will quit after spending most of its time on GC (about 95%), but usually before that you have to wait for a long time, particularly if your job is already at massive scale.
Since it is hard to run profiling online, maybe its easier for debugging if you make a lot of partitions (so you can watch the progress bar) and post the last log before it froze. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Processing-Large-Data-Stuck-tp8075p8086.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
