Hi, I ran spark standalone mode on a cluster and it went well for approximately one hour, then the driver's output stopped with the following:
14/07/24 08:07:36 INFO MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 36 to spark@worker5.local:47416 14/07/24 08:07:36 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 36 is 265 bytes 14/07/24 08:30:04 INFO MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 39 to spark@worker5.local:47416 14/07/24 08:30:04 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 39 is 265 bytes Then I checked the spark UI, found only one active task, then I checked that worker's stderr, it seemed the worker had fallen into a loop: 14/07/24 09:18:18 INFO BlockManager: Found block rdd_14_3 locally 14/07/24 09:18:18 INFO BlockManager: Found block rdd_14_3 locally 14/07/24 09:18:18 INFO BlockManager: Found block rdd_14_3 locally 14/07/24 09:18:18 INFO BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesInFlight: 50331648, targetRequestSize: 10066329 14/07/24 09:18:18 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Getting 0 non-empty blocks out of 28 blocks 14/07/24 09:18:18 INFO BlockFetcherIterator$BasicBlockFetcherIterator: Started 0 remote fetches in 0 ms These aberrant info was repeatedly outputted. So, what should I do to fix it? I have run the program for multiple times and sooner or later it ends up in the case. And I tried to extend the memory, didn't work. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-got-stuck-with-a-loop-tp10590.html Sent from the Apache Spark User List mailing list archive at Nabble.com.