This is a long running Spark Streaming job running in YARN, Spark v1.0.2 on
CDH5. The jobs will run for about 34-37 hours then die due to this
FileNotFoundException. There’s very little CPU or RAM usage, I’m running 2 x
cores, 2 x executors, 4g memory, YARN cluster mode.
Here’s the stack
This is likely due to a bug in shuffle file consolidation (which you have
enabled) which was hopefully fixed in 1.1 with this patch:
https://github.com/apache/spark/commit/78f2af582286b81e6dc9fa9d455ed2b369d933bd
Until 1.0.3 or 1.1 are released, the simplest solution is to disable
Thanks, I’ll go ahead and disable that setting for now.
From: Aaron Davidson ilike...@gmail.commailto:ilike...@gmail.com
Date: Wednesday, August 20, 2014 at 3:20 PM
To: Silvio Fiorito
silvio.fior...@granturing.commailto:silvio.fior...@granturing.com
Cc: