mailto:user@spark.apache.org>>
Subject: Re: Stage failure in BlockManager due to FileNotFoundException on
long-running streaming job
This is likely due to a bug in shuffle file consolidation (which you have
enabled) which was hopefully fixed in 1.1 with this patch:
http
This is likely due to a bug in shuffle file consolidation (which you have
enabled) which was hopefully fixed in 1.1 with this patch:
https://github.com/apache/spark/commit/78f2af582286b81e6dc9fa9d455ed2b369d933bd
Until 1.0.3 or 1.1 are released, the simplest solution is to disable
spark.shuffle.co
This is a long running Spark Streaming job running in YARN, Spark v1.0.2 on
CDH5. The jobs will run for about 34-37 hours then die due to this
FileNotFoundException. There’s very little CPU or RAM usage, I’m running 2 x
cores, 2 x executors, 4g memory, YARN cluster mode.
Here’s the stack trace