Re: SparkStraming job break with shuffle file not found

2018-03-28 Thread Lucas Kacher
I have been running into this as well, but I am using S3 for checkpointing so I chalked it up to network partitioning with s3-isnt-hdfs as my storage location. But it seems that you are indeed using hdfs, so I wonder if there is another underlying issue. On Wed, Mar 28, 2018 at 8:21 AM, Jone

SparkStraming job break with shuffle file not found

2018-03-28 Thread Jone Zhang
The spark streaming job running for a few days,then fail as below What is the possible reason? *18/03/25 07:58:37 ERROR yarn.ApplicationMaster: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 16 in stage 80018.0 failed 4 times, most recent