I think that is a bug. I have seen that a lot especially with long running jobs where Spark skips a lot of stages because it has pre-computed results. And some of these are never marked as completed, even though in reality they are. I figured this out because I was using the interactive shell (spark-shell) and the shell came up to a prompt indicating the job had finished even though there were a lot of Active jobs and tasks according to the UI. And my output is correct.
Is there a JIRA item tracking this? From: Kuchekar [mailto:kuchekar.nil...@gmail.com] Sent: Wednesday, November 16, 2016 10:00 AM To: spark users <user@spark.apache.org> Subject: Spark UI shows Jobs are processing, but the files are already written to S3 Hi, I am running a spark job, which saves the computed data (massive data) to S3. On the Spark Ui I see the some jobs are active, but no activity in the logs. Also on S3 all the data has be written (verified each bucket --> it has _SUCCESS file) Am I missing something? Thanks. Kuchekar, Nilesh