I have encountered the same problem after migrating from 1.2.2 to 1.3.0. After some searching this appears to be a bug introduced in 1.3. Hopefully it¹s fixed in 1.4.
Thanks, Charles On 9/9/15, 7:30 AM, "David Rosenstrauch" <dar...@darose.net> wrote: >Standalone. > >On 09/08/2015 11:18 PM, Jeff Zhang wrote: >> What cluster mode do you use ? Standalone/Yarn/Mesos ? >> >> >> On Wed, Sep 9, 2015 at 11:15 AM, David Rosenstrauch <dar...@darose.net> >> wrote: >> >>> Our Spark cluster is configured to write application history event >>>logging >>> to a directory on HDFS. This all works fine. (I've tested it with >>>Spark >>> shell.) >>> >>> However, on a large, long-running job that we ran tonight, one of our >>> machines at the cloud provider had issues and had to be terminated and >>> replaced in the middle of the job. >>> >>> The job completed correctly, and shows in state FINISHED in the >>>"Completed >>> Applications" section of the Spark GUI. However, when I try to look >>>at the >>> application's history, the GUI says "Application history not found" and >>> "Application ... is still in progress". >>> >>> The reason appears to be the machine that was terminated. When I >>>click on >>> the executor list for that job, Spark is showing the executor from the >>> terminated machine as still in state RUNNING. >>> >>> Any solution/workaround for this? BTW, I'm running Spark v1.3.0. >>> >>> Thanks, >>> >>> DR >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >> > > >--------------------------------------------------------------------- >To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org