Looks like the spark history server should take the lost exectuors into account by analyzing the output from 'yarn logs applicationId' command.
Cheers On Thu, Oct 1, 2015 at 11:46 AM, Lan Jiang <ljia...@gmail.com> wrote: > Ted, > > Thanks for your reply. > > First of all, after sending email to the mailing list, I use yarn logs > applicationId <application-id> to retrieve the aggregated log > successfully. I found the exceptions I am looking for. > > Now as to your suggestion, when I go to the YARN RM UI, I can only see the > "Tracking URL" in the application overview section. When I click it, it > brings me to the spark history server UI, where I cannot find the lost > exectuors. The only logs link I can find one the YARN RM site is the > ApplicationMaster log, which is not what I need. Did I miss something? > > Lan > > On Thu, Oct 1, 2015 at 1:30 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> Can you go to YARN RM UI to find all the attempts for this Spark Job ? >> >> The two lost executors should be found there. >> >> On Thu, Oct 1, 2015 at 10:30 AM, Lan Jiang <ljia...@gmail.com> wrote: >> >>> Hi, there >>> >>> When running a Spark job on YARN, 2 executors somehow got lost during >>> the execution. The message on the history server GUI is “CANNOT find >>> address”. Two extra executors were launched by YARN and eventually >>> finished the job. Usually I go to the “Executors” tab on the UI to check >>> the executor stdout/stderr for troubleshoot. Now if I go to the “Executors” >>> tab, I do not see the 2 executors that were lost. I can only see the rest >>> executors and the 2 new executors. Thus I cannot check the stdout/stderr of >>> the lost executors. How can I access the log files of these lost executors >>> to find out why they were lost? >>> >>> Thanks >>> >>> Lan >>> >>> >>> >>> >>> >>> >> >