[ 
https://issues.apache.org/jira/browse/SPARK-12755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282452#comment-15282452
 ] 

Meethu Mathew commented on SPARK-12755:
---------------------------------------

Hi, 
I am facing similar issues again in 1.6.1 standalone. 
1. My completed applications are listed under in the incompleted applications 
list. My application was completed using sc.stop() and the log directory 
contains app folders without .inprogress suffix. No permission issues is there 
for the log directory.
2 From the incompleted list, I can view the UI of only those apps ,which has a 
.inprogress suffix in the folder name in log directory. For other apps it's 
showing error "Application app-2015xxxxx not found".
 Please help me.

> Spark may attempt to rebuild application UI before finishing writing the 
> event logs in possible race condition
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-12755
>                 URL: https://issues.apache.org/jira/browse/SPARK-12755
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.5.2
>            Reporter: Michael Allman
>            Assignee: Michael Allman
>            Priority: Minor
>             Fix For: 1.5.3, 1.6.1, 2.0.0
>
>
> As reported in SPARK-6950, it appears that sometimes the standalone master 
> attempts to build an application's historical UI before closing the app's 
> event log. This is still an issue for us in 1.5.2+, and I believe I've found 
> the underlying cause.
> When stopping a {{SparkContext}}, the {{stop}} method stops the DAG scheduler:
> https://github.com/apache/spark/blob/a76cf51ed91d99c88f301ec85f3cda1288bcf346/core/src/main/scala/org/apache/spark/SparkContext.scala#L1722-L1727
> and then stops the event logger:
> https://github.com/apache/spark/blob/a76cf51ed91d99c88f301ec85f3cda1288bcf346/core/src/main/scala/org/apache/spark/SparkContext.scala#L1722-L1727
> Though it is difficult to follow the chain of events, one of the sequelae of 
> stopping the DAG scheduler is that the master's {{rebuildSparkUI}} method is 
> called. This method looks for the application's event logs, and its behavior 
> varies based on the existence of an {{.inprogress}} file suffix. In 
> particular, a warning is logged if this suffix exists:
> https://github.com/apache/spark/blob/a76cf51ed91d99c88f301ec85f3cda1288bcf346/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L935
> After calling the {{stop}} method on the DAG scheduler, the {{SparkContext}} 
> stops the event logger:
> https://github.com/apache/spark/blob/a76cf51ed91d99c88f301ec85f3cda1288bcf346/core/src/main/scala/org/apache/spark/SparkContext.scala#L1734-L1736
> This renames the event log, dropping the {{.inprogress}} file sequence.
> As such, a race condition exists where the master may attempt to process the 
> application log file before finalizing it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to