[jira] [Commented] (SPARK-4705) Driver retries in yarn-cluster mode always fail if event logging is enabled

Marcelo Vanzin (JIRA) Tue, 03 Feb 2015 10:49:17 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303742#comment-14303742
 ]


Marcelo Vanzin commented on SPARK-4705:
---------------------------------------

Hi [~twinkle], a few comments.

I'm not sure it's worth it to differentiate applications that only have one try 
from those that allow multiple tries. The former is just a special case of the 
latter, where max number of tries = 1.

Also note that in the current master, there isn't a "folder structure". All 
logs are in a single file. You could play with the file names, or create a new 
folder structure for handling app logs. The former is a tiny bit easier on the 
HDFS NameNode.

As for the history server UI, I'd like to throw a different suggestion: list 
app attempts in the existing listing tables. For example,  the row listing the 
application would have multiple "sub rows" with the different attemps for that 
particular application. That means we wouldn't need a separate page to list 
application attempts, and we could have filters on the listing page to only 
list successful attempts, for example.


> Driver retries in yarn-cluster mode always fail if event logging is enabled
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-4705
>                 URL: https://issues.apache.org/jira/browse/SPARK-4705
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, YARN
>    Affects Versions: 1.2.0
>            Reporter: Marcelo Vanzin
>
> yarn-cluster mode will retry to run the driver in certain failure modes. If 
> even logging is enabled, this will most probably fail, because:
> {noformat}
> Exception in thread "Driver" java.io.IOException: Log directory 
> hdfs://vanzin-krb-1.vpc.cloudera.com:8020/user/spark/applicationHistory/application_1417554558066_0003
>  already exists!
>         at org.apache.spark.util.FileLogger.createLogDir(FileLogger.scala:129)
>         at org.apache.spark.util.FileLogger.start(FileLogger.scala:115)
>         at 
> org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:74)
>         at org.apache.spark.SparkContext.<init>(SparkContext.scala:353)
> {noformat}
> The even log path should be "more unique". Or perhaps retries of the same app 
> should clean up the old logs first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-4705) Driver retries in yarn-cluster mode always fail if event logging is enabled

Reply via email to