[ https://issues.apache.org/jira/browse/SPARK-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303742#comment-14303742 ]
Marcelo Vanzin commented on SPARK-4705: --------------------------------------- Hi [~twinkle], a few comments. I'm not sure it's worth it to differentiate applications that only have one try from those that allow multiple tries. The former is just a special case of the latter, where max number of tries = 1. Also note that in the current master, there isn't a "folder structure". All logs are in a single file. You could play with the file names, or create a new folder structure for handling app logs. The former is a tiny bit easier on the HDFS NameNode. As for the history server UI, I'd like to throw a different suggestion: list app attempts in the existing listing tables. For example, the row listing the application would have multiple "sub rows" with the different attemps for that particular application. That means we wouldn't need a separate page to list application attempts, and we could have filters on the listing page to only list successful attempts, for example. > Driver retries in yarn-cluster mode always fail if event logging is enabled > --------------------------------------------------------------------------- > > Key: SPARK-4705 > URL: https://issues.apache.org/jira/browse/SPARK-4705 > Project: Spark > Issue Type: Bug > Components: Spark Core, YARN > Affects Versions: 1.2.0 > Reporter: Marcelo Vanzin > > yarn-cluster mode will retry to run the driver in certain failure modes. If > even logging is enabled, this will most probably fail, because: > {noformat} > Exception in thread "Driver" java.io.IOException: Log directory > hdfs://vanzin-krb-1.vpc.cloudera.com:8020/user/spark/applicationHistory/application_1417554558066_0003 > already exists! > at org.apache.spark.util.FileLogger.createLogDir(FileLogger.scala:129) > at org.apache.spark.util.FileLogger.start(FileLogger.scala:115) > at > org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:74) > at org.apache.spark.SparkContext.<init>(SparkContext.scala:353) > {noformat} > The even log path should be "more unique". Or perhaps retries of the same app > should clean up the old logs first. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org