[jira] [Commented] (SPARK-18988) Spark "spark.eventLog.dir" dir should create the directory if it is different from "spark.history.fs.logDirectory"

Chen He (JIRA) Fri, 23 Dec 2016 12:06:06 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15773594#comment-15773594
 ]


Chen He commented on SPARK-18988:
---------------------------------

The "automatically" means HistoryServer will create it during startup. Is it 
correct? Really appreciate if you can give us at least 24 hours to answer your 
proposed question instead of close it directly.

> Spark "spark.eventLog.dir" dir should create the directory if it is different 
> from "spark.history.fs.logDirectory"
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-18988
>                 URL: https://issues.apache.org/jira/browse/SPARK-18988
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>    Affects Versions: 1.6.1, 2.1.0
>            Reporter: Chen He
>            Priority: Minor
>
> When set "spark.history.fs.logDirectory" to be hdfs:///spark-history but set 
> "spark.eventLog.dir" to be hdfs:///spark-history/eventLog. It reports 
> following error. 
> ERROR spark.SparkContext: Error initializing SparkContext.
> java.io.FileNotFoundException: File does not exist: 
> hdfs:/spark-history/eventLog
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1367)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1359)
>       at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>       at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1359)
>       at 
> org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:100)
>       at org.apache.spark.SparkContext.<init>(SparkContext.scala:549)
>       at 
> org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
>       at com.oracle.test.logs.Main.main(Main.java:13)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:497)
>       at 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:559)
> If spark event history has to be the same as "spark.history.fs.logDirectory", 
> why has "spark.eventLog.dir". If not, In the EventLoggingListener.start(). It 
> should try to create this dir instead of just simply throwing exception. 
> {code}
>   def start() {
>     if (!fileSystem.getFileStatus(new Path(logBaseDir)).isDir) {
>       throw new IllegalArgumentException(s"Log directory $logBaseDir does not 
> exist.")
>     }
> {code}
> It cause confusion, at the same time, Spark documentation does not make it 
> clear
> {quote}
>       Base directory in which Spark events are logged, if 
> spark.eventLog.enabled is true. *Within this base directory* (???you must 
> make sure it already exists???), Spark creates a sub-directory for each 
> application, and logs the events specific to the application in this 
> directory. Users may want to set this to a unified location like an HDFS 
> directory so history files can be read by the history server.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-18988) Spark "spark.eventLog.dir" dir should create the directory if it is different from "spark.history.fs.logDirectory"

Reply via email to