Hi, I am trying to use the new Spark history server in 1.0.0 to view finished applications (launched on YARN), without success so far.
Here are the relevant configuration properties in my spark-defaults.conf: spark.yarn.historyServer.address=<server_name>:18080 spark.ui.killEnabled=false spark.eventLog.enabled=true spark.eventLog.compress=true spark.eventLog.dir=hdfs://<server_name>:9000/user/<user_name>/spark-events And the history server has been launched with the command below: /opt/spark/sbin/start-history-server.sh hdfs://<server_name>:9000/user/<user_name>/spark-events However, the finished application do not appear in the history server UI (though the UI itself works correctly). Apparently, the problem is that the APPLICATION_COMPLETE file is not created: hdfs dfs -stat %n spark-events/<application_name>-1403166516102/* COMPRESSION_CODEC_org.apache.spark.io.LZFCompressionCodec EVENT_LOG_2 SPARK_VERSION_1.0.0 Indeed, if I manually create an empty APPLICATION_COMPLETE file in the above directory, the application can now be viewed normally in the history server. Finally, here is the relevant part of the YARN application log, which seems to imply that the DFS Filesystem is already closed when the APPLICATION_COMPLETE file is created: (...) 14/06/19 08:29:29 INFO ApplicationMaster: finishApplicationMaster with SUCCEEDED 14/06/19 08:29:29 INFO AMRMClientImpl: Waiting for application to be successfully unregistered. 14/06/19 08:29:29 INFO ApplicationMaster: AppMaster received a signal. 14/06/19 08:29:29 INFO ApplicationMaster: Deleting staging directory .sparkStaging/application_1397477394591_0798 14/06/19 08:29:29 INFO ApplicationMaster$$anon$1: Invoking sc stop from shutdown hook 14/06/19 08:29:29 INFO SparkUI: Stopped Spark web UI at http://dc1-ibd-corp-hadoop-02.corp.dc1.kelkoo.net:54877 14/06/19 08:29:29 INFO DAGScheduler: Stopping DAGScheduler 14/06/19 08:29:29 INFO CoarseGrainedSchedulerBackend: Shutting down all executors 14/06/19 08:29:29 INFO CoarseGrainedSchedulerBackend: Asking each executor to shut down 14/06/19 08:29:30 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor stopped! 14/06/19 08:29:30 INFO ConnectionManager: Selector thread was interrupted! 14/06/19 08:29:30 INFO ConnectionManager: ConnectionManager stopped 14/06/19 08:29:30 INFO MemoryStore: MemoryStore cleared 14/06/19 08:29:30 INFO BlockManager: BlockManager stopped 14/06/19 08:29:30 INFO BlockManagerMasterActor: Stopping BlockManagerMaster 14/06/19 08:29:30 INFO BlockManagerMaster: BlockManagerMaster stopped Exception in thread "Thread-44" java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:629) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1365) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1307) at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:384) at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:380) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:380) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:324) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783) at org.apache.spark.util.FileLogger.createWriter(FileLogger.scala:117) at org.apache.spark.util.FileLogger.newFile(FileLogger.scala:181) at org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:129) at org.apache.spark.SparkContext$$anonfun$stop$2.apply(SparkContext.scala:989) at org.apache.spark.SparkContext$$anonfun$stop$2.apply(SparkContext.scala:989) at scala.Option.foreach(Option.scala:236) at org.apache.spark.SparkContext.stop(SparkContext.scala:989) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:443) 14/06/19 08:29:30 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. Am I missing something, or is it a bug? Thanks, Christophe. Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.