Hi,

I am trying to use the new Spark history server in 1.0.0 to view finished 
applications (launched on YARN), without success so far.

Here are the relevant configuration properties in my spark-defaults.conf:

spark.yarn.historyServer.address=<server_name>:18080
spark.ui.killEnabled=false
spark.eventLog.enabled=true
spark.eventLog.compress=true
spark.eventLog.dir=hdfs://<server_name>:9000/user/<user_name>/spark-events

And the history server has been launched with the command below:

/opt/spark/sbin/start-history-server.sh 
hdfs://<server_name>:9000/user/<user_name>/spark-events


However, the finished application do not appear in the history server UI 
(though the UI itself works correctly).
Apparently, the problem is that the APPLICATION_COMPLETE file is not created:

hdfs dfs -stat %n spark-events/<application_name>-1403166516102/*
COMPRESSION_CODEC_org.apache.spark.io.LZFCompressionCodec
EVENT_LOG_2
SPARK_VERSION_1.0.0

Indeed, if I manually create an empty APPLICATION_COMPLETE file in the above 
directory, the application can now be viewed normally in the history server.

Finally, here is the relevant part of the YARN application log, which seems to 
imply that
the DFS Filesystem is already closed when the APPLICATION_COMPLETE file is 
created:

(...)
14/06/19 08:29:29 INFO ApplicationMaster: finishApplicationMaster with SUCCEEDED
14/06/19 08:29:29 INFO AMRMClientImpl: Waiting for application to be 
successfully unregistered.
14/06/19 08:29:29 INFO ApplicationMaster: AppMaster received a signal.
14/06/19 08:29:29 INFO ApplicationMaster: Deleting staging directory 
.sparkStaging/application_1397477394591_0798
14/06/19 08:29:29 INFO ApplicationMaster$$anon$1: Invoking sc stop from 
shutdown hook
14/06/19 08:29:29 INFO SparkUI: Stopped Spark web UI at 
http://dc1-ibd-corp-hadoop-02.corp.dc1.kelkoo.net:54877
14/06/19 08:29:29 INFO DAGScheduler: Stopping DAGScheduler
14/06/19 08:29:29 INFO CoarseGrainedSchedulerBackend: Shutting down all 
executors
14/06/19 08:29:29 INFO CoarseGrainedSchedulerBackend: Asking each executor to 
shut down
14/06/19 08:29:30 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor 
stopped!
14/06/19 08:29:30 INFO ConnectionManager: Selector thread was interrupted!
14/06/19 08:29:30 INFO ConnectionManager: ConnectionManager stopped
14/06/19 08:29:30 INFO MemoryStore: MemoryStore cleared
14/06/19 08:29:30 INFO BlockManager: BlockManager stopped
14/06/19 08:29:30 INFO BlockManagerMasterActor: Stopping BlockManagerMaster
14/06/19 08:29:30 INFO BlockManagerMaster: BlockManagerMaster stopped
Exception in thread "Thread-44" java.io.IOException: Filesystem closed
        at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:629)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1365)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1307)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:384)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:380)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:380)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:324)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:905)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:886)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:783)
        at org.apache.spark.util.FileLogger.createWriter(FileLogger.scala:117)
        at org.apache.spark.util.FileLogger.newFile(FileLogger.scala:181)
        at 
org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:129)
        at 
org.apache.spark.SparkContext$$anonfun$stop$2.apply(SparkContext.scala:989)
        at 
org.apache.spark.SparkContext$$anonfun$stop$2.apply(SparkContext.scala:989)
        at scala.Option.foreach(Option.scala:236)
        at org.apache.spark.SparkContext.stop(SparkContext.scala:989)
        at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$1.run(ApplicationMaster.scala:443)
14/06/19 08:29:30 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down 
remote daemon.


Am I missing something, or is it a bug?

Thanks,
Christophe.

Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.

Reply via email to