hi, Suddenly spark jobs started failing with following error
Exception in thread "main" java.io.FileNotFoundException: /user/spark/applicationHistory/application_1432824195832_1275.inprogress (No such file or directory) full trace here [21:50:04 x...@hadoop-client01.dev:~]$ spark-submit --class org.apache.spark.examples.SparkPi --master yarn /usr/lib/spark/lib/spark-examples.jar 10 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/flume-ng/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/parquet/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 15/05/28 21:55:21 INFO SparkContext: Running Spark version 1.3.0 15/05/28 21:55:21 WARN SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN). 15/05/28 21:55:21 WARN SparkConf: SPARK_JAVA_OPTS was detected (set to ' -Dspark.local.dir=/srv/tmp/xyz '). This is deprecated in Spark 1.0+. Please instead use: - ./spark-submit with conf/spark-defaults.conf to set defaults for an application - ./spark-submit with --driver-java-options to set -X options for a driver - spark.executor.extraJavaOptions to set -X options for executors - SPARK_DAEMON_JAVA_OPTS to set java options for standalone daemons (master or worker) 15/05/28 21:55:21 WARN SparkConf: Setting 'spark.executor.extraJavaOptions' to ' -Dspark.local.dir=/srv/tmp/xyz ' as a work-around. 15/05/28 21:55:21 WARN SparkConf: Setting 'spark.driver.extraJavaOptions' to ' -Dspark.local.dir=/srv/tmp/xyz ' as a work-around. 15/05/28 21:55:22 INFO SecurityManager: Changing view acls to: xyz 15/05/28 21:55:22 INFO SecurityManager: Changing modify acls to: xyz 15/05/28 21:55:22 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(xyz); users with modify permissions: Set(xyz) 15/05/28 21:55:22 INFO Slf4jLogger: Slf4jLogger started 15/05/28 21:55:22 INFO Remoting: Starting remoting 15/05/28 21:55:22 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkdri...@hadoop-client01.abc.com:51876] 15/05/28 21:55:22 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkdri...@hadoop-client01.abc.com:51876] 15/05/28 21:55:22 INFO Utils: Successfully started service 'sparkDriver' on port 51876. 15/05/28 21:55:22 INFO SparkEnv: Registering MapOutputTracker 15/05/28 21:55:22 INFO SparkEnv: Registering BlockManagerMaster 15/05/28 21:55:22 INFO DiskBlockManager: Created local directory at /srv/tmp/xyz/spark-1e66e6eb-7ad6-4f62-87fc-f0cfaa631e36/blockmgr-61f866b8-6475-4a11-88b2-792d2ba22662 15/05/28 21:55:22 INFO MemoryStore: MemoryStore started with capacity 265.4 MB 15/05/28 21:55:23 INFO HttpFileServer: HTTP File server directory is /srv/tmp/xyz/spark-2b676170-3f88-44bf-87a3-600de1b7ee24/httpd-b84f76d5-26c7-4c63-9223-f6c5aa3899f0 15/05/28 21:55:23 INFO HttpServer: Starting HTTP Server 15/05/28 21:55:23 INFO Server: jetty-8.y.z-SNAPSHOT 15/05/28 21:55:23 INFO AbstractConnector: Started SocketConnector@0.0.0.0:41538 15/05/28 21:55:23 INFO Utils: Successfully started service 'HTTP file server' on port 41538. 15/05/28 21:55:23 INFO SparkEnv: Registering OutputCommitCoordinator 15/05/28 21:55:23 INFO Server: jetty-8.y.z-SNAPSHOT 15/05/28 21:55:23 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040 15/05/28 21:55:23 INFO Utils: Successfully started service 'SparkUI' on port 4040. 15/05/28 21:55:23 INFO SparkUI: Started SparkUI at http://hadoop-client01.abc.com:4040 15/05/28 21:55:23 INFO SparkContext: Added JAR file:/usr/lib/spark/lib/spark-examples.jar at http://10.0.3.62:41538/jars/spark-examples.jar with timestamp 1432850123523 15/05/28 21:55:24 INFO Client: Requesting a new application from cluster with 16 NodeManagers 15/05/28 21:55:24 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (61440 MB per container) 15/05/28 21:55:24 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 15/05/28 21:55:24 INFO Client: Setting up container launch context for our AM 15/05/28 21:55:24 INFO Client: Preparing resources for our AM container 15/05/28 21:55:24 INFO Client: Setting up the launch environment for our AM container 15/05/28 21:55:24 INFO SecurityManager: Changing view acls to: xyz 15/05/28 21:55:24 INFO SecurityManager: Changing modify acls to: xyz 15/05/28 21:55:24 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(xyz); users with modify permissions: Set(xyz) 15/05/28 21:55:24 INFO Client: Submitting application 1275 to ResourceManager 15/05/28 21:55:24 INFO YarnClientImpl: Submitted application application_1432824195832_1275 15/05/28 21:55:25 INFO Client: Application report for application_1432824195832_1275 (state: ACCEPTED) 15/05/28 21:55:25 INFO Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: root.xyz start time: 1432850126673 final status: UNDEFINED tracking URL: http://ds-hnn001.abc.com:8088/proxy/application_1432824195832_1275/ user: xyz 15/05/28 21:55:26 INFO Client: Application report for application_1432824195832_1275 (state: ACCEPTED) 15/05/28 21:55:27 INFO Client: Application report for application_1432824195832_1275 (state: ACCEPTED) 15/05/28 21:55:27 INFO YarnClientSchedulerBackend: ApplicationMaster registered as Actor[akka.tcp://sparkYarnAM@ip-10-0-3-154.ec2.internal:41482/user/YarnAM#1960927737] 15/05/28 21:55:27 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> ds-hnn001.abc.com,ds-hnn002.abc.com, PROXY_URI_BASES -> http://ds-hnn001.abc.com:8088/proxy/application_1432824195832_1275,http://ds-hnn002.abc.com:8088/proxy/application_1432824195832_1275), /proxy/application_1432824195832_1275 15/05/28 21:55:27 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 15/05/28 21:55:28 INFO Client: Application report for application_1432824195832_1275 (state: RUNNING) 15/05/28 21:55:28 INFO Client: client token: N/A diagnostics: N/A ApplicationMaster host: ip-10-0-3-154.ec2.internal ApplicationMaster RPC port: 0 queue: root.xyz start time: 1432850126673 final status: UNDEFINED tracking URL: http://ds-hnn001.abc.com:8088/proxy/application_1432824195832_1275/ user: xyz 15/05/28 21:55:28 INFO YarnClientSchedulerBackend: Application application_1432824195832_1275 has started running. 15/05/28 21:55:28 INFO NettyBlockTransferService: Server created on 24272 15/05/28 21:55:28 INFO BlockManagerMaster: Trying to register BlockManager 15/05/28 21:55:28 INFO BlockManagerMasterActor: Registering block manager hadoop-client01.abc.com:24272 with 265.4 MB RAM, BlockManagerId(<driver>, hadoop-client01.abc.com, 24272) 15/05/28 21:55:28 INFO BlockManagerMaster: Registered BlockManager Exception in thread "main" java.io.FileNotFoundException: /user/spark/applicationHistory/application_1432824195832_1275.inprogress (No such file or directory) at java.io.FileOutputStream.open(Native Method) at java.io.FileOutputStream.<init>(FileOutputStream.java:221) at java.io.FileOutputStream.<init>(FileOutputStream.java:110) at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:117) at org.apache.spark.SparkContext.<init>(SparkContext.scala:399) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:28) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/flume-ng/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/parquet/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/avro/avro-tools-1.7.6-cdh5.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Everything was working fine sometime back. Any help on this Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-java-io-FileNotFoundException-user-spark-applicationHistory-application-tp23077.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org