I'm trying to run some Spark script in cluster mode using Yarn but I've always obtained this error. I read in other similar question that the cause can be:
"Local" set up hard-coded as a master but I don't have it HADOOP_CONF_DIR environment variable that's wrong inside spark-env.sh but it seems right I've tried with every code, even simple code but it still doesn't work, even though in local mode they work. Here is my log when I try to execute the code: spark/bin/spark-submit --deploy-mode cluster --master yarn ~/prova7.py log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 20/07/16 16:10:27 INFO Client: Requesting a new application from cluster with 2 NodeManagers 20/07/16 16:10:27 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (1536 MB per container) 20/07/16 16:10:27 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 20/07/16 16:10:27 INFO Client: Setting up container launch context for our AM 20/07/16 16:10:27 INFO Client: Setting up the launch environment for our AM container 20/07/16 16:10:27 INFO Client: Preparing resources for our AM container 20/07/16 16:10:27 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 20/07/16 16:10:31 INFO Client: Uploading resource file:/tmp/spark-750fb229-4166-4444-9c69-eb90e9a2318d/__spark_libs__4588035472069967339.zip -> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/__spark_libs__4588035472069967339.zip 20/07/16 16:10:31 INFO Client: Uploading resource file:/home/ubuntu/prova7.py -> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/prova7.py 20/07/16 16:10:31 INFO Client: Uploading resource file:/home/ubuntu/spark/python/lib/pyspark.zip -> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/pyspark.zip 20/07/16 16:10:31 INFO Client: Uploading resource file:/home/ubuntu/spark/python/lib/py4j-0.10.7-src.zip -> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/py4j-0.10.7-src.zip 20/07/16 16:10:32 INFO Client: Uploading resource file:/tmp/spark-750fb229-4166-4444-9c69-eb90e9a2318d/__spark_conf__1291791519024875749.zip -> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/__spark_conf__.zip 20/07/16 16:10:32 INFO SecurityManager: Changing view acls to: ubuntu 20/07/16 16:10:32 INFO SecurityManager: Changing modify acls to: ubuntu 20/07/16 16:10:32 INFO SecurityManager: Changing view acls groups to: 20/07/16 16:10:32 INFO SecurityManager: Changing modify acls groups to: 20/07/16 16:10:32 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ubuntu); groups with view permissions: Set(); users with modify permissions: Set(ubuntu); groups with modify permissions: Set() 20/07/16 16:10:33 INFO Client: Submitting application application_1594914119543_0010 to ResourceManager 20/07/16 16:10:33 INFO YarnClientImpl: Submitted application application_1594914119543_0010 20/07/16 16:10:34 INFO Client: Application report for application_1594914119543_0010 (state: FAILED) 20/07/16 16:10:34 INFO Client: client token: N/A diagnostics: Application application_1594914119543_0010 failed 2 times due to AM Container for appattempt_1594914119543_0010_000002 exited with exitCode: -1000 Failing this attempt.Diagnostics: [2020-07-16 16:10:34.391]File file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/pyspark.zip does not exist java.io.FileNotFoundException: File file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/pyspark.zip does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454) at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:242) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:235) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:223) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) For more detailed output, check the application tracking page: http://ec2-3-215-190-32.compute-1.amazonaws.com:8088/cluster/app/application_1594914119543_0010 Then click on links to logs of each attempt. . Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1594915833427 final status: FAILED tracking URL: http://ec2-3-215-190-32.compute-1.amazonaws.com:8088/cluster/app/application_1594914119543_0010 user: ubuntu 20/07/16 16:10:34 INFO Client: Deleted staging directory file:/home/ubuntu/.sparkStaging/application_1594914119543_0010 20/07/16 16:10:34 ERROR Client: Application diagnostics message: Application application_1594914119543_0010 failed 2 times due to AM Container for appattempt_1594914119543_0010_000002 exited with exitCode: -1000 Failing this attempt.Diagnostics: [2020-07-16 16:10:34.391]File file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/pyspark.zip does not exist java.io.FileNotFoundException: File file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/pyspark.zip does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454) at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:242) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:235) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:223) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(Fspark/bin/spark-submit --deploy-mode cluster --master yarn ~/prova7.pyutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) For more detailed output, check the application tracking page: http://ec2-3-215-190-32.compute-1.amazonaws.com:8088/cluster/app/application_1594914119543_0010 Then click on links to logs of each attempt. . Failing the application. Exception in thread "main" org.apache.spark.SparkException: Application application_1594914119543_0010 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1150) at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1530) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 20/07/16 16:10:34 INFO ShutdownHookManager: Shutdown hook called 20/07/16 16:10:34 INFO ShutdownHookManager: Deleting directory /tmp/spark-750fb229-4166-4444-9c69-eb90e9a2318d 20/07/16 16:10:34 INFO ShutdownHookManager: Deleting directory /tmp/spark-257b390a-3c40-49fd-b285-de35f27e3dfb Do you have any suggestion about how to solve this problem? Thanks in advance, Davide