 Sent: Thursday, July 16, 2020 at 6:54 PM
> From: "Davide Curcio" <>
> To: "" <>
> Subject: “ does not exist” using Spark in cluster mode with Yarn
> I'm trying to run some Spark script in cluster mode using Yarn but I've 
> always obtained this error. I read in other similar question that the cause 
> can be:
> "Local" set up hard-coded as a master but I don't have it
> HADOOP_CONF_DIR environment variable that's wrong inside but it 
> seems right
> I've tried with every code, even simple code but it still doesn't work, even 
> though in local mode they work.
> Here is my log when I try to execute the code:
> spark/bin/spark-submit --deploy-mode cluster --master yarn ~/
> log4j:WARN No appenders could be found for logger 
> (org.apache.hadoop.util.Shell).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See for more 
> info.
> Using Spark's default log4j profile: 
> org/apache/spark/
> 20/07/16 16:10:27 INFO Client: Requesting a new application from cluster with 
> 2 NodeManagers
> 20/07/16 16:10:27 INFO Client: Verifying our application has not requested 
> more than the maximum memory capability of the cluster (1536 MB per container)
> 20/07/16 16:10:27 INFO Client: Will allocate AM container, with 896 MB memory 
> including 384 MB overhead
> 20/07/16 16:10:27 INFO Client: Setting up container launch context for our AM
> 20/07/16 16:10:27 INFO Client: Setting up the launch environment for our AM 
> container
> 20/07/16 16:10:27 INFO Client: Preparing resources for our AM container
> 20/07/16 16:10:27 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive 
> is set, falling back to uploading libraries under SPARK_HOME.
> 20/07/16 16:10:31 INFO Client: Uploading resource 
> file:/tmp/spark-750fb229-4166-4444-9c69-eb90e9a2318d/
>  -> 
> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/
> 20/07/16 16:10:31 INFO Client: Uploading resource file:/home/ubuntu/ 
> -> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/
> 20/07/16 16:10:31 INFO Client: Uploading resource 
> file:/home/ubuntu/spark/python/lib/ -> 
> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/
> 20/07/16 16:10:31 INFO Client: Uploading resource 
> file:/home/ubuntu/spark/python/lib/ -> 
> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/
> 20/07/16 16:10:32 INFO Client: Uploading resource 
> file:/tmp/spark-750fb229-4166-4444-9c69-eb90e9a2318d/
>  -> 
> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/
> 20/07/16 16:10:32 INFO SecurityManager: Changing view acls to: ubuntu
> 20/07/16 16:10:32 INFO SecurityManager: Changing modify acls to: ubuntu
> 20/07/16 16:10:32 INFO SecurityManager: Changing view acls groups to:
> 20/07/16 16:10:32 INFO SecurityManager: Changing modify acls groups to:
> 20/07/16 16:10:32 INFO SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users  with view permissions: Set(ubuntu); groups 
> with view permissions: Set(); users  with modify permissions: Set(ubuntu); 
> groups with modify permissions: Set()
> 20/07/16 16:10:33 INFO Client: Submitting application 
> application_1594914119543_0010 to ResourceManager
> 20/07/16 16:10:33 INFO YarnClientImpl: Submitted application 
> application_1594914119543_0010
> 20/07/16 16:10:34 INFO Client: Application report for 
> application_1594914119543_0010 (state: FAILED)
> 20/07/16 16:10:34 INFO Client:
>      client token: N/A
>      diagnostics: Application application_1594914119543_0010 failed 2 times 
> due to AM Container for appattempt_1594914119543_0010_000002 exited with  
> exitCode: -1000
> Failing this attempt.Diagnostics: [2020-07-16 16:10:34.391]File 
> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/ 
> does not exist
> File 
> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/ 
> does not exist
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(
>     at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(
>     at 
> org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(
>     at org.apache.hadoop.yarn.util.FSDownload.access$000(
>     at org.apache.hadoop.yarn.util.FSDownload$
>     at org.apache.hadoop.yarn.util.FSDownload$
>     at Method)
>     at
>     at 
>     at
>     at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(
>     at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$
>     at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$
>     at
>     at java.util.concurrent.Executors$
>     at
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(
>     at 
> java.util.concurrent.ThreadPoolExecutor$
>     at
> For more detailed output, check the application tracking page: 
>  Then click on links to logs of each attempt.
> . Failing the application.
>      ApplicationMaster host: N/A
>      ApplicationMaster RPC port: -1
>      queue: default
>      start time: 1594915833427
>      final status: FAILED
>      tracking URL: 
>      user: ubuntu
> 20/07/16 16:10:34 INFO Client: Deleted staging directory 
> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010
> 20/07/16 16:10:34 ERROR Client: Application diagnostics message: Application 
> application_1594914119543_0010 failed 2 times due to AM Container for 
> appattempt_1594914119543_0010_000002 exited with  exitCode: -1000
> Failing this attempt.Diagnostics: [2020-07-16 16:10:34.391]File 
> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/ 
> does not exist
> File 
> file:/home/ubuntu/.sparkStaging/application_1594914119543_0010/ 
> does not exist
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(
>     at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(
>     at 
> org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(
>     at org.apache.hadoop.yarn.util.FSDownload.access$000(
>     at org.apache.hadoop.yarn.util.FSDownload$
>     at org.apache.hadoop.yarn.util.FSDownload$
>     at Method)
>     at
>     at 
>     at
>     at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(
>     at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$
>     at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$
>     at
>     at java.util.concurrent.Executors$
>     at 
> --deploy-mode cluster --master yarn ~/
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(
>     at 
> java.util.concurrent.ThreadPoolExecutor$
>     at
> For more detailed output, check the application tracking page: 
>  Then click on links to logs of each attempt.
> . Failing the application.
> Exception in thread "main" org.apache.spark.SparkException: Application 
> application_1594914119543_0010 finished with failed status
>     at
>     at 
> org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1530)
>     at 
>     at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
>     at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
>     at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
>     at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
>     at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
>     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> 20/07/16 16:10:34 INFO ShutdownHookManager: Shutdown hook called
> 20/07/16 16:10:34 INFO ShutdownHookManager: Deleting directory 
> /tmp/spark-750fb229-4166-4444-9c69-eb90e9a2318d
> 20/07/16 16:10:34 INFO ShutdownHookManager: Deleting directory 
> /tmp/spark-257b390a-3c40-49fd-b285-de35f27e3dfb
> Do you have any suggestion about how to solve this problem?
> Thanks in advance,
> Davide

To unsubscribe e-mail:

Reply via email to