[ https://issues.apache.org/jira/browse/OOZIE-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15323736#comment-15323736 ]
Robert Kanter commented on OOZIE-2547: -------------------------------------- I tried the 5 patch with local, yarn-client, and yarn-cluster modes, and they all worked now! Thanks for figuring this out. It's a lot cleaner and faster than the old way. +1 > Add mapreduce.job.cache.files to spark action > --------------------------------------------- > > Key: OOZIE-2547 > URL: https://issues.apache.org/jira/browse/OOZIE-2547 > Project: Oozie > Issue Type: Bug > Reporter: Satish Subhashrao Saley > Assignee: Satish Subhashrao Saley > Priority: Minor > Attachments: OOZIE-2547-1.patch, OOZIE-2547-4.patch, > OOZIE-2547-5.patch, yarn-cluster_launcher.txt > > > Currently, we pass jars using --jars option while submitting spark job. Also, > we add spark.yarn.dist.files option in case of yarn-client mode. > Instead of that, we can have only --files option and pass on the files which > are present in mapreduce.job.cache.files. While doing so, we make sure that > spark won't make another copy of the files if files exist on the hdfs. We saw > the issues when files are getting copied multiple times and causing > exceptions such as : > {code} > Diagnostics: Resource > hdfs://localhost/user/saley/.sparkStaging/application_1234_123/oozie-examples.jar > changed on src filesystem > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)