[jira] [Commented] (SPARK-21570) File __spark_libs__XXX.zip does not exist on networked file system w/ yarn
[ https://issues.apache.org/jira/browse/SPARK-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112868#comment-16112868 ] Albert Chu commented on SPARK-21570: There's no scheme. Just using "file://" to treat like a local file system. > File __spark_libs__XXX.zip does not exist on networked file system w/ yarn > -- > > Key: SPARK-21570 > URL: https://issues.apache.org/jira/browse/SPARK-21570 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.2.0 >Reporter: Albert Chu > > I have a set of scripts that run Spark with data in a networked file system. > One of my unit tests to make sure things don't break between Spark releases > is to simply run a word count (via org.apache.spark.examples.JavaWordCount) > on a file in the networked file system. This test broke with Spark 2.2.0 > when I use yarn to launch the job (using the spark standalone scheduler > things still work). I'm currently using Hadoop 2.7.0. I get the following > error: > {noformat} > Diagnostics: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > java.io.FileNotFoundException: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > {noformat} > While debugging, I sat and watched the directory and did see that > /p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does show up at some point. > Wondering if it's possible something racy was introduced. Nothing in the > Spark 2.2.0 release notes suggests any type of configuration change that > needs to be done. > Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21570) File __spark_libs__XXX.zip does not exist on networked file system w/ yarn
[ https://issues.apache.org/jira/browse/SPARK-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112036#comment-16112036 ] Saisai Shao commented on SPARK-21570: - Sorry I'm not familiar with NFS/Lustre FS, does this kind of network FS has a special scheme in hadoop like "hdfs://" or "wasb://", or they just represented as "file://" to treat them like local FS? > File __spark_libs__XXX.zip does not exist on networked file system w/ yarn > -- > > Key: SPARK-21570 > URL: https://issues.apache.org/jira/browse/SPARK-21570 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.2.0 >Reporter: Albert Chu > > I have a set of scripts that run Spark with data in a networked file system. > One of my unit tests to make sure things don't break between Spark releases > is to simply run a word count (via org.apache.spark.examples.JavaWordCount) > on a file in the networked file system. This test broke with Spark 2.2.0 > when I use yarn to launch the job (using the spark standalone scheduler > things still work). I'm currently using Hadoop 2.7.0. I get the following > error: > {noformat} > Diagnostics: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > java.io.FileNotFoundException: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > {noformat} > While debugging, I sat and watched the directory and did see that > /p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does show up at some point. > Wondering if it's possible something racy was introduced. Nothing in the > Spark 2.2.0 release notes suggests any type of configuration change that > needs to be done. > Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21570) File __spark_libs__XXX.zip does not exist on networked file system w/ yarn
[ https://issues.apache.org/jira/browse/SPARK-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111701#comment-16111701 ] Albert Chu commented on SPARK-21570: My setup is unique. The primary unique part is when configuring Hadoop, I configure it to only use a "local file system", or basically the "file:" URI. I don't setup HDFS. The defaultFS in Hadoop is configured as such. {noformat} fs.defaultFS file:/// {noformat} B/c of this, there are no HDFS configs. Other important configuration of temp dirs and such are configured as follows, always to a networked file system accessible on all nodes (note that I have text like "node-0" below, that is adjusted depending on what node you are on, i.e. "node-1", "node-2", etc. on other nodes). in core-site.xml {noformat} hadoop.tmp.dir /p/lcratery/achu/testing/rawnetworkfs//test/1181121/node-0 {noformat} in mapred-site.xml (I'm excluding mapreduce.cluster.local.dir, mapreduce.jobtracker.system.dir, mapreduce.jobtracker.staging.root.dir, mapreduce.cluster.temp.dir, which have paths based on ${hadoop.tmp.dir} above) {noformat} yarn.app.mapreduce.am.staging-dir /p/lcratery/achu/testing/rawnetworkfs//test/1181121/node-0/yarn/ {noformat} In spark-defaults.conf {noformat} spark.local.dir /p/lcratery/achu/testing/rawnetworkfs//test/1181121/node-0/spark/node-0 {noformat} (I can post full config files if you're interested, but I suspect that's excessive). The paths above happen to be a Lustre file system, but the problem also was exhibited on NFS. Re-running my tests today, things still work on Spark 2.1.1 but broke on Spark 2.2.0. I re-ran against Spark 1.6.0 too, and it passed there. The job run itself isn't particular magical. Just a simple spark-submit call w/ the Spark wordcount example. The Spark binaries are stored in NFS, where it is available on all nodes. The data itself is also in NFS where all can read it (it's a small 4 line file, this is just a sanity test). {noformat} spark-submit --class org.apache.spark.examples.JavaWordCount ${PATH_IN_NFS}/spark-examples_2.11-2.2.0.jar file:// {noformat} > File __spark_libs__XXX.zip does not exist on networked file system w/ yarn > -- > > Key: SPARK-21570 > URL: https://issues.apache.org/jira/browse/SPARK-21570 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.2.0 >Reporter: Albert Chu > > I have a set of scripts that run Spark with data in a networked file system. > One of my unit tests to make sure things don't break between Spark releases > is to simply run a word count (via org.apache.spark.examples.JavaWordCount) > on a file in the networked file system. This test broke with Spark 2.2.0 > when I use yarn to launch the job (using the spark standalone scheduler > things still work). I'm currently using Hadoop 2.7.0. I get the following > error: > {noformat} > Diagnostics: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > java.io.FileNotFoundException: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurre
[jira] [Commented] (SPARK-21570) File __spark_libs__XXX.zip does not exist on networked file system w/ yarn
[ https://issues.apache.org/jira/browse/SPARK-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16110617#comment-16110617 ] Saisai Shao commented on SPARK-21570: - This __spark_libs_xxx.zip is created by Spark on yarn to zip spark dependencies and upload to HDFS, yarn will download it from HDFS and add to local dir, this will be used for Spark AM and executor launch classpath. Usually there's no issue whether you're using HDFS/S3/WASB, as long as this zip file can be reach by NM across the cluster. I'm wondering if you're using some different ways to start Spark on yarn application, or your cluster is a little different from normal setup. I think it should not be an issue of Spark, mostly it is a setup issue. Can you please list the steps to reproduce this issue? I'm not quite following your description above. > File __spark_libs__XXX.zip does not exist on networked file system w/ yarn > -- > > Key: SPARK-21570 > URL: https://issues.apache.org/jira/browse/SPARK-21570 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.2.0 >Reporter: Albert Chu > > I have a set of scripts that run Spark with data in a networked file system. > One of my unit tests to make sure things don't break between Spark releases > is to simply run a word count (via org.apache.spark.examples.JavaWordCount) > on a file in the networked file system. This test broke with Spark 2.2.0 > when I use yarn to launch the job (using the spark standalone scheduler > things still work). I'm currently using Hadoop 2.7.0. I get the following > error: > {noformat} > Diagnostics: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > java.io.FileNotFoundException: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > {noformat} > While debugging, I sat and watched the directory and did see that > /p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does show up at some point. > Wondering if it's possible something racy was introduced. Nothing in the > Spark 2.2.0 release notes suggests any type of configuration change that > needs to be done. > Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21570) File __spark_libs__XXX.zip does not exist on networked file system w/ yarn
[ https://issues.apache.org/jira/browse/SPARK-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106609#comment-16106609 ] Albert Chu commented on SPARK-21570: Yeah, that's understandable. IIRC, on most network file systems, closing a file (or equivalent) is enough to ensure that changes are noticed by all others. I skimmed changes hoping to find something obvious, but didn't see anything. > File __spark_libs__XXX.zip does not exist on networked file system w/ yarn > -- > > Key: SPARK-21570 > URL: https://issues.apache.org/jira/browse/SPARK-21570 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.2.0 >Reporter: Albert Chu > > I have a set of scripts that run Spark with data in a networked file system. > One of my unit tests to make sure things don't break between Spark releases > is to simply run a word count (via org.apache.spark.examples.JavaWordCount) > on a file in the networked file system. This test broke with Spark 2.2.0 > when I use yarn to launch the job (using the spark standalone scheduler > things still work). I'm currently using Hadoop 2.7.0. I get the following > error: > {noformat} > Diagnostics: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > java.io.FileNotFoundException: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > {noformat} > While debugging, I sat and watched the directory and did see that > /p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does show up at some point. > Wondering if it's possible something racy was introduced. Nothing in the > Spark 2.2.0 release notes suggests any type of configuration change that > needs to be done. > Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21570) File __spark_libs__XXX.zip does not exist on networked file system w/ yarn
[ https://issues.apache.org/jira/browse/SPARK-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106607#comment-16106607 ] Albert Chu commented on SPARK-21570: FWIW (and my scala and Spark code knowledge is not elite), in resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala {noformat} // Subdirectory where Spark libraries will be placed. val LOCALIZED_LIB_DIR = "__spark_libs__" {noformat} {noformat} val jarsArchive = File.createTempFile(LOCALIZED_LIB_DIR, ".zip", new File(Utils.getLocalDir(sparkConf))) {noformat} So it looks like Spark does create this file. > File __spark_libs__XXX.zip does not exist on networked file system w/ yarn > -- > > Key: SPARK-21570 > URL: https://issues.apache.org/jira/browse/SPARK-21570 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.2.0 >Reporter: Albert Chu > > I have a set of scripts that run Spark with data in a networked file system. > One of my unit tests to make sure things don't break between Spark releases > is to simply run a word count (via org.apache.spark.examples.JavaWordCount) > on a file in the networked file system. This test broke with Spark 2.2.0 > when I use yarn to launch the job (using the spark standalone scheduler > things still work). I'm currently using Hadoop 2.7.0. I get the following > error: > {noformat} > Diagnostics: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > java.io.FileNotFoundException: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > {noformat} > While debugging, I sat and watched the directory and did see that > /p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does show up at some point. > Wondering if it's possible something racy was introduced. Nothing in the > Spark 2.2.0 release notes suggests any type of configuration change that > needs to be done. > Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21570) File __spark_libs__XXX.zip does not exist on networked file system w/ yarn
[ https://issues.apache.org/jira/browse/SPARK-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106591#comment-16106591 ] Sean Owen commented on SPARK-21570: --- Got it. I still think that when it writes the archive to a local file system, it assumes it's written and visible to read elsewhere, but that isn't true with this NFS mount. I don't know enough to know if that is something the FS can control, like, blocking writes until they're propagated or something. > File __spark_libs__XXX.zip does not exist on networked file system w/ yarn > -- > > Key: SPARK-21570 > URL: https://issues.apache.org/jira/browse/SPARK-21570 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.2.0 >Reporter: Albert Chu > > I have a set of scripts that run Spark with data in a networked file system. > One of my unit tests to make sure things don't break between Spark releases > is to simply run a word count (via org.apache.spark.examples.JavaWordCount) > on a file in the networked file system. This test broke with Spark 2.2.0 > when I use yarn to launch the job (using the spark standalone scheduler > things still work). I'm currently using Hadoop 2.7.0. I get the following > error: > {noformat} > Diagnostics: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > java.io.FileNotFoundException: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > {noformat} > While debugging, I sat and watched the directory and did see that > /p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does show up at some point. > Wondering if it's possible something racy was introduced. Nothing in the > Spark 2.2.0 release notes suggests any type of configuration change that > needs to be done. > Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21570) File __spark_libs__XXX.zip does not exist on networked file system w/ yarn
[ https://issues.apache.org/jira/browse/SPARK-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106585#comment-16106585 ] Albert Chu commented on SPARK-21570: My assumption was the "__spark_libs__695301535722158702.zip" file was created by Spark, something that's done internally. Likewise with the "spark-292938be-7ae3-460f-aca7-294083ebb790" directory. They are deleted after the job dies, which suggests Spark did create them. Effectively, the test is {noformat} spark-submit --class org.apache.spark.examples.JavaWordCount ${PATHTO}/spark-examples_2.11-2.2.0.jar {noformat} Nothing particularly extraordinary. > File __spark_libs__XXX.zip does not exist on networked file system w/ yarn > -- > > Key: SPARK-21570 > URL: https://issues.apache.org/jira/browse/SPARK-21570 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.2.0 >Reporter: Albert Chu > > I have a set of scripts that run Spark with data in a networked file system. > One of my unit tests to make sure things don't break between Spark releases > is to simply run a word count (via org.apache.spark.examples.JavaWordCount) > on a file in the networked file system. This test broke with Spark 2.2.0 > when I use yarn to launch the job (using the spark standalone scheduler > things still work). I'm currently using Hadoop 2.7.0. I get the following > error: > {noformat} > Diagnostics: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > java.io.FileNotFoundException: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > {noformat} > While debugging, I sat and watched the directory and did see that > /p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does show up at some point. > Wondering if it's possible something racy was introduced. Nothing in the > Spark 2.2.0 release notes suggests any type of configuration change that > needs to be done. > Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21570) File __spark_libs__XXX.zip does not exist on networked file system w/ yarn
[ https://issues.apache.org/jira/browse/SPARK-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106368#comment-16106368 ] Sean Owen commented on SPARK-21570: --- I'm not sure how you're running it or what creates the file -- Spark? but if you write a file to the FS and it doesn't appear for some time across the hosts that mount it, yes I'm not sure how anything can know to expect the file. I don't think that's ever been different, though you can imagine scenarios in which this happens to work. > File __spark_libs__XXX.zip does not exist on networked file system w/ yarn > -- > > Key: SPARK-21570 > URL: https://issues.apache.org/jira/browse/SPARK-21570 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.2.0 >Reporter: Albert Chu > > I have a set of scripts that run Spark with data in a networked file system. > One of my unit tests to make sure things don't break between Spark releases > is to simply run a word count (via org.apache.spark.examples.JavaWordCount) > on a file in the networked file system. This test broke with Spark 2.2.0 > when I use yarn to launch the job (using the spark standalone scheduler > things still work). I'm currently using Hadoop 2.7.0. I get the following > error: > {noformat} > Diagnostics: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > java.io.FileNotFoundException: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > {noformat} > While debugging, I sat and watched the directory and did see that > /p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does show up at some point. > Wondering if it's possible something racy was introduced. Nothing in the > Spark 2.2.0 release notes suggests any type of configuration change that > needs to be done. > Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21570) File __spark_libs__XXX.zip does not exist on networked file system w/ yarn
[ https://issues.apache.org/jira/browse/SPARK-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106157#comment-16106157 ] Albert Chu commented on SPARK-21570: The file was confirmed to show up, although at the time I did not confirm permissions. Ran a test today and confirmed permissions were valid (assuming -rw-r--r-- is valid). Fair enough if yarn w/ non-HDFS is not supported. FWIW this test has passed since Spark 2.0. > File __spark_libs__XXX.zip does not exist on networked file system w/ yarn > -- > > Key: SPARK-21570 > URL: https://issues.apache.org/jira/browse/SPARK-21570 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.2.0 >Reporter: Albert Chu > > I have a set of scripts that run Spark with data in a networked file system. > One of my unit tests to make sure things don't break between Spark releases > is to simply run a word count (via org.apache.spark.examples.JavaWordCount) > on a file in the networked file system. This test broke with Spark 2.2.0 > when I use yarn to launch the job (using the spark standalone scheduler > things still work). I'm currently using Hadoop 2.7.0. I get the following > error: > {noformat} > Diagnostics: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > java.io.FileNotFoundException: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > {noformat} > While debugging, I sat and watched the directory and did see that > /p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does show up at some point. > Wondering if it's possible something racy was introduced. Nothing in the > Spark 2.2.0 release notes suggests any type of configuration change that > needs to be done. > Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21570) File __spark_libs__XXX.zip does not exist on networked file system w/ yarn
[ https://issues.apache.org/jira/browse/SPARK-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106051#comment-16106051 ] Sean Owen commented on SPARK-21570: --- I think it is either indeed that the file doesn't show up, or has permission problems or something. This seems to be an issue with what you're trying to test, which is outside Spark and not necessarily supported. > File __spark_libs__XXX.zip does not exist on networked file system w/ yarn > -- > > Key: SPARK-21570 > URL: https://issues.apache.org/jira/browse/SPARK-21570 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.2.0 >Reporter: Albert Chu > > I have a set of scripts that run Spark with data in a networked file system. > One of my unit tests to make sure things don't break between Spark releases > is to simply run a word count (via org.apache.spark.examples.JavaWordCount) > on a file in the networked file system. This test broke with Spark 2.2.0 > when I use yarn to launch the job (using the spark standalone scheduler > things still work). I'm currently using Hadoop 2.7.0. I get the following > error: > {noformat} > Diagnostics: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > java.io.FileNotFoundException: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > {noformat} > While debugging, I sat and watched the directory and did see that > /p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does show up at some point. > Wondering if it's possible something racy was introduced. Nothing in the > Spark 2.2.0 release notes suggests any type of configuration change that > needs to be done. > Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21570) File __spark_libs__XXX.zip does not exist on networked file system w/ yarn
[ https://issues.apache.org/jira/browse/SPARK-21570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105900#comment-16105900 ] Albert Chu commented on SPARK-21570: Oh, and because it will likely be asked and may likely be relevant. In this test setup, HDFS is not used at all. {noformat} fs.defaultFS file:/// {noformat} All temp dirs, staging dirs, etc. are configured to appropriate locations in /tmp or somewhere in the networked file system. > File __spark_libs__XXX.zip does not exist on networked file system w/ yarn > -- > > Key: SPARK-21570 > URL: https://issues.apache.org/jira/browse/SPARK-21570 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.2.0 >Reporter: Albert Chu > > I have a set of scripts that run Spark with data in a networked file system. > One of my unit tests to make sure things don't break between Spark releases > is to simply run a word count (via org.apache.spark.examples.JavaWordCount) > on a file in the networked file system. This test broke with Spark 2.2.0 > when I use yarn to launch the job (using the spark standalone scheduler > things still work). I'm currently using Hadoop 2.7.0. I get the following > error: > {noformat} > Diagnostics: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > java.io.FileNotFoundException: File > file:/p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:606) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:819) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:596) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > {noformat} > While debugging, I sat and watched the directory and did see that > /p/lcratery/achu/testing/rawnetworkfs/test/1181015/node-0/spark/node-0/spark-292938be-7ae3-460f-aca7-294083ebb790/__spark_libs__695301535722158702.zip > does show up at some point. > Wondering if it's possible something racy was introduced. Nothing in the > Spark 2.2.0 release notes suggests any type of configuration change that > needs to be done. > Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org