Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2024-03-08 Thread via GitHub
github-actions[bot] closed pull request #43936: [SPARK-46034][CORE] SparkContext add file should also copy file to local root path URL: https://github.com/apache/spark/pull/43936 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2024-03-07 Thread via GitHub
github-actions[bot] commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1984826472 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-27 Thread via GitHub
HyukjinKwon commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1829030542 @AngersZh can you fix the PR description, and explain how this issue happens specifically in Yarn cluster mode? I still can't fully follow what where is the issue. -- This is

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-27 Thread via GitHub
junyi1313 commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1829023594 > The root cause is spark driver download file to it's `driverTempPath`, but didn't download to container's execution root path. So in yarn cluster mode, if we need to use the file in

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-27 Thread via GitHub
tgravescs commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1828063684 I have not used addFiles on yarn in a long time so can't speak to whether it got broken. Generally speaking it's not recommended and user should pass files on submission. Whatever

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-27 Thread via GitHub
AngersZh commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1827485987 I don't know why we add a `driverTmpDir`, remove `driverTmpDir` also can resolve this issue. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-27 Thread via GitHub
AngersZh commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1827474881 > Could you please update the PR description? It looks overdue and inaccurate for me to catch up with. > > Could you also provide the output for `LIST FILE` both before and

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-26 Thread via GitHub
yaooqinn commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1827051972 Could you please update the PR description? It looks overdue and inaccurate for me to catch up with. Could you also provide the output for `LIST FILE`? I guess we shall use its

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-26 Thread via GitHub
AngersZh commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1827039250 gentle ping @yaooqinn @cloud-fan @HyukjinKwon @tgravescs Could you take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-23 Thread via GitHub
AngersZh commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1825190497 > Can we fix `SparkFiles.get` at driver side when Yarn cluster is used? `SparkFiles.get()` don't have problem. The problem is user use relative path to use the added file.

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-23 Thread via GitHub
HyukjinKwon commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1825009537 Can we fix `SparkFiles.get` at driver side when Yarn cluster is used? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-23 Thread via GitHub
AngersZh commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1824131573 Any more suggestion? cc @HyukjinKwon @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-22 Thread via GitHub
AngersZh commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401427353 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1822,7 +1822,7 @@ class SparkContext(config: SparkConf) extends Logging {

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-22 Thread via GitHub
AngersZh commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1822478004 ``` Fetching hdfs://R2/projects/search_algo/hdfs/dev/typhoon.bo/uploader/ego_config/feature_map.txt to

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-22 Thread via GitHub
AngersZh commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1822401850 The root cause is spark driver download file to it's `driverTempPath`, but didn't download to container's execution root path. So in yarn cluster mode, if we need to use the file

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-22 Thread via GitHub
AngersZh commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1822369346 > cc @mridulm or @tgravescs have you ever seen such things before?: `SparkContext.addFiles` adds a file with a temporary name, and you cannot get it with the original name from

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-22 Thread via GitHub
HyukjinKwon commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1822356253 cc @mridulm have you ever seen such things before? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-22 Thread via GitHub
AngersZh commented on PR #43936: URL: https://github.com/apache/spark/pull/43936#issuecomment-1822339813 Should be I make a mistake. In driver side, file was download and copy to path driverTempPath ```

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
HyukjinKwon commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401465887 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1822,7 +1822,7 @@ class SparkContext(config: SparkConf) extends Logging { logInfo(s"Added

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
AngersZh commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401448409 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1836,7 +1836,7 @@ class SparkContext(config: SparkConf) extends Logging { val uriToUse =

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
AngersZh commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401447578 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1822,7 +1822,7 @@ class SparkContext(config: SparkConf) extends Logging {

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
AngersZh commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401447858 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1836,7 +1836,7 @@ class SparkContext(config: SparkConf) extends Logging { val uriToUse =

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
AngersZh commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401447419 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1822,7 +1822,7 @@ class SparkContext(config: SparkConf) extends Logging {

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
HyukjinKwon commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401445511 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1822,7 +1822,7 @@ class SparkContext(config: SparkConf) extends Logging { logInfo(s"Added

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
HyukjinKwon commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401443129 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1836,7 +1836,7 @@ class SparkContext(config: SparkConf) extends Logging { val uriToUse =

Re: [PR] [SPARK-46034][CORE] SparkContext add file should also copy file to local root path [spark]

2023-11-21 Thread via GitHub
AngersZh commented on code in PR #43936: URL: https://github.com/apache/spark/pull/43936#discussion_r1401427353 ## core/src/main/scala/org/apache/spark/SparkContext.scala: ## @@ -1822,7 +1822,7 @@ class SparkContext(config: SparkConf) extends Logging {