[jira] [Commented] (HADOOP-18173) S3a copyFromLocalOperation doesn't support single file

2022-06-08 Thread Tayfun (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17551628#comment-17551628
 ] 

Tayfun commented on HADOOP-18173:
-

Greetings. I'm quite new to the project, investigating a similar issue.

If i may ask, can you have a look at `getFinalPath` where that exception is 
thrown? It feels like `relativeSrcUri` must be the relative path between `src` 
and `destination`, instead of `source`

!image-2022-06-08-15-12-17-066.png!

> S3a copyFromLocalOperation doesn't support single file
> --
>
> Key: HADOOP-18173
> URL: https://issues.apache.org/jira/browse/HADOOP-18173
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.2
> Environment: Hadoop version 3.3.2
> Spark version 3.4.0-SNAPSHOT
> use minio:latest to mock S3 filesystem
>  
>Reporter: qian
>Priority: Major
> Attachments: minio.yaml
>
>
> Spark job uses aws s3 as fileSystem and calls 
> {code:java}
> fs.copyFromLocalFile(delSrc, overwrite, src, dest) 
> delSrc = false
> overwrite = true
> src = 
> "/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar"
> dest = 
> "s3a://spark/spark-upload-a703d8e7-8dd2-4e29-beca-b4df2fedefbd/spark-examples_2.12-3.4.0-SNAPSHOT.jar"{code}
> Then throw a PathIOException, message is as follow
> {code:java}
> Exception in thread "main" org.apache.spark.SparkException: Uploading file 
> /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
>  failed...
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277)
> 
> at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) 
>
> at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)   
>  
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)  
>   
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
> at scala.collection.TraversableLike.map(TraversableLike.scala:286)
> at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
> at scala.collection.AbstractTraversable.map(Traversable.scala:108)
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275)
> 
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187)
>
> at scala.collection.immutable.List.foreach(List.scala:431)
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86)
> at 
> scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
> 
> at 
> scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)   
>  
> at scala.collection.immutable.List.foldLeft(List.scala:91)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84)
> 
> at 
> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242)
>  
> at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214)
> 
> at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
> 
> at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) 
>
> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)  
>   
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
> org.apache.spark.SparkException: Error uploading file 
> spark-examples_2.12-3.4.0-SNAPSHOT.jar

[jira] [Commented] (HADOOP-18173) S3a copyFromLocalOperation doesn't support single file

2022-04-18 Thread qian (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523977#comment-17523977
 ] 

qian commented on HADOOP-18173:
---

CopyFromLocalOperation doesn't support file with file:// schema

> S3a copyFromLocalOperation doesn't support single file
> --
>
> Key: HADOOP-18173
> URL: https://issues.apache.org/jira/browse/HADOOP-18173
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.2
> Environment: Hadoop version 3.3.2
> Spark version 3.4.0-SNAPSHOT
> use minio:latest to mock S3 filesystem
>  
>Reporter: qian
>Priority: Major
> Attachments: minio.yaml
>
>
> Spark job uses aws s3 as fileSystem and calls 
> {code:java}
> fs.copyFromLocalFile(delSrc, overwrite, src, dest) 
> delSrc = false
> overwrite = true
> src = 
> "/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar"
> dest = 
> "s3a://spark/spark-upload-a703d8e7-8dd2-4e29-beca-b4df2fedefbd/spark-examples_2.12-3.4.0-SNAPSHOT.jar"{code}
> Then throw a PathIOException, message is as follow
> {code:java}
> Exception in thread "main" org.apache.spark.SparkException: Uploading file 
> /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
>  failed...
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277)
> 
> at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) 
>
> at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)   
>  
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)  
>   
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
> at scala.collection.TraversableLike.map(TraversableLike.scala:286)
> at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
> at scala.collection.AbstractTraversable.map(Traversable.scala:108)
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275)
> 
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187)
>
> at scala.collection.immutable.List.foreach(List.scala:431)
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86)
> at 
> scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
> 
> at 
> scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)   
>  
> at scala.collection.immutable.List.foldLeft(List.scala:91)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84)
> 
> at 
> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242)
>  
> at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214)
> 
> at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
> 
> at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) 
>
> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)  
>   
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
> org.apache.spark.SparkException: Error uploading file 
> spark-examples_2.12-3.4.0-SNAPSHOT.jar
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328)
> 
> ... 30 more
> 

[jira] [Commented] (HADOOP-18173) S3a copyFromLocalOperation doesn't support single file

2022-04-18 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523878#comment-17523878
 ] 

Dongjoon Hyun commented on HADOOP-18173:


Yes, it was a API behavior regression which broke the downstream code. At 
least, Apache Spark fixed two places and I guess other downstream projects or 
user applications will hit the same issues and need to fix the behavior.
- SPARK-38652 uploadFileUri should preserve file scheme
- SPARK-38706 Use URI in FallbackStorage.copy

> S3a copyFromLocalOperation doesn't support single file
> --
>
> Key: HADOOP-18173
> URL: https://issues.apache.org/jira/browse/HADOOP-18173
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.2
> Environment: Hadoop version 3.3.2
> Spark version 3.4.0-SNAPSHOT
> use minio:latest to mock S3 filesystem
>  
>Reporter: qian
>Priority: Major
> Attachments: minio.yaml
>
>
> Spark job uses aws s3 as fileSystem and calls 
> {code:java}
> fs.copyFromLocalFile(delSrc, overwrite, src, dest) 
> delSrc = false
> overwrite = true
> src = 
> "/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar"
> dest = 
> "s3a://spark/spark-upload-a703d8e7-8dd2-4e29-beca-b4df2fedefbd/spark-examples_2.12-3.4.0-SNAPSHOT.jar"{code}
> Then throw a PathIOException, message is as follow
> {code:java}
> Exception in thread "main" org.apache.spark.SparkException: Uploading file 
> /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
>  failed...
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277)
> 
> at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) 
>
> at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)   
>  
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)  
>   
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
> at scala.collection.TraversableLike.map(TraversableLike.scala:286)
> at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
> at scala.collection.AbstractTraversable.map(Traversable.scala:108)
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275)
> 
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187)
>
> at scala.collection.immutable.List.foreach(List.scala:431)
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86)
> at 
> scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
> 
> at 
> scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)   
>  
> at scala.collection.immutable.List.foldLeft(List.scala:91)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84)
> 
> at 
> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242)
>  
> at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214)
> 
> at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
> 
> at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) 
>
> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)  
>   
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
> org.apache.spark.SparkException: Error uploading file 
> 

[jira] [Commented] (HADOOP-18173) S3a copyFromLocalOperation doesn't support single file

2022-04-18 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523762#comment-17523762
 ] 

Steve Loughran commented on HADOOP-18173:
-

where are with this? is it a regression?

> S3a copyFromLocalOperation doesn't support single file
> --
>
> Key: HADOOP-18173
> URL: https://issues.apache.org/jira/browse/HADOOP-18173
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.2
> Environment: Hadoop version 3.3.2
> Spark version 3.4.0-SNAPSHOT
> use minio:latest to mock S3 filesystem
>  
>Reporter: qian
>Priority: Major
> Attachments: minio.yaml
>
>
> Spark job uses aws s3 as fileSystem and calls 
> {code:java}
> fs.copyFromLocalFile(delSrc, overwrite, src, dest) 
> delSrc = false
> overwrite = true
> src = 
> "/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar"
> dest = 
> "s3a://spark/spark-upload-a703d8e7-8dd2-4e29-beca-b4df2fedefbd/spark-examples_2.12-3.4.0-SNAPSHOT.jar"{code}
> Then throw a PathIOException, message is as follow
> {code:java}
> Exception in thread "main" org.apache.spark.SparkException: Uploading file 
> /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
>  failed...
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277)
> 
> at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) 
>
> at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)   
>  
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)  
>   
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
> at scala.collection.TraversableLike.map(TraversableLike.scala:286)
> at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
> at scala.collection.AbstractTraversable.map(Traversable.scala:108)
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275)
> 
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187)
>
> at scala.collection.immutable.List.foreach(List.scala:431)
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86)
> at 
> scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
> 
> at 
> scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)   
>  
> at scala.collection.immutable.List.foldLeft(List.scala:91)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84)
> 
> at 
> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242)
>  
> at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214)
> 
> at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
> 
> at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) 
>
> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)  
>   
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
> org.apache.spark.SparkException: Error uploading file 
> spark-examples_2.12-3.4.0-SNAPSHOT.jar
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328)
> 
> ... 30 more
> Caused 

[jira] [Commented] (HADOOP-18173) S3a copyFromLocalOperation doesn't support single file

2022-03-29 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17514433#comment-17514433
 ] 

Dongjoon Hyun commented on HADOOP-18173:


This seems to be a regression of HADOOP-17139

> S3a copyFromLocalOperation doesn't support single file
> --
>
> Key: HADOOP-18173
> URL: https://issues.apache.org/jira/browse/HADOOP-18173
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.2
> Environment: Hadoop version 3.3.2
> Spark version 3.4.0-SNAPSHOT
> use minio:latest to mock S3 filesystem
>  
>Reporter: qian
>Priority: Major
> Attachments: minio.yaml
>
>
> Spark job uses aws s3 as fileSystem and calls 
> {code:java}
> fs.copyFromLocalFile(delSrc, overwrite, src, dest) 
> delSrc = false
> overwrite = true
> src = 
> "/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar"
> dest = 
> "s3a://spark/spark-upload-a703d8e7-8dd2-4e29-beca-b4df2fedefbd/spark-examples_2.12-3.4.0-SNAPSHOT.jar"{code}
> Then throw a PathIOException, message is as follow
> {code:java}
> Exception in thread "main" org.apache.spark.SparkException: Uploading file 
> /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
>  failed...
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277)
> 
> at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) 
>
> at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)   
>  
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)  
>   
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
> at scala.collection.TraversableLike.map(TraversableLike.scala:286)
> at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
> at scala.collection.AbstractTraversable.map(Traversable.scala:108)
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275)
> 
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187)
>
> at scala.collection.immutable.List.foreach(List.scala:431)
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86)
> at 
> scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
> 
> at 
> scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)   
>  
> at scala.collection.immutable.List.foldLeft(List.scala:91)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84)
> 
> at 
> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242)
>  
> at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214)
> 
> at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
> 
> at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) 
>
> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)  
>   
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
> org.apache.spark.SparkException: Error uploading file 
> spark-examples_2.12-3.4.0-SNAPSHOT.jar
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328)
> 
> ... 30 more
> 

[jira] [Commented] (HADOOP-18173) S3a copyFromLocalOperation doesn't support single file

2022-03-25 Thread qian (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17512656#comment-17512656
 ] 

qian commented on HADOOP-18173:
---

Hi [~ste...@apache.org] 

Thanks for your help. Maybe I confused s3 and s3a. Is there a difference in 
usage between s3 and s3a?

Good idea to check out hadoop trunk source and run the hadoop aws tests.

Minio deployment yaml is here 

[^minio.yaml]

^I see you have a look on SPARK-38652. DepsTestsSuite works normally with 
hadoop-aws-3.3.1 and minio deployment has no change.^

> S3a copyFromLocalOperation doesn't support single file
> --
>
> Key: HADOOP-18173
> URL: https://issues.apache.org/jira/browse/HADOOP-18173
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.2
> Environment: Hadoop version 3.3.2
> Spark version 3.4.0-SNAPSHOT
> use minio:latest to mock S3 filesystem
>  
>Reporter: qian
>Priority: Major
> Attachments: minio.yaml
>
>
> Spark job uses aws s3 as fileSystem and calls 
> {code:java}
> fs.copyFromLocalFile(delSrc, overwrite, src, dest) 
> delSrc = false
> overwrite = true
> src = 
> "/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar"
> dest = 
> "s3a://spark/spark-upload-a703d8e7-8dd2-4e29-beca-b4df2fedefbd/spark-examples_2.12-3.4.0-SNAPSHOT.jar"{code}
> Then throw a PathIOException, message is as follow
> {code:java}
> Exception in thread "main" org.apache.spark.SparkException: Uploading file 
> /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
>  failed...
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277)
> 
> at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) 
>
> at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)   
>  
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)  
>   
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
> at scala.collection.TraversableLike.map(TraversableLike.scala:286)
> at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
> at scala.collection.AbstractTraversable.map(Traversable.scala:108)
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275)
> 
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187)
>
> at scala.collection.immutable.List.foreach(List.scala:431)
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86)
> at 
> scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
> 
> at 
> scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)   
>  
> at scala.collection.immutable.List.foldLeft(List.scala:91)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84)
> 
> at 
> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242)
>  
> at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214)
> 
> at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
> 
> at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) 
>
> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)  
>   
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
> org.apache.spark.SparkException: