[jira] [Commented] (HADOOP-18173) AWS S3 copyFromLocalOperation doesn't support single file

2022-03-25 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17512412#comment-17512412
 ] 

Steve Loughran commented on HADOOP-18173:
-

afraid bogdan has moved from aws to meta and won't be fielding this regression. 

bq. It looks like copyFromLocalOperation doesn't support single file.

it is meant to and we have tests for that (AbstractContractCopyFromLocalTest), 
so either the tests are incomplete or there is something up with your 
deployment against minio.

could you check out hadoop trunk source and run the hadoop aws tests against 
your setup, especially the ITestS3ACopyFromLocalFile one? 

> AWS S3 copyFromLocalOperation doesn't support single file
> -
>
> Key: HADOOP-18173
> URL: https://issues.apache.org/jira/browse/HADOOP-18173
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.2
> Environment: Hadoop version 3.3.2
> Spark version 3.4.0-SNAPSHOT
> use minio:latest to mock S3 filesystem
>  
>Reporter: qian
>Priority: Major
>
> Spark job uses aws s3 as fileSystem and calls 
> {code:java}
> fs.copyFromLocalFile(delSrc, overwrite, src, dest) 
> delSrc = false
> overwrite = true
> src = 
> "/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar"
> dest = 
> "s3a://spark/spark-upload-a703d8e7-8dd2-4e29-beca-b4df2fedefbd/spark-examples_2.12-3.4.0-SNAPSHOT.jar"{code}
> Then throw a PathIOException, message is as follow
> {code:java}
> Exception in thread "main" org.apache.spark.SparkException: Uploading file 
> /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
>  failed...
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277)
> 
> at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) 
>
> at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)   
>  
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)  
>   
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
> at scala.collection.TraversableLike.map(TraversableLike.scala:286)
> at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
> at scala.collection.AbstractTraversable.map(Traversable.scala:108)
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275)
> 
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187)
>
> at scala.collection.immutable.List.foreach(List.scala:431)
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86)
> at 
> scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
> 
> at 
> scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)   
>  
> at scala.collection.immutable.List.foldLeft(List.scala:91)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84)
> 
> at 
> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242)
>  
> at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214)
> 
> at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
> 
> at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) 
>
> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)  
>   
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
> at 

[jira] [Commented] (HADOOP-18173) AWS S3 copyFromLocalOperation doesn't support single file

2022-03-24 Thread qian (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17512156#comment-17512156
 ] 

qian commented on HADOOP-18173:
---

[~bogthe] Hi. Could u help me about this case?

> AWS S3 copyFromLocalOperation doesn't support single file
> -
>
> Key: HADOOP-18173
> URL: https://issues.apache.org/jira/browse/HADOOP-18173
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.3.2
> Environment: Hadoop version 3.3.2
> Spark version 3.4.0-SNAPSHOT
> use minio:latest to mock S3 filesystem
>  
>Reporter: qian
>Priority: Major
>
> Spark job uses aws s3 as fileSystem and calls 
> {code:java}
> fs.copyFromLocalFile(delSrc, overwrite, src, dest) 
> delSrc = false
> overwrite = true
> src = 
> "/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar"
> dest = 
> "s3a://spark/spark-upload-a703d8e7-8dd2-4e29-beca-b4df2fedefbd/spark-examples_2.12-3.4.0-SNAPSHOT.jar"{code}
> Then throw a PathIOException, message is as follow
> {code:java}
> Exception in thread "main" org.apache.spark.SparkException: Uploading file 
> /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
>  failed...
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277)
> 
> at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) 
>
> at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)   
>  
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)  
>   
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
> at scala.collection.TraversableLike.map(TraversableLike.scala:286)
> at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
> at scala.collection.AbstractTraversable.map(Traversable.scala:108)
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275)
> 
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187)
>
> at scala.collection.immutable.List.foreach(List.scala:431)
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86)
> at 
> scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
> 
> at 
> scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)   
>  
> at scala.collection.immutable.List.foldLeft(List.scala:91)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84)
> 
> at 
> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242)
> at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738) 
>
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214)
> 
> at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
> 
> at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) 
>
> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)  
>   
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
> org.apache.spark.SparkException: Error uploading file 
> spark-examples_2.12-3.4.0-SNAPSHOT.jar
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328)
> 
> ... 30 more
> Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get