[ 
https://issues.apache.org/jira/browse/SPARK-42425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17699042#comment-17699042
 ] 

Sean R. Owen commented on SPARK-42425:
--------------------------------------

The docs don't say it's part of the Spark distro. in fact it tells you to 
bundle it in your app. It is not bundled on purpose.

> spark-hadoop-cloud is not provided in the default Spark distribution
> --------------------------------------------------------------------
>
>                 Key: SPARK-42425
>                 URL: https://issues.apache.org/jira/browse/SPARK-42425
>             Project: Spark
>          Issue Type: Bug
>          Components: Input/Output
>    Affects Versions: 3.3.1
>            Reporter: Arseniy Tashoyan
>            Priority: Major
>
> The library spark-hadoop-cloud is absent in the default Spark distribution 
> (as well as its dependencies like hadoop-aws). Therefore the dependency 
> management section described in [Integration with Cloud 
> Infrastructures|https://spark.apache.org/docs/3.3.1/cloud-integration.html#installation]
>  is invalid. Actually the libraries for cloud integration are not provided.
> A naive workaround would be to add the spark-hadoop-cloud library as a 
> compile-scope dependency. However, this does not work due to Spark classpath 
> hierarchy. Spark system classloader does not see classes loaded by the 
> application classloader.
> Therefore a proper fix would be to enable the hadoop-cloud build profile by 
> default: -Phadoop-cloud



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to