[
https://issues.apache.org/jira/browse/SPARK-42425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17699042#comment-17699042
]
Sean R. Owen commented on SPARK-42425:
--------------------------------------
The docs don't say it's part of the Spark distro. in fact it tells you to
bundle it in your app. It is not bundled on purpose.
> spark-hadoop-cloud is not provided in the default Spark distribution
> --------------------------------------------------------------------
>
> Key: SPARK-42425
> URL: https://issues.apache.org/jira/browse/SPARK-42425
> Project: Spark
> Issue Type: Bug
> Components: Input/Output
> Affects Versions: 3.3.1
> Reporter: Arseniy Tashoyan
> Priority: Major
>
> The library spark-hadoop-cloud is absent in the default Spark distribution
> (as well as its dependencies like hadoop-aws). Therefore the dependency
> management section described in [Integration with Cloud
> Infrastructures|https://spark.apache.org/docs/3.3.1/cloud-integration.html#installation]
> is invalid. Actually the libraries for cloud integration are not provided.
> A naive workaround would be to add the spark-hadoop-cloud library as a
> compile-scope dependency. However, this does not work due to Spark classpath
> hierarchy. Spark system classloader does not see classes loaded by the
> application classloader.
> Therefore a proper fix would be to enable the hadoop-cloud build profile by
> default: -Phadoop-cloud
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]