[jira] [Commented] (SPARK-33212) Move to shaded clients for Hadoop 3.x profile

2021-01-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17266282#comment-17266282
 ] 

Apache Spark commented on SPARK-33212:
--

User 'sunchao' has created a pull request for this issue:
https://github.com/apache/spark/pull/30701

> Move to shaded clients for Hadoop 3.x profile
> -
>
> Key: SPARK-33212
> URL: https://issues.apache.org/jira/browse/SPARK-33212
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, Spark Submit, SQL, YARN
>Affects Versions: 3.0.1
>Reporter: Chao Sun
>Priority: Major
>  Labels: releasenotes
>
> Hadoop 3.x+ offers shaded client jars: hadoop-client-api and 
> hadoop-client-runtime, which shade 3rd party dependencies such as Guava, 
> protobuf, jetty etc. This Jira switches Spark to use these jars instead of 
> hadoop-common, hadoop-client etc. Benefits include:
>  * It unblocks Spark from upgrading to Hadoop 3.2.2/3.3.0+. The newer 
> versions of Hadoop have migrated to Guava 27.0+ and in order to resolve Guava 
> conflicts, Spark depends on Hadoop to not leaking dependencies.
>  * It makes Spark/Hadoop dependency cleaner. Currently Spark uses both 
> client-side and server-side Hadoop APIs from modules such as hadoop-common, 
> hadoop-yarn-server-common etc. Moving to hadoop-client-api allows use to only 
> use public/client API from Hadoop side.
>  * Provides a better isolation from Hadoop dependencies. In future Spark can 
> better evolve without worrying about dependencies pulled from Hadoop side 
> (which used to be a lot).
> *There are some behavior changes introduced with this JIRA, when people use 
> Spark compiled with Hadoop 3.x:*
> - Users now need to make sure class path contains `hadoop-client-api` and 
> `hadoop-client-runtime` jars when they deploy Spark with the 
> `hadoop-provided` option. In addition, it is high recommended that they put 
> these two jars before other Hadoop jars in the class path. Otherwise, 
> conflicts such as from Guava could happen if classes are loaded from the 
> other non-shaded Hadoop jars.
> - Since the new shaded Hadoop clients no longer include 3rd party 
> dependencies. Users who used to depend on these now need to explicitly put 
> the jars in their class path.
> Ideally the above should go to release notes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33212) Move to shaded clients for Hadoop 3.x profile

2021-01-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17266281#comment-17266281
 ] 

Apache Spark commented on SPARK-33212:
--

User 'sunchao' has created a pull request for this issue:
https://github.com/apache/spark/pull/30701

> Move to shaded clients for Hadoop 3.x profile
> -
>
> Key: SPARK-33212
> URL: https://issues.apache.org/jira/browse/SPARK-33212
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, Spark Submit, SQL, YARN
>Affects Versions: 3.0.1
>Reporter: Chao Sun
>Priority: Major
>  Labels: releasenotes
>
> Hadoop 3.x+ offers shaded client jars: hadoop-client-api and 
> hadoop-client-runtime, which shade 3rd party dependencies such as Guava, 
> protobuf, jetty etc. This Jira switches Spark to use these jars instead of 
> hadoop-common, hadoop-client etc. Benefits include:
>  * It unblocks Spark from upgrading to Hadoop 3.2.2/3.3.0+. The newer 
> versions of Hadoop have migrated to Guava 27.0+ and in order to resolve Guava 
> conflicts, Spark depends on Hadoop to not leaking dependencies.
>  * It makes Spark/Hadoop dependency cleaner. Currently Spark uses both 
> client-side and server-side Hadoop APIs from modules such as hadoop-common, 
> hadoop-yarn-server-common etc. Moving to hadoop-client-api allows use to only 
> use public/client API from Hadoop side.
>  * Provides a better isolation from Hadoop dependencies. In future Spark can 
> better evolve without worrying about dependencies pulled from Hadoop side 
> (which used to be a lot).
> *There are some behavior changes introduced with this JIRA, when people use 
> Spark compiled with Hadoop 3.x:*
> - Users now need to make sure class path contains `hadoop-client-api` and 
> `hadoop-client-runtime` jars when they deploy Spark with the 
> `hadoop-provided` option. In addition, it is high recommended that they put 
> these two jars before other Hadoop jars in the class path. Otherwise, 
> conflicts such as from Guava could happen if classes are loaded from the 
> other non-shaded Hadoop jars.
> - Since the new shaded Hadoop clients no longer include 3rd party 
> dependencies. Users who used to depend on these now need to explicitly put 
> the jars in their class path.
> Ideally the above should go to release notes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33212) Move to shaded clients for Hadoop 3.x profile

2020-11-30 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240983#comment-17240983
 ] 

Apache Spark commented on SPARK-33212:
--

User 'sunchao' has created a pull request for this issue:
https://github.com/apache/spark/pull/30556

> Move to shaded clients for Hadoop 3.x profile
> -
>
> Key: SPARK-33212
> URL: https://issues.apache.org/jira/browse/SPARK-33212
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, Spark Submit, SQL, YARN
>Affects Versions: 3.0.1
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
>  Labels: releasenotes
> Fix For: 3.1.0
>
>
> Hadoop 3.x+ offers shaded client jars: hadoop-client-api and 
> hadoop-client-runtime, which shade 3rd party dependencies such as Guava, 
> protobuf, jetty etc. This Jira switches Spark to use these jars instead of 
> hadoop-common, hadoop-client etc. Benefits include:
>  * It unblocks Spark from upgrading to Hadoop 3.2.2/3.3.0+. The newer 
> versions of Hadoop have migrated to Guava 27.0+ and in order to resolve Guava 
> conflicts, Spark depends on Hadoop to not leaking dependencies.
>  * It makes Spark/Hadoop dependency cleaner. Currently Spark uses both 
> client-side and server-side Hadoop APIs from modules such as hadoop-common, 
> hadoop-yarn-server-common etc. Moving to hadoop-client-api allows use to only 
> use public/client API from Hadoop side.
>  * Provides a better isolation from Hadoop dependencies. In future Spark can 
> better evolve without worrying about dependencies pulled from Hadoop side 
> (which used to be a lot).
> *There are some behavior changes introduced with this JIRA, when people use 
> Spark compiled with Hadoop 3.x:*
> - Users now need to make sure class path contains `hadoop-client-api` and 
> `hadoop-client-runtime` jars when they deploy Spark with the 
> `hadoop-provided` option. In addition, it is high recommended that they put 
> these two jars before other Hadoop jars in the class path. Otherwise, 
> conflicts such as from Guava could happen if classes are loaded from the 
> other non-shaded Hadoop jars.
> - Since the new shaded Hadoop clients no longer include 3rd party 
> dependencies. Users who used to depend on these now need to explicitly put 
> the jars in their class path.
> Ideally the above should go to release notes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33212) Move to shaded clients for Hadoop 3.x profile

2020-11-30 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240982#comment-17240982
 ] 

Apache Spark commented on SPARK-33212:
--

User 'sunchao' has created a pull request for this issue:
https://github.com/apache/spark/pull/30556

> Move to shaded clients for Hadoop 3.x profile
> -
>
> Key: SPARK-33212
> URL: https://issues.apache.org/jira/browse/SPARK-33212
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, Spark Submit, SQL, YARN
>Affects Versions: 3.0.1
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
>  Labels: releasenotes
> Fix For: 3.1.0
>
>
> Hadoop 3.x+ offers shaded client jars: hadoop-client-api and 
> hadoop-client-runtime, which shade 3rd party dependencies such as Guava, 
> protobuf, jetty etc. This Jira switches Spark to use these jars instead of 
> hadoop-common, hadoop-client etc. Benefits include:
>  * It unblocks Spark from upgrading to Hadoop 3.2.2/3.3.0+. The newer 
> versions of Hadoop have migrated to Guava 27.0+ and in order to resolve Guava 
> conflicts, Spark depends on Hadoop to not leaking dependencies.
>  * It makes Spark/Hadoop dependency cleaner. Currently Spark uses both 
> client-side and server-side Hadoop APIs from modules such as hadoop-common, 
> hadoop-yarn-server-common etc. Moving to hadoop-client-api allows use to only 
> use public/client API from Hadoop side.
>  * Provides a better isolation from Hadoop dependencies. In future Spark can 
> better evolve without worrying about dependencies pulled from Hadoop side 
> (which used to be a lot).
> *There are some behavior changes introduced with this JIRA, when people use 
> Spark compiled with Hadoop 3.x:*
> - Users now need to make sure class path contains `hadoop-client-api` and 
> `hadoop-client-runtime` jars when they deploy Spark with the 
> `hadoop-provided` option. In addition, it is high recommended that they put 
> these two jars before other Hadoop jars in the class path. Otherwise, 
> conflicts such as from Guava could happen if classes are loaded from the 
> other non-shaded Hadoop jars.
> - Since the new shaded Hadoop clients no longer include 3rd party 
> dependencies. Users who used to depend on these now need to explicitly put 
> the jars in their class path.
> Ideally the above should go to release notes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33212) Move to shaded clients for Hadoop 3.x profile

2020-11-25 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17238974#comment-17238974
 ] 

Apache Spark commented on SPARK-33212:
--

User 'dongjoon-hyun' has created a pull request for this issue:
https://github.com/apache/spark/pull/30508

> Move to shaded clients for Hadoop 3.x profile
> -
>
> Key: SPARK-33212
> URL: https://issues.apache.org/jira/browse/SPARK-33212
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, Spark Submit, SQL, YARN
>Affects Versions: 3.0.1
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
>  Labels: releasenotes
> Fix For: 3.1.0
>
>
> Hadoop 3.x+ offers shaded client jars: hadoop-client-api and 
> hadoop-client-runtime, which shade 3rd party dependencies such as Guava, 
> protobuf, jetty etc. This Jira switches Spark to use these jars instead of 
> hadoop-common, hadoop-client etc. Benefits include:
>  * It unblocks Spark from upgrading to Hadoop 3.2.2/3.3.0+. The newer 
> versions of Hadoop have migrated to Guava 27.0+ and in order to resolve Guava 
> conflicts, Spark depends on Hadoop to not leaking dependencies.
>  * It makes Spark/Hadoop dependency cleaner. Currently Spark uses both 
> client-side and server-side Hadoop APIs from modules such as hadoop-common, 
> hadoop-yarn-server-common etc. Moving to hadoop-client-api allows use to only 
> use public/client API from Hadoop side.
>  * Provides a better isolation from Hadoop dependencies. In future Spark can 
> better evolve without worrying about dependencies pulled from Hadoop side 
> (which used to be a lot).
> *There are some behavior changes introduced with this JIRA, when people use 
> Spark compiled with Hadoop 3.x:*
> - Users now need to make sure class path contains `hadoop-client-api` and 
> `hadoop-client-runtime` jars when they deploy Spark with the 
> `hadoop-provided` option. In addition, it is high recommended that they put 
> these two jars before other Hadoop jars in the class path. Otherwise, 
> conflicts such as from Guava could happen if classes are loaded from the 
> other non-shaded Hadoop jars.
> - Since the new shaded Hadoop clients no longer include 3rd party 
> dependencies. Users who used to depend on these now need to explicitly put 
> the jars in their class path.
> Ideally the above should go to release notes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33212) Move to shaded clients for Hadoop 3.x profile

2020-10-21 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218588#comment-17218588
 ] 

Apache Spark commented on SPARK-33212:
--

User 'sunchao' has created a pull request for this issue:
https://github.com/apache/spark/pull/29843

> Move to shaded clients for Hadoop 3.x profile
> -
>
> Key: SPARK-33212
> URL: https://issues.apache.org/jira/browse/SPARK-33212
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, Spark Submit, SQL, YARN
>Affects Versions: 3.0.1
>Reporter: Chao Sun
>Priority: Major
>
> Hadoop 3.x+ offers shaded client jars: hadoop-client-api and 
> hadoop-client-runtime, which shade 3rd party dependencies such as Guava, 
> protobuf, jetty etc. This Jira switches Spark to use these jars instead of 
> hadoop-common, hadoop-client etc. Benefits include:
>  * It unblocks Spark from upgrading to Hadoop 3.2.2/3.3.0+. The newer 
> versions of Hadoop have migrated to Guava 27.0+ and in order to resolve Guava 
> conflicts, Spark depends on Hadoop to not leaking dependencies.
>  * It makes Spark/Hadoop dependency cleaner. Currently Spark uses both 
> client-side and server-side Hadoop APIs from modules such as hadoop-common, 
> hadoop-yarn-server-common etc. Moving to hadoop-client-api allows use to only 
> use public/client API from Hadoop side.
>  * Provides a better isolation from Hadoop dependencies. In future Spark can 
> better evolve without worrying about dependencies pulled from Hadoop side 
> (which used to be a lot).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org