[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre

2021-09-14 Thread GitBox


viirya commented on pull request #29326:
URL: https://github.com/apache/spark/pull/29326#issuecomment-919308510


   #33989 seems a promising direction. Close this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre

2021-07-06 Thread GitBox


viirya commented on pull request #29326:
URL: https://github.com/apache/spark/pull/29326#issuecomment-875177916


   Hmm, I looked at `isSharedClass`, looks like `common-lang3`, orc, etc. are 
already non-shared classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre

2021-07-06 Thread GitBox


viirya commented on pull request #29326:
URL: https://github.com/apache/spark/pull/29326#issuecomment-875060042


   > Oh I didn't even realize that Spark is using `hive-exec-core` jar. Does 
that mean it doesn't take advantage of the Guava shading from Hive 2.3.8+ at 
all?
   
   Yea, I'm afraid that it is true. If we want to completely isolate 
dependencies from Hive, we may need to relocate all included (but not 
relocated) dependencies in `hive-exec` w/o classifier.
   
   > One idea is to have Spark use 
[`hadoop-shaded-guava`](https://github.com/apache/hadoop-thirdparty) which is 
also 30.1.1-jre. It also makes sure that Spark always use the same Guava 
version as Hadoop.
   
   Even Spark uses `hadoop-shaded-guava`, but `hive-exec` still needs older 
Guava if we cannot use the version w/o classifier (due to other dependencies 
e.g. common-lang3, orc, parquet..)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre

2021-07-03 Thread GitBox


viirya commented on pull request #29326:
URL: https://github.com/apache/spark/pull/29326#issuecomment-873359259


   Encountered some issues.
   
   Although we can switch to hive-exec without classifier (shaded version) to 
get rid of above guava version issue, the shaded hive-exec contains (without 
relocation) some dependencies like commons-lang3, orc that are not same version 
with Spark and so they conflict.
   
   Because shaded hive-exec jar already includes these dependency jars, seems 
dependency exclusions in pom cannot exclude them.
   
   Currently seems we can just go back to Hive to shade every included 
dependencies? Any other thoughts?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre

2021-07-01 Thread GitBox


viirya commented on pull request #29326:
URL: https://github.com/apache/spark/pull/29326#issuecomment-872735918


   Hmm, from the failed tests below:
   
   org.apache.spark.sql.hive.DataSourceWithHiveMetastoreCatalogSuite
   org.apache.spark.sql.hive.HiveExternalCatalogSuite
   org.apache.spark.sql.hive.StatisticsSuite
   
   Since Guava 20, `com.google.common.collect.Iterators.emptyIterator()` is not 
public anymore. But I don't get it because Hive 2.3.8/2.3.9 shaded guava. Why 
it will use the newer guava upgraded here? 
   
   ```
   java.lang.IllegalAccessError: tried to access method 
com.google.common.collect.Iterators.emptyIterator()Lcom/google/common/collect/UnmodifiableIterator;
 from class org.apache.hadoop.hive.ql.exec.FetchOperator
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:108)
at 
org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:87)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:541)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
at 
org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$runHive$1(HiveClientImpl.scala:831)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre

2021-07-01 Thread GitBox


viirya commented on pull request #29326:
URL: https://github.com/apache/spark/pull/29326#issuecomment-871990967


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre

2021-06-30 Thread GitBox


viirya commented on pull request #29326:
URL: https://github.com/apache/spark/pull/29326#issuecomment-871122684


   I'm not against to this point. I can change to latest guava and see what CI 
tells.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre

2021-06-29 Thread GitBox


viirya commented on pull request #29326:
URL: https://github.com/apache/spark/pull/29326#issuecomment-871101530


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre

2021-06-29 Thread GitBox


viirya commented on pull request #29326:
URL: https://github.com/apache/spark/pull/29326#issuecomment-870850274


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre and Hadoop to 3.2.1

2021-06-29 Thread GitBox


viirya commented on pull request #29326:
URL: https://github.com/apache/spark/pull/29326#issuecomment-870811211


   try this again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre and Hadoop to 3.2.1

2020-09-17 Thread GitBox


viirya commented on pull request #29326:
URL: https://github.com/apache/spark/pull/29326#issuecomment-694355931


   Isn't HADOOP-14284 resolved as Invalid?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre and Hadoop to 3.2.1

2020-08-07 Thread GitBox


viirya commented on pull request #29326:
URL: https://github.com/apache/spark/pull/29326#issuecomment-670608710


   @dongjoon-hyun Thanks for the comment. Yeah, it doesn't make sense to 
upgrade to Hive 4 in short or midterm. I'm working on upgrade Guava 27 and 
shading Guava in Hive too. I hope it can be part of Hive 2.3.8.
   
   I will close this for now. Once the work at Hive gets progress, I can reopen 
this. Thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre and Hadoop to 3.2.1

2020-08-04 Thread GitBox


viirya commented on pull request #29326:
URL: https://github.com/apache/spark/pull/29326#issuecomment-668878065


   I did some tests. Few changes are required to pass the failed Hive tests:
   
   1. Shading Guava at hive-exec packaging and a few code changes to 
hive-common and hive-exec regarding Guava usage
   2. Don't use `core` classifier for hive dependencies in Spark
   
   But this just upgrades Guava version used in Spark. Hive dependencies still 
use older Guava with the reported CVE.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre and Hadoop to 3.2.1

2020-08-03 Thread GitBox


viirya commented on pull request #29326:
URL: https://github.com/apache/spark/pull/29326#issuecomment-668116815


   Opened https://issues.apache.org/jira/browse/HIVE-23980 and see if Hive 
people has some ideas.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre and Hadoop to 3.2.1

2020-08-02 Thread GitBox


viirya commented on pull request #29326:
URL: https://github.com/apache/spark/pull/29326#issuecomment-667801138


   It is a trouble that hive-exec uses a method that became package-private 
since Guava version 20. So there is incompatibility with Guava versions > 19.0.
   
   ```
   sbt.ForkMain$ForkError: sbt.ForkMain$ForkError: 
java.lang.IllegalAccessError: tried to access method 
com.google.common.collect.Iterators.emptyIterator()Lcom/google/common/collect/UnmodifiableIterator;
 from class org.apache.hadoop.hive.ql.exec.FetchOperator
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:108)
at 
org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:87)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:541)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
   ```
   
   hive-exec doesn't shade Guava until 
https://issues.apache.org/jira/browse/HIVE-22126 that targets 4.0.0.
   
   This seems a dead end for upgrading Guava in Spark for now.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org