This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new d11d9cf729ab [SPARK-47699][BUILD] Upgrade `gcs-connector` to 2.2.21 and add a note for 3.0.0 d11d9cf729ab is described below commit d11d9cf729ab699c68770337d35043ebf58195cf Author: Dongjoon Hyun <dh...@apple.com> AuthorDate: Tue Apr 2 13:31:18 2024 -0700 [SPARK-47699][BUILD] Upgrade `gcs-connector` to 2.2.21 and add a note for 3.0.0 ### What changes were proposed in this pull request? This PR aims to upgrade `gcs-connector` to 2.2.21 and add a note for 3.0.0. ### Why are the changes needed? This PR aims to upgrade `gcs-connector` to bring the latest bug fixes. However, due to the following, we stick to use 2.2.21. - https://github.com/GoogleCloudDataproc/hadoop-connectors/issues/1114 - `gcs-connector` 2.2.21 has shaded Guava 32.1.2-jre. - https://github.com/GoogleCloudDataproc/hadoop-connectors/blob/15c8ee41a15d6735442f36333f1d67792c93b9cf/pom.xml#L100 - `gcs-connector` 3.0.0 has shaded Guava 31.1-jre. - https://github.com/GoogleCloudDataproc/hadoop-connectors/blob/667bf17291dbaa96a60f06df58c7a528bc4a8f79/pom.xml#L97 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually. ``` $ dev/make-distribution.sh -Phadoop-cloud $ cd dist $ export KEYFILE=~/.ssh/apache-spark.json $ export EMAIL=$(jq -r '.client_email' < $KEYFILE) $ export PRIVATE_KEY_ID=$(jq -r '.private_key_id' < $KEYFILE) $ export PRIVATE_KEY="$(jq -r '.private_key' < $KEYFILE)" $ bin/spark-shell \ -c spark.hadoop.fs.gs.auth.service.account.email=$EMAIL \ -c spark.hadoop.fs.gs.auth.service.account.private.key.id=$PRIVATE_KEY_ID \ -c spark.hadoop.fs.gs.auth.service.account.private.key="$PRIVATE_KEY" Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 4.0.0-SNAPSHOT /_/ Using Scala version 2.13.13 (OpenJDK 64-Bit Server VM, Java 21.0.2) Type in expressions to have them evaluated. Type :help for more information. {"ts":"2024-04-02T13:08:31.513-0700","level":"WARN","msg":"Unable to load native-hadoop library for your platform... using builtin-java classes where applicable","logger":"org.apache.hadoop.util.NativeCodeLoader"} Spark context Web UI available at http://localhost:4040 Spark context available as 'sc' (master = local[*], app id = local-1712088511841). Spark session available as 'spark'. scala> spark.read.text("gs://apache-spark-bucket/README.md").count() val res0: Long = 124 scala> spark.read.orc("examples/src/main/resources/users.orc").write.mode("overwrite").orc("gs://apache-spark-bucket/users.orc") scala> spark.read.orc("gs://apache-spark-bucket/users.orc").show() +------+--------------+----------------+ | name|favorite_color|favorite_numbers| +------+--------------+----------------+ |Alyssa| NULL| [3, 9, 15, 20]| | Ben| red| []| +------+--------------+----------------+ ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #45824 from dongjoon-hyun/SPARK-47699. Authored-by: Dongjoon Hyun <dh...@apple.com> Signed-off-by: Dongjoon Hyun <dh...@apple.com> --- dev/deps/spark-deps-hadoop-3-hive-2.3 | 2 +- pom.xml | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 b/dev/deps/spark-deps-hadoop-3-hive-2.3 index a564ec9f044a..c6913ceeff13 100644 --- a/dev/deps/spark-deps-hadoop-3-hive-2.3 +++ b/dev/deps/spark-deps-hadoop-3-hive-2.3 @@ -66,7 +66,7 @@ eclipse-collections-api/11.1.0//eclipse-collections-api-11.1.0.jar eclipse-collections/11.1.0//eclipse-collections-11.1.0.jar esdk-obs-java/3.20.4.2//esdk-obs-java-3.20.4.2.jar flatbuffers-java/23.5.26//flatbuffers-java-23.5.26.jar -gcs-connector/hadoop3-2.2.20/shaded/gcs-connector-hadoop3-2.2.20-shaded.jar +gcs-connector/hadoop3-2.2.21/shaded/gcs-connector-hadoop3-2.2.21-shaded.jar gmetric4j/1.0.10//gmetric4j-1.0.10.jar gson/2.2.4//gson-2.2.4.jar guava/14.0.1//guava-14.0.1.jar diff --git a/pom.xml b/pom.xml index b70d091796a5..ca949a05c81c 100644 --- a/pom.xml +++ b/pom.xml @@ -163,7 +163,8 @@ <aws.java.sdk.v2.version>2.24.6</aws.java.sdk.v2.version> <!-- the producer is used in tests --> <aws.kinesis.producer.version>0.12.8</aws.kinesis.producer.version> - <gcs-connector.version>hadoop3-2.2.20</gcs-connector.version> + <!-- Do not use 3.0.0: https://github.com/GoogleCloudDataproc/hadoop-connectors/issues/1114 --> + <gcs-connector.version>hadoop3-2.2.21</gcs-connector.version> <!-- org.apache.httpcomponents/httpclient--> <commons.httpclient.version>4.5.14</commons.httpclient.version> <commons.httpcore.version>4.4.16</commons.httpcore.version> --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org