This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new d11d9cf729ab [SPARK-47699][BUILD] Upgrade `gcs-connector` to 2.2.21 
and add a note for 3.0.0
d11d9cf729ab is described below

commit d11d9cf729ab699c68770337d35043ebf58195cf
Author: Dongjoon Hyun <dh...@apple.com>
AuthorDate: Tue Apr 2 13:31:18 2024 -0700

    [SPARK-47699][BUILD] Upgrade `gcs-connector` to 2.2.21 and add a note for 
3.0.0
    
    ### What changes were proposed in this pull request?
    
    This PR aims to upgrade `gcs-connector` to 2.2.21 and add a note for 3.0.0.
    
    ### Why are the changes needed?
    
    This PR aims to upgrade `gcs-connector` to bring the latest bug fixes.
    
    However, due to the following, we stick to use 2.2.21.
    - https://github.com/GoogleCloudDataproc/hadoop-connectors/issues/1114
      - `gcs-connector` 2.2.21 has shaded Guava 32.1.2-jre.
        - 
https://github.com/GoogleCloudDataproc/hadoop-connectors/blob/15c8ee41a15d6735442f36333f1d67792c93b9cf/pom.xml#L100
    
      - `gcs-connector` 3.0.0 has shaded Guava 31.1-jre.
        - 
https://github.com/GoogleCloudDataproc/hadoop-connectors/blob/667bf17291dbaa96a60f06df58c7a528bc4a8f79/pom.xml#L97
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Manually.
    ```
    $ dev/make-distribution.sh -Phadoop-cloud
    $ cd dist
    $ export KEYFILE=~/.ssh/apache-spark.json
    $ export EMAIL=$(jq -r '.client_email' < $KEYFILE)
    $ export PRIVATE_KEY_ID=$(jq -r '.private_key_id' < $KEYFILE)
    $ export PRIVATE_KEY="$(jq -r '.private_key' < $KEYFILE)"
    $ bin/spark-shell \
                -c spark.hadoop.fs.gs.auth.service.account.email=$EMAIL \
                -c 
spark.hadoop.fs.gs.auth.service.account.private.key.id=$PRIVATE_KEY_ID \
                -c 
spark.hadoop.fs.gs.auth.service.account.private.key="$PRIVATE_KEY"
    Setting default log level to "WARN".
    To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _\ \/ _ \/ _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_\   version 4.0.0-SNAPSHOT
          /_/
    
    Using Scala version 2.13.13 (OpenJDK 64-Bit Server VM, Java 21.0.2)
    Type in expressions to have them evaluated.
    Type :help for more information.
    {"ts":"2024-04-02T13:08:31.513-0700","level":"WARN","msg":"Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable","logger":"org.apache.hadoop.util.NativeCodeLoader"}
    Spark context Web UI available at http://localhost:4040
    Spark context available as 'sc' (master = local[*], app id = 
local-1712088511841).
    Spark session available as 'spark'.
    
    scala> spark.read.text("gs://apache-spark-bucket/README.md").count()
    val res0: Long = 124
    
    scala> 
spark.read.orc("examples/src/main/resources/users.orc").write.mode("overwrite").orc("gs://apache-spark-bucket/users.orc")
    
    scala> spark.read.orc("gs://apache-spark-bucket/users.orc").show()
    +------+--------------+----------------+
    |  name|favorite_color|favorite_numbers|
    +------+--------------+----------------+
    |Alyssa|          NULL|  [3, 9, 15, 20]|
    |   Ben|           red|              []|
    +------+--------------+----------------+
    ```
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #45824 from dongjoon-hyun/SPARK-47699.
    
    Authored-by: Dongjoon Hyun <dh...@apple.com>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 2 +-
 pom.xml                               | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 
b/dev/deps/spark-deps-hadoop-3-hive-2.3
index a564ec9f044a..c6913ceeff13 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -66,7 +66,7 @@ 
eclipse-collections-api/11.1.0//eclipse-collections-api-11.1.0.jar
 eclipse-collections/11.1.0//eclipse-collections-11.1.0.jar
 esdk-obs-java/3.20.4.2//esdk-obs-java-3.20.4.2.jar
 flatbuffers-java/23.5.26//flatbuffers-java-23.5.26.jar
-gcs-connector/hadoop3-2.2.20/shaded/gcs-connector-hadoop3-2.2.20-shaded.jar
+gcs-connector/hadoop3-2.2.21/shaded/gcs-connector-hadoop3-2.2.21-shaded.jar
 gmetric4j/1.0.10//gmetric4j-1.0.10.jar
 gson/2.2.4//gson-2.2.4.jar
 guava/14.0.1//guava-14.0.1.jar
diff --git a/pom.xml b/pom.xml
index b70d091796a5..ca949a05c81c 100644
--- a/pom.xml
+++ b/pom.xml
@@ -163,7 +163,8 @@
     <aws.java.sdk.v2.version>2.24.6</aws.java.sdk.v2.version>
     <!-- the producer is used in tests -->
     <aws.kinesis.producer.version>0.12.8</aws.kinesis.producer.version>
-    <gcs-connector.version>hadoop3-2.2.20</gcs-connector.version>
+    <!-- Do not use 3.0.0: 
https://github.com/GoogleCloudDataproc/hadoop-connectors/issues/1114 -->
+    <gcs-connector.version>hadoop3-2.2.21</gcs-connector.version>
     <!--  org.apache.httpcomponents/httpclient-->
     <commons.httpclient.version>4.5.14</commons.httpclient.version>
     <commons.httpcore.version>4.4.16</commons.httpcore.version>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to