(spark) branch master updated: [SPARK-53232][SQL][TESTS] Use Java `Map.copyOf` instead of `ImmutableMap.copyOf`

yangjie01 Sun, 10 Aug 2025 04:41:22 -0700

This is an automated email from the ASF dual-hosted git repository.

yangjie01 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new f4d10122fa22 [SPARK-53232][SQL][TESTS] Use Java `Map.copyOf` instead 
of `ImmutableMap.copyOf`
f4d10122fa22 is described below

commit f4d10122fa22a9923a619984f12a6149cb52d41e
Author: Dongjoon Hyun <dongj...@apache.org>
AuthorDate: Sun Aug 10 19:40:05 2025 +0800

    [SPARK-53232][SQL][TESTS] Use Java `Map.copyOf` instead of 
`ImmutableMap.copyOf`
    
    ### What changes were proposed in this pull request?
    
    This PR aims to use Java 10+ `Map.copyOf` instead of `ImmutableMap.copyOf`.
    
    In addition, a new Scalastyle rule is added to prevent future regressions.
    
    ### Why are the changes needed?
    
    Java native implementation is **significantly faster and simpler** than 
`ImmutableMap.copyOf`.
    
    ```scala
    scala> val m = java.util.Map.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    val m: java.util.Map[Int,Int] = {5=6, 7=8, 9=10, 1=2, 3=4}
    
    scala> spark.time((1 until 100_000_000).foreach(_ => 
com.google.common.collect.ImmutableMap.copyOf(m)))
    Time taken: 4404 ms
    
    scala> spark.time((1 until 100_000_000).foreach(_ => 
java.util.Map.copyOf(m)))
    Time taken: 223 ms
    ```
    
    ### Does this PR introduce _any_ user-facing change?
    
    No behavior change.
    
    ### How was this patch tested?
    
    Pass the CIs.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #51958 from dongjoon-hyun/SPARK-53232.
    
    Authored-by: Dongjoon Hyun <dongj...@apache.org>
    Signed-off-by: yangjie01 <yangji...@baidu.com>
---
 scalastyle-config.xml                                            | 5 +++++
 sql/core/src/test/scala/org/apache/spark/sql/FileScanSuite.scala | 7 +++----
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/scalastyle-config.xml b/scalastyle-config.xml
index 48362ca6f9a2..cd7c86a8bc1b 100644
--- a/scalastyle-config.xml
+++ b/scalastyle-config.xml
@@ -787,6 +787,11 @@ This file is divided into 3 sections:
     <customMessage>Use OutputStream.nullOutputStream instead.</customMessage>
   </check>
 
+  <check customId="ImmutableMapcopyOf" level="error" 
class="org.scalastyle.file.RegexChecker" enabled="true">
+    <parameters><parameter 
name="regex">\bImmutableMap\.copyOf\b</parameter></parameters>
+    <customMessage>Use Map.copyOf instead.</customMessage>
+  </check>
+
   <check customId="maputils" level="error" 
class="org.scalastyle.file.RegexChecker" enabled="true">
     <parameters><parameter 
name="regex">org\.apache\.commons\.collections4\.MapUtils\b</parameter></parameters>
     <customMessage>Use org.apache.spark.util.collection.Utils 
instead.</customMessage>
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/FileScanSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/FileScanSuite.scala
index f8a3cfc50a2e..c7ea8eca75ea 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/FileScanSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/FileScanSuite.scala
@@ -21,7 +21,6 @@ import java.util.{Map => JMap}
 
 import scala.collection.mutable
 
-import com.google.common.collect.ImmutableMap
 import org.apache.hadoop.fs.{FileStatus, Path}
 
 import org.apache.spark.sql.catalyst.dsl.expressions._
@@ -84,9 +83,9 @@ trait FileScanSuiteBase extends SharedSparkSession {
     val pushedFiltersNotEqual =
       Array[Filter](sources.And(sources.IsNull("data"), 
sources.LessThan("data", 1)))
     val optionsMap = JMap.of("key", "value")
-    val options = new CaseInsensitiveStringMap(ImmutableMap.copyOf(optionsMap))
+    val options = new CaseInsensitiveStringMap(JMap.copyOf(optionsMap))
     val optionsNotEqual =
-      new CaseInsensitiveStringMap(ImmutableMap.copyOf(JMap.of("key2", 
"value2")))
+      new CaseInsensitiveStringMap(JMap.copyOf(JMap.of("key2", "value2")))
     val partitionFilters = Seq(And(IsNull($"data".int), LessThan($"data".int, 
0)))
     val partitionFiltersNotEqual = Seq(And(IsNull($"data".int),
       LessThan($"data".int, 1)))
@@ -115,7 +114,7 @@ trait FileScanSuiteBase extends SharedSparkSession {
           readDataSchema.copy(),
           readPartitionSchema.copy(),
           pushedFilters.clone(),
-          new CaseInsensitiveStringMap(ImmutableMap.copyOf(optionsMap)),
+          new CaseInsensitiveStringMap(JMap.copyOf(optionsMap)),
           Seq(partitionFilters: _*),
           Seq(dataFilters: _*))
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-53232][SQL][TESTS] Use Java `Map.copyOf` instead of `ImmutableMap.copyOf`

Reply via email to