panbingkun commented on PR #48237:
URL: https://github.com/apache/spark/pull/48237#issuecomment-2373414681

   ## size(map_from_entries(array(...)))
   ### Benchmark code:
   ```scala
   object SizeBenchmark extends SqlBasedBenchmark {
     private val N = 10_000_00
     private val M = 100
   
     private val path = 
"/Users/panbingkun/Developer/spark/spark-community/SizeBenchmark"
     private val df = spark.range(N).to(new StructType().add("id", "int")).
       withColumn("id1", col("id") + 1).
       withColumn("id2", col("id") + 2).
       withColumn("id3", col("id") + 3).
       withColumn("id4", col("id") + 4).
       withColumn("id5", col("id") + 5)
     df.write.parquet(path)
     private val table = spark.read.parquet(path)
   
     private def doBenchmark(): Unit = {
       table.selectExpr("size(map_from_entries(array(struct(id, id3))))").noop()
     }
   
     override def runBenchmarkSuite(mainArgs: Array[String]): Unit = {
       runBenchmark("size") {
         val benchmark = new Benchmark("size", N, output = output)
         benchmark.addCase("optimize", M) { _ =>
           doBenchmark()
         }
         benchmark.run()
       }
     }
   }
   
   ```
   
   ### Benchmark Result:
   #### Before
   ```shell
   Running benchmark: size
     Running case: optimize
     Stopped after 100 iterations, 12723 ms
   
   OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.0
   Apple M2
   size:                                     Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   optimize                                            105            127       
   19          9.5         104.9       1.0X
   
   
   Running benchmark: size
     Running case: optimize
     Stopped after 100 iterations, 13554 ms
   
   OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.0
   Apple M2
   size:                                     Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   optimize                                            122            136       
    9          8.2         121.8       1.0X
   
   
   Running benchmark: size
     Running case: optimize
     Stopped after 100 iterations, 12055 ms
   
   OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.0
   Apple M2
   size:                                     Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   optimize                                            105            121       
   12          9.5         105.3       1.0X      
   ```
   
   #### After
   ```shell
   Running benchmark: size
     Running case: optimize
     Stopped after 100 iterations, 3246 ms
   
   OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.0
   Apple M2
   size:                                     Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   optimize                                             22             32       
    8         46.1          21.7       1.0X
   
   
   Running benchmark: size
     Running case: optimize
     Stopped after 100 iterations, 3312 ms
   
   OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.0
   Apple M2
   size:                                     Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   optimize                                             23             33       
   18         42.7          23.4       1.0X
   
   
   Running benchmark: size
     Running case: optimize
     Stopped after 100 iterations, 3236 ms
   
   OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Mac OS X 15.0
   Apple M2
   size:                                     Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   optimize                                             20             32       
   15         48.9          20.4       1.0X
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to