cloud-fan commented on code in PR #45453:
URL: https://github.com/apache/spark/pull/45453#discussion_r1539342857


##########
sql/core/benchmarks/CollationBenchmark-results.txt:
##########
@@ -0,0 +1,26 @@
+OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Linux 6.5.0-1016-azure
+AMD EPYC 7763 64-Core Processor
+filter df column with collation:                     Best Time(ms)   Avg 
Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
+-----------------------------------------------------------------------------------------------------------------------------------
+filter df column with collation - UNICODE_CI                   403            
463          39          0.0    20147470.0       1.0X
+filter df column with collation - UNICODE                      187            
223          37          0.0     9339586.0       2.2X
+filter df column with collation - UTF8_BINARY_LCASE            426            
434           7          0.0    21300903.4       0.9X
+filter df column with collation - UTF8_BINARY                  188            
199           5          0.0     9403169.1       2.1X
+
+OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Linux 6.5.0-1016-azure
+AMD EPYC 7763 64-Core Processor
+collation unit benchmarks:                Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
+------------------------------------------------------------------------------------------------------------------------
+equalsFunction - UTF8_BINARY                          0              0         
  0         10.4          96.6       1.0X
+collator.compare - UTF8_BINARY                        4              4         
  0          0.3        3717.9       0.0X
+hashFunction - UTF8_BINARY                            0              0         
  0         42.0          23.8       4.1X
+equalsFunction - UTF8_BINARY_LCASE                    5              5         
  0          0.2        5014.7       0.0X

Review Comment:
   The problem here is it does not make sense to compare these cases, like 
`hashFunction` and `equalsFunction`. We should create more `Benchmark` and each 
should test the same operation with different collation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to