vladimirg-db commented on code in PR #46082:
URL: https://github.com/apache/spark/pull/46082#discussion_r1570130205


##########
sql/core/benchmarks/CollationNonASCIIBenchmark-jdk21-results.txt:
##########
@@ -1,54 +1,54 @@
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1017-azure
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
 AMD EPYC 7763 64-Core Processor
 collation unit benchmarks - equalsFunction:  Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
--------------------------------------------------------------------------------------------------------------------------
-UTF8_BINARY_LCASE                                   18244          18258       
   20          0.0      456096.4       1.0X
-UNICODE                                               498            498       
    0          0.1       12440.3      36.7X
-UTF8_BINARY                                           499            500       
    1          0.1       12467.7      36.6X
-UNICODE_CI                                          13429          13443       
   19          0.0      335725.4       1.4X
+UTF8_BINARY_LCASE                                   18412          18491       
  113          0.0      460288.0       1.0X
+UNICODE                                               500            501       
    3          0.1       12489.2      36.9X
+UTF8_BINARY                                           500            502       
    3          0.1       12511.4      36.8X
+UNICODE_CI                                          13663          13673       
   14          0.0      341564.3       1.3X
 
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1017-azure
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
 AMD EPYC 7763 64-Core Processor
 collation unit benchmarks - compareFunction:  Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
---------------------------------------------------------------------------------------------------------------------------
-UTF8_BINARY_LCASE                                    18377          18399      
    31          0.0      459430.5       1.0X
-UNICODE                                              14238          14240      
     3          0.0      355957.4       1.3X
-UTF8_BINARY                                            975            976      
     1          0.0       24371.3      18.9X
-UNICODE_CI                                           13819          13826      
    10          0.0      345482.6       1.3X
+UTF8_BINARY_LCASE                                    18578          18582      
     6          0.0      464453.0       1.0X
+UNICODE                                              13870          13904      
    47          0.0      346759.1       1.3X
+UTF8_BINARY                                           1029           1030      
     2          0.0       25714.2      18.1X
+UNICODE_CI                                           14092          14103      
    16          0.0      352289.2       1.3X
 
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1017-azure
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
 AMD EPYC 7763 64-Core Processor
 collation unit benchmarks - hashFunction:  Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-UTF8_BINARY_LCASE                                  9183           9230         
 67          0.0      229564.0       1.0X
-UNICODE                                           38937          38952         
 22          0.0      973421.3       0.2X
-UTF8_BINARY                                        1376           1376         
  0          0.0       34397.5       6.7X
-UNICODE_CI                                        32881          32882         
  1          0.0      822027.4       0.3X
+UTF8_BINARY_LCASE                                  9410           9413         
  5          0.0      235238.0       1.0X
+UNICODE                                           41038          41047         
 12          0.0     1025949.7       0.2X
+UTF8_BINARY                                        1406           1408         
  3          0.0       35151.5       6.7X
+UNICODE_CI                                        31829          31829         
  1          0.0      795717.5       0.3X
 
-OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1017-azure
+OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure
 AMD EPYC 7763 64-Core Processor
 collation unit benchmarks - contains:     Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 
------------------------------------------------------------------------------------------------------------------------
-UTF8_BINARY_LCASE                                 22429          22438         
 13          0.0      560735.1       1.0X
-UNICODE                                            2900           2901         
  2          0.0       72503.2       7.7X
-UTF8_BINARY                                        3190           3198         
 11          0.0       79740.5       7.0X
-UNICODE_CI                                       166847         167278         
609          0.0     4171180.3       0.1X
+UTF8_BINARY_LCASE                                 22537          22546         
 13          0.0      563430.8       1.0X

Review Comment:
   Not drastically (in the range of a measurement error I would say, judging by 
the other data). But it's expected with the current approach - we first have to 
find a non-ascii character, so part of the loop gives this slowdown. And in the 
previous approach we did the allocation right away.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to