GideonPotok commented on code in PR #45453: URL: https://github.com/apache/spark/pull/45453#discussion_r1539275734
########## sql/core/benchmarks/CollationBenchmark-results.txt: ########## @@ -0,0 +1,26 @@ +OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Linux 6.5.0-1016-azure +AMD EPYC 7763 64-Core Processor +filter df column with collation: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +----------------------------------------------------------------------------------------------------------------------------------- +filter df column with collation - UNICODE_CI 403 463 39 0.0 20147470.0 1.0X +filter df column with collation - UNICODE 187 223 37 0.0 9339586.0 2.2X +filter df column with collation - UTF8_BINARY_LCASE 426 434 7 0.0 21300903.4 0.9X +filter df column with collation - UTF8_BINARY 188 199 5 0.0 9403169.1 2.1X + +OpenJDK 64-Bit Server VM 17.0.10+7-LTS on Linux 6.5.0-1016-azure +AMD EPYC 7763 64-Core Processor +collation unit benchmarks: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative +------------------------------------------------------------------------------------------------------------------------ +equalsFunction - UTF8_BINARY 0 0 0 10.4 96.6 1.0X Review Comment: ah, I see, numIterations can be hardcoded. I misunderstood, thought you wanted a bigger data frame. I am making that change now, -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org