Re: [PR] [SPARK-48025][SQL][TESTS] Fix org.apache.spark.sql.execution.benchmark.DateTimeBenchmark [spark]
yaooqinn commented on PR #46261: URL: https://github.com/apache/spark/pull/46261#issuecomment-2081416283 Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48025][SQL][TESTS] Fix org.apache.spark.sql.execution.benchmark.DateTimeBenchmark [spark]
yaooqinn closed pull request #46261: [SPARK-48025][SQL][TESTS] Fix org.apache.spark.sql.execution.benchmark.DateTimeBenchmark URL: https://github.com/apache/spark/pull/46261 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48025][SQL][TESTS] Fix org.apache.spark.sql.execution.benchmark.DateTimeBenchmark [spark]
yaooqinn commented on PR #46261: URL: https://github.com/apache/spark/pull/46261#issuecomment-2081402895 Thank you @HyukjinKwon. I have dispatched a CI task to audit, https://github.com/yaooqinn/spark/actions/runs/8866497258 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48025][SQL][TESTS] Fix org.apache.spark.sql.execution.benchmark.DateTimeBenchmark [spark]
yaooqinn commented on code in PR #46261: URL: https://github.com/apache/spark/pull/46261#discussion_r1582051605 ## sql/core/benchmarks/DateTimeBenchmark-jdk21-results.txt: ## @@ -2,460 +2,460 @@ datetime +/- interval -OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure +OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure AMD EPYC 7763 64-Core Processor datetime +/- interval:Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative -date + interval(m) 850887 33 11.8 85.0 1.0X -date + interval(m, d) 863864 2 11.6 86.3 1.0X -date + interval(m, d, ms) 3507 3511 5 2.9 350.7 0.2X -date - interval(m) 841851 9 11.9 84.1 1.0X -date - interval(m, d) 864870 5 11.6 86.4 1.0X -date - interval(m, d, ms) 3518 3519 2 2.8 351.8 0.2X -timestamp + interval(m)1756 1759 5 5.7 175.6 0.5X -timestamp + interval(m, d) 1802 1805 4 5.5 180.2 0.5X -timestamp + interval(m, d, ms) 1958 1961 4 5.1 195.8 0.4X -timestamp - interval(m)1744 1745 2 5.7 174.4 0.5X -timestamp - interval(m, d) 1796 1799 4 5.6 179.6 0.5X -timestamp - interval(m, d, ms) 1944 1947 5 5.1 194.4 0.4X +date + interval(m) 1149 1158 12 8.7 114.9 1.0X +date + interval(m, d) 1136 1137 1 8.8 113.6 1.0X +date + interval(m, d, ms) 3779 3799 29 2.6 377.9 0.3X +date - interval(m) 1113 1116 4 9.0 111.3 1.0X +date - interval(m, d) 1124 1141 25 8.9 112.4 1.0X +date - interval(m, d, ms) 3795 3796 1 2.6 379.5 0.3X +timestamp + interval(m)1528 1530 3 6.5 152.8 0.8X +timestamp + interval(m, d) 1581 1585 6 6.3 158.1 0.7X +timestamp + interval(m, d, ms) 2037 2044 10 4.9 203.7 0.6X +timestamp - interval(m)1786 1790 6 5.6 178.6 0.6X +timestamp - interval(m, d) 1865 1872 10 5.4 186.5 0.6X +timestamp - interval(m, d, ms) 2038 2054 23 4.9 203.8 0.6X Extract components -OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure +OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure AMD EPYC 7763 64-Core Processor cast to timestamp:Best Time(ms) Avg Time(ms) Stdev(ms)Rate(M/s) Per Row(ns) Relative -cast to timestamp wholestage off209209 0 47.9 20.9 1.0X -cast to timestamp wholestage on 209225 15 47.8 20.9 1.0X +cast to timestamp wholestage off192198 9 52.2 19.2 1.0X +cast to timestamp wholestage on 206213 6 48.5 20.6 0.9X -OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure +OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure AMD EPYC 7763 64-Core Processor year
Re: [PR] [SPARK-48025][SQL][TESTS] Fix org.apache.spark.sql.execution.benchmark.DateTimeBenchmark [spark]
yaooqinn commented on code in PR #46261: URL: https://github.com/apache/spark/pull/46261#discussion_r1582049065 ## sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/DateTimeBenchmark.scala: ## @@ -161,7 +167,7 @@ object DateTimeBenchmark extends SqlBasedBenchmark { } val dateExpr = "cast(timestamp_seconds(id) as date)" Seq("year", "", "yy", "mon", "month", "mm").foreach { level => -run(N, s"trunc $level", s"trunc('$level', $dateExpr)") +run(N, s"trunc $level", s"trunc($dateExpr, '$level')") Review Comment: the parameter order is wrong -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48025][SQL][TESTS] Fix org.apache.spark.sql.execution.benchmark.DateTimeBenchmark [spark]
yaooqinn commented on code in PR #46261: URL: https://github.com/apache/spark/pull/46261#discussion_r1582049184 ## sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/DateTimeBenchmark.scala: ## @@ -171,7 +177,7 @@ object DateTimeBenchmark extends SqlBasedBenchmark { run(n, "to timestamp str", timestampStrExpr) run(n, "to_timestamp", s"to_timestamp($timestampStrExpr, $pattern)") run(n, "to_unix_timestamp", s"to_unix_timestamp($timestampStrExpr, $pattern)") - val dateStrExpr = "concat('2019-01-', lpad(mod(id, 25), 2, '0'))" + val dateStrExpr = "concat('2019-01-', lpad(mod(id, 25) + 1, 2, '0'))" Review Comment: avoid invalid date value `2019-01-00` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
Re: [PR] [SPARK-48025][SQL][TESTS] Fix org.apache.spark.sql.execution.benchmark.DateTimeBenchmark [spark]
yaooqinn commented on code in PR #46261: URL: https://github.com/apache/spark/pull/46261#discussion_r1582048982 ## sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/DateTimeBenchmark.scala: ## @@ -75,7 +81,7 @@ object DateTimeBenchmark extends SqlBasedBenchmark { doBenchmark(N, s"$dt + interval 1 month 2 day") } benchmark.addCase("date + interval(m, d, ms)") { _ => -doBenchmark(N, s"$dt + interval 1 month 2 day 5 hour") +doBenchmarkAnsiOff(N, s"$dt + interval 1 month 2 day 5 hour") Review Comment: illegal hour portion for ansi date add -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org