[ https://issues.apache.org/jira/browse/SPARK-31443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Maxim Gekk updated SPARK-31443: ------------------------------- Description: DateTimeBenchmark shows the regression Spark 2.4.6-SNAPSHOT at the PR [https://github.com/MaxGekk/spark/pull/27] {code:java} OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz To/from Java's date-time: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ >From java.sql.Date 559 603 > 38 8.9 111.8 1.0X Collect dates 2306 3221 1558 2.2 461.1 0.2X {code} Current master: {code:java} OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz To/from Java's date-time: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ >From java.sql.Date 1052 1130 > 73 4.8 210.3 1.0X Collect dates 3251 4943 1624 1.5 650.2 0.3X {code} If we subtract preparing DATE column: * Spark 2.4.6-SNAPSHOT is (461.1 - 111.8) = 349.3 ns/row * master is (650.2 - 210.3) = 439 ns/row The regression of toJavaDate in master against Spark 2.4.6-SNAPSHOT is (439 - 349.3)/349.3 = 25% was: DateTimeBenchmark shows the regression Spark 2.4.6-SNAPSHOT at the PR https://github.com/MaxGekk/spark/pull/27 {code} ================================================================================================ Conversion from/to external types ================================================================================================ OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz To/from java.sql.Timestamp: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ >From java.sql.Date 614 655 > 43 8.1 122.8 1.0X {code} Current master: {code} ================================================================================================ Conversion from/to external types ================================================================================================ OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz To/from java.sql.Timestamp: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ >From java.sql.Date 1154 1206 > 46 4.3 230.9 1.0X {code} The regression is ~x2. > Perf regression of toJavaDate > ----------------------------- > > Key: SPARK-31443 > URL: https://issues.apache.org/jira/browse/SPARK-31443 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 3.0.0 > Reporter: Maxim Gekk > Priority: Major > > DateTimeBenchmark shows the regression > Spark 2.4.6-SNAPSHOT at the PR [https://github.com/MaxGekk/spark/pull/27] > {code:java} > OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux > 4.15.0-1063-aws > Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz > To/from Java's date-time: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > ------------------------------------------------------------------------------------------------------------------------ > From java.sql.Date 559 603 > 38 8.9 111.8 1.0X > Collect dates 2306 3221 > 1558 2.2 461.1 0.2X > {code} > Current master: > {code:java} > OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux > 4.15.0-1063-aws > Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz > To/from Java's date-time: Best Time(ms) Avg Time(ms) > Stdev(ms) Rate(M/s) Per Row(ns) Relative > ------------------------------------------------------------------------------------------------------------------------ > From java.sql.Date 1052 1130 > 73 4.8 210.3 1.0X > Collect dates 3251 4943 > 1624 1.5 650.2 0.3X > {code} > If we subtract preparing DATE column: > * Spark 2.4.6-SNAPSHOT is (461.1 - 111.8) = 349.3 ns/row > * master is (650.2 - 210.3) = 439 ns/row > The regression of toJavaDate in master against Spark 2.4.6-SNAPSHOT is (439 - > 349.3)/349.3 = 25% -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org