[spark] branch master updated (b87a342 -> 78f9043)
This is an automated email from the ASF dual-hosted git repository. yumwang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b87a342 [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException add 78f9043 [SPARK-31912][SQL][TESTS] Normalize all binary comparison expressions No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/plans/PlanTest.scala | 23 -- 1 file changed, 13 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b87a342 -> 78f9043)
This is an automated email from the ASF dual-hosted git repository. yumwang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b87a342 [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException add 78f9043 [SPARK-31912][SQL][TESTS] Normalize all binary comparison expressions No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/plans/PlanTest.scala | 23 -- 1 file changed, 13 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b87a342 -> 78f9043)
This is an automated email from the ASF dual-hosted git repository. yumwang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b87a342 [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException add 78f9043 [SPARK-31912][SQL][TESTS] Normalize all binary comparison expressions No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/plans/PlanTest.scala | 23 -- 1 file changed, 13 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b87a342 -> 78f9043)
This is an automated email from the ASF dual-hosted git repository. yumwang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b87a342 [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException add 78f9043 [SPARK-31912][SQL][TESTS] Normalize all binary comparison expressions No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/plans/PlanTest.scala | 23 -- 1 file changed, 13 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b87a342 -> 78f9043)
This is an automated email from the ASF dual-hosted git repository. yumwang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b87a342 [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException add 78f9043 [SPARK-31912][SQL][TESTS] Normalize all binary comparison expressions No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/plans/PlanTest.scala | 23 -- 1 file changed, 13 insertions(+), 10 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year when year field pattern is missing
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 265bf04 [SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year when year field pattern is missing 265bf04 is described below commit 265bf04d7afa12ea6312721003f834e9538c9101 Author: Kent Yao AuthorDate: Fri Jun 12 11:52:05 2020 +0900 [SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year when year field pattern is missing ### What changes were proposed in this pull request? This PR brings back 4638402d4775acf0d7ff190ef5eabb6f2dc85af5 reverted by dad163f0f4a70b84f560de4f1f0bb89354e39217 and fix test failure for branch-3.0 diffs made: re-genarate sql test results to match 3.0's schema, the RuntimeReplaceable expression has new pretty schema in master branch ### Why are the changes needed? bring back reverted PR and fix tests ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? passing jenkins Closes #28800 from yaooqinn/SPARK-31939-30. Authored-by: Kent Yao Signed-off-by: HyukjinKwon --- .../catalyst/util/DateTimeFormatterHelper.scala| 26 - .../sql/catalyst/util/DateFormatterSuite.scala | 2 + .../sql/catalyst/util/DatetimeFormatterSuite.scala | 78 +++ .../catalyst/util/TimestampFormatterSuite.scala| 2 + .../sql-tests/inputs/datetime-parsing-invalid.sql | 20 .../sql-tests/inputs/datetime-parsing-legacy.sql | 2 + .../sql-tests/inputs/datetime-parsing.sql | 16 +++ .../results/datetime-parsing-invalid.sql.out | 110 + .../results/datetime-parsing-legacy.sql.out| 106 .../sql-tests/results/datetime-parsing.sql.out | 106 10 files changed, 465 insertions(+), 3 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala index 992a2b1..5de06af 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala @@ -39,6 +39,18 @@ trait DateTimeFormatterHelper { } } + private def verifyLocalDate( + accessor: TemporalAccessor, field: ChronoField, candidate: LocalDate): Unit = { +if (accessor.isSupported(field)) { + val actual = accessor.get(field) + val expected = candidate.get(field) + if (actual != expected) { +throw new DateTimeException(s"Conflict found: Field $field $actual differs from" + + s" $field $expected derived from $candidate") + } +} + } + protected def toLocalDate(accessor: TemporalAccessor): LocalDate = { val localDate = accessor.query(TemporalQueries.localDate()) // If all the date fields are specified, return the local date directly. @@ -48,9 +60,17 @@ trait DateTimeFormatterHelper { // later, and we should provide default values for missing fields. // To be compatible with Spark 2.4, we pick 1970 as the default value of year. val year = getOrDefault(accessor, ChronoField.YEAR, 1970) -val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1) -val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1) -LocalDate.of(year, month, day) +if (accessor.isSupported(ChronoField.DAY_OF_YEAR)) { + val dayOfYear = accessor.get(ChronoField.DAY_OF_YEAR) + val date = LocalDate.ofYearDay(year, dayOfYear) + verifyLocalDate(accessor, ChronoField.MONTH_OF_YEAR, date) + verifyLocalDate(accessor, ChronoField.DAY_OF_MONTH, date) + date +} else { + val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1) + val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1) + LocalDate.of(year, month, day) +} } private def toLocalTime(accessor: TemporalAccessor): LocalTime = { diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala index 4892dea..0a29d94 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala @@ -31,6 +31,8 @@ class DateFormatterSuite extends DatetimeFormatterSuite { DateFormatter(pattern, UTC, isParsing) } + override protected def useDateFormatter: Boolean = true + test("parsing dates") { outstandingTimezonesIds.foreach { timeZone => withSQLConf(SQLConf.SESSION_LOCAL_TIMEZONE.key
[spark] branch branch-3.0 updated: [SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year when year field pattern is missing
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 265bf04 [SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year when year field pattern is missing 265bf04 is described below commit 265bf04d7afa12ea6312721003f834e9538c9101 Author: Kent Yao AuthorDate: Fri Jun 12 11:52:05 2020 +0900 [SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year when year field pattern is missing ### What changes were proposed in this pull request? This PR brings back 4638402d4775acf0d7ff190ef5eabb6f2dc85af5 reverted by dad163f0f4a70b84f560de4f1f0bb89354e39217 and fix test failure for branch-3.0 diffs made: re-genarate sql test results to match 3.0's schema, the RuntimeReplaceable expression has new pretty schema in master branch ### Why are the changes needed? bring back reverted PR and fix tests ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? passing jenkins Closes #28800 from yaooqinn/SPARK-31939-30. Authored-by: Kent Yao Signed-off-by: HyukjinKwon --- .../catalyst/util/DateTimeFormatterHelper.scala| 26 - .../sql/catalyst/util/DateFormatterSuite.scala | 2 + .../sql/catalyst/util/DatetimeFormatterSuite.scala | 78 +++ .../catalyst/util/TimestampFormatterSuite.scala| 2 + .../sql-tests/inputs/datetime-parsing-invalid.sql | 20 .../sql-tests/inputs/datetime-parsing-legacy.sql | 2 + .../sql-tests/inputs/datetime-parsing.sql | 16 +++ .../results/datetime-parsing-invalid.sql.out | 110 + .../results/datetime-parsing-legacy.sql.out| 106 .../sql-tests/results/datetime-parsing.sql.out | 106 10 files changed, 465 insertions(+), 3 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala index 992a2b1..5de06af 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala @@ -39,6 +39,18 @@ trait DateTimeFormatterHelper { } } + private def verifyLocalDate( + accessor: TemporalAccessor, field: ChronoField, candidate: LocalDate): Unit = { +if (accessor.isSupported(field)) { + val actual = accessor.get(field) + val expected = candidate.get(field) + if (actual != expected) { +throw new DateTimeException(s"Conflict found: Field $field $actual differs from" + + s" $field $expected derived from $candidate") + } +} + } + protected def toLocalDate(accessor: TemporalAccessor): LocalDate = { val localDate = accessor.query(TemporalQueries.localDate()) // If all the date fields are specified, return the local date directly. @@ -48,9 +60,17 @@ trait DateTimeFormatterHelper { // later, and we should provide default values for missing fields. // To be compatible with Spark 2.4, we pick 1970 as the default value of year. val year = getOrDefault(accessor, ChronoField.YEAR, 1970) -val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1) -val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1) -LocalDate.of(year, month, day) +if (accessor.isSupported(ChronoField.DAY_OF_YEAR)) { + val dayOfYear = accessor.get(ChronoField.DAY_OF_YEAR) + val date = LocalDate.ofYearDay(year, dayOfYear) + verifyLocalDate(accessor, ChronoField.MONTH_OF_YEAR, date) + verifyLocalDate(accessor, ChronoField.DAY_OF_MONTH, date) + date +} else { + val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1) + val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1) + LocalDate.of(year, month, day) +} } private def toLocalTime(accessor: TemporalAccessor): LocalTime = { diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala index 4892dea..0a29d94 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala @@ -31,6 +31,8 @@ class DateFormatterSuite extends DatetimeFormatterSuite { DateFormatter(pattern, UTC, isParsing) } + override protected def useDateFormatter: Boolean = true + test("parsing dates") { outstandingTimezonesIds.foreach { timeZone => withSQLConf(SQLConf.SESSION_LOCAL_TIMEZONE.key
[spark] branch branch-3.0 updated: [SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year when year field pattern is missing
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 265bf04 [SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year when year field pattern is missing 265bf04 is described below commit 265bf04d7afa12ea6312721003f834e9538c9101 Author: Kent Yao AuthorDate: Fri Jun 12 11:52:05 2020 +0900 [SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year when year field pattern is missing ### What changes were proposed in this pull request? This PR brings back 4638402d4775acf0d7ff190ef5eabb6f2dc85af5 reverted by dad163f0f4a70b84f560de4f1f0bb89354e39217 and fix test failure for branch-3.0 diffs made: re-genarate sql test results to match 3.0's schema, the RuntimeReplaceable expression has new pretty schema in master branch ### Why are the changes needed? bring back reverted PR and fix tests ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? passing jenkins Closes #28800 from yaooqinn/SPARK-31939-30. Authored-by: Kent Yao Signed-off-by: HyukjinKwon --- .../catalyst/util/DateTimeFormatterHelper.scala| 26 - .../sql/catalyst/util/DateFormatterSuite.scala | 2 + .../sql/catalyst/util/DatetimeFormatterSuite.scala | 78 +++ .../catalyst/util/TimestampFormatterSuite.scala| 2 + .../sql-tests/inputs/datetime-parsing-invalid.sql | 20 .../sql-tests/inputs/datetime-parsing-legacy.sql | 2 + .../sql-tests/inputs/datetime-parsing.sql | 16 +++ .../results/datetime-parsing-invalid.sql.out | 110 + .../results/datetime-parsing-legacy.sql.out| 106 .../sql-tests/results/datetime-parsing.sql.out | 106 10 files changed, 465 insertions(+), 3 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala index 992a2b1..5de06af 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala @@ -39,6 +39,18 @@ trait DateTimeFormatterHelper { } } + private def verifyLocalDate( + accessor: TemporalAccessor, field: ChronoField, candidate: LocalDate): Unit = { +if (accessor.isSupported(field)) { + val actual = accessor.get(field) + val expected = candidate.get(field) + if (actual != expected) { +throw new DateTimeException(s"Conflict found: Field $field $actual differs from" + + s" $field $expected derived from $candidate") + } +} + } + protected def toLocalDate(accessor: TemporalAccessor): LocalDate = { val localDate = accessor.query(TemporalQueries.localDate()) // If all the date fields are specified, return the local date directly. @@ -48,9 +60,17 @@ trait DateTimeFormatterHelper { // later, and we should provide default values for missing fields. // To be compatible with Spark 2.4, we pick 1970 as the default value of year. val year = getOrDefault(accessor, ChronoField.YEAR, 1970) -val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1) -val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1) -LocalDate.of(year, month, day) +if (accessor.isSupported(ChronoField.DAY_OF_YEAR)) { + val dayOfYear = accessor.get(ChronoField.DAY_OF_YEAR) + val date = LocalDate.ofYearDay(year, dayOfYear) + verifyLocalDate(accessor, ChronoField.MONTH_OF_YEAR, date) + verifyLocalDate(accessor, ChronoField.DAY_OF_MONTH, date) + date +} else { + val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1) + val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1) + LocalDate.of(year, month, day) +} } private def toLocalTime(accessor: TemporalAccessor): LocalTime = { diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala index 4892dea..0a29d94 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala @@ -31,6 +31,8 @@ class DateFormatterSuite extends DatetimeFormatterSuite { DateFormatter(pattern, UTC, isParsing) } + override protected def useDateFormatter: Boolean = true + test("parsing dates") { outstandingTimezonesIds.foreach { timeZone => withSQLConf(SQLConf.SESSION_LOCAL_TIMEZONE.key
[spark] branch branch-3.0 updated: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new f61b31a [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException f61b31a is described below commit f61b31a5a484c7e90920ec36c456594ce92cdf73 Author: Dilip Biswal AuthorDate: Fri Jun 12 09:19:29 2020 +0900 [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException ### What changes were proposed in this pull request? A minor fix to fix the append method of StringConcat to cap the length at MAX_ROUNDED_ARRAY_LENGTH to make sure it does not overflow and cause StringIndexOutOfBoundsException Thanks to **Jeffrey Stokes** for reporting the issue and explaining the underlying problem in detail in the JIRA. ### Why are the changes needed? This fixes StringIndexOutOfBoundsException on an overflow. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Added a test in StringsUtilSuite. Closes #28750 from dilipbiswal/SPARK-31916. Authored-by: Dilip Biswal Signed-off-by: Takeshi Yamamuro (cherry picked from commit b87a342c7dd51046fcbe323db640c825646fb8d4) Signed-off-by: Takeshi Yamamuro --- .../spark/sql/catalyst/util/StringUtils.scala | 6 +++- .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +- 2 files changed, 36 insertions(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala index b42ae4e..2a416d6 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala @@ -123,7 +123,11 @@ object StringUtils extends Logging { val stringToAppend = if (available >= sLen) s else s.substring(0, available) strings.append(stringToAppend) } -length += sLen + +// Keeps the total length of appended strings. Note that we need to cap the length at +// `ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH`; otherwise, we will overflow +// length causing StringIndexOutOfBoundsException in the substring call above. +length = Math.min(length.toLong + sLen, ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH).toInt } } diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala index 67bc4bc..c68e89fc 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala @@ -18,9 +18,11 @@ package org.apache.spark.sql.catalyst.util import org.apache.spark.SparkFunSuite +import org.apache.spark.sql.catalyst.plans.SQLHelper import org.apache.spark.sql.catalyst.util.StringUtils._ +import org.apache.spark.sql.internal.SQLConf -class StringUtilsSuite extends SparkFunSuite { +class StringUtilsSuite extends SparkFunSuite with SQLHelper { test("escapeLikeRegex") { val expectedEscapedStrOne = "(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E" @@ -98,4 +100,32 @@ class StringUtilsSuite extends SparkFunSuite { assert(checkLimit("1234567")) assert(checkLimit("1234567890")) } + + test("SPARK-31916: StringConcat doesn't overflow on many inputs") { +val concat = new StringConcat(maxLength = 100) +val stringToAppend = "Test internal index of StringConcat does not overflow with many " + + "append calls" +0.to((Integer.MAX_VALUE / stringToAppend.length) + 1).foreach { _ => + concat.append(stringToAppend) +} +assert(concat.toString.length === 100) + } + + test("SPARK-31916: verify that PlanStringConcat's output shows the actual length of the plan") { +withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "0") { + val concat = new PlanStringConcat() + 0.to(3).foreach { i => +concat.append(s"plan fragment $i") + } + assert(concat.toString === "Truncated plan of 60 characters") +} + +withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "60") { + val concat = new PlanStringConcat() + 0.to(2).foreach { i => +concat.append(s"plan fragment $i") + } + assert(concat.toString === "plan fragment 0plan fragment 1... 15 more characters") +} + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new f61b31a [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException f61b31a is described below commit f61b31a5a484c7e90920ec36c456594ce92cdf73 Author: Dilip Biswal AuthorDate: Fri Jun 12 09:19:29 2020 +0900 [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException ### What changes were proposed in this pull request? A minor fix to fix the append method of StringConcat to cap the length at MAX_ROUNDED_ARRAY_LENGTH to make sure it does not overflow and cause StringIndexOutOfBoundsException Thanks to **Jeffrey Stokes** for reporting the issue and explaining the underlying problem in detail in the JIRA. ### Why are the changes needed? This fixes StringIndexOutOfBoundsException on an overflow. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Added a test in StringsUtilSuite. Closes #28750 from dilipbiswal/SPARK-31916. Authored-by: Dilip Biswal Signed-off-by: Takeshi Yamamuro (cherry picked from commit b87a342c7dd51046fcbe323db640c825646fb8d4) Signed-off-by: Takeshi Yamamuro --- .../spark/sql/catalyst/util/StringUtils.scala | 6 +++- .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +- 2 files changed, 36 insertions(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala index b42ae4e..2a416d6 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala @@ -123,7 +123,11 @@ object StringUtils extends Logging { val stringToAppend = if (available >= sLen) s else s.substring(0, available) strings.append(stringToAppend) } -length += sLen + +// Keeps the total length of appended strings. Note that we need to cap the length at +// `ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH`; otherwise, we will overflow +// length causing StringIndexOutOfBoundsException in the substring call above. +length = Math.min(length.toLong + sLen, ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH).toInt } } diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala index 67bc4bc..c68e89fc 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala @@ -18,9 +18,11 @@ package org.apache.spark.sql.catalyst.util import org.apache.spark.SparkFunSuite +import org.apache.spark.sql.catalyst.plans.SQLHelper import org.apache.spark.sql.catalyst.util.StringUtils._ +import org.apache.spark.sql.internal.SQLConf -class StringUtilsSuite extends SparkFunSuite { +class StringUtilsSuite extends SparkFunSuite with SQLHelper { test("escapeLikeRegex") { val expectedEscapedStrOne = "(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E" @@ -98,4 +100,32 @@ class StringUtilsSuite extends SparkFunSuite { assert(checkLimit("1234567")) assert(checkLimit("1234567890")) } + + test("SPARK-31916: StringConcat doesn't overflow on many inputs") { +val concat = new StringConcat(maxLength = 100) +val stringToAppend = "Test internal index of StringConcat does not overflow with many " + + "append calls" +0.to((Integer.MAX_VALUE / stringToAppend.length) + 1).foreach { _ => + concat.append(stringToAppend) +} +assert(concat.toString.length === 100) + } + + test("SPARK-31916: verify that PlanStringConcat's output shows the actual length of the plan") { +withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "0") { + val concat = new PlanStringConcat() + 0.to(3).foreach { i => +concat.append(s"plan fragment $i") + } + assert(concat.toString === "Truncated plan of 60 characters") +} + +withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "60") { + val concat = new PlanStringConcat() + 0.to(2).foreach { i => +concat.append(s"plan fragment $i") + } + assert(concat.toString === "plan fragment 0plan fragment 1... 15 more characters") +} + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (88a4e55 -> b87a342)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 88a4e55 [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0 add b87a342 [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/StringUtils.scala | 6 +++- .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +- 2 files changed, 36 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new f61b31a [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException f61b31a is described below commit f61b31a5a484c7e90920ec36c456594ce92cdf73 Author: Dilip Biswal AuthorDate: Fri Jun 12 09:19:29 2020 +0900 [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException ### What changes were proposed in this pull request? A minor fix to fix the append method of StringConcat to cap the length at MAX_ROUNDED_ARRAY_LENGTH to make sure it does not overflow and cause StringIndexOutOfBoundsException Thanks to **Jeffrey Stokes** for reporting the issue and explaining the underlying problem in detail in the JIRA. ### Why are the changes needed? This fixes StringIndexOutOfBoundsException on an overflow. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Added a test in StringsUtilSuite. Closes #28750 from dilipbiswal/SPARK-31916. Authored-by: Dilip Biswal Signed-off-by: Takeshi Yamamuro (cherry picked from commit b87a342c7dd51046fcbe323db640c825646fb8d4) Signed-off-by: Takeshi Yamamuro --- .../spark/sql/catalyst/util/StringUtils.scala | 6 +++- .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +- 2 files changed, 36 insertions(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala index b42ae4e..2a416d6 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala @@ -123,7 +123,11 @@ object StringUtils extends Logging { val stringToAppend = if (available >= sLen) s else s.substring(0, available) strings.append(stringToAppend) } -length += sLen + +// Keeps the total length of appended strings. Note that we need to cap the length at +// `ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH`; otherwise, we will overflow +// length causing StringIndexOutOfBoundsException in the substring call above. +length = Math.min(length.toLong + sLen, ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH).toInt } } diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala index 67bc4bc..c68e89fc 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala @@ -18,9 +18,11 @@ package org.apache.spark.sql.catalyst.util import org.apache.spark.SparkFunSuite +import org.apache.spark.sql.catalyst.plans.SQLHelper import org.apache.spark.sql.catalyst.util.StringUtils._ +import org.apache.spark.sql.internal.SQLConf -class StringUtilsSuite extends SparkFunSuite { +class StringUtilsSuite extends SparkFunSuite with SQLHelper { test("escapeLikeRegex") { val expectedEscapedStrOne = "(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E" @@ -98,4 +100,32 @@ class StringUtilsSuite extends SparkFunSuite { assert(checkLimit("1234567")) assert(checkLimit("1234567890")) } + + test("SPARK-31916: StringConcat doesn't overflow on many inputs") { +val concat = new StringConcat(maxLength = 100) +val stringToAppend = "Test internal index of StringConcat does not overflow with many " + + "append calls" +0.to((Integer.MAX_VALUE / stringToAppend.length) + 1).foreach { _ => + concat.append(stringToAppend) +} +assert(concat.toString.length === 100) + } + + test("SPARK-31916: verify that PlanStringConcat's output shows the actual length of the plan") { +withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "0") { + val concat = new PlanStringConcat() + 0.to(3).foreach { i => +concat.append(s"plan fragment $i") + } + assert(concat.toString === "Truncated plan of 60 characters") +} + +withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "60") { + val concat = new PlanStringConcat() + 0.to(2).foreach { i => +concat.append(s"plan fragment $i") + } + assert(concat.toString === "plan fragment 0plan fragment 1... 15 more characters") +} + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (88a4e55 -> b87a342)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 88a4e55 [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0 add b87a342 [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/StringUtils.scala | 6 +++- .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +- 2 files changed, 36 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new f61b31a [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException f61b31a is described below commit f61b31a5a484c7e90920ec36c456594ce92cdf73 Author: Dilip Biswal AuthorDate: Fri Jun 12 09:19:29 2020 +0900 [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException ### What changes were proposed in this pull request? A minor fix to fix the append method of StringConcat to cap the length at MAX_ROUNDED_ARRAY_LENGTH to make sure it does not overflow and cause StringIndexOutOfBoundsException Thanks to **Jeffrey Stokes** for reporting the issue and explaining the underlying problem in detail in the JIRA. ### Why are the changes needed? This fixes StringIndexOutOfBoundsException on an overflow. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Added a test in StringsUtilSuite. Closes #28750 from dilipbiswal/SPARK-31916. Authored-by: Dilip Biswal Signed-off-by: Takeshi Yamamuro (cherry picked from commit b87a342c7dd51046fcbe323db640c825646fb8d4) Signed-off-by: Takeshi Yamamuro --- .../spark/sql/catalyst/util/StringUtils.scala | 6 +++- .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +- 2 files changed, 36 insertions(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala index b42ae4e..2a416d6 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala @@ -123,7 +123,11 @@ object StringUtils extends Logging { val stringToAppend = if (available >= sLen) s else s.substring(0, available) strings.append(stringToAppend) } -length += sLen + +// Keeps the total length of appended strings. Note that we need to cap the length at +// `ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH`; otherwise, we will overflow +// length causing StringIndexOutOfBoundsException in the substring call above. +length = Math.min(length.toLong + sLen, ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH).toInt } } diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala index 67bc4bc..c68e89fc 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala @@ -18,9 +18,11 @@ package org.apache.spark.sql.catalyst.util import org.apache.spark.SparkFunSuite +import org.apache.spark.sql.catalyst.plans.SQLHelper import org.apache.spark.sql.catalyst.util.StringUtils._ +import org.apache.spark.sql.internal.SQLConf -class StringUtilsSuite extends SparkFunSuite { +class StringUtilsSuite extends SparkFunSuite with SQLHelper { test("escapeLikeRegex") { val expectedEscapedStrOne = "(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E" @@ -98,4 +100,32 @@ class StringUtilsSuite extends SparkFunSuite { assert(checkLimit("1234567")) assert(checkLimit("1234567890")) } + + test("SPARK-31916: StringConcat doesn't overflow on many inputs") { +val concat = new StringConcat(maxLength = 100) +val stringToAppend = "Test internal index of StringConcat does not overflow with many " + + "append calls" +0.to((Integer.MAX_VALUE / stringToAppend.length) + 1).foreach { _ => + concat.append(stringToAppend) +} +assert(concat.toString.length === 100) + } + + test("SPARK-31916: verify that PlanStringConcat's output shows the actual length of the plan") { +withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "0") { + val concat = new PlanStringConcat() + 0.to(3).foreach { i => +concat.append(s"plan fragment $i") + } + assert(concat.toString === "Truncated plan of 60 characters") +} + +withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "60") { + val concat = new PlanStringConcat() + 0.to(2).foreach { i => +concat.append(s"plan fragment $i") + } + assert(concat.toString === "plan fragment 0plan fragment 1... 15 more characters") +} + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (88a4e55 -> b87a342)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 88a4e55 [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0 add b87a342 [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/StringUtils.scala | 6 +++- .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +- 2 files changed, 36 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new f61b31a [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException f61b31a is described below commit f61b31a5a484c7e90920ec36c456594ce92cdf73 Author: Dilip Biswal AuthorDate: Fri Jun 12 09:19:29 2020 +0900 [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException ### What changes were proposed in this pull request? A minor fix to fix the append method of StringConcat to cap the length at MAX_ROUNDED_ARRAY_LENGTH to make sure it does not overflow and cause StringIndexOutOfBoundsException Thanks to **Jeffrey Stokes** for reporting the issue and explaining the underlying problem in detail in the JIRA. ### Why are the changes needed? This fixes StringIndexOutOfBoundsException on an overflow. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Added a test in StringsUtilSuite. Closes #28750 from dilipbiswal/SPARK-31916. Authored-by: Dilip Biswal Signed-off-by: Takeshi Yamamuro (cherry picked from commit b87a342c7dd51046fcbe323db640c825646fb8d4) Signed-off-by: Takeshi Yamamuro --- .../spark/sql/catalyst/util/StringUtils.scala | 6 +++- .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +- 2 files changed, 36 insertions(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala index b42ae4e..2a416d6 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala @@ -123,7 +123,11 @@ object StringUtils extends Logging { val stringToAppend = if (available >= sLen) s else s.substring(0, available) strings.append(stringToAppend) } -length += sLen + +// Keeps the total length of appended strings. Note that we need to cap the length at +// `ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH`; otherwise, we will overflow +// length causing StringIndexOutOfBoundsException in the substring call above. +length = Math.min(length.toLong + sLen, ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH).toInt } } diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala index 67bc4bc..c68e89fc 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala @@ -18,9 +18,11 @@ package org.apache.spark.sql.catalyst.util import org.apache.spark.SparkFunSuite +import org.apache.spark.sql.catalyst.plans.SQLHelper import org.apache.spark.sql.catalyst.util.StringUtils._ +import org.apache.spark.sql.internal.SQLConf -class StringUtilsSuite extends SparkFunSuite { +class StringUtilsSuite extends SparkFunSuite with SQLHelper { test("escapeLikeRegex") { val expectedEscapedStrOne = "(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E" @@ -98,4 +100,32 @@ class StringUtilsSuite extends SparkFunSuite { assert(checkLimit("1234567")) assert(checkLimit("1234567890")) } + + test("SPARK-31916: StringConcat doesn't overflow on many inputs") { +val concat = new StringConcat(maxLength = 100) +val stringToAppend = "Test internal index of StringConcat does not overflow with many " + + "append calls" +0.to((Integer.MAX_VALUE / stringToAppend.length) + 1).foreach { _ => + concat.append(stringToAppend) +} +assert(concat.toString.length === 100) + } + + test("SPARK-31916: verify that PlanStringConcat's output shows the actual length of the plan") { +withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "0") { + val concat = new PlanStringConcat() + 0.to(3).foreach { i => +concat.append(s"plan fragment $i") + } + assert(concat.toString === "Truncated plan of 60 characters") +} + +withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "60") { + val concat = new PlanStringConcat() + 0.to(2).foreach { i => +concat.append(s"plan fragment $i") + } + assert(concat.toString === "plan fragment 0plan fragment 1... 15 more characters") +} + } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (88a4e55 -> b87a342)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 88a4e55 [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0 add b87a342 [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/StringUtils.scala | 6 +++- .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +- 2 files changed, 36 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (88a4e55 -> b87a342)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 88a4e55 [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0 add b87a342 [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/StringUtils.scala | 6 +++- .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +- 2 files changed, 36 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b1adc3d -> 88a4e55)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b1adc3d [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET add 88a4e55 [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0 No new revisions were added by this update. Summary of changes: core/pom.xml | 2 +- core/src/main/scala/org/apache/spark/ui/JettyUtils.scala | 7 ++- .../test/scala/org/apache/spark/ui/UISeleniumSuite.scala | 2 +- pom.xml| 14 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- streaming/pom.xml | 2 +- 7 files changed, 20 insertions(+), 11 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b1adc3d -> 88a4e55)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b1adc3d [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET add 88a4e55 [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0 No new revisions were added by this update. Summary of changes: core/pom.xml | 2 +- core/src/main/scala/org/apache/spark/ui/JettyUtils.scala | 7 ++- .../test/scala/org/apache/spark/ui/UISeleniumSuite.scala | 2 +- pom.xml| 14 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- streaming/pom.xml | 2 +- 7 files changed, 20 insertions(+), 11 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b1adc3d -> 88a4e55)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b1adc3d [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET add 88a4e55 [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0 No new revisions were added by this update. Summary of changes: core/pom.xml | 2 +- core/src/main/scala/org/apache/spark/ui/JettyUtils.scala | 7 ++- .../test/scala/org/apache/spark/ui/UISeleniumSuite.scala | 2 +- pom.xml| 14 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- streaming/pom.xml | 2 +- 7 files changed, 20 insertions(+), 11 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b1adc3d -> 88a4e55)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b1adc3d [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET add 88a4e55 [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0 No new revisions were added by this update. Summary of changes: core/pom.xml | 2 +- core/src/main/scala/org/apache/spark/ui/JettyUtils.scala | 7 ++- .../test/scala/org/apache/spark/ui/UISeleniumSuite.scala | 2 +- pom.xml| 14 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- streaming/pom.xml | 2 +- 7 files changed, 20 insertions(+), 11 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b1adc3d -> 88a4e55)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b1adc3d [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET add 88a4e55 [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0 No new revisions were added by this update. Summary of changes: core/pom.xml | 2 +- core/src/main/scala/org/apache/spark/ui/JettyUtils.scala | 7 ++- .../test/scala/org/apache/spark/ui/UISeleniumSuite.scala | 2 +- pom.xml| 14 +- sql/core/pom.xml | 2 +- sql/hive-thriftserver/pom.xml | 2 +- streaming/pom.xml | 2 +- 7 files changed, 20 insertions(+), 11 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new fe298c3 [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options fe298c3 is described below commit fe298c34e11823cb1371db1bff425ce8874fface Author: Gengliang Wang AuthorDate: Thu Jun 11 14:18:19 2020 -0700 [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options ### What changes were proposed in this pull request? Mkae Hadoop file system config effective in data source options. From `org.apache.hadoop.fs.FileSystem.java`: ``` public static FileSystem get(URI uri, Configuration conf) throws IOException { String scheme = uri.getScheme(); String authority = uri.getAuthority(); if (scheme == null && authority == null) { // use default FS return get(conf); } if (scheme != null && authority == null) { // no authority URI defaultUri = getDefaultUri(conf); if (scheme.equals(defaultUri.getScheme())// if scheme matches default && defaultUri.getAuthority() != null) { // & default has authority return get(defaultUri, conf); // return default } } String disableCacheName = String.format("fs.%s.impl.disable.cache", scheme); if (conf.getBoolean(disableCacheName, false)) { return createFileSystem(uri, conf); } return CACHE.get(uri, conf); } ``` Before changes, the file system configurations in data source options are not propagated in `DataSource.scala`. After changes, we can specify authority and URI schema related configurations for scanning file systems. This problem only exists in data source V1. In V2, we already use `sparkSession.sessionState.newHadoopConfWithOptions(options)` in `FileTable`. ### Why are the changes needed? Allow users to specify authority and URI schema related Hadoop configurations for file source reading. ### Does this PR introduce _any_ user-facing change? Yes, the file system related Hadoop configuration in data source option will be effective on reading. ### How was this patch tested? Unit test Closes #28776 from gengliangwang/SPARK-31935-3.0. Lead-authored-by: Gengliang Wang Co-authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun --- .../spark/sql/execution/datasources/DataSource.scala | 13 +++-- .../apache/spark/sql/FileBasedDataSourceSuite.scala | 20 .../spark/sql/streaming/FileStreamSourceSuite.scala | 12 3 files changed, 39 insertions(+), 6 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala index 3615afc..588a9b4 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala @@ -109,6 +109,9 @@ case class DataSource( private def providingInstance() = providingClass.getConstructor().newInstance() + private def newHadoopConfiguration(): Configuration = +sparkSession.sessionState.newHadoopConfWithOptions(options) + lazy val sourceInfo: SourceInfo = sourceSchema() private val caseInsensitiveOptions = CaseInsensitiveMap(options) private val equality = sparkSession.sessionState.conf.resolver @@ -230,7 +233,7 @@ case class DataSource( // once the streaming job starts and some upstream source starts dropping data. val hdfsPath = new Path(path) if (!SparkHadoopUtil.get.isGlobPath(hdfsPath)) { - val fs = hdfsPath.getFileSystem(sparkSession.sessionState.newHadoopConf()) + val fs = hdfsPath.getFileSystem(newHadoopConfiguration()) if (!fs.exists(hdfsPath)) { throw new AnalysisException(s"Path does not exist: $path") } @@ -357,7 +360,7 @@ case class DataSource( case (format: FileFormat, _) if FileStreamSink.hasMetadata( caseInsensitiveOptions.get("path").toSeq ++ paths, -sparkSession.sessionState.newHadoopConf(), +newHadoopConfiguration(), sparkSession.sessionState.conf) => val basePath = new Path((caseInsensitiveOptions.get("path").toSeq ++ paths).head) val fileCatalog = new MetadataLogFileIndex(sparkSession, basePath, @@ -449,7 +452,7 @@ case class DataSource( val allPaths = paths ++ caseInsensitiveOptions.get("path") val outputPath = if (allPaths.length == 1) { val path = new
[spark] branch branch-3.0 updated: [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new fe298c3 [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options fe298c3 is described below commit fe298c34e11823cb1371db1bff425ce8874fface Author: Gengliang Wang AuthorDate: Thu Jun 11 14:18:19 2020 -0700 [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options ### What changes were proposed in this pull request? Mkae Hadoop file system config effective in data source options. From `org.apache.hadoop.fs.FileSystem.java`: ``` public static FileSystem get(URI uri, Configuration conf) throws IOException { String scheme = uri.getScheme(); String authority = uri.getAuthority(); if (scheme == null && authority == null) { // use default FS return get(conf); } if (scheme != null && authority == null) { // no authority URI defaultUri = getDefaultUri(conf); if (scheme.equals(defaultUri.getScheme())// if scheme matches default && defaultUri.getAuthority() != null) { // & default has authority return get(defaultUri, conf); // return default } } String disableCacheName = String.format("fs.%s.impl.disable.cache", scheme); if (conf.getBoolean(disableCacheName, false)) { return createFileSystem(uri, conf); } return CACHE.get(uri, conf); } ``` Before changes, the file system configurations in data source options are not propagated in `DataSource.scala`. After changes, we can specify authority and URI schema related configurations for scanning file systems. This problem only exists in data source V1. In V2, we already use `sparkSession.sessionState.newHadoopConfWithOptions(options)` in `FileTable`. ### Why are the changes needed? Allow users to specify authority and URI schema related Hadoop configurations for file source reading. ### Does this PR introduce _any_ user-facing change? Yes, the file system related Hadoop configuration in data source option will be effective on reading. ### How was this patch tested? Unit test Closes #28776 from gengliangwang/SPARK-31935-3.0. Lead-authored-by: Gengliang Wang Co-authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun --- .../spark/sql/execution/datasources/DataSource.scala | 13 +++-- .../apache/spark/sql/FileBasedDataSourceSuite.scala | 20 .../spark/sql/streaming/FileStreamSourceSuite.scala | 12 3 files changed, 39 insertions(+), 6 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala index 3615afc..588a9b4 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala @@ -109,6 +109,9 @@ case class DataSource( private def providingInstance() = providingClass.getConstructor().newInstance() + private def newHadoopConfiguration(): Configuration = +sparkSession.sessionState.newHadoopConfWithOptions(options) + lazy val sourceInfo: SourceInfo = sourceSchema() private val caseInsensitiveOptions = CaseInsensitiveMap(options) private val equality = sparkSession.sessionState.conf.resolver @@ -230,7 +233,7 @@ case class DataSource( // once the streaming job starts and some upstream source starts dropping data. val hdfsPath = new Path(path) if (!SparkHadoopUtil.get.isGlobPath(hdfsPath)) { - val fs = hdfsPath.getFileSystem(sparkSession.sessionState.newHadoopConf()) + val fs = hdfsPath.getFileSystem(newHadoopConfiguration()) if (!fs.exists(hdfsPath)) { throw new AnalysisException(s"Path does not exist: $path") } @@ -357,7 +360,7 @@ case class DataSource( case (format: FileFormat, _) if FileStreamSink.hasMetadata( caseInsensitiveOptions.get("path").toSeq ++ paths, -sparkSession.sessionState.newHadoopConf(), +newHadoopConfiguration(), sparkSession.sessionState.conf) => val basePath = new Path((caseInsensitiveOptions.get("path").toSeq ++ paths).head) val fileCatalog = new MetadataLogFileIndex(sparkSession, basePath, @@ -449,7 +452,7 @@ case class DataSource( val allPaths = paths ++ caseInsensitiveOptions.get("path") val outputPath = if (allPaths.length == 1) { val path = new
[spark] branch branch-3.0 updated: [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new fe298c3 [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options fe298c3 is described below commit fe298c34e11823cb1371db1bff425ce8874fface Author: Gengliang Wang AuthorDate: Thu Jun 11 14:18:19 2020 -0700 [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options ### What changes were proposed in this pull request? Mkae Hadoop file system config effective in data source options. From `org.apache.hadoop.fs.FileSystem.java`: ``` public static FileSystem get(URI uri, Configuration conf) throws IOException { String scheme = uri.getScheme(); String authority = uri.getAuthority(); if (scheme == null && authority == null) { // use default FS return get(conf); } if (scheme != null && authority == null) { // no authority URI defaultUri = getDefaultUri(conf); if (scheme.equals(defaultUri.getScheme())// if scheme matches default && defaultUri.getAuthority() != null) { // & default has authority return get(defaultUri, conf); // return default } } String disableCacheName = String.format("fs.%s.impl.disable.cache", scheme); if (conf.getBoolean(disableCacheName, false)) { return createFileSystem(uri, conf); } return CACHE.get(uri, conf); } ``` Before changes, the file system configurations in data source options are not propagated in `DataSource.scala`. After changes, we can specify authority and URI schema related configurations for scanning file systems. This problem only exists in data source V1. In V2, we already use `sparkSession.sessionState.newHadoopConfWithOptions(options)` in `FileTable`. ### Why are the changes needed? Allow users to specify authority and URI schema related Hadoop configurations for file source reading. ### Does this PR introduce _any_ user-facing change? Yes, the file system related Hadoop configuration in data source option will be effective on reading. ### How was this patch tested? Unit test Closes #28776 from gengliangwang/SPARK-31935-3.0. Lead-authored-by: Gengliang Wang Co-authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun --- .../spark/sql/execution/datasources/DataSource.scala | 13 +++-- .../apache/spark/sql/FileBasedDataSourceSuite.scala | 20 .../spark/sql/streaming/FileStreamSourceSuite.scala | 12 3 files changed, 39 insertions(+), 6 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala index 3615afc..588a9b4 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala @@ -109,6 +109,9 @@ case class DataSource( private def providingInstance() = providingClass.getConstructor().newInstance() + private def newHadoopConfiguration(): Configuration = +sparkSession.sessionState.newHadoopConfWithOptions(options) + lazy val sourceInfo: SourceInfo = sourceSchema() private val caseInsensitiveOptions = CaseInsensitiveMap(options) private val equality = sparkSession.sessionState.conf.resolver @@ -230,7 +233,7 @@ case class DataSource( // once the streaming job starts and some upstream source starts dropping data. val hdfsPath = new Path(path) if (!SparkHadoopUtil.get.isGlobPath(hdfsPath)) { - val fs = hdfsPath.getFileSystem(sparkSession.sessionState.newHadoopConf()) + val fs = hdfsPath.getFileSystem(newHadoopConfiguration()) if (!fs.exists(hdfsPath)) { throw new AnalysisException(s"Path does not exist: $path") } @@ -357,7 +360,7 @@ case class DataSource( case (format: FileFormat, _) if FileStreamSink.hasMetadata( caseInsensitiveOptions.get("path").toSeq ++ paths, -sparkSession.sessionState.newHadoopConf(), +newHadoopConfiguration(), sparkSession.sessionState.conf) => val basePath = new Path((caseInsensitiveOptions.get("path").toSeq ++ paths).head) val fileCatalog = new MetadataLogFileIndex(sparkSession, basePath, @@ -449,7 +452,7 @@ case class DataSource( val allPaths = paths ++ caseInsensitiveOptions.get("path") val outputPath = if (allPaths.length == 1) { val path = new
[spark] branch master updated (11d3a74 -> b1adc3d)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 11d3a74 [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion add b1adc3d [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET No new revisions were added by this update. Summary of changes: .../sql/catalyst/analysis/FunctionRegistry.scala | 1 + .../sql/catalyst/expressions/mathExpressions.scala | 96 .../sql-functions/sql-expression-schema.md | 3 +- .../test/resources/sql-tests/inputs/operators.sql | 14 ++ .../sql-tests/inputs/postgreSQL/numeric.sql| 76 + .../resources/sql-tests/results/operators.sql.out | 98 +++- .../sql-tests/results/postgreSQL/numeric.sql.out | 173 - 7 files changed, 431 insertions(+), 30 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new fe298c3 [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options fe298c3 is described below commit fe298c34e11823cb1371db1bff425ce8874fface Author: Gengliang Wang AuthorDate: Thu Jun 11 14:18:19 2020 -0700 [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options ### What changes were proposed in this pull request? Mkae Hadoop file system config effective in data source options. From `org.apache.hadoop.fs.FileSystem.java`: ``` public static FileSystem get(URI uri, Configuration conf) throws IOException { String scheme = uri.getScheme(); String authority = uri.getAuthority(); if (scheme == null && authority == null) { // use default FS return get(conf); } if (scheme != null && authority == null) { // no authority URI defaultUri = getDefaultUri(conf); if (scheme.equals(defaultUri.getScheme())// if scheme matches default && defaultUri.getAuthority() != null) { // & default has authority return get(defaultUri, conf); // return default } } String disableCacheName = String.format("fs.%s.impl.disable.cache", scheme); if (conf.getBoolean(disableCacheName, false)) { return createFileSystem(uri, conf); } return CACHE.get(uri, conf); } ``` Before changes, the file system configurations in data source options are not propagated in `DataSource.scala`. After changes, we can specify authority and URI schema related configurations for scanning file systems. This problem only exists in data source V1. In V2, we already use `sparkSession.sessionState.newHadoopConfWithOptions(options)` in `FileTable`. ### Why are the changes needed? Allow users to specify authority and URI schema related Hadoop configurations for file source reading. ### Does this PR introduce _any_ user-facing change? Yes, the file system related Hadoop configuration in data source option will be effective on reading. ### How was this patch tested? Unit test Closes #28776 from gengliangwang/SPARK-31935-3.0. Lead-authored-by: Gengliang Wang Co-authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun --- .../spark/sql/execution/datasources/DataSource.scala | 13 +++-- .../apache/spark/sql/FileBasedDataSourceSuite.scala | 20 .../spark/sql/streaming/FileStreamSourceSuite.scala | 12 3 files changed, 39 insertions(+), 6 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala index 3615afc..588a9b4 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala @@ -109,6 +109,9 @@ case class DataSource( private def providingInstance() = providingClass.getConstructor().newInstance() + private def newHadoopConfiguration(): Configuration = +sparkSession.sessionState.newHadoopConfWithOptions(options) + lazy val sourceInfo: SourceInfo = sourceSchema() private val caseInsensitiveOptions = CaseInsensitiveMap(options) private val equality = sparkSession.sessionState.conf.resolver @@ -230,7 +233,7 @@ case class DataSource( // once the streaming job starts and some upstream source starts dropping data. val hdfsPath = new Path(path) if (!SparkHadoopUtil.get.isGlobPath(hdfsPath)) { - val fs = hdfsPath.getFileSystem(sparkSession.sessionState.newHadoopConf()) + val fs = hdfsPath.getFileSystem(newHadoopConfiguration()) if (!fs.exists(hdfsPath)) { throw new AnalysisException(s"Path does not exist: $path") } @@ -357,7 +360,7 @@ case class DataSource( case (format: FileFormat, _) if FileStreamSink.hasMetadata( caseInsensitiveOptions.get("path").toSeq ++ paths, -sparkSession.sessionState.newHadoopConf(), +newHadoopConfiguration(), sparkSession.sessionState.conf) => val basePath = new Path((caseInsensitiveOptions.get("path").toSeq ++ paths).head) val fileCatalog = new MetadataLogFileIndex(sparkSession, basePath, @@ -449,7 +452,7 @@ case class DataSource( val allPaths = paths ++ caseInsensitiveOptions.get("path") val outputPath = if (allPaths.length == 1) { val path = new
[spark] branch master updated (11d3a74 -> b1adc3d)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 11d3a74 [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion add b1adc3d [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET No new revisions were added by this update. Summary of changes: .../sql/catalyst/analysis/FunctionRegistry.scala | 1 + .../sql/catalyst/expressions/mathExpressions.scala | 96 .../sql-functions/sql-expression-schema.md | 3 +- .../test/resources/sql-tests/inputs/operators.sql | 14 ++ .../sql-tests/inputs/postgreSQL/numeric.sql| 76 + .../resources/sql-tests/results/operators.sql.out | 98 +++- .../sql-tests/results/postgreSQL/numeric.sql.out | 173 - 7 files changed, 431 insertions(+), 30 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (11d3a74 -> b1adc3d)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 11d3a74 [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion add b1adc3d [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET No new revisions were added by this update. Summary of changes: .../sql/catalyst/analysis/FunctionRegistry.scala | 1 + .../sql/catalyst/expressions/mathExpressions.scala | 96 .../sql-functions/sql-expression-schema.md | 3 +- .../test/resources/sql-tests/inputs/operators.sql | 14 ++ .../sql-tests/inputs/postgreSQL/numeric.sql| 76 + .../resources/sql-tests/results/operators.sql.out | 98 +++- .../sql-tests/results/postgreSQL/numeric.sql.out | 173 - 7 files changed, 431 insertions(+), 30 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (11d3a74 -> b1adc3d)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 11d3a74 [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion add b1adc3d [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET No new revisions were added by this update. Summary of changes: .../sql/catalyst/analysis/FunctionRegistry.scala | 1 + .../sql/catalyst/expressions/mathExpressions.scala | 96 .../sql-functions/sql-expression-schema.md | 3 +- .../test/resources/sql-tests/inputs/operators.sql | 14 ++ .../sql-tests/inputs/postgreSQL/numeric.sql| 76 + .../resources/sql-tests/results/operators.sql.out | 98 +++- .../sql-tests/results/postgreSQL/numeric.sql.out | 173 - 7 files changed, 431 insertions(+), 30 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new b1adc3d [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET b1adc3d is described below commit b1adc3deee00058cba669534aee156dc7af243dc Author: Takeshi Yamamuro AuthorDate: Thu Jun 11 14:15:28 2020 -0700 [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET ### What changes were proposed in this pull request? This PR intends to add a build-in SQL function - `WIDTH_BUCKET`. It is the rework of #18323. Closes #18323 The other RDBMS references for `WIDTH_BUCKET`: - Oracle: https://docs.oracle.com/cd/B28359_01/olap.111/b28126/dml_functions_2137.htm#OLADM717 - PostgreSQL: https://www.postgresql.org/docs/current/functions-math.html - Snowflake: https://docs.snowflake.com/en/sql-reference/functions/width_bucket.html - Prestodb: https://prestodb.io/docs/current/functions/math.html - Teradata: https://docs.teradata.com/reader/kmuOwjp1zEYg98JsB8fu_A/Wa8vw69cGzoRyNULHZeudg - DB2: https://www.ibm.com/support/producthub/db2/docs/content/SSEPGG_11.5.0/com.ibm.db2.luw.sql.ref.doc/doc/r0061483.html?pos=2 ### Why are the changes needed? For better usability. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added unit tests. Closes #28764 from maropu/SPARK-21117. Lead-authored-by: Takeshi Yamamuro Co-authored-by: Yuming Wang Signed-off-by: Dongjoon Hyun --- .../sql/catalyst/analysis/FunctionRegistry.scala | 1 + .../sql/catalyst/expressions/mathExpressions.scala | 96 .../sql-functions/sql-expression-schema.md | 3 +- .../test/resources/sql-tests/inputs/operators.sql | 14 ++ .../sql-tests/inputs/postgreSQL/numeric.sql| 76 + .../resources/sql-tests/results/operators.sql.out | 98 +++- .../sql-tests/results/postgreSQL/numeric.sql.out | 173 - 7 files changed, 431 insertions(+), 30 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala index e2559d4..3989df5 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala @@ -274,6 +274,7 @@ object FunctionRegistry { expression[Tan]("tan"), expression[Cot]("cot"), expression[Tanh]("tanh"), +expression[WidthBucket]("width_bucket"), expression[Add]("+"), expression[Subtract]("-"), diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala index fe8ea2a..5c76495 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala @@ -1325,3 +1325,99 @@ case class BRound(child: Expression, scale: Expression) with Serializable with ImplicitCastInputTypes { def this(child: Expression) = this(child, Literal(0)) } + +object WidthBucket { + + def computeBucketNumber(value: Double, min: Double, max: Double, numBucket: Long): jl.Long = { +if (numBucket <= 0 || numBucket == Long.MaxValue || jl.Double.isNaN(value) || min == max || +jl.Double.isNaN(min) || jl.Double.isInfinite(min) || +jl.Double.isNaN(max) || jl.Double.isInfinite(max)) { + return null +} + +val lower = Math.min(min, max) +val upper = Math.max(min, max) + +if (min < max) { + if (value < lower) { +0L + } else if (value >= upper) { +numBucket + 1L + } else { +(numBucket.toDouble * (value - lower) / (upper - lower)).toLong + 1L + } +} else { // `min > max` case + if (value > upper) { +0L + } else if (value <= lower) { +numBucket + 1L + } else { +(numBucket.toDouble * (upper - value) / (upper - lower)).toLong + 1L + } +} + } +} + +/** + * Returns the bucket number into which the value of this expression would fall + * after being evaluated. Note that input arguments must follow conditions listed below; + * otherwise, the method will return null. + * - `numBucket` must be greater than zero and be less than Long.MaxValue + * - `value`, `min`, and `max` cannot be NaN + * - `min` bound cannot equal `max` + * - `min` and `max` must be finite + * + * Note: If `minValue` > `maxValue`, a return value is as follows; + * if `value` >
[spark] branch master updated (91cd06b -> 11d3a74)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 91cd06b [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after running a task add 11d3a74 [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion No new revisions were added by this update. Summary of changes: .../sql/catalyst/expressions/predicates.scala | 96 +++- .../spark/sql/catalyst/optimizer/Optimizer.scala | 9 +- .../optimizer/PushCNFPredicateThroughJoin.scala| 62 .../org/apache/spark/sql/internal/SQLConf.scala| 15 ++ .../ConjunctiveNormalFormPredicateSuite.scala | 128 .../catalyst/optimizer/FilterPushdownSuite.scala | 162 - 6 files changed, 468 insertions(+), 4 deletions(-) create mode 100644 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PushCNFPredicateThroughJoin.scala create mode 100644 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ConjunctiveNormalFormPredicateSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (91cd06b -> 11d3a74)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 91cd06b [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after running a task add 11d3a74 [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion No new revisions were added by this update. Summary of changes: .../sql/catalyst/expressions/predicates.scala | 96 +++- .../spark/sql/catalyst/optimizer/Optimizer.scala | 9 +- .../optimizer/PushCNFPredicateThroughJoin.scala| 62 .../org/apache/spark/sql/internal/SQLConf.scala| 15 ++ .../ConjunctiveNormalFormPredicateSuite.scala | 128 .../catalyst/optimizer/FilterPushdownSuite.scala | 162 - 6 files changed, 468 insertions(+), 4 deletions(-) create mode 100644 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PushCNFPredicateThroughJoin.scala create mode 100644 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ConjunctiveNormalFormPredicateSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (91cd06b -> 11d3a74)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 91cd06b [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after running a task add 11d3a74 [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion No new revisions were added by this update. Summary of changes: .../sql/catalyst/expressions/predicates.scala | 96 +++- .../spark/sql/catalyst/optimizer/Optimizer.scala | 9 +- .../optimizer/PushCNFPredicateThroughJoin.scala| 62 .../org/apache/spark/sql/internal/SQLConf.scala| 15 ++ .../ConjunctiveNormalFormPredicateSuite.scala | 128 .../catalyst/optimizer/FilterPushdownSuite.scala | 162 - 6 files changed, 468 insertions(+), 4 deletions(-) create mode 100644 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PushCNFPredicateThroughJoin.scala create mode 100644 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ConjunctiveNormalFormPredicateSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (91cd06b -> 11d3a74)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 91cd06b [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after running a task add 11d3a74 [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion No new revisions were added by this update. Summary of changes: .../sql/catalyst/expressions/predicates.scala | 96 +++- .../spark/sql/catalyst/optimizer/Optimizer.scala | 9 +- .../optimizer/PushCNFPredicateThroughJoin.scala| 62 .../org/apache/spark/sql/internal/SQLConf.scala| 15 ++ .../ConjunctiveNormalFormPredicateSuite.scala | 128 .../catalyst/optimizer/FilterPushdownSuite.scala | 162 - 6 files changed, 468 insertions(+), 4 deletions(-) create mode 100644 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PushCNFPredicateThroughJoin.scala create mode 100644 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ConjunctiveNormalFormPredicateSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (91cd06b -> 11d3a74)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 91cd06b [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after running a task add 11d3a74 [SPARK-31705][SQL] Push more possible predicates through Join via CNF conversion No new revisions were added by this update. Summary of changes: .../sql/catalyst/expressions/predicates.scala | 96 +++- .../spark/sql/catalyst/optimizer/Optimizer.scala | 9 +- .../optimizer/PushCNFPredicateThroughJoin.scala| 62 .../org/apache/spark/sql/internal/SQLConf.scala| 15 ++ .../ConjunctiveNormalFormPredicateSuite.scala | 128 .../catalyst/optimizer/FilterPushdownSuite.scala | 162 - 6 files changed, 468 insertions(+), 4 deletions(-) create mode 100644 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PushCNFPredicateThroughJoin.scala create mode 100644 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ConjunctiveNormalFormPredicateSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (912d45d -> 91cd06b)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 912d45d [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite add 91cd06b [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after running a task No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/executor/Executor.scala | 12 1 file changed, 4 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (912d45d -> 91cd06b)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 912d45d [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite add 91cd06b [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after running a task No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/executor/Executor.scala | 12 1 file changed, 4 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (912d45d -> 91cd06b)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 912d45d [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite add 91cd06b [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after running a task No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/executor/Executor.scala | 12 1 file changed, 4 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (912d45d -> 91cd06b)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 912d45d [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite add 91cd06b [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after running a task No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/executor/Executor.scala | 12 1 file changed, 4 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (912d45d -> 91cd06b)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 912d45d [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite add 91cd06b [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after running a task No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/executor/Executor.scala | 12 1 file changed, 4 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (250c6fd -> dad163f)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 250c6fd [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite add dad163f Revert "[SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when year field pattern is missing" No new revisions were added by this update. Summary of changes: .../catalyst/util/DateTimeFormatterHelper.scala| 26 + .../sql/catalyst/util/DateFormatterSuite.scala | 2 - .../sql/catalyst/util/DatetimeFormatterSuite.scala | 78 --- .../catalyst/util/TimestampFormatterSuite.scala| 2 - .../sql-tests/inputs/datetime-parsing-invalid.sql | 20 .../sql-tests/inputs/datetime-parsing-legacy.sql | 2 - .../sql-tests/inputs/datetime-parsing.sql | 16 --- .../results/datetime-parsing-invalid.sql.out | 110 - .../results/datetime-parsing-legacy.sql.out| 106 .../sql-tests/results/datetime-parsing.sql.out | 106 10 files changed, 3 insertions(+), 465 deletions(-) delete mode 100644 sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-invalid.sql delete mode 100644 sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-legacy.sql delete mode 100644 sql/core/src/test/resources/sql-tests/inputs/datetime-parsing.sql delete mode 100644 sql/core/src/test/resources/sql-tests/results/datetime-parsing-invalid.sql.out delete mode 100644 sql/core/src/test/resources/sql-tests/results/datetime-parsing-legacy.sql.out delete mode 100644 sql/core/src/test/resources/sql-tests/results/datetime-parsing.sql.out - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (250c6fd -> dad163f)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 250c6fd [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite add dad163f Revert "[SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when year field pattern is missing" No new revisions were added by this update. Summary of changes: .../catalyst/util/DateTimeFormatterHelper.scala| 26 + .../sql/catalyst/util/DateFormatterSuite.scala | 2 - .../sql/catalyst/util/DatetimeFormatterSuite.scala | 78 --- .../catalyst/util/TimestampFormatterSuite.scala| 2 - .../sql-tests/inputs/datetime-parsing-invalid.sql | 20 .../sql-tests/inputs/datetime-parsing-legacy.sql | 2 - .../sql-tests/inputs/datetime-parsing.sql | 16 --- .../results/datetime-parsing-invalid.sql.out | 110 - .../results/datetime-parsing-legacy.sql.out| 106 .../sql-tests/results/datetime-parsing.sql.out | 106 10 files changed, 3 insertions(+), 465 deletions(-) delete mode 100644 sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-invalid.sql delete mode 100644 sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-legacy.sql delete mode 100644 sql/core/src/test/resources/sql-tests/inputs/datetime-parsing.sql delete mode 100644 sql/core/src/test/resources/sql-tests/results/datetime-parsing-invalid.sql.out delete mode 100644 sql/core/src/test/resources/sql-tests/results/datetime-parsing-legacy.sql.out delete mode 100644 sql/core/src/test/resources/sql-tests/results/datetime-parsing.sql.out - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: Revert "[SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when year field pattern is missing"
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new dad163f Revert "[SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when year field pattern is missing" dad163f is described below commit dad163f0f4a70b84f560de4f1f0bb89354e39217 Author: HyukjinKwon AuthorDate: Thu Jun 11 22:59:43 2020 +0900 Revert "[SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when year field pattern is missing" This reverts commit 4638402d4775acf0d7ff190ef5eabb6f2dc85af5. --- .../catalyst/util/DateTimeFormatterHelper.scala| 26 + .../sql/catalyst/util/DateFormatterSuite.scala | 2 - .../sql/catalyst/util/DatetimeFormatterSuite.scala | 78 --- .../catalyst/util/TimestampFormatterSuite.scala| 2 - .../sql-tests/inputs/datetime-parsing-invalid.sql | 20 .../sql-tests/inputs/datetime-parsing-legacy.sql | 2 - .../sql-tests/inputs/datetime-parsing.sql | 16 --- .../results/datetime-parsing-invalid.sql.out | 110 - .../results/datetime-parsing-legacy.sql.out| 106 .../sql-tests/results/datetime-parsing.sql.out | 106 10 files changed, 3 insertions(+), 465 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala index 5de06af..992a2b1 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala @@ -39,18 +39,6 @@ trait DateTimeFormatterHelper { } } - private def verifyLocalDate( - accessor: TemporalAccessor, field: ChronoField, candidate: LocalDate): Unit = { -if (accessor.isSupported(field)) { - val actual = accessor.get(field) - val expected = candidate.get(field) - if (actual != expected) { -throw new DateTimeException(s"Conflict found: Field $field $actual differs from" + - s" $field $expected derived from $candidate") - } -} - } - protected def toLocalDate(accessor: TemporalAccessor): LocalDate = { val localDate = accessor.query(TemporalQueries.localDate()) // If all the date fields are specified, return the local date directly. @@ -60,17 +48,9 @@ trait DateTimeFormatterHelper { // later, and we should provide default values for missing fields. // To be compatible with Spark 2.4, we pick 1970 as the default value of year. val year = getOrDefault(accessor, ChronoField.YEAR, 1970) -if (accessor.isSupported(ChronoField.DAY_OF_YEAR)) { - val dayOfYear = accessor.get(ChronoField.DAY_OF_YEAR) - val date = LocalDate.ofYearDay(year, dayOfYear) - verifyLocalDate(accessor, ChronoField.MONTH_OF_YEAR, date) - verifyLocalDate(accessor, ChronoField.DAY_OF_MONTH, date) - date -} else { - val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1) - val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1) - LocalDate.of(year, month, day) -} +val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1) +val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1) +LocalDate.of(year, month, day) } private def toLocalTime(accessor: TemporalAccessor): LocalTime = { diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala index 0a29d94..4892dea 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala @@ -31,8 +31,6 @@ class DateFormatterSuite extends DatetimeFormatterSuite { DateFormatter(pattern, UTC, isParsing) } - override protected def useDateFormatter: Boolean = true - test("parsing dates") { outstandingTimezonesIds.foreach { timeZone => withSQLConf(SQLConf.SESSION_LOCAL_TIMEZONE.key -> timeZone) { diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DatetimeFormatterSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DatetimeFormatterSuite.scala index b78facd..31ff50f 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DatetimeFormatterSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DatetimeFormatterSuite.scala @@ -17,61 +17,15 @@ package org.apache.spark.sql.catalyst.util -import java.time.DateTimeException - import org.scalatest.Matchers import org.apache.spark.{SparkFunSuite,
[spark] branch branch-2.4 updated (53f1349 -> e51eb3a)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 53f1349 [SPARK-31941][CORE] Replace SparkException to NoSuchElementException for applicationInfo in AppStatusStore add e51eb3a [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite No new revisions were added by this update. Summary of changes: ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 | 1 - ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} | 0 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} | 0 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 | 1 - .../spark/sql/hive/execution/HiveQuerySuite.scala | 26 +- 5 files changed, 10 insertions(+), 18 deletions(-) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 rename sql/hive/src/test/resources/golden/{timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} (100%) rename sql/hive/src/test/resources/golden/{timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79} (100%) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (593b423 -> 250c6fd)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 593b423 [SPARK-31958][SQL] normalize special floating numbers in subquery add 250c6fd [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite No new revisions were added by this update. Summary of changes: ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 | 1 - ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} | 0 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} | 0 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 | 1 - .../spark/sql/hive/execution/HiveQuerySuite.scala | 26 +- 5 files changed, 10 insertions(+), 18 deletions(-) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 rename sql/hive/src/test/resources/golden/{timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} (100%) rename sql/hive/src/test/resources/golden/{timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79} (100%) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (53f1349 -> e51eb3a)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 53f1349 [SPARK-31941][CORE] Replace SparkException to NoSuchElementException for applicationInfo in AppStatusStore add e51eb3a [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite No new revisions were added by this update. Summary of changes: ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 | 1 - ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} | 0 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} | 0 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 | 1 - .../spark/sql/hive/execution/HiveQuerySuite.scala | 26 +- 5 files changed, 10 insertions(+), 18 deletions(-) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 rename sql/hive/src/test/resources/golden/{timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} (100%) rename sql/hive/src/test/resources/golden/{timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79} (100%) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 250c6fd [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite 250c6fd is described below commit 250c6fdecf169b2a6fa162af5034c3b67ea3f9bc Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com> AuthorDate: Thu Jun 11 22:03:40 2020 +0900 [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite ### What changes were proposed in this pull request? remove duplicate test cases ### Why are the changes needed? improve test quality ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? No test Closes #28782 from GuoPhilipse/31954-delete-duplicate-testcase. Lead-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Co-authored-by: GuoPhilipse Signed-off-by: HyukjinKwon (cherry picked from commit 912d45df7c6535336f72c971c90fecd11cfe87e9) Signed-off-by: HyukjinKwon --- ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 | 1 - ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} | 0 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} | 0 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 | 1 - .../spark/sql/hive/execution/HiveQuerySuite.scala | 26 +- 5 files changed, 10 insertions(+), 18 deletions(-) diff --git a/sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 b/sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 deleted file mode 100644 index 5625e59..000 --- a/sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 +++ /dev/null @@ -1 +0,0 @@ -1.2 diff --git a/sql/hive/src/test/resources/golden/timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 b/sql/hive/src/test/resources/golden/timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2 similarity index 100% rename from sql/hive/src/test/resources/golden/timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 rename to sql/hive/src/test/resources/golden/timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2 diff --git a/sql/hive/src/test/resources/golden/timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 b/sql/hive/src/test/resources/golden/timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79 similarity index 100% rename from sql/hive/src/test/resources/golden/timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 rename to sql/hive/src/test/resources/golden/timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79 diff --git a/sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 b/sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 deleted file mode 100644 index 1d94c8a..000 --- a/sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 +++ /dev/null @@ -1 +0,0 @@ --1.2 diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala index 63b985f..b10a8cb 100644 --- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala +++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala @@ -556,33 +556,27 @@ class HiveQuerySuite extends HiveComparisonTest with SQLTestUtils with BeforeAnd assert(1 == res.getDouble(0)) } - createQueryTest("timestamp cast #2", -"SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") - - test("timestamp cast #3") { -val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 1").collect().head -assert(1200 == res.getInt(0)) + test("timestamp cast #2") { +val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1").collect().head +assert(-1 == res.get(0)) } - createQueryTest("timestamp cast #4", + createQueryTest("timestamp cast #3", "SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") + createQueryTest("timestamp cast #4", +"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") + test("timestamp cast #5") { -val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1").collect().head -assert(-1 == res.get(0)) +val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 1").collect().head +assert(1200 == res.getInt(0)) } - createQueryTest("timestamp cast #6", -"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") - - test("timestamp cast #7") { + test("timestamp cast #6") { val res = sql("SELECT CAST(CAST(-1200 AS TIMESTAMP) AS INT) FROM src LIMIT 1").collect().head assert(-1200 ==
[spark] branch master updated (6fb9c80 -> 912d45d)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fb9c80 [SPARK-31958][SQL] normalize special floating numbers in subquery add 912d45d [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite No new revisions were added by this update. Summary of changes: ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 | 1 - ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} | 0 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} | 0 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 | 1 - .../spark/sql/hive/execution/HiveQuerySuite.scala | 26 +- 5 files changed, 10 insertions(+), 18 deletions(-) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 rename sql/hive/src/test/resources/golden/{timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} (100%) rename sql/hive/src/test/resources/golden/{timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79} (100%) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (53f1349 -> e51eb3a)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 53f1349 [SPARK-31941][CORE] Replace SparkException to NoSuchElementException for applicationInfo in AppStatusStore add e51eb3a [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite No new revisions were added by this update. Summary of changes: ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 | 1 - ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} | 0 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} | 0 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 | 1 - .../spark/sql/hive/execution/HiveQuerySuite.scala | 26 +- 5 files changed, 10 insertions(+), 18 deletions(-) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 rename sql/hive/src/test/resources/golden/{timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} (100%) rename sql/hive/src/test/resources/golden/{timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79} (100%) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 250c6fd [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite 250c6fd is described below commit 250c6fdecf169b2a6fa162af5034c3b67ea3f9bc Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com> AuthorDate: Thu Jun 11 22:03:40 2020 +0900 [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite ### What changes were proposed in this pull request? remove duplicate test cases ### Why are the changes needed? improve test quality ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? No test Closes #28782 from GuoPhilipse/31954-delete-duplicate-testcase. Lead-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Co-authored-by: GuoPhilipse Signed-off-by: HyukjinKwon (cherry picked from commit 912d45df7c6535336f72c971c90fecd11cfe87e9) Signed-off-by: HyukjinKwon --- ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 | 1 - ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} | 0 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} | 0 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 | 1 - .../spark/sql/hive/execution/HiveQuerySuite.scala | 26 +- 5 files changed, 10 insertions(+), 18 deletions(-) diff --git a/sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 b/sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 deleted file mode 100644 index 5625e59..000 --- a/sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 +++ /dev/null @@ -1 +0,0 @@ -1.2 diff --git a/sql/hive/src/test/resources/golden/timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 b/sql/hive/src/test/resources/golden/timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2 similarity index 100% rename from sql/hive/src/test/resources/golden/timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 rename to sql/hive/src/test/resources/golden/timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2 diff --git a/sql/hive/src/test/resources/golden/timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 b/sql/hive/src/test/resources/golden/timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79 similarity index 100% rename from sql/hive/src/test/resources/golden/timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 rename to sql/hive/src/test/resources/golden/timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79 diff --git a/sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 b/sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 deleted file mode 100644 index 1d94c8a..000 --- a/sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 +++ /dev/null @@ -1 +0,0 @@ --1.2 diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala index 63b985f..b10a8cb 100644 --- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala +++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala @@ -556,33 +556,27 @@ class HiveQuerySuite extends HiveComparisonTest with SQLTestUtils with BeforeAnd assert(1 == res.getDouble(0)) } - createQueryTest("timestamp cast #2", -"SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") - - test("timestamp cast #3") { -val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 1").collect().head -assert(1200 == res.getInt(0)) + test("timestamp cast #2") { +val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1").collect().head +assert(-1 == res.get(0)) } - createQueryTest("timestamp cast #4", + createQueryTest("timestamp cast #3", "SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") + createQueryTest("timestamp cast #4", +"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") + test("timestamp cast #5") { -val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1").collect().head -assert(-1 == res.get(0)) +val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 1").collect().head +assert(1200 == res.getInt(0)) } - createQueryTest("timestamp cast #6", -"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") - - test("timestamp cast #7") { + test("timestamp cast #6") { val res = sql("SELECT CAST(CAST(-1200 AS TIMESTAMP) AS INT) FROM src LIMIT 1").collect().head assert(-1200 ==
[spark] branch master updated (6fb9c80 -> 912d45d)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fb9c80 [SPARK-31958][SQL] normalize special floating numbers in subquery add 912d45d [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite No new revisions were added by this update. Summary of changes: ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 | 1 - ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} | 0 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} | 0 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 | 1 - .../spark/sql/hive/execution/HiveQuerySuite.scala | 26 +- 5 files changed, 10 insertions(+), 18 deletions(-) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 rename sql/hive/src/test/resources/golden/{timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} (100%) rename sql/hive/src/test/resources/golden/{timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79} (100%) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (53f1349 -> e51eb3a)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 53f1349 [SPARK-31941][CORE] Replace SparkException to NoSuchElementException for applicationInfo in AppStatusStore add e51eb3a [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite No new revisions were added by this update. Summary of changes: ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 | 1 - ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} | 0 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} | 0 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 | 1 - .../spark/sql/hive/execution/HiveQuerySuite.scala | 26 +- 5 files changed, 10 insertions(+), 18 deletions(-) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 rename sql/hive/src/test/resources/golden/{timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} (100%) rename sql/hive/src/test/resources/golden/{timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79} (100%) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 250c6fd [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite 250c6fd is described below commit 250c6fdecf169b2a6fa162af5034c3b67ea3f9bc Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com> AuthorDate: Thu Jun 11 22:03:40 2020 +0900 [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite ### What changes were proposed in this pull request? remove duplicate test cases ### Why are the changes needed? improve test quality ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? No test Closes #28782 from GuoPhilipse/31954-delete-duplicate-testcase. Lead-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Co-authored-by: GuoPhilipse Signed-off-by: HyukjinKwon (cherry picked from commit 912d45df7c6535336f72c971c90fecd11cfe87e9) Signed-off-by: HyukjinKwon --- ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 | 1 - ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} | 0 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} | 0 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 | 1 - .../spark/sql/hive/execution/HiveQuerySuite.scala | 26 +- 5 files changed, 10 insertions(+), 18 deletions(-) diff --git a/sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 b/sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 deleted file mode 100644 index 5625e59..000 --- a/sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 +++ /dev/null @@ -1 +0,0 @@ -1.2 diff --git a/sql/hive/src/test/resources/golden/timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 b/sql/hive/src/test/resources/golden/timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2 similarity index 100% rename from sql/hive/src/test/resources/golden/timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 rename to sql/hive/src/test/resources/golden/timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2 diff --git a/sql/hive/src/test/resources/golden/timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 b/sql/hive/src/test/resources/golden/timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79 similarity index 100% rename from sql/hive/src/test/resources/golden/timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 rename to sql/hive/src/test/resources/golden/timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79 diff --git a/sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 b/sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 deleted file mode 100644 index 1d94c8a..000 --- a/sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 +++ /dev/null @@ -1 +0,0 @@ --1.2 diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala index 63b985f..b10a8cb 100644 --- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala +++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala @@ -556,33 +556,27 @@ class HiveQuerySuite extends HiveComparisonTest with SQLTestUtils with BeforeAnd assert(1 == res.getDouble(0)) } - createQueryTest("timestamp cast #2", -"SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") - - test("timestamp cast #3") { -val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 1").collect().head -assert(1200 == res.getInt(0)) + test("timestamp cast #2") { +val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1").collect().head +assert(-1 == res.get(0)) } - createQueryTest("timestamp cast #4", + createQueryTest("timestamp cast #3", "SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") + createQueryTest("timestamp cast #4", +"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") + test("timestamp cast #5") { -val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1").collect().head -assert(-1 == res.get(0)) +val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 1").collect().head +assert(1200 == res.getInt(0)) } - createQueryTest("timestamp cast #6", -"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") - - test("timestamp cast #7") { + test("timestamp cast #6") { val res = sql("SELECT CAST(CAST(-1200 AS TIMESTAMP) AS INT) FROM src LIMIT 1").collect().head assert(-1200 ==
[spark] branch master updated (6fb9c80 -> 912d45d)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fb9c80 [SPARK-31958][SQL] normalize special floating numbers in subquery add 912d45d [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite No new revisions were added by this update. Summary of changes: ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 | 1 - ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} | 0 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} | 0 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 | 1 - .../spark/sql/hive/execution/HiveQuerySuite.scala | 26 +- 5 files changed, 10 insertions(+), 18 deletions(-) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 rename sql/hive/src/test/resources/golden/{timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} (100%) rename sql/hive/src/test/resources/golden/{timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79} (100%) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (53f1349 -> e51eb3a)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 53f1349 [SPARK-31941][CORE] Replace SparkException to NoSuchElementException for applicationInfo in AppStatusStore add e51eb3a [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite No new revisions were added by this update. Summary of changes: ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 | 1 - ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} | 0 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} | 0 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 | 1 - .../spark/sql/hive/execution/HiveQuerySuite.scala | 26 +- 5 files changed, 10 insertions(+), 18 deletions(-) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 rename sql/hive/src/test/resources/golden/{timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} (100%) rename sql/hive/src/test/resources/golden/{timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79} (100%) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 250c6fd [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite 250c6fd is described below commit 250c6fdecf169b2a6fa162af5034c3b67ea3f9bc Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com> AuthorDate: Thu Jun 11 22:03:40 2020 +0900 [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite ### What changes were proposed in this pull request? remove duplicate test cases ### Why are the changes needed? improve test quality ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? No test Closes #28782 from GuoPhilipse/31954-delete-duplicate-testcase. Lead-authored-by: GuoPhilipse <46367746+guophili...@users.noreply.github.com> Co-authored-by: GuoPhilipse Signed-off-by: HyukjinKwon (cherry picked from commit 912d45df7c6535336f72c971c90fecd11cfe87e9) Signed-off-by: HyukjinKwon --- ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 | 1 - ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} | 0 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} | 0 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 | 1 - .../spark/sql/hive/execution/HiveQuerySuite.scala | 26 +- 5 files changed, 10 insertions(+), 18 deletions(-) diff --git a/sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 b/sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 deleted file mode 100644 index 5625e59..000 --- a/sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 +++ /dev/null @@ -1 +0,0 @@ -1.2 diff --git a/sql/hive/src/test/resources/golden/timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 b/sql/hive/src/test/resources/golden/timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2 similarity index 100% rename from sql/hive/src/test/resources/golden/timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 rename to sql/hive/src/test/resources/golden/timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2 diff --git a/sql/hive/src/test/resources/golden/timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 b/sql/hive/src/test/resources/golden/timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79 similarity index 100% rename from sql/hive/src/test/resources/golden/timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 rename to sql/hive/src/test/resources/golden/timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79 diff --git a/sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 b/sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 deleted file mode 100644 index 1d94c8a..000 --- a/sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 +++ /dev/null @@ -1 +0,0 @@ --1.2 diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala index 63b985f..b10a8cb 100644 --- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala +++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala @@ -556,33 +556,27 @@ class HiveQuerySuite extends HiveComparisonTest with SQLTestUtils with BeforeAnd assert(1 == res.getDouble(0)) } - createQueryTest("timestamp cast #2", -"SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") - - test("timestamp cast #3") { -val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 1").collect().head -assert(1200 == res.getInt(0)) + test("timestamp cast #2") { +val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1").collect().head +assert(-1 == res.get(0)) } - createQueryTest("timestamp cast #4", + createQueryTest("timestamp cast #3", "SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") + createQueryTest("timestamp cast #4", +"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") + test("timestamp cast #5") { -val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1").collect().head -assert(-1 == res.get(0)) +val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 1").collect().head +assert(1200 == res.getInt(0)) } - createQueryTest("timestamp cast #6", -"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1") - - test("timestamp cast #7") { + test("timestamp cast #6") { val res = sql("SELECT CAST(CAST(-1200 AS TIMESTAMP) AS INT) FROM src LIMIT 1").collect().head assert(-1200 ==
[spark] branch master updated (6fb9c80 -> 912d45d)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fb9c80 [SPARK-31958][SQL] normalize special floating numbers in subquery add 912d45d [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite No new revisions were added by this update. Summary of changes: ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 | 1 - ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} | 0 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} | 0 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 | 1 - .../spark/sql/hive/execution/HiveQuerySuite.scala | 26 +- 5 files changed, 10 insertions(+), 18 deletions(-) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 rename sql/hive/src/test/resources/golden/{timestamp cast #4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} (100%) rename sql/hive/src/test/resources/golden/{timestamp cast #8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast #4-0-6d2da5cfada03605834e38bc4075bc79} (100%) delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31958][SQL] normalize special floating numbers in subquery
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 593b423 [SPARK-31958][SQL] normalize special floating numbers in subquery 593b423 is described below commit 593b42323255f44b268ca5148f1fa817f3c01de7 Author: Wenchen Fan AuthorDate: Thu Jun 11 06:39:14 2020 + [SPARK-31958][SQL] normalize special floating numbers in subquery ### What changes were proposed in this pull request? This is a followup of https://github.com/apache/spark/pull/23388 . https://github.com/apache/spark/pull/23388 has an issue: it doesn't handle subquery expressions and assumes they will be turned into joins. However, this is not true for non-correlated subquery expressions. This PR fixes this issue. It now doesn't skip `Subquery`, and subquery expressions will be handled by `OptimizeSubqueries`, which runs the optimizer with the subquery. Note that, correlated subquery expressions will be handled twice: once in `OptimizeSubqueries`, once later when it becomes join. This is OK as `NormalizeFloatingNumbers` is idempotent now. ### Why are the changes needed? fix a bug ### Does this PR introduce _any_ user-facing change? yes, see the newly added test. ### How was this patch tested? new test Closes #28785 from cloud-fan/normalize. Authored-by: Wenchen Fan Signed-off-by: Wenchen Fan (cherry picked from commit 6fb9c80da129d0b43f9ff5b8be6ce8bad992a4ed) Signed-off-by: Wenchen Fan --- .../catalyst/optimizer/NormalizeFloatingNumbers.scala | 4 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++ 2 files changed, 18 insertions(+), 4 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala index 5f94af5..4373820 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala @@ -56,10 +56,6 @@ import org.apache.spark.sql.types._ object NormalizeFloatingNumbers extends Rule[LogicalPlan] { def apply(plan: LogicalPlan): LogicalPlan = plan match { -// A subquery will be rewritten into join later, and will go through this rule -// eventually. Here we skip subquery, as we only need to run this rule once. -case _: Subquery => plan - case _ => plan transform { case w: Window if w.partitionSpec.exists(p => needNormalize(p)) => // Although the `windowExpressions` may refer to `partitionSpec` expressions, we don't need diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala index a23e583..093f2db 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala @@ -3449,6 +3449,24 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark checkAnswer(sql("select CAST(-32768 as short) DIV CAST (-1 as short)"), Seq(Row(Short.MinValue.toLong * -1))) } + + test("normalize special floating numbers in subquery") { +withTempView("v1", "v2", "v3") { + Seq(-0.0).toDF("d").createTempView("v1") + Seq(0.0).toDF("d").createTempView("v2") + spark.range(2).createTempView("v3") + + // non-correlated subquery + checkAnswer(sql("SELECT (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d)"), Row(-0.0)) + // correlated subquery + checkAnswer( +sql( + """ +|SELECT id FROM v3 WHERE EXISTS +| (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d WHERE id > 0) +|""".stripMargin), Row(1)) +} + } } case class Foo(bar: Option[String]) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31958][SQL] normalize special floating numbers in subquery
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 593b423 [SPARK-31958][SQL] normalize special floating numbers in subquery 593b423 is described below commit 593b42323255f44b268ca5148f1fa817f3c01de7 Author: Wenchen Fan AuthorDate: Thu Jun 11 06:39:14 2020 + [SPARK-31958][SQL] normalize special floating numbers in subquery ### What changes were proposed in this pull request? This is a followup of https://github.com/apache/spark/pull/23388 . https://github.com/apache/spark/pull/23388 has an issue: it doesn't handle subquery expressions and assumes they will be turned into joins. However, this is not true for non-correlated subquery expressions. This PR fixes this issue. It now doesn't skip `Subquery`, and subquery expressions will be handled by `OptimizeSubqueries`, which runs the optimizer with the subquery. Note that, correlated subquery expressions will be handled twice: once in `OptimizeSubqueries`, once later when it becomes join. This is OK as `NormalizeFloatingNumbers` is idempotent now. ### Why are the changes needed? fix a bug ### Does this PR introduce _any_ user-facing change? yes, see the newly added test. ### How was this patch tested? new test Closes #28785 from cloud-fan/normalize. Authored-by: Wenchen Fan Signed-off-by: Wenchen Fan (cherry picked from commit 6fb9c80da129d0b43f9ff5b8be6ce8bad992a4ed) Signed-off-by: Wenchen Fan --- .../catalyst/optimizer/NormalizeFloatingNumbers.scala | 4 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++ 2 files changed, 18 insertions(+), 4 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala index 5f94af5..4373820 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala @@ -56,10 +56,6 @@ import org.apache.spark.sql.types._ object NormalizeFloatingNumbers extends Rule[LogicalPlan] { def apply(plan: LogicalPlan): LogicalPlan = plan match { -// A subquery will be rewritten into join later, and will go through this rule -// eventually. Here we skip subquery, as we only need to run this rule once. -case _: Subquery => plan - case _ => plan transform { case w: Window if w.partitionSpec.exists(p => needNormalize(p)) => // Although the `windowExpressions` may refer to `partitionSpec` expressions, we don't need diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala index a23e583..093f2db 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala @@ -3449,6 +3449,24 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark checkAnswer(sql("select CAST(-32768 as short) DIV CAST (-1 as short)"), Seq(Row(Short.MinValue.toLong * -1))) } + + test("normalize special floating numbers in subquery") { +withTempView("v1", "v2", "v3") { + Seq(-0.0).toDF("d").createTempView("v1") + Seq(0.0).toDF("d").createTempView("v2") + spark.range(2).createTempView("v3") + + // non-correlated subquery + checkAnswer(sql("SELECT (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d)"), Row(-0.0)) + // correlated subquery + checkAnswer( +sql( + """ +|SELECT id FROM v3 WHERE EXISTS +| (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d WHERE id > 0) +|""".stripMargin), Row(1)) +} + } } case class Foo(bar: Option[String]) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31958][SQL] normalize special floating numbers in subquery
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 593b423 [SPARK-31958][SQL] normalize special floating numbers in subquery 593b423 is described below commit 593b42323255f44b268ca5148f1fa817f3c01de7 Author: Wenchen Fan AuthorDate: Thu Jun 11 06:39:14 2020 + [SPARK-31958][SQL] normalize special floating numbers in subquery ### What changes were proposed in this pull request? This is a followup of https://github.com/apache/spark/pull/23388 . https://github.com/apache/spark/pull/23388 has an issue: it doesn't handle subquery expressions and assumes they will be turned into joins. However, this is not true for non-correlated subquery expressions. This PR fixes this issue. It now doesn't skip `Subquery`, and subquery expressions will be handled by `OptimizeSubqueries`, which runs the optimizer with the subquery. Note that, correlated subquery expressions will be handled twice: once in `OptimizeSubqueries`, once later when it becomes join. This is OK as `NormalizeFloatingNumbers` is idempotent now. ### Why are the changes needed? fix a bug ### Does this PR introduce _any_ user-facing change? yes, see the newly added test. ### How was this patch tested? new test Closes #28785 from cloud-fan/normalize. Authored-by: Wenchen Fan Signed-off-by: Wenchen Fan (cherry picked from commit 6fb9c80da129d0b43f9ff5b8be6ce8bad992a4ed) Signed-off-by: Wenchen Fan --- .../catalyst/optimizer/NormalizeFloatingNumbers.scala | 4 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++ 2 files changed, 18 insertions(+), 4 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala index 5f94af5..4373820 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala @@ -56,10 +56,6 @@ import org.apache.spark.sql.types._ object NormalizeFloatingNumbers extends Rule[LogicalPlan] { def apply(plan: LogicalPlan): LogicalPlan = plan match { -// A subquery will be rewritten into join later, and will go through this rule -// eventually. Here we skip subquery, as we only need to run this rule once. -case _: Subquery => plan - case _ => plan transform { case w: Window if w.partitionSpec.exists(p => needNormalize(p)) => // Although the `windowExpressions` may refer to `partitionSpec` expressions, we don't need diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala index a23e583..093f2db 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala @@ -3449,6 +3449,24 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark checkAnswer(sql("select CAST(-32768 as short) DIV CAST (-1 as short)"), Seq(Row(Short.MinValue.toLong * -1))) } + + test("normalize special floating numbers in subquery") { +withTempView("v1", "v2", "v3") { + Seq(-0.0).toDF("d").createTempView("v1") + Seq(0.0).toDF("d").createTempView("v2") + spark.range(2).createTempView("v3") + + // non-correlated subquery + checkAnswer(sql("SELECT (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d)"), Row(-0.0)) + // correlated subquery + checkAnswer( +sql( + """ +|SELECT id FROM v3 WHERE EXISTS +| (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d WHERE id > 0) +|""".stripMargin), Row(1)) +} + } } case class Foo(bar: Option[String]) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31958][SQL] normalize special floating numbers in subquery
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 593b423 [SPARK-31958][SQL] normalize special floating numbers in subquery 593b423 is described below commit 593b42323255f44b268ca5148f1fa817f3c01de7 Author: Wenchen Fan AuthorDate: Thu Jun 11 06:39:14 2020 + [SPARK-31958][SQL] normalize special floating numbers in subquery ### What changes were proposed in this pull request? This is a followup of https://github.com/apache/spark/pull/23388 . https://github.com/apache/spark/pull/23388 has an issue: it doesn't handle subquery expressions and assumes they will be turned into joins. However, this is not true for non-correlated subquery expressions. This PR fixes this issue. It now doesn't skip `Subquery`, and subquery expressions will be handled by `OptimizeSubqueries`, which runs the optimizer with the subquery. Note that, correlated subquery expressions will be handled twice: once in `OptimizeSubqueries`, once later when it becomes join. This is OK as `NormalizeFloatingNumbers` is idempotent now. ### Why are the changes needed? fix a bug ### Does this PR introduce _any_ user-facing change? yes, see the newly added test. ### How was this patch tested? new test Closes #28785 from cloud-fan/normalize. Authored-by: Wenchen Fan Signed-off-by: Wenchen Fan (cherry picked from commit 6fb9c80da129d0b43f9ff5b8be6ce8bad992a4ed) Signed-off-by: Wenchen Fan --- .../catalyst/optimizer/NormalizeFloatingNumbers.scala | 4 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++ 2 files changed, 18 insertions(+), 4 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala index 5f94af5..4373820 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala @@ -56,10 +56,6 @@ import org.apache.spark.sql.types._ object NormalizeFloatingNumbers extends Rule[LogicalPlan] { def apply(plan: LogicalPlan): LogicalPlan = plan match { -// A subquery will be rewritten into join later, and will go through this rule -// eventually. Here we skip subquery, as we only need to run this rule once. -case _: Subquery => plan - case _ => plan transform { case w: Window if w.partitionSpec.exists(p => needNormalize(p)) => // Although the `windowExpressions` may refer to `partitionSpec` expressions, we don't need diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala index a23e583..093f2db 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala @@ -3449,6 +3449,24 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark checkAnswer(sql("select CAST(-32768 as short) DIV CAST (-1 as short)"), Seq(Row(Short.MinValue.toLong * -1))) } + + test("normalize special floating numbers in subquery") { +withTempView("v1", "v2", "v3") { + Seq(-0.0).toDF("d").createTempView("v1") + Seq(0.0).toDF("d").createTempView("v2") + spark.range(2).createTempView("v3") + + // non-correlated subquery + checkAnswer(sql("SELECT (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d)"), Row(-0.0)) + // correlated subquery + checkAnswer( +sql( + """ +|SELECT id FROM v3 WHERE EXISTS +| (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d WHERE id > 0) +|""".stripMargin), Row(1)) +} + } } case class Foo(bar: Option[String]) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (56d4f27 -> 6fb9c80)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 56d4f27 [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for StreamingLogisticRegressionWithSGDTests.test_training_and_prediction add 6fb9c80 [SPARK-31958][SQL] normalize special floating numbers in subquery No new revisions were added by this update. Summary of changes: .../catalyst/optimizer/NormalizeFloatingNumbers.scala | 4 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++ 2 files changed, 18 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-31958][SQL] normalize special floating numbers in subquery
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 593b423 [SPARK-31958][SQL] normalize special floating numbers in subquery 593b423 is described below commit 593b42323255f44b268ca5148f1fa817f3c01de7 Author: Wenchen Fan AuthorDate: Thu Jun 11 06:39:14 2020 + [SPARK-31958][SQL] normalize special floating numbers in subquery ### What changes were proposed in this pull request? This is a followup of https://github.com/apache/spark/pull/23388 . https://github.com/apache/spark/pull/23388 has an issue: it doesn't handle subquery expressions and assumes they will be turned into joins. However, this is not true for non-correlated subquery expressions. This PR fixes this issue. It now doesn't skip `Subquery`, and subquery expressions will be handled by `OptimizeSubqueries`, which runs the optimizer with the subquery. Note that, correlated subquery expressions will be handled twice: once in `OptimizeSubqueries`, once later when it becomes join. This is OK as `NormalizeFloatingNumbers` is idempotent now. ### Why are the changes needed? fix a bug ### Does this PR introduce _any_ user-facing change? yes, see the newly added test. ### How was this patch tested? new test Closes #28785 from cloud-fan/normalize. Authored-by: Wenchen Fan Signed-off-by: Wenchen Fan (cherry picked from commit 6fb9c80da129d0b43f9ff5b8be6ce8bad992a4ed) Signed-off-by: Wenchen Fan --- .../catalyst/optimizer/NormalizeFloatingNumbers.scala | 4 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++ 2 files changed, 18 insertions(+), 4 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala index 5f94af5..4373820 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala @@ -56,10 +56,6 @@ import org.apache.spark.sql.types._ object NormalizeFloatingNumbers extends Rule[LogicalPlan] { def apply(plan: LogicalPlan): LogicalPlan = plan match { -// A subquery will be rewritten into join later, and will go through this rule -// eventually. Here we skip subquery, as we only need to run this rule once. -case _: Subquery => plan - case _ => plan transform { case w: Window if w.partitionSpec.exists(p => needNormalize(p)) => // Although the `windowExpressions` may refer to `partitionSpec` expressions, we don't need diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala index a23e583..093f2db 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala @@ -3449,6 +3449,24 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark checkAnswer(sql("select CAST(-32768 as short) DIV CAST (-1 as short)"), Seq(Row(Short.MinValue.toLong * -1))) } + + test("normalize special floating numbers in subquery") { +withTempView("v1", "v2", "v3") { + Seq(-0.0).toDF("d").createTempView("v1") + Seq(0.0).toDF("d").createTempView("v2") + spark.range(2).createTempView("v3") + + // non-correlated subquery + checkAnswer(sql("SELECT (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d)"), Row(-0.0)) + // correlated subquery + checkAnswer( +sql( + """ +|SELECT id FROM v3 WHERE EXISTS +| (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d WHERE id > 0) +|""".stripMargin), Row(1)) +} + } } case class Foo(bar: Option[String]) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (56d4f27 -> 6fb9c80)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 56d4f27 [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for StreamingLogisticRegressionWithSGDTests.test_training_and_prediction add 6fb9c80 [SPARK-31958][SQL] normalize special floating numbers in subquery No new revisions were added by this update. Summary of changes: .../catalyst/optimizer/NormalizeFloatingNumbers.scala | 4 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++ 2 files changed, 18 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (56d4f27 -> 6fb9c80)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 56d4f27 [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for StreamingLogisticRegressionWithSGDTests.test_training_and_prediction add 6fb9c80 [SPARK-31958][SQL] normalize special floating numbers in subquery No new revisions were added by this update. Summary of changes: .../catalyst/optimizer/NormalizeFloatingNumbers.scala | 4 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++ 2 files changed, 18 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org