date:20200611

[spark] branch master updated (b87a342 -> 78f9043)

2020-06-11 Thread yumwang

This is an automated email from the ASF dual-hosted git repository.

yumwang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b87a342  [SPARK-31916][SQL] StringConcat can lead to 
StringIndexOutOfBoundsException
 add 78f9043  [SPARK-31912][SQL][TESTS] Normalize all binary comparison 
expressions

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/plans/PlanTest.scala | 23 --
 1 file changed, 13 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b87a342 -> 78f9043)

2020-06-11 Thread yumwang

This is an automated email from the ASF dual-hosted git repository.

yumwang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b87a342  [SPARK-31916][SQL] StringConcat can lead to 
StringIndexOutOfBoundsException
 add 78f9043  [SPARK-31912][SQL][TESTS] Normalize all binary comparison 
expressions

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/plans/PlanTest.scala | 23 --
 1 file changed, 13 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b87a342 -> 78f9043)

2020-06-11 Thread yumwang

This is an automated email from the ASF dual-hosted git repository.

yumwang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b87a342  [SPARK-31916][SQL] StringConcat can lead to 
StringIndexOutOfBoundsException
 add 78f9043  [SPARK-31912][SQL][TESTS] Normalize all binary comparison 
expressions

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/plans/PlanTest.scala | 23 --
 1 file changed, 13 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b87a342 -> 78f9043)

2020-06-11 Thread yumwang

This is an automated email from the ASF dual-hosted git repository.

yumwang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b87a342  [SPARK-31916][SQL] StringConcat can lead to 
StringIndexOutOfBoundsException
 add 78f9043  [SPARK-31912][SQL][TESTS] Normalize all binary comparison 
expressions

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/plans/PlanTest.scala | 23 --
 1 file changed, 13 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b87a342 -> 78f9043)

2020-06-11 Thread yumwang

This is an automated email from the ASF dual-hosted git repository.

yumwang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b87a342  [SPARK-31916][SQL] StringConcat can lead to 
StringIndexOutOfBoundsException
 add 78f9043  [SPARK-31912][SQL][TESTS] Normalize all binary comparison 
expressions

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/plans/PlanTest.scala | 23 --
 1 file changed, 13 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year when year field pattern is missing

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 265bf04  [SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year 
when year field pattern is missing
265bf04 is described below

commit 265bf04d7afa12ea6312721003f834e9538c9101
Author: Kent Yao 
AuthorDate: Fri Jun 12 11:52:05 2020 +0900

[SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year when year 
field pattern is missing

### What changes were proposed in this pull request?

This PR brings back 4638402d4775acf0d7ff190ef5eabb6f2dc85af5 reverted by 
dad163f0f4a70b84f560de4f1f0bb89354e39217 and fix test failure for branch-3.0

diffs made:
  re-genarate sql test results to match 3.0's schema, the 
RuntimeReplaceable expression has new pretty schema in master branch

### Why are the changes needed?

bring back reverted PR and fix tests

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

passing jenkins

Closes #28800 from yaooqinn/SPARK-31939-30.

Authored-by: Kent Yao 
Signed-off-by: HyukjinKwon 
---
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 -
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 +
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 +++
 .../catalyst/util/TimestampFormatterSuite.scala|   2 +
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 .../sql-tests/inputs/datetime-parsing-legacy.sql   |   2 +
 .../sql-tests/inputs/datetime-parsing.sql  |  16 +++
 .../results/datetime-parsing-invalid.sql.out   | 110 +
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 465 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
index 992a2b1..5de06af 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
@@ -39,6 +39,18 @@ trait DateTimeFormatterHelper {
 }
   }
 
+  private def verifyLocalDate(
+  accessor: TemporalAccessor, field: ChronoField, candidate: LocalDate): 
Unit = {
+if (accessor.isSupported(field)) {
+  val actual = accessor.get(field)
+  val expected = candidate.get(field)
+  if (actual != expected) {
+throw new DateTimeException(s"Conflict found: Field $field $actual 
differs from" +
+  s" $field $expected derived from $candidate")
+  }
+}
+  }
+
   protected def toLocalDate(accessor: TemporalAccessor): LocalDate = {
 val localDate = accessor.query(TemporalQueries.localDate())
 // If all the date fields are specified, return the local date directly.
@@ -48,9 +60,17 @@ trait DateTimeFormatterHelper {
 // later, and we should provide default values for missing fields.
 // To be compatible with Spark 2.4, we pick 1970 as the default value of 
year.
 val year = getOrDefault(accessor, ChronoField.YEAR, 1970)
-val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1)
-val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1)
-LocalDate.of(year, month, day)
+if (accessor.isSupported(ChronoField.DAY_OF_YEAR)) {
+  val dayOfYear = accessor.get(ChronoField.DAY_OF_YEAR)
+  val date = LocalDate.ofYearDay(year, dayOfYear)
+  verifyLocalDate(accessor, ChronoField.MONTH_OF_YEAR, date)
+  verifyLocalDate(accessor, ChronoField.DAY_OF_MONTH, date)
+  date
+} else {
+  val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1)
+  val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1)
+  LocalDate.of(year, month, day)
+}
   }
 
   private def toLocalTime(accessor: TemporalAccessor): LocalTime = {
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
index 4892dea..0a29d94 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
@@ -31,6 +31,8 @@ class DateFormatterSuite extends DatetimeFormatterSuite {
 DateFormatter(pattern, UTC, isParsing)
   }
 
+  override protected def useDateFormatter: Boolean = true
+
   test("parsing dates") {
 outstandingTimezonesIds.foreach { timeZone =>
   withSQLConf(SQLConf.SESSION_LOCAL_TIMEZONE.key

[spark] branch branch-3.0 updated: [SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year when year field pattern is missing

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 265bf04  [SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year 
when year field pattern is missing
265bf04 is described below

commit 265bf04d7afa12ea6312721003f834e9538c9101
Author: Kent Yao 
AuthorDate: Fri Jun 12 11:52:05 2020 +0900

[SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year when year 
field pattern is missing

### What changes were proposed in this pull request?

This PR brings back 4638402d4775acf0d7ff190ef5eabb6f2dc85af5 reverted by 
dad163f0f4a70b84f560de4f1f0bb89354e39217 and fix test failure for branch-3.0

diffs made:
  re-genarate sql test results to match 3.0's schema, the 
RuntimeReplaceable expression has new pretty schema in master branch

### Why are the changes needed?

bring back reverted PR and fix tests

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

passing jenkins

Closes #28800 from yaooqinn/SPARK-31939-30.

Authored-by: Kent Yao 
Signed-off-by: HyukjinKwon 
---
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 -
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 +
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 +++
 .../catalyst/util/TimestampFormatterSuite.scala|   2 +
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 .../sql-tests/inputs/datetime-parsing-legacy.sql   |   2 +
 .../sql-tests/inputs/datetime-parsing.sql  |  16 +++
 .../results/datetime-parsing-invalid.sql.out   | 110 +
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 465 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
index 992a2b1..5de06af 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
@@ -39,6 +39,18 @@ trait DateTimeFormatterHelper {
 }
   }
 
+  private def verifyLocalDate(
+  accessor: TemporalAccessor, field: ChronoField, candidate: LocalDate): 
Unit = {
+if (accessor.isSupported(field)) {
+  val actual = accessor.get(field)
+  val expected = candidate.get(field)
+  if (actual != expected) {
+throw new DateTimeException(s"Conflict found: Field $field $actual 
differs from" +
+  s" $field $expected derived from $candidate")
+  }
+}
+  }
+
   protected def toLocalDate(accessor: TemporalAccessor): LocalDate = {
 val localDate = accessor.query(TemporalQueries.localDate())
 // If all the date fields are specified, return the local date directly.
@@ -48,9 +60,17 @@ trait DateTimeFormatterHelper {
 // later, and we should provide default values for missing fields.
 // To be compatible with Spark 2.4, we pick 1970 as the default value of 
year.
 val year = getOrDefault(accessor, ChronoField.YEAR, 1970)
-val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1)
-val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1)
-LocalDate.of(year, month, day)
+if (accessor.isSupported(ChronoField.DAY_OF_YEAR)) {
+  val dayOfYear = accessor.get(ChronoField.DAY_OF_YEAR)
+  val date = LocalDate.ofYearDay(year, dayOfYear)
+  verifyLocalDate(accessor, ChronoField.MONTH_OF_YEAR, date)
+  verifyLocalDate(accessor, ChronoField.DAY_OF_MONTH, date)
+  date
+} else {
+  val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1)
+  val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1)
+  LocalDate.of(year, month, day)
+}
   }
 
   private def toLocalTime(accessor: TemporalAccessor): LocalTime = {
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
index 4892dea..0a29d94 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
@@ -31,6 +31,8 @@ class DateFormatterSuite extends DatetimeFormatterSuite {
 DateFormatter(pattern, UTC, isParsing)
   }
 
+  override protected def useDateFormatter: Boolean = true
+
   test("parsing dates") {
 outstandingTimezonesIds.foreach { timeZone =>
   withSQLConf(SQLConf.SESSION_LOCAL_TIMEZONE.key

[spark] branch branch-3.0 updated: [SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year when year field pattern is missing

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 265bf04  [SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year 
when year field pattern is missing
265bf04 is described below

commit 265bf04d7afa12ea6312721003f834e9538c9101
Author: Kent Yao 
AuthorDate: Fri Jun 12 11:52:05 2020 +0900

[SPARK-31939][SQL][TEST-JAVA11][3.0] Fix Parsing day of year when year 
field pattern is missing

### What changes were proposed in this pull request?

This PR brings back 4638402d4775acf0d7ff190ef5eabb6f2dc85af5 reverted by 
dad163f0f4a70b84f560de4f1f0bb89354e39217 and fix test failure for branch-3.0

diffs made:
  re-genarate sql test results to match 3.0's schema, the 
RuntimeReplaceable expression has new pretty schema in master branch

### Why are the changes needed?

bring back reverted PR and fix tests

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

passing jenkins

Closes #28800 from yaooqinn/SPARK-31939-30.

Authored-by: Kent Yao 
Signed-off-by: HyukjinKwon 
---
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 -
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 +
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 +++
 .../catalyst/util/TimestampFormatterSuite.scala|   2 +
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 .../sql-tests/inputs/datetime-parsing-legacy.sql   |   2 +
 .../sql-tests/inputs/datetime-parsing.sql  |  16 +++
 .../results/datetime-parsing-invalid.sql.out   | 110 +
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 465 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
index 992a2b1..5de06af 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
@@ -39,6 +39,18 @@ trait DateTimeFormatterHelper {
 }
   }
 
+  private def verifyLocalDate(
+  accessor: TemporalAccessor, field: ChronoField, candidate: LocalDate): 
Unit = {
+if (accessor.isSupported(field)) {
+  val actual = accessor.get(field)
+  val expected = candidate.get(field)
+  if (actual != expected) {
+throw new DateTimeException(s"Conflict found: Field $field $actual 
differs from" +
+  s" $field $expected derived from $candidate")
+  }
+}
+  }
+
   protected def toLocalDate(accessor: TemporalAccessor): LocalDate = {
 val localDate = accessor.query(TemporalQueries.localDate())
 // If all the date fields are specified, return the local date directly.
@@ -48,9 +60,17 @@ trait DateTimeFormatterHelper {
 // later, and we should provide default values for missing fields.
 // To be compatible with Spark 2.4, we pick 1970 as the default value of 
year.
 val year = getOrDefault(accessor, ChronoField.YEAR, 1970)
-val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1)
-val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1)
-LocalDate.of(year, month, day)
+if (accessor.isSupported(ChronoField.DAY_OF_YEAR)) {
+  val dayOfYear = accessor.get(ChronoField.DAY_OF_YEAR)
+  val date = LocalDate.ofYearDay(year, dayOfYear)
+  verifyLocalDate(accessor, ChronoField.MONTH_OF_YEAR, date)
+  verifyLocalDate(accessor, ChronoField.DAY_OF_MONTH, date)
+  date
+} else {
+  val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1)
+  val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1)
+  LocalDate.of(year, month, day)
+}
   }
 
   private def toLocalTime(accessor: TemporalAccessor): LocalTime = {
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
index 4892dea..0a29d94 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
@@ -31,6 +31,8 @@ class DateFormatterSuite extends DatetimeFormatterSuite {
 DateFormatter(pattern, UTC, isParsing)
   }
 
+  override protected def useDateFormatter: Boolean = true
+
   test("parsing dates") {
 outstandingTimezonesIds.foreach { timeZone =>
   withSQLConf(SQLConf.SESSION_LOCAL_TIMEZONE.key

[spark] branch branch-3.0 updated: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

2020-06-11 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new f61b31a  [SPARK-31916][SQL] StringConcat can lead to 
StringIndexOutOfBoundsException
f61b31a is described below

commit f61b31a5a484c7e90920ec36c456594ce92cdf73
Author: Dilip Biswal 
AuthorDate: Fri Jun 12 09:19:29 2020 +0900

[SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

### What changes were proposed in this pull request?
A minor fix to fix the append method of StringConcat to cap the length at 
MAX_ROUNDED_ARRAY_LENGTH to make sure it does not overflow and cause 
StringIndexOutOfBoundsException

Thanks to **Jeffrey Stokes** for reporting the issue and explaining the 
underlying problem in detail in the JIRA.

### Why are the changes needed?
This fixes StringIndexOutOfBoundsException on an overflow.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Added a test in StringsUtilSuite.

Closes #28750 from dilipbiswal/SPARK-31916.

Authored-by: Dilip Biswal 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit b87a342c7dd51046fcbe323db640c825646fb8d4)
Signed-off-by: Takeshi Yamamuro 
---
 .../spark/sql/catalyst/util/StringUtils.scala  |  6 +++-
 .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +-
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
index b42ae4e..2a416d6 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
@@ -123,7 +123,11 @@ object StringUtils extends Logging {
   val stringToAppend = if (available >= sLen) s else s.substring(0, 
available)
   strings.append(stringToAppend)
 }
-length += sLen
+
+// Keeps the total length of appended strings. Note that we need to 
cap the length at
+// `ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH`; otherwise, we will 
overflow
+// length causing StringIndexOutOfBoundsException in the substring 
call above.
+length = Math.min(length.toLong + sLen, 
ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH).toInt
   }
 }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
index 67bc4bc..c68e89fc 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
@@ -18,9 +18,11 @@
 package org.apache.spark.sql.catalyst.util
 
 import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.catalyst.plans.SQLHelper
 import org.apache.spark.sql.catalyst.util.StringUtils._
+import org.apache.spark.sql.internal.SQLConf
 
-class StringUtilsSuite extends SparkFunSuite {
+class StringUtilsSuite extends SparkFunSuite with SQLHelper {
 
   test("escapeLikeRegex") {
 val expectedEscapedStrOne = "(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E"
@@ -98,4 +100,32 @@ class StringUtilsSuite extends SparkFunSuite {
 assert(checkLimit("1234567"))
 assert(checkLimit("1234567890"))
   }
+
+  test("SPARK-31916: StringConcat doesn't overflow on many inputs") {
+val concat = new StringConcat(maxLength = 100)
+val stringToAppend = "Test internal index of StringConcat does not 
overflow with many " +
+  "append calls"
+0.to((Integer.MAX_VALUE / stringToAppend.length) + 1).foreach { _ =>
+  concat.append(stringToAppend)
+}
+assert(concat.toString.length === 100)
+  }
+
+  test("SPARK-31916: verify that PlanStringConcat's output shows the actual 
length of the plan") {
+withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "0") {
+  val concat = new PlanStringConcat()
+  0.to(3).foreach { i =>
+concat.append(s"plan fragment $i")
+  }
+  assert(concat.toString === "Truncated plan of 60 characters")
+}
+
+withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "60") {
+  val concat = new PlanStringConcat()
+  0.to(2).foreach { i =>
+concat.append(s"plan fragment $i")
+  }
+  assert(concat.toString === "plan fragment 0plan fragment 1... 15 more 
characters")
+}
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

2020-06-11 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new f61b31a  [SPARK-31916][SQL] StringConcat can lead to 
StringIndexOutOfBoundsException
f61b31a is described below

commit f61b31a5a484c7e90920ec36c456594ce92cdf73
Author: Dilip Biswal 
AuthorDate: Fri Jun 12 09:19:29 2020 +0900

[SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

### What changes were proposed in this pull request?
A minor fix to fix the append method of StringConcat to cap the length at 
MAX_ROUNDED_ARRAY_LENGTH to make sure it does not overflow and cause 
StringIndexOutOfBoundsException

Thanks to **Jeffrey Stokes** for reporting the issue and explaining the 
underlying problem in detail in the JIRA.

### Why are the changes needed?
This fixes StringIndexOutOfBoundsException on an overflow.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Added a test in StringsUtilSuite.

Closes #28750 from dilipbiswal/SPARK-31916.

Authored-by: Dilip Biswal 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit b87a342c7dd51046fcbe323db640c825646fb8d4)
Signed-off-by: Takeshi Yamamuro 
---
 .../spark/sql/catalyst/util/StringUtils.scala  |  6 +++-
 .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +-
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
index b42ae4e..2a416d6 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
@@ -123,7 +123,11 @@ object StringUtils extends Logging {
   val stringToAppend = if (available >= sLen) s else s.substring(0, 
available)
   strings.append(stringToAppend)
 }
-length += sLen
+
+// Keeps the total length of appended strings. Note that we need to 
cap the length at
+// `ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH`; otherwise, we will 
overflow
+// length causing StringIndexOutOfBoundsException in the substring 
call above.
+length = Math.min(length.toLong + sLen, 
ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH).toInt
   }
 }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
index 67bc4bc..c68e89fc 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
@@ -18,9 +18,11 @@
 package org.apache.spark.sql.catalyst.util
 
 import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.catalyst.plans.SQLHelper
 import org.apache.spark.sql.catalyst.util.StringUtils._
+import org.apache.spark.sql.internal.SQLConf
 
-class StringUtilsSuite extends SparkFunSuite {
+class StringUtilsSuite extends SparkFunSuite with SQLHelper {
 
   test("escapeLikeRegex") {
 val expectedEscapedStrOne = "(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E"
@@ -98,4 +100,32 @@ class StringUtilsSuite extends SparkFunSuite {
 assert(checkLimit("1234567"))
 assert(checkLimit("1234567890"))
   }
+
+  test("SPARK-31916: StringConcat doesn't overflow on many inputs") {
+val concat = new StringConcat(maxLength = 100)
+val stringToAppend = "Test internal index of StringConcat does not 
overflow with many " +
+  "append calls"
+0.to((Integer.MAX_VALUE / stringToAppend.length) + 1).foreach { _ =>
+  concat.append(stringToAppend)
+}
+assert(concat.toString.length === 100)
+  }
+
+  test("SPARK-31916: verify that PlanStringConcat's output shows the actual 
length of the plan") {
+withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "0") {
+  val concat = new PlanStringConcat()
+  0.to(3).foreach { i =>
+concat.append(s"plan fragment $i")
+  }
+  assert(concat.toString === "Truncated plan of 60 characters")
+}
+
+withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "60") {
+  val concat = new PlanStringConcat()
+  0.to(2).foreach { i =>
+concat.append(s"plan fragment $i")
+  }
+  assert(concat.toString === "plan fragment 0plan fragment 1... 15 more 
characters")
+}
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (88a4e55 -> b87a342)

2020-06-11 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 88a4e55  [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0
 add b87a342  [SPARK-31916][SQL] StringConcat can lead to 
StringIndexOutOfBoundsException

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/util/StringUtils.scala  |  6 +++-
 .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +-
 2 files changed, 36 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

2020-06-11 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new f61b31a  [SPARK-31916][SQL] StringConcat can lead to 
StringIndexOutOfBoundsException
f61b31a is described below

commit f61b31a5a484c7e90920ec36c456594ce92cdf73
Author: Dilip Biswal 
AuthorDate: Fri Jun 12 09:19:29 2020 +0900

[SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

### What changes were proposed in this pull request?
A minor fix to fix the append method of StringConcat to cap the length at 
MAX_ROUNDED_ARRAY_LENGTH to make sure it does not overflow and cause 
StringIndexOutOfBoundsException

Thanks to **Jeffrey Stokes** for reporting the issue and explaining the 
underlying problem in detail in the JIRA.

### Why are the changes needed?
This fixes StringIndexOutOfBoundsException on an overflow.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Added a test in StringsUtilSuite.

Closes #28750 from dilipbiswal/SPARK-31916.

Authored-by: Dilip Biswal 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit b87a342c7dd51046fcbe323db640c825646fb8d4)
Signed-off-by: Takeshi Yamamuro 
---
 .../spark/sql/catalyst/util/StringUtils.scala  |  6 +++-
 .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +-
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
index b42ae4e..2a416d6 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
@@ -123,7 +123,11 @@ object StringUtils extends Logging {
   val stringToAppend = if (available >= sLen) s else s.substring(0, 
available)
   strings.append(stringToAppend)
 }
-length += sLen
+
+// Keeps the total length of appended strings. Note that we need to 
cap the length at
+// `ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH`; otherwise, we will 
overflow
+// length causing StringIndexOutOfBoundsException in the substring 
call above.
+length = Math.min(length.toLong + sLen, 
ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH).toInt
   }
 }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
index 67bc4bc..c68e89fc 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
@@ -18,9 +18,11 @@
 package org.apache.spark.sql.catalyst.util
 
 import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.catalyst.plans.SQLHelper
 import org.apache.spark.sql.catalyst.util.StringUtils._
+import org.apache.spark.sql.internal.SQLConf
 
-class StringUtilsSuite extends SparkFunSuite {
+class StringUtilsSuite extends SparkFunSuite with SQLHelper {
 
   test("escapeLikeRegex") {
 val expectedEscapedStrOne = "(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E"
@@ -98,4 +100,32 @@ class StringUtilsSuite extends SparkFunSuite {
 assert(checkLimit("1234567"))
 assert(checkLimit("1234567890"))
   }
+
+  test("SPARK-31916: StringConcat doesn't overflow on many inputs") {
+val concat = new StringConcat(maxLength = 100)
+val stringToAppend = "Test internal index of StringConcat does not 
overflow with many " +
+  "append calls"
+0.to((Integer.MAX_VALUE / stringToAppend.length) + 1).foreach { _ =>
+  concat.append(stringToAppend)
+}
+assert(concat.toString.length === 100)
+  }
+
+  test("SPARK-31916: verify that PlanStringConcat's output shows the actual 
length of the plan") {
+withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "0") {
+  val concat = new PlanStringConcat()
+  0.to(3).foreach { i =>
+concat.append(s"plan fragment $i")
+  }
+  assert(concat.toString === "Truncated plan of 60 characters")
+}
+
+withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "60") {
+  val concat = new PlanStringConcat()
+  0.to(2).foreach { i =>
+concat.append(s"plan fragment $i")
+  }
+  assert(concat.toString === "plan fragment 0plan fragment 1... 15 more 
characters")
+}
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (88a4e55 -> b87a342)

2020-06-11 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 88a4e55  [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0
 add b87a342  [SPARK-31916][SQL] StringConcat can lead to 
StringIndexOutOfBoundsException

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/util/StringUtils.scala  |  6 +++-
 .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +-
 2 files changed, 36 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

2020-06-11 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new f61b31a  [SPARK-31916][SQL] StringConcat can lead to 
StringIndexOutOfBoundsException
f61b31a is described below

commit f61b31a5a484c7e90920ec36c456594ce92cdf73
Author: Dilip Biswal 
AuthorDate: Fri Jun 12 09:19:29 2020 +0900

[SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

### What changes were proposed in this pull request?
A minor fix to fix the append method of StringConcat to cap the length at 
MAX_ROUNDED_ARRAY_LENGTH to make sure it does not overflow and cause 
StringIndexOutOfBoundsException

Thanks to **Jeffrey Stokes** for reporting the issue and explaining the 
underlying problem in detail in the JIRA.

### Why are the changes needed?
This fixes StringIndexOutOfBoundsException on an overflow.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Added a test in StringsUtilSuite.

Closes #28750 from dilipbiswal/SPARK-31916.

Authored-by: Dilip Biswal 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit b87a342c7dd51046fcbe323db640c825646fb8d4)
Signed-off-by: Takeshi Yamamuro 
---
 .../spark/sql/catalyst/util/StringUtils.scala  |  6 +++-
 .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +-
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
index b42ae4e..2a416d6 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
@@ -123,7 +123,11 @@ object StringUtils extends Logging {
   val stringToAppend = if (available >= sLen) s else s.substring(0, 
available)
   strings.append(stringToAppend)
 }
-length += sLen
+
+// Keeps the total length of appended strings. Note that we need to 
cap the length at
+// `ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH`; otherwise, we will 
overflow
+// length causing StringIndexOutOfBoundsException in the substring 
call above.
+length = Math.min(length.toLong + sLen, 
ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH).toInt
   }
 }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
index 67bc4bc..c68e89fc 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
@@ -18,9 +18,11 @@
 package org.apache.spark.sql.catalyst.util
 
 import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.catalyst.plans.SQLHelper
 import org.apache.spark.sql.catalyst.util.StringUtils._
+import org.apache.spark.sql.internal.SQLConf
 
-class StringUtilsSuite extends SparkFunSuite {
+class StringUtilsSuite extends SparkFunSuite with SQLHelper {
 
   test("escapeLikeRegex") {
 val expectedEscapedStrOne = "(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E"
@@ -98,4 +100,32 @@ class StringUtilsSuite extends SparkFunSuite {
 assert(checkLimit("1234567"))
 assert(checkLimit("1234567890"))
   }
+
+  test("SPARK-31916: StringConcat doesn't overflow on many inputs") {
+val concat = new StringConcat(maxLength = 100)
+val stringToAppend = "Test internal index of StringConcat does not 
overflow with many " +
+  "append calls"
+0.to((Integer.MAX_VALUE / stringToAppend.length) + 1).foreach { _ =>
+  concat.append(stringToAppend)
+}
+assert(concat.toString.length === 100)
+  }
+
+  test("SPARK-31916: verify that PlanStringConcat's output shows the actual 
length of the plan") {
+withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "0") {
+  val concat = new PlanStringConcat()
+  0.to(3).foreach { i =>
+concat.append(s"plan fragment $i")
+  }
+  assert(concat.toString === "Truncated plan of 60 characters")
+}
+
+withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "60") {
+  val concat = new PlanStringConcat()
+  0.to(2).foreach { i =>
+concat.append(s"plan fragment $i")
+  }
+  assert(concat.toString === "plan fragment 0plan fragment 1... 15 more 
characters")
+}
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (88a4e55 -> b87a342)

2020-06-11 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 88a4e55  [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0
 add b87a342  [SPARK-31916][SQL] StringConcat can lead to 
StringIndexOutOfBoundsException

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/util/StringUtils.scala  |  6 +++-
 .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +-
 2 files changed, 36 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

2020-06-11 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new f61b31a  [SPARK-31916][SQL] StringConcat can lead to 
StringIndexOutOfBoundsException
f61b31a is described below

commit f61b31a5a484c7e90920ec36c456594ce92cdf73
Author: Dilip Biswal 
AuthorDate: Fri Jun 12 09:19:29 2020 +0900

[SPARK-31916][SQL] StringConcat can lead to StringIndexOutOfBoundsException

### What changes were proposed in this pull request?
A minor fix to fix the append method of StringConcat to cap the length at 
MAX_ROUNDED_ARRAY_LENGTH to make sure it does not overflow and cause 
StringIndexOutOfBoundsException

Thanks to **Jeffrey Stokes** for reporting the issue and explaining the 
underlying problem in detail in the JIRA.

### Why are the changes needed?
This fixes StringIndexOutOfBoundsException on an overflow.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Added a test in StringsUtilSuite.

Closes #28750 from dilipbiswal/SPARK-31916.

Authored-by: Dilip Biswal 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit b87a342c7dd51046fcbe323db640c825646fb8d4)
Signed-off-by: Takeshi Yamamuro 
---
 .../spark/sql/catalyst/util/StringUtils.scala  |  6 +++-
 .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +-
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
index b42ae4e..2a416d6 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala
@@ -123,7 +123,11 @@ object StringUtils extends Logging {
   val stringToAppend = if (available >= sLen) s else s.substring(0, 
available)
   strings.append(stringToAppend)
 }
-length += sLen
+
+// Keeps the total length of appended strings. Note that we need to 
cap the length at
+// `ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH`; otherwise, we will 
overflow
+// length causing StringIndexOutOfBoundsException in the substring 
call above.
+length = Math.min(length.toLong + sLen, 
ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH).toInt
   }
 }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
index 67bc4bc..c68e89fc 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/StringUtilsSuite.scala
@@ -18,9 +18,11 @@
 package org.apache.spark.sql.catalyst.util
 
 import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.catalyst.plans.SQLHelper
 import org.apache.spark.sql.catalyst.util.StringUtils._
+import org.apache.spark.sql.internal.SQLConf
 
-class StringUtilsSuite extends SparkFunSuite {
+class StringUtilsSuite extends SparkFunSuite with SQLHelper {
 
   test("escapeLikeRegex") {
 val expectedEscapedStrOne = "(?s)\\Qa\\E\\Qb\\E\\Qd\\E\\Qe\\E\\Qf\\E"
@@ -98,4 +100,32 @@ class StringUtilsSuite extends SparkFunSuite {
 assert(checkLimit("1234567"))
 assert(checkLimit("1234567890"))
   }
+
+  test("SPARK-31916: StringConcat doesn't overflow on many inputs") {
+val concat = new StringConcat(maxLength = 100)
+val stringToAppend = "Test internal index of StringConcat does not 
overflow with many " +
+  "append calls"
+0.to((Integer.MAX_VALUE / stringToAppend.length) + 1).foreach { _ =>
+  concat.append(stringToAppend)
+}
+assert(concat.toString.length === 100)
+  }
+
+  test("SPARK-31916: verify that PlanStringConcat's output shows the actual 
length of the plan") {
+withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "0") {
+  val concat = new PlanStringConcat()
+  0.to(3).foreach { i =>
+concat.append(s"plan fragment $i")
+  }
+  assert(concat.toString === "Truncated plan of 60 characters")
+}
+
+withSQLConf(SQLConf.MAX_PLAN_STRING_LENGTH.key -> "60") {
+  val concat = new PlanStringConcat()
+  0.to(2).foreach { i =>
+concat.append(s"plan fragment $i")
+  }
+  assert(concat.toString === "plan fragment 0plan fragment 1... 15 more 
characters")
+}
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (88a4e55 -> b87a342)

2020-06-11 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 88a4e55  [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0
 add b87a342  [SPARK-31916][SQL] StringConcat can lead to 
StringIndexOutOfBoundsException

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/util/StringUtils.scala  |  6 +++-
 .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +-
 2 files changed, 36 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (88a4e55 -> b87a342)

2020-06-11 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 88a4e55  [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0
 add b87a342  [SPARK-31916][SQL] StringConcat can lead to 
StringIndexOutOfBoundsException

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/util/StringUtils.scala  |  6 +++-
 .../spark/sql/catalyst/util/StringUtilsSuite.scala | 32 +-
 2 files changed, 36 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b1adc3d -> 88a4e55)

2020-06-11 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b1adc3d  [SPARK-21117][SQL] Built-in SQL Function Support - 
WIDTH_BUCKET
 add 88a4e55  [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0

No new revisions were added by this update.

Summary of changes:
 core/pom.xml   |  2 +-
 core/src/main/scala/org/apache/spark/ui/JettyUtils.scala   |  7 ++-
 .../test/scala/org/apache/spark/ui/UISeleniumSuite.scala   |  2 +-
 pom.xml| 14 +-
 sql/core/pom.xml   |  2 +-
 sql/hive-thriftserver/pom.xml  |  2 +-
 streaming/pom.xml  |  2 +-
 7 files changed, 20 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b1adc3d -> 88a4e55)

2020-06-11 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b1adc3d  [SPARK-21117][SQL] Built-in SQL Function Support - 
WIDTH_BUCKET
 add 88a4e55  [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0

No new revisions were added by this update.

Summary of changes:
 core/pom.xml   |  2 +-
 core/src/main/scala/org/apache/spark/ui/JettyUtils.scala   |  7 ++-
 .../test/scala/org/apache/spark/ui/UISeleniumSuite.scala   |  2 +-
 pom.xml| 14 +-
 sql/core/pom.xml   |  2 +-
 sql/hive-thriftserver/pom.xml  |  2 +-
 streaming/pom.xml  |  2 +-
 7 files changed, 20 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b1adc3d -> 88a4e55)

2020-06-11 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b1adc3d  [SPARK-21117][SQL] Built-in SQL Function Support - 
WIDTH_BUCKET
 add 88a4e55  [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0

No new revisions were added by this update.

Summary of changes:
 core/pom.xml   |  2 +-
 core/src/main/scala/org/apache/spark/ui/JettyUtils.scala   |  7 ++-
 .../test/scala/org/apache/spark/ui/UISeleniumSuite.scala   |  2 +-
 pom.xml| 14 +-
 sql/core/pom.xml   |  2 +-
 sql/hive-thriftserver/pom.xml  |  2 +-
 streaming/pom.xml  |  2 +-
 7 files changed, 20 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b1adc3d -> 88a4e55)

2020-06-11 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b1adc3d  [SPARK-21117][SQL] Built-in SQL Function Support - 
WIDTH_BUCKET
 add 88a4e55  [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0

No new revisions were added by this update.

Summary of changes:
 core/pom.xml   |  2 +-
 core/src/main/scala/org/apache/spark/ui/JettyUtils.scala   |  7 ++-
 .../test/scala/org/apache/spark/ui/UISeleniumSuite.scala   |  2 +-
 pom.xml| 14 +-
 sql/core/pom.xml   |  2 +-
 sql/hive-thriftserver/pom.xml  |  2 +-
 streaming/pom.xml  |  2 +-
 7 files changed, 20 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b1adc3d -> 88a4e55)

2020-06-11 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b1adc3d  [SPARK-21117][SQL] Built-in SQL Function Support - 
WIDTH_BUCKET
 add 88a4e55  [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0

No new revisions were added by this update.

Summary of changes:
 core/pom.xml   |  2 +-
 core/src/main/scala/org/apache/spark/ui/JettyUtils.scala   |  7 ++-
 .../test/scala/org/apache/spark/ui/UISeleniumSuite.scala   |  2 +-
 pom.xml| 14 +-
 sql/core/pom.xml   |  2 +-
 sql/hive-thriftserver/pom.xml  |  2 +-
 streaming/pom.xml  |  2 +-
 7 files changed, 20 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options

2020-06-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new fe298c3  [SPARK-31935][SQL][3.0] Hadoop file system config should be 
effective in data source options
fe298c3 is described below

commit fe298c34e11823cb1371db1bff425ce8874fface
Author: Gengliang Wang 
AuthorDate: Thu Jun 11 14:18:19 2020 -0700

[SPARK-31935][SQL][3.0] Hadoop file system config should be effective in 
data source options

### What changes were proposed in this pull request?

Mkae Hadoop file system config effective in data source options.

From `org.apache.hadoop.fs.FileSystem.java`:
```
  public static FileSystem get(URI uri, Configuration conf) throws 
IOException {
String scheme = uri.getScheme();
String authority = uri.getAuthority();

if (scheme == null && authority == null) { // use default FS
  return get(conf);
}

if (scheme != null && authority == null) { // no authority
  URI defaultUri = getDefaultUri(conf);
  if (scheme.equals(defaultUri.getScheme())// if scheme matches 
default
  && defaultUri.getAuthority() != null) {  // & default has 
authority
return get(defaultUri, conf);  // return default
  }
}

String disableCacheName = String.format("fs.%s.impl.disable.cache", 
scheme);
if (conf.getBoolean(disableCacheName, false)) {
  return createFileSystem(uri, conf);
}

return CACHE.get(uri, conf);
  }
```
Before changes, the file system configurations in data source options are 
not propagated in `DataSource.scala`.
After changes, we can specify authority and URI schema related 
configurations for scanning file systems.

This problem only exists in data source V1. In V2, we already use 
`sparkSession.sessionState.newHadoopConfWithOptions(options)` in `FileTable`.
### Why are the changes needed?

Allow users to specify authority and URI schema related Hadoop 
configurations for file source reading.

### Does this PR introduce _any_ user-facing change?

Yes, the file system related Hadoop configuration in data source option 
will be effective on reading.

### How was this patch tested?

Unit test

Closes #28776 from gengliangwang/SPARK-31935-3.0.

Lead-authored-by: Gengliang Wang 
Co-authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 .../spark/sql/execution/datasources/DataSource.scala | 13 +++--
 .../apache/spark/sql/FileBasedDataSourceSuite.scala  | 20 
 .../spark/sql/streaming/FileStreamSourceSuite.scala  | 12 
 3 files changed, 39 insertions(+), 6 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
index 3615afc..588a9b4 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
@@ -109,6 +109,9 @@ case class DataSource(
 
   private def providingInstance() = 
providingClass.getConstructor().newInstance()
 
+  private def newHadoopConfiguration(): Configuration =
+sparkSession.sessionState.newHadoopConfWithOptions(options)
+
   lazy val sourceInfo: SourceInfo = sourceSchema()
   private val caseInsensitiveOptions = CaseInsensitiveMap(options)
   private val equality = sparkSession.sessionState.conf.resolver
@@ -230,7 +233,7 @@ case class DataSource(
 // once the streaming job starts and some upstream source starts 
dropping data.
 val hdfsPath = new Path(path)
 if (!SparkHadoopUtil.get.isGlobPath(hdfsPath)) {
-  val fs = 
hdfsPath.getFileSystem(sparkSession.sessionState.newHadoopConf())
+  val fs = hdfsPath.getFileSystem(newHadoopConfiguration())
   if (!fs.exists(hdfsPath)) {
 throw new AnalysisException(s"Path does not exist: $path")
   }
@@ -357,7 +360,7 @@ case class DataSource(
   case (format: FileFormat, _)
   if FileStreamSink.hasMetadata(
 caseInsensitiveOptions.get("path").toSeq ++ paths,
-sparkSession.sessionState.newHadoopConf(),
+newHadoopConfiguration(),
 sparkSession.sessionState.conf) =>
 val basePath = new Path((caseInsensitiveOptions.get("path").toSeq ++ 
paths).head)
 val fileCatalog = new MetadataLogFileIndex(sparkSession, basePath,
@@ -449,7 +452,7 @@ case class DataSource(
 val allPaths = paths ++ caseInsensitiveOptions.get("path")
 val outputPath = if (allPaths.length == 1) {
   val path = new

[spark] branch branch-3.0 updated: [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options

2020-06-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new fe298c3  [SPARK-31935][SQL][3.0] Hadoop file system config should be 
effective in data source options
fe298c3 is described below

commit fe298c34e11823cb1371db1bff425ce8874fface
Author: Gengliang Wang 
AuthorDate: Thu Jun 11 14:18:19 2020 -0700

[SPARK-31935][SQL][3.0] Hadoop file system config should be effective in 
data source options

### What changes were proposed in this pull request?

Mkae Hadoop file system config effective in data source options.

From `org.apache.hadoop.fs.FileSystem.java`:
```
  public static FileSystem get(URI uri, Configuration conf) throws 
IOException {
String scheme = uri.getScheme();
String authority = uri.getAuthority();

if (scheme == null && authority == null) { // use default FS
  return get(conf);
}

if (scheme != null && authority == null) { // no authority
  URI defaultUri = getDefaultUri(conf);
  if (scheme.equals(defaultUri.getScheme())// if scheme matches 
default
  && defaultUri.getAuthority() != null) {  // & default has 
authority
return get(defaultUri, conf);  // return default
  }
}

String disableCacheName = String.format("fs.%s.impl.disable.cache", 
scheme);
if (conf.getBoolean(disableCacheName, false)) {
  return createFileSystem(uri, conf);
}

return CACHE.get(uri, conf);
  }
```
Before changes, the file system configurations in data source options are 
not propagated in `DataSource.scala`.
After changes, we can specify authority and URI schema related 
configurations for scanning file systems.

This problem only exists in data source V1. In V2, we already use 
`sparkSession.sessionState.newHadoopConfWithOptions(options)` in `FileTable`.
### Why are the changes needed?

Allow users to specify authority and URI schema related Hadoop 
configurations for file source reading.

### Does this PR introduce _any_ user-facing change?

Yes, the file system related Hadoop configuration in data source option 
will be effective on reading.

### How was this patch tested?

Unit test

Closes #28776 from gengliangwang/SPARK-31935-3.0.

Lead-authored-by: Gengliang Wang 
Co-authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 .../spark/sql/execution/datasources/DataSource.scala | 13 +++--
 .../apache/spark/sql/FileBasedDataSourceSuite.scala  | 20 
 .../spark/sql/streaming/FileStreamSourceSuite.scala  | 12 
 3 files changed, 39 insertions(+), 6 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
index 3615afc..588a9b4 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
@@ -109,6 +109,9 @@ case class DataSource(
 
   private def providingInstance() = 
providingClass.getConstructor().newInstance()
 
+  private def newHadoopConfiguration(): Configuration =
+sparkSession.sessionState.newHadoopConfWithOptions(options)
+
   lazy val sourceInfo: SourceInfo = sourceSchema()
   private val caseInsensitiveOptions = CaseInsensitiveMap(options)
   private val equality = sparkSession.sessionState.conf.resolver
@@ -230,7 +233,7 @@ case class DataSource(
 // once the streaming job starts and some upstream source starts 
dropping data.
 val hdfsPath = new Path(path)
 if (!SparkHadoopUtil.get.isGlobPath(hdfsPath)) {
-  val fs = 
hdfsPath.getFileSystem(sparkSession.sessionState.newHadoopConf())
+  val fs = hdfsPath.getFileSystem(newHadoopConfiguration())
   if (!fs.exists(hdfsPath)) {
 throw new AnalysisException(s"Path does not exist: $path")
   }
@@ -357,7 +360,7 @@ case class DataSource(
   case (format: FileFormat, _)
   if FileStreamSink.hasMetadata(
 caseInsensitiveOptions.get("path").toSeq ++ paths,
-sparkSession.sessionState.newHadoopConf(),
+newHadoopConfiguration(),
 sparkSession.sessionState.conf) =>
 val basePath = new Path((caseInsensitiveOptions.get("path").toSeq ++ 
paths).head)
 val fileCatalog = new MetadataLogFileIndex(sparkSession, basePath,
@@ -449,7 +452,7 @@ case class DataSource(
 val allPaths = paths ++ caseInsensitiveOptions.get("path")
 val outputPath = if (allPaths.length == 1) {
   val path = new

[spark] branch branch-3.0 updated: [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options

2020-06-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new fe298c3  [SPARK-31935][SQL][3.0] Hadoop file system config should be 
effective in data source options
fe298c3 is described below

commit fe298c34e11823cb1371db1bff425ce8874fface
Author: Gengliang Wang 
AuthorDate: Thu Jun 11 14:18:19 2020 -0700

[SPARK-31935][SQL][3.0] Hadoop file system config should be effective in 
data source options

### What changes were proposed in this pull request?

Mkae Hadoop file system config effective in data source options.

From `org.apache.hadoop.fs.FileSystem.java`:
```
  public static FileSystem get(URI uri, Configuration conf) throws 
IOException {
String scheme = uri.getScheme();
String authority = uri.getAuthority();

if (scheme == null && authority == null) { // use default FS
  return get(conf);
}

if (scheme != null && authority == null) { // no authority
  URI defaultUri = getDefaultUri(conf);
  if (scheme.equals(defaultUri.getScheme())// if scheme matches 
default
  && defaultUri.getAuthority() != null) {  // & default has 
authority
return get(defaultUri, conf);  // return default
  }
}

String disableCacheName = String.format("fs.%s.impl.disable.cache", 
scheme);
if (conf.getBoolean(disableCacheName, false)) {
  return createFileSystem(uri, conf);
}

return CACHE.get(uri, conf);
  }
```
Before changes, the file system configurations in data source options are 
not propagated in `DataSource.scala`.
After changes, we can specify authority and URI schema related 
configurations for scanning file systems.

This problem only exists in data source V1. In V2, we already use 
`sparkSession.sessionState.newHadoopConfWithOptions(options)` in `FileTable`.
### Why are the changes needed?

Allow users to specify authority and URI schema related Hadoop 
configurations for file source reading.

### Does this PR introduce _any_ user-facing change?

Yes, the file system related Hadoop configuration in data source option 
will be effective on reading.

### How was this patch tested?

Unit test

Closes #28776 from gengliangwang/SPARK-31935-3.0.

Lead-authored-by: Gengliang Wang 
Co-authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 .../spark/sql/execution/datasources/DataSource.scala | 13 +++--
 .../apache/spark/sql/FileBasedDataSourceSuite.scala  | 20 
 .../spark/sql/streaming/FileStreamSourceSuite.scala  | 12 
 3 files changed, 39 insertions(+), 6 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
index 3615afc..588a9b4 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
@@ -109,6 +109,9 @@ case class DataSource(
 
   private def providingInstance() = 
providingClass.getConstructor().newInstance()
 
+  private def newHadoopConfiguration(): Configuration =
+sparkSession.sessionState.newHadoopConfWithOptions(options)
+
   lazy val sourceInfo: SourceInfo = sourceSchema()
   private val caseInsensitiveOptions = CaseInsensitiveMap(options)
   private val equality = sparkSession.sessionState.conf.resolver
@@ -230,7 +233,7 @@ case class DataSource(
 // once the streaming job starts and some upstream source starts 
dropping data.
 val hdfsPath = new Path(path)
 if (!SparkHadoopUtil.get.isGlobPath(hdfsPath)) {
-  val fs = 
hdfsPath.getFileSystem(sparkSession.sessionState.newHadoopConf())
+  val fs = hdfsPath.getFileSystem(newHadoopConfiguration())
   if (!fs.exists(hdfsPath)) {
 throw new AnalysisException(s"Path does not exist: $path")
   }
@@ -357,7 +360,7 @@ case class DataSource(
   case (format: FileFormat, _)
   if FileStreamSink.hasMetadata(
 caseInsensitiveOptions.get("path").toSeq ++ paths,
-sparkSession.sessionState.newHadoopConf(),
+newHadoopConfiguration(),
 sparkSession.sessionState.conf) =>
 val basePath = new Path((caseInsensitiveOptions.get("path").toSeq ++ 
paths).head)
 val fileCatalog = new MetadataLogFileIndex(sparkSession, basePath,
@@ -449,7 +452,7 @@ case class DataSource(
 val allPaths = paths ++ caseInsensitiveOptions.get("path")
 val outputPath = if (allPaths.length == 1) {
   val path = new

[spark] branch master updated (11d3a74 -> b1adc3d)

2020-06-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 11d3a74  [SPARK-31705][SQL] Push more possible predicates through Join 
via CNF conversion
 add b1adc3d  [SPARK-21117][SQL] Built-in SQL Function Support - 
WIDTH_BUCKET

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |   1 +
 .../sql/catalyst/expressions/mathExpressions.scala |  96 
 .../sql-functions/sql-expression-schema.md |   3 +-
 .../test/resources/sql-tests/inputs/operators.sql  |  14 ++
 .../sql-tests/inputs/postgreSQL/numeric.sql|  76 +
 .../resources/sql-tests/results/operators.sql.out  |  98 +++-
 .../sql-tests/results/postgreSQL/numeric.sql.out   | 173 -
 7 files changed, 431 insertions(+), 30 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31935][SQL][3.0] Hadoop file system config should be effective in data source options

2020-06-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new fe298c3  [SPARK-31935][SQL][3.0] Hadoop file system config should be 
effective in data source options
fe298c3 is described below

commit fe298c34e11823cb1371db1bff425ce8874fface
Author: Gengliang Wang 
AuthorDate: Thu Jun 11 14:18:19 2020 -0700

[SPARK-31935][SQL][3.0] Hadoop file system config should be effective in 
data source options

### What changes were proposed in this pull request?

Mkae Hadoop file system config effective in data source options.

From `org.apache.hadoop.fs.FileSystem.java`:
```
  public static FileSystem get(URI uri, Configuration conf) throws 
IOException {
String scheme = uri.getScheme();
String authority = uri.getAuthority();

if (scheme == null && authority == null) { // use default FS
  return get(conf);
}

if (scheme != null && authority == null) { // no authority
  URI defaultUri = getDefaultUri(conf);
  if (scheme.equals(defaultUri.getScheme())// if scheme matches 
default
  && defaultUri.getAuthority() != null) {  // & default has 
authority
return get(defaultUri, conf);  // return default
  }
}

String disableCacheName = String.format("fs.%s.impl.disable.cache", 
scheme);
if (conf.getBoolean(disableCacheName, false)) {
  return createFileSystem(uri, conf);
}

return CACHE.get(uri, conf);
  }
```
Before changes, the file system configurations in data source options are 
not propagated in `DataSource.scala`.
After changes, we can specify authority and URI schema related 
configurations for scanning file systems.

This problem only exists in data source V1. In V2, we already use 
`sparkSession.sessionState.newHadoopConfWithOptions(options)` in `FileTable`.
### Why are the changes needed?

Allow users to specify authority and URI schema related Hadoop 
configurations for file source reading.

### Does this PR introduce _any_ user-facing change?

Yes, the file system related Hadoop configuration in data source option 
will be effective on reading.

### How was this patch tested?

Unit test

Closes #28776 from gengliangwang/SPARK-31935-3.0.

Lead-authored-by: Gengliang Wang 
Co-authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 .../spark/sql/execution/datasources/DataSource.scala | 13 +++--
 .../apache/spark/sql/FileBasedDataSourceSuite.scala  | 20 
 .../spark/sql/streaming/FileStreamSourceSuite.scala  | 12 
 3 files changed, 39 insertions(+), 6 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
index 3615afc..588a9b4 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
@@ -109,6 +109,9 @@ case class DataSource(
 
   private def providingInstance() = 
providingClass.getConstructor().newInstance()
 
+  private def newHadoopConfiguration(): Configuration =
+sparkSession.sessionState.newHadoopConfWithOptions(options)
+
   lazy val sourceInfo: SourceInfo = sourceSchema()
   private val caseInsensitiveOptions = CaseInsensitiveMap(options)
   private val equality = sparkSession.sessionState.conf.resolver
@@ -230,7 +233,7 @@ case class DataSource(
 // once the streaming job starts and some upstream source starts 
dropping data.
 val hdfsPath = new Path(path)
 if (!SparkHadoopUtil.get.isGlobPath(hdfsPath)) {
-  val fs = 
hdfsPath.getFileSystem(sparkSession.sessionState.newHadoopConf())
+  val fs = hdfsPath.getFileSystem(newHadoopConfiguration())
   if (!fs.exists(hdfsPath)) {
 throw new AnalysisException(s"Path does not exist: $path")
   }
@@ -357,7 +360,7 @@ case class DataSource(
   case (format: FileFormat, _)
   if FileStreamSink.hasMetadata(
 caseInsensitiveOptions.get("path").toSeq ++ paths,
-sparkSession.sessionState.newHadoopConf(),
+newHadoopConfiguration(),
 sparkSession.sessionState.conf) =>
 val basePath = new Path((caseInsensitiveOptions.get("path").toSeq ++ 
paths).head)
 val fileCatalog = new MetadataLogFileIndex(sparkSession, basePath,
@@ -449,7 +452,7 @@ case class DataSource(
 val allPaths = paths ++ caseInsensitiveOptions.get("path")
 val outputPath = if (allPaths.length == 1) {
   val path = new

[spark] branch master updated (11d3a74 -> b1adc3d)

2020-06-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 11d3a74  [SPARK-31705][SQL] Push more possible predicates through Join 
via CNF conversion
 add b1adc3d  [SPARK-21117][SQL] Built-in SQL Function Support - 
WIDTH_BUCKET

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |   1 +
 .../sql/catalyst/expressions/mathExpressions.scala |  96 
 .../sql-functions/sql-expression-schema.md |   3 +-
 .../test/resources/sql-tests/inputs/operators.sql  |  14 ++
 .../sql-tests/inputs/postgreSQL/numeric.sql|  76 +
 .../resources/sql-tests/results/operators.sql.out  |  98 +++-
 .../sql-tests/results/postgreSQL/numeric.sql.out   | 173 -
 7 files changed, 431 insertions(+), 30 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (11d3a74 -> b1adc3d)

2020-06-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 11d3a74  [SPARK-31705][SQL] Push more possible predicates through Join 
via CNF conversion
 add b1adc3d  [SPARK-21117][SQL] Built-in SQL Function Support - 
WIDTH_BUCKET

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |   1 +
 .../sql/catalyst/expressions/mathExpressions.scala |  96 
 .../sql-functions/sql-expression-schema.md |   3 +-
 .../test/resources/sql-tests/inputs/operators.sql  |  14 ++
 .../sql-tests/inputs/postgreSQL/numeric.sql|  76 +
 .../resources/sql-tests/results/operators.sql.out  |  98 +++-
 .../sql-tests/results/postgreSQL/numeric.sql.out   | 173 -
 7 files changed, 431 insertions(+), 30 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (11d3a74 -> b1adc3d)

2020-06-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 11d3a74  [SPARK-31705][SQL] Push more possible predicates through Join 
via CNF conversion
 add b1adc3d  [SPARK-21117][SQL] Built-in SQL Function Support - 
WIDTH_BUCKET

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |   1 +
 .../sql/catalyst/expressions/mathExpressions.scala |  96 
 .../sql-functions/sql-expression-schema.md |   3 +-
 .../test/resources/sql-tests/inputs/operators.sql  |  14 ++
 .../sql-tests/inputs/postgreSQL/numeric.sql|  76 +
 .../resources/sql-tests/results/operators.sql.out  |  98 +++-
 .../sql-tests/results/postgreSQL/numeric.sql.out   | 173 -
 7 files changed, 431 insertions(+), 30 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET

2020-06-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new b1adc3d  [SPARK-21117][SQL] Built-in SQL Function Support - 
WIDTH_BUCKET
b1adc3d is described below

commit b1adc3deee00058cba669534aee156dc7af243dc
Author: Takeshi Yamamuro 
AuthorDate: Thu Jun 11 14:15:28 2020 -0700

[SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET

### What changes were proposed in this pull request?

This PR intends to add a build-in SQL function - `WIDTH_BUCKET`.
It is the rework of #18323.

Closes #18323

The other RDBMS references for `WIDTH_BUCKET`:
 - Oracle: 
https://docs.oracle.com/cd/B28359_01/olap.111/b28126/dml_functions_2137.htm#OLADM717
 - PostgreSQL: https://www.postgresql.org/docs/current/functions-math.html
 - Snowflake: 
https://docs.snowflake.com/en/sql-reference/functions/width_bucket.html
 - Prestodb: https://prestodb.io/docs/current/functions/math.html
 - Teradata: 
https://docs.teradata.com/reader/kmuOwjp1zEYg98JsB8fu_A/Wa8vw69cGzoRyNULHZeudg
 - DB2: 
https://www.ibm.com/support/producthub/db2/docs/content/SSEPGG_11.5.0/com.ibm.db2.luw.sql.ref.doc/doc/r0061483.html?pos=2

### Why are the changes needed?

For better usability.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added unit tests.

Closes #28764 from maropu/SPARK-21117.

Lead-authored-by: Takeshi Yamamuro 
Co-authored-by: Yuming Wang 
Signed-off-by: Dongjoon Hyun 
---
 .../sql/catalyst/analysis/FunctionRegistry.scala   |   1 +
 .../sql/catalyst/expressions/mathExpressions.scala |  96 
 .../sql-functions/sql-expression-schema.md |   3 +-
 .../test/resources/sql-tests/inputs/operators.sql  |  14 ++
 .../sql-tests/inputs/postgreSQL/numeric.sql|  76 +
 .../resources/sql-tests/results/operators.sql.out  |  98 +++-
 .../sql-tests/results/postgreSQL/numeric.sql.out   | 173 -
 7 files changed, 431 insertions(+), 30 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
index e2559d4..3989df5 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
@@ -274,6 +274,7 @@ object FunctionRegistry {
 expression[Tan]("tan"),
 expression[Cot]("cot"),
 expression[Tanh]("tanh"),
+expression[WidthBucket]("width_bucket"),
 
 expression[Add]("+"),
 expression[Subtract]("-"),
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
index fe8ea2a..5c76495 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
@@ -1325,3 +1325,99 @@ case class BRound(child: Expression, scale: Expression)
 with Serializable with ImplicitCastInputTypes {
   def this(child: Expression) = this(child, Literal(0))
 }
+
+object WidthBucket {
+
+  def computeBucketNumber(value: Double, min: Double, max: Double, numBucket: 
Long): jl.Long = {
+if (numBucket <= 0 || numBucket == Long.MaxValue || jl.Double.isNaN(value) 
|| min == max ||
+jl.Double.isNaN(min) || jl.Double.isInfinite(min) ||
+jl.Double.isNaN(max) || jl.Double.isInfinite(max)) {
+  return null
+}
+
+val lower = Math.min(min, max)
+val upper = Math.max(min, max)
+
+if (min < max) {
+  if (value < lower) {
+0L
+  } else if (value >= upper) {
+numBucket + 1L
+  } else {
+(numBucket.toDouble * (value - lower) / (upper - lower)).toLong + 1L
+  }
+} else { // `min > max` case
+  if (value > upper) {
+0L
+  } else if (value <= lower) {
+numBucket + 1L
+  } else {
+(numBucket.toDouble * (upper - value) / (upper - lower)).toLong + 1L
+  }
+}
+  }
+}
+
+/**
+ * Returns the bucket number into which the value of this expression would fall
+ * after being evaluated. Note that input arguments must follow conditions 
listed below;
+ * otherwise, the method will return null.
+ *  - `numBucket` must be greater than zero and be less than Long.MaxValue
+ *  - `value`, `min`, and `max` cannot be NaN
+ *  - `min` bound cannot equal `max`
+ *  - `min` and `max` must be finite
+ *
+ * Note: If `minValue` > `maxValue`, a return value is as follows;
+ *  if `value` >

[spark] branch master updated (91cd06b -> 11d3a74)

2020-06-11 Thread gengliang

This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 91cd06b  [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after 
running a task
 add 11d3a74  [SPARK-31705][SQL] Push more possible predicates through Join 
via CNF conversion

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/expressions/predicates.scala  |  96 +++-
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |   9 +-
 .../optimizer/PushCNFPredicateThroughJoin.scala|  62 
 .../org/apache/spark/sql/internal/SQLConf.scala|  15 ++
 .../ConjunctiveNormalFormPredicateSuite.scala  | 128 
 .../catalyst/optimizer/FilterPushdownSuite.scala   | 162 -
 6 files changed, 468 insertions(+), 4 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PushCNFPredicateThroughJoin.scala
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ConjunctiveNormalFormPredicateSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (91cd06b -> 11d3a74)

2020-06-11 Thread gengliang

This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 91cd06b  [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after 
running a task
 add 11d3a74  [SPARK-31705][SQL] Push more possible predicates through Join 
via CNF conversion

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/expressions/predicates.scala  |  96 +++-
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |   9 +-
 .../optimizer/PushCNFPredicateThroughJoin.scala|  62 
 .../org/apache/spark/sql/internal/SQLConf.scala|  15 ++
 .../ConjunctiveNormalFormPredicateSuite.scala  | 128 
 .../catalyst/optimizer/FilterPushdownSuite.scala   | 162 -
 6 files changed, 468 insertions(+), 4 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PushCNFPredicateThroughJoin.scala
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ConjunctiveNormalFormPredicateSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (91cd06b -> 11d3a74)

2020-06-11 Thread gengliang

This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 91cd06b  [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after 
running a task
 add 11d3a74  [SPARK-31705][SQL] Push more possible predicates through Join 
via CNF conversion

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/expressions/predicates.scala  |  96 +++-
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |   9 +-
 .../optimizer/PushCNFPredicateThroughJoin.scala|  62 
 .../org/apache/spark/sql/internal/SQLConf.scala|  15 ++
 .../ConjunctiveNormalFormPredicateSuite.scala  | 128 
 .../catalyst/optimizer/FilterPushdownSuite.scala   | 162 -
 6 files changed, 468 insertions(+), 4 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PushCNFPredicateThroughJoin.scala
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ConjunctiveNormalFormPredicateSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (91cd06b -> 11d3a74)

2020-06-11 Thread gengliang

This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 91cd06b  [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after 
running a task
 add 11d3a74  [SPARK-31705][SQL] Push more possible predicates through Join 
via CNF conversion

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/expressions/predicates.scala  |  96 +++-
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |   9 +-
 .../optimizer/PushCNFPredicateThroughJoin.scala|  62 
 .../org/apache/spark/sql/internal/SQLConf.scala|  15 ++
 .../ConjunctiveNormalFormPredicateSuite.scala  | 128 
 .../catalyst/optimizer/FilterPushdownSuite.scala   | 162 -
 6 files changed, 468 insertions(+), 4 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PushCNFPredicateThroughJoin.scala
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ConjunctiveNormalFormPredicateSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (91cd06b -> 11d3a74)

2020-06-11 Thread gengliang

This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 91cd06b  [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after 
running a task
 add 11d3a74  [SPARK-31705][SQL] Push more possible predicates through Join 
via CNF conversion

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/expressions/predicates.scala  |  96 +++-
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |   9 +-
 .../optimizer/PushCNFPredicateThroughJoin.scala|  62 
 .../org/apache/spark/sql/internal/SQLConf.scala|  15 ++
 .../ConjunctiveNormalFormPredicateSuite.scala  | 128 
 .../catalyst/optimizer/FilterPushdownSuite.scala   | 162 -
 6 files changed, 468 insertions(+), 4 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PushCNFPredicateThroughJoin.scala
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ConjunctiveNormalFormPredicateSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (912d45d -> 91cd06b)

2020-06-11 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 912d45d  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite
 add 91cd06b  [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after 
running a task

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/executor/Executor.scala | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (912d45d -> 91cd06b)

2020-06-11 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 912d45d  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite
 add 91cd06b  [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after 
running a task

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/executor/Executor.scala | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (912d45d -> 91cd06b)

2020-06-11 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 912d45d  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite
 add 91cd06b  [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after 
running a task

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/executor/Executor.scala | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (912d45d -> 91cd06b)

2020-06-11 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 912d45d  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite
 add 91cd06b  [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after 
running a task

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/executor/Executor.scala | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (912d45d -> 91cd06b)

2020-06-11 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 912d45d  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite
 add 91cd06b  [SPARK-8981][CORE][FOLLOW-UP] Clean up MDC properties after 
running a task

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/executor/Executor.scala | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (250c6fd -> dad163f)

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 250c6fd  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite
 add dad163f  Revert "[SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of 
year when year field pattern is missing"

No new revisions were added by this update.

Summary of changes:
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 +
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 -
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 ---
 .../catalyst/util/TimestampFormatterSuite.scala|   2 -
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 .../sql-tests/inputs/datetime-parsing-legacy.sql   |   2 -
 .../sql-tests/inputs/datetime-parsing.sql  |  16 ---
 .../results/datetime-parsing-invalid.sql.out   | 110 -
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 3 insertions(+), 465 deletions(-)
 delete mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-invalid.sql
 delete mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-legacy.sql
 delete mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing.sql
 delete mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-invalid.sql.out
 delete mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-legacy.sql.out
 delete mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (250c6fd -> dad163f)

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 250c6fd  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite
 add dad163f  Revert "[SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of 
year when year field pattern is missing"

No new revisions were added by this update.

Summary of changes:
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 +
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 -
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 ---
 .../catalyst/util/TimestampFormatterSuite.scala|   2 -
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 .../sql-tests/inputs/datetime-parsing-legacy.sql   |   2 -
 .../sql-tests/inputs/datetime-parsing.sql  |  16 ---
 .../results/datetime-parsing-invalid.sql.out   | 110 -
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 3 insertions(+), 465 deletions(-)
 delete mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-invalid.sql
 delete mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-legacy.sql
 delete mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing.sql
 delete mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-invalid.sql.out
 delete mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-legacy.sql.out
 delete mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: Revert "[SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when year field pattern is missing"

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new dad163f  Revert "[SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of 
year when year field pattern is missing"
dad163f is described below

commit dad163f0f4a70b84f560de4f1f0bb89354e39217
Author: HyukjinKwon 
AuthorDate: Thu Jun 11 22:59:43 2020 +0900

Revert "[SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when year 
field pattern is missing"

This reverts commit 4638402d4775acf0d7ff190ef5eabb6f2dc85af5.
---
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 +
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 -
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 ---
 .../catalyst/util/TimestampFormatterSuite.scala|   2 -
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 .../sql-tests/inputs/datetime-parsing-legacy.sql   |   2 -
 .../sql-tests/inputs/datetime-parsing.sql  |  16 ---
 .../results/datetime-parsing-invalid.sql.out   | 110 -
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 3 insertions(+), 465 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
index 5de06af..992a2b1 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
@@ -39,18 +39,6 @@ trait DateTimeFormatterHelper {
 }
   }
 
-  private def verifyLocalDate(
-  accessor: TemporalAccessor, field: ChronoField, candidate: LocalDate): 
Unit = {
-if (accessor.isSupported(field)) {
-  val actual = accessor.get(field)
-  val expected = candidate.get(field)
-  if (actual != expected) {
-throw new DateTimeException(s"Conflict found: Field $field $actual 
differs from" +
-  s" $field $expected derived from $candidate")
-  }
-}
-  }
-
   protected def toLocalDate(accessor: TemporalAccessor): LocalDate = {
 val localDate = accessor.query(TemporalQueries.localDate())
 // If all the date fields are specified, return the local date directly.
@@ -60,17 +48,9 @@ trait DateTimeFormatterHelper {
 // later, and we should provide default values for missing fields.
 // To be compatible with Spark 2.4, we pick 1970 as the default value of 
year.
 val year = getOrDefault(accessor, ChronoField.YEAR, 1970)
-if (accessor.isSupported(ChronoField.DAY_OF_YEAR)) {
-  val dayOfYear = accessor.get(ChronoField.DAY_OF_YEAR)
-  val date = LocalDate.ofYearDay(year, dayOfYear)
-  verifyLocalDate(accessor, ChronoField.MONTH_OF_YEAR, date)
-  verifyLocalDate(accessor, ChronoField.DAY_OF_MONTH, date)
-  date
-} else {
-  val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1)
-  val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1)
-  LocalDate.of(year, month, day)
-}
+val month = getOrDefault(accessor, ChronoField.MONTH_OF_YEAR, 1)
+val day = getOrDefault(accessor, ChronoField.DAY_OF_MONTH, 1)
+LocalDate.of(year, month, day)
   }
 
   private def toLocalTime(accessor: TemporalAccessor): LocalTime = {
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
index 0a29d94..4892dea 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateFormatterSuite.scala
@@ -31,8 +31,6 @@ class DateFormatterSuite extends DatetimeFormatterSuite {
 DateFormatter(pattern, UTC, isParsing)
   }
 
-  override protected def useDateFormatter: Boolean = true
-
   test("parsing dates") {
 outstandingTimezonesIds.foreach { timeZone =>
   withSQLConf(SQLConf.SESSION_LOCAL_TIMEZONE.key -> timeZone) {
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DatetimeFormatterSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DatetimeFormatterSuite.scala
index b78facd..31ff50f 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DatetimeFormatterSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DatetimeFormatterSuite.scala
@@ -17,61 +17,15 @@
 
 package org.apache.spark.sql.catalyst.util
 
-import java.time.DateTimeException
-
 import org.scalatest.Matchers
 
 import org.apache.spark.{SparkFunSuite,

[spark] branch branch-2.4 updated (53f1349 -> e51eb3a)

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 53f1349  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore
 add e51eb3a  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

No new revisions were added by this update.

Summary of changes:
 ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 |  1 -
 ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} |  0
 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} |  0
 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 |  1 -
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 26 +-
 5 files changed, 10 insertions(+), 18 deletions(-)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2} (100%)
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79} (100%)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (593b423 -> 250c6fd)

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 593b423  [SPARK-31958][SQL] normalize special floating numbers in 
subquery
 add 250c6fd  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

No new revisions were added by this update.

Summary of changes:
 ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 |  1 -
 ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} |  0
 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} |  0
 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 |  1 -
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 26 +-
 5 files changed, 10 insertions(+), 18 deletions(-)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2} (100%)
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79} (100%)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated (53f1349 -> e51eb3a)

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 53f1349  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore
 add e51eb3a  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

No new revisions were added by this update.

Summary of changes:
 ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 |  1 -
 ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} |  0
 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} |  0
 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 |  1 -
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 26 +-
 5 files changed, 10 insertions(+), 18 deletions(-)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2} (100%)
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79} (100%)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 250c6fd  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite
250c6fd is described below

commit 250c6fdecf169b2a6fa162af5034c3b67ea3f9bc
Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com>
AuthorDate: Thu Jun 11 22:03:40 2020 +0900

[SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

### What changes were proposed in this pull request?
remove duplicate test cases

### Why are the changes needed?
improve test quality

### Does this PR introduce _any_ user-facing change?
NO

### How was this patch tested?
No  test

Closes #28782 from GuoPhilipse/31954-delete-duplicate-testcase.

Lead-authored-by: GuoPhilipse 
<46367746+guophili...@users.noreply.github.com>
Co-authored-by: GuoPhilipse 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 912d45df7c6535336f72c971c90fecd11cfe87e9)
Signed-off-by: HyukjinKwon 
---
 ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 |  1 -
 ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} |  0
 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} |  0
 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 |  1 -
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 26 +-
 5 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2 
b/sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2
deleted file mode 100644
index 5625e59..000
--- a/sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2   
+++ /dev/null
@@ -1 +0,0 @@
-1.2
diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2 
b/sql/hive/src/test/resources/golden/timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2
similarity index 100%
rename from sql/hive/src/test/resources/golden/timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2
rename to sql/hive/src/test/resources/golden/timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2
diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79 
b/sql/hive/src/test/resources/golden/timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79
similarity index 100%
rename from sql/hive/src/test/resources/golden/timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79
rename to sql/hive/src/test/resources/golden/timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79
diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79 
b/sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79
deleted file mode 100644
index 1d94c8a..000
--- a/sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79   
+++ /dev/null
@@ -1 +0,0 @@
--1.2
diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
index 63b985f..b10a8cb 100644
--- 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
+++ 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
@@ -556,33 +556,27 @@ class HiveQuerySuite extends HiveComparisonTest with 
SQLTestUtils with BeforeAnd
 assert(1 == res.getDouble(0))
   }
 
-  createQueryTest("timestamp cast #2",
-"SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
-
-  test("timestamp cast #3") {
-val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 
1").collect().head
-assert(1200 == res.getInt(0))
+  test("timestamp cast #2") {
+val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 
1").collect().head
+assert(-1 == res.get(0))
   }
 
-  createQueryTest("timestamp cast #4",
+  createQueryTest("timestamp cast #3",
 "SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
 
+  createQueryTest("timestamp cast #4",
+"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
+
   test("timestamp cast #5") {
-val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 
1").collect().head
-assert(-1 == res.get(0))
+val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 
1").collect().head
+assert(1200 == res.getInt(0))
   }
 
-  createQueryTest("timestamp cast #6",
-"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
-
-  test("timestamp cast #7") {
+  test("timestamp cast #6") {
 val res = sql("SELECT CAST(CAST(-1200 AS TIMESTAMP) AS INT) FROM src LIMIT 
1").collect().head
 assert(-1200 ==

[spark] branch master updated (6fb9c80 -> 912d45d)

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6fb9c80  [SPARK-31958][SQL] normalize special floating numbers in 
subquery
 add 912d45d  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

No new revisions were added by this update.

Summary of changes:
 ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 |  1 -
 ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} |  0
 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} |  0
 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 |  1 -
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 26 +-
 5 files changed, 10 insertions(+), 18 deletions(-)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2} (100%)
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79} (100%)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated (53f1349 -> e51eb3a)

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 53f1349  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore
 add e51eb3a  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

No new revisions were added by this update.

Summary of changes:
 ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 |  1 -
 ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} |  0
 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} |  0
 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 |  1 -
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 26 +-
 5 files changed, 10 insertions(+), 18 deletions(-)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2} (100%)
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79} (100%)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 250c6fd  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite
250c6fd is described below

commit 250c6fdecf169b2a6fa162af5034c3b67ea3f9bc
Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com>
AuthorDate: Thu Jun 11 22:03:40 2020 +0900

[SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

### What changes were proposed in this pull request?
remove duplicate test cases

### Why are the changes needed?
improve test quality

### Does this PR introduce _any_ user-facing change?
NO

### How was this patch tested?
No  test

Closes #28782 from GuoPhilipse/31954-delete-duplicate-testcase.

Lead-authored-by: GuoPhilipse 
<46367746+guophili...@users.noreply.github.com>
Co-authored-by: GuoPhilipse 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 912d45df7c6535336f72c971c90fecd11cfe87e9)
Signed-off-by: HyukjinKwon 
---
 ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 |  1 -
 ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} |  0
 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} |  0
 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 |  1 -
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 26 +-
 5 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2 
b/sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2
deleted file mode 100644
index 5625e59..000
--- a/sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2   
+++ /dev/null
@@ -1 +0,0 @@
-1.2
diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2 
b/sql/hive/src/test/resources/golden/timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2
similarity index 100%
rename from sql/hive/src/test/resources/golden/timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2
rename to sql/hive/src/test/resources/golden/timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2
diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79 
b/sql/hive/src/test/resources/golden/timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79
similarity index 100%
rename from sql/hive/src/test/resources/golden/timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79
rename to sql/hive/src/test/resources/golden/timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79
diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79 
b/sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79
deleted file mode 100644
index 1d94c8a..000
--- a/sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79   
+++ /dev/null
@@ -1 +0,0 @@
--1.2
diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
index 63b985f..b10a8cb 100644
--- 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
+++ 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
@@ -556,33 +556,27 @@ class HiveQuerySuite extends HiveComparisonTest with 
SQLTestUtils with BeforeAnd
 assert(1 == res.getDouble(0))
   }
 
-  createQueryTest("timestamp cast #2",
-"SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
-
-  test("timestamp cast #3") {
-val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 
1").collect().head
-assert(1200 == res.getInt(0))
+  test("timestamp cast #2") {
+val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 
1").collect().head
+assert(-1 == res.get(0))
   }
 
-  createQueryTest("timestamp cast #4",
+  createQueryTest("timestamp cast #3",
 "SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
 
+  createQueryTest("timestamp cast #4",
+"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
+
   test("timestamp cast #5") {
-val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 
1").collect().head
-assert(-1 == res.get(0))
+val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 
1").collect().head
+assert(1200 == res.getInt(0))
   }
 
-  createQueryTest("timestamp cast #6",
-"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
-
-  test("timestamp cast #7") {
+  test("timestamp cast #6") {
 val res = sql("SELECT CAST(CAST(-1200 AS TIMESTAMP) AS INT) FROM src LIMIT 
1").collect().head
 assert(-1200 ==

[spark] branch master updated (6fb9c80 -> 912d45d)

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6fb9c80  [SPARK-31958][SQL] normalize special floating numbers in 
subquery
 add 912d45d  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

No new revisions were added by this update.

Summary of changes:
 ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 |  1 -
 ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} |  0
 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} |  0
 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 |  1 -
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 26 +-
 5 files changed, 10 insertions(+), 18 deletions(-)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2} (100%)
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79} (100%)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated (53f1349 -> e51eb3a)

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 53f1349  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore
 add e51eb3a  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

No new revisions were added by this update.

Summary of changes:
 ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 |  1 -
 ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} |  0
 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} |  0
 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 |  1 -
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 26 +-
 5 files changed, 10 insertions(+), 18 deletions(-)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2} (100%)
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79} (100%)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 250c6fd  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite
250c6fd is described below

commit 250c6fdecf169b2a6fa162af5034c3b67ea3f9bc
Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com>
AuthorDate: Thu Jun 11 22:03:40 2020 +0900

[SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

### What changes were proposed in this pull request?
remove duplicate test cases

### Why are the changes needed?
improve test quality

### Does this PR introduce _any_ user-facing change?
NO

### How was this patch tested?
No  test

Closes #28782 from GuoPhilipse/31954-delete-duplicate-testcase.

Lead-authored-by: GuoPhilipse 
<46367746+guophili...@users.noreply.github.com>
Co-authored-by: GuoPhilipse 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 912d45df7c6535336f72c971c90fecd11cfe87e9)
Signed-off-by: HyukjinKwon 
---
 ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 |  1 -
 ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} |  0
 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} |  0
 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 |  1 -
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 26 +-
 5 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2 
b/sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2
deleted file mode 100644
index 5625e59..000
--- a/sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2   
+++ /dev/null
@@ -1 +0,0 @@
-1.2
diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2 
b/sql/hive/src/test/resources/golden/timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2
similarity index 100%
rename from sql/hive/src/test/resources/golden/timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2
rename to sql/hive/src/test/resources/golden/timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2
diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79 
b/sql/hive/src/test/resources/golden/timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79
similarity index 100%
rename from sql/hive/src/test/resources/golden/timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79
rename to sql/hive/src/test/resources/golden/timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79
diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79 
b/sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79
deleted file mode 100644
index 1d94c8a..000
--- a/sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79   
+++ /dev/null
@@ -1 +0,0 @@
--1.2
diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
index 63b985f..b10a8cb 100644
--- 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
+++ 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
@@ -556,33 +556,27 @@ class HiveQuerySuite extends HiveComparisonTest with 
SQLTestUtils with BeforeAnd
 assert(1 == res.getDouble(0))
   }
 
-  createQueryTest("timestamp cast #2",
-"SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
-
-  test("timestamp cast #3") {
-val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 
1").collect().head
-assert(1200 == res.getInt(0))
+  test("timestamp cast #2") {
+val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 
1").collect().head
+assert(-1 == res.get(0))
   }
 
-  createQueryTest("timestamp cast #4",
+  createQueryTest("timestamp cast #3",
 "SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
 
+  createQueryTest("timestamp cast #4",
+"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
+
   test("timestamp cast #5") {
-val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 
1").collect().head
-assert(-1 == res.get(0))
+val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 
1").collect().head
+assert(1200 == res.getInt(0))
   }
 
-  createQueryTest("timestamp cast #6",
-"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
-
-  test("timestamp cast #7") {
+  test("timestamp cast #6") {
 val res = sql("SELECT CAST(CAST(-1200 AS TIMESTAMP) AS INT) FROM src LIMIT 
1").collect().head
 assert(-1200 ==

[spark] branch master updated (6fb9c80 -> 912d45d)

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6fb9c80  [SPARK-31958][SQL] normalize special floating numbers in 
subquery
 add 912d45d  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

No new revisions were added by this update.

Summary of changes:
 ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 |  1 -
 ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} |  0
 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} |  0
 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 |  1 -
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 26 +-
 5 files changed, 10 insertions(+), 18 deletions(-)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2} (100%)
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79} (100%)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated (53f1349 -> e51eb3a)

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 53f1349  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore
 add e51eb3a  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

No new revisions were added by this update.

Summary of changes:
 ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 |  1 -
 ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} |  0
 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} |  0
 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 |  1 -
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 26 +-
 5 files changed, 10 insertions(+), 18 deletions(-)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2} (100%)
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79} (100%)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 250c6fd  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite
250c6fd is described below

commit 250c6fdecf169b2a6fa162af5034c3b67ea3f9bc
Author: GuoPhilipse <46367746+guophili...@users.noreply.github.com>
AuthorDate: Thu Jun 11 22:03:40 2020 +0900

[SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

### What changes were proposed in this pull request?
remove duplicate test cases

### Why are the changes needed?
improve test quality

### Does this PR introduce _any_ user-facing change?
NO

### How was this patch tested?
No  test

Closes #28782 from GuoPhilipse/31954-delete-duplicate-testcase.

Lead-authored-by: GuoPhilipse 
<46367746+guophili...@users.noreply.github.com>
Co-authored-by: GuoPhilipse 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 912d45df7c6535336f72c971c90fecd11cfe87e9)
Signed-off-by: HyukjinKwon 
---
 ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 |  1 -
 ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} |  0
 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} |  0
 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 |  1 -
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 26 +-
 5 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2 
b/sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2
deleted file mode 100644
index 5625e59..000
--- a/sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2   
+++ /dev/null
@@ -1 +0,0 @@
-1.2
diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2 
b/sql/hive/src/test/resources/golden/timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2
similarity index 100%
rename from sql/hive/src/test/resources/golden/timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2
rename to sql/hive/src/test/resources/golden/timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2
diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79 
b/sql/hive/src/test/resources/golden/timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79
similarity index 100%
rename from sql/hive/src/test/resources/golden/timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79
rename to sql/hive/src/test/resources/golden/timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79
diff --git a/sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79 
b/sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79
deleted file mode 100644
index 1d94c8a..000
--- a/sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79   
+++ /dev/null
@@ -1 +0,0 @@
--1.2
diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
index 63b985f..b10a8cb 100644
--- 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
+++ 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala
@@ -556,33 +556,27 @@ class HiveQuerySuite extends HiveComparisonTest with 
SQLTestUtils with BeforeAnd
 assert(1 == res.getDouble(0))
   }
 
-  createQueryTest("timestamp cast #2",
-"SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
-
-  test("timestamp cast #3") {
-val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 
1").collect().head
-assert(1200 == res.getInt(0))
+  test("timestamp cast #2") {
+val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 
1").collect().head
+assert(-1 == res.get(0))
   }
 
-  createQueryTest("timestamp cast #4",
+  createQueryTest("timestamp cast #3",
 "SELECT CAST(CAST(1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
 
+  createQueryTest("timestamp cast #4",
+"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
+
   test("timestamp cast #5") {
-val res = sql("SELECT CAST(CAST(-1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 
1").collect().head
-assert(-1 == res.get(0))
+val res = sql("SELECT CAST(CAST(1200 AS TIMESTAMP) AS INT) FROM src LIMIT 
1").collect().head
+assert(1200 == res.getInt(0))
   }
 
-  createQueryTest("timestamp cast #6",
-"SELECT CAST(CAST(-1.2 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1")
-
-  test("timestamp cast #7") {
+  test("timestamp cast #6") {
 val res = sql("SELECT CAST(CAST(-1200 AS TIMESTAMP) AS INT) FROM src LIMIT 
1").collect().head
 assert(-1200 ==

[spark] branch master updated (6fb9c80 -> 912d45d)

2020-06-11 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6fb9c80  [SPARK-31958][SQL] normalize special floating numbers in 
subquery
 add 912d45d  [SPARK-31954][SQL] Delete duplicate testcase in HiveQuerySuite

No new revisions were added by this update.

Summary of changes:
 ...tamp cast #2-0-732ed232ac592c5e7f7c913a88874fd2 |  1 -
 ...amp cast #3-0-732ed232ac592c5e7f7c913a88874fd2} |  0
 ...amp cast #4-0-6d2da5cfada03605834e38bc4075bc79} |  0
 ...tamp cast #6-0-6d2da5cfada03605834e38bc4075bc79 |  1 -
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 26 +-
 5 files changed, 10 insertions(+), 18 deletions(-)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#2-0-732ed232ac592c5e7f7c913a88874fd2
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#4-0-732ed232ac592c5e7f7c913a88874fd2 => timestamp cast 
#3-0-732ed232ac592c5e7f7c913a88874fd2} (100%)
 rename sql/hive/src/test/resources/golden/{timestamp cast 
#8-0-6d2da5cfada03605834e38bc4075bc79 => timestamp cast 
#4-0-6d2da5cfada03605834e38bc4075bc79} (100%)
 delete mode 100644 sql/hive/src/test/resources/golden/timestamp cast 
#6-0-6d2da5cfada03605834e38bc4075bc79


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31958][SQL] normalize special floating numbers in subquery

2020-06-11 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 593b423  [SPARK-31958][SQL] normalize special floating numbers in 
subquery
593b423 is described below

commit 593b42323255f44b268ca5148f1fa817f3c01de7
Author: Wenchen Fan 
AuthorDate: Thu Jun 11 06:39:14 2020 +

[SPARK-31958][SQL] normalize special floating numbers in subquery

### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/23388 .

https://github.com/apache/spark/pull/23388 has an issue: it doesn't handle 
subquery expressions and assumes they will be turned into joins. However, this 
is not true for non-correlated subquery expressions.

This PR fixes this issue. It now doesn't skip `Subquery`, and subquery 
expressions will be handled by `OptimizeSubqueries`, which runs the optimizer 
with the subquery.

Note that, correlated subquery expressions will be handled twice: once in 
`OptimizeSubqueries`, once later when it becomes join. This is OK as 
`NormalizeFloatingNumbers` is idempotent now.

### Why are the changes needed?

fix a bug

### Does this PR introduce _any_ user-facing change?

yes, see the newly added test.

### How was this patch tested?

new test

Closes #28785 from cloud-fan/normalize.

Authored-by: Wenchen Fan 
Signed-off-by: Wenchen Fan 
(cherry picked from commit 6fb9c80da129d0b43f9ff5b8be6ce8bad992a4ed)
Signed-off-by: Wenchen Fan 
---
 .../catalyst/optimizer/NormalizeFloatingNumbers.scala  |  4 
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
index 5f94af5..4373820 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
@@ -56,10 +56,6 @@ import org.apache.spark.sql.types._
 object NormalizeFloatingNumbers extends Rule[LogicalPlan] {
 
   def apply(plan: LogicalPlan): LogicalPlan = plan match {
-// A subquery will be rewritten into join later, and will go through this 
rule
-// eventually. Here we skip subquery, as we only need to run this rule 
once.
-case _: Subquery => plan
-
 case _ => plan transform {
   case w: Window if w.partitionSpec.exists(p => needNormalize(p)) =>
 // Although the `windowExpressions` may refer to `partitionSpec` 
expressions, we don't need
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
index a23e583..093f2db 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
@@ -3449,6 +3449,24 @@ class SQLQuerySuite extends QueryTest with 
SharedSparkSession with AdaptiveSpark
 checkAnswer(sql("select CAST(-32768 as short) DIV CAST (-1 as short)"),
   Seq(Row(Short.MinValue.toLong * -1)))
   }
+
+  test("normalize special floating numbers in subquery") {
+withTempView("v1", "v2", "v3") {
+  Seq(-0.0).toDF("d").createTempView("v1")
+  Seq(0.0).toDF("d").createTempView("v2")
+  spark.range(2).createTempView("v3")
+
+  // non-correlated subquery
+  checkAnswer(sql("SELECT (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d)"), 
Row(-0.0))
+  // correlated subquery
+  checkAnswer(
+sql(
+  """
+|SELECT id FROM v3 WHERE EXISTS
+|  (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d WHERE id > 0)
+|""".stripMargin), Row(1))
+}
+  }
 }
 
 case class Foo(bar: Option[String])


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31958][SQL] normalize special floating numbers in subquery

2020-06-11 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 593b423  [SPARK-31958][SQL] normalize special floating numbers in 
subquery
593b423 is described below

commit 593b42323255f44b268ca5148f1fa817f3c01de7
Author: Wenchen Fan 
AuthorDate: Thu Jun 11 06:39:14 2020 +

[SPARK-31958][SQL] normalize special floating numbers in subquery

### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/23388 .

https://github.com/apache/spark/pull/23388 has an issue: it doesn't handle 
subquery expressions and assumes they will be turned into joins. However, this 
is not true for non-correlated subquery expressions.

This PR fixes this issue. It now doesn't skip `Subquery`, and subquery 
expressions will be handled by `OptimizeSubqueries`, which runs the optimizer 
with the subquery.

Note that, correlated subquery expressions will be handled twice: once in 
`OptimizeSubqueries`, once later when it becomes join. This is OK as 
`NormalizeFloatingNumbers` is idempotent now.

### Why are the changes needed?

fix a bug

### Does this PR introduce _any_ user-facing change?

yes, see the newly added test.

### How was this patch tested?

new test

Closes #28785 from cloud-fan/normalize.

Authored-by: Wenchen Fan 
Signed-off-by: Wenchen Fan 
(cherry picked from commit 6fb9c80da129d0b43f9ff5b8be6ce8bad992a4ed)
Signed-off-by: Wenchen Fan 
---
 .../catalyst/optimizer/NormalizeFloatingNumbers.scala  |  4 
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
index 5f94af5..4373820 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
@@ -56,10 +56,6 @@ import org.apache.spark.sql.types._
 object NormalizeFloatingNumbers extends Rule[LogicalPlan] {
 
   def apply(plan: LogicalPlan): LogicalPlan = plan match {
-// A subquery will be rewritten into join later, and will go through this 
rule
-// eventually. Here we skip subquery, as we only need to run this rule 
once.
-case _: Subquery => plan
-
 case _ => plan transform {
   case w: Window if w.partitionSpec.exists(p => needNormalize(p)) =>
 // Although the `windowExpressions` may refer to `partitionSpec` 
expressions, we don't need
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
index a23e583..093f2db 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
@@ -3449,6 +3449,24 @@ class SQLQuerySuite extends QueryTest with 
SharedSparkSession with AdaptiveSpark
 checkAnswer(sql("select CAST(-32768 as short) DIV CAST (-1 as short)"),
   Seq(Row(Short.MinValue.toLong * -1)))
   }
+
+  test("normalize special floating numbers in subquery") {
+withTempView("v1", "v2", "v3") {
+  Seq(-0.0).toDF("d").createTempView("v1")
+  Seq(0.0).toDF("d").createTempView("v2")
+  spark.range(2).createTempView("v3")
+
+  // non-correlated subquery
+  checkAnswer(sql("SELECT (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d)"), 
Row(-0.0))
+  // correlated subquery
+  checkAnswer(
+sql(
+  """
+|SELECT id FROM v3 WHERE EXISTS
+|  (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d WHERE id > 0)
+|""".stripMargin), Row(1))
+}
+  }
 }
 
 case class Foo(bar: Option[String])


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31958][SQL] normalize special floating numbers in subquery

2020-06-11 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 593b423  [SPARK-31958][SQL] normalize special floating numbers in 
subquery
593b423 is described below

commit 593b42323255f44b268ca5148f1fa817f3c01de7
Author: Wenchen Fan 
AuthorDate: Thu Jun 11 06:39:14 2020 +

[SPARK-31958][SQL] normalize special floating numbers in subquery

### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/23388 .

https://github.com/apache/spark/pull/23388 has an issue: it doesn't handle 
subquery expressions and assumes they will be turned into joins. However, this 
is not true for non-correlated subquery expressions.

This PR fixes this issue. It now doesn't skip `Subquery`, and subquery 
expressions will be handled by `OptimizeSubqueries`, which runs the optimizer 
with the subquery.

Note that, correlated subquery expressions will be handled twice: once in 
`OptimizeSubqueries`, once later when it becomes join. This is OK as 
`NormalizeFloatingNumbers` is idempotent now.

### Why are the changes needed?

fix a bug

### Does this PR introduce _any_ user-facing change?

yes, see the newly added test.

### How was this patch tested?

new test

Closes #28785 from cloud-fan/normalize.

Authored-by: Wenchen Fan 
Signed-off-by: Wenchen Fan 
(cherry picked from commit 6fb9c80da129d0b43f9ff5b8be6ce8bad992a4ed)
Signed-off-by: Wenchen Fan 
---
 .../catalyst/optimizer/NormalizeFloatingNumbers.scala  |  4 
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
index 5f94af5..4373820 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
@@ -56,10 +56,6 @@ import org.apache.spark.sql.types._
 object NormalizeFloatingNumbers extends Rule[LogicalPlan] {
 
   def apply(plan: LogicalPlan): LogicalPlan = plan match {
-// A subquery will be rewritten into join later, and will go through this 
rule
-// eventually. Here we skip subquery, as we only need to run this rule 
once.
-case _: Subquery => plan
-
 case _ => plan transform {
   case w: Window if w.partitionSpec.exists(p => needNormalize(p)) =>
 // Although the `windowExpressions` may refer to `partitionSpec` 
expressions, we don't need
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
index a23e583..093f2db 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
@@ -3449,6 +3449,24 @@ class SQLQuerySuite extends QueryTest with 
SharedSparkSession with AdaptiveSpark
 checkAnswer(sql("select CAST(-32768 as short) DIV CAST (-1 as short)"),
   Seq(Row(Short.MinValue.toLong * -1)))
   }
+
+  test("normalize special floating numbers in subquery") {
+withTempView("v1", "v2", "v3") {
+  Seq(-0.0).toDF("d").createTempView("v1")
+  Seq(0.0).toDF("d").createTempView("v2")
+  spark.range(2).createTempView("v3")
+
+  // non-correlated subquery
+  checkAnswer(sql("SELECT (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d)"), 
Row(-0.0))
+  // correlated subquery
+  checkAnswer(
+sql(
+  """
+|SELECT id FROM v3 WHERE EXISTS
+|  (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d WHERE id > 0)
+|""".stripMargin), Row(1))
+}
+  }
 }
 
 case class Foo(bar: Option[String])


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31958][SQL] normalize special floating numbers in subquery

2020-06-11 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 593b423  [SPARK-31958][SQL] normalize special floating numbers in 
subquery
593b423 is described below

commit 593b42323255f44b268ca5148f1fa817f3c01de7
Author: Wenchen Fan 
AuthorDate: Thu Jun 11 06:39:14 2020 +

[SPARK-31958][SQL] normalize special floating numbers in subquery

### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/23388 .

https://github.com/apache/spark/pull/23388 has an issue: it doesn't handle 
subquery expressions and assumes they will be turned into joins. However, this 
is not true for non-correlated subquery expressions.

This PR fixes this issue. It now doesn't skip `Subquery`, and subquery 
expressions will be handled by `OptimizeSubqueries`, which runs the optimizer 
with the subquery.

Note that, correlated subquery expressions will be handled twice: once in 
`OptimizeSubqueries`, once later when it becomes join. This is OK as 
`NormalizeFloatingNumbers` is idempotent now.

### Why are the changes needed?

fix a bug

### Does this PR introduce _any_ user-facing change?

yes, see the newly added test.

### How was this patch tested?

new test

Closes #28785 from cloud-fan/normalize.

Authored-by: Wenchen Fan 
Signed-off-by: Wenchen Fan 
(cherry picked from commit 6fb9c80da129d0b43f9ff5b8be6ce8bad992a4ed)
Signed-off-by: Wenchen Fan 
---
 .../catalyst/optimizer/NormalizeFloatingNumbers.scala  |  4 
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
index 5f94af5..4373820 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
@@ -56,10 +56,6 @@ import org.apache.spark.sql.types._
 object NormalizeFloatingNumbers extends Rule[LogicalPlan] {
 
   def apply(plan: LogicalPlan): LogicalPlan = plan match {
-// A subquery will be rewritten into join later, and will go through this 
rule
-// eventually. Here we skip subquery, as we only need to run this rule 
once.
-case _: Subquery => plan
-
 case _ => plan transform {
   case w: Window if w.partitionSpec.exists(p => needNormalize(p)) =>
 // Although the `windowExpressions` may refer to `partitionSpec` 
expressions, we don't need
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
index a23e583..093f2db 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
@@ -3449,6 +3449,24 @@ class SQLQuerySuite extends QueryTest with 
SharedSparkSession with AdaptiveSpark
 checkAnswer(sql("select CAST(-32768 as short) DIV CAST (-1 as short)"),
   Seq(Row(Short.MinValue.toLong * -1)))
   }
+
+  test("normalize special floating numbers in subquery") {
+withTempView("v1", "v2", "v3") {
+  Seq(-0.0).toDF("d").createTempView("v1")
+  Seq(0.0).toDF("d").createTempView("v2")
+  spark.range(2).createTempView("v3")
+
+  // non-correlated subquery
+  checkAnswer(sql("SELECT (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d)"), 
Row(-0.0))
+  // correlated subquery
+  checkAnswer(
+sql(
+  """
+|SELECT id FROM v3 WHERE EXISTS
+|  (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d WHERE id > 0)
+|""".stripMargin), Row(1))
+}
+  }
 }
 
 case class Foo(bar: Option[String])


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (56d4f27 -> 6fb9c80)

2020-06-11 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 56d4f27  [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction
 add 6fb9c80  [SPARK-31958][SQL] normalize special floating numbers in 
subquery

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/NormalizeFloatingNumbers.scala  |  4 
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++
 2 files changed, 18 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31958][SQL] normalize special floating numbers in subquery

2020-06-11 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 593b423  [SPARK-31958][SQL] normalize special floating numbers in 
subquery
593b423 is described below

commit 593b42323255f44b268ca5148f1fa817f3c01de7
Author: Wenchen Fan 
AuthorDate: Thu Jun 11 06:39:14 2020 +

[SPARK-31958][SQL] normalize special floating numbers in subquery

### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/23388 .

https://github.com/apache/spark/pull/23388 has an issue: it doesn't handle 
subquery expressions and assumes they will be turned into joins. However, this 
is not true for non-correlated subquery expressions.

This PR fixes this issue. It now doesn't skip `Subquery`, and subquery 
expressions will be handled by `OptimizeSubqueries`, which runs the optimizer 
with the subquery.

Note that, correlated subquery expressions will be handled twice: once in 
`OptimizeSubqueries`, once later when it becomes join. This is OK as 
`NormalizeFloatingNumbers` is idempotent now.

### Why are the changes needed?

fix a bug

### Does this PR introduce _any_ user-facing change?

yes, see the newly added test.

### How was this patch tested?

new test

Closes #28785 from cloud-fan/normalize.

Authored-by: Wenchen Fan 
Signed-off-by: Wenchen Fan 
(cherry picked from commit 6fb9c80da129d0b43f9ff5b8be6ce8bad992a4ed)
Signed-off-by: Wenchen Fan 
---
 .../catalyst/optimizer/NormalizeFloatingNumbers.scala  |  4 
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
index 5f94af5..4373820 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala
@@ -56,10 +56,6 @@ import org.apache.spark.sql.types._
 object NormalizeFloatingNumbers extends Rule[LogicalPlan] {
 
   def apply(plan: LogicalPlan): LogicalPlan = plan match {
-// A subquery will be rewritten into join later, and will go through this 
rule
-// eventually. Here we skip subquery, as we only need to run this rule 
once.
-case _: Subquery => plan
-
 case _ => plan transform {
   case w: Window if w.partitionSpec.exists(p => needNormalize(p)) =>
 // Although the `windowExpressions` may refer to `partitionSpec` 
expressions, we don't need
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
index a23e583..093f2db 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
@@ -3449,6 +3449,24 @@ class SQLQuerySuite extends QueryTest with 
SharedSparkSession with AdaptiveSpark
 checkAnswer(sql("select CAST(-32768 as short) DIV CAST (-1 as short)"),
   Seq(Row(Short.MinValue.toLong * -1)))
   }
+
+  test("normalize special floating numbers in subquery") {
+withTempView("v1", "v2", "v3") {
+  Seq(-0.0).toDF("d").createTempView("v1")
+  Seq(0.0).toDF("d").createTempView("v2")
+  spark.range(2).createTempView("v3")
+
+  // non-correlated subquery
+  checkAnswer(sql("SELECT (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d)"), 
Row(-0.0))
+  // correlated subquery
+  checkAnswer(
+sql(
+  """
+|SELECT id FROM v3 WHERE EXISTS
+|  (SELECT v1.d FROM v1 JOIN v2 ON v1.d = v2.d WHERE id > 0)
+|""".stripMargin), Row(1))
+}
+  }
 }
 
 case class Foo(bar: Option[String])


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (56d4f27 -> 6fb9c80)

2020-06-11 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 56d4f27  [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction
 add 6fb9c80  [SPARK-31958][SQL] normalize special floating numbers in 
subquery

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/NormalizeFloatingNumbers.scala  |  4 
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++
 2 files changed, 18 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (56d4f27 -> 6fb9c80)

2020-06-11 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 56d4f27  [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction
 add 6fb9c80  [SPARK-31958][SQL] normalize special floating numbers in 
subquery

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/NormalizeFloatingNumbers.scala  |  4 
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 ++
 2 files changed, 18 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

67 matches

Mail list logo