from:"maxgekk"

[spark] branch master updated (439e94c -> 82af318)

2021-06-14 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 439e94c  [SPARK-35737][SQL] Parse day-time interval literals to 
tightest types
 add 82af318  [SPARK-35748][SS][SQL] Fix StreamingJoinHelper to be able to 
handle day-time interval

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/StreamingJoinHelper.scala|  3 +++
 .../sql/catalyst/analysis/StreamingJoinHelperSuite.scala | 12 
 2 files changed, 15 insertions(+)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35771][SQL] Format year-month intervals using type fields

2021-06-16 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 184f65e  [SPARK-35771][SQL] Format year-month intervals using type 
fields
184f65e is described below

commit 184f65e7c7359d4ff68ce5b558b186bca6e8efb1
Author: Kousuke Saruta 
AuthorDate: Wed Jun 16 11:08:02 2021 +0300

[SPARK-35771][SQL] Format year-month intervals using type fields

### What changes were proposed in this pull request?

This PR proposes to format year-month interval to strings using the start 
and end fields of `YearMonthIntervalType`.

### Why are the changes needed?

 Currently, they are ignored, and any `YearMonthIntervalType` is formatted 
as `INTERVAL YEAR TO MONTH`.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

New test.

Closes #32924 from sarutak/year-month-interval-format.

Authored-by: Kousuke Saruta 
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/util/IntervalUtils.scala| 21 -
 .../sql/catalyst/util/IntervalUtilsSuite.scala | 26 ++
 2 files changed, 42 insertions(+), 5 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
index aca3d15..e9be3de 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
@@ -29,7 +29,7 @@ import 
org.apache.spark.sql.catalyst.util.DateTimeUtils.millisToMicros
 import org.apache.spark.sql.catalyst.util.IntervalStringStyles.{ANSI_STYLE, 
HIVE_STYLE, IntervalStyle}
 import org.apache.spark.sql.errors.QueryExecutionErrors
 import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.types.{DayTimeIntervalType, Decimal}
+import org.apache.spark.sql.types.{DayTimeIntervalType, Decimal, 
YearMonthIntervalType}
 import org.apache.spark.unsafe.types.{CalendarInterval, UTF8String}
 
 // The style of textual representation of intervals
@@ -945,7 +945,6 @@ object IntervalUtils {
   def toYearMonthIntervalString(
   months: Int,
   style: IntervalStyle,
-  // TODO(SPARK-35771): Format year-month intervals using type fields
   startField: Byte,
   endField: Byte): String = {
 var sign = ""
@@ -954,10 +953,22 @@ object IntervalUtils {
   sign = "-"
   absMonths = -absMonths
 }
-val payload = s"$sign${absMonths / MONTHS_PER_YEAR}-${absMonths % 
MONTHS_PER_YEAR}"
+val year = s"$sign${absMonths / MONTHS_PER_YEAR}"
+val month = s"${absMonths % MONTHS_PER_YEAR}"
+val yearAndMonth = s"$year-$month"
 style match {
-  case ANSI_STYLE => s"INTERVAL '$payload' YEAR TO MONTH"
-  case HIVE_STYLE => payload
+  case ANSI_STYLE =>
+val formatBuilder = new StringBuilder("INTERVAL '")
+if (startField == endField) {
+  startField match {
+case YearMonthIntervalType.YEAR => formatBuilder.append(s"$year' 
YEAR")
+case YearMonthIntervalType.MONTH => formatBuilder.append(s"$month' 
MONTH")
+  }
+} else {
+  formatBuilder.append(s"$yearAndMonth' YEAR TO MONTH")
+}
+formatBuilder.toString
+  case HIVE_STYLE => s"$yearAndMonth"
 }
   }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/IntervalUtilsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/IntervalUtilsSuite.scala
index c2ece95..b6cf152 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/IntervalUtilsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/IntervalUtilsSuite.scala
@@ -619,4 +619,30 @@ class IntervalUtilsSuite extends SparkFunSuite with 
SQLHelper {
 assert(toDayTimeIntervalString(micros, ANSI_STYLE, SECOND, SECOND) === 
sec)
 }
   }
+
+  test("SPARK-35771: Format year-month intervals using type fields") {
+import org.apache.spark.sql.types.YearMonthIntervalType._
+Seq(
+  0 ->
+("INTERVAL '0-0' YEAR TO MONTH", "INTERVAL '0' YEAR", "INTERVAL '0' 
MONTH"),
+  -11 -> ("INTERVAL '-0-11' YEAR TO MONTH", "INTERVAL '-0' YEAR", 
"INTERVAL '11' MONTH"),
+  11 -> ("INTERVAL '0-11' YEAR TO MONTH", "INTERVAL '0' YEAR", "INTERVAL 
'11' MONTH"),
+  -13 -> ("INTERVAL '-1-1' YEAR TO MONTH", "INTERVAL '-1' YEAR", "INTERVAL 
'1' MONTH&

[spark] branch master updated (aaa8a80 -> 4530760)

2021-06-16 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from aaa8a80  [SPARK-35613][CORE][SQL] Cache commonly occurring strings in 
SQLMetrics, JSONProtocol and AccumulatorV2 classes
 add 4530760  [SPARK-35774][SQL] Parse any year-month interval types in SQL

No new revisions were added by this update.

Summary of changes:
 .../main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4  | 2 +-
 .../scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala  | 9 +++--
 .../test/scala/org/apache/spark/sql/types/DataTypeSuite.scala| 2 +-
 3 files changed, 9 insertions(+), 4 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (82af318 -> aab0c2b)

2021-06-14 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 82af318  [SPARK-35748][SS][SQL] Fix StreamingJoinHelper to be able to 
handle day-time interval
 add aab0c2b  [SPARK-35736][SPARK-35737][SQL][FOLLOWUP] Move a common logic 
to DayTimeIntervalType

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala| 12 
 .../org/apache/spark/sql/types/DayTimeIntervalType.scala |  2 ++
 2 files changed, 6 insertions(+), 8 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35680][SQL] Add fields to `YearMonthIntervalType`

2021-06-15 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 61ce8f7  [SPARK-35680][SQL] Add fields to `YearMonthIntervalType`
61ce8f7 is described below

commit 61ce8f764982306f2c7a8b2b3dfe22963b49f2d5
Author: Max Gekk 
AuthorDate: Tue Jun 15 23:08:12 2021 +0300

[SPARK-35680][SQL] Add fields to `YearMonthIntervalType`

### What changes were proposed in this pull request?
Extend `YearMonthIntervalType` to support interval fields. Valid interval 
field values:
- 0 (YEAR)
- 1 (MONTH)

After the changes, the following year-month interval types are supported:
1. `YearMonthIntervalType(0, 0)` or `YearMonthIntervalType(YEAR, YEAR)`
2. `YearMonthIntervalType(0, 1)` or `YearMonthIntervalType(YEAR, MONTH)`. 
**It is the default one**.
3. `YearMonthIntervalType(1, 1)` or `YearMonthIntervalType(MONTH, MONTH)`

Closes #32825

### Why are the changes needed?
In the current implementation, Spark supports only `interval year to month` 
but the SQL standard allows to specify the start and end fields. The changes 
will allow to follow ANSI SQL standard more precisely.

### Does this PR introduce _any_ user-facing change?
Yes but `YearMonthIntervalType` has not been released yet.

### How was this patch tested?
By existing test suites.

Closes #32909 from MaxGekk/add-fields-to-YearMonthIntervalType.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/expressions/UnsafeRow.java  | 11 ++---
 .../java/org/apache/spark/sql/types/DataTypes.java | 19 +---
 .../sql/catalyst/CatalystTypeConverters.scala  |  3 +-
 .../apache/spark/sql/catalyst/InternalRow.scala|  4 +-
 .../spark/sql/catalyst/JavaTypeInference.scala |  2 +-
 .../spark/sql/catalyst/ScalaReflection.scala   | 10 ++---
 .../spark/sql/catalyst/SerializerBuildHelper.scala |  2 +-
 .../spark/sql/catalyst/analysis/Analyzer.scala | 18 
 .../apache/spark/sql/catalyst/dsl/package.scala|  7 ++-
 .../spark/sql/catalyst/encoders/RowEncoder.scala   |  6 +--
 .../spark/sql/catalyst/expressions/Cast.scala  | 35 +--
 .../expressions/InterpretedUnsafeProjection.scala  |  2 +-
 .../catalyst/expressions/SpecificInternalRow.scala |  2 +-
 .../catalyst/expressions/aggregate/Average.scala   |  6 +--
 .../sql/catalyst/expressions/aggregate/Sum.scala   |  2 +-
 .../sql/catalyst/expressions/arithmetic.scala  | 10 ++---
 .../expressions/codegen/CodeGenerator.scala|  4 +-
 .../expressions/collectionOperations.scala |  8 ++--
 .../catalyst/expressions/datetimeExpressions.scala |  2 +-
 .../spark/sql/catalyst/expressions/hash.scala  |  2 +-
 .../catalyst/expressions/intervalExpressions.scala | 10 ++---
 .../spark/sql/catalyst/expressions/literals.scala  | 16 ---
 .../catalyst/expressions/windowExpressions.scala   |  4 +-
 .../spark/sql/catalyst/parser/AstBuilder.scala |  6 ++-
 .../spark/sql/catalyst/util/IntervalUtils.scala| 17 ++-
 .../apache/spark/sql/catalyst/util/TypeUtils.scala |  4 +-
 .../spark/sql/errors/QueryCompilationErrors.scala  | 11 +
 .../org/apache/spark/sql/types/DataType.scala  |  4 +-
 .../spark/sql/types/YearMonthIntervalType.scala| 52 ++
 .../org/apache/spark/sql/util/ArrowUtils.scala |  4 +-
 .../org/apache/spark/sql/RandomDataGenerator.scala |  2 +-
 .../spark/sql/RandomDataGeneratorSuite.scala   |  4 +-
 .../sql/catalyst/CatalystTypeConvertersSuite.scala |  2 +-
 .../sql/catalyst/encoders/RowEncoderSuite.scala| 18 
 .../expressions/ArithmeticExpressionSuite.scala| 10 ++---
 .../spark/sql/catalyst/expressions/CastSuite.scala | 30 ++---
 .../sql/catalyst/expressions/CastSuiteBase.scala   |  8 ++--
 .../expressions/DateExpressionsSuite.scala | 19 
 .../expressions/HashExpressionsSuite.scala |  2 +-
 .../expressions/IntervalExpressionsSuite.scala | 24 ++
 .../expressions/LiteralExpressionSuite.scala   |  6 +--
 .../catalyst/expressions/LiteralGenerator.scala|  4 +-
 .../expressions/MutableProjectionSuite.scala   | 14 +++---
 .../optimizer/PushFoldableIntoBranchesSuite.scala  | 16 +++
 .../sql/catalyst/parser/DataTypeParserSuite.scala  |  2 +-
 .../sql/catalyst/util/IntervalUtilsSuite.scala |  5 ++-
 .../org/apache/spark/sql/types/DataTypeSuite.scala |  6 +--
 .../apache/spark/sql/types/DataTypeTestUtils.scala | 18 
 .../apache/spark/sql/util/ArrowUtilsSuite.scala|  2 +-
 .../apache/spark/sql/execution/HiveResult.scala|  4 +-
 .../sql/execution/aggregate/HashMapGenerator.scala |  3 +-
 .../spark/sql/execution/aggregate/udaf.scala   |  4 +-
 .../spark/sql/execution/arrow/ArrowWriter.scala|  2 +-
 .../sql/execution/columnar

[spark] branch branch-3.1 updated: [SPARK-35679][SQL] instantToMicros overflow

2021-06-09 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new ff83105  [SPARK-35679][SQL] instantToMicros overflow
ff83105 is described below

commit ff831054d9a3b0fbd58b532bf6c527276d7994c6
Author: dgd-contributor 
AuthorDate: Thu Jun 10 08:08:51 2021 +0300

[SPARK-35679][SQL] instantToMicros overflow

### Why are the changes needed?
With Long.minValue cast to an instant, secs will be floored in function 
microsToInstant and cause overflow when multiply with Micros_per_second

```
def microsToInstant(micros: Long): Instant = {
  val secs = Math.floorDiv(micros, MICROS_PER_SECOND)
  // Unfolded Math.floorMod(us, MICROS_PER_SECOND) to reuse the result of
  // the above calculation of `secs` via `floorDiv`.
  val mos = micros - secs * MICROS_PER_SECOND  <- it will overflow here
  Instant.ofEpochSecond(secs, mos * NANOS_PER_MICROS)
}
```

But the overflow is acceptable because it won't produce any change to the 
result

However, when convert the instant back to micro value, it will raise 
Overflow Error

```
def instantToMicros(instant: Instant): Long = {
  val us = Math.multiplyExact(instant.getEpochSecond, MICROS_PER_SECOND) <- 
It overflow here
  val result = Math.addExact(us, NANOSECONDS.toMicros(instant.getNano))
  result
}
```

Code to reproduce this error
```
instantToMicros(microToInstant(Long.MinValue))
```

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Test added

Closes #32839 from dgd-contributor/SPARK-35679_instantToMicro.

Authored-by: dgd-contributor 
Signed-off-by: Max Gekk 
(cherry picked from commit aa3de4077302fe7e0b23b01a338c7feab0e5974e)
Signed-off-by: Max Gekk 
---
 .../org/apache/spark/sql/catalyst/util/DateTimeUtils.scala | 14 +++---
 .../spark/sql/catalyst/util/DateTimeUtilsSuite.scala   |  5 +
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
index 89cb67c..a4c34e1 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
@@ -375,6 +375,9 @@ object DateTimeUtils {
   timestamp.get
 }
   }
+  // See issue SPARK-35679
+  // min second cause overflow in instant to micro
+  private val MIN_SECONDS = Math.floorDiv(Long.MinValue, MICROS_PER_SECOND)
 
   /**
* Gets the number of microseconds since the epoch of 1970-01-01 00:00:00Z 
from the given
@@ -382,9 +385,14 @@ object DateTimeUtils {
* microseconds where microsecond 0 is 1970-01-01 00:00:00Z.
*/
   def instantToMicros(instant: Instant): Long = {
-val us = Math.multiplyExact(instant.getEpochSecond, MICROS_PER_SECOND)
-val result = Math.addExact(us, NANOSECONDS.toMicros(instant.getNano))
-result
+val secs = instant.getEpochSecond
+if (secs == MIN_SECONDS) {
+  val us = Math.multiplyExact(secs + 1, MICROS_PER_SECOND)
+  Math.addExact(us, NANOSECONDS.toMicros(instant.getNano) - 
MICROS_PER_SECOND)
+} else {
+  val us = Math.multiplyExact(secs, MICROS_PER_SECOND)
+  Math.addExact(us, NANOSECONDS.toMicros(instant.getNano))
+}
   }
 
   /**
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
index fb2d511..8cc6bf2 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
@@ -688,4 +688,9 @@ class DateTimeUtilsSuite extends SparkFunSuite with 
Matchers with SQLHelper {
   assert(toDate("tomorrow CET ", zoneId).get === today + 1)
 }
   }
+
+  test("SPARK-35679: instantToMicros should be able to return microseconds of 
Long.MinValue") {
+assert(instantToMicros(microsToInstant(Long.MaxValue)) === Long.MaxValue)
+assert(instantToMicros(microsToInstant(Long.MinValue)) === Long.MinValue)
+  }
 }

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8dde20a -> aa3de40)

2021-06-09 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8dde20a  [SPARK-35675][SQL] EnsureRequirements remove shuffle should 
respect PartitioningCollection
 add aa3de40  [SPARK-35679][SQL] instantToMicros overflow

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/util/DateTimeUtils.scala | 14 +++---
 .../spark/sql/catalyst/util/DateTimeUtilsSuite.scala   |  5 +
 2 files changed, 16 insertions(+), 3 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-35679][SQL] instantToMicros overflow

2021-06-09 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new fe18354  [SPARK-35679][SQL] instantToMicros overflow
fe18354 is described below

commit fe1835406ac46eca2b147e59a92a91ba04f61121
Author: dgd-contributor 
AuthorDate: Thu Jun 10 08:08:51 2021 +0300

[SPARK-35679][SQL] instantToMicros overflow

With Long.minValue cast to an instant, secs will be floored in function 
microsToInstant and cause overflow when multiply with Micros_per_second

```
def microsToInstant(micros: Long): Instant = {
  val secs = Math.floorDiv(micros, MICROS_PER_SECOND)
  // Unfolded Math.floorMod(us, MICROS_PER_SECOND) to reuse the result of
  // the above calculation of `secs` via `floorDiv`.
  val mos = micros - secs * MICROS_PER_SECOND  <- it will overflow here
  Instant.ofEpochSecond(secs, mos * NANOS_PER_MICROS)
}
```

But the overflow is acceptable because it won't produce any change to the 
result

However, when convert the instant back to micro value, it will raise 
Overflow Error

```
def instantToMicros(instant: Instant): Long = {
  val us = Math.multiplyExact(instant.getEpochSecond, MICROS_PER_SECOND) <- 
It overflow here
  val result = Math.addExact(us, NANOSECONDS.toMicros(instant.getNano))
  result
}
```

Code to reproduce this error
```
instantToMicros(microToInstant(Long.MinValue))
```

No

Test added

Closes #32839 from dgd-contributor/SPARK-35679_instantToMicro.

Authored-by: dgd-contributor 
Signed-off-by: Max Gekk 
(cherry picked from commit aa3de4077302fe7e0b23b01a338c7feab0e5974e)
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/util/DateTimeUtils.scala  | 20 +---
 .../spark/sql/catalyst/util/DateTimeUtilsSuite.scala |  5 +
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
index 52df227..2e242f2 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
@@ -413,10 +413,24 @@ object DateTimeUtils {
 }
   }
 
+  // See issue SPARK-35679
+  // min second cause overflow in instant to micro
+  private val MIN_SECONDS = Math.floorDiv(Long.MinValue, MICROS_PER_SECOND)
+
+  /**
+   * Gets the number of microseconds since the epoch of 1970-01-01 00:00:00Z 
from the given
+   * instance of `java.time.Instant`. The epoch microsecond count is a simple 
incrementing count of
+   * microseconds where microsecond 0 is 1970-01-01 00:00:00Z.
+   */
   def instantToMicros(instant: Instant): Long = {
-val us = Math.multiplyExact(instant.getEpochSecond, MICROS_PER_SECOND)
-val result = Math.addExact(us, NANOSECONDS.toMicros(instant.getNano))
-result
+val secs = instant.getEpochSecond
+if (secs == MIN_SECONDS) {
+  val us = Math.multiplyExact(secs + 1, MICROS_PER_SECOND)
+  Math.addExact(us, NANOSECONDS.toMicros(instant.getNano) - 
MICROS_PER_SECOND)
+} else {
+  val us = Math.multiplyExact(secs, MICROS_PER_SECOND)
+  Math.addExact(us, NANOSECONDS.toMicros(instant.getNano))
+}
   }
 
   def microsToInstant(us: Long): Instant = {
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
index d457c88..d451aff 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala
@@ -686,4 +686,9 @@ class DateTimeUtilsSuite extends SparkFunSuite with 
Matchers with SQLHelper {
   assert(toDate("tomorrow CET ", zoneId).get === today + 1)
 }
   }
+
+  test("SPARK-35679: instantToMicros should be able to return microseconds of 
Long.MinValue") {
+assert(instantToMicros(microsToInstant(Long.MaxValue)) === Long.MaxValue)
+assert(instantToMicros(microsToInstant(Long.MinValue)) === Long.MinValue)
+  }
 }

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35734][SQL] Format day-time intervals using type fields

2021-06-12 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 80f7989  [SPARK-35734][SQL] Format day-time intervals using type fields
80f7989 is described below

commit 80f7989d9a761eff1a0b3c64ec3aabb81506953d
Author: Kousuke Saruta 
AuthorDate: Sat Jun 12 21:45:12 2021 +0300

[SPARK-35734][SQL] Format day-time intervals using type fields

### What changes were proposed in this pull request?

This PR add a feature which formats day-time interval to strings using the 
start and end fields of `DayTimeIntervalType`.

### Why are the changes needed?

Currently, they are ignored, and any `DayTimeIntervalType` is formatted as 
`INTERVAL DAY TO SECOND.`

### Does this PR introduce _any_ user-facing change?

Yes. The format of day-time intervals is determined the start and end 
fields.

### How was this patch tested?

New test.

Closes #32891 from sarutak/interval-format.

Authored-by: Kousuke Saruta 
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/util/IntervalUtils.scala| 58 +--
 .../sql/catalyst/util/IntervalUtilsSuite.scala | 84 ++
 2 files changed, 138 insertions(+), 4 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
index c18cca9..dda5581 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
@@ -21,6 +21,7 @@ import java.time.{Duration, Period}
 import java.time.temporal.ChronoUnit
 import java.util.concurrent.TimeUnit
 
+import scala.collection.mutable
 import scala.util.control.NonFatal
 
 import org.apache.spark.sql.catalyst.util.DateTimeConstants._
@@ -28,7 +29,7 @@ import 
org.apache.spark.sql.catalyst.util.DateTimeUtils.millisToMicros
 import org.apache.spark.sql.catalyst.util.IntervalStringStyles.{ANSI_STYLE, 
HIVE_STYLE, IntervalStyle}
 import org.apache.spark.sql.errors.QueryExecutionErrors
 import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.types.Decimal
+import org.apache.spark.sql.types.{DayTimeIntervalType, Decimal}
 import org.apache.spark.unsafe.types.{CalendarInterval, UTF8String}
 
 // The style of textual representation of intervals
@@ -960,18 +961,34 @@ object IntervalUtils {
   def toDayTimeIntervalString(
   micros: Long,
   style: IntervalStyle,
-  // TODO(SPARK-35734): Format day-time intervals using type fields
   startField: Byte,
   endField: Byte): String = {
 var sign = ""
 var rest = micros
+val from = DayTimeIntervalType.fieldToString(startField).toUpperCase
+val to = DayTimeIntervalType.fieldToString(endField).toUpperCase
 if (micros < 0) {
   if (micros == Long.MinValue) {
 // Especial handling of minimum `Long` value because negate op 
overflows `Long`.
 // seconds = 106751991 * (24 * 60 * 60) + 4 * 60 * 60 + 54 = 
9223372036854
 // microseconds = -922337203685400L-775808 == Long.MinValue
 val minIntervalString = style match {
-  case ANSI_STYLE => "INTERVAL '-106751991 04:00:54.775808' DAY TO 
SECOND"
+  case ANSI_STYLE =>
+val baseStr = "-106751991 04:00:54.775808"
+val fromPos = startField match {
+  case DayTimeIntervalType.DAY => 0
+  case DayTimeIntervalType.HOUR => 11
+  case DayTimeIntervalType.MINUTE => 14
+  case DayTimeIntervalType.SECOND => 17
+}
+val toPos = endField match {
+  case DayTimeIntervalType.DAY => 10
+  case DayTimeIntervalType.HOUR => 13
+  case DayTimeIntervalType.MINUTE => 16
+  case DayTimeIntervalType.SECOND => baseStr.length
+}
+val postfix = if (startField == endField) from else s"$from TO $to"
+s"INTERVAL '${baseStr.substring(fromPos, toPos)}' $postfix"
   case HIVE_STYLE => "-106751991 04:00:54.775808000"
 }
 return minIntervalString
@@ -992,7 +1009,40 @@ object IntervalUtils {
 val secStr = java.math.BigDecimal.valueOf(secondsWithFraction, 6)
   .stripTrailingZeros()
   .toPlainString()
-f"INTERVAL '$sign$days $hours%02d:$minutes%02d:$leadSecZero$secStr' 
DAY TO SECOND"
+val formatBuilder = new StringBuilder("INTERVAL '")
+if (startField == endField) {
+  startField match {
+case DayTimeIntervalType.DAY => formatBuilder.append(s

[spark] branch master updated: [SPARK-35716][SQL] Support casting of timestamp without time zone to date type

2021-06-10 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new d21ff13  [SPARK-35716][SQL] Support casting of timestamp without time 
zone to date type
d21ff13 is described below

commit d21ff1318f614fc207d9cd3c485e4337faa8e878
Author: Gengliang Wang 
AuthorDate: Thu Jun 10 23:37:02 2021 +0300

[SPARK-35716][SQL] Support casting of timestamp without time zone to date 
type

### What changes were proposed in this pull request?

Extend the Cast expression and support TimestampWithoutTZType in casting to 
DateType.

### Why are the changes needed?

To conform the ANSI SQL standard which requires to support such casting.

### Does this PR introduce _any_ user-facing change?

No, the new timestamp type is not released yet.

### How was this patch tested?

Unit test

Closes #32869 from gengliangwang/castToDate.

Authored-by: Gengliang Wang 
Signed-off-by: Max Gekk 
---
 .../org/apache/spark/sql/catalyst/expressions/Cast.scala  |  9 +
 .../org/apache/spark/sql/catalyst/expressions/CastSuite.scala | 11 +--
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
index 8de19ba..fba17d3 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
@@ -72,6 +72,7 @@ object Cast {
 
 case (StringType, DateType) => true
 case (TimestampType, DateType) => true
+case (TimestampWithoutTZType, DateType) => true
 
 case (StringType, CalendarIntervalType) => true
 case (StringType, DayTimeIntervalType) => true
@@ -534,6 +535,8 @@ abstract class CastBase extends UnaryExpression with 
TimeZoneAwareExpression wit
   // throw valid precision more than seconds, according to Hive.
   // Timestamp.nanos is in 0 to 999,999,999, no more than a second.
   buildCast[Long](_, t => microsToDays(t, zoneId))
+case TimestampWithoutTZType =>
+  buildCast[Long](_, t => microsToDays(t, ZoneOffset.UTC))
   }
 
   // IntervalConverter
@@ -1204,6 +1207,11 @@ abstract class CastBase extends UnaryExpression with 
TimeZoneAwareExpression wit
 (c, evPrim, evNull) =>
   code"""$evPrim =
 org.apache.spark.sql.catalyst.util.DateTimeUtils.microsToDays($c, 
$zid);"""
+  case TimestampWithoutTZType =>
+(c, evPrim, evNull) =>
+  // scalastyle:off line.size.limit
+  code"$evPrim = 
org.apache.spark.sql.catalyst.util.DateTimeUtils.microsToDays($c, 
java.time.ZoneOffset.UTC);"
+  // scalastyle:on line.size.limit
   case _ =>
 (c, evPrim, evNull) => code"$evNull = true;"
 }
@@ -1953,6 +1961,7 @@ object AnsiCast {
 
 case (StringType, DateType) => true
 case (TimestampType, DateType) => true
+case (TimestampWithoutTZType, DateType) => true
 
 case (_: NumericType, _: NumericType) => true
 case (StringType, _: NumericType) => true
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
index a4e4257..51a7740 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
@@ -18,7 +18,7 @@
 package org.apache.spark.sql.catalyst.expressions
 
 import java.sql.{Date, Timestamp}
-import java.time.{DateTimeException, Duration, LocalDateTime, Period}
+import java.time.{DateTimeException, Duration, LocalDate, LocalDateTime, 
Period}
 import java.time.temporal.ChronoUnit
 import java.util.{Calendar, TimeZone}
 
@@ -1256,10 +1256,17 @@ abstract class AnsiCastSuiteBase extends CastSuiteBase {
 
   test("SPARK-35711: cast timestamp without time zone to timestamp with local 
time zone") {
 specialTs.foreach { s =>
-  val dt = LocalDateTime.parse(s.replace(" ", "T"))
+  val dt = LocalDateTime.parse(s)
   checkEvaluation(cast(dt, TimestampType), 
DateTimeUtils.localDateTimeToMicros(dt))
 }
   }
+
+  test("SPARK-35716: cast timestamp without time zone to date type") {
+specialTs.foreach { s =>
+  val dt = LocalDateTime.parse(s)
+  checkEvaluation(cast(dt, DateType), LocalDate.parse(s.split("T")(0)))
+}
+  }
 }
 
 /**

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (0ba1d38 -> 6272222)

2021-06-13 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 0ba1d38  [SPARK-35701][SQL] Use copy-on-write semantics for SQLConf 
registered configurations
 add 627  [SPARK-35719][SQL] Support type conversion between timestamp 
and timestamp without time zone type

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/expressions/Cast.scala  | 27 ++
 .../spark/sql/catalyst/expressions/CastSuite.scala | 24 +++
 2 files changed, 43 insertions(+), 8 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (69aa7ad -> 7978fdc)

2021-06-13 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 69aa7ad  [SPARK-35714][CORE] Bug fix for deadlock during the executor 
shutdown
 add 7978fdc  [SPARK-35736][SQL] Parse any day-time interval types in SQL

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-ansi-compliance.md   |  2 ++
 .../antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4|  9 -
 .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala | 11 +--
 .../test/scala/org/apache/spark/sql/types/DataTypeSuite.scala |  2 +-
 4 files changed, 20 insertions(+), 4 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7fcb127 -> 45b7f76)

2021-06-17 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7fcb127  [SPARK-35670][BUILD] Upgrade ZSTD-JNI to 1.5.0-2
 add 45b7f76  [SPARK-35095][SS][TESTS] Use ANSI intervals in streaming join 
tests

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/streaming/StreamingJoinSuite.scala   | 41 +++---
 1 file changed, 37 insertions(+), 4 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35726][SQL] Truncate java.time.Duration by fields of day-time interval type

2021-06-19 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 2ebad72  [SPARK-35726][SQL] Truncate java.time.Duration by fields of 
day-time interval type
2ebad72 is described below

commit 2ebad727587e25b8bf4a8439593e7402ea4e2827
Author: Angerszh 
AuthorDate: Sat Jun 19 13:51:21 2021 +0300

[SPARK-35726][SQL] Truncate java.time.Duration by fields of day-time 
interval type

### What changes were proposed in this pull request?
Support truncate java.time.Duration by fields of day-time interval type.

### Why are the changes needed?
To respect fields of the target day-time interval types.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Added UT

Closes #32950 from AngersZh/SPARK-35726.

Authored-by: Angerszh 
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/CatalystTypeConverters.scala  | 11 ++-
 .../spark/sql/catalyst/util/IntervalUtils.scala  | 15 +--
 .../org/apache/spark/sql/RandomDataGenerator.scala   | 13 -
 .../sql/catalyst/CatalystTypeConvertersSuite.scala   | 20 
 4 files changed, 51 insertions(+), 8 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala
index 08a5fd5..1742524 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala
@@ -32,6 +32,7 @@ import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.catalyst.util._
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
+import org.apache.spark.sql.types.DayTimeIntervalType._
 import org.apache.spark.sql.types.YearMonthIntervalType._
 import org.apache.spark.unsafe.types.UTF8String
 
@@ -76,8 +77,7 @@ object CatalystTypeConverters {
   case LongType => LongConverter
   case FloatType => FloatConverter
   case DoubleType => DoubleConverter
-  // TODO(SPARK-35726): Truncate java.time.Duration by fields of day-time 
interval type
-  case _: DayTimeIntervalType => DurationConverter
+  case DayTimeIntervalType(_, endField) => DurationConverter(endField)
   case YearMonthIntervalType(_, endField) => PeriodConverter(endField)
   case dataType: DataType => IdentityConverter(dataType)
 }
@@ -432,9 +432,10 @@ object CatalystTypeConverters {
 override def toScalaImpl(row: InternalRow, column: Int): Double = 
row.getDouble(column)
   }
 
-  private object DurationConverter extends CatalystTypeConverter[Duration, 
Duration, Any] {
+  private case class DurationConverter(endField: Byte)
+  extends CatalystTypeConverter[Duration, Duration, Any] {
 override def toCatalystImpl(scalaValue: Duration): Long = {
-  IntervalUtils.durationToMicros(scalaValue)
+  IntervalUtils.durationToMicros(scalaValue, endField)
 }
 override def toScala(catalystValue: Any): Duration = {
   if (catalystValue == null) null
@@ -523,7 +524,7 @@ object CatalystTypeConverters {
 map,
 (key: Any) => convertToCatalyst(key),
 (value: Any) => convertToCatalyst(value))
-case d: Duration => DurationConverter.toCatalyst(d)
+case d: Duration => DurationConverter(SECOND).toCatalyst(d)
 case p: Period => PeriodConverter(MONTH).toCatalyst(p)
 case other => other
   }
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
index e87ea51..3f56d2f 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
@@ -892,18 +892,29 @@ object IntervalUtils {
* @throws ArithmeticException If numeric overflow occurs
*/
   def durationToMicros(duration: Duration): Long = {
+durationToMicros(duration, DayTimeIntervalType.SECOND)
+  }
+
+  def durationToMicros(duration: Duration, endField: Byte): Long = {
 val seconds = duration.getSeconds
-if (seconds == minDurationSeconds) {
+val micros = if (seconds == minDurationSeconds) {
   val microsInSeconds = (minDurationSeconds + 1) * MICROS_PER_SECOND
   val nanoAdjustment = duration.getNano
   assert(0 <= nanoAdjustment && nanoAdjustment < NANOS_PER_SECOND,
 "Duration.getNano() must return the adjustment to the seconds field " +
-"in the range from 0 to 9 nanoseconds, inclusive.&q

[spark] branch master updated: [SPARK-35819][SQL] Support Cast between different field YearMonthIntervalType

2021-06-19 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 86bcd1f  [SPARK-35819][SQL] Support Cast between different field 
YearMonthIntervalType
86bcd1f is described below

commit 86bcd1fba09d9b5e4d36a48824354aaae769fa21
Author: Angerszh 
AuthorDate: Sat Jun 19 21:43:06 2021 +0300

[SPARK-35819][SQL] Support Cast between different field 
YearMonthIntervalType

### What changes were proposed in this pull request?
 Support Cast between different field YearMonthIntervalType

### Why are the changes needed?
Make user convenient to get different field YearMonthIntervalType

### Does this PR introduce _any_ user-facing change?
User can call cast YearMonthIntervalType(YEAR, MONTH) to 
YearMonthIntervalType(YEAR, YEAR) etc

### How was this patch tested?
Added UT

Closes #32974 from AngersZh/SPARK-35819.

Authored-by: Angerszh 
Signed-off-by: Max Gekk 
---
 .../org/apache/spark/sql/catalyst/expressions/Cast.scala | 12 
 .../apache/spark/sql/catalyst/expressions/CastSuite.scala| 11 +++
 2 files changed, 23 insertions(+)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
index 52801ec..cdf0753 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
@@ -82,6 +82,8 @@ object Cast {
 case (StringType, _: DayTimeIntervalType) => true
 case (StringType, _: YearMonthIntervalType) => true
 
+case (_: YearMonthIntervalType, _: YearMonthIntervalType) => true
+
 case (StringType, _: NumericType) => true
 case (BooleanType, _: NumericType) => true
 case (DateType, _: NumericType) => true
@@ -580,6 +582,8 @@ abstract class CastBase extends UnaryExpression with 
TimeZoneAwareExpression wit
   it: YearMonthIntervalType): Any => Any = from match {
 case StringType => buildCast[UTF8String](_, s =>
   IntervalUtils.castStringToYMInterval(s, it.startField, it.endField))
+case _: YearMonthIntervalType => buildCast[Int](_, s =>
+  IntervalUtils.periodToMonths(IntervalUtils.monthsToPeriod(s), 
it.endField))
   }
 
   // LongConverter
@@ -1481,6 +1485,12 @@ abstract class CastBase extends UnaryExpression with 
TimeZoneAwareExpression wit
 code"""
   $evPrim = $util.castStringToYMInterval($c, (byte)${it.startField}, 
(byte)${it.endField});
 """
+case _: YearMonthIntervalType =>
+  val util = IntervalUtils.getClass.getCanonicalName.stripSuffix("$")
+  (c, evPrim, _) =>
+code"""
+  $evPrim = $util.periodToMonths($util.monthsToPeriod($c), 
(byte)${it.endField});
+"""
   }
 
   private[this] def decimalToTimestampCode(d: ExprValue): Block = {
@@ -2051,6 +2061,8 @@ object AnsiCast {
 case (StringType, _: DayTimeIntervalType) => true
 case (StringType, _: YearMonthIntervalType) => true
 
+case (_: YearMonthIntervalType, _: YearMonthIntervalType) => true
+
 case (StringType, DateType) => true
 case (TimestampType, DateType) => true
 case (TimestampWithoutTZType, DateType) => true
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
index d114968..51c3681 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
@@ -30,6 +30,7 @@ import org.apache.spark.sql.catalyst.util.DateTimeTestUtils._
 import org.apache.spark.sql.catalyst.util.DateTimeUtils._
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
+import org.apache.spark.sql.types.YearMonthIntervalType._
 import org.apache.spark.unsafe.types.UTF8String
 
 /**
@@ -662,4 +663,14 @@ class CastSuite extends CastSuiteBase {
   checkEvaluation(cast(invalidInput, TimestampWithoutTZType), null)
 }
   }
+
+  test("SPARK-35819: Support cast YearMonthIntervalType in different fields") {
+val ym = cast(Literal.create("1-1"), YearMonthIntervalType(YEAR, MONTH))
+Seq(YearMonthIntervalType(YEAR, YEAR) -> 12,
+  YearMonthIntervalType(YEAR, MONTH) -> 13,
+  YearMonthIntervalType(MONTH, MONTH) -> 13)
+  .foreach { case (dt, value) =>
+checkEvaluation(cast(ym, dt), value)
+  }
+  }
 }

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6d30991 -> 4758dc7)

2021-06-20 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6d30991  [SPARK-35303][SPARK-35498][PYTHON][FOLLOW-UP] Copy local 
properties when starting the thread, and use inheritable thread in the current 
codebase
 add 4758dc7  [SPARK-35771][SQL][FOLLOWUP] 
IntervalUtils.toYearMonthIntervalString should consider the case year-month 
type is casted as month type

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/util/IntervalUtils.scala |  5 ++---
 .../spark/sql/catalyst/util/IntervalUtilsSuite.scala   | 14 +++---
 2 files changed, 9 insertions(+), 10 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35769][SQL] Truncate java.time.Period by fields of year-month interval type

2021-06-18 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 071566c  [SPARK-35769][SQL] Truncate java.time.Period by fields of 
year-month interval type
071566c is described below

commit 071566caf3d1efc752006afaec974c6c4cfdc679
Author: Angerszh 
AuthorDate: Fri Jun 18 11:55:57 2021 +0300

[SPARK-35769][SQL] Truncate java.time.Period by fields of year-month 
interval type

### What changes were proposed in this pull request?
Support truncate java.time.Period by fields of year-month interval type

### Why are the changes needed?
To follow the SQL standard and respect the field restriction of the target 
year-month type.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Added UT

Closes #32945 from AngersZh/SPARK-35769.

Authored-by: Angerszh 
Signed-off-by: Max Gekk 
---
 .../apache/spark/sql/catalyst/CatalystTypeConverters.scala| 11 ++-
 .../org/apache/spark/sql/catalyst/util/IntervalUtils.scala| 11 ++-
 .../test/scala/org/apache/spark/sql/RandomDataGenerator.scala |  7 +--
 .../spark/sql/catalyst/CatalystTypeConvertersSuite.scala  | 10 ++
 4 files changed, 31 insertions(+), 8 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala
index 38790e0..08a5fd5 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala
@@ -32,6 +32,7 @@ import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.catalyst.util._
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
+import org.apache.spark.sql.types.YearMonthIntervalType._
 import org.apache.spark.unsafe.types.UTF8String
 
 /**
@@ -77,8 +78,7 @@ object CatalystTypeConverters {
   case DoubleType => DoubleConverter
   // TODO(SPARK-35726): Truncate java.time.Duration by fields of day-time 
interval type
   case _: DayTimeIntervalType => DurationConverter
-  // TODO(SPARK-35769): Truncate java.time.Period by fields of year-month 
interval type
-  case _: YearMonthIntervalType => PeriodConverter
+  case YearMonthIntervalType(_, endField) => PeriodConverter(endField)
   case dataType: DataType => IdentityConverter(dataType)
 }
 converter.asInstanceOf[CatalystTypeConverter[Any, Any, Any]]
@@ -444,9 +444,10 @@ object CatalystTypeConverters {
   IntervalUtils.microsToDuration(row.getLong(column))
   }
 
-  private object PeriodConverter extends CatalystTypeConverter[Period, Period, 
Any] {
+  private case class PeriodConverter(endField: Byte)
+  extends CatalystTypeConverter[Period, Period, Any] {
 override def toCatalystImpl(scalaValue: Period): Int = {
-  IntervalUtils.periodToMonths(scalaValue)
+  IntervalUtils.periodToMonths(scalaValue, endField)
 }
 override def toScala(catalystValue: Any): Period = {
   if (catalystValue == null) null
@@ -523,7 +524,7 @@ object CatalystTypeConverters {
 (key: Any) => convertToCatalyst(key),
 (value: Any) => convertToCatalyst(value))
 case d: Duration => DurationConverter.toCatalyst(d)
-case p: Period => PeriodConverter.toCatalyst(p)
+case p: Period => PeriodConverter(MONTH).toCatalyst(p)
 case other => other
   }
 
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
index 9e67004..e87ea51 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
@@ -925,8 +925,17 @@ object IntervalUtils {
* @throws ArithmeticException If numeric overflow occurs
*/
   def periodToMonths(period: Period): Int = {
+periodToMonths(period, YearMonthIntervalType.MONTH)
+  }
+
+  def periodToMonths(period: Period, endField: Byte): Int = {
 val monthsInYears = Math.multiplyExact(period.getYears, MONTHS_PER_YEAR)
-Math.addExact(monthsInYears, period.getMonths)
+val months = Math.addExact(monthsInYears, period.getMonths)
+if (endField == YearMonthIntervalType.YEAR) {
+  months - months % MONTHS_PER_YEAR
+} else {
+  months
+}
   }
 
   /**
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
index 603c88d..6201f12 100644
--- a/sql/catalys

[spark] branch master updated: [SPARK-35827][SQL] Show proper error message when update column types to year-month/day-time interval

2021-06-20 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new af20474  [SPARK-35827][SQL] Show proper error message when update 
column types to year-month/day-time interval
af20474 is described below

commit af20474c67a61190829fa100a17ded58cb9a2102
Author: Kousuke Saruta 
AuthorDate: Sun Jun 20 23:39:46 2021 +0300

[SPARK-35827][SQL] Show proper error message when update column types to 
year-month/day-time interval

### What changes were proposed in this pull request?

This PR fixes error message shown when changing a column type to 
year-month/day-time interval type is attempted.

### Why are the changes needed?

It's for consistent behavior.
Updating column types to interval types are prohibited for V2 source tables.
So, if we attempt to update the type of a column to the conventional 
interval type, an error message like `Error in query: Cannot update  
field  to interval type;`.

But, for year-month/day-time interval types, another error message like 
`Error in query: Cannot update  field : cannot be cast to 
interval year;`.

You can reproduce with the following procedure.
```
$ bin/spark-sql
spark-sql> SET spark.sql.catalog.mycatalog=;
spark-sql> CREATE TABLE mycatalog.t1(c1 int) USING ;
spark-sql> ALTER TABLE mycatalog.t1 ALTER COLUMN c1 TYPE interval year to 
month;
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Modified an existing test.

Closes #32978 from sarutak/err-msg-interval.

Authored-by: Kousuke Saruta 
Signed-off-by: Max Gekk 
---
 .../org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala   | 3 ++-
 .../scala/org/apache/spark/sql/connector/AlterTableTests.scala   | 9 +++--
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
index a351a10..28b0f1f 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
@@ -533,7 +533,8 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog {
   case u: UserDefinedType[_] =>
 alter.failAnalysis(s"Cannot update ${table.name} field 
$fieldName type: " +
   s"update a UserDefinedType[${u.sql}] by updating its 
fields")
-  case _: CalendarIntervalType =>
+  case _: CalendarIntervalType | _: YearMonthIntervalType |
+   _: DayTimeIntervalType =>
 alter.failAnalysis(s"Cannot update ${table.name} field 
$fieldName to " +
   s"interval type")
   case _ =>
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala
index afc51f4..165735f 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala
@@ -403,8 +403,13 @@ trait AlterTableTests extends SharedSparkSession {
 val t = s"${catalogAndNamespace}table_name"
 withTable(t) {
   sql(s"CREATE TABLE $t (id int) USING $v2Format")
-  val e = intercept[AnalysisException](sql(s"ALTER TABLE $t ALTER COLUMN 
id TYPE interval"))
-  assert(e.getMessage.contains("id to interval type"))
+  (DataTypeTestUtils.dayTimeIntervalTypes ++ 
DataTypeTestUtils.yearMonthIntervalTypes)
+.foreach {
+  case d: DataType => d.typeName
+val e = intercept[AnalysisException](
+  sql(s"ALTER TABLE $t ALTER COLUMN id TYPE ${d.typeName}"))
+assert(e.getMessage.contains("id to interval type"))
+}
 }
   }
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (682e7f2 -> 974d127)

2021-06-21 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 682e7f2  [SPARK-29375][SPARK-28940][SPARK-32041][SQL] Whole plan 
exchange and subquery reuse
 add 974d127  [SPARK-35545][FOLLOW-UP][TEST][SQL] Add a regression test for 
the SubqueryExpression refactor

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 24 ++
 1 file changed, 24 insertions(+)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (844f10c -> 2c91672)

2021-06-21 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 844f10c  [SPARK-35391] Fix memory leak in ExecutorAllocationListener
 add 2c91672  [SPARK-35775][SQL][TESTS] Check all year-month interval types 
in aggregate expressions

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/DataFrameAggregateSuite.scala | 84 ++
 1 file changed, 56 insertions(+), 28 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35840][SQL] Add `apply()` for a single field to `YearMonthIntervalType` and `DayTimeIntervalType`

2021-06-21 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 37ef7bb  [SPARK-35840][SQL] Add `apply()` for a single field to 
`YearMonthIntervalType` and `DayTimeIntervalType`
37ef7bb is described below

commit 37ef7bb98cdb1a8eefa06677f119a4d97e242097
Author: Max Gekk 
AuthorDate: Mon Jun 21 14:15:33 2021 +0300

[SPARK-35840][SQL] Add `apply()` for a single field to 
`YearMonthIntervalType` and `DayTimeIntervalType`

### What changes were proposed in this pull request?
In the PR, I propose to add 2 new methods that accept one field and produce 
either `YearMonthIntervalType` or `DayTimeIntervalType`.

### Why are the changes needed?
To improve code maintenance.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
By existing test suites.

Closes #32997 from MaxGekk/ansi-interval-types-single-field.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
---
 .../src/main/scala/org/apache/spark/sql/types/DataType.scala | 12 ++--
 .../org/apache/spark/sql/types/DayTimeIntervalType.scala |  1 +
 .../org/apache/spark/sql/types/YearMonthIntervalType.scala   |  1 +
 .../spark/sql/catalyst/CatalystTypeConvertersSuite.scala | 12 ++--
 .../apache/spark/sql/catalyst/expressions/CastSuite.scala|  4 ++--
 .../scala/org/apache/spark/sql/types/DataTypeTestUtils.scala | 12 ++--
 6 files changed, 22 insertions(+), 20 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DataType.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DataType.scala
index d781b05..22c9428 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DataType.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DataType.scala
@@ -173,18 +173,18 @@ object DataType {
   private val otherTypes = {
 Seq(NullType, DateType, TimestampType, BinaryType, IntegerType, 
BooleanType, LongType,
   DoubleType, FloatType, ShortType, ByteType, StringType, 
CalendarIntervalType,
-  DayTimeIntervalType(DAY, DAY),
+  DayTimeIntervalType(DAY),
   DayTimeIntervalType(DAY, HOUR),
   DayTimeIntervalType(DAY, MINUTE),
   DayTimeIntervalType(DAY, SECOND),
-  DayTimeIntervalType(HOUR, HOUR),
+  DayTimeIntervalType(HOUR),
   DayTimeIntervalType(HOUR, MINUTE),
   DayTimeIntervalType(HOUR, SECOND),
-  DayTimeIntervalType(MINUTE, MINUTE),
+  DayTimeIntervalType(MINUTE),
   DayTimeIntervalType(MINUTE, SECOND),
-  DayTimeIntervalType(SECOND, SECOND),
-  YearMonthIntervalType(YEAR, YEAR),
-  YearMonthIntervalType(MONTH, MONTH),
+  DayTimeIntervalType(SECOND),
+  YearMonthIntervalType(YEAR),
+  YearMonthIntervalType(MONTH),
   YearMonthIntervalType(YEAR, MONTH),
   TimestampWithoutTZType)
   .map(t => t.typeName -> t).toMap
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DayTimeIntervalType.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DayTimeIntervalType.scala
index c6d2537..99aa5f1 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DayTimeIntervalType.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DayTimeIntervalType.scala
@@ -101,6 +101,7 @@ case object DayTimeIntervalType extends AbstractDataType {
   val DEFAULT = DayTimeIntervalType(DAY, SECOND)
 
   def apply(): DayTimeIntervalType = DEFAULT
+  def apply(field: Byte): DayTimeIntervalType = DayTimeIntervalType(field, 
field)
 
   override private[sql] def defaultConcreteType: DataType = DEFAULT
 
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/YearMonthIntervalType.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/YearMonthIntervalType.scala
index e6e2643..04902e3 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/YearMonthIntervalType.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/YearMonthIntervalType.scala
@@ -95,6 +95,7 @@ case object YearMonthIntervalType extends AbstractDataType {
   val DEFAULT = YearMonthIntervalType(YEAR, MONTH)
 
   def apply(): YearMonthIntervalType = DEFAULT
+  def apply(field: Byte): YearMonthIntervalType = YearMonthIntervalType(field, 
field)
 
   override private[sql] def defaultConcreteType: DataType = DEFAULT
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/CatalystTypeConvertersSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/CatalystTypeConvertersSuite.scala
index f08dc5f..f850a5a 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/CatalystTypeConvertersSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/CatalystTypeConvertersSuite.scala
@@ -277,16 +277,16 @@

[spark] branch master updated (79e3d0d -> 7c1a9dd)

2021-06-23 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 79e3d0d  [SPARK-35855][SQL] Unify reuse map data structures in non-AQE 
and AQE rules
 add 7c1a9dd  [SPARK-35776][SQL][TESTS] Check all year-month interval types 
in arrow

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/execution/arrow/ArrowWriterSuite.scala   | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (960a7e5 -> 4416b4b)

2021-06-22 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 960a7e5  [SPARK-35856][SQL][TESTS] Move new interval type test cases 
from CastSuite to CastBaseSuite
 add 4416b4b  [SPARK-35734][SQL][FOLLOWUP] 
IntervalUtils.toDayTimeIntervalString should consider the case a day-time type 
is casted as another day-time type

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/util/IntervalUtils.scala| 148 +++--
 .../sql/catalyst/util/IntervalUtilsSuite.scala |  58 +---
 2 files changed, 119 insertions(+), 87 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35772][SQL][TESTS] Check all year-month interval types in `HiveInspectors` tests

2021-06-22 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new df55945  [SPARK-35772][SQL][TESTS] Check all year-month interval types 
in `HiveInspectors` tests
df55945 is described below

commit df55945804918f4d147dcef7a9b5f18bff4cabc9
Author: Angerszh 
AuthorDate: Wed Jun 23 08:54:07 2021 +0300

[SPARK-35772][SQL][TESTS] Check all year-month interval types in 
`HiveInspectors` tests

### What changes were proposed in this pull request?
Check all year-month interval types in HiveInspectors tests.

### Why are the changes needed?
To improve test coverage.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Added UT.

Closes #32970 from AngersZh/SPARK-35772.

Authored-by: Angerszh 
Signed-off-by: Max Gekk 
---
 .../execution/HiveScriptTransformationSuite.scala  | 46 --
 1 file changed, 35 insertions(+), 11 deletions(-)

diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveScriptTransformationSuite.scala
 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveScriptTransformationSuite.scala
index 8cea781..d84a766 100644
--- 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveScriptTransformationSuite.scala
+++ 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveScriptTransformationSuite.scala
@@ -32,6 +32,7 @@ import org.apache.spark.sql.execution._
 import org.apache.spark.sql.functions._
 import org.apache.spark.sql.hive.test.TestHiveSingleton
 import org.apache.spark.sql.types._
+import org.apache.spark.sql.types.YearMonthIntervalType._
 import org.apache.spark.unsafe.types.CalendarInterval
 
 class HiveScriptTransformationSuite extends BaseScriptTransformationSuite with 
TestHiveSingleton {
@@ -521,22 +522,20 @@ class HiveScriptTransformationSuite extends 
BaseScriptTransformationSuite with T
 
   }
 
-  test("SPARK-34879: HiveInspectors supports DayTimeIntervalType and 
YearMonthIntervalType") {
+  test("SPARK-34879: HiveInspectors supports DayTimeIntervalType") {
 assume(TestUtils.testCommandAvailable("/bin/bash"))
 withTempView("v") {
   val df = Seq(
 (Duration.ofDays(1),
   Duration.ofSeconds(100).plusNanos(123456),
-  Duration.of(Long.MaxValue, ChronoUnit.MICROS),
-  Period.ofMonths(10)),
+  Duration.of(Long.MaxValue, ChronoUnit.MICROS)),
 (Duration.ofDays(1),
   Duration.ofSeconds(100).plusNanos(1123456789),
-  Duration.ofSeconds(Long.MaxValue / 
DateTimeConstants.MICROS_PER_SECOND),
-  Period.ofMonths(10))
-  ).toDF("a", "b", "c", "d")
+  Duration.ofSeconds(Long.MaxValue / 
DateTimeConstants.MICROS_PER_SECOND))
+  ).toDF("a", "b", "c")
   df.createTempView("v")
 
-  // Hive serde supports DayTimeIntervalType/YearMonthIntervalType as 
input and output data type
+  // Hive serde supports DayTimeIntervalType as input and output data type
   checkAnswer(
 df,
 (child: SparkPlan) => createScriptTransformationExec(
@@ -545,12 +544,37 @@ class HiveScriptTransformationSuite extends 
BaseScriptTransformationSuite with T
 // TODO(SPARK-35733): Check all day-time interval types in 
HiveInspectors tests
 AttributeReference("a", DayTimeIntervalType())(),
 AttributeReference("b", DayTimeIntervalType())(),
-AttributeReference("c", DayTimeIntervalType())(),
-// TODO(SPARK-35772): Check all year-month interval types in 
HiveInspectors tests
-AttributeReference("d", YearMonthIntervalType())()),
+AttributeReference("c", DayTimeIntervalType())()),
+  child = child,
+  ioschema = hiveIOSchema),
+df.select($"a", $"b", $"c").collect())
+}
+  }
+
+  test("SPARK-35722: HiveInspectors supports all type of 
YearMonthIntervalType") {
+assume(TestUtils.testCommandAvailable("/bin/bash"))
+withTempView("v") {
+  val schema = StructType(Seq(
+StructField("a", YearMonthIntervalType(YEAR)),
+StructField("b", YearMonthIntervalType(YEAR, MONTH)),
+StructField("c", YearMonthIntervalType(MONTH))
+  ))
+  val df = spark.createDataFrame(sparkContext.parallelize(Seq(
+Row(Period.ofMonths(13), Period.ofMonths(13), Period.ofMonths(13))
+  )), schema)
+
+  // Hive serde supports YearMonthIntervalType as input and output data 
type
+  checkAnswer(
+df,
+(child: SparkPlan) =&g

[spark] branch master updated (7c1a9dd -> 758b423)

2021-06-23 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7c1a9dd  [SPARK-35776][SQL][TESTS] Check all year-month interval types 
in arrow
 add 758b423  [SPARK-35860][SQL] Support UpCast between different field of 
YearMonthIntervalType/DayTimeIntervalType

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/expressions/Cast.scala |  3 +++
 .../spark/sql/catalyst/expressions/CastSuiteBase.scala   | 12 
 2 files changed, 15 insertions(+)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (66d5a00 -> 077cf2a)

2021-06-23 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 66d5a00  [SPARK-35817][SQL] Restore performance of queries against 
wide Avro tables
 add 077cf2a  [SPARK-35733][SQL][TESTS] Check all day-time interval types 
in `HiveInspectors` tests

No new revisions were added by this update.

Summary of changes:
 .../execution/HiveScriptTransformationSuite.scala  | 53 +++---
 1 file changed, 37 insertions(+), 16 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2d3fa04 -> ad18722)

2021-06-23 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2d3fa04  [SPARK-35729][SQL][TESTS] Check all day-time interval types 
in aggregate expressions
 add ad18722  [SPARK-35731][SQL][TESTS] Check all day-time interval types 
in arrow

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/execution/arrow/ArrowWriterSuite.scala  | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4761977 -> 2d3fa04)

2021-06-23 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4761977  [SPARK-34889][SS] Introduce MergingSessionsIterator merging 
elements directly which belong to the same session
 add 2d3fa04  [SPARK-35729][SQL][TESTS] Check all day-time interval types 
in aggregate expressions

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/DataFrameAggregateSuite.scala | 285 -
 .../org/apache/spark/sql/test/SQLTestData.scala| 106 
 2 files changed, 327 insertions(+), 64 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (43cd6ca -> 5a510cf)

2021-06-22 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 43cd6ca  [SPARK-35378][SQL][FOLLOWUP] isLocal should consider 
CommandResult
 add 5a510cf  [SPARK-35726][SPARK-35769][SQL][FOLLOWUP] Call periodToMonths 
and durationToMicros in HiveResult should add endField

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/execution/HiveResult.scala| 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4a6d90e -> 1488ea9)

2021-06-21 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4a6d90e  [SPARK-35611][SS] Introduce the strategy on mismatched offset 
for start offset timestamp on Kafka data source
 add 1488ea9  [SPARK-35820][SQL] Support Cast between different field 
DayTimeIntervalType

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/expressions/Cast.scala   | 10 ++
 .../spark/sql/catalyst/expressions/CastSuite.scala  | 21 +
 2 files changed, 31 insertions(+)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (cfcfbca -> 345d3db)

2021-06-24 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from cfcfbca  [SPARK-35476][PYTHON] Fix disallow_untyped_defs mypy checks 
for pyspark.pandas.series
 add 345d3db  Revert "[SPARK-35778][SQL][TESTS] Check multiply/divide of 
year month interval of any fields by numeric"

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/expressions/IntervalExpressionsSuite.scala| 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new de35675  [SPARK-35871][SQL] Literal.create(value, dataType) should 
support fields
de35675 is described below

commit de35675c6191c05195a3c7ef4c11889469e9e192
Author: Angerszh 
AuthorDate: Thu Jun 24 17:36:48 2021 +0300

[SPARK-35871][SQL] Literal.create(value, dataType) should support fields

### What changes were proposed in this pull request?
Current Literal.create(data, dataType) for Period to YearMonthIntervalType 
and Duration to DayTimeIntervalType is not correct.

if data type is Period/Duration, it will create converter of default 
YearMonthIntervalType/DayTimeIntervalType,  then the result is not correct, 
this pr fix this bug.

### Why are the changes needed?
Fix  bug when use Literal.create()

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Added UT

Closes #33056 from AngersZh/SPARK-35871.

Authored-by: Angerszh 
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/expressions/literals.scala  |  8 +++-
 .../expressions/LiteralExpressionSuite.scala   | 24 ++
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala
index d31634c..94052a2 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala
@@ -153,7 +153,13 @@ object Literal {
   def fromObject(obj: Any): Literal = new Literal(obj, 
ObjectType(obj.getClass))
 
   def create(v: Any, dataType: DataType): Literal = {
-Literal(CatalystTypeConverters.convertToCatalyst(v), dataType)
+dataType match {
+  case _: YearMonthIntervalType if v.isInstanceOf[Period] =>
+Literal(CatalystTypeConverters.createToCatalystConverter(dataType)(v), 
dataType)
+  case _: DayTimeIntervalType if v.isInstanceOf[Duration] =>
+Literal(CatalystTypeConverters.createToCatalystConverter(dataType)(v), 
dataType)
+  case _ => Literal(CatalystTypeConverters.convertToCatalyst(v), dataType)
+}
   }
 
   def create[T : TypeTag](v: T): Literal = Try {
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralExpressionSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralExpressionSuite.scala
index 6410651..50b7263 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralExpressionSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralExpressionSuite.scala
@@ -32,6 +32,8 @@ import org.apache.spark.sql.catalyst.util.DateTimeConstants._
 import org.apache.spark.sql.catalyst.util.DateTimeUtils
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
+import org.apache.spark.sql.types.DayTimeIntervalType._
+import org.apache.spark.sql.types.YearMonthIntervalType._
 import org.apache.spark.unsafe.types.CalendarInterval
 
 class LiteralExpressionSuite extends SparkFunSuite with ExpressionEvalHelper {
@@ -432,4 +434,26 @@ class LiteralExpressionSuite extends SparkFunSuite with 
ExpressionEvalHelper {
   assert(literal.toString === expected)
 }
   }
+
+  test("SPARK-35871: Literal.create(value, dataType) should support fields") {
+val period = Period.ofMonths(13)
+DataTypeTestUtils.yearMonthIntervalTypes.foreach { dt =>
+  val result = dt.endField match {
+case YEAR => 12
+case MONTH => 13
+  }
+  checkEvaluation(Literal.create(period, dt), result)
+}
+
+val duration = Duration.ofSeconds(86400 + 3600 + 60 + 1)
+DataTypeTestUtils.dayTimeIntervalTypes.foreach { dt =>
+  val result = dt.endField match {
+case DAY => 864L
+case HOUR => 900L
+case MINUTE => 9006000L
+case SECOND => 9006100L
+  }
+  checkEvaluation(Literal.create(duration, dt), result)
+}
+  }
 }

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (345d3db -> d40a1a2)

2021-06-24 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 345d3db  Revert "[SPARK-35778][SQL][TESTS] Check multiply/divide of 
year month interval of any fields by numeric"
 add d40a1a2  Revert "[SPARK-35728][SQL][TESTS] Check multiply/divide of 
day-time intervals of any fields by numeric"

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/expressions/IntervalExpressionsSuite.scala  | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35736][SPARK-35774][SQL][FOLLOWUP] Prohibit to specify the same units for FROM and TO with unit-to-unit interval syntax

2021-06-24 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 156b9b5  [SPARK-35736][SPARK-35774][SQL][FOLLOWUP] Prohibit to specify 
the same units for FROM and TO with unit-to-unit interval syntax
156b9b5 is described below

commit 156b9b5d14d9f759ae2beef46f163a541878f53c
Author: Kousuke Saruta 
AuthorDate: Thu Jun 24 23:13:31 2021 +0300

[SPARK-35736][SPARK-35774][SQL][FOLLOWUP] Prohibit to specify the same 
units for FROM and TO with unit-to-unit interval syntax

### What changes were proposed in this pull request?

This PR change the behavior of unit-to-unit interval syntax to prohibit the 
case that the same units are specified for FROM and TO.

### Why are the changes needed?

For ANSI compliance.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

New test.

Closes #33057 from sarutak/prohibit-unit-pattern.

Authored-by: Kousuke Saruta 
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/parser/AstBuilder.scala | 30 ++
 .../spark/sql/errors/QueryParsingErrors.scala  |  2 +-
 .../apache/spark/sql/types/StructTypeSuite.scala   | 24 +
 3 files changed, 45 insertions(+), 11 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
index 6a373ab..f82b9be 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
@@ -2521,23 +2521,33 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] 
with SQLConfHelper with Logg
   }
 
   override def visitYearMonthIntervalDataType(ctx: 
YearMonthIntervalDataTypeContext): DataType = {
-val start = 
YearMonthIntervalType.stringToField(ctx.from.getText.toLowerCase(Locale.ROOT))
-val end = if (ctx.to != null) {
-  
YearMonthIntervalType.stringToField(ctx.to.getText.toLowerCase(Locale.ROOT))
+val startStr = ctx.from.getText.toLowerCase(Locale.ROOT)
+val start = YearMonthIntervalType.stringToField(startStr)
+if (ctx.to != null) {
+  val endStr = ctx.to.getText.toLowerCase(Locale.ROOT)
+  val end = YearMonthIntervalType.stringToField(endStr)
+  if (end <= start) {
+throw QueryParsingErrors.fromToIntervalUnsupportedError(startStr, 
endStr, ctx)
+  }
+  YearMonthIntervalType(start, end)
 } else {
-  start
+  YearMonthIntervalType(start)
 }
-YearMonthIntervalType(start, end)
   }
 
   override def visitDayTimeIntervalDataType(ctx: 
DayTimeIntervalDataTypeContext): DataType = {
-val start = 
DayTimeIntervalType.stringToField(ctx.from.getText.toLowerCase(Locale.ROOT))
-val end = if (ctx.to != null ) {
-  
DayTimeIntervalType.stringToField(ctx.to.getText.toLowerCase(Locale.ROOT))
+val startStr = ctx.from.getText.toLowerCase(Locale.ROOT)
+val start = DayTimeIntervalType.stringToField(startStr)
+if (ctx.to != null ) {
+  val endStr = ctx.to.getText.toLowerCase(Locale.ROOT)
+  val end = DayTimeIntervalType.stringToField(endStr)
+  if (end <= start) {
+throw QueryParsingErrors.fromToIntervalUnsupportedError(startStr, 
endStr, ctx)
+  }
+  DayTimeIntervalType(start, end)
 } else {
-  start
+  DayTimeIntervalType(start)
 }
-DayTimeIntervalType(start, end)
   }
 
   /**
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
index 9002728..cab1c17 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
@@ -206,7 +206,7 @@ object QueryParsingErrors {
   }
 
   def fromToIntervalUnsupportedError(
-  from: String, to: String, ctx: UnitToUnitIntervalContext): Throwable = {
+  from: String, to: String, ctx: ParserRuleContext): Throwable = {
 new ParseException(s"Intervals FROM $from TO $to are not supported.", ctx)
   }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/types/StructTypeSuite.scala 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/types/StructTypeSuite.scala
index 820f326..18821b8 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/types/StructTypeSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/types/StructTypeSuite.scala
@@ -18,8 +18,11 @@
 package org.apache.spark.sql.types
 
 import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.catalyst.parser.ParseE

[spark] branch master updated: [SPARK-35764][SQL] Assign pretty names to TimestampWithoutTZType

2021-06-15 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 195090a  [SPARK-35764][SQL] Assign pretty names to 
TimestampWithoutTZType
195090a is described below

commit 195090afcc8ed138336b353edc0a4db6f0f5f168
Author: Gengliang Wang 
AuthorDate: Tue Jun 15 12:15:13 2021 +0300

[SPARK-35764][SQL] Assign pretty names to TimestampWithoutTZType

### What changes were proposed in this pull request?

In the PR, I propose to override the typeName() method in 
TimestampWithoutTZType, and assign it a name according to the ANSI SQL standard

![image](https://user-images.githubusercontent.com/1097932/122013859-2cf50680-cdf1-11eb-9fcd-0ec1b59fb5c0.png)

### Why are the changes needed?

To improve Spark SQL user experience, and have readable types in error 
messages.

### Does this PR introduce _any_ user-facing change?

No, the new timestamp type is not released yet.
### How was this patch tested?

Unit test

Closes #32915 from gengliangwang/typename.

Authored-by: Gengliang Wang 
Signed-off-by: Max Gekk 
---
 .../org/apache/spark/sql/types/TimestampWithoutTZType.scala |  2 ++
 .../apache/spark/sql/catalyst/expressions/CastSuite.scala   | 13 +
 2 files changed, 15 insertions(+)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/TimestampWithoutTZType.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/TimestampWithoutTZType.scala
index 558f5ee..856d549 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/TimestampWithoutTZType.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/TimestampWithoutTZType.scala
@@ -48,6 +48,8 @@ class TimestampWithoutTZType private() extends AtomicType {
*/
   override def defaultSize: Int = 8
 
+  override def typeName: String = "timestamp without time zone"
+
   private[spark] override def asNullable: TimestampWithoutTZType = this
 }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
index c268d52..910c757 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuite.scala
@@ -1295,6 +1295,19 @@ abstract class AnsiCastSuiteBase extends CastSuiteBase {
   }
 }
   }
+
+  test("disallow type conversions between Numeric types and Timestamp without 
time zone type") {
+import DataTypeTestUtils.numericTypes
+checkInvalidCastFromNumericType(TimestampWithoutTZType)
+var errorMsg = "cannot cast bigint to timestamp without time zone"
+verifyCastFailure(cast(Literal(0L), TimestampWithoutTZType), 
Some(errorMsg))
+
+val timestampWithoutTZLiteral = Literal.create(LocalDateTime.now(), 
TimestampWithoutTZType)
+errorMsg = "cannot cast timestamp without time zone to"
+numericTypes.foreach { numericType =>
+  verifyCastFailure(cast(timestampWithoutTZLiteral, numericType), 
Some(errorMsg))
+}
+  }
 }
 
 /**

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (195090a -> b191d72)

2021-06-15 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 195090a  [SPARK-35764][SQL] Assign pretty names to 
TimestampWithoutTZType
 add b191d72  [SPARK-35056][SQL] Group exception messages in 
execution/streaming

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/errors/QueryExecutionErrors.scala| 95 +-
 .../streaming/CheckpointFileManager.scala  | 16 ++--
 .../sql/execution/streaming/FileStreamSink.scala   | 21 +
 .../sql/execution/streaming/GroupStateImpl.scala   | 15 ++--
 .../sql/execution/streaming/HDFSMetadataLog.scala  |  7 +-
 .../streaming/ManifestFileCommitProtocol.scala |  4 +-
 .../execution/streaming/MicroBatchExecution.scala  |  4 +-
 .../execution/streaming/StreamingRelation.scala|  3 +-
 .../execution/streaming/statefulOperators.scala|  3 +-
 9 files changed, 121 insertions(+), 47 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c382d40 -> 8a02f3a)

2021-06-15 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c382d40  [SPARK-35766][SQL][TESTS] Break down CastSuite/AnsiCastSuite 
into multiple files
 add 8a02f3a  [SPARK-35129][SQL] Construct year-month interval column from 
integral fields

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  1 +
 .../catalyst/expressions/intervalExpressions.scala | 53 +++
 .../expressions/IntervalExpressionsSuite.scala | 28 ++
 .../sql-functions/sql-expression-schema.md |  3 +-
 .../test/resources/sql-tests/inputs/interval.sql   |  9 
 .../sql-tests/results/ansi/interval.sql.out| 60 +-
 .../resources/sql-tests/results/interval.sql.out   | 60 +-
 7 files changed, 211 insertions(+), 3 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35737][SQL] Parse day-time interval literals to tightest types

2021-06-14 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 439e94c  [SPARK-35737][SQL] Parse day-time interval literals to 
tightest types
439e94c is described below

commit 439e94c1712366ff267183d3946f2507ebf3a98e
Author: Kousuke Saruta 
AuthorDate: Mon Jun 14 10:06:19 2021 +0300

[SPARK-35737][SQL] Parse day-time interval literals to tightest types

### What changes were proposed in this pull request?

This PR add a feature which parse day-time interval literals to tightest 
type.

### Why are the changes needed?

To comply with the ANSI behavior.
For example, `INTERVAL '10 20:30' DAY TO MINUTE` should be parsed as 
`DayTimeIntervalType(DAY, MINUTE)` but not as `DayTimeIntervalType(DAY, 
SECOND)`.

### Does this PR introduce _any_ user-facing change?

No because `DayTimeIntervalType` will be introduced in `3.2.0`.

### How was this patch tested?

New tests.

Closes #32892 from sarutak/tight-daytime-interval.

Authored-by: Kousuke Saruta 
Signed-off-by: Max Gekk 
---
 .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala |  9 +++--
 .../resources/sql-tests/results/ansi/interval.sql.out | 12 ++--
 .../src/test/resources/sql-tests/results/interval.sql.out | 12 ++--
 .../test/scala/org/apache/spark/sql/SQLQuerySuite.scala   | 15 +++
 4 files changed, 34 insertions(+), 14 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
index 9f1d665..4bbd9bd 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
@@ -2357,9 +2357,14 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
 Literal(calendarInterval.months, YearMonthIntervalType)
   } else {
 assert(calendarInterval.months == 0)
+val strToFieldIndex = DayTimeIntervalType.dayTimeFields.map(i =>
+  DayTimeIntervalType.fieldToString(i) -> i).toMap
+val fromUnit =
+  
ctx.errorCapturingUnitToUnitInterval.body.from.getText.toLowerCase(Locale.ROOT)
 val micros = IntervalUtils.getDuration(calendarInterval, 
TimeUnit.MICROSECONDS)
-// TODO(SPARK-35737): Parse day-time interval literals to tightest 
types
-Literal(micros, DayTimeIntervalType())
+val start = strToFieldIndex(fromUnit)
+val end = strToFieldIndex(toUnit)
+Literal(micros, DayTimeIntervalType(start, end))
   }
 } else {
   Literal(calendarInterval, CalendarIntervalType)
diff --git 
a/sql/core/src/test/resources/sql-tests/results/ansi/interval.sql.out 
b/sql/core/src/test/resources/sql-tests/results/ansi/interval.sql.out
index 3205259..f2f5d5c 100644
--- a/sql/core/src/test/resources/sql-tests/results/ansi/interval.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/ansi/interval.sql.out
@@ -363,7 +363,7 @@ struct
 -- !query
 select interval '20 15' day to hour
 -- !query schema
-struct
+struct
 -- !query output
 20 15:00:00.0
 
@@ -371,7 +371,7 @@ struct
 -- !query
 select interval '20 15:40' day to minute
 -- !query schema
-struct
+struct
 -- !query output
 20 15:40:00.0
 
@@ -387,7 +387,7 @@ struct
 -- !query
 select interval '15:40' hour to minute
 -- !query schema
-struct
+struct
 -- !query output
 0 15:40:00.0
 
@@ -395,7 +395,7 @@ struct
 -- !query
 select interval '15:40:32.9989' hour to second
 -- !query schema
-struct
+struct
 -- !query output
 0 15:40:32.998999000
 
@@ -403,7 +403,7 @@ struct
 -- !query
 select interval '40:32.9989' minute to second
 -- !query schema
-struct
+struct
 -- !query output
 0 00:40:32.998999000
 
@@ -411,7 +411,7 @@ struct
 -- !query
 select interval '40:32' minute to second
 -- !query schema
-struct
+struct
 -- !query output
 0 00:40:32.0
 
diff --git a/sql/core/src/test/resources/sql-tests/results/interval.sql.out 
b/sql/core/src/test/resources/sql-tests/results/interval.sql.out
index ef9ef8f..9b44960 100644
--- a/sql/core/src/test/resources/sql-tests/results/interval.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/interval.sql.out
@@ -357,7 +357,7 @@ struct
 -- !query
 select interval '20 15' day to hour
 -- !query schema
-struct
+struct
 -- !query output
 20 15:00:00.0
 
@@ -365,7 +365,7 @@ struct
 -- !query
 select interval '20 15:40' day to minute
 -- !query schema
-struct
+struct
 -- !query output
 20 15:40:00.0
 
@@ -381,7 +381,7 @@ struct
 -- !query
 select interval '15:40' hour to minute
 -- !query schema
-struct
+struct
 -- !query output
 0 15:40:00.000

[spark] branch master updated (0e554d4 -> 939ae91)

2021-06-17 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 0e554d4  [SPARK-35770][SQL] Parse YearMonthIntervalType from JSON
 add 939ae91  [SPARK-35130][SQL] Add make_dt_interval function to construct 
DayTimeIntervalType value

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  1 +
 .../catalyst/expressions/intervalExpressions.scala | 77 ++
 .../spark/sql/catalyst/util/IntervalUtils.scala| 13 
 .../expressions/IntervalExpressionsSuite.scala | 45 +
 .../sql-functions/sql-expression-schema.md |  5 +-
 .../test/resources/sql-tests/inputs/interval.sql   |  8 +++
 .../sql-tests/results/ansi/interval.sql.out| 49 +-
 .../resources/sql-tests/results/interval.sql.out   | 50 +-
 8 files changed, 244 insertions(+), 4 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8aeed08 -> 0e554d4)

2021-06-17 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8aeed08  [SPARK-35757][CORE] Add bitwise AND operation and 
functionality for intersecting bloom filters
 add 0e554d4  [SPARK-35770][SQL] Parse YearMonthIntervalType from JSON

No new revisions were added by this update.

Summary of changes:
 .../src/main/scala/org/apache/spark/sql/types/DataType.scala   | 7 +--
 .../src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala  | 3 +++
 2 files changed, 8 insertions(+), 2 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35732][SQL] Parse DayTimeIntervalType from JSON

2021-06-17 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 234163f  [SPARK-35732][SQL] Parse DayTimeIntervalType from JSON
234163f is described below

commit 234163fbe08c8ac51b2ce52094c38baaa4e06709
Author: Angerszh 
AuthorDate: Thu Jun 17 12:54:34 2021 +0300

[SPARK-35732][SQL] Parse DayTimeIntervalType from JSON

### What changes were proposed in this pull request?
Support Parse DayTimeIntervalType from JSON

### Why are the changes needed?
this will allow to store day-second intervals as table columns into Hive 
external catalog.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Added UT

Closes #32930 from AngersZh/SPARK-35732.

Lead-authored-by: Angerszh 
Co-authored-by: AngersZh 
Signed-off-by: Max Gekk 
---
 .../main/scala/org/apache/spark/sql/types/DataType.scala| 13 +++--
 .../scala/org/apache/spark/sql/types/DataTypeSuite.scala|  2 +-
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DataType.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DataType.scala
index 1c4ad88..d781b05 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DataType.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/types/DataType.scala
@@ -37,6 +37,7 @@ import 
org.apache.spark.sql.catalyst.util.StringUtils.StringConcat
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.internal.SQLConf.StoreAssignmentPolicy
 import org.apache.spark.sql.internal.SQLConf.StoreAssignmentPolicy.{ANSI, 
STRICT}
+import org.apache.spark.sql.types.DayTimeIntervalType._
 import org.apache.spark.sql.types.YearMonthIntervalType._
 import org.apache.spark.util.Utils
 
@@ -172,8 +173,16 @@ object DataType {
   private val otherTypes = {
 Seq(NullType, DateType, TimestampType, BinaryType, IntegerType, 
BooleanType, LongType,
   DoubleType, FloatType, ShortType, ByteType, StringType, 
CalendarIntervalType,
-  // TODO(SPARK-35732): Parse DayTimeIntervalType from JSON
-  DayTimeIntervalType(),
+  DayTimeIntervalType(DAY, DAY),
+  DayTimeIntervalType(DAY, HOUR),
+  DayTimeIntervalType(DAY, MINUTE),
+  DayTimeIntervalType(DAY, SECOND),
+  DayTimeIntervalType(HOUR, HOUR),
+  DayTimeIntervalType(HOUR, MINUTE),
+  DayTimeIntervalType(HOUR, SECOND),
+  DayTimeIntervalType(MINUTE, MINUTE),
+  DayTimeIntervalType(MINUTE, SECOND),
+  DayTimeIntervalType(SECOND, SECOND),
   YearMonthIntervalType(YEAR, YEAR),
   YearMonthIntervalType(MONTH, MONTH),
   YearMonthIntervalType(YEAR, MONTH),
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala
index d620415..3761833 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DataTypeSuite.scala
@@ -256,7 +256,7 @@ class DataTypeSuite extends SparkFunSuite {
   checkDataTypeFromJson(VarcharType(10))
   checkDataTypeFromDDL(VarcharType(11))
 
-
+  dayTimeIntervalTypes.foreach(checkDataTypeFromJson)
   yearMonthIntervalTypes.foreach(checkDataTypeFromJson)
 
   yearMonthIntervalTypes.foreach(checkDataTypeFromDDL)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35704][SQL] Add fields to `DayTimeIntervalType`

2021-06-11 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new d53831f  [SPARK-35704][SQL] Add fields to `DayTimeIntervalType`
d53831f is described below

commit d53831ff5cef618cfbc08dcea84ca9bb61ed9903
Author: Max Gekk 
AuthorDate: Fri Jun 11 16:16:33 2021 +0300

[SPARK-35704][SQL] Add fields to `DayTimeIntervalType`

### What changes were proposed in this pull request?
Extend DayTimeIntervalType to support interval fields. Valid interval field 
values:
- 0 (DAY)
- 1 (HOUR)
- 2 (MINUTE)
- 3 (SECOND)

After the changes, the following day-time interval types are supported:
1. `DayTimeIntervalType(0, 0)` or `DayTimeIntervalType(DAY, DAY)`
2. `DayTimeIntervalType(0, 1)` or `DayTimeIntervalType(DAY, HOUR)`
3. `DayTimeIntervalType(0, 2)` or `DayTimeIntervalType(DAY, MINUTE)`
4. `DayTimeIntervalType(0, 3)` or `DayTimeIntervalType(DAY, SECOND)`. **It 
is the default one**. The second fraction precision is microseconds.
5. `DayTimeIntervalType(1, 1)` or `DayTimeIntervalType(HOUR, HOUR)`
6. `DayTimeIntervalType(1, 2)` or `DayTimeIntervalType(HOUR, MINUTE)`
7. `DayTimeIntervalType(1, 3)` or `DayTimeIntervalType(HOUR, SECOND)`
8. `DayTimeIntervalType(2, 2)` or `DayTimeIntervalType(MINUTE, MINUTE)`
9. `DayTimeIntervalType(2, 3)` or `DayTimeIntervalType(MINUTE, SECOND)`
10. `DayTimeIntervalType(3, 3)` or `DayTimeIntervalType(SECOND, SECOND)`

### Why are the changes needed?
In the current implementation, Spark supports only `interval day to second` 
but the SQL standard allows to specify the start and end fields. The changes 
will allow to follow ANSI SQL standard more precisely.

### Does this PR introduce _any_ user-facing change?
Yes but `DayTimeIntervalType` has not been released yet.

### How was this patch tested?
By existing test suites.

Closes #32849 from MaxGekk/day-time-interval-type-units.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/expressions/UnsafeRow.java  |  9 ++--
 .../java/org/apache/spark/sql/types/DataTypes.java | 19 ++--
 .../sql/catalyst/CatalystTypeConverters.scala  |  3 +-
 .../apache/spark/sql/catalyst/InternalRow.scala|  4 +-
 .../spark/sql/catalyst/JavaTypeInference.scala |  2 +-
 .../spark/sql/catalyst/ScalaReflection.scala   |  6 +--
 .../spark/sql/catalyst/SerializerBuildHelper.scala |  2 +-
 .../spark/sql/catalyst/analysis/Analyzer.scala | 22 -
 .../apache/spark/sql/catalyst/dsl/package.scala|  7 ++-
 .../spark/sql/catalyst/encoders/RowEncoder.scala   |  6 +--
 .../spark/sql/catalyst/expressions/Cast.scala  | 46 --
 .../expressions/InterpretedUnsafeProjection.scala  |  2 +-
 .../catalyst/expressions/SpecificInternalRow.scala |  3 +-
 .../catalyst/expressions/aggregate/Average.scala   |  6 +--
 .../sql/catalyst/expressions/aggregate/Sum.scala   |  2 +-
 .../sql/catalyst/expressions/arithmetic.scala  | 10 ++--
 .../expressions/codegen/CodeGenerator.scala|  4 +-
 .../catalyst/expressions/codegen/javaCode.scala|  2 +-
 .../expressions/collectionOperations.scala |  8 ++--
 .../catalyst/expressions/datetimeExpressions.scala | 14 +++---
 .../spark/sql/catalyst/expressions/hash.scala  |  2 +-
 .../catalyst/expressions/intervalExpressions.scala | 12 ++---
 .../spark/sql/catalyst/expressions/literals.scala  | 16 ---
 .../catalyst/expressions/windowExpressions.scala   |  2 +-
 .../spark/sql/catalyst/parser/AstBuilder.scala |  6 ++-
 .../spark/sql/catalyst/util/IntervalUtils.scala| 13 +-
 .../apache/spark/sql/catalyst/util/TypeUtils.scala |  5 +-
 .../spark/sql/errors/QueryCompilationErrors.scala  | 11 +
 .../org/apache/spark/sql/types/DataType.scala  |  3 +-
 .../spark/sql/types/DayTimeIntervalType.scala  | 54 ++
 .../org/apache/spark/sql/util/ArrowUtils.scala |  4 +-
 .../org/apache/spark/sql/RandomDataGenerator.scala |  2 +-
 .../spark/sql/RandomDataGeneratorSuite.scala   |  3 +-
 .../sql/catalyst/CatalystTypeConvertersSuite.scala |  3 +-
 .../sql/catalyst/encoders/RowEncoderSuite.scala| 17 ---
 .../expressions/ArithmeticExpressionSuite.scala|  8 ++--
 .../spark/sql/catalyst/expressions/CastSuite.scala | 39 +---
 .../expressions/DateExpressionsSuite.scala | 12 +++--
 .../expressions/HashExpressionsSuite.scala |  2 +-
 .../expressions/IntervalExpressionsSuite.scala | 32 -
 .../expressions/LiteralExpressionSuite.scala   |  4 +-
 .../catalyst/expressions/LiteralGenerator.scala|  4 +-
 .../expressions/MutableProjectionSuite.scala   |  9 ++--
 .../sql/catalyst/parser/DataTypeParserSuite.scala  |  2 +-
 .../sql/catalyst/util

[spark] branch master updated (38eb5a6 -> 2c8ced9)

2021-05-10 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 38eb5a6  [SPARK-35354][SQL] Replace BaseJoinExec with ShuffledJoin in 
CoalesceBucketsInJoin
 add 2c8ced9  [SPARK-35111][SPARK-35112][SQL][FOLLOWUP] Rename ANSI 
interval patterns and regexps

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/util/IntervalUtils.scala| 31 +++---
 1 file changed, 16 insertions(+), 15 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d808956 -> 7182f8c)

2021-05-10 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d808956  [MINOR][INFRA] Add python/.idea into git ignore
 add 7182f8c  [SPARK-35360][SQL] RepairTableCommand respects 
`spark.sql.addPartitionInBatch.size` too

No new revisions were added by this update.

Summary of changes:
 .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala   | 5 +++--
 .../src/main/scala/org/apache/spark/sql/execution/command/ddl.scala  | 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (af1dba7 -> e3c6907)

2021-05-26 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from af1dba7  [SPARK-35440][SQL] Add function type to `ExpressionInfo` for 
UDF
 add e3c6907  [SPARK-35490][BUILD] Update json4s to 3.7.0-M11

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/deploy/FaultToleranceTest.scala   | 1 -
 .../org/apache/spark/deploy/history/HistoryServerSuite.scala  | 1 -
 dev/deps/spark-deps-hadoop-2.7-hive-2.3   | 8 
 dev/deps/spark-deps-hadoop-3.2-hive-2.3   | 8 
 pom.xml   | 2 +-
 .../sql/streaming/StreamingQueryStatusAndProgressSuite.scala  | 1 -
 6 files changed, 9 insertions(+), 12 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35581][SQL] Support special datetime values in typed literals only

2021-06-01 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new a59063d  [SPARK-35581][SQL] Support special datetime values in typed 
literals only
a59063d is described below

commit a59063d5446a78d713c19a0d43f86c9f72ffd77d
Author: Max Gekk 
AuthorDate: Tue Jun 1 15:29:05 2021 +0300

[SPARK-35581][SQL] Support special datetime values in typed literals only

### What changes were proposed in this pull request?
In the PR, I propose to support special datetime values introduced by 
#25708 and by #25716 only in typed literals, and don't recognize them in 
parsing strings to dates/timestamps. The following string values are supported 
only in typed timestamp literals:
- `epoch [zoneId]` - `1970-01-01 00:00:00+00 (Unix system time zero)`
- `today [zoneId]` - midnight today.
- `yesterday [zoneId]` - midnight yesterday
- `tomorrow [zoneId]` - midnight tomorrow
- `now` - current query start time.

For example:
```sql
spark-sql> SELECT timestamp 'tomorrow';
2019-09-07 00:00:00
```

Similarly, the following special date values are supported only in typed 
date literals:
- `epoch [zoneId]` - `1970-01-01`
- `today [zoneId]` - the current date in the time zone specified by 
`spark.sql.session.timeZone`.
- `yesterday [zoneId]` - the current date -1
- `tomorrow [zoneId]` - the current date + 1
- `now` - the date of running the current query. It has the same notion as 
`today`.

For example:
```sql
spark-sql> SELECT date 'tomorrow' - date 'yesterday';
2
```

### Why are the changes needed?
In the current implementation, Spark supports the special date/timestamp 
value in any input strings casted to dates/timestamps that leads to the 
following problems:
- If executors have different system time, the result is inconsistent, and 
random. Column values depend on where the conversions were performed.
- The special values play the role of distributed non-deterministic 
functions though users might think of the values as constants.

### Does this PR introduce _any_ user-facing change?
Yes but the probability should be small.

### How was this patch tested?
By running existing test suites:
```
$ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z 
interval.sql"
$ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z 
date.sql"
$ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z 
timestamp.sql"
$ build/sbt "test:testOnly *DateTimeUtilsSuite"
```

Closes #32714 from MaxGekk/remove-datetime-special-values.

Lead-authored-by: Max Gekk 
Co-authored-by: Maxim Gekk 
Signed-off-by: Max Gekk 
---
 docs/sql-migration-guide.md|  2 ++
 .../src/main/scala/org/apache/spark/sql/Row.scala  |  2 +-
 .../spark/sql/catalyst/catalog/interface.scala |  4 +--
 .../sql/catalyst/csv/UnivocityGenerator.scala  |  1 -
 .../spark/sql/catalyst/csv/UnivocityParser.scala   |  3 +-
 .../spark/sql/catalyst/expressions/Cast.scala  | 20 +
 .../spark/sql/catalyst/expressions/literals.scala  |  2 +-
 .../spark/sql/catalyst/json/JacksonGenerator.scala |  1 -
 .../spark/sql/catalyst/json/JacksonParser.scala|  3 +-
 .../spark/sql/catalyst/parser/AstBuilder.scala |  9 --
 .../spark/sql/catalyst/util/DateFormatter.scala| 34 -
 .../spark/sql/catalyst/util/DateTimeUtils.scala| 35 +-
 .../sql/catalyst/util/TimestampFormatter.scala | 23 +++---
 .../expressions/HashExpressionsSuite.scala |  2 +-
 .../sql/catalyst/util/DateFormatterSuite.scala | 34 ++---
 .../sql/catalyst/util/DateTimeUtilsSuite.scala | 32 ++--
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  2 +-
 .../catalyst/util/TimestampFormatterSuite.scala| 24 +--
 .../spark/sql/catalyst/util/UnsafeArraySuite.scala |  6 ++--
 .../execution/BaseScriptTransformationExec.scala   | 11 +++
 .../apache/spark/sql/execution/HiveResult.scala| 12 ++--
 .../execution/datasources/PartitioningUtils.scala  |  2 +-
 .../execution/datasources/jdbc/JDBCRelation.scala  |  5 ++--
 .../org/apache/spark/sql/jdbc/JdbcDialects.scala   |  4 +--
 .../sql-tests/inputs/postgreSQL/timestamp.sql  | 16 +-
 .../sql-tests/results/postgreSQL/timestamp.sql.out | 16 +-
 .../org/apache/spark/sql/CsvFunctionsSuite.scala   | 19 
 .../org/apache/spark/sql/JsonFunctionsSuite.scala  | 19 
 .../parquet/ParquetPartitionDiscoverySuite.scala   |  2 +-
 .../spark/sql/hive/HiveExternalCatalog.scala   |  8 ++---
 .../apache/spark/sql/hive/c

[spark] branch branch-3.2 updated: [SPARK-35982][SQL] Allow from_json/to_json for map types where value types are year-month intervals

2021-07-05 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 26bcf02  [SPARK-35982][SQL] Allow from_json/to_json for map types 
where value types are year-month intervals
26bcf02 is described below

commit 26bcf028339c02ca75af31ab8105f7dbe58c49a9
Author: Kousuke Saruta 
AuthorDate: Mon Jul 5 10:35:50 2021 +0300

[SPARK-35982][SQL] Allow from_json/to_json for map types where value types 
are year-month intervals

### What changes were proposed in this pull request?

This PR fixes two issues. One is that `to_json` doesn't support `map` types 
where value types are `year-month` interval types like:
```
spark-sql> select to_json(map('a', interval '1-2' year to  month));
21/07/02 11:38:15 ERROR SparkSQLDriver: Failed in [select to_json(map('a', 
interval '1-2' year to  month))]
java.lang.RuntimeException: Failed to convert value 14 (class of class 
java.lang.Integer) with the type of YearMonthIntervalType(0,1) to JSON.
```
The other issue is that even if the issue of `to_json` is resolved, 
`from_json` doesn't support to convert `year-month` interval string to JSON. So 
the result of following query will be `null`.
```
spark-sql> select from_json(to_json(map('a', interval '1-2' year to 
month)), 'a interval year to month');
{"a":null}
```

### Why are the changes needed?

There should be no reason why year-month intervals cannot used as map value 
types.
`CalendarIntervalTypes` can do it.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

New tests.

Closes #33181 from sarutak/map-json-yminterval.

Authored-by: Kousuke Saruta 
Signed-off-by: Max Gekk 
(cherry picked from commit 647422685292cd1a46766afa9b07b6fcfc181bbd)
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/json/JacksonGenerator.scala |  9 
 .../spark/sql/catalyst/json/JacksonParser.scala|  7 ++
 .../org/apache/spark/sql/JsonFunctionsSuite.scala  | 26 ++
 3 files changed, 42 insertions(+)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala
index 2567438..9777d56 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala
@@ -147,6 +147,15 @@ private[sql] class JacksonGenerator(
   (row: SpecializedGetters, ordinal: Int) =>
 gen.writeString(row.getInterval(ordinal).toString)
 
+case YearMonthIntervalType(start, end) =>
+  (row: SpecializedGetters, ordinal: Int) =>
+val ymString = IntervalUtils.toYearMonthIntervalString(
+  row.getInt(ordinal),
+  IntervalStringStyles.ANSI_STYLE,
+  start,
+  end)
+gen.writeString(ymString)
+
 case BinaryType =>
   (row: SpecializedGetters, ordinal: Int) =>
 gen.writeBinary(row.getBinary(ordinal))
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
index 27e1411..2aa735d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
@@ -295,6 +295,13 @@ class JacksonParser(
   
IntervalUtils.safeStringToInterval(UTF8String.fromString(parser.getText))
   }
 
+case ym: YearMonthIntervalType => (parser: JsonParser) =>
+  parseJsonToken[Integer](parser, dataType) {
+case VALUE_STRING =>
+  val expr = Cast(Literal(parser.getText), ym)
+  Integer.valueOf(expr.eval(EmptyRow).asInstanceOf[Int])
+  }
+
 case st: StructType =>
   val fieldConverters = st.map(_.dataType).map(makeConverter).toArray
   (parser: JsonParser) => parseJsonToken[InternalRow](parser, dataType) {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala
index 5485cc1..c2bea8c 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala
@@ -18,6 +18,7 @@
 package org.apache.spark.sql
 
 import java.text.SimpleDateFormat
+import java.time.Period
 import java.util.Locale
 
 import collection.JavaConverters._
@@ -27,6 +28,7 @@ import org.apache.spark.sql.functions._
 import org.apache.spark.sql.internal.SQLConf
 import org.ap

[spark] branch master updated: [SPARK-35982][SQL] Allow from_json/to_json for map types where value types are year-month intervals

2021-07-05 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 6474226  [SPARK-35982][SQL] Allow from_json/to_json for map types 
where value types are year-month intervals
6474226 is described below

commit 647422685292cd1a46766afa9b07b6fcfc181bbd
Author: Kousuke Saruta 
AuthorDate: Mon Jul 5 10:35:50 2021 +0300

[SPARK-35982][SQL] Allow from_json/to_json for map types where value types 
are year-month intervals

### What changes were proposed in this pull request?

This PR fixes two issues. One is that `to_json` doesn't support `map` types 
where value types are `year-month` interval types like:
```
spark-sql> select to_json(map('a', interval '1-2' year to  month));
21/07/02 11:38:15 ERROR SparkSQLDriver: Failed in [select to_json(map('a', 
interval '1-2' year to  month))]
java.lang.RuntimeException: Failed to convert value 14 (class of class 
java.lang.Integer) with the type of YearMonthIntervalType(0,1) to JSON.
```
The other issue is that even if the issue of `to_json` is resolved, 
`from_json` doesn't support to convert `year-month` interval string to JSON. So 
the result of following query will be `null`.
```
spark-sql> select from_json(to_json(map('a', interval '1-2' year to 
month)), 'a interval year to month');
{"a":null}
```

### Why are the changes needed?

There should be no reason why year-month intervals cannot used as map value 
types.
`CalendarIntervalTypes` can do it.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

New tests.

Closes #33181 from sarutak/map-json-yminterval.

Authored-by: Kousuke Saruta 
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/json/JacksonGenerator.scala |  9 
 .../spark/sql/catalyst/json/JacksonParser.scala|  7 ++
 .../org/apache/spark/sql/JsonFunctionsSuite.scala  | 26 ++
 3 files changed, 42 insertions(+)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala
index 2567438..9777d56 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala
@@ -147,6 +147,15 @@ private[sql] class JacksonGenerator(
   (row: SpecializedGetters, ordinal: Int) =>
 gen.writeString(row.getInterval(ordinal).toString)
 
+case YearMonthIntervalType(start, end) =>
+  (row: SpecializedGetters, ordinal: Int) =>
+val ymString = IntervalUtils.toYearMonthIntervalString(
+  row.getInt(ordinal),
+  IntervalStringStyles.ANSI_STYLE,
+  start,
+  end)
+gen.writeString(ymString)
+
 case BinaryType =>
   (row: SpecializedGetters, ordinal: Int) =>
 gen.writeBinary(row.getBinary(ordinal))
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
index 27e1411..2aa735d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
@@ -295,6 +295,13 @@ class JacksonParser(
   
IntervalUtils.safeStringToInterval(UTF8String.fromString(parser.getText))
   }
 
+case ym: YearMonthIntervalType => (parser: JsonParser) =>
+  parseJsonToken[Integer](parser, dataType) {
+case VALUE_STRING =>
+  val expr = Cast(Literal(parser.getText), ym)
+  Integer.valueOf(expr.eval(EmptyRow).asInstanceOf[Int])
+  }
+
 case st: StructType =>
   val fieldConverters = st.map(_.dataType).map(makeConverter).toArray
   (parser: JsonParser) => parseJsonToken[InternalRow](parser, dataType) {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala
index 5485cc1..c2bea8c 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala
@@ -18,6 +18,7 @@
 package org.apache.spark.sql
 
 import java.text.SimpleDateFormat
+import java.time.Period
 import java.util.Locale
 
 import collection.JavaConverters._
@@ -27,6 +28,7 @@ import org.apache.spark.sql.functions._
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.test.SharedSparkSession
 import org.apache.spark.sql.types._
+import org.apache.spark.sql.types.YearMonthIntervalType.{MO

[spark] branch master updated (2fe6c94 -> f4237af)

2021-07-05 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2fe6c94  [SPARK-33996][BUILD][FOLLOW-UP] Match SBT's plugin checkstyle 
version to Maven's
 add f4237af  [SPARK-35998][SQL] Make from_csv/to_csv to handle year-month 
intervals properly

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/csv/UnivocityGenerator.scala  |  7 +-
 .../spark/sql/catalyst/csv/UnivocityParser.scala   |  7 +-
 .../org/apache/spark/sql/CsvFunctionsSuite.scala   | 29 ++
 3 files changed, 41 insertions(+), 2 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated: [SPARK-35998][SQL] Make from_csv/to_csv to handle year-month intervals properly

2021-07-05 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 544b7e1  [SPARK-35998][SQL] Make from_csv/to_csv to handle year-month 
intervals properly
544b7e1 is described below

commit 544b7e16acf51b7c2a9555fb4ebe7b19a00e
Author: Kousuke Saruta 
AuthorDate: Mon Jul 5 13:10:50 2021 +0300

[SPARK-35998][SQL] Make from_csv/to_csv to handle year-month intervals 
properly

### What changes were proposed in this pull request?

This PR fixes an issue that `from_csv/to_csv` doesn't handle year-month 
intervals properly.
`from_csv` throws exception if year-month interval types are given.
```
spark-sql> select from_csv("interval '1-2' year to month", "a interval year 
to month");
21/07/03 04:32:24 ERROR SparkSQLDriver: Failed in [select 
from_csv("interval '1-2' year to month", "a interval year to month")]
java.lang.Exception: Unsupported type: interval year to month
at 
org.apache.spark.sql.errors.QueryExecutionErrors$.unsupportedTypeError(QueryExecutionErrors.scala:775)
at 
org.apache.spark.sql.catalyst.csv.UnivocityParser.makeConverter(UnivocityParser.scala:224)
at 
org.apache.spark.sql.catalyst.csv.UnivocityParser.$anonfun$valueConverters$1(UnivocityParser.scala:134)
```

Also, `to_csv` doesn't handle year-month interval types properly though any 
exception is thrown.
The result of `to_csv` for year-month interval types is not ANSI interval 
compliant form.

```
spark-sql> select to_csv(named_struct("a", interval '1-2' year to month));
14
```
The result above should be `INTERVAL '1-2' YEAR TO MONTH`.

### Why are the changes needed?

Bug fix.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

New tests.

Closes #33210 from sarutak/csv-yminterval.

Authored-by: Kousuke Saruta 
Signed-off-by: Max Gekk 
(cherry picked from commit f4237aff7ebece0b8d61e680ecbe759850f25af8)
Signed-off-by: Max Gekk 
---
 .../sql/catalyst/csv/UnivocityGenerator.scala  |  7 +-
 .../spark/sql/catalyst/csv/UnivocityParser.scala   |  7 +-
 .../org/apache/spark/sql/CsvFunctionsSuite.scala   | 29 ++
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityGenerator.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityGenerator.scala
index 11b31ce..5d70ccb 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityGenerator.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityGenerator.scala
@@ -22,7 +22,7 @@ import java.io.Writer
 import com.univocity.parsers.csv.CsvWriter
 
 import org.apache.spark.sql.catalyst.InternalRow
-import org.apache.spark.sql.catalyst.util.{DateFormatter, TimestampFormatter}
+import org.apache.spark.sql.catalyst.util.{DateFormatter, 
IntervalStringStyles, IntervalUtils, TimestampFormatter}
 import org.apache.spark.sql.catalyst.util.LegacyDateFormats.FAST_DATE_FORMAT
 import org.apache.spark.sql.types._
 
@@ -61,6 +61,11 @@ class UnivocityGenerator(
 case TimestampType =>
   (row: InternalRow, ordinal: Int) => 
timestampFormatter.format(row.getLong(ordinal))
 
+case YearMonthIntervalType(start, end) =>
+  (row: InternalRow, ordinal: Int) =>
+IntervalUtils.toYearMonthIntervalString(
+  row.getInt(ordinal), IntervalStringStyles.ANSI_STYLE, start, end)
+
 case udt: UserDefinedType[_] => makeConverter(udt.sqlType)
 
 case dt: DataType =>
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala
index 672d133..3ec1ea0 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala
@@ -26,7 +26,7 @@ import com.univocity.parsers.csv.CsvParser
 import org.apache.spark.SparkUpgradeException
 import org.apache.spark.internal.Logging
 import org.apache.spark.sql.catalyst.{InternalRow, NoopFilters, OrderedFilters}
-import org.apache.spark.sql.catalyst.expressions.{ExprUtils, 
GenericInternalRow}
+import org.apache.spark.sql.catalyst.expressions.{Cast, EmptyRow, ExprUtils, 
GenericInternalRow, Literal}
 import org.apache.spark.sql.catalyst.util._
 import org.apache.spark.sql.catalyst.util.LegacyDateFormats.FAST_DATE_FORMAT
 import org.apache.spark.sql.errors.QueryExecutionErrors
@@ -217,6 +217,11 @@ class UnivocityParser(
 IntervalUtils.

[spark] branch branch-3.2 updated: [SPARK-36021][SQL] Parse interval literals should support more than 2 digits

2021-07-07 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 74bfbcd  [SPARK-36021][SQL] Parse interval literals should support 
more than 2 digits
74bfbcd is described below

commit 74bfbcd6430b1a5d274c660224851ce1562813e4
Author: Angerszh 
AuthorDate: Wed Jul 7 20:31:29 2021 +0300

[SPARK-36021][SQL] Parse interval literals should support more than 2 digits

### What changes were proposed in this pull request?
For case
```
spark-sql> select interval '123456:12' minute to second;
Error in query:
requirement failed: Interval string must match day-time format of 
'^(?[+|-])?(?\d{1,2}):(?(\d{1,2})(\.(\d{1,9}))?)$': 
123456:12, set spark.sql.legacy.fromDayTimeString.enabled to true to restore 
the behavior before Spark 3.0.(line 1, pos 16)

== SQL ==
select interval '123456:12' minute to second
^^^
```

we should support hour/minute/second when for more than 2 digits when parse 
interval literal string

### Why are the changes needed?
Keep consistence

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Added UT

Closes #33231 from AngersZh/SPARK-36021.

Authored-by: Angerszh 
Signed-off-by: Max Gekk 
(cherry picked from commit eaa200e586043e29e2f3566b95b6943b811f)
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/util/IntervalUtils.scala| 144 +--
 .../sql/catalyst/util/IntervalUtilsSuite.scala |  25 +--
 .../test/resources/sql-tests/inputs/interval.sql   |  21 +++
 .../sql-tests/results/ansi/interval.sql.out| 198 -
 .../resources/sql-tests/results/interval.sql.out   | 198 -
 .../sql-tests/results/postgreSQL/interval.sql.out  |  20 +--
 6 files changed, 467 insertions(+), 139 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
index ad87f2a..24bcad8 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
@@ -57,6 +57,12 @@ object IntervalUtils {
   }
   import IntervalUnit._
 
+  private val MAX_DAY = Long.MaxValue / MICROS_PER_DAY
+  private val MAX_HOUR = Long.MaxValue / MICROS_PER_HOUR
+  private val MAX_MINUTE = Long.MaxValue / MICROS_PER_MINUTE
+  private val MAX_SECOND = Long.MaxValue / MICROS_PER_SECOND
+  private val MIN_SECOND = Long.MinValue / MICROS_PER_SECOND
+
   def getYears(months: Int): Int = months / MONTHS_PER_YEAR
 
   def getYears(interval: CalendarInterval): Int = getYears(interval.months)
@@ -213,19 +219,25 @@ object IntervalUtils {
   }
 
   /**
-   * Parse YearMonth string in form: [+|-]-MM
+   * Parse year-month interval in form: [+|-]-MM
*
* adapted from HiveIntervalYearMonth.valueOf
*/
   def fromYearMonthString(input: String): CalendarInterval = {
+fromYearMonthString(input, YM.YEAR, YM.MONTH)
+  }
+
+  /**
+   * Parse year-month interval in form: [+|-]-MM
+   *
+   * adapted from HiveIntervalYearMonth.valueOf
+   * Below interval conversion patterns are supported:
+   * - YEAR TO (YEAR|MONTH)
+   */
+  def fromYearMonthString(input: String, startField: Byte, endField: Byte): 
CalendarInterval = {
 require(input != null, "Interval year-month string must be not null")
-input.trim match {
-  case yearMonthRegex(sign, yearStr, monthStr) =>
-new CalendarInterval(toYMInterval(yearStr, monthStr, finalSign(sign)), 
0, 0)
-  case _ =>
-throw new IllegalArgumentException(
-  s"Interval string does not match year-month format of 'y-m': $input")
-}
+val months = castStringToYMInterval(UTF8String.fromString(input), 
startField, endField)
+new CalendarInterval(months, 0, 0)
   }
 
   private def safeToInterval[T](interval: String)(f: => T): T = {
@@ -397,7 +409,7 @@ object IntervalUtils {
   secondStr: String,
   sign: Int): Long = {
 var micros = 0L
-val days = toLongWithRange(DAY, dayStr, 0, Int.MaxValue).toInt
+val days = toLongWithRange(DAY, dayStr, 0, MAX_DAY).toInt
 micros = Math.addExact(micros, sign * days * MICROS_PER_DAY)
 val hours = toLongWithRange(HOUR, hourStr, 0, 23)
 micros = Math.addExact(micros, sign * hours * MICROS_PER_HOUR)
@@ -413,7 +425,7 @@ object IntervalUtils {
   secondStr: String,
   sign: Int): Long = {
 var micros = 0L
-val hours = toLongWithRange(HOUR, hourStr, 0, 2562047788L)
+val hours = toLongWithRange(HOUR, hourStr, 0, MAX_HOUR)
 micros = Math.addExact(micros, sign * h

[spark] branch master updated (62ff2ad -> ea3333a)

2021-07-07 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 62ff2ad  [SPARK-36015][SQL] Support TimestampNTZType in the Window 
spec definition
 add eaa  [SPARK-36021][SQL] Parse interval literals should support 
more than 2 digits

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/util/IntervalUtils.scala| 144 +--
 .../sql/catalyst/util/IntervalUtilsSuite.scala |  25 +--
 .../test/resources/sql-tests/inputs/interval.sql   |  21 +++
 .../sql-tests/results/ansi/interval.sql.out| 198 -
 .../resources/sql-tests/results/interval.sql.out   | 198 -
 .../sql-tests/results/postgreSQL/interval.sql.out  |  20 +--
 6 files changed, 467 insertions(+), 139 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-36015][SQL] Support TimestampNTZType in the Window spec definition

2021-07-07 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 62ff2ad  [SPARK-36015][SQL] Support TimestampNTZType in the Window 
spec definition
62ff2ad is described below

commit 62ff2add9444fbd54802548b3daf7cde5820feef
Author: gengjiaan 
AuthorDate: Wed Jul 7 20:27:05 2021 +0300

[SPARK-36015][SQL] Support TimestampNTZType in the Window spec definition

### What changes were proposed in this pull request?
The method `WindowSpecDefinition.isValidFrameType` doesn't consider 
`TimestampNTZType`. We should support it as for `TimestampType`.

### Why are the changes needed?
Support `TimestampNTZType` in the Window spec definition.

### Does this PR introduce _any_ user-facing change?
'Yes'. This PR allows users use  `TimestampNTZType` as the sort spec in 
window spec definition.

### How was this patch tested?
New tests.

Closes #33246 from beliefer/SPARK-36015.

Authored-by: gengjiaan 
Signed-off-by: Max Gekk 
---
 .../catalyst/expressions/windowExpressions.scala   |  6 +--
 .../sql/execution/window/WindowExecBase.scala  | 10 ++--
 .../src/test/resources/sql-tests/inputs/window.sql |  9 
 .../resources/sql-tests/results/window.sql.out | 56 +-
 4 files changed, 73 insertions(+), 8 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
index 2555c6a..fc2e449 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
@@ -102,9 +102,9 @@ case class WindowSpecDefinition(
   private def isValidFrameType(ft: DataType): Boolean = 
(orderSpec.head.dataType, ft) match {
 case (DateType, IntegerType) => true
 case (DateType, _: YearMonthIntervalType) => true
-case (TimestampType, CalendarIntervalType) => true
-case (TimestampType, _: YearMonthIntervalType) => true
-case (TimestampType, _: DayTimeIntervalType) => true
+case (TimestampType | TimestampNTZType, CalendarIntervalType) => true
+case (TimestampType | TimestampNTZType, _: YearMonthIntervalType) => true
+case (TimestampType | TimestampNTZType, _: DayTimeIntervalType) => true
 case (a, b) => a == b
   }
 }
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExecBase.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExecBase.scala
index 2aa0b02..f3b3b34 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExecBase.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExecBase.scala
@@ -24,7 +24,7 @@ import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression
 import org.apache.spark.sql.execution.UnaryExecNode
-import org.apache.spark.sql.types.{CalendarIntervalType, DateType, 
DayTimeIntervalType, IntegerType, TimestampType, YearMonthIntervalType}
+import org.apache.spark.sql.types._
 
 trait WindowExecBase extends UnaryExecNode {
   def windowExpression: Seq[NamedExpression]
@@ -96,10 +96,12 @@ trait WindowExecBase extends UnaryExecNode {
 val boundExpr = (expr.dataType, boundOffset.dataType) match {
   case (DateType, IntegerType) => DateAdd(expr, boundOffset)
   case (DateType, _: YearMonthIntervalType) => DateAddYMInterval(expr, 
boundOffset)
-  case (TimestampType, CalendarIntervalType) => TimeAdd(expr, 
boundOffset, Some(timeZone))
-  case (TimestampType, _: YearMonthIntervalType) =>
+  case (TimestampType | TimestampNTZType, CalendarIntervalType) =>
+TimeAdd(expr, boundOffset, Some(timeZone))
+  case (TimestampType | TimestampNTZType, _: YearMonthIntervalType) =>
 TimestampAddYMInterval(expr, boundOffset, Some(timeZone))
-  case (TimestampType, _: DayTimeIntervalType) => TimeAdd(expr, 
boundOffset, Some(timeZone))
+  case (TimestampType | TimestampNTZType, _: DayTimeIntervalType) =>
+TimeAdd(expr, boundOffset, Some(timeZone))
   case (a, b) if a == b => Add(expr, boundOffset)
 }
 val bound = MutableProjection.create(boundExpr :: Nil, child.output)
diff --git a/sql/core/src/test/resources/sql-tests/inputs/window.sql 
b/sql/core/src/test/resources/sql-tests/inputs/window.sql
index 46d3629..9766aaf 100644
--- a/sql/core/src/test/resources/sql-tests/inputs/window.sql
+++ b/sql/core/src/test/resources/sql-tes

[spark] branch branch-3.2 updated: [SPARK-36015][SQL] Support TimestampNTZType in the Window spec definition

2021-07-07 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 2fc57bba [SPARK-36015][SQL] Support TimestampNTZType in the Window 
spec definition
2fc57bba is described below

commit 2fc57bba31a9bea3b97582751a693acd268158e9
Author: gengjiaan 
AuthorDate: Wed Jul 7 20:27:05 2021 +0300

[SPARK-36015][SQL] Support TimestampNTZType in the Window spec definition

### What changes were proposed in this pull request?
The method `WindowSpecDefinition.isValidFrameType` doesn't consider 
`TimestampNTZType`. We should support it as for `TimestampType`.

### Why are the changes needed?
Support `TimestampNTZType` in the Window spec definition.

### Does this PR introduce _any_ user-facing change?
'Yes'. This PR allows users use  `TimestampNTZType` as the sort spec in 
window spec definition.

### How was this patch tested?
New tests.

Closes #33246 from beliefer/SPARK-36015.

Authored-by: gengjiaan 
Signed-off-by: Max Gekk 
(cherry picked from commit 62ff2add9444fbd54802548b3daf7cde5820feef)
Signed-off-by: Max Gekk 
---
 .../catalyst/expressions/windowExpressions.scala   |  6 +--
 .../sql/execution/window/WindowExecBase.scala  | 10 ++--
 .../src/test/resources/sql-tests/inputs/window.sql |  9 
 .../resources/sql-tests/results/window.sql.out | 56 +-
 4 files changed, 73 insertions(+), 8 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
index 2555c6a..fc2e449 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
@@ -102,9 +102,9 @@ case class WindowSpecDefinition(
   private def isValidFrameType(ft: DataType): Boolean = 
(orderSpec.head.dataType, ft) match {
 case (DateType, IntegerType) => true
 case (DateType, _: YearMonthIntervalType) => true
-case (TimestampType, CalendarIntervalType) => true
-case (TimestampType, _: YearMonthIntervalType) => true
-case (TimestampType, _: DayTimeIntervalType) => true
+case (TimestampType | TimestampNTZType, CalendarIntervalType) => true
+case (TimestampType | TimestampNTZType, _: YearMonthIntervalType) => true
+case (TimestampType | TimestampNTZType, _: DayTimeIntervalType) => true
 case (a, b) => a == b
   }
 }
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExecBase.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExecBase.scala
index 2aa0b02..f3b3b34 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExecBase.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExecBase.scala
@@ -24,7 +24,7 @@ import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression
 import org.apache.spark.sql.execution.UnaryExecNode
-import org.apache.spark.sql.types.{CalendarIntervalType, DateType, 
DayTimeIntervalType, IntegerType, TimestampType, YearMonthIntervalType}
+import org.apache.spark.sql.types._
 
 trait WindowExecBase extends UnaryExecNode {
   def windowExpression: Seq[NamedExpression]
@@ -96,10 +96,12 @@ trait WindowExecBase extends UnaryExecNode {
 val boundExpr = (expr.dataType, boundOffset.dataType) match {
   case (DateType, IntegerType) => DateAdd(expr, boundOffset)
   case (DateType, _: YearMonthIntervalType) => DateAddYMInterval(expr, 
boundOffset)
-  case (TimestampType, CalendarIntervalType) => TimeAdd(expr, 
boundOffset, Some(timeZone))
-  case (TimestampType, _: YearMonthIntervalType) =>
+  case (TimestampType | TimestampNTZType, CalendarIntervalType) =>
+TimeAdd(expr, boundOffset, Some(timeZone))
+  case (TimestampType | TimestampNTZType, _: YearMonthIntervalType) =>
 TimestampAddYMInterval(expr, boundOffset, Some(timeZone))
-  case (TimestampType, _: DayTimeIntervalType) => TimeAdd(expr, 
boundOffset, Some(timeZone))
+  case (TimestampType | TimestampNTZType, _: DayTimeIntervalType) =>
+TimeAdd(expr, boundOffset, Some(timeZone))
   case (a, b) if a == b => Add(expr, boundOffset)
 }
 val bound = MutableProjection.create(boundExpr :: Nil, child.output)
diff --git a/sql/core/src/test/resources/sql-tests/inputs/window.sql 
b/sql/core/src/test/resources/sql-tests/inputs/window.sql
index 46d3629..9766aaf 100

[spark] branch branch-3.2 updated: [SPARK-36016][SQL] Support TimestampNTZType in expression ApproxCountDistinctForIntervals

2021-07-07 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 0c7972b  [SPARK-36016][SQL] Support TimestampNTZType in expression 
ApproxCountDistinctForIntervals
0c7972b is described below

commit 0c7972ba5f9c4ab991b8ade00df409fb4392788a
Author: gengjiaan 
AuthorDate: Wed Jul 7 20:22:46 2021 +0300

[SPARK-36016][SQL] Support TimestampNTZType in expression 
ApproxCountDistinctForIntervals

### What changes were proposed in this pull request?
The current `ApproxCountDistinctForInterval`s supports `TimestampType`, but 
not supports timestamp without time zone yet.
This PR will add the function.

### Why are the changes needed?
`ApproxCountDistinctForInterval` need supports `TimestampNTZType`.

### Does this PR introduce _any_ user-facing change?
'Yes'. `ApproxCountDistinctForInterval` accepts `TimestampNTZType`.

### How was this patch tested?
New tests.

Closes #33243 from beliefer/SPARK-36016.

Authored-by: gengjiaan 
Signed-off-by: Max Gekk 
(cherry picked from commit be382a6285fcbf374faeec1298371952a7bf1193)
Signed-off-by: Max Gekk 
---
 .../expressions/aggregate/ApproxCountDistinctForIntervals.scala   | 6 +++---
 .../aggregate/ApproxCountDistinctForIntervalsSuite.scala  | 8 ++--
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala
index 19e212d..a7e9a22 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervals.scala
@@ -61,7 +61,7 @@ case class ApproxCountDistinctForIntervals(
   }
 
   override def inputTypes: Seq[AbstractDataType] = {
-Seq(TypeCollection(NumericType, TimestampType, DateType), ArrayType)
+Seq(TypeCollection(NumericType, TimestampType, DateType, 
TimestampNTZType), ArrayType)
   }
 
   // Mark as lazy so that endpointsExpression is not evaluated during tree 
transformation.
@@ -79,7 +79,7 @@ case class ApproxCountDistinctForIntervals(
   TypeCheckFailure("The endpoints provided must be constant literals")
 } else {
   endpointsExpression.dataType match {
-case ArrayType(_: NumericType | DateType | TimestampType, _) =>
+case ArrayType(_: NumericType | DateType | TimestampType | 
TimestampNTZType, _) =>
   if (endpoints.length < 2) {
 TypeCheckFailure("The number of endpoints must be >= 2 to 
construct intervals")
   } else {
@@ -122,7 +122,7 @@ case class ApproxCountDistinctForIntervals(
   n.numeric.toDouble(value.asInstanceOf[n.InternalType])
 case _: DateType =>
   value.asInstanceOf[Int].toDouble
-case _: TimestampType =>
+case TimestampType | TimestampNTZType =>
   value.asInstanceOf[Long].toDouble
   }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervalsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervalsSuite.scala
index 73f18d4..9d53673 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervalsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproxCountDistinctForIntervalsSuite.scala
@@ -18,6 +18,7 @@
 package org.apache.spark.sql.catalyst.expressions.aggregate
 
 import java.sql.{Date, Timestamp}
+import java.time.LocalDateTime
 
 import org.apache.spark.SparkFunSuite
 import org.apache.spark.sql.catalyst.InternalRow
@@ -38,7 +39,7 @@ class ApproxCountDistinctForIntervalsSuite extends 
SparkFunSuite {
   assert(
 wrongColumn.checkInputDataTypes() match {
   case TypeCheckFailure(msg)
-if msg.contains("requires (numeric or timestamp or date) type") => 
true
+if msg.contains("requires (numeric or timestamp or date or 
timestamp_ntz) type") => true
   case _ => false
 })
 }
@@ -199,7 +200,9 @@ class ApproxCountDistinctForIntervalsSuite extends 
SparkFunSuite {
   (intRecords.map(DateTimeUtils.toJavaDate),
   intEndpoints.map(DateTimeUtils.toJavaDate), DateType),
   (intRecords.map(DateTimeUtils.toJavaTimestamp(_)),
-  intEndpoints.map(DateTimeUtils.toJavaTimestamp(_)), TimestampType)
+  intEndpoints.map(Dat

[spark] branch master updated (55373b1 -> be382a6)

2021-07-07 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 55373b1  [SPARK-35907][CORE] Instead of File#mkdirs, 
Files#createDirectories is expected
 add be382a6  [SPARK-36016][SQL] Support TimestampNTZType in expression 
ApproxCountDistinctForIntervals

No new revisions were added by this update.

Summary of changes:
 .../expressions/aggregate/ApproxCountDistinctForIntervals.scala   | 6 +++---
 .../aggregate/ApproxCountDistinctForIntervalsSuite.scala  | 8 ++--
 2 files changed, 9 insertions(+), 5 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated: [SPARK-36022][SQL] Respect interval fields in extract

2021-07-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 429d178  [SPARK-36022][SQL] Respect interval fields in extract
429d178 is described below

commit 429d1780b3ab3267a5e22a857cf51458713bc208
Author: Kousuke Saruta 
AuthorDate: Thu Jul 8 09:40:57 2021 +0300

[SPARK-36022][SQL] Respect interval fields in extract

### What changes were proposed in this pull request?

This PR fixes an issue about `extract`.
`Extract` should process only existing fields of interval types. For 
example:

```
spark-sql> SELECT EXTRACT(MONTH FROM INTERVAL '2021-11' YEAR TO MONTH);
11
spark-sql> SELECT EXTRACT(MONTH FROM INTERVAL '2021' YEAR);
0
```
The last command should fail as the month field doesn't present in INTERVAL 
YEAR.

### Why are the changes needed?

Bug fix.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

New tests.

Closes #33247 from sarutak/fix-extract-interval.

Authored-by: Kousuke Saruta 
Signed-off-by: Max Gekk 
(cherry picked from commit 39002cb99514010f6d6cc2e575b9eab1694f04ef)
Signed-off-by: Max Gekk 
---
 .../catalyst/expressions/intervalExpressions.scala | 24 ++--
 .../apache/spark/sql/IntervalFunctionsSuite.scala  | 64 ++
 2 files changed, 82 insertions(+), 6 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/intervalExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/intervalExpressions.scala
index 5d49007..5b111d1 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/intervalExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/intervalExpressions.scala
@@ -29,6 +29,8 @@ import org.apache.spark.sql.catalyst.util.IntervalUtils._
 import org.apache.spark.sql.errors.QueryExecutionErrors
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
+import org.apache.spark.sql.types.DayTimeIntervalType.{DAY, HOUR, MINUTE, 
SECOND}
+import org.apache.spark.sql.types.YearMonthIntervalType.{MONTH, YEAR}
 import org.apache.spark.unsafe.types.CalendarInterval
 
 abstract class ExtractIntervalPart[T](
@@ -125,33 +127,43 @@ object ExtractIntervalPart {
   source: Expression,
   errorHandleFunc: => Nothing): Expression = {
 (extractField.toUpperCase(Locale.ROOT), source.dataType) match {
-  case ("YEAR" | "Y" | "YEARS" | "YR" | "YRS", _: YearMonthIntervalType) =>
+  case ("YEAR" | "Y" | "YEARS" | "YR" | "YRS", 
YearMonthIntervalType(start, end))
+if isUnitInIntervalRange(YEAR, start, end) =>
 ExtractANSIIntervalYears(source)
   case ("YEAR" | "Y" | "YEARS" | "YR" | "YRS", CalendarIntervalType) =>
 ExtractIntervalYears(source)
-  case ("MONTH" | "MON" | "MONS" | "MONTHS", _: YearMonthIntervalType) =>
+  case ("MONTH" | "MON" | "MONS" | "MONTHS", YearMonthIntervalType(start, 
end))
+if isUnitInIntervalRange(MONTH, start, end) =>
 ExtractANSIIntervalMonths(source)
   case ("MONTH" | "MON" | "MONS" | "MONTHS", CalendarIntervalType) =>
 ExtractIntervalMonths(source)
-  case ("DAY" | "D" | "DAYS", _: DayTimeIntervalType) =>
+  case ("DAY" | "D" | "DAYS", DayTimeIntervalType(start, end))
+if isUnitInIntervalRange(DAY, start, end) =>
 ExtractANSIIntervalDays(source)
   case ("DAY" | "D" | "DAYS", CalendarIntervalType) =>
 ExtractIntervalDays(source)
-  case ("HOUR" | "H" | "HOURS" | "HR" | "HRS", _: DayTimeIntervalType) =>
+  case ("HOUR" | "H" | "HOURS" | "HR" | "HRS", DayTimeIntervalType(start, 
end))
+if isUnitInIntervalRange(HOUR, start, end) =>
 ExtractANSIIntervalHours(source)
   case ("HOUR" | "H" | "HOURS" | "HR" | "HRS", CalendarIntervalType) =>
 ExtractIntervalHours(source)
-  case ("MINUTE" | "M" | "MIN" | "MINS" | "MINUTES", _: 
DayTimeIntervalType) =>
+  case ("MINUTE" | "M" | "MIN" | "MINS" | "MINUTES", 
DayTimeInterva

[spark] branch master updated (23943e5 -> 39002cb)

2021-07-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 23943e5  [SPARK-32577][SQL][TEST][FOLLOWUP] Fix the config value of 
shuffled hash join for all other test queries
 add 39002cb  [SPARK-36022][SQL] Respect interval fields in extract

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/intervalExpressions.scala | 24 ++--
 .../apache/spark/sql/IntervalFunctionsSuite.scala  | 64 ++
 2 files changed, 82 insertions(+), 6 deletions(-)
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/IntervalFunctionsSuite.scala

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (39002cb -> 89aa16b)

2021-07-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 39002cb  [SPARK-36022][SQL] Respect interval fields in extract
 add 89aa16b  [SPARK-36021][SQL][FOLLOWUP] DT/YM func use field byte to 
keep consistence

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/parser/AstBuilder.scala | 17 ---
 .../spark/sql/catalyst/util/IntervalUtils.scala| 24 ++
 .../catalyst/parser/ExpressionParserSuite.scala|  3 ++-
 .../sql/catalyst/util/IntervalUtilsSuite.scala | 13 
 4 files changed, 21 insertions(+), 36 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated: [SPARK-36021][SQL][FOLLOWUP] DT/YM func use field byte to keep consistence

2021-07-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 2776e8a  [SPARK-36021][SQL][FOLLOWUP] DT/YM func use field byte to 
keep consistence
2776e8a is described below

commit 2776e8aa4792a7b2e95d5bbbd76cea0f9554503c
Author: Angerszh 
AuthorDate: Thu Jul 8 12:22:04 2021 +0300

[SPARK-36021][SQL][FOLLOWUP] DT/YM func use field byte to keep consistence

### What changes were proposed in this pull request?
With more thought, all DT/YM function use field byte to keep consistence is 
better

### Why are the changes needed?
Keep code consistence

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Not need

Closes #33252 from AngersZh/SPARK-36021-FOLLOWUP.

Authored-by: Angerszh 
Signed-off-by: Max Gekk 
(cherry picked from commit 89aa16b4a838246cfb7bdc9318461485016f6252)
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/parser/AstBuilder.scala | 17 ---
 .../spark/sql/catalyst/util/IntervalUtils.scala| 24 ++
 .../catalyst/parser/ExpressionParserSuite.scala|  3 ++-
 .../sql/catalyst/util/IntervalUtilsSuite.scala | 13 
 4 files changed, 21 insertions(+), 36 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
index d6363b5..4f1e53f 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
@@ -40,7 +40,6 @@ import org.apache.spark.sql.catalyst.plans.logical._
 import org.apache.spark.sql.catalyst.trees.CurrentOrigin
 import org.apache.spark.sql.catalyst.util.{CharVarcharUtils, DateTimeUtils, 
IntervalUtils}
 import org.apache.spark.sql.catalyst.util.DateTimeUtils.{convertSpecialDate, 
convertSpecialTimestamp, convertSpecialTimestampNTZ, getZoneId, stringToDate, 
stringToTimestamp, stringToTimestampWithoutTimeZone}
-import org.apache.spark.sql.catalyst.util.IntervalUtils.IntervalUnit
 import org.apache.spark.sql.connector.catalog.{SupportsNamespaces, 
TableCatalog}
 import org.apache.spark.sql.connector.catalog.TableChange.ColumnPosition
 import org.apache.spark.sql.connector.expressions.{ApplyTransform, 
BucketTransform, DaysTransform, Expression => V2Expression, FieldReference, 
HoursTransform, IdentityTransform, LiteralValue, MonthsTransform, Transform, 
YearsTransform}
@@ -2487,18 +2486,10 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] 
with SQLConfHelper with Logg
 (from, to) match {
   case ("year", "month") =>
 IntervalUtils.fromYearMonthString(value)
-  case ("day", "hour") =>
-IntervalUtils.fromDayTimeString(value, IntervalUnit.DAY, 
IntervalUnit.HOUR)
-  case ("day", "minute") =>
-IntervalUtils.fromDayTimeString(value, IntervalUnit.DAY, 
IntervalUnit.MINUTE)
-  case ("day", "second") =>
-IntervalUtils.fromDayTimeString(value, IntervalUnit.DAY, 
IntervalUnit.SECOND)
-  case ("hour", "minute") =>
-IntervalUtils.fromDayTimeString(value, IntervalUnit.HOUR, 
IntervalUnit.MINUTE)
-  case ("hour", "second") =>
-IntervalUtils.fromDayTimeString(value, IntervalUnit.HOUR, 
IntervalUnit.SECOND)
-  case ("minute", "second") =>
-IntervalUtils.fromDayTimeString(value, IntervalUnit.MINUTE, 
IntervalUnit.SECOND)
+  case ("day", "hour") | ("day", "minute") | ("day", "second") | 
("hour", "minute") |
+   ("hour", "second") | ("minute", "second") =>
+IntervalUtils.fromDayTimeString(value,
+  DayTimeIntervalType.stringToField(from), 
DayTimeIntervalType.stringToField(to))
   case _ =>
 throw QueryParsingErrors.fromToIntervalUnsupportedError(from, to, 
ctx)
 }
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
index 24bcad8..7579a28 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
@@ -458,7 +458,7 @@ object IntervalUtils {
* adapted from HiveIntervalDayTime.valueOf
*/
   def fromDayTimeS

[spark] branch branch-3.2 updated: [SPARK-36055][SQL] Assign pretty SQL string to TimestampNTZ literals

2021-07-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 9103c1f  [SPARK-36055][SQL] Assign pretty SQL string to TimestampNTZ 
literals
9103c1f is described below

commit 9103c1fe2332a60424077ca9ecffb24afa440c55
Author: Gengliang Wang 
AuthorDate: Thu Jul 8 21:42:50 2021 +0300

[SPARK-36055][SQL] Assign pretty SQL string to TimestampNTZ literals

### What changes were proposed in this pull request?

Currently the TimestampNTZ literals shows only long value instead of 
timestamp string in its SQL string and toString result.
Before changes (with default timestamp type as TIMESTAMP_NTZ)
```
– !query
select timestamp '2019-01-01\t'
– !query schema
struct<15463008:timestamp_ntz>
```

After changes:
```
– !query
select timestamp '2019-01-01\t'
– !query schema
struct
```
### Why are the changes needed?

Make the schema of TimestampNTZ literals readable.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Unit test

Closes #33269 from gengliangwang/ntzLiteralString.

Authored-by: Gengliang Wang 
Signed-off-by: Max Gekk 
(cherry picked from commit ee945e99cc1d3979a2c24077a9ae786ce50bbe81)
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/expressions/literals.scala  |  6 +-
 .../catalyst/expressions/LiteralExpressionSuite.scala  |  9 +
 .../sql-tests/results/timestampNTZ/datetime.sql.out| 18 +-
 3 files changed, 23 insertions(+), 10 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala
index a2270eb..ee40909 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala
@@ -28,7 +28,7 @@ import java.lang.{Short => JavaShort}
 import java.math.{BigDecimal => JavaBigDecimal}
 import java.nio.charset.StandardCharsets
 import java.sql.{Date, Timestamp}
-import java.time.{Duration, Instant, LocalDate, LocalDateTime, Period}
+import java.time.{Duration, Instant, LocalDate, LocalDateTime, Period, 
ZoneOffset}
 import java.util
 import java.util.Objects
 import javax.xml.bind.DatatypeConverter
@@ -352,6 +352,8 @@ case class Literal (value: Any, dataType: DataType) extends 
LeafExpression {
   DateFormatter().format(value.asInstanceOf[Int])
 case TimestampType =>
   
TimestampFormatter.getFractionFormatter(timeZoneId).format(value.asInstanceOf[Long])
+case TimestampNTZType =>
+  
TimestampFormatter.getFractionFormatter(ZoneOffset.UTC).format(value.asInstanceOf[Long])
 case DayTimeIntervalType(startField, endField) =>
   toDayTimeIntervalString(value.asInstanceOf[Long], ANSI_STYLE, 
startField, endField)
 case YearMonthIntervalType(startField, endField) =>
@@ -473,6 +475,8 @@ case class Literal (value: Any, dataType: DataType) extends 
LeafExpression {
   s"DATE '$toString'"
 case (v: Long, TimestampType) =>
   s"TIMESTAMP '$toString'"
+case (v: Long, TimestampNTZType) =>
+  s"TIMESTAMP_NTZ '$toString'"
 case (i: CalendarInterval, CalendarIntervalType) =>
   s"INTERVAL '${i.toString}'"
 case (v: Array[Byte], BinaryType) => 
s"X'${DatatypeConverter.printHexBinary(v)}'"
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralExpressionSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralExpressionSuite.scala
index 50b7263..4081e13 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralExpressionSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralExpressionSuite.scala
@@ -362,6 +362,15 @@ class LiteralExpressionSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 }
   }
 
+  test("SPARK-36055: TimestampNTZ toString") {
+assert(Literal.default(TimestampNTZType).toString === "1970-01-01 
00:00:00")
+withTimeZones(sessionTimeZone = "GMT+01:00", systemTimeZone = "GMT-08:00") 
{
+  val timestamp = LocalDateTime.of(2021, 2, 3, 16, 50, 3, 45600)
+  val literalStr = Literal.create(timestamp).toString
+  assert(literalStr === "2021-02-03 16:50:03.456")
+}
+  }
+
   test("SPARK-35664: construct literals from java.time.LocalDateTime") {
 Seq(
   LocalDateTime.of(1, 1, 1, 0, 0, 0, 0),
diff

[spark] branch master updated (e071721 -> ee945e9)

2021-07-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e071721  [SPARK-36012][SQL] Add null flag in SHOW CREATE TABLE
 add ee945e9  [SPARK-36055][SQL] Assign pretty SQL string to TimestampNTZ 
literals

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/expressions/literals.scala  |  6 +-
 .../catalyst/expressions/LiteralExpressionSuite.scala  |  9 +
 .../sql-tests/results/timestampNTZ/datetime.sql.out| 18 +-
 3 files changed, 23 insertions(+), 10 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35777][SQL][TESTS] Check all year-month interval types in UDF

2021-06-23 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 490ae8f  [SPARK-35777][SQL][TESTS] Check all year-month interval types 
in UDF
490ae8f is described below

commit 490ae8f4d6c32274144e22789fc184150302c7f7
Author: Angerszh 
AuthorDate: Thu Jun 24 08:56:08 2021 +0300

[SPARK-35777][SQL][TESTS] Check all year-month interval types in UDF

### What changes were proposed in this pull request?
Check all year-month interval types in UDF.

### Why are the changes needed?
New checks should improve test coverage.

### Does this PR introduce _any_ user-facing change?
Yes but `YearMonthIntervalType` has not been released yet.

### How was this patch tested?
Existed UT.

Closes #32985 from AngersZh/SPARK-35777.

Authored-by: Angerszh 
Signed-off-by: Max Gekk 
---
 .../test/scala/org/apache/spark/sql/UDFSuite.scala | 47 +++---
 1 file changed, 33 insertions(+), 14 deletions(-)

diff --git a/sql/core/src/test/scala/org/apache/spark/sql/UDFSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/UDFSuite.scala
index 65cbaf5..a341c87 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/UDFSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/UDFSuite.scala
@@ -42,6 +42,7 @@ import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.test.SharedSparkSession
 import org.apache.spark.sql.test.SQLTestData._
 import org.apache.spark.sql.types._
+import org.apache.spark.sql.types.YearMonthIntervalType._
 import org.apache.spark.sql.util.QueryExecutionListener
 
 private case class FunctionResult(f1: String, f2: String)
@@ -902,27 +903,45 @@ class UDFSuite extends QueryTest with SharedSparkSession {
 assert(e.isInstanceOf[java.lang.ArithmeticException])
   }
 
-  test("SPARK-34663: using java.time.Period in UDF") {
+  test("SPARK-34663, SPARK-35777: using java.time.Period in UDF") {
 // Regular case
-val input = Seq(java.time.Period.ofMonths(11)).toDF("p")
+val input = Seq(java.time.Period.ofMonths(13)).toDF("p")
+  .select($"p", $"p".cast(YearMonthIntervalType(YEAR)).as("p_y"),
+$"p".cast(YearMonthIntervalType(MONTH)).as("p_m"))
 val incMonth = udf((p: java.time.Period) => p.plusMonths(1))
-val result = input.select(incMonth($"p").as("new_p"))
-checkAnswer(result, Row(java.time.Period.ofYears(1)) :: Nil)
-// TODO(SPARK-35777): Check all year-month interval types in UDF
-assert(result.schema === new StructType().add("new_p", 
YearMonthIntervalType()))
+val result = input.select(incMonth($"p").as("new_p"),
+  incMonth($"p_y").as("new_p_y"),
+  incMonth($"p_m").as("new_p_m"))
+checkAnswer(result, Row(java.time.Period.ofMonths(14).normalized(),
+  java.time.Period.ofMonths(13).normalized(),
+  java.time.Period.ofMonths(14).normalized()) :: Nil)
+assert(result.schema === new StructType()
+  .add("new_p", YearMonthIntervalType())
+  .add("new_p_y", YearMonthIntervalType())
+  .add("new_p_m", YearMonthIntervalType()))
 // UDF produces `null`
 val nullFunc = udf((_: java.time.Period) => 
null.asInstanceOf[java.time.Period])
-val nullResult = input.select(nullFunc($"p").as("null_p"))
-checkAnswer(nullResult, Row(null) :: Nil)
-// TODO(SPARK-35777): Check all year-month interval types in UDF
-assert(nullResult.schema === new StructType().add("null_p", 
YearMonthIntervalType()))
+val nullResult = input.select(nullFunc($"p").as("null_p"),
+  nullFunc($"p_y").as("null_p_y"), nullFunc($"p_m").as("null_p_m"))
+checkAnswer(nullResult, Row(null, null, null) :: Nil)
+assert(nullResult.schema === new StructType()
+  .add("null_p", YearMonthIntervalType())
+  .add("null_p_y", YearMonthIntervalType())
+  .add("null_p_m", YearMonthIntervalType()))
 // Input parameter of UDF is null
 val nullInput = Seq(null.asInstanceOf[java.time.Period]).toDF("null_p")
+  .select($"null_p",
+$"null_p".cast(YearMonthIntervalType(YEAR)).as("null_p_y"),
+$"null_p".cast(YearMonthIntervalType(MONTH)).as("null_p_m"))
 val constPeriod = udf((_: java.time.Period) => 
java.time.Period.ofYears(10))
-val constResult = nullInput.select(constPeriod($"null_p").as("10_years"))
-checkAnswer(constResult, Row(java.time.Period.ofYears(10)) :: Nil)
-// TODO(SPARK-3

[spark] branch master updated: [SPARK-35852][SQL] Use DateAdd instead of TimeAdd for DateType +/- INTERVAL DAY

2021-06-23 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 61bd036  [SPARK-35852][SQL] Use DateAdd instead of TimeAdd for 
DateType +/- INTERVAL DAY
61bd036 is described below

commit 61bd036cb96fd3a81fc4d131a29d956407956c94
Author: PengLei <18066542...@189.cn>
AuthorDate: Thu Jun 24 08:47:29 2021 +0300

[SPARK-35852][SQL] Use DateAdd instead of TimeAdd for DateType +/- INTERVAL 
DAY

### What changes were proposed in this pull request?
We use `DateAdd` to impl `DateType` `+`/`-`  `INTERVAL DAY`

### Why are the changes needed?
To improve the impl of `DateType` `+`/`-`  `INTERVAL DAY`
### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Add ut test

Closes #33033 from Peng-Lei/SPARK-35852.

Authored-by: PengLei <18066542...@189.cn>
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/analysis/Analyzer.scala |  5 +
 .../apache/spark/sql/ColumnExpressionSuite.scala   | 23 ++
 2 files changed, 28 insertions(+)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index 2f142bf..7cb270c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -53,6 +53,7 @@ import 
org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.internal.SQLConf.{PartitionOverwriteMode, 
StoreAssignmentPolicy}
 import org.apache.spark.sql.types._
+import org.apache.spark.sql.types.DayTimeIntervalType.DAY
 import org.apache.spark.sql.util.{CaseInsensitiveStringMap, SchemaUtils}
 import org.apache.spark.unsafe.types.UTF8String
 import org.apache.spark.util.Utils
@@ -349,7 +350,9 @@ class Analyzer(override val catalogManager: CatalogManager)
   case p: LogicalPlan => p.transformExpressionsUpWithPruning(
 _.containsPattern(BINARY_ARITHMETIC), ruleId) {
 case a @ Add(l, r, f) if a.childrenResolved => (l.dataType, 
r.dataType) match {
+  case (DateType, DayTimeIntervalType(DAY, DAY)) => DateAdd(l, 
ExtractANSIIntervalDays(r))
   case (DateType, _: DayTimeIntervalType) => TimeAdd(Cast(l, 
TimestampType), r)
+  case (DayTimeIntervalType(DAY, DAY), DateType) => DateAdd(r, 
ExtractANSIIntervalDays(l))
   case (_: DayTimeIntervalType, DateType) => TimeAdd(Cast(r, 
TimestampType), l)
   case (DateType, _: YearMonthIntervalType) => DateAddYMInterval(l, r)
   case (_: YearMonthIntervalType, DateType) => DateAddYMInterval(r, l)
@@ -366,6 +369,8 @@ class Analyzer(override val catalogManager: CatalogManager)
   case _ => a
 }
 case s @ Subtract(l, r, f) if s.childrenResolved => (l.dataType, 
r.dataType) match {
+  case (DateType, DayTimeIntervalType(DAY, DAY)) =>
+DateAdd(l, UnaryMinus(ExtractANSIIntervalDays(r), f))
   case (DateType, _: DayTimeIntervalType) =>
 DatetimeSub(l, r, TimeAdd(Cast(l, TimestampType), UnaryMinus(r, 
f)))
   case (DateType, _: YearMonthIntervalType) =>
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala
index 13fdbf2..b0cd613 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala
@@ -2881,4 +2881,27 @@ class ColumnExpressionSuite extends QueryTest with 
SharedSparkSession {
   }
 }
   }
+
+  test("SPARK-35852: add/subtract a interval day to/from a date") {
+withSQLConf(SQLConf.DATETIME_JAVA8API_ENABLED.key -> "true") {
+  Seq(
+(LocalDate.of(1, 1, 1), Duration.ofDays(31)),
+(LocalDate.of(1582, 9, 15), Duration.ofDays(30)),
+(LocalDate.of(1900, 1, 1), Duration.ofDays(0)),
+(LocalDate.of(1970, 1, 1), Duration.ofDays(-1)),
+(LocalDate.of(2021, 3, 14), Duration.ofDays(1)),
+(LocalDate.of(2020, 12, 31), Duration.ofDays(4 * 30)),
+(LocalDate.of(2020, 2, 29), Duration.ofDays(365)),
+(LocalDate.of(1, 1, 1), Duration.ofDays(-2))
+  ).foreach { case (date, duration) =>
+val days = duration.toDays
+val add = date.plusDays(days)
+val sub = date.minusDays(days)
+val df = Seq((date, duration)).toDF("start", "diff")
+  .select($"start", $"diff" cast DayTimeIntervalType(DAY) as "diff&q

[spark] branch master updated (1295e88 -> 8a1995f)

2021-06-24 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1295e88  [SPARK-35786][SQL] Add a new operator to distingush if AQE 
can optimize safely
 add 8a1995f  [SPARK-35728][SQL][TESTS] Check multiply/divide of day-time 
intervals of any fields by numeric

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/expressions/IntervalExpressionsSuite.scala  | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35778][SQL][TESTS] Check multiply/divide of year month interval of any fields by numeric

2021-06-24 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3904c0e  [SPARK-35778][SQL][TESTS] Check multiply/divide of year month 
interval of any fields by numeric
3904c0e is described below

commit 3904c0edbad1838d70cbde0aec0318dea1ecdf4a
Author: PengLei <18066542...@189.cn>
AuthorDate: Thu Jun 24 12:25:06 2021 +0300

[SPARK-35778][SQL][TESTS] Check multiply/divide of year month interval of 
any fields by numeric

### What changes were proposed in this pull request?
Check multiply/divide of year-month intervals of any fields by numeric.

### Why are the changes needed?
To improve test coverage.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Expanded existed test cases.

Closes #33051 from Peng-Lei/SPARK-35778.

Authored-by: PengLei <18066542...@189.cn>
Signed-off-by: Max Gekk 
---
 .../sql/catalyst/expressions/IntervalExpressionsSuite.scala| 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/IntervalExpressionsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/IntervalExpressionsSuite.scala
index 3d423dd..b83aeab 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/IntervalExpressionsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/IntervalExpressionsSuite.scala
@@ -325,7 +325,6 @@ class IntervalExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 checkException(days = Int.MaxValue)
   }
 
-  // TODO(SPARK-35778): Check multiply/divide of year-month intervals of any 
fields by numeric
   test("SPARK-34824: multiply year-month interval by numeric") {
 Seq(
   (Period.ofYears(-123), Literal(null, DecimalType.USER_DEFAULT)) -> null,
@@ -337,7 +336,9 @@ class IntervalExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
   (Period.ofYears(), 0.0001d) -> Period.ofYears(1),
   (Period.ofYears(), BigDecimal(0.0001)) -> Period.ofYears(1)
 ).foreach { case ((period, num), expected) =>
-  checkEvaluation(MultiplyYMInterval(Literal(period), Literal(num)), 
expected)
+  DataTypeTestUtils.yearMonthIntervalTypes.foreach { dt =>
+checkEvaluation(MultiplyYMInterval(Literal.create(period, dt), 
Literal(num)), expected)
+  }
 }
 
 Seq(
@@ -398,7 +399,6 @@ class IntervalExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 }
   }
 
-  // TODO(SPARK-35778): Check multiply/divide of year-month intervals of any 
fields by numeric
   test("SPARK-34868: divide year-month interval by numeric") {
 Seq(
   (Period.ofYears(-123), Literal(null, DecimalType.USER_DEFAULT)) -> null,
@@ -412,7 +412,9 @@ class IntervalExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
   (Period.ofYears(1000), 100d) -> Period.ofYears(10),
   (Period.ofMonths(2), BigDecimal(0.1)) -> Period.ofMonths(20)
 ).foreach { case ((period, num), expected) =>
-  checkEvaluation(DivideYMInterval(Literal(period), Literal(num)), 
expected)
+  DataTypeTestUtils.yearMonthIntervalTypes.foreach { dt =>
+checkEvaluation(DivideYMInterval(Literal.create(period, dt), 
Literal(num)), expected)
+  }
 }
 
 Seq(

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (92ddef7 -> 7d0786f)

2021-06-24 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 92ddef7  [SPARK-35696][PYTHON][DOCS][FOLLOW-UP] Fix underline for 
title in FAQ to remove warnings
 add 7d0786f  [SPARK-35730][SQL][TESTS] Check all day-time interval types 
in UDF

No new revisions were added by this update.

Summary of changes:
 .../test/scala/org/apache/spark/sql/UDFSuite.scala | 112 ++---
 1 file changed, 99 insertions(+), 13 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new a643076  [SPARK-35975][SQL] New configuration 
`spark.sql.timestampType` for the default timestamp type
a643076 is described below

commit a643076d4ef622eac505ebf22c9aa2cc909320ac
Author: Gengliang Wang 
AuthorDate: Thu Jul 1 23:25:18 2021 +0300

[SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the 
default timestamp type

### What changes were proposed in this pull request?

Add a new configuration `spark.sql.timestampType`, which configures the 
default timestamp type of Spark SQL, including SQL DDL and Cast clause. Setting 
the configuration as `TIMESTAMP_NTZ` will use `TIMESTAMP WITHOUT TIME ZONE` as 
the default type while putting it as `TIMESTAMP_LTZ` will use `TIMESTAMP WITH 
LOCAL TIME ZONE`.

The default value of the new configuration is TIMESTAMP_LTZ, which is 
consistent with previous Spark releases.

### Why are the changes needed?

A new configuration for switching the default timestamp type as timestamp 
without time zone.

### Does this PR introduce _any_ user-facing change?

No, it's a new feature.

### How was this patch tested?

Unit test

Closes #33176 from gengliangwang/newTsTypeConf.

Authored-by: Gengliang Wang 
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/parser/AstBuilder.scala |  2 +-
 .../org/apache/spark/sql/internal/SQLConf.scala| 28 ++
 .../sql/catalyst/parser/DataTypeParserSuite.scala  | 14 ++-
 3 files changed, 42 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
index 224c2d0..361ecc1 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
@@ -2502,7 +2502,7 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
   case ("float" | "real", Nil) => FloatType
   case ("double", Nil) => DoubleType
   case ("date", Nil) => DateType
-  case ("timestamp", Nil) => TimestampType
+  case ("timestamp", Nil) => SQLConf.get.timestampType
   case ("string", Nil) => StringType
   case ("character" | "char", length :: Nil) => 
CharType(length.getText.toInt)
   case ("varchar", length :: Nil) => VarcharType(length.getText.toInt)
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index 30e5a16..3aed3c2 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -44,6 +44,7 @@ import 
org.apache.spark.sql.catalyst.plans.logical.HintErrorHandler
 import org.apache.spark.sql.catalyst.util.DateTimeUtils
 import 
org.apache.spark.sql.connector.catalog.CatalogManager.SESSION_CATALOG_NAME
 import org.apache.spark.sql.errors.{QueryCompilationErrors, 
QueryExecutionErrors}
+import org.apache.spark.sql.types.{AtomicType, TimestampNTZType, TimestampType}
 import org.apache.spark.unsafe.array.ByteArrayMethods
 import org.apache.spark.util.Utils
 
@@ -2820,6 +2821,24 @@ object SQLConf {
   .booleanConf
   .createWithDefault(true)
 
+  object TimestampTypes extends Enumeration {
+val TIMESTAMP_NTZ, TIMESTAMP_LTZ = Value
+  }
+
+  val TIMESTAMP_TYPE =
+buildConf("spark.sql.timestampType")
+  .doc("Configures the default timestamp type of Spark SQL, including SQL 
DDL and Cast " +
+s"clause. Setting the configuration as 
${TimestampTypes.TIMESTAMP_NTZ.toString} will " +
+"use TIMESTAMP WITHOUT TIME ZONE as the default type while putting it 
as " +
+s"${TimestampTypes.TIMESTAMP_LTZ.toString} will use TIMESTAMP WITH 
LOCAL TIME ZONE. " +
+"Before the 3.2.0 release, Spark only supports the TIMESTAMP WITH " +
+"LOCAL TIME ZONE type.")
+  .version("3.2.0")
+  .stringConf
+  .transform(_.toUpperCase(Locale.ROOT))
+  .checkValues(TimestampTypes.values.map(_.toString))
+  .createWithDefault(TimestampTypes.TIMESTAMP_LTZ.toString)
+
   val DATETIME_JAVA8API_ENABLED = 
buildConf("spark.sql.datetime.java8API.enabled")
 .doc("If the configuration property is set to true, java.time.Instant and 
" +
   "java.time.LocalDate classes of Java 8 API

[spark] branch master updated: [SPARK-35935][SQL] Prevent failure of `MSCK REPAIR TABLE` on table refreshing

2021-06-30 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new d28ca9c  [SPARK-35935][SQL] Prevent failure of `MSCK REPAIR TABLE` on 
table refreshing
d28ca9c is described below

commit d28ca9cc9808828118be64a545c3478160fdc170
Author: Max Gekk 
AuthorDate: Wed Jun 30 09:44:52 2021 +0300

[SPARK-35935][SQL] Prevent failure of `MSCK REPAIR TABLE` on table 
refreshing

### What changes were proposed in this pull request?
In the PR, I propose to catch all non-fatal exceptions coming 
`refreshTable()` at the final stage of table repairing, and output an error 
message instead of failing with an exception.

### Why are the changes needed?
1. The uncaught exceptions from table refreshing might be considered as 
regression comparing to previous Spark versions. Table refreshing was 
introduced by https://github.com/apache/spark/pull/31066.
2. This should improve user experience with Spark SQL. For instance, when 
the `MSCK REPAIR TABLE` is performed in a chain of command in SQL where 
catching exception is difficult or even impossible.

### Does this PR introduce _any_ user-facing change?
Yes. Before the changes the `MSCK REPAIR TABLE` command can fail with the 
exception portrayed in SPARK-35935. After the changes, the same command outputs 
error message, and completes successfully.

### How was this patch tested?
By existing test suites.

Closes #33137 from MaxGekk/msck-repair-catch-except.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
---
 .../scala/org/apache/spark/sql/execution/command/ddl.scala | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala
index 0876b5f..06c6847 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala
@@ -675,7 +675,15 @@ case class RepairTableCommand(
 // This is always the case for Hive format tables, but is not true for 
Datasource tables created
 // before Spark 2.1 unless they are converted via `msck repair table`.
 spark.sessionState.catalog.alterTable(table.copy(tracksPartitionsInCatalog 
= true))
-spark.catalog.refreshTable(tableIdentWithDB)
+try {
+  spark.catalog.refreshTable(tableIdentWithDB)
+} catch {
+  case NonFatal(e) =>
+logError(s"Cannot refresh the table '$tableIdentWithDB'. A query of 
the table " +
+  "might return wrong result if the table was cached. To avoid such 
issue, you should " +
+  "uncache the table manually via the UNCACHE TABLE command after 
table recovering will " +
+  "complete fully.", e)
+}
 logInfo(s"Recovered all partitions: added ($addedAmount), dropped 
($droppedAmount).")
 Seq.empty[Row]
   }

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: Revert "[SPARK-33995][SQL] Expose make_interval as a Scala function"

2021-06-30 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 7668226  Revert "[SPARK-33995][SQL] Expose make_interval as a Scala 
function"
7668226 is described below

commit 76682268d746e72f0e8aa4cc64860e0bfd90f1ed
Author: Max Gekk 
AuthorDate: Wed Jun 30 09:26:35 2021 +0300

Revert "[SPARK-33995][SQL] Expose make_interval as a Scala function"

### What changes were proposed in this pull request?
This reverts commit e6753c9402b5c40d9e2af662f28bd4f07a0bae17.

### Why are the changes needed?
The `make_interval` function aims to construct values of the legacy 
interval type `CalendarIntervalType` which will be substituted by ANSI interval 
types (see SPARK-27790). Since the function has not been released yet, it would 
be better to don't expose it via public API at all.

### Does this PR introduce _any_ user-facing change?
Should not since the `make_interval` function has not been released yet.

### How was this patch tested?
By existing test suites, and GA/jenkins builds.

    Closes #33143 from MaxGekk/revert-make_interval.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
---
 .../scala/org/apache/spark/sql/functions.scala | 25 
 .../apache/spark/sql/JavaDateFunctionsSuite.java   | 68 --
 .../org/apache/spark/sql/DateFunctionsSuite.scala  | 40 -
 3 files changed, 133 deletions(-)

diff --git a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
index c446d6b..ecd60ff 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
@@ -2929,31 +2929,6 @@ object functions {
   
//
 
   /**
-   * (Scala-specific) Creates a datetime interval
-   *
-   * @param years Number of years
-   * @param months Number of months
-   * @param weeks Number of weeks
-   * @param days Number of days
-   * @param hours Number of hours
-   * @param mins Number of mins
-   * @param secs Number of secs
-   * @return A datetime interval
-   * @group datetime_funcs
-   * @since 3.2.0
-   */
-  def make_interval(
-  years: Column = lit(0),
-  months: Column = lit(0),
-  weeks: Column = lit(0),
-  days: Column = lit(0),
-  hours: Column = lit(0),
-  mins: Column = lit(0),
-  secs: Column = lit(0)): Column = withExpr {
-MakeInterval(years.expr, months.expr, weeks.expr, days.expr, hours.expr, 
mins.expr, secs.expr)
-  }
-
-  /**
* Returns the date that is `numMonths` after `startDate`.
*
* @param startDate A date, timestamp or string. If a string, the data must 
be in a format that
diff --git 
a/sql/core/src/test/java/test/org/apache/spark/sql/JavaDateFunctionsSuite.java 
b/sql/core/src/test/java/test/org/apache/spark/sql/JavaDateFunctionsSuite.java
deleted file mode 100644
index 2d1de77..000
--- 
a/sql/core/src/test/java/test/org/apache/spark/sql/JavaDateFunctionsSuite.java
+++ /dev/null
@@ -1,68 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package test.org.apache.spark.sql;
-
-import org.apache.spark.sql.Column;
-import org.apache.spark.sql.Dataset;
-import org.apache.spark.sql.Row;
-import org.apache.spark.sql.RowFactory;
-import org.apache.spark.sql.test.TestSparkSession;
-import org.apache.spark.sql.types.StructType;
-import org.junit.After;
-import org.junit.Assert;
-import org.junit.Before;
-import org.junit.Test;
-
-import java.sql.Date;
-import java.util.*;
-
-import static org.apache.spark.sql.types.DataTypes.*;
-import static org.apache.spark.sql.functions.*;
-
-public class JavaDateFunctionsSuite {
-  private transient TestSparkSession spark;
-
-  @Before
-  public void setUp() {
-spark = new TestSparkSession();
-}
-
-  @After
-  public void tearDown() {
-spark.stop();
-spark = null;
-  }
-
-  @Test
-  public void m

[spark] branch branch-3.0 updated (ab46045 -> 6a1361c)

2021-06-30 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ab46045  [SPARK-35886][SQL][3.0] PromotePrecision should not overwrite 
genCodePromotePrecision should not overwrite genCode
 add 6a1361c  [SPARK-35935][SQL][3.1][3.0] Prevent failure of `MSCK REPAIR 
TABLE` on table refreshing

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/execution/command/ddl.scala | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated (fe412b6 -> b6e8fab)

2021-06-30 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fe412b6  [SPARK-35898][SQL] Fix arrays and maps in RowToColumnConverter
 add b6e8fab  [SPARK-35935][SQL][3.1][3.0] Prevent failure of `MSCK REPAIR 
TABLE` on table refreshing

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/execution/command/ddl.scala | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35735][SQL] Take into account day-time interval fields in cast

2021-06-30 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 2febd5c  [SPARK-35735][SQL] Take into account day-time interval fields 
in cast
2febd5c is described below

commit 2febd5c3f0c3a0c6660cfb340eb65316a1ca4acd
Author: Angerszh 
AuthorDate: Wed Jun 30 16:05:04 2021 +0300

[SPARK-35735][SQL] Take into account day-time interval fields in cast

### What changes were proposed in this pull request?
Support take into account day-time interval field in cast.

### Why are the changes needed?
To conform to the SQL standard.

### Does this PR introduce _any_ user-facing change?
An user can use `cast(str, DayTimeInterval(DAY, HOUR))`, for instance.

### How was this patch tested?
Added UT.

Closes #32943 from AngersZh/SPARK-35735.

Authored-by: Angerszh 
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/util/IntervalUtils.scala| 203 +++--
 .../sql/catalyst/expressions/CastSuiteBase.scala   | 145 ++-
 2 files changed, 324 insertions(+), 24 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
index 7a6de7f..30a2fa5 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
@@ -119,18 +119,27 @@ object IntervalUtils {
 }
   }
 
+  val supportedFormat = Map(
+(YM.YEAR, YM.MONTH) -> Seq("[+|-]y-m", "INTERVAL [+|-]'[+|-]y-m' YEAR TO 
MONTH"),
+(YM.YEAR, YM.YEAR) -> Seq("[+|-]y", "INTERVAL [+|-]'[+|-]y' YEAR"),
+(YM.MONTH, YM.MONTH) -> Seq("[+|-]m", "INTERVAL [+|-]'[+|-]m' MONTH"),
+(DT.DAY, DT.DAY) -> Seq("[+|-]d", "INTERVAL [+|-]'[+|-]d' DAY"),
+(DT.DAY, DT.HOUR) -> Seq("[+|-]d h", "INTERVAL [+|-]'[+|-]d h' DAY TO 
HOUR"),
+(DT.DAY, DT.MINUTE) -> Seq("[+|-]d h:m", "INTERVAL [+|-]'[+|-]d h:m' DAY 
TO MINUTE"),
+(DT.DAY, DT.SECOND) -> Seq("[+|-]d h:m:s.n", "INTERVAL [+|-]'[+|-]d 
h:m:s.n' DAY TO SECOND"),
+(DT.HOUR, DT.HOUR) -> Seq("[+|-]h", "INTERVAL [+|-]'[+|-]h' HOUR"),
+(DT.HOUR, DT.MINUTE) -> Seq("[+|-]h:m", "INTERVAL [+|-]'[+|-]h:m' HOUR TO 
MINUTE"),
+(DT.HOUR, DT.SECOND) -> Seq("[+|-]h:m:s.n", "INTERVAL [+|-]'[+|-]h:m:s.n' 
HOUR TO SECOND"),
+(DT.MINUTE, DT.MINUTE) -> Seq("[+|-]m", "INTERVAL [+|-]'[+|-]m' MINUTE"),
+(DT.MINUTE, DT.SECOND) -> Seq("[+|-]m:s.n", "INTERVAL [+|-]'[+|-]m:s.n' 
MINUTE TO SECOND"),
+(DT.SECOND, DT.SECOND) -> Seq("[+|-]s.n", "INTERVAL [+|-]'[+|-]s.n' 
SECOND")
+  )
+
   def castStringToYMInterval(
   input: UTF8String,
   startField: Byte,
   endField: Byte): Int = {
 
-val supportedFormat = Map(
-  (YM.YEAR, YM.MONTH) ->
-Seq("[+|-]y-m", "INTERVAL [+|-]'[+|-]y-m' YEAR TO MONTH"),
-  (YM.YEAR, YM.YEAR) -> Seq("[+|-]y", "INTERVAL [+|-]'[+|-]y' YEAR"),
-  (YM.MONTH, YM.MONTH) -> Seq("[+|-]m", "INTERVAL [+|-]'[+|-]m' MONTH")
-)
-
 def checkStringIntervalType(targetStartField: Byte, targetEndField: Byte): 
Unit = {
   if (startField != targetStartField || endField != targetEndField) {
 throw new IllegalArgumentException(s"Interval string does not match 
year-month format of " +
@@ -151,7 +160,7 @@ object IntervalUtils {
 checkStringIntervalType(YM.YEAR, YM.MONTH)
 toYMInterval(year, month, getSign(firstSign, secondSign))
   case yearMonthIndividualRegex(secondSign, value) =>
-safeToYMInterval {
+safeToInterval {
   val sign = getSign("+", secondSign)
   if (endField == YM.YEAR) {
 sign * Math.toIntExact(value.toLong * MONTHS_PER_YEAR)
@@ -166,7 +175,7 @@ object IntervalUtils {
   }
 }
   case yearMonthIndividualLiteralRegex(firstSign, secondSign, value, 
suffix) =>
-safeToYMInterval {
+safeToInterval {
   val sign = getSign(firstSign, secondSign)
   if ("YEAR".equalsIgnoreCase(suffix)) {
 checkStringIntervalType(YM.YEAR, YM.YEAR)
@@ -202,7 +211,7 @@ object IntervalUtils {
 }
   }
 
-  private def safeToYMInterval(f: => Int): Int = {
+  private def safeToInterval[T](f: => T): T = {
 try {
   f
 } catch {
@@ -213,24 +222,72 @@ object IntervalUtils {
   }

[spark] branch master updated (108635a -> 356aef4)

2021-06-28 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 108635a  Revert "[SPARK-35904][SQL] Collapse above RebalancePartitions"
 add 356aef4  [SPARK-35728][SPARK-35778][SQL][TESTS] Check multiply/divide 
of day-time and year-month interval of any fields by a numeric

No new revisions were added by this update.

Summary of changes:
 .../expressions/IntervalExpressionsSuite.scala | 136 -
 1 file changed, 131 insertions(+), 5 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-35895][SQL] Support subtracting Intervals from TimestampWithoutTZ

2021-06-26 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 645fb59  [SPARK-35895][SQL] Support subtracting Intervals from 
TimestampWithoutTZ
645fb59 is described below

commit 645fb59652fad5a7b84a691e2446b396cf81048a
Author: Gengliang Wang 
AuthorDate: Sat Jun 26 13:19:00 2021 +0300

[SPARK-35895][SQL] Support subtracting Intervals from TimestampWithoutTZ

### What changes were proposed in this pull request?

Support the following operation:
- TimestampWithoutTZ - Year-Month interval

The following operation is actually supported in 
https://github.com/apache/spark/pull/33076/. This PR is to add end-to-end tests 
for them:
- TimestampWithoutTZ - Calendar interval
- TimestampWithoutTZ - Daytime interval

### Why are the changes needed?

Support subtracting all 3 interval types from a timestamp without time zone

### Does this PR introduce _any_ user-facing change?

No, the timestamp without time zone type is not release yet.

### How was this patch tested?

Unit tests

Closes #33086 from gengliangwang/subtract.

Authored-by: Gengliang Wang 
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/analysis/Analyzer.scala |  2 +-
 .../test/resources/sql-tests/inputs/datetime.sql   | 11 
 .../sql-tests/results/ansi/datetime.sql.out| 74 +-
 .../sql-tests/results/datetime-legacy.sql.out  | 74 +-
 .../resources/sql-tests/results/datetime.sql.out   | 74 +-
 5 files changed, 231 insertions(+), 4 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index 6737ed5..5de228b 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -378,7 +378,7 @@ class Analyzer(override val catalogManager: CatalogManager)
 DatetimeSub(l, r, TimeAdd(Cast(l, TimestampType), UnaryMinus(r, 
f)))
   case (DateType, _: YearMonthIntervalType) =>
 DatetimeSub(l, r, DateAddYMInterval(l, UnaryMinus(r, f)))
-  case (TimestampType, _: YearMonthIntervalType) =>
+  case (TimestampType | TimestampWithoutTZType, _: 
YearMonthIntervalType) =>
 DatetimeSub(l, r, TimestampAddYMInterval(l, UnaryMinus(r, f)))
   case (CalendarIntervalType, CalendarIntervalType) |
(_: DayTimeIntervalType, _: DayTimeIntervalType) => s
diff --git a/sql/core/src/test/resources/sql-tests/inputs/datetime.sql 
b/sql/core/src/test/resources/sql-tests/inputs/datetime.sql
index 819bf4b..d68c9ff 100644
--- a/sql/core/src/test/resources/sql-tests/inputs/datetime.sql
+++ b/sql/core/src/test/resources/sql-tests/inputs/datetime.sql
@@ -236,3 +236,14 @@ select to_timestamp_ntz('2021-06-25 10:11:12') + interval 
'10-9' year to month;
 select to_timestamp_ntz('2021-06-25 10:11:12') + interval '20 15' day to hour;
 select to_timestamp_ntz('2021-06-25 10:11:12') + interval '20 15:40' day to 
minute;
 select to_timestamp_ntz('2021-06-25 10:11:12') + interval '20 
15:40:32.9989' day to second;
+
+-- TimestampWithoutTZ - Intervals
+select to_timestamp_ntz('2021-06-25 10:11:12') - interval 2 day;
+select to_timestamp_ntz('2021-06-25 10:11:12') - interval '0-0' year to month;
+select to_timestamp_ntz('2021-06-25 10:11:12') - interval '1-2' year to month;
+select to_timestamp_ntz('2021-06-25 10:11:12') - interval '0 0:0:0' day to 
second;
+select to_timestamp_ntz('2021-06-25 10:11:12') - interval '0 0:0:0.1' day to 
second;
+select to_timestamp_ntz('2021-06-25 10:11:12') - interval '10-9' year to month;
+select to_timestamp_ntz('2021-06-25 10:11:12') - interval '20 15' day to hour;
+select to_timestamp_ntz('2021-06-25 10:11:12') - interval '20 15:40' day to 
minute;
+select to_timestamp_ntz('2021-06-25 10:11:12') - interval '20 
15:40:32.9989' day to second;
diff --git 
a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out 
b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out
index 33d041b..08b01ca 100644
--- a/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/ansi/datetime.sql.out
@@ -1,5 +1,5 @@
 -- Automatically generated by SQLQueryTestSuite
--- Number of queries: 176
+-- Number of queries: 185
 
 
 -- !query
@@ -1506,3 +1506,75 @@ select to_timestamp_ntz('2021-06-25 10:11:12') + 
interval '20 15:40:32.9989'
 struct
 -- !query output
 2021-07-16 01:51:44.998999
+
+
+-- !query
+select to_timestamp_ntz('2021-06-25 10:11:12') - interval 2 day
+-- !

[spark] branch branch-3.2 updated: [SPARK-36083][SQL] make_timestamp: return different result based on the default timestamp type

2021-07-11 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 09e5bbd  [SPARK-36083][SQL] make_timestamp: return different result 
based on the default timestamp type
09e5bbd is described below

commit 09e5bbdfbe53ccb4efd85f5774b7bee9e731a14f
Author: Gengliang Wang 
AuthorDate: Sun Jul 11 20:47:49 2021 +0300

[SPARK-36083][SQL] make_timestamp: return different result based on the 
default timestamp type

### What changes were proposed in this pull request?

The SQL function MAKE_TIMESTAMP should return different results based on 
the default timestamp type:
* when "spark.sql.timestampType" is TIMESTAMP_NTZ, return TimestampNTZType 
literal
* when "spark.sql.timestampType" is TIMESTAMP_LTZ, return TimestampType 
literal

### Why are the changes needed?

As "spark.sql.timestampType" sets the default timestamp type, the 
make_timestamp function should behave consistently with it.

### Does this PR introduce _any_ user-facing change?

Yes, when the value of "spark.sql.timestampType" is TIMESTAMP_NTZ, the 
result type of `MAKE_TIMESTAMP` is of TIMESTAMP_NTZ type.

### How was this patch tested?

Unit test

Closes #33290 from gengliangwang/mkTS.

Authored-by: Gengliang Wang 
Signed-off-by: Max Gekk 
(cherry picked from commit 17ddcc9e8273a098b63984b950bfa6cd36b58b99)
Signed-off-by: Max Gekk 
---
 .../catalyst/expressions/datetimeExpressions.scala |  30 --
 .../expressions/DateExpressionsSuite.scala | 120 +++--
 .../test/resources/sql-tests/inputs/datetime.sql   |   4 +
 .../sql-tests/results/ansi/datetime.sql.out|  19 +++-
 .../sql-tests/results/datetime-legacy.sql.out  |  18 +++-
 .../resources/sql-tests/results/datetime.sql.out   |  18 +++-
 .../results/timestampNTZ/datetime.sql.out  |  18 +++-
 7 files changed, 161 insertions(+), 66 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
index be527ce..979eeba 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
@@ -2286,7 +2286,8 @@ case class MakeDate(
 
 // scalastyle:off line.size.limit
 @ExpressionDescription(
-  usage = "_FUNC_(year, month, day, hour, min, sec[, timezone]) - Create 
timestamp from year, month, day, hour, min, sec and timezone fields.",
+  usage = "_FUNC_(year, month, day, hour, min, sec[, timezone]) - Create 
timestamp from year, month, day, hour, min, sec and timezone fields. " +
+"The result data type is consistent with the value of configuration 
`spark.sql.timestampType`",
   arguments = """
 Arguments:
   * year - the year to represent, from 1 to 
@@ -2324,7 +2325,8 @@ case class MakeTimestamp(
 sec: Expression,
 timezone: Option[Expression] = None,
 timeZoneId: Option[String] = None,
-failOnError: Boolean = SQLConf.get.ansiEnabled)
+failOnError: Boolean = SQLConf.get.ansiEnabled,
+override val dataType: DataType = SQLConf.get.timestampType)
   extends SeptenaryExpression with TimeZoneAwareExpression with 
ImplicitCastInputTypes
 with NullIntolerant {
 
@@ -2335,7 +2337,8 @@ case class MakeTimestamp(
   hour: Expression,
   min: Expression,
   sec: Expression) = {
-this(year, month, day, hour, min, sec, None, None, SQLConf.get.ansiEnabled)
+this(year, month, day, hour, min, sec, None, None, SQLConf.get.ansiEnabled,
+  SQLConf.get.timestampType)
   }
 
   def this(
@@ -2346,7 +2349,8 @@ case class MakeTimestamp(
   min: Expression,
   sec: Expression,
   timezone: Expression) = {
-this(year, month, day, hour, min, sec, Some(timezone), None, 
SQLConf.get.ansiEnabled)
+this(year, month, day, hour, min, sec, Some(timezone), None, 
SQLConf.get.ansiEnabled,
+  SQLConf.get.timestampType)
   }
 
   override def children: Seq[Expression] = Seq(year, month, day, hour, min, 
sec) ++ timezone
@@ -2355,7 +2359,6 @@ case class MakeTimestamp(
   override def inputTypes: Seq[AbstractDataType] =
 Seq(IntegerType, IntegerType, IntegerType, IntegerType, IntegerType, 
DecimalType(8, 6)) ++
 timezone.map(_ => StringType)
-  override def dataType: DataType = TimestampType
   override def nullable: Boolean = if (failOnError) 
children.exists(_.nullable) else true
 
   override def withTimeZone(timeZoneId: String): TimeZoneAwareExpression =
@@ -2388,7 +2391,11 @@ case class MakeTimestam

[spark] branch master updated (cfcd094 -> 17ddcc9)

2021-07-11 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from cfcd094  [SPARK-36036][CORE] Fix cleanup of DownloadFile resources
 add 17ddcc9  [SPARK-36083][SQL] make_timestamp: return different result 
based on the default timestamp type

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/datetimeExpressions.scala |  30 --
 .../expressions/DateExpressionsSuite.scala | 120 +++--
 .../test/resources/sql-tests/inputs/datetime.sql   |   4 +
 .../sql-tests/results/ansi/datetime.sql.out|  19 +++-
 .../sql-tests/results/datetime-legacy.sql.out  |  18 +++-
 .../resources/sql-tests/results/datetime.sql.out   |  18 +++-
 .../results/timestampNTZ/datetime.sql.out  |  18 +++-
 7 files changed, 161 insertions(+), 66 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated: [SPARK-36044][SQL] Suport TimestampNTZ in functions unix_timestamp/to_unix_timestamp

2021-07-12 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 5816482  [SPARK-36044][SQL] Suport TimestampNTZ in functions 
unix_timestamp/to_unix_timestamp
5816482 is described below

commit 58164828683db50fb0e1698b679ffc0a773da847
Author: gengjiaan 
AuthorDate: Mon Jul 12 09:55:43 2021 +0300

[SPARK-36044][SQL] Suport TimestampNTZ in functions 
unix_timestamp/to_unix_timestamp

### What changes were proposed in this pull request?
The functions `unix_timestamp`/`to_unix_timestamp` should be able to accept 
input of `TimestampNTZType`.

### Why are the changes needed?
The functions `unix_timestamp`/`to_unix_timestamp` should be able to accept 
input of `TimestampNTZType`.

### Does this PR introduce _any_ user-facing change?
'Yes'.

### How was this patch tested?
New tests.

Closes #33278 from beliefer/SPARK-36044.

Authored-by: gengjiaan 
Signed-off-by: Max Gekk 
(cherry picked from commit 8738682f6a36436da0e9fc332d58b2f41309e2c2)
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/expressions/datetimeExpressions.scala  |  8 +---
 .../spark/sql/catalyst/expressions/DateExpressionsSuite.scala |  9 +
 .../test/scala/org/apache/spark/sql/DateFunctionsSuite.scala  | 11 ++-
 3 files changed, 24 insertions(+), 4 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
index 979eeba..f0ed89e 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
@@ -1091,8 +1091,10 @@ abstract class ToTimestamp
   override protected def formatString: Expression = right
   override protected def isParsing = true
 
+  override def forTimestampNTZ: Boolean = left.dataType == TimestampNTZType
+
   override def inputTypes: Seq[AbstractDataType] =
-Seq(TypeCollection(StringType, DateType, TimestampType), StringType)
+Seq(TypeCollection(StringType, DateType, TimestampType, TimestampNTZType), 
StringType)
 
   override def dataType: DataType = LongType
   override def nullable: Boolean = if (failOnError) 
children.exists(_.nullable) else true
@@ -1112,7 +1114,7 @@ abstract class ToTimestamp
   left.dataType match {
 case DateType =>
   daysToMicros(t.asInstanceOf[Int], zoneId) / downScaleFactor
-case TimestampType =>
+case TimestampType | TimestampNTZType =>
   t.asInstanceOf[Long] / downScaleFactor
 case StringType =>
   val fmt = right.eval(input)
@@ -1192,7 +1194,7 @@ abstract class ToTimestamp
  |}
  |""".stripMargin)
   }
-  case TimestampType =>
+  case TimestampType | TimestampNTZType =>
 val eval1 = left.genCode(ctx)
 ev.copy(code = code"""
   ${eval1.code}
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/DateExpressionsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/DateExpressionsSuite.scala
index 5f071c3..02d6d95 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/DateExpressionsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/DateExpressionsSuite.scala
@@ -916,6 +916,11 @@ class DateExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 Literal(new Timestamp(100)), Literal("-MM-dd 
HH:mm:ss"), timeZoneId),
   1000L)
 checkEvaluation(
+  UnixTimestamp(
+
Literal(DateTimeUtils.microsToLocalDateTime(DateTimeUtils.millisToMicros(100))),
+Literal("-MM-dd HH:mm:ss"), timeZoneId),
+  1000L)
+checkEvaluation(
   UnixTimestamp(Literal(date1), Literal("-MM-dd HH:mm:ss"), 
timeZoneId),
   MICROSECONDS.toSeconds(
 DateTimeUtils.daysToMicros(DateTimeUtils.fromJavaDate(date1), 
tz.toZoneId)))
@@ -981,6 +986,10 @@ class DateExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 checkEvaluation(ToUnixTimestamp(
   Literal(new Timestamp(100)), Literal(fmt1)),
   1000L)
+checkEvaluation(ToUnixTimestamp(
+  
Literal(DateTimeUtils.microsToLocalDateTime(DateTimeUtils.millisToMicros(100))),
+  Literal(fmt1)),
+  1000L)
 checkEvaluation(
   ToUnixTimestamp(Literal(date1), Literal(fmt

[spark] branch master updated (badb039 -> 8738682)

2021-07-12 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from badb039  [SPARK-36003][PYTHON] Implement unary operator `invert` of 
integral ps.Series/Index
 add 8738682  [SPARK-36044][SQL] Suport TimestampNTZ in functions 
unix_timestamp/to_unix_timestamp

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/expressions/datetimeExpressions.scala  |  8 +---
 .../spark/sql/catalyst/expressions/DateExpressionsSuite.scala |  9 +
 .../test/scala/org/apache/spark/sql/DateFunctionsSuite.scala  | 11 ++-
 3 files changed, 24 insertions(+), 4 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8738682 -> 32720dd)

2021-07-12 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8738682  [SPARK-36044][SQL] Suport TimestampNTZ in functions 
unix_timestamp/to_unix_timestamp
 add 32720dd  [SPARK-36072][SQL] TO_TIMESTAMP: return different results 
based on the default timestamp type

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/datetimeExpressions.scala | 54 +-
 .../expressions/DateExpressionsSuite.scala | 23 +++---
 .../results/timestampNTZ/datetime.sql.out  | 84 +++---
 3 files changed, 76 insertions(+), 85 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated: [SPARK-36072][SQL] TO_TIMESTAMP: return different results based on the default timestamp type

2021-07-12 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 4e9e2f3  [SPARK-36072][SQL] TO_TIMESTAMP: return different results 
based on the default timestamp type
4e9e2f3 is described below

commit 4e9e2f32e84cd1d317e25b85eb18f2c662bc641f
Author: Gengliang Wang 
AuthorDate: Mon Jul 12 10:12:30 2021 +0300

[SPARK-36072][SQL] TO_TIMESTAMP: return different results based on the 
default timestamp type

### What changes were proposed in this pull request?

The SQL function TO_TIMESTAMP should return different results based on the 
default timestamp type:
* when "spark.sql.timestampType" is TIMESTAMP_NTZ, return TimestampNTZType 
literal
* when "spark.sql.timestampType" is TIMESTAMP_LTZ, return TimestampType 
literal

This PR also refactor the class GetTimestamp and GetTimestampNTZ to reduce 
duplicated code.

### Why are the changes needed?

As "spark.sql.timestampType" sets the default timestamp type, the 
to_timestamp function should behave consistently with it.

### Does this PR introduce _any_ user-facing change?

Yes, when the value of "spark.sql.timestampType" is TIMESTAMP_NTZ, the 
result type of `TO_TIMESTAMP` is of TIMESTAMP_NTZ type.

### How was this patch tested?

Unit test

Closes #33280 from gengliangwang/to_timestamp.

Authored-by: Gengliang Wang 
Signed-off-by: Max Gekk 
(cherry picked from commit 32720dd3e18ea43c6d88125a52356f40f808b300)
Signed-off-by: Max Gekk 
---
 .../catalyst/expressions/datetimeExpressions.scala | 54 +-
 .../expressions/DateExpressionsSuite.scala | 23 +++---
 .../results/timestampNTZ/datetime.sql.out  | 84 +++---
 3 files changed, 76 insertions(+), 85 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
index f0ed89e..0ebeacb 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
@@ -1008,17 +1008,17 @@ case class UnixTimestamp(
 copy(timeExp = newLeft, format = newRight)
 }
 
-case class GetTimestampNTZ(
+/**
+ * Gets a timestamp from a string or a date.
+ */
+case class GetTimestamp(
 left: Expression,
 right: Expression,
+override val dataType: DataType,
 timeZoneId: Option[String] = None,
 failOnError: Boolean = SQLConf.get.ansiEnabled) extends ToTimestamp {
 
-  override val forTimestampNTZ: Boolean = true
-
-  override def inputTypes: Seq[AbstractDataType] = Seq(StringType, StringType)
-
-  override def dataType: DataType = TimestampNTZType
+  override val forTimestampNTZ: Boolean = dataType == TimestampNTZType
 
   override protected def downScaleFactor: Long = 1
 
@@ -1064,7 +1064,7 @@ case class ParseToTimestampNTZ(
 child: Expression) extends RuntimeReplaceable {
 
   def this(left: Expression, format: Expression) = {
-this(left, Option(format), GetTimestampNTZ(left, format))
+this(left, Option(format), GetTimestamp(left, format, TimestampNTZType))
   }
 
   def this(left: Expression) = this(left, None, Cast(left, TimestampNTZType))
@@ -1886,7 +1886,7 @@ case class ParseToDate(left: Expression, format: 
Option[Expression], child: Expr
   extends RuntimeReplaceable {
 
   def this(left: Expression, format: Expression) = {
-this(left, Option(format), Cast(GetTimestamp(left, format), DateType))
+this(left, Option(format), Cast(GetTimestamp(left, format, TimestampType), 
DateType))
   }
 
   def this(left: Expression) = {
@@ -1911,7 +1911,8 @@ case class ParseToDate(left: Expression, format: 
Option[Expression], child: Expr
   usage = """
 _FUNC_(timestamp_str[, fmt]) - Parses the `timestamp_str` expression with 
the `fmt` expression
   to a timestamp. Returns null with invalid input. By default, it follows 
casting rules to
-  a timestamp if the `fmt` is omitted.
+  a timestamp if the `fmt` is omitted. The result data type is consistent 
with the value of
+  configuration `spark.sql.timestampType`.
   """,
   arguments = """
 Arguments:
@@ -1929,20 +1930,24 @@ case class ParseToDate(left: Expression, format: 
Option[Expression], child: Expr
   group = "datetime_funcs",
   since = "2.2.0")
 // scalastyle:on line.size.limit
-case class ParseToTimestamp(left: Expression, format: Option[Expression], 
child: Expression)
-  extends RuntimeReplaceable {
+case class ParseToTimestamp(
+left: Expression,
+format: Opt

[spark] branch master updated: [SPARK-36046][SQL] Support new functions make_timestamp_ntz and make_timestamp_ltz

2021-07-12 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 92bf83e  [SPARK-36046][SQL] Support new functions make_timestamp_ntz 
and make_timestamp_ltz
92bf83e is described below

commit 92bf83ed0aba2c399debb1db5fb88bad3961ab06
Author: Gengliang Wang 
AuthorDate: Mon Jul 12 22:44:26 2021 +0300

[SPARK-36046][SQL] Support new functions make_timestamp_ntz and 
make_timestamp_ltz

### What changes were proposed in this pull request?

Support new functions make_timestamp_ntz and make_timestamp_ltz
Syntax:
* `make_timestamp_ntz(year, month, day, hour, min, sec)`: Create local 
date-time from year, month, day, hour, min, sec fields
* `make_timestamp_ltz(year, month, day, hour, min, sec[, timezone])`: 
Create current timestamp with local time zone from year, month, day, hour, min, 
sec and timezone fields

### Why are the changes needed?

As the result of `make_timestamp` become consistent with the SQL 
configuration `spark.sql.timestmapType`, we need these two new functions to 
construct timestamp literals. They align to the functions [`make_timestamp` and 
`make_timestamptz`](https://www.postgresql.org/docs/9.4/functions-datetime.html)
 in PostgreSQL

### Does this PR introduce _any_ user-facing change?

Yes, two new datetime functions: make_timestamp_ntz and make_timestamp_ltz.

### How was this patch tested?

End-to-end tests.

Closes #33299 from gengliangwang/make_timestamp_ntz_ltz.

Authored-by: Gengliang Wang 
Signed-off-by: Max Gekk 
---
 .../sql/catalyst/analysis/FunctionRegistry.scala   |   2 +
 .../catalyst/expressions/datetimeExpressions.scala | 122 +
 .../sql-functions/sql-expression-schema.md |   6 +-
 .../test/resources/sql-tests/inputs/datetime.sql   |  12 ++
 .../sql-tests/results/ansi/datetime.sql.out|  61 ++-
 .../sql-tests/results/datetime-legacy.sql.out  |  59 +-
 .../resources/sql-tests/results/datetime.sql.out   |  59 +-
 .../results/timestampNTZ/datetime.sql.out  |  59 +-
 8 files changed, 374 insertions(+), 6 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
index fcc0220..4fd871d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
@@ -552,6 +552,8 @@ object FunctionRegistry {
 expression[TimeWindow]("window"),
 expression[MakeDate]("make_date"),
 expression[MakeTimestamp]("make_timestamp"),
+expression[MakeTimestampNTZ]("make_timestamp_ntz", true),
+expression[MakeTimestampLTZ]("make_timestamp_ltz", true),
 expression[MakeInterval]("make_interval"),
 expression[MakeDTInterval]("make_dt_interval"),
 expression[MakeYMInterval]("make_ym_interval"),
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
index 0ebeacb..2840b18 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
@@ -2272,6 +2272,128 @@ case class MakeDate(
 
 // scalastyle:off line.size.limit
 @ExpressionDescription(
+  usage = "_FUNC_(year, month, day, hour, min, sec) - Create local date-time 
from year, month, day, hour, min, sec fields. ",
+  arguments = """
+Arguments:
+  * year - the year to represent, from 1 to 
+  * month - the month-of-year to represent, from 1 (January) to 12 
(December)
+  * day - the day-of-month to represent, from 1 to 31
+  * hour - the hour-of-day to represent, from 0 to 23
+  * min - the minute-of-hour to represent, from 0 to 59
+  * sec - the second-of-minute and its micro-fraction to represent, from
+  0 to 60. If the sec argument equals to 60, the seconds field is 
set
+  to 0 and 1 minute is added to the final timestamp.
+  """,
+  examples = """
+Examples:
+  > SELECT _FUNC_(2014, 12, 28, 6, 30, 45.887);
+   2014-12-28 06:30:45.887
+  > SELECT _FUNC_(2019, 6, 30, 23, 59, 60);
+   2019-07-01 00:00:00
+  > SELECT _FUNC_(null, 7, 22, 15, 30, 0);
+   NULL
+  """,
+  group = "datetime_funcs",
+  since = "3.2.0")
+// sca

[spark] branch branch-3.2 updated: [SPARK-36046][SQL] Support new functions make_timestamp_ntz and make_timestamp_ltz

2021-07-12 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new fba3e90  [SPARK-36046][SQL] Support new functions make_timestamp_ntz 
and make_timestamp_ltz
fba3e90 is described below

commit fba3e90863e584361b2f61828a500c96d89c35de
Author: Gengliang Wang 
AuthorDate: Mon Jul 12 22:44:26 2021 +0300

[SPARK-36046][SQL] Support new functions make_timestamp_ntz and 
make_timestamp_ltz

### What changes were proposed in this pull request?

Support new functions make_timestamp_ntz and make_timestamp_ltz
Syntax:
* `make_timestamp_ntz(year, month, day, hour, min, sec)`: Create local 
date-time from year, month, day, hour, min, sec fields
* `make_timestamp_ltz(year, month, day, hour, min, sec[, timezone])`: 
Create current timestamp with local time zone from year, month, day, hour, min, 
sec and timezone fields

### Why are the changes needed?

As the result of `make_timestamp` become consistent with the SQL 
configuration `spark.sql.timestmapType`, we need these two new functions to 
construct timestamp literals. They align to the functions [`make_timestamp` and 
`make_timestamptz`](https://www.postgresql.org/docs/9.4/functions-datetime.html)
 in PostgreSQL

### Does this PR introduce _any_ user-facing change?

Yes, two new datetime functions: make_timestamp_ntz and make_timestamp_ltz.

### How was this patch tested?

End-to-end tests.

Closes #33299 from gengliangwang/make_timestamp_ntz_ltz.

Authored-by: Gengliang Wang 
Signed-off-by: Max Gekk 
(cherry picked from commit 92bf83ed0aba2c399debb1db5fb88bad3961ab06)
Signed-off-by: Max Gekk 
---
 .../sql/catalyst/analysis/FunctionRegistry.scala   |   2 +
 .../catalyst/expressions/datetimeExpressions.scala | 122 +
 .../sql-functions/sql-expression-schema.md |   6 +-
 .../test/resources/sql-tests/inputs/datetime.sql   |  12 ++
 .../sql-tests/results/ansi/datetime.sql.out|  61 ++-
 .../sql-tests/results/datetime-legacy.sql.out  |  59 +-
 .../resources/sql-tests/results/datetime.sql.out   |  59 +-
 .../results/timestampNTZ/datetime.sql.out  |  59 +-
 8 files changed, 374 insertions(+), 6 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
index fcc0220..4fd871d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
@@ -552,6 +552,8 @@ object FunctionRegistry {
 expression[TimeWindow]("window"),
 expression[MakeDate]("make_date"),
 expression[MakeTimestamp]("make_timestamp"),
+expression[MakeTimestampNTZ]("make_timestamp_ntz", true),
+expression[MakeTimestampLTZ]("make_timestamp_ltz", true),
 expression[MakeInterval]("make_interval"),
 expression[MakeDTInterval]("make_dt_interval"),
 expression[MakeYMInterval]("make_ym_interval"),
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
index 0ebeacb..2840b18 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
@@ -2272,6 +2272,128 @@ case class MakeDate(
 
 // scalastyle:off line.size.limit
 @ExpressionDescription(
+  usage = "_FUNC_(year, month, day, hour, min, sec) - Create local date-time 
from year, month, day, hour, min, sec fields. ",
+  arguments = """
+Arguments:
+  * year - the year to represent, from 1 to 
+  * month - the month-of-year to represent, from 1 (January) to 12 
(December)
+  * day - the day-of-month to represent, from 1 to 31
+  * hour - the hour-of-day to represent, from 0 to 23
+  * min - the minute-of-hour to represent, from 0 to 59
+  * sec - the second-of-minute and its micro-fraction to represent, from
+  0 to 60. If the sec argument equals to 60, the seconds field is 
set
+  to 0 and 1 minute is added to the final timestamp.
+  """,
+  examples = """
+Examples:
+  > SELECT _FUNC_(2014, 12, 28, 6, 30, 45.887);
+   2014-12-28 06:30:45.887
+  > SELECT _FUNC_(2019, 6, 30, 23, 59, 60);
+   2019-07-01 00:00:00
+  > SELECT _FUNC_(null, 7, 22, 15, 30, 0);

[spark] branch master updated: [SPARK-35977][SQL] Support non-reserved keyword TIMESTAMP_NTZ

2021-07-05 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 5f44acf  [SPARK-35977][SQL] Support non-reserved keyword TIMESTAMP_NTZ
5f44acf is described below

commit 5f44acff3df51721fe891ea50c0a5bcf3a37a719
Author: Gengliang Wang 
AuthorDate: Mon Jul 5 22:30:44 2021 +0300

[SPARK-35977][SQL] Support non-reserved keyword TIMESTAMP_NTZ

### What changes were proposed in this pull request?

Support new keyword TIMESTAMP_NTZ, which can be used for:

- timestamp without time zone data type in DDL
- timestamp without time zone data type in Cast clause.
- timestamp without time zone data type literal

### Why are the changes needed?

Users can use `TIMESTAMP_NTZ` in DDL/Cast/Literals for the timestamp 
without time zone type directly.

### Does this PR introduce _any_ user-facing change?

No, the new timestamp type is not released yet.

### How was this patch tested?

Unit test

Closes #33221 from gengliangwang/timstamp_ntz.

Authored-by: Gengliang Wang 
Signed-off-by: Max Gekk 
---
 .../main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala  | 4 
 .../org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala| 1 +
 .../org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala  | 3 +++
 3 files changed, 8 insertions(+)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
index 5b9107f..c650cf0 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
@@ -2125,6 +2125,9 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
   val zoneId = getZoneId(conf.sessionLocalTimeZone)
   val specialDate = convertSpecialDate(value, zoneId).map(Literal(_, 
DateType))
   specialDate.getOrElse(toLiteral(stringToDate, DateType))
+case "TIMESTAMP_NTZ" =>
+  val specialTs = convertSpecialTimestampNTZ(value).map(Literal(_, 
TimestampNTZType))
+  specialTs.getOrElse(toLiteral(stringToTimestampWithoutTimeZone, 
TimestampNTZType))
 case "TIMESTAMP" =>
   def constructTimestampLTZLiteral(value: String): Literal = {
 val zoneId = getZoneId(conf.sessionLocalTimeZone)
@@ -2525,6 +2528,7 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
   case ("double", Nil) => DoubleType
   case ("date", Nil) => DateType
   case ("timestamp", Nil) => SQLConf.get.timestampType
+  case ("timestamp_ntz", Nil) => TimestampNTZType
   case ("string", Nil) => StringType
   case ("character" | "char", length :: Nil) => 
CharType(length.getText.toInt)
   case ("varchar", length :: Nil) => VarcharType(length.getText.toInt)
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala
index a6b78e0..d34 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala
@@ -58,6 +58,7 @@ class DataTypeParserSuite extends SparkFunSuite with 
SQLHelper {
   checkDataType("deC", DecimalType.USER_DEFAULT)
   checkDataType("DATE", DateType)
   checkDataType("timestamp", TimestampType)
+  checkDataType("timestamp_ntz", TimestampNTZType)
   checkDataType("string", StringType)
   checkDataType("ChaR(5)", CharType(5))
   checkDataType("ChaRacter(5)", CharType(5))
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
index 37e2d9b..7b13fa9 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
@@ -465,6 +465,9 @@ class ExpressionParserSuite extends AnalysisTest {
 intercept("timestamP '2016-33-11 20:54:00.000'", "Cannot parse the 
TIMESTAMP value")
 
 // Timestamp without time zone
+assertEqual("tImEstAmp_Ntz '2016-03-11 20:54:00.000'",
+  Literal(LocalDateTime.parse("2016-03-11T20:54:00.000")))
+intercept("tImE

[spark] branch branch-3.2 updated: [SPARK-35977][SQL] Support non-reserved keyword TIMESTAMP_NTZ

2021-07-05 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 1ec37dd  [SPARK-35977][SQL] Support non-reserved keyword TIMESTAMP_NTZ
1ec37dd is described below

commit 1ec37dd164ae64c78bdccd8c0604b4013a692015
Author: Gengliang Wang 
AuthorDate: Mon Jul 5 22:30:44 2021 +0300

[SPARK-35977][SQL] Support non-reserved keyword TIMESTAMP_NTZ

### What changes were proposed in this pull request?

Support new keyword TIMESTAMP_NTZ, which can be used for:

- timestamp without time zone data type in DDL
- timestamp without time zone data type in Cast clause.
- timestamp without time zone data type literal

### Why are the changes needed?

Users can use `TIMESTAMP_NTZ` in DDL/Cast/Literals for the timestamp 
without time zone type directly.

### Does this PR introduce _any_ user-facing change?

No, the new timestamp type is not released yet.

### How was this patch tested?

Unit test

Closes #33221 from gengliangwang/timstamp_ntz.

Authored-by: Gengliang Wang 
Signed-off-by: Max Gekk 
(cherry picked from commit 5f44acff3df51721fe891ea50c0a5bcf3a37a719)
Signed-off-by: Max Gekk 
---
 .../main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala  | 4 
 .../org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala| 1 +
 .../org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala  | 3 +++
 3 files changed, 8 insertions(+)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
index 5b9107f..c650cf0 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
@@ -2125,6 +2125,9 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
   val zoneId = getZoneId(conf.sessionLocalTimeZone)
   val specialDate = convertSpecialDate(value, zoneId).map(Literal(_, 
DateType))
   specialDate.getOrElse(toLiteral(stringToDate, DateType))
+case "TIMESTAMP_NTZ" =>
+  val specialTs = convertSpecialTimestampNTZ(value).map(Literal(_, 
TimestampNTZType))
+  specialTs.getOrElse(toLiteral(stringToTimestampWithoutTimeZone, 
TimestampNTZType))
 case "TIMESTAMP" =>
   def constructTimestampLTZLiteral(value: String): Literal = {
 val zoneId = getZoneId(conf.sessionLocalTimeZone)
@@ -2525,6 +2528,7 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with 
SQLConfHelper with Logg
   case ("double", Nil) => DoubleType
   case ("date", Nil) => DateType
   case ("timestamp", Nil) => SQLConf.get.timestampType
+  case ("timestamp_ntz", Nil) => TimestampNTZType
   case ("string", Nil) => StringType
   case ("character" | "char", length :: Nil) => 
CharType(length.getText.toInt)
   case ("varchar", length :: Nil) => VarcharType(length.getText.toInt)
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala
index a6b78e0..d34 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala
@@ -58,6 +58,7 @@ class DataTypeParserSuite extends SparkFunSuite with 
SQLHelper {
   checkDataType("deC", DecimalType.USER_DEFAULT)
   checkDataType("DATE", DateType)
   checkDataType("timestamp", TimestampType)
+  checkDataType("timestamp_ntz", TimestampNTZType)
   checkDataType("string", StringType)
   checkDataType("ChaR(5)", CharType(5))
   checkDataType("ChaRacter(5)", CharType(5))
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
index 37e2d9b..7b13fa9 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
@@ -465,6 +465,9 @@ class ExpressionParserSuite extends AnalysisTest {
 intercept("timestamP '2016-33-11 20:54:00.000'", "Cannot parse the 
TIMESTAMP value")
 
 // Timestamp without time zone
+assertEqual("tImEstAmp_Ntz '2016-03-11

[spark] branch branch-3.2 updated: [SPARK-35735][SQL][FOLLOWUP] Fix case minute to second regex can cover by hour to minute and unit case-sensitive issue

2021-07-07 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new dd038aa  [SPARK-35735][SQL][FOLLOWUP] Fix case minute to second regex 
can cover by hour to minute and unit case-sensitive issue
dd038aa is described below

commit dd038aacd454e1a70ef2b2598fb504d8ad04b82a
Author: Angerszh 
AuthorDate: Wed Jul 7 12:37:19 2021 +0300

[SPARK-35735][SQL][FOLLOWUP] Fix case minute to second regex can cover by 
hour to minute and unit case-sensitive issue

### What changes were proposed in this pull request?
When cast `10:10` to interval minute to second,  it can be catch by hour to 
minute regex, here to fix this.

### Why are the changes needed?
Fix bug

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Added UT

Closes #33242 from AngersZh/SPARK-35735-FOLLOWUP.

Authored-by: Angerszh 
Signed-off-by: Max Gekk 
(cherry picked from commit 3953754f36656e1a0bee16b89fae0142f172a91a)
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/util/IntervalUtils.scala| 92 +++---
 .../sql/catalyst/expressions/CastSuiteBase.scala   |  4 +
 2 files changed, 48 insertions(+), 48 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
index b174165..ad87f2a 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
@@ -19,6 +19,7 @@ package org.apache.spark.sql.catalyst.util
 
 import java.time.{Duration, Period}
 import java.time.temporal.ChronoUnit
+import java.util.Locale
 import java.util.concurrent.TimeUnit
 
 import scala.collection.mutable
@@ -173,15 +174,14 @@ object IntervalUtils {
   startField: Byte,
   endField: Byte): Int = {
 
-def checkYMIntervalStringDataType(ym: YM): Unit =
-  checkIntervalStringDataType(input, startField, endField, ym)
+def checkTargetType(targetStartField: Byte, targetEndField: Byte): Boolean 
=
+  startField == targetStartField && endField == targetEndField
 
 input.trimAll().toString match {
-  case yearMonthRegex(sign, year, month) =>
-checkYMIntervalStringDataType(YM(YM.YEAR, YM.MONTH))
+  case yearMonthRegex(sign, year, month) if checkTargetType(YM.YEAR, 
YM.MONTH) =>
 toYMInterval(year, month, finalSign(sign))
-  case yearMonthLiteralRegex(firstSign, secondSign, year, month) =>
-checkYMIntervalStringDataType(YM(YM.YEAR, YM.MONTH))
+  case yearMonthLiteralRegex(firstSign, secondSign, year, month)
+if checkTargetType(YM.YEAR, YM.MONTH) =>
 toYMInterval(year, month, finalSign(firstSign, secondSign))
   case yearMonthIndividualRegex(firstSign, value) =>
 safeToInterval("year-month") {
@@ -195,15 +195,16 @@ object IntervalUtils {
   input, startField, endField, "year-month", YM(startField, 
endField).typeName)
   }
 }
-  case yearMonthIndividualLiteralRegex(firstSign, secondSign, value, 
suffix) =>
+  case yearMonthIndividualLiteralRegex(firstSign, secondSign, value, unit) 
=>
 safeToInterval("year-month") {
   val sign = finalSign(firstSign, secondSign)
-  if ("YEAR".equalsIgnoreCase(suffix)) {
-checkYMIntervalStringDataType(YM(YM.YEAR, YM.YEAR))
-sign * Math.toIntExact(value.toLong * MONTHS_PER_YEAR)
-  } else {
-checkYMIntervalStringDataType(YM(YM.MONTH, YM.MONTH))
-Math.toIntExact(sign * value.toLong)
+  unit.toUpperCase(Locale.ROOT) match {
+case "YEAR" if checkTargetType(YM.YEAR, YM.YEAR) =>
+  sign * Math.toIntExact(value.toLong * MONTHS_PER_YEAR)
+case "MONTH" if checkTargetType(YM.MONTH, YM.MONTH) =>
+  Math.toIntExact(sign * value.toLong)
+case _ => throwIllegalIntervalFormatException(input, startField, 
endField,
+  "year-month", YM(startField, endField).typeName)
   }
 }
   case _ => throwIllegalIntervalFormatException(input, startField, 
endField,
@@ -303,48 +304,45 @@ object IntervalUtils {
   }
 }
 
-def checkDTIntervalStringDataType(dt: DT): Unit =
-  checkIntervalStringDataType(input, startField, endField, dt, 
Some(fallbackNotice))
+def checkTargetType(targetStartField: Byte, targetEndField: Byte): Boolean 
=
+  startField == targetStartField && endField == targetEndField
 
 input.trimAll().toString match {
-  case day

[spark] branch branch-3.2 updated: [SPARK-36017][SQL] Support TimestampNTZType in expression ApproximatePercentile

2021-07-07 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 25ea296  [SPARK-36017][SQL] Support TimestampNTZType in expression 
ApproximatePercentile
25ea296 is described below

commit 25ea296c3c4295d78a573d98edbdd8f1de0e0447
Author: gengjiaan 
AuthorDate: Wed Jul 7 12:41:11 2021 +0300

[SPARK-36017][SQL] Support TimestampNTZType in expression 
ApproximatePercentile

### What changes were proposed in this pull request?
The current `ApproximatePercentile` supports `TimestampType`, but not 
supports timestamp without time zone yet.
This PR will add the function.

### Why are the changes needed?
`ApproximatePercentile` need supports `TimestampNTZType`.

### Does this PR introduce _any_ user-facing change?
'Yes'. `ApproximatePercentile` accepts `TimestampNTZType`.

### How was this patch tested?
New tests.

Closes #33241 from beliefer/SPARK-36017.

Authored-by: gengjiaan 
Signed-off-by: Max Gekk 
(cherry picked from commit cc4463e818749faaf648ec71699d1e2fd3828c3f)
Signed-off-by: Max Gekk 
---
 .../expressions/aggregate/ApproximatePercentile.scala  | 10 +-
 .../apache/spark/sql/ApproximatePercentileQuerySuite.scala | 14 +-
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
index 78e64bf..8cce79c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
@@ -92,9 +92,9 @@ case class ApproximatePercentile(
   private lazy val accuracy: Long = 
accuracyExpression.eval().asInstanceOf[Number].longValue
 
   override def inputTypes: Seq[AbstractDataType] = {
-// Support NumericType, DateType and TimestampType since their internal 
types are all numeric,
-// and can be easily cast to double for processing.
-Seq(TypeCollection(NumericType, DateType, TimestampType),
+// Support NumericType, DateType, TimestampType and TimestampNTZType since 
their internal types
+// are all numeric, and can be easily cast to double for processing.
+Seq(TypeCollection(NumericType, DateType, TimestampType, TimestampNTZType),
   TypeCollection(DoubleType, ArrayType(DoubleType, containsNull = false)), 
IntegralType)
   }
 
@@ -139,7 +139,7 @@ case class ApproximatePercentile(
   // Convert the value to a double value
   val doubleValue = child.dataType match {
 case DateType => value.asInstanceOf[Int].toDouble
-case TimestampType => value.asInstanceOf[Long].toDouble
+case TimestampType | TimestampNTZType => 
value.asInstanceOf[Long].toDouble
 case n: NumericType => 
n.numeric.toDouble(value.asInstanceOf[n.InternalType])
 case other: DataType =>
   throw QueryExecutionErrors.dataTypeUnexpectedError(other)
@@ -158,7 +158,7 @@ case class ApproximatePercentile(
 val doubleResult = buffer.getPercentiles(percentages)
 val result = child.dataType match {
   case DateType => doubleResult.map(_.toInt)
-  case TimestampType => doubleResult.map(_.toLong)
+  case TimestampType | TimestampNTZType => doubleResult.map(_.toLong)
   case ByteType => doubleResult.map(_.toByte)
   case ShortType => doubleResult.map(_.toShort)
   case IntegerType => doubleResult.map(_.toInt)
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala
index 4991e39..5ff15c9 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala
@@ -18,6 +18,7 @@
 package org.apache.spark.sql
 
 import java.sql.{Date, Timestamp}
+import java.time.LocalDateTime
 
 import 
org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile
 import 
org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile.DEFAULT_PERCENTILE_ACCURACY
@@ -89,23 +90,26 @@ class ApproximatePercentileQuerySuite extends QueryTest 
with SharedSparkSession
   test("percentile_approx, different column types") {
 withTempView(table) {
   val intSeq = 1 to 1000
-  val data: Seq[(java.math.BigDecimal, Date, Timestamp)] = intSeq.map { i 
=>
-(new java.math.BigDecimal(i), DateTimeUtils.toJavaDate(i), 
DateTimeUtils.toJavaTimestamp(i))
+  val data: Seq[(java.m

[spark] branch master updated: [SPARK-35735][SQL][FOLLOWUP] Fix case minute to second regex can cover by hour to minute and unit case-sensitive issue

2021-07-07 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3953754  [SPARK-35735][SQL][FOLLOWUP] Fix case minute to second regex 
can cover by hour to minute and unit case-sensitive issue
3953754 is described below

commit 3953754f36656e1a0bee16b89fae0142f172a91a
Author: Angerszh 
AuthorDate: Wed Jul 7 12:37:19 2021 +0300

[SPARK-35735][SQL][FOLLOWUP] Fix case minute to second regex can cover by 
hour to minute and unit case-sensitive issue

### What changes were proposed in this pull request?
When cast `10:10` to interval minute to second,  it can be catch by hour to 
minute regex, here to fix this.

### Why are the changes needed?
Fix bug

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Added UT

Closes #33242 from AngersZh/SPARK-35735-FOLLOWUP.

Authored-by: Angerszh 
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/util/IntervalUtils.scala| 92 +++---
 .../sql/catalyst/expressions/CastSuiteBase.scala   |  4 +
 2 files changed, 48 insertions(+), 48 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
index b174165..ad87f2a 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
@@ -19,6 +19,7 @@ package org.apache.spark.sql.catalyst.util
 
 import java.time.{Duration, Period}
 import java.time.temporal.ChronoUnit
+import java.util.Locale
 import java.util.concurrent.TimeUnit
 
 import scala.collection.mutable
@@ -173,15 +174,14 @@ object IntervalUtils {
   startField: Byte,
   endField: Byte): Int = {
 
-def checkYMIntervalStringDataType(ym: YM): Unit =
-  checkIntervalStringDataType(input, startField, endField, ym)
+def checkTargetType(targetStartField: Byte, targetEndField: Byte): Boolean 
=
+  startField == targetStartField && endField == targetEndField
 
 input.trimAll().toString match {
-  case yearMonthRegex(sign, year, month) =>
-checkYMIntervalStringDataType(YM(YM.YEAR, YM.MONTH))
+  case yearMonthRegex(sign, year, month) if checkTargetType(YM.YEAR, 
YM.MONTH) =>
 toYMInterval(year, month, finalSign(sign))
-  case yearMonthLiteralRegex(firstSign, secondSign, year, month) =>
-checkYMIntervalStringDataType(YM(YM.YEAR, YM.MONTH))
+  case yearMonthLiteralRegex(firstSign, secondSign, year, month)
+if checkTargetType(YM.YEAR, YM.MONTH) =>
 toYMInterval(year, month, finalSign(firstSign, secondSign))
   case yearMonthIndividualRegex(firstSign, value) =>
 safeToInterval("year-month") {
@@ -195,15 +195,16 @@ object IntervalUtils {
   input, startField, endField, "year-month", YM(startField, 
endField).typeName)
   }
 }
-  case yearMonthIndividualLiteralRegex(firstSign, secondSign, value, 
suffix) =>
+  case yearMonthIndividualLiteralRegex(firstSign, secondSign, value, unit) 
=>
 safeToInterval("year-month") {
   val sign = finalSign(firstSign, secondSign)
-  if ("YEAR".equalsIgnoreCase(suffix)) {
-checkYMIntervalStringDataType(YM(YM.YEAR, YM.YEAR))
-sign * Math.toIntExact(value.toLong * MONTHS_PER_YEAR)
-  } else {
-checkYMIntervalStringDataType(YM(YM.MONTH, YM.MONTH))
-Math.toIntExact(sign * value.toLong)
+  unit.toUpperCase(Locale.ROOT) match {
+case "YEAR" if checkTargetType(YM.YEAR, YM.YEAR) =>
+  sign * Math.toIntExact(value.toLong * MONTHS_PER_YEAR)
+case "MONTH" if checkTargetType(YM.MONTH, YM.MONTH) =>
+  Math.toIntExact(sign * value.toLong)
+case _ => throwIllegalIntervalFormatException(input, startField, 
endField,
+  "year-month", YM(startField, endField).typeName)
   }
 }
   case _ => throwIllegalIntervalFormatException(input, startField, 
endField,
@@ -303,48 +304,45 @@ object IntervalUtils {
   }
 }
 
-def checkDTIntervalStringDataType(dt: DT): Unit =
-  checkIntervalStringDataType(input, startField, endField, dt, 
Some(fallbackNotice))
+def checkTargetType(targetStartField: Byte, targetEndField: Byte): Boolean 
=
+  startField == targetStartField && endField == targetEndField
 
 input.trimAll().toString match {
-  case dayHourRegex(sign, day, hour) =>
-checkDTIntervalStringDataType(DT(DT.DAY, DT.HOUR))
+  case dayH

[spark] branch master updated: [SPARK-36017][SQL] Support TimestampNTZType in expression ApproximatePercentile

2021-07-07 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new cc4463e  [SPARK-36017][SQL] Support TimestampNTZType in expression 
ApproximatePercentile
cc4463e is described below

commit cc4463e818749faaf648ec71699d1e2fd3828c3f
Author: gengjiaan 
AuthorDate: Wed Jul 7 12:41:11 2021 +0300

[SPARK-36017][SQL] Support TimestampNTZType in expression 
ApproximatePercentile

### What changes were proposed in this pull request?
The current `ApproximatePercentile` supports `TimestampType`, but not 
supports timestamp without time zone yet.
This PR will add the function.

### Why are the changes needed?
`ApproximatePercentile` need supports `TimestampNTZType`.

### Does this PR introduce _any_ user-facing change?
'Yes'. `ApproximatePercentile` accepts `TimestampNTZType`.

### How was this patch tested?
New tests.

Closes #33241 from beliefer/SPARK-36017.

Authored-by: gengjiaan 
Signed-off-by: Max Gekk 
---
 .../expressions/aggregate/ApproximatePercentile.scala  | 10 +-
 .../apache/spark/sql/ApproximatePercentileQuerySuite.scala | 14 +-
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
index 78e64bf..8cce79c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
@@ -92,9 +92,9 @@ case class ApproximatePercentile(
   private lazy val accuracy: Long = 
accuracyExpression.eval().asInstanceOf[Number].longValue
 
   override def inputTypes: Seq[AbstractDataType] = {
-// Support NumericType, DateType and TimestampType since their internal 
types are all numeric,
-// and can be easily cast to double for processing.
-Seq(TypeCollection(NumericType, DateType, TimestampType),
+// Support NumericType, DateType, TimestampType and TimestampNTZType since 
their internal types
+// are all numeric, and can be easily cast to double for processing.
+Seq(TypeCollection(NumericType, DateType, TimestampType, TimestampNTZType),
   TypeCollection(DoubleType, ArrayType(DoubleType, containsNull = false)), 
IntegralType)
   }
 
@@ -139,7 +139,7 @@ case class ApproximatePercentile(
   // Convert the value to a double value
   val doubleValue = child.dataType match {
 case DateType => value.asInstanceOf[Int].toDouble
-case TimestampType => value.asInstanceOf[Long].toDouble
+case TimestampType | TimestampNTZType => 
value.asInstanceOf[Long].toDouble
 case n: NumericType => 
n.numeric.toDouble(value.asInstanceOf[n.InternalType])
 case other: DataType =>
   throw QueryExecutionErrors.dataTypeUnexpectedError(other)
@@ -158,7 +158,7 @@ case class ApproximatePercentile(
 val doubleResult = buffer.getPercentiles(percentages)
 val result = child.dataType match {
   case DateType => doubleResult.map(_.toInt)
-  case TimestampType => doubleResult.map(_.toLong)
+  case TimestampType | TimestampNTZType => doubleResult.map(_.toLong)
   case ByteType => doubleResult.map(_.toByte)
   case ShortType => doubleResult.map(_.toShort)
   case IntegerType => doubleResult.map(_.toInt)
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala
index 4991e39..5ff15c9 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala
@@ -18,6 +18,7 @@
 package org.apache.spark.sql
 
 import java.sql.{Date, Timestamp}
+import java.time.LocalDateTime
 
 import 
org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile
 import 
org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile.DEFAULT_PERCENTILE_ACCURACY
@@ -89,23 +90,26 @@ class ApproximatePercentileQuerySuite extends QueryTest 
with SharedSparkSession
   test("percentile_approx, different column types") {
 withTempView(table) {
   val intSeq = 1 to 1000
-  val data: Seq[(java.math.BigDecimal, Date, Timestamp)] = intSeq.map { i 
=>
-(new java.math.BigDecimal(i), DateTimeUtils.toJavaDate(i), 
DateTimeUtils.toJavaTimestamp(i))
+  val data: Seq[(java.math.BigDecimal, Date, Timestamp, LocalDateTime)] = 
intSeq.map { i =>
+(new ja

[spark] branch branch-3.2 updated: [SPARK-36054][SQL] Support group by TimestampNTZ type column

2021-07-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new ae62c9d  [SPARK-36054][SQL] Support group by TimestampNTZ type column
ae62c9d is described below

commit ae62c9d7726e2b05897f7e807bb9cdcc2748e3fa
Author: Gengliang Wang 
AuthorDate: Thu Jul 8 22:33:25 2021 +0300

[SPARK-36054][SQL] Support group by TimestampNTZ type column

### What changes were proposed in this pull request?

Support group by TimestampNTZ type column

### Why are the changes needed?

It's a basic SQL operation.

### Does this PR introduce _any_ user-facing change?

No, the new timestmap type is not released yet.

### How was this patch tested?

Unit test

Closes #33268 from gengliangwang/agg.

Authored-by: Gengliang Wang 
Signed-off-by: Max Gekk 
(cherry picked from commit 382b66e26725e4667607303ebb9803b05e8076bc)
Signed-off-by: Max Gekk 
---
 .../org/apache/spark/sql/catalyst/expressions/hash.scala|  2 +-
 .../spark/sql/execution/vectorized/OffHeapColumnVector.java |  3 ++-
 .../spark/sql/execution/vectorized/OnHeapColumnVector.java  |  3 ++-
 .../spark/sql/execution/aggregate/HashMapGenerator.scala|  2 +-
 .../org/apache/spark/sql/DataFrameAggregateSuite.scala  | 13 -
 5 files changed, 18 insertions(+), 5 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala
index d730586..3785262 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala
@@ -490,7 +490,7 @@ abstract class HashExpression[E] extends Expression {
 case BooleanType => genHashBoolean(input, result)
 case ByteType | ShortType | IntegerType | DateType => genHashInt(input, 
result)
 case LongType => genHashLong(input, result)
-case TimestampType => genHashTimestamp(input, result)
+case TimestampType | TimestampNTZType => genHashTimestamp(input, result)
 case FloatType => genHashFloat(input, result)
 case DoubleType => genHashDouble(input, result)
 case d: DecimalType => genHashDecimal(ctx, d, input, result)
diff --git 
a/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java
 
b/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java
index 7da5a28..b4b6903 100644
--- 
a/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java
+++ 
b/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java
@@ -553,7 +553,8 @@ public final class OffHeapColumnVector extends 
WritableColumnVector {
 type instanceof DateType || DecimalType.is32BitDecimalType(type)) {
   this.data = Platform.reallocateMemory(data, oldCapacity * 4L, 
newCapacity * 4L);
 } else if (type instanceof LongType || type instanceof DoubleType ||
-DecimalType.is64BitDecimalType(type) || type instanceof TimestampType) 
{
+DecimalType.is64BitDecimalType(type) || type instanceof TimestampType 
||
+type instanceof TimestampNTZType) {
   this.data = Platform.reallocateMemory(data, oldCapacity * 8L, 
newCapacity * 8L);
 } else if (childColumns != null) {
   // Nothing to store.
diff --git 
a/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java
 
b/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java
index 5942c5f..3fb96d8 100644
--- 
a/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java
+++ 
b/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java
@@ -547,7 +547,8 @@ public final class OnHeapColumnVector extends 
WritableColumnVector {
 if (intData != null) System.arraycopy(intData, 0, newData, 0, 
capacity);
 intData = newData;
   }
-} else if (type instanceof LongType || type instanceof TimestampType ||
+} else if (type instanceof LongType ||
+type instanceof TimestampType ||type instanceof TimestampNTZType ||
 DecimalType.is64BitDecimalType(type) || type instanceof 
DayTimeIntervalType) {
   if (longData == null || longData.length < newCapacity) {
 long[] newData = new long[newCapacity];
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashMapGenerator.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashMapGenerator.scala
index b3f5e34..713e7db 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashMapGenerator.scala
+++ 
b/

[spark] branch master updated (819c482 -> 382b66e)

2021-07-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 819c482  [SPARK-35340][PYTHON] Standardize TypeError messages for 
unsupported basic operations
 add 382b66e  [SPARK-36054][SQL] Support group by TimestampNTZ type column

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/expressions/hash.scala|  2 +-
 .../spark/sql/execution/vectorized/OffHeapColumnVector.java |  3 ++-
 .../spark/sql/execution/vectorized/OnHeapColumnVector.java  |  3 ++-
 .../spark/sql/execution/aggregate/HashMapGenerator.scala|  2 +-
 .../org/apache/spark/sql/DataFrameAggregateSuite.scala  | 13 -
 5 files changed, 18 insertions(+), 5 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated: [SPARK-36049][SQL] Remove IntervalUnit

2021-07-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 2f54d9e  [SPARK-36049][SQL] Remove IntervalUnit
2f54d9e is described below

commit 2f54d9eed6dce5e9fc853ef4dceb1abb2b338a34
Author: Angerszh 
AuthorDate: Thu Jul 8 23:02:21 2021 +0300

[SPARK-36049][SQL] Remove IntervalUnit

### What changes were proposed in this pull request?
Remove IntervalUnit

### Why are the changes needed?
Clean code

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Not need

Closes #33265 from AngersZh/SPARK-36049.

Lead-authored-by: Angerszh 
Co-authored-by: Maxim Gekk 
Signed-off-by: Max Gekk 
(cherry picked from commit fef7e1703c342165000f89b01112a8a28a936436)
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/util/IntervalUtils.scala| 120 -
 .../catalyst/parser/ExpressionParserSuite.scala|  33 ++
 2 files changed, 58 insertions(+), 95 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
index 7579a28..e026266 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
@@ -41,22 +41,6 @@ object IntervalStringStyles extends Enumeration {
 
 object IntervalUtils {
 
-  object IntervalUnit extends Enumeration {
-type IntervalUnit = Value
-
-val NANOSECOND = Value(0, "nanosecond")
-val MICROSECOND = Value(1, "microsecond")
-val MILLISECOND = Value(2, "millisecond")
-val SECOND = Value(3, "second")
-val MINUTE = Value(4, "minute")
-val HOUR = Value(5, "hour")
-val DAY = Value(6, "day")
-val WEEK = Value(7, "week")
-val MONTH = Value(8, "month")
-val YEAR = Value(9, "year")
-  }
-  import IntervalUnit._
-
   private val MAX_DAY = Long.MaxValue / MICROS_PER_DAY
   private val MAX_HOUR = Long.MaxValue / MICROS_PER_HOUR
   private val MAX_MINUTE = Long.MaxValue / MICROS_PER_MINUTE
@@ -97,7 +81,7 @@ object IntervalUtils {
   def getSeconds(interval: CalendarInterval): Decimal = 
getSeconds(interval.microseconds)
 
   private def toLongWithRange(
-  fieldName: IntervalUnit,
+  fieldName: UTF8String,
   s: String,
   minValue: Long,
   maxValue: Long): Long = {
@@ -250,10 +234,11 @@ object IntervalUtils {
 }
   }
 
-  private def toYMInterval(yearStr: String, monthStr: String, sign: Int): Int 
= {
+  private def toYMInterval(year: String, month: String, sign: Int): Int = {
 safeToInterval("year-month") {
-  val years = toLongWithRange(YEAR, yearStr, 0, Integer.MAX_VALUE / 
MONTHS_PER_YEAR)
-  val totalMonths = sign * (years * MONTHS_PER_YEAR + 
toLongWithRange(MONTH, monthStr, 0, 11))
+  val years = toLongWithRange(yearStr, year, 0, Integer.MAX_VALUE / 
MONTHS_PER_YEAR)
+  val totalMonths =
+sign * (years * MONTHS_PER_YEAR + toLongWithRange(monthStr, month, 0, 
11))
   Math.toIntExact(totalMonths)
 }
   }
@@ -402,45 +387,33 @@ object IntervalUtils {
 }
   }
 
-  def toDTInterval(
-  dayStr: String,
-  hourStr: String,
-  minuteStr: String,
-  secondStr: String,
-  sign: Int): Long = {
+  def toDTInterval(day: String, hour: String, minute: String, second: String, 
sign: Int): Long = {
 var micros = 0L
-val days = toLongWithRange(DAY, dayStr, 0, MAX_DAY).toInt
+val days = toLongWithRange(dayStr, day, 0, MAX_DAY).toInt
 micros = Math.addExact(micros, sign * days * MICROS_PER_DAY)
-val hours = toLongWithRange(HOUR, hourStr, 0, 23)
+val hours = toLongWithRange(hourStr, hour, 0, 23)
 micros = Math.addExact(micros, sign * hours * MICROS_PER_HOUR)
-val minutes = toLongWithRange(MINUTE, minuteStr, 0, 59)
+val minutes = toLongWithRange(minuteStr, minute, 0, 59)
 micros = Math.addExact(micros, sign * minutes * MICROS_PER_MINUTE)
-micros = Math.addExact(micros, sign * parseSecondNano(secondStr))
+micros = Math.addExact(micros, sign * parseSecondNano(second))
 micros
   }
 
-  def toDTInterval(
-  hourStr: String,
-  minuteStr: String,
-  secondStr: String,
-  sign: Int): Long = {
+  def toDTInterval(hour: String, minute: String, second: String, sign: Int): 
Long = {
 var micros = 0L
-val hours = toLongWithRange(HOUR, hourStr, 0, MAX_HOUR)
+val hours = toLongWithRange(hourStr, hour, 0, MAX_HOUR)
 micros = Math.addExact(micros, sign * hours * MICROS_PER_HOUR)
-val minutes = toLongWithRange(MINU

[spark] branch master updated (382b66e -> fef7e170)

2021-07-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 382b66e  [SPARK-36054][SQL] Support group by TimestampNTZ type column
 add fef7e170 [SPARK-36049][SQL] Remove IntervalUnit

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/util/IntervalUtils.scala| 120 -
 .../catalyst/parser/ExpressionParserSuite.scala|  33 ++
 2 files changed, 58 insertions(+), 95 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (87282f0 -> c8ff613)

2021-07-06 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 87282f0  [SPARK-35972][SQL] When replace ExtractValue in 
NestedColumnAliasing we should use semanticEquals
 add c8ff613  [SPARK-35983][SQL] Allow from_json/to_json for map types 
where value types are day-time intervals

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/json/JacksonGenerator.scala |  9 +
 .../spark/sql/catalyst/json/JacksonParser.scala|  7 
 .../org/apache/spark/sql/JsonFunctionsSuite.scala  | 44 +-
 3 files changed, 59 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated: [SPARK-35983][SQL] Allow from_json/to_json for map types where value types are day-time intervals

2021-07-06 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 634b2e2  [SPARK-35983][SQL] Allow from_json/to_json for map types 
where value types are day-time intervals
634b2e2 is described below

commit 634b2e265c1a95af2f2df35f8ecb9d05d4fe752f
Author: Kousuke Saruta 
AuthorDate: Tue Jul 6 11:06:56 2021 +0300

[SPARK-35983][SQL] Allow from_json/to_json for map types where value types 
are day-time intervals

### What changes were proposed in this pull request?

This PR fixes two issues. One is that `to_json` doesn't support `map` types 
where value types are `day-time` interval types like:
```
spark-sql> select to_json(map('a', interval '1 2:3:4' day to second));
21/07/06 14:53:58 ERROR SparkSQLDriver: Failed in [select to_json(map('a', 
interval '1 2:3:4' day to second))]
java.lang.RuntimeException: Failed to convert value 9378400 (class of 
class java.lang.Long) with the type of DayTimeIntervalType(0,3) to JSON.
```
The other issue is that even if the issue of `to_json` is resolved, 
`from_json` doesn't support to convert `day-time` interval string to JSON. So 
the result of following query will be `null`.
```
spark-sql> select from_json(to_json(map('a', interval '1 2:3:4' day to 
second)), 'a interval day to second');
{"a":null}
```

### Why are the changes needed?

There should be no reason why day-time intervals cannot used as map value 
types.
`CalendarIntervalTypes` can do it.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

New tests.

Closes #33225 from sarutak/json-dtinterval.

Authored-by: Kousuke Saruta 
Signed-off-by: Max Gekk 
(cherry picked from commit c8ff613c3cd0d04ebfaf57feeb67e21b3c8410a2)
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/json/JacksonGenerator.scala |  9 +
 .../spark/sql/catalyst/json/JacksonParser.scala|  7 
 .../org/apache/spark/sql/JsonFunctionsSuite.scala  | 44 +-
 3 files changed, 59 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala
index 9777d56..9bd0546 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonGenerator.scala
@@ -156,6 +156,15 @@ private[sql] class JacksonGenerator(
   end)
 gen.writeString(ymString)
 
+case DayTimeIntervalType(start, end) =>
+  (row: SpecializedGetters, ordinal: Int) =>
+val dtString = IntervalUtils.toDayTimeIntervalString(
+  row.getLong(ordinal),
+  IntervalStringStyles.ANSI_STYLE,
+  start,
+  end)
+gen.writeString(dtString)
+
 case BinaryType =>
   (row: SpecializedGetters, ordinal: Int) =>
 gen.writeBinary(row.getBinary(ordinal))
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
index 2aa735d..8a1191c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
@@ -302,6 +302,13 @@ class JacksonParser(
   Integer.valueOf(expr.eval(EmptyRow).asInstanceOf[Int])
   }
 
+case dt: DayTimeIntervalType => (parser: JsonParser) =>
+  parseJsonToken[java.lang.Long](parser, dataType) {
+case VALUE_STRING =>
+  val expr = Cast(Literal(parser.getText), dt)
+  java.lang.Long.valueOf(expr.eval(EmptyRow).asInstanceOf[Long])
+  }
+
 case st: StructType =>
   val fieldConverters = st.map(_.dataType).map(makeConverter).toArray
   (parser: JsonParser) => parseJsonToken[InternalRow](parser, dataType) {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala
index c2bea8c..82cca2b 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala
@@ -18,7 +18,7 @@
 package org.apache.spark.sql
 
 import java.text.SimpleDateFormat
-import java.time.Period
+import java.time.{Duration, Period}
 import java.util.Locale
 
 import collection.JavaConverters._
@@ -28,6 +28,7 @@ import org.apache.spark.sql.functions._
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.test.SharedSparkSession

[spark] branch branch-3.2 updated: [SPARK-36023][SPARK-35735][SPARK-35768][SQL] Refactor code about parse string to DT/YM

2021-07-06 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new b53d285  [SPARK-36023][SPARK-35735][SPARK-35768][SQL] Refactor code 
about parse string to DT/YM
b53d285 is described below

commit b53d285f72a918abeafaf7517281d08cf57beb64
Author: Angerszh 
AuthorDate: Tue Jul 6 13:51:06 2021 +0300

[SPARK-36023][SPARK-35735][SPARK-35768][SQL] Refactor code about parse 
string to DT/YM

### What changes were proposed in this pull request?
 Refactor code about parse string to DT/YM intervals.

### Why are the changes needed?
Extracting the common code about parse string to DT/YM should improve code 
maintenance.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Existed UT.

Closes #33217 from AngersZh/SPARK-35735-35768.

Authored-by: Angerszh 
Signed-off-by: Max Gekk 
(cherry picked from commit 26d1bb16bc565dbcb1a3f536dc78cd87be6c2468)
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/util/IntervalUtils.scala| 201 ++---
 .../sql/catalyst/expressions/CastSuiteBase.scala   |  28 ++-
 2 files changed, 123 insertions(+), 106 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
index 30a2fa5..b174165 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
@@ -29,7 +29,7 @@ import 
org.apache.spark.sql.catalyst.util.DateTimeUtils.millisToMicros
 import org.apache.spark.sql.catalyst.util.IntervalStringStyles.{ANSI_STYLE, 
HIVE_STYLE, IntervalStyle}
 import org.apache.spark.sql.errors.QueryExecutionErrors
 import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.types.{DayTimeIntervalType => DT, Decimal, 
YearMonthIntervalType => YM}
+import org.apache.spark.sql.types.{DataType, DayTimeIntervalType => DT, 
Decimal, YearMonthIntervalType => YM}
 import org.apache.spark.unsafe.types.{CalendarInterval, UTF8String}
 
 // The style of textual representation of intervals
@@ -110,7 +110,7 @@ object IntervalUtils {
   private val yearMonthIndividualLiteralRegex =
 
(s"(?i)^INTERVAL\\s+([+|-])?'$yearMonthIndividualPatternString'\\s+(YEAR|MONTH)$$").r
 
-  private def getSign(firstSign: String, secondSign: String): Int = {
+  private def finalSign(firstSign: String, secondSign: String = null): Int = {
 (firstSign, secondSign) match {
   case ("-", "-") => 1
   case ("-", _) => -1
@@ -119,6 +119,39 @@ object IntervalUtils {
 }
   }
 
+  private def throwIllegalIntervalFormatException(
+  input: UTF8String,
+  startFiled: Byte,
+  endField: Byte,
+  intervalStr: String,
+  typeName: String,
+  fallBackNotice: Option[String] = None) = {
+throw new IllegalArgumentException(
+  s"Interval string does not match $intervalStr format of " +
+s"${supportedFormat((startFiled, endField)).map(format => 
s"`$format`").mkString(", ")} " +
+s"when cast to $typeName: ${input.toString}" +
+s"${fallBackNotice.map(s => s", $s").getOrElse("")}")
+  }
+
+  private def checkIntervalStringDataType(
+  input: UTF8String,
+  targetStartField: Byte,
+  targetEndField: Byte,
+  inputIntervalType: DataType,
+  fallBackNotice: Option[String] = None): Unit = {
+val (intervalStr, typeName, inputStartField, inputEndField) = 
inputIntervalType match {
+  case DT(startField, endField) =>
+("day-time", DT(targetStartField, targetEndField).typeName, 
startField, endField)
+  case YM(startField, endField) =>
+("year-month", YM(targetStartField, targetEndField).typeName, 
startField, endField)
+}
+if (targetStartField != inputStartField || targetEndField != 
inputEndField) {
+  throwIllegalIntervalFormatException(
+input, targetStartField, targetEndField, intervalStr, typeName, 
fallBackNotice)
+}
+  }
+
+
   val supportedFormat = Map(
 (YM.YEAR, YM.MONTH) -> Seq("[+|-]y-m", "INTERVAL [+|-]'[+|-]y-m' YEAR TO 
MONTH"),
 (YM.YEAR, YM.YEAR) -> Seq("[+|-]y", "INTERVAL [+|-]'[+|-]y' YEAR"),
@@ -140,56 +173,41 @@ object IntervalUtils {
   startField: Byte,
   endField: Byte): Int = {
 
-def checkStringIntervalType(targetStartField: Byte, targetEndField: Byte): 
Unit = {
-  if (startField != targetStartField || endField != targetEndField) {
-

[spark] branch master updated: [SPARK-36023][SPARK-35735][SPARK-35768][SQL] Refactor code about parse string to DT/YM

2021-07-06 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 26d1bb1  [SPARK-36023][SPARK-35735][SPARK-35768][SQL] Refactor code 
about parse string to DT/YM
26d1bb1 is described below

commit 26d1bb16bc565dbcb1a3f536dc78cd87be6c2468
Author: Angerszh 
AuthorDate: Tue Jul 6 13:51:06 2021 +0300

[SPARK-36023][SPARK-35735][SPARK-35768][SQL] Refactor code about parse 
string to DT/YM

### What changes were proposed in this pull request?
 Refactor code about parse string to DT/YM intervals.

### Why are the changes needed?
Extracting the common code about parse string to DT/YM should improve code 
maintenance.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Existed UT.

Closes #33217 from AngersZh/SPARK-35735-35768.

Authored-by: Angerszh 
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/util/IntervalUtils.scala| 201 ++---
 .../sql/catalyst/expressions/CastSuiteBase.scala   |  28 ++-
 2 files changed, 123 insertions(+), 106 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
index 30a2fa5..b174165 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
@@ -29,7 +29,7 @@ import 
org.apache.spark.sql.catalyst.util.DateTimeUtils.millisToMicros
 import org.apache.spark.sql.catalyst.util.IntervalStringStyles.{ANSI_STYLE, 
HIVE_STYLE, IntervalStyle}
 import org.apache.spark.sql.errors.QueryExecutionErrors
 import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.types.{DayTimeIntervalType => DT, Decimal, 
YearMonthIntervalType => YM}
+import org.apache.spark.sql.types.{DataType, DayTimeIntervalType => DT, 
Decimal, YearMonthIntervalType => YM}
 import org.apache.spark.unsafe.types.{CalendarInterval, UTF8String}
 
 // The style of textual representation of intervals
@@ -110,7 +110,7 @@ object IntervalUtils {
   private val yearMonthIndividualLiteralRegex =
 
(s"(?i)^INTERVAL\\s+([+|-])?'$yearMonthIndividualPatternString'\\s+(YEAR|MONTH)$$").r
 
-  private def getSign(firstSign: String, secondSign: String): Int = {
+  private def finalSign(firstSign: String, secondSign: String = null): Int = {
 (firstSign, secondSign) match {
   case ("-", "-") => 1
   case ("-", _) => -1
@@ -119,6 +119,39 @@ object IntervalUtils {
 }
   }
 
+  private def throwIllegalIntervalFormatException(
+  input: UTF8String,
+  startFiled: Byte,
+  endField: Byte,
+  intervalStr: String,
+  typeName: String,
+  fallBackNotice: Option[String] = None) = {
+throw new IllegalArgumentException(
+  s"Interval string does not match $intervalStr format of " +
+s"${supportedFormat((startFiled, endField)).map(format => 
s"`$format`").mkString(", ")} " +
+s"when cast to $typeName: ${input.toString}" +
+s"${fallBackNotice.map(s => s", $s").getOrElse("")}")
+  }
+
+  private def checkIntervalStringDataType(
+  input: UTF8String,
+  targetStartField: Byte,
+  targetEndField: Byte,
+  inputIntervalType: DataType,
+  fallBackNotice: Option[String] = None): Unit = {
+val (intervalStr, typeName, inputStartField, inputEndField) = 
inputIntervalType match {
+  case DT(startField, endField) =>
+("day-time", DT(targetStartField, targetEndField).typeName, 
startField, endField)
+  case YM(startField, endField) =>
+("year-month", YM(targetStartField, targetEndField).typeName, 
startField, endField)
+}
+if (targetStartField != inputStartField || targetEndField != 
inputEndField) {
+  throwIllegalIntervalFormatException(
+input, targetStartField, targetEndField, intervalStr, typeName, 
fallBackNotice)
+}
+  }
+
+
   val supportedFormat = Map(
 (YM.YEAR, YM.MONTH) -> Seq("[+|-]y-m", "INTERVAL [+|-]'[+|-]y-m' YEAR TO 
MONTH"),
 (YM.YEAR, YM.YEAR) -> Seq("[+|-]y", "INTERVAL [+|-]'[+|-]y' YEAR"),
@@ -140,56 +173,41 @@ object IntervalUtils {
   startField: Byte,
   endField: Byte): Int = {
 
-def checkStringIntervalType(targetStartField: Byte, targetEndField: Byte): 
Unit = {
-  if (startField != targetStartField || endField != targetEndField) {
-throw new IllegalArgumentException(s"Interval string does not match 
year-month format of " +
-  s&q

[spark] branch master updated (38ef477 -> d572a85)

2021-04-26 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 38ef477  [SPARK-32921][SHUFFLE] MapOutputTracker extensions to support 
push-based shuffle
 add d572a85  [SPARK-35224][SQL][TESTS] Fix buffer overflow in 
`MutableProjectionSuite`

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/expressions/MutableProjectionSuite.scala| 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (74afc68 -> c0a3c0c)

2021-04-26 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 74afc68  [SPARK-35213][SQL] Keep the correct ordering of nested 
structs in chained withField operations
 add c0a3c0c  [SPARK-35088][SQL] Accept ANSI intervals by the Sequence 
expression

No new revisions were added by this update.

Summary of changes:
 .../expressions/collectionOperations.scala | 194 -
 .../expressions/CollectionExpressionsSuite.scala   | 176 ++-
 2 files changed, 326 insertions(+), 44 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated (eaceb40 -> c59db3d)

2021-04-26 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from eaceb40  [SPARK-35213][SQL] Keep the correct ordering of nested 
structs in chained withField operations
 add c59db3d  [SPARK-35224][SQL][TESTS][3.1][3.0] Fix buffer overflow in 
`MutableProjectionSuite`

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/expressions/MutableProjectionSuite.scala| 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1145 matches

Mail list logo