[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-24 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r476090110



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -474,6 +480,54 @@ case class Lag(input: Expression, offset: Expression, 
default: Expression)
   override val direction = Descending
 }
 
+/**
+ * The NthValue function returns the value of `input` at the `offset`th row 
from beginning of the
+ * window frame. Offsets start at 1. When the value of `input` is null at the 
`offset`th row or

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-24 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r476096152



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -474,6 +480,54 @@ case class Lag(input: Expression, offset: Expression, 
default: Expression)
   override val direction = Descending
 }
 
+/**
+ * The NthValue function returns the value of `input` at the `offset`th row 
from beginning of the
+ * window frame. Offsets start at 1. When the value of `input` is null at the 
`offset`th row or
+ * there is no such an `offset`th row, null is returned.
+ *
+ * @param input expression to evaluate `offset`th row of the window frame.
+ * @param offset rows to jump ahead in the partition.
+ */
+@ExpressionDescription(
+  usage = """
+_FUNC_(input[, offset]) - Returns the value of `input` at the row that is 
the `offset`th row
+  from beginning of the window frame. Offsets start at 1. If the value of 
`input` at the
+  `offset`th row is null, null is returned. If there is no such an offset 
row (e.g., when the
+  offset is 10, size of the window frame less than 10), null is returned.
+  """,
+  since = "3.1.0")
+case class NthValue(input: Expression, offset: Expression)
+extends OffsetWindowFunction {
+
+  override val default = Literal(null)
+
+  override val startWithCurrentRow = false
+
+  override val direction = Ascending
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+val check = super.checkInputDataTypes()
+if (check.isFailure) {
+  check
+} else if (offset.foldable) {
+  offset.dataType match {
+case IntegerType =>
+  offset.eval().asInstanceOf[Int] match {
+case i: Int if i <= 0 => TypeCheckFailure(
+  s"The 'offset' argument of nth_value must be greater than zero 
but it is $i.")
+case _ => TypeCheckSuccess
+  }
+case _ => TypeCheckFailure(
+  s"The 'offset' parameter must be a int literal but it is 
${offset.dataType}.")

Review comment:
   postgresql: integer
   oracle:integer
   redshift:integer
   vertica:integer





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-24 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r476152901



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -363,6 +363,12 @@ abstract class OffsetWindowFunction
*/
   val direction: SortDirection
 
+  /**
+   * Whether the offset is start with the current row. If 
`startWithCurrentRow` is false, `offset`
+   * means the offset is start with the first row of the entire window 
partition.
+   */

Review comment:
   OK.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-24 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r476153597



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala
##
@@ -77,31 +77,30 @@ object WindowFunctionFrame {
  * @param newMutableProjection function used to create the projection.
  * @param offset by which rows get moved within a partition.
  */
-final class OffsetWindowFunctionFrame(

Review comment:
   We can't uniform the behavior.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-24 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r476159993



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExecBase.scala
##
@@ -137,7 +137,11 @@ trait WindowExecBase extends UnaryExecNode {
   function match {
 case AggregateExpression(f, _, _, _, _) => collect("AGGREGATE", 
frame, e, f)
 case f: AggregateWindowFunction => collect("AGGREGATE", frame, e, 
f)
-case f: OffsetWindowFunction => collect("OFFSET", frame, e, f)
+case f: OffsetWindowFunction => if (f.startWithCurrentRow) {
+  collect("OFFSET", frame, e, f)
+} else {
+  collect("WHOLE_OFFSET", frame, e, f)

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-24 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r476195407



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExecBase.scala
##
@@ -137,7 +137,11 @@ trait WindowExecBase extends UnaryExecNode {
   function match {
 case AggregateExpression(f, _, _, _, _) => collect("AGGREGATE", 
frame, e, f)
 case f: AggregateWindowFunction => collect("AGGREGATE", frame, e, 
f)
-case f: OffsetWindowFunction => collect("OFFSET", frame, e, f)
+case f: OffsetWindowFunction => if (f.startWithCurrentRow) {
+  collect("OFFSET", frame, e, f)
+} else {
+  collect("WHOLE_OFFSET", frame, e, f)

Review comment:
   ok





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-25 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r476291241



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -342,7 +342,8 @@ abstract class OffsetWindowFunction
   extends Expression with WindowFunction with Unevaluable with 
ImplicitCastInputTypes {
   /**
* Input expression to evaluate against a row which a number of rows below 
or above (depending on
-   * the value and sign of the offset) the current row.
+   * the value and sign of the offset) the current row or the first row of the 
entire window

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-25 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r476307262



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -352,17 +353,24 @@ abstract class OffsetWindowFunction
   val default: Expression
 
   /**
-   * (Foldable) expression that contains the number of rows between the 
current row and the row
-   * where the input expression is evaluated.
+   * (Foldable) expression that contains the number of rows between the 
current row or the first

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-25 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r476307662



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -352,17 +353,24 @@ abstract class OffsetWindowFunction
   val default: Expression
 
   /**
-   * (Foldable) expression that contains the number of rows between the 
current row and the row
-   * where the input expression is evaluated.
+   * (Foldable) expression that contains the number of rows between the 
current row or the first
+   * row of the entire window partition and the row where the input expression 
is evaluated.
*/
   val offset: Expression
 
   /**
-   * Direction of the number of rows between the current row and the row where 
the input expression
-   * is evaluated.
+   * Direction of the number of rows between the current row or the first row 
of the entire
+   * window partition and the row where the input expression is evaluated.
*/
   val direction: SortDirection
 
+  /**
+   * Whether the offset is start with the current row. If `isRelative` is 
true, `offset` means

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-25 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r476307341



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -352,17 +353,24 @@ abstract class OffsetWindowFunction
   val default: Expression
 
   /**
-   * (Foldable) expression that contains the number of rows between the 
current row and the row
-   * where the input expression is evaluated.
+   * (Foldable) expression that contains the number of rows between the 
current row or the first
+   * row of the entire window partition and the row where the input expression 
is evaluated.
*/
   val offset: Expression
 
   /**
-   * Direction of the number of rows between the current row and the row where 
the input expression
-   * is evaluated.
+   * Direction of the number of rows between the current row or the first row 
of the entire
+   * window partition and the row where the input expression is evaluated.
*/
   val direction: SortDirection
 
+  /**
+   * Whether the offset is start with the current row. If `isRelative` is 
true, `offset` means
+   * the offset is start with the current row. otherwise, the offset is start 
with the first
+   * row of the entire window partition.

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-25 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r476308921



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala
##
@@ -151,10 +168,46 @@ final class OffsetWindowFunctionFrame(
 }
 inputIndex += 1
   }
+}
 
-  override def currentLowerBound(): Int = throw new 
UnsupportedOperationException()
+/**
+ * The absolute offset window frame based on the first row calculates frames 
containing
+ * NTH_VALUE statements. The absolute offset window frame based on the first 
row return
+ * the same value for all rows in the window partition.

Review comment:
   I will update it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-12 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r469670262



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -474,6 +479,52 @@ case class Lag(input: Expression, offset: Expression, 
default: Expression)
   override val direction = Descending
 }
 
+/**
+ * The NthValue function returns the value of `input` at the row that is the 
`offset`th row of
+ * the window frame (counting from 1). Offsets start at 0, which is the 
current row. When the
+ * value of `input` is null at the `offset`th row or there is no such an 
`offset`th row, null
+ * is returned.
+ *
+ * @param input expression to evaluate `offset`th row of the window frame.
+ * @param offset rows to jump ahead in the partition.
+ */
+@ExpressionDescription(
+  usage = """
+_FUNC_(input[, offset]) - Returns the value of `input` at the row that is 
the`offset`th row

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-12 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r469673892



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -474,6 +479,52 @@ case class Lag(input: Expression, offset: Expression, 
default: Expression)
   override val direction = Descending
 }
 
+/**
+ * The NthValue function returns the value of `input` at the row that is the 
`offset`th row of
+ * the window frame (counting from 1). Offsets start at 0, which is the 
current row. When the
+ * value of `input` is null at the `offset`th row or there is no such an 
`offset`th row, null
+ * is returned.
+ *
+ * @param input expression to evaluate `offset`th row of the window frame.
+ * @param offset rows to jump ahead in the partition.
+ */
+@ExpressionDescription(
+  usage = """
+_FUNC_(input[, offset]) - Returns the value of `input` at the row that is 
the`offset`th row
+  of the window frame (counting from 1). If the value of `input` at the 
`offset`th row is
+  null, null is returned. If there is no such an offset row (e.g., when 
the offset is 10,
+  size of the window frame less than 10), null is returned.

Review comment:
   `If there is no such an offset row, null is returned.` means `If the 
offset greater than the number of values in the window, null is returned.`
   `PostgreSQL`,`Vertica`,`Oracle` and `Presto` all clearly stated this 
requirement.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-12 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r469674334



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -474,6 +479,52 @@ case class Lag(input: Expression, offset: Expression, 
default: Expression)
   override val direction = Descending
 }
 
+/**
+ * The NthValue function returns the value of `input` at the row that is the 
`offset`th row of
+ * the window frame (counting from 1). Offsets start at 0, which is the 
current row. When the
+ * value of `input` is null at the `offset`th row or there is no such an 
`offset`th row, null
+ * is returned.
+ *
+ * @param input expression to evaluate `offset`th row of the window frame.
+ * @param offset rows to jump ahead in the partition.
+ */
+@ExpressionDescription(
+  usage = """
+_FUNC_(input[, offset]) - Returns the value of `input` at the row that is 
the`offset`th row
+  of the window frame (counting from 1). If the value of `input` at the 
`offset`th row is
+  null, null is returned. If there is no such an offset row (e.g., when 
the offset is 10,
+  size of the window frame less than 10), null is returned.
+  """,
+  since = "3.1.0")
+case class NthValue(input: Expression, offset: Expression)
+extends OffsetWindowFunction {
+
+  override val default = Literal(null)
+
+  override val isWholeBased = true
+
+  override val direction = Ascending
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+val check = super.checkInputDataTypes()
+if (check.isFailure) {
+  check
+} else if (offset.foldable) {
+  offset.eval() match {
+case i: Int if i <= 0 => TypeCheckFailure(
+  s"The 'offset' argument of nth_value must be greater than zero but 
it is $i.")
+case i: Int => TypeCheckSuccess
+case other => TypeCheckFailure(
+  s"The 'offset' parameter must be a int literal but it is $other.")

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-13 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r469835363



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala
##
@@ -151,10 +168,41 @@ final class OffsetWindowFunctionFrame(
 }
 inputIndex += 1
   }
+}
 
-  override def currentLowerBound(): Int = throw new 
UnsupportedOperationException()
+/**
+ * The fixed offset window frame calculates frames containing
+ * NTH_VALUE statements.
+ * The fixed offset window frame return the same value for all rows in the 
window partition.
+ */
+class FixedOffsetWindowFunctionFrame(
+target: InternalRow,
+ordinal: Int,
+expressions: Array[OffsetWindowFunction],
+inputSchema: Seq[Attribute],
+newMutableProjection: (Seq[Expression], Seq[Attribute]) => 
MutableProjection,
+offset: Int)
+  extends OffsetWindowFunctionFrameBase(
+target, ordinal, expressions, inputSchema, newMutableProjection, offset) {
 
-  override def currentUpperBound(): Int = throw new 
UnsupportedOperationException()
+  var maybeRow: Option[UnsafeRow] = None
+
+  override def prepare(rows: ExternalAppendOnlyUnsafeRowArray): Unit = {
+super.prepare(rows)
+if (inputIndex >= 0 && inputIndex < input.length) {
+  val r = WindowFunctionFrame.getNextOrNull(inputIterator)
+  maybeRow = Some(r)
+}
+  }
+
+  override def write(index: Int, current: InternalRow): Unit = {

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-13 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r470375984



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -474,6 +479,55 @@ case class Lag(input: Expression, offset: Expression, 
default: Expression)
   override val direction = Descending
 }
 
+/**
+ * The NthValue function returns the value of `input` at the row that is the 
`offset`th row of
+ * the window frame (counting from 1). Offsets start at 0, which is the 
current row. When the

Review comment:
   Sorry! I made a mistake.
   `Offsets start at 1`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-13 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r470377487



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -474,6 +479,55 @@ case class Lag(input: Expression, offset: Expression, 
default: Expression)
   override val direction = Descending
 }
 
+/**
+ * The NthValue function returns the value of `input` at the row that is the 
`offset`th row of
+ * the window frame (counting from 1). Offsets start at 0, which is the 
current row. When the
+ * value of `input` is null at the `offset`th row or there is no such an 
`offset`th row, null
+ * is returned.
+ *
+ * @param input expression to evaluate `offset`th row of the window frame.
+ * @param offset rows to jump ahead in the partition.
+ */
+@ExpressionDescription(
+  usage = """
+_FUNC_(input[, offset]) - Returns the value of `input` at the row that is 
the `offset`th row
+  of the window frame (counting from 1). If the value of `input` at the 
`offset`th row is

Review comment:
   ``offset`th row of the window` ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-13 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r470379655



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExecBase.scala
##
@@ -137,7 +137,11 @@ trait WindowExecBase extends UnaryExecNode {
   function match {
 case AggregateExpression(f, _, _, _, _) => collect("AGGREGATE", 
frame, e, f)
 case f: AggregateWindowFunction => collect("AGGREGATE", frame, e, 
f)
-case f: OffsetWindowFunction => collect("OFFSET", frame, e, f)
+case f: OffsetWindowFunction => if (f.isWholeBased) {

Review comment:
   According to the plan, `first_value` and `last_value` need to be 
realized.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-14 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r470473071



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -363,6 +363,11 @@ abstract class OffsetWindowFunction
*/
   val direction: SortDirection
 
+  /**
+   * Whether the offset is based on the entire frame.

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-14 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r470474638



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -474,6 +479,54 @@ case class Lag(input: Expression, offset: Expression, 
default: Expression)
   override val direction = Descending
 }
 
+/**
+ * The NthValue function returns the value of `input` at the `offset`th row of 
beginning

Review comment:
   OK

##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -474,6 +479,54 @@ case class Lag(input: Expression, offset: Expression, 
default: Expression)
   override val direction = Descending
 }
 
+/**
+ * The NthValue function returns the value of `input` at the `offset`th row of 
beginning
+ * the window partition (counting from 1). Offsets start at 1. When the value 
of `input` is null

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-20 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r474380349



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExecBase.scala
##
@@ -184,6 +188,18 @@ trait WindowExecBase extends UnaryExecNode {
   MutableProjection.create(expressions, schema),
 offset)
 
+  case ("WHOLE_OFFSET", _, IntegerLiteral(offset), _) =>
+target: InternalRow =>
+  new FixedOffsetWindowFunctionFrame(

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-10 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r437999658



##
File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala
##
@@ -993,6 +993,30 @@ object functions {
 Lead(e.expr, Literal(offset), Literal(defaultValue))
   }
 
+  /**
+   * Window function: returns the value that is the `offset`th row of the 
window frame
+   * (counting from 1), and `null` if the size of window frame is less than 
`offset` rows.
+   *
+   * This is equivalent to the nth_value function in SQL.
+   *
+   * @group window_funcs
+   * @since 3.0.0
+   */
+  def nth_value(columnName: String, offset: Int): Column = {

Review comment:
   They are existing two differences.
   The first is:
   The offset of `lag` based on the current row in the window. 
   The offset of `nth_value` based on the first row in the window. 
   The second is.
   The offset of `lag` increases from back to front.
   The offset of `nth_value` increases from front to back.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-10 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r438556665



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -474,6 +479,52 @@ case class Lag(input: Expression, offset: Expression, 
default: Expression)
   override val direction = Descending
 }
 
+/**
+ * The NthValue function returns the value of `input` at the row that is the 
`offset`th row of
+ * the window frame (counting from 1). Offsets start at 0, which is the 
current row. When the
+ * value of `input` is null at the `offset`th row or there is no such an 
`offset`th row, null
+ * is returned.
+ *
+ * @param input expression to evaluate `offset`th row of the window frame.
+ * @param offset rows to jump ahead in the partition.
+ */
+@ExpressionDescription(
+  usage = """
+_FUNC_(input[, offset]) - Returns the value of `input` at the row that is 
the`offset`th row
+  of the window frame (counting from 1). If the value of `input` at the 
`offset`th row is
+  null, null is returned. If there is no such an offset row (e.g., when 
the offset is 10,
+  size of the window frame less than 10), null is returned.
+  """,
+  since = "3.0.0")

Review comment:
   OK.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-11 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r439184634



##
File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala
##
@@ -993,6 +993,30 @@ object functions {
 Lead(e.expr, Literal(offset), Literal(defaultValue))
   }
 
+  /**
+   * Window function: returns the value that is the `offset`th row of the 
window frame
+   * (counting from 1), and `null` if the size of window frame is less than 
`offset` rows.
+   *
+   * This is equivalent to the nth_value function in SQL.
+   *
+   * @group window_funcs
+   * @since 3.0.0

Review comment:
   @HyukjinKwon Thanks for your reminder. I'm wating @hvanhovell here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-27 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r446590900



##
File path: 
sql/core/src/test/resources/sql-tests/inputs/postgreSQL/window_part1.sql
##
@@ -301,7 +301,7 @@ FROM tenk1 WHERE unique1 < 10;
 -- unique1, four
 -- FROM tenk1 WHERE unique1 < 10 WINDOW w AS (order by four);
 
--- [SPARK-27951] ANSI SQL: NTH_VALUE function
+-- [SPARK-30708] first_value/last_value window function throws ParseException

Review comment:
   Because #25082 has reverted, SPARK-30708 not need.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-27 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r446591593



##
File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala
##
@@ -993,6 +993,30 @@ object functions {
 Lead(e.expr, Literal(offset), Literal(defaultValue))
   }
 
+  /**
+   * Window function: returns the value that is the `offset`th row of the 
window frame
+   * (counting from 1), and `null` if the size of window frame is less than 
`offset` rows.
+   *
+   * This is equivalent to the nth_value function in SQL.
+   *
+   * @group window_funcs
+   * @since 3.1.0
+   */
+  def nth_value(columnName: String, offset: Int): Column = {
+nth_value(Column(columnName), offset)
+  }
+
+  /**
+   * Window function: returns the value that is the `offset`th row of the 
window frame
+   * (counting from 1), and `null` if the size of window frame is less than 
`offset` rows.
+   *
+   * This is equivalent to the nth_value function in SQL.
+   *
+   * @group window_funcs
+   * @since 3.0.0

Review comment:
   OK.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-27 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r446591681



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala
##
@@ -151,10 +168,41 @@ final class OffsetWindowFunctionFrame(
 }
 inputIndex += 1
   }
+}
 
-  override def currentLowerBound(): Int = throw new 
UnsupportedOperationException()
+/**
+ * The fixed offset window frame calculates frames containing
+ * NTH_VALUE/FIRST_VALUE/LAST_VALUE statements.
+ * The fixed offset windwo frame return the same value for all rows in the 
window partition.

Review comment:
   Thanks.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-27 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r446591742



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala
##
@@ -151,10 +168,41 @@ final class OffsetWindowFunctionFrame(
 }
 inputIndex += 1
   }
+}
 
-  override def currentLowerBound(): Int = throw new 
UnsupportedOperationException()
+/**
+ * The fixed offset window frame calculates frames containing
+ * NTH_VALUE/FIRST_VALUE/LAST_VALUE statements.

Review comment:
   I will open other PR to support first_value and last_value.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-27 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r446591742



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala
##
@@ -151,10 +168,41 @@ final class OffsetWindowFunctionFrame(
 }
 inputIndex += 1
   }
+}
 
-  override def currentLowerBound(): Int = throw new 
UnsupportedOperationException()
+/**
+ * The fixed offset window frame calculates frames containing
+ * NTH_VALUE/FIRST_VALUE/LAST_VALUE statements.

Review comment:
   I will open other PR to support first_value and last_value.
   I will delete first_value and last_value temporarily, until the other PR 
support them.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-27 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r446591904



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowFunctionFrame.scala
##
@@ -151,10 +168,41 @@ final class OffsetWindowFunctionFrame(
 }
 inputIndex += 1
   }
+}
 
-  override def currentLowerBound(): Int = throw new 
UnsupportedOperationException()
+/**
+ * The fixed offset window frame calculates frames containing
+ * NTH_VALUE/FIRST_VALUE/LAST_VALUE statements.
+ * The fixed offset windwo frame return the same value for all rows in the 
window partition.
+ */
+class FixedOffsetWindowFunctionFrame(
+target: InternalRow,
+ordinal: Int,
+expressions: Array[OffsetWindowFunction],
+inputSchema: Seq[Attribute],
+newMutableProjection: (Seq[Expression], Seq[Attribute]) => 
MutableProjection,
+offset: Int)
+  extends OffsetWindowFunctionFrameBase(
+target, ordinal, expressions, inputSchema, newMutableProjection, offset) {
 
-  override def currentUpperBound(): Int = throw new 
UnsupportedOperationException()
+  var rowOption: Option[UnsafeRow] = None

Review comment:
   OK.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-27 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r446593095



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -363,6 +363,11 @@ abstract class OffsetWindowFunction
*/
   val direction: SortDirection
 
+  /**
+   * Whether the offset is based on the entire frame.
+   */
+  val isWholeBased: Boolean = false

Review comment:
   I added this flag used to distinguish `OffsetWindowFunctionFrame` and 
`FixedOffsetWindowFunctionFrame`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-27 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r446590900



##
File path: 
sql/core/src/test/resources/sql-tests/inputs/postgreSQL/window_part1.sql
##
@@ -301,7 +301,7 @@ FROM tenk1 WHERE unique1 < 10;
 -- unique1, four
 -- FROM tenk1 WHERE unique1 < 10 WINDOW w AS (order by four);
 
--- [SPARK-27951] ANSI SQL: NTH_VALUE function
+-- [SPARK-30708] first_value/last_value window function throws ParseException

Review comment:
   Because #25082 has reverted, SPARK-30708 not need.
   I will delete this line.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-27 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r446590900



##
File path: 
sql/core/src/test/resources/sql-tests/inputs/postgreSQL/window_part1.sql
##
@@ -301,7 +301,7 @@ FROM tenk1 WHERE unique1 < 10;
 -- unique1, four
 -- FROM tenk1 WHERE unique1 < 10 WINDOW w AS (order by four);
 
--- [SPARK-27951] ANSI SQL: NTH_VALUE function
+-- [SPARK-30708] first_value/last_value window function throws ParseException

Review comment:
   Because #25082 has reverted, SPARK-30708 not need.
   I updated this comments with SPARK-28310





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-06-30 Thread GitBox


beliefer commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r447462884



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -363,6 +363,11 @@ abstract class OffsetWindowFunction
*/
   val direction: SortDirection
 
+  /**
+   * Whether the offset is based on the entire frame.
+   */
+  val isWholeBased: Boolean = false

Review comment:
   @HyukjinKwon Thanks for your review. But I can't understand your means. 
Could you tell me more?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org