subject:"\[GitHub\] spark pull request #18540\: \[SPARK\-19451\]\[SQL\] rangeBetween method should acc..."

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18540


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130224334
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   }
 
   /**
-   * Create or resolve a [[FrameBoundary]]. Simple math expressions are 
allowed for Value
-   * Preceding/Following boundaries. These expressions must be constant 
(foldable) and return an
-   * integer value.
+   * Create or resolve a frame boundary expressions.
*/
-  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = 
withOrigin(ctx) {
-// We currently only allow foldable integers.
-def value: Int = {
+  override def visitFrameBound(ctx: FrameBoundContext): Expression = 
withOrigin(ctx) {
+def value: Expression = {
   val e = expression(ctx.expression)
-  validate(e.resolved && e.foldable && e.dataType == IntegerType,
-"Frame bound value must be a constant integer.",
-ctx)
-  e.eval().asInstanceOf[Int]
+  validate(e.resolved && e.foldable, "Frame bound value must be a 
literal.", ctx)
+  e
 }
 
-// Create the FrameBoundary
 ctx.boundType.getType match {
   case SqlBaseParser.PRECEDING if ctx.UNBOUNDED != null =>
 UnboundedPreceding
   case SqlBaseParser.PRECEDING =>
-ValuePreceding(value)
+UnaryMinus(value)
   case SqlBaseParser.CURRENT =>
 CurrentRow
   case SqlBaseParser.FOLLOWING if ctx.UNBOUNDED != null =>
 UnboundedFollowing
   case SqlBaseParser.FOLLOWING =>
-ValueFollowing(value)
+value
--- End diff --

It sounds like we already allowed it in the previous release. Thus, we need 
to follow what we have now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130217250
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
@@ -174,28 +191,22 @@ class WindowSpec private[sql](
*/
   // Note: when updating the doc for this method, also update 
Window.rangeBetween.
   def rangeBetween(start: Long, end: Long): WindowSpec = {
--- End diff --

We'll get and compute empty frames.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130217244
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   }
 
   /**
-   * Create or resolve a [[FrameBoundary]]. Simple math expressions are 
allowed for Value
-   * Preceding/Following boundaries. These expressions must be constant 
(foldable) and return an
-   * integer value.
+   * Create or resolve a frame boundary expressions.
*/
-  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = 
withOrigin(ctx) {
-// We currently only allow foldable integers.
-def value: Int = {
+  override def visitFrameBound(ctx: FrameBoundContext): Expression = 
withOrigin(ctx) {
+def value: Expression = {
   val e = expression(ctx.expression)
-  validate(e.resolved && e.foldable && e.dataType == IntegerType,
-"Frame bound value must be a constant integer.",
-ctx)
-  e.eval().asInstanceOf[Int]
+  validate(e.resolved && e.foldable, "Frame bound value must be a 
literal.", ctx)
+  e
 }
 
-// Create the FrameBoundary
 ctx.boundType.getType match {
   case SqlBaseParser.PRECEDING if ctx.UNBOUNDED != null =>
 UnboundedPreceding
   case SqlBaseParser.PRECEDING =>
-ValuePreceding(value)
+UnaryMinus(value)
   case SqlBaseParser.CURRENT =>
 CurrentRow
   case SqlBaseParser.FOLLOWING if ctx.UNBOUNDED != null =>
 UnboundedFollowing
   case SqlBaseParser.FOLLOWING =>
-ValueFollowing(value)
+value
--- End diff --

May I ask how should we parse it into an `unsigned-integer`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130217194
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   }
 
   /**
-   * Create or resolve a [[FrameBoundary]]. Simple math expressions are 
allowed for Value
-   * Preceding/Following boundaries. These expressions must be constant 
(foldable) and return an
-   * integer value.
+   * Create or resolve a frame boundary expressions.
*/
-  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = 
withOrigin(ctx) {
-// We currently only allow foldable integers.
-def value: Int = {
+  override def visitFrameBound(ctx: FrameBoundContext): Expression = 
withOrigin(ctx) {
+def value: Expression = {
   val e = expression(ctx.expression)
-  validate(e.resolved && e.foldable && e.dataType == IntegerType,
-"Frame bound value must be a constant integer.",
-ctx)
-  e.eval().asInstanceOf[Int]
+  validate(e.resolved && e.foldable, "Frame bound value must be a 
literal.", ctx)
--- End diff --

added


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130217171
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -351,20 +356,25 @@ abstract class OffsetWindowFunction
 
   override def nullable: Boolean = default == null || default.nullable || 
input.nullable
 
-  override lazy val frame = {
-// This will be triggered by the Analyzer.
-val offsetValue = offset.eval() match {
-  case o: Int => o
-  case x => throw new AnalysisException(
-s"Offset expression must be a foldable integer expression: $x")
-}
+  override lazy val frame: WindowFrame = {
 val boundary = direction match {
-  case Ascending => ValueFollowing(offsetValue)
-  case Descending => ValuePreceding(offsetValue)
+  case Ascending => offset
+  case Descending => UnaryMinus(offset)
 }
 SpecifiedWindowFrame(RowFrame, boundary, boundary)
   }
 
+  override def checkInputDataTypes(): TypeCheckResult = {
+val check = super.checkInputDataTypes()
+if (check.isFailure) {
+  check
+} else if (!offset.foldable) {
+  TypeCheckFailure(s"Offset expression '$offset' must be a literal.")
--- End diff --

Currently it's only used by lead()/lag() functions, that both checked the 
input types, so we're not able to test this from sql.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130217065
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,161 +101,171 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
-
-/**
- * The trait used to represent the type of a Window Frame Boundary.
- */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
 }
 
 /**
- * Extractor for making working with frame boundaries easier.
+ * The trait used to represent special boundaries used in a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
+  override def children: Seq[Expression] = Nil
+  override def dataType: DataType = NullType
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
-
-  override def toString: String = "UNBOUNDED PRECEDING"
+/** UNBOUNDED boundary. */
+case object UnboundedPreceding extends SpecialFrameBoundary {
+  override def sql: String = "UNBOUNDED PRECEDING"
 }
 
-/**  PRECEDING boundary. */
-case class ValuePreceding(value: Int) extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => false
-case ValuePreceding(anotherValue) => value >= anotherValue
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
-
-  override def toString: String = s"$value PRECEDING"
+case object UnboundedFollowing extends SpecialFrameBoundary {
+  override def sql: String = "UNBOUNDED FOLLOWING"
 }
 
 /** CURRENT ROW boundary. */
-case object CurrentRow extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => false
-case vp: ValuePreceding => false
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130216990
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -43,58 +42,54 @@ case class WindowSpecDefinition(
 orderSpec: Seq[SortOrder],
 frameSpecification: WindowFrame) extends Expression with WindowSpec 
with Unevaluable {
 
-  def validate: Option[String] = frameSpecification match {
-case UnspecifiedFrame =>
-  Some("Found a UnspecifiedFrame. It should be converted to a 
SpecifiedWindowFrame " +
-"during analysis. Please file a bug report.")
-case frame: SpecifiedWindowFrame => frame.validate.orElse {
-  def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
-if (orderSpec.length > 1)  {
-  // It is not allowed to have a value-based PRECEDING and 
FOLLOWING
-  // as the boundary of a Range Window Frame.
-  Some("This Range Window Frame only accepts at most one ORDER BY 
expression.")
-} else if (orderSpec.nonEmpty && 
!orderSpec.head.dataType.isInstanceOf[NumericType]) {
-  Some("The data type of the expression in the ORDER BY clause 
should be a numeric type.")
-} else {
-  None
-}
-  }
-
-  (frame.frameType, frame.frameStart, frame.frameEnd) match {
-case (RangeFrame, vp: ValuePreceding, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, vf: ValueFollowing, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vp: ValuePreceding) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vf: ValueFollowing) => 
checkValueBasedBoundaryForRangeFrame()
-case (_, _, _) => None
-  }
-}
-  }
-
-  override def children: Seq[Expression] = partitionSpec ++ orderSpec
+  override def children: Seq[Expression] = partitionSpec ++ orderSpec :+ 
frameSpecification
 
   override lazy val resolved: Boolean =
 childrenResolved && checkInputDataTypes().isSuccess &&
   frameSpecification.isInstanceOf[SpecifiedWindowFrame]
 
   override def nullable: Boolean = true
   override def foldable: Boolean = false
-  override def dataType: DataType = throw new UnsupportedOperationException
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
 
-  override def sql: String = {
-val partition = if (partitionSpec.isEmpty) {
-  ""
-} else {
-  "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
+  override def checkInputDataTypes(): TypeCheckResult = {
+frameSpecification match {
+  case UnspecifiedFrame =>
+TypeCheckFailure(
+  "Cannot use an UnspecifiedFrame. This should have been converted 
during analysis. " +
+"Please file a bug report.")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
!f.isUnbounded &&
+orderSpec.isEmpty =>
+TypeCheckFailure(
+  "A range window frame cannot be used in an unordered window 
specification.")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
f.isValueBound &&
+orderSpec.size > 1 =>
+TypeCheckFailure(
+  s"A range window frame with value boundaries cannot be used in a 
window specification " +
+s"with multiple order by expressions: 
${orderSpec.mkString(",")}")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
f.isValueBound &&
+!isValidFrameType(f.valueBoundary.head.dataType) =>
+TypeCheckFailure(
+  s"The data type '${orderSpec.head.dataType}' used in the order 
specification does " +
+s"not match the data type '${f.valueBoundary.head.dataType}' 
which is used in the " +
+"range frame.")
--- End diff --

The first case is for defensive guard, I'll add test sql for the two 
negative cases related to orderBy in RangeFrame.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130216635
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 ---
@@ -121,15 +119,10 @@ trait CheckAnalysis extends PredicateHelper {
 // function.
 e match {
   case _: AggregateExpression | _: OffsetWindowFunction | _: 
AggregateWindowFunction =>
+w
   case _ =>
 failAnalysis(s"Expression '$e' not supported within a 
window function.")
 }
-// Make sure the window specification is valid.
-s.validate match {
--- End diff --

yea


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130215803
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
@@ -174,28 +191,22 @@ class WindowSpec private[sql](
*/
   // Note: when updating the doc for this method, also update 
Window.rangeBetween.
   def rangeBetween(start: Long, end: Long): WindowSpec = {
--- End diff --

What happens if `start` is larger than `end`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130215695
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   }
 
   /**
-   * Create or resolve a [[FrameBoundary]]. Simple math expressions are 
allowed for Value
-   * Preceding/Following boundaries. These expressions must be constant 
(foldable) and return an
-   * integer value.
+   * Create or resolve a frame boundary expressions.
*/
-  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = 
withOrigin(ctx) {
-// We currently only allow foldable integers.
-def value: Int = {
+  override def visitFrameBound(ctx: FrameBoundContext): Expression = 
withOrigin(ctx) {
+def value: Expression = {
   val e = expression(ctx.expression)
-  validate(e.resolved && e.foldable && e.dataType == IntegerType,
-"Frame bound value must be a constant integer.",
-ctx)
-  e.eval().asInstanceOf[Int]
+  validate(e.resolved && e.foldable, "Frame bound value must be a 
literal.", ctx)
+  e
 }
 
-// Create the FrameBoundary
 ctx.boundType.getType match {
   case SqlBaseParser.PRECEDING if ctx.UNBOUNDED != null =>
 UnboundedPreceding
   case SqlBaseParser.PRECEDING =>
-ValuePreceding(value)
+UnaryMinus(value)
--- End diff --

The same here. Do we allow users assign a negative value?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130215692
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   }
 
   /**
-   * Create or resolve a [[FrameBoundary]]. Simple math expressions are 
allowed for Value
-   * Preceding/Following boundaries. These expressions must be constant 
(foldable) and return an
-   * integer value.
+   * Create or resolve a frame boundary expressions.
*/
-  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = 
withOrigin(ctx) {
-// We currently only allow foldable integers.
-def value: Int = {
+  override def visitFrameBound(ctx: FrameBoundContext): Expression = 
withOrigin(ctx) {
+def value: Expression = {
   val e = expression(ctx.expression)
-  validate(e.resolved && e.foldable && e.dataType == IntegerType,
-"Frame bound value must be a constant integer.",
-ctx)
-  e.eval().asInstanceOf[Int]
+  validate(e.resolved && e.foldable, "Frame bound value must be a 
literal.", ctx)
+  e
 }
 
-// Create the FrameBoundary
 ctx.boundType.getType match {
   case SqlBaseParser.PRECEDING if ctx.UNBOUNDED != null =>
 UnboundedPreceding
   case SqlBaseParser.PRECEDING =>
-ValuePreceding(value)
+UnaryMinus(value)
   case SqlBaseParser.CURRENT =>
 CurrentRow
   case SqlBaseParser.FOLLOWING if ctx.UNBOUNDED != null =>
 UnboundedFollowing
   case SqlBaseParser.FOLLOWING =>
-ValueFollowing(value)
+value
--- End diff --

It should be an `unsigned-integer` based on ANSI SQL


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130215611
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   }
 
   /**
-   * Create or resolve a [[FrameBoundary]]. Simple math expressions are 
allowed for Value
-   * Preceding/Following boundaries. These expressions must be constant 
(foldable) and return an
-   * integer value.
+   * Create or resolve a frame boundary expressions.
*/
-  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = 
withOrigin(ctx) {
-// We currently only allow foldable integers.
-def value: Int = {
+  override def visitFrameBound(ctx: FrameBoundContext): Expression = 
withOrigin(ctx) {
+def value: Expression = {
   val e = expression(ctx.expression)
-  validate(e.resolved && e.foldable && e.dataType == IntegerType,
-"Frame bound value must be a constant integer.",
-ctx)
-  e.eval().asInstanceOf[Int]
+  validate(e.resolved && e.foldable, "Frame bound value must be a 
literal.", ctx)
--- End diff --

Any test case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130215526
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -351,20 +356,25 @@ abstract class OffsetWindowFunction
 
   override def nullable: Boolean = default == null || default.nullable || 
input.nullable
 
-  override lazy val frame = {
-// This will be triggered by the Analyzer.
-val offsetValue = offset.eval() match {
-  case o: Int => o
-  case x => throw new AnalysisException(
-s"Offset expression must be a foldable integer expression: $x")
-}
+  override lazy val frame: WindowFrame = {
 val boundary = direction match {
-  case Ascending => ValueFollowing(offsetValue)
-  case Descending => ValuePreceding(offsetValue)
+  case Ascending => offset
+  case Descending => UnaryMinus(offset)
 }
 SpecifiedWindowFrame(RowFrame, boundary, boundary)
   }
 
+  override def checkInputDataTypes(): TypeCheckResult = {
+val check = super.checkInputDataTypes()
+if (check.isFailure) {
+  check
+} else if (!offset.foldable) {
+  TypeCheckFailure(s"Offset expression '$offset' must be a literal.")
--- End diff --

Having a test case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130215469
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,161 +101,171 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
-
-/**
- * The trait used to represent the type of a Window Frame Boundary.
- */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
 }
 
 /**
- * Extractor for making working with frame boundaries easier.
+ * The trait used to represent special boundaries used in a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
+  override def children: Seq[Expression] = Nil
+  override def dataType: DataType = NullType
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
-
-  override def toString: String = "UNBOUNDED PRECEDING"
+/** UNBOUNDED boundary. */
+case object UnboundedPreceding extends SpecialFrameBoundary {
+  override def sql: String = "UNBOUNDED PRECEDING"
 }
 
-/**  PRECEDING boundary. */
-case class ValuePreceding(value: Int) extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => false
-case ValuePreceding(anotherValue) => value >= anotherValue
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
-
-  override def toString: String = s"$value PRECEDING"
+case object UnboundedFollowing extends SpecialFrameBoundary {
+  override def sql: String = "UNBOUNDED FOLLOWING"
 }
 
 /** CURRENT ROW boundary. */
-case object CurrentRow extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => false
-case vp: ValuePreceding => false
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing =>

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130215418
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,161 +101,171 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
-
-/**
- * The trait used to represent the type of a Window Frame Boundary.
- */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
--- End diff --

uh, we also support `CalendarInterval`. Do we have a test case to verify it 
works on `CalendarInterval`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130215342
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -43,58 +42,54 @@ case class WindowSpecDefinition(
 orderSpec: Seq[SortOrder],
 frameSpecification: WindowFrame) extends Expression with WindowSpec 
with Unevaluable {
 
-  def validate: Option[String] = frameSpecification match {
-case UnspecifiedFrame =>
-  Some("Found a UnspecifiedFrame. It should be converted to a 
SpecifiedWindowFrame " +
-"during analysis. Please file a bug report.")
-case frame: SpecifiedWindowFrame => frame.validate.orElse {
-  def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
-if (orderSpec.length > 1)  {
-  // It is not allowed to have a value-based PRECEDING and 
FOLLOWING
-  // as the boundary of a Range Window Frame.
-  Some("This Range Window Frame only accepts at most one ORDER BY 
expression.")
-} else if (orderSpec.nonEmpty && 
!orderSpec.head.dataType.isInstanceOf[NumericType]) {
-  Some("The data type of the expression in the ORDER BY clause 
should be a numeric type.")
-} else {
-  None
-}
-  }
-
-  (frame.frameType, frame.frameStart, frame.frameEnd) match {
-case (RangeFrame, vp: ValuePreceding, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, vf: ValueFollowing, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vp: ValuePreceding) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vf: ValueFollowing) => 
checkValueBasedBoundaryForRangeFrame()
-case (_, _, _) => None
-  }
-}
-  }
-
-  override def children: Seq[Expression] = partitionSpec ++ orderSpec
+  override def children: Seq[Expression] = partitionSpec ++ orderSpec :+ 
frameSpecification
 
   override lazy val resolved: Boolean =
 childrenResolved && checkInputDataTypes().isSuccess &&
   frameSpecification.isInstanceOf[SpecifiedWindowFrame]
 
   override def nullable: Boolean = true
   override def foldable: Boolean = false
-  override def dataType: DataType = throw new UnsupportedOperationException
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
 
-  override def sql: String = {
-val partition = if (partitionSpec.isEmpty) {
-  ""
-} else {
-  "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
+  override def checkInputDataTypes(): TypeCheckResult = {
+frameSpecification match {
+  case UnspecifiedFrame =>
+TypeCheckFailure(
+  "Cannot use an UnspecifiedFrame. This should have been converted 
during analysis. " +
+"Please file a bug report.")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
!f.isUnbounded &&
+orderSpec.isEmpty =>
+TypeCheckFailure(
+  "A range window frame cannot be used in an unordered window 
specification.")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
f.isValueBound &&
+orderSpec.size > 1 =>
+TypeCheckFailure(
+  s"A range window frame with value boundaries cannot be used in a 
window specification " +
+s"with multiple order by expressions: 
${orderSpec.mkString(",")}")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
f.isValueBound &&
+!isValidFrameType(f.valueBoundary.head.dataType) =>
+TypeCheckFailure(
+  s"The data type '${orderSpec.head.dataType}' used in the order 
specification does " +
+s"not match the data type '${f.valueBoundary.head.dataType}' 
which is used in the " +
+"range frame.")
--- End diff --

Just want to confirm whether we have at least four negatives test cases to 
respectively cover these cases?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130215242
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -43,58 +42,54 @@ case class WindowSpecDefinition(
 orderSpec: Seq[SortOrder],
 frameSpecification: WindowFrame) extends Expression with WindowSpec 
with Unevaluable {
 
-  def validate: Option[String] = frameSpecification match {
-case UnspecifiedFrame =>
-  Some("Found a UnspecifiedFrame. It should be converted to a 
SpecifiedWindowFrame " +
-"during analysis. Please file a bug report.")
-case frame: SpecifiedWindowFrame => frame.validate.orElse {
-  def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
-if (orderSpec.length > 1)  {
-  // It is not allowed to have a value-based PRECEDING and 
FOLLOWING
-  // as the boundary of a Range Window Frame.
-  Some("This Range Window Frame only accepts at most one ORDER BY 
expression.")
-} else if (orderSpec.nonEmpty && 
!orderSpec.head.dataType.isInstanceOf[NumericType]) {
-  Some("The data type of the expression in the ORDER BY clause 
should be a numeric type.")
-} else {
-  None
-}
-  }
-
-  (frame.frameType, frame.frameStart, frame.frameEnd) match {
-case (RangeFrame, vp: ValuePreceding, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, vf: ValueFollowing, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vp: ValuePreceding) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vf: ValueFollowing) => 
checkValueBasedBoundaryForRangeFrame()
-case (_, _, _) => None
-  }
-}
-  }
-
-  override def children: Seq[Expression] = partitionSpec ++ orderSpec
+  override def children: Seq[Expression] = partitionSpec ++ orderSpec :+ 
frameSpecification
 
   override lazy val resolved: Boolean =
 childrenResolved && checkInputDataTypes().isSuccess &&
   frameSpecification.isInstanceOf[SpecifiedWindowFrame]
 
   override def nullable: Boolean = true
   override def foldable: Boolean = false
-  override def dataType: DataType = throw new UnsupportedOperationException
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
 
-  override def sql: String = {
-val partition = if (partitionSpec.isEmpty) {
-  ""
-} else {
-  "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
+  override def checkInputDataTypes(): TypeCheckResult = {
+frameSpecification match {
+  case UnspecifiedFrame =>
+TypeCheckFailure(
+  "Cannot use an UnspecifiedFrame. This should have been converted 
during analysis. " +
+"Please file a bug report.")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
!f.isUnbounded &&
+orderSpec.isEmpty =>
+TypeCheckFailure(
+  "A range window frame cannot be used in an unordered window 
specification.")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
f.isValueBound &&
+orderSpec.size > 1 =>
+TypeCheckFailure(
+  s"A range window frame with value boundaries cannot be used in a 
window specification " +
+s"with multiple order by expressions: 
${orderSpec.mkString(",")}")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
f.isValueBound &&
+!isValidFrameType(f.valueBoundary.head.dataType) =>
--- End diff --

Personally, I do not like many long `if` conditions. Let us add extra two 
spaces in line 71, 66, and 62.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130214990
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -805,4 +806,27 @@ object TypeCoercion {
   Option(ret)
 }
   }
+
+  /**
+   * Cast WindowFrame boundaries to the type they operate upon.
+   */
+  object WindowFrameCoercion extends Rule[LogicalPlan] {
+def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
+  case s @ WindowSpecDefinition(_, Seq(order), 
SpecifiedWindowFrame(RangeFrame, lower, upper))
+if order.resolved =>
--- End diff --

Nit: add two more spaces before `if`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130214960
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -805,4 +806,27 @@ object TypeCoercion {
   Option(ret)
 }
   }
+
+  /**
+   * Cast WindowFrame boundaries to the type they operate upon.
+   */
+  object WindowFrameCoercion extends Rule[LogicalPlan] {
+def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
+  case s @ WindowSpecDefinition(_, Seq(order), 
SpecifiedWindowFrame(RangeFrame, lower, upper))
+if order.resolved =>
+s.copy(frameSpecification = SpecifiedWindowFrame(
+  RangeFrame,
+  createBoundaryCast(lower, order.dataType),
+  createBoundaryCast(upper, order.dataType)))
+}
+
+private def createBoundaryCast(boundary: Expression, dt: DataType): 
Expression = {
+  boundary match {
+case e: Expression if e.dataType != dt && Cast.canCast(e.dataType, 
dt) &&
+  !e.isInstanceOf[SpecialFrameBoundary] =>
--- End diff --

Splitting `if` in the middle looks weird. How about?
```Scala
case e: SpecialFrameBoundary => e
case e: Expression if e.dataType != dt && Cast.canCast(e.dataType, 
dt) => Cast(e, dt)
case _ => boundary
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130214846
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 ---
@@ -121,15 +119,10 @@ trait CheckAnalysis extends PredicateHelper {
 // function.
 e match {
   case _: AggregateExpression | _: OffsetWindowFunction | _: 
AggregateWindowFunction =>
+w
   case _ =>
 failAnalysis(s"Expression '$e' not supported within a 
window function.")
 }
-// Make sure the window specification is valid.
-s.validate match {
--- End diff --

The verification is moved to `checkInputDataTypes`, right? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-29 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130214797
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 ---
@@ -109,10 +109,8 @@ trait CheckAnalysis extends PredicateHelper {
 failAnalysis(s"Distinct window functions are not supported: 
$w")
 
   case w @ WindowExpression(_: OffsetWindowFunction, 
WindowSpecDefinition(_, order,
-   SpecifiedWindowFrame(frame,
- FrameBoundary(l),
- FrameBoundary(h
- if order.isEmpty || frame != RowFrame || l != h =>
+   frame: SpecifiedWindowFrame))
--- End diff --

Nit: Cutting in the middle looks weird. Please keep them in the same line.
`WindowSpecDefinition(_, order, frame: SpecifiedWindowFrame))`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-28 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130078786
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   }
 
   /**
-   * Create or resolve a [[FrameBoundary]]. Simple math expressions are 
allowed for Value
-   * Preceding/Following boundaries. These expressions must be constant 
(foldable) and return an
-   * integer value.
+   * Create or resolve a frame boundary expressions.
*/
-  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = 
withOrigin(ctx) {
-// We currently only allow foldable integers.
-def value: Int = {
+  override def visitFrameBound(ctx: FrameBoundContext): Expression = 
withOrigin(ctx) {
+def value: Expression = {
   val e = expression(ctx.expression)
-  validate(e.resolved && e.foldable && e.dataType == IntegerType,
-"Frame bound value must be a constant integer.",
-ctx)
-  e.eval().asInstanceOf[Int]
+  validate(e.resolved && e.foldable, "Frame bound value must be a 
literal.", ctx)
--- End diff --

sure


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-28 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130076316
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   }
 
   /**
-   * Create or resolve a [[FrameBoundary]]. Simple math expressions are 
allowed for Value
-   * Preceding/Following boundaries. These expressions must be constant 
(foldable) and return an
-   * integer value.
+   * Create or resolve a frame boundary expressions.
*/
-  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = 
withOrigin(ctx) {
-// We currently only allow foldable integers.
-def value: Int = {
+  override def visitFrameBound(ctx: FrameBoundContext): Expression = 
withOrigin(ctx) {
+def value: Expression = {
   val e = expression(ctx.expression)
-  validate(e.resolved && e.foldable && e.dataType == IntegerType,
-"Frame bound value must be a constant integer.",
-ctx)
-  e.eval().asInstanceOf[Int]
+  validate(e.resolved && e.foldable, "Frame bound value must be a 
literal.", ctx)
--- End diff --

How about keep this so we can fail earlier?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-28 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130074867
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -1179,32 +1179,26 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   }
 
   /**
-   * Create or resolve a [[FrameBoundary]]. Simple math expressions are 
allowed for Value
-   * Preceding/Following boundaries. These expressions must be constant 
(foldable) and return an
-   * integer value.
+   * Create or resolve a frame boundary expressions.
*/
-  override def visitFrameBound(ctx: FrameBoundContext): FrameBoundary = 
withOrigin(ctx) {
-// We currently only allow foldable integers.
-def value: Int = {
+  override def visitFrameBound(ctx: FrameBoundContext): Expression = 
withOrigin(ctx) {
+def value: Expression = {
   val e = expression(ctx.expression)
-  validate(e.resolved && e.foldable && e.dataType == IntegerType,
-"Frame bound value must be a constant integer.",
-ctx)
-  e.eval().asInstanceOf[Int]
+  validate(e.resolved && e.foldable, "Frame bound value must be a 
literal.", ctx)
--- End diff --

is it necessary? I think analyzer can detect and report this failure too?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-28 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130073540
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,161 +113,176 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
-
-/**
- * The trait used to represent the type of a Window Frame Boundary.
- */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
 }
 
 /**
- * Extractor for making working with frame boundaries easier.
+ * The trait used to represent special boundaries used in a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = NullType
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
+
+  def notFollows(other: Expression): Boolean
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** UNBOUNDED boundary. */
+case object UnboundedPreceding extends SpecialFrameBoundary {
+  override def sql: String = "UNBOUNDED PRECEDING"
 
-  override def toString: String = "UNBOUNDED PRECEDING"
+  override def notFollows(other: Expression): Boolean = true
 }
 
-/**  PRECEDING boundary. */
-case class ValuePreceding(value: Int) extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => false
-case ValuePreceding(anotherValue) => value >= anotherValue
-case CurrentRow => true
-case vf: ValueFollowing => true
+case object UnboundedFollowing extends SpecialFrameBoundary {
+  override def sql: String = "UNBOUNDED FOLLOWING"
+
+  override def notFollows(other: Expression): Boolean = other match {
 case UnboundedFollowing => true
+case _ => false
   }
-
-  override def toString: String = s"$value PRECEDING"
 }
 
 /** CURRENT ROW boundary. */
-case object CurrentRow extends FrameBoundary {
-  def notFollows(other:

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-28 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130073180
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,161 +113,176 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
-
-/**
- * The trait used to represent the type of a Window Frame Boundary.
- */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
 }
 
 /**
- * Extractor for making working with frame boundaries easier.
+ * The trait used to represent special boundaries used in a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = NullType
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
+
+  def notFollows(other: Expression): Boolean
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** UNBOUNDED boundary. */
+case object UnboundedPreceding extends SpecialFrameBoundary {
+  override def sql: String = "UNBOUNDED PRECEDING"
 
-  override def toString: String = "UNBOUNDED PRECEDING"
+  override def notFollows(other: Expression): Boolean = true
 }
 
-/**  PRECEDING boundary. */
-case class ValuePreceding(value: Int) extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => false
-case ValuePreceding(anotherValue) => value >= anotherValue
-case CurrentRow => true
-case vf: ValueFollowing => true
+case object UnboundedFollowing extends SpecialFrameBoundary {
+  override def sql: String = "UNBOUNDED FOLLOWING"
+
+  override def notFollows(other: Expression): Boolean = other match {
 case UnboundedFollowing => true
+case _ => false
   }
-
-  override def toString: String = s"$value PRECEDING"
 }
 
 /** CURRENT ROW boundary. */
-case object CurrentRow extends FrameBoundary {
-  def notFollows(other:

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-28 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130073049
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,161 +113,176 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
-
-/**
- * The trait used to represent the type of a Window Frame Boundary.
- */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
 }
 
 /**
- * Extractor for making working with frame boundaries easier.
+ * The trait used to represent special boundaries used in a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
--- End diff --

nit: `def` instead of `lazy val`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-28 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130073011
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,161 +113,176 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
-
-/**
- * The trait used to represent the type of a Window Frame Boundary.
- */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
 }
 
 /**
- * Extractor for making working with frame boundaries easier.
+ * The trait used to represent special boundaries used in a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = NullType
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
+
+  def notFollows(other: Expression): Boolean
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** UNBOUNDED boundary. */
+case object UnboundedPreceding extends SpecialFrameBoundary {
+  override def sql: String = "UNBOUNDED PRECEDING"
 
-  override def toString: String = "UNBOUNDED PRECEDING"
+  override def notFollows(other: Expression): Boolean = true
 }
 
-/**  PRECEDING boundary. */
-case class ValuePreceding(value: Int) extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => false
-case ValuePreceding(anotherValue) => value >= anotherValue
-case CurrentRow => true
-case vf: ValueFollowing => true
+case object UnboundedFollowing extends SpecialFrameBoundary {
+  override def sql: String = "UNBOUNDED FOLLOWING"
+
+  override def notFollows(other: Expression): Boolean = other match {
 case UnboundedFollowing => true
+case _ => false
   }
-
-  override def toString: String = s"$value PRECEDING"
 }
 
 /** CURRENT ROW boundary. */
-case object CurrentRow extends FrameBoundary {
-  def notFollows(other:

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-28 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130036535
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,161 +113,176 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
-
-/**
- * The trait used to represent the type of a Window Frame Boundary.
- */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
 }
 
 /**
- * Extractor for making working with frame boundaries easier.
+ * The trait used to represent special boundaries used in a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = NullType
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
+
+  def notFollows(other: Expression): Boolean
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** UNBOUNDED boundary. */
+case object UnboundedPreceding extends SpecialFrameBoundary {
+  override def sql: String = "UNBOUNDED PRECEDING"
 
-  override def toString: String = "UNBOUNDED PRECEDING"
+  override def notFollows(other: Expression): Boolean = true
 }
 
-/**  PRECEDING boundary. */
-case class ValuePreceding(value: Int) extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => false
-case ValuePreceding(anotherValue) => value >= anotherValue
-case CurrentRow => true
-case vf: ValueFollowing => true
+case object UnboundedFollowing extends SpecialFrameBoundary {
+  override def sql: String = "UNBOUNDED FOLLOWING"
+
+  override def notFollows(other: Expression): Boolean = other match {
 case UnboundedFollowing => true
+case _ => false
   }
-
-  override def toString: String = s"$value PRECEDING"
 }
 
 /** CURRENT ROW boundary. */
-case object CurrentRow extends FrameBoundary {
-  def notFollows(other:

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-28 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130036232
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -43,57 +42,65 @@ case class WindowSpecDefinition(
 orderSpec: Seq[SortOrder],
 frameSpecification: WindowFrame) extends Expression with WindowSpec 
with Unevaluable {
 
-  def validate: Option[String] = frameSpecification match {
-case UnspecifiedFrame =>
-  Some("Found a UnspecifiedFrame. It should be converted to a 
SpecifiedWindowFrame " +
-"during analysis. Please file a bug report.")
-case frame: SpecifiedWindowFrame => frame.validate.orElse {
-  def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
-if (orderSpec.length > 1)  {
-  // It is not allowed to have a value-based PRECEDING and 
FOLLOWING
-  // as the boundary of a Range Window Frame.
-  Some("This Range Window Frame only accepts at most one ORDER BY 
expression.")
-} else if (orderSpec.nonEmpty && 
!orderSpec.head.dataType.isInstanceOf[NumericType]) {
-  Some("The data type of the expression in the ORDER BY clause 
should be a numeric type.")
-} else {
-  None
-}
-  }
-
-  (frame.frameType, frame.frameStart, frame.frameEnd) match {
-case (RangeFrame, vp: ValuePreceding, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, vf: ValueFollowing, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vp: ValuePreceding) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vf: ValueFollowing) => 
checkValueBasedBoundaryForRangeFrame()
-case (_, _, _) => None
-  }
-}
-  }
-
-  override def children: Seq[Expression] = partitionSpec ++ orderSpec
+  override def children: Seq[Expression] = partitionSpec ++ orderSpec :+ 
frameSpecification
 
   override lazy val resolved: Boolean =
 childrenResolved && checkInputDataTypes().isSuccess &&
   frameSpecification.isInstanceOf[SpecifiedWindowFrame]
 
   override def nullable: Boolean = true
   override def foldable: Boolean = false
-  override def dataType: DataType = throw new UnsupportedOperationException
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
 
-  override def sql: String = {
-val partition = if (partitionSpec.isEmpty) {
-  ""
-} else {
-  "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
+  override def checkInputDataTypes(): TypeCheckResult = {
+frameSpecification match {
+  case UnspecifiedFrame =>
+TypeCheckFailure(
+  "Cannot use an UnspecifiedFrame. This should have been converted 
during analysis. " +
+"Please file a bug report.")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
!f.isUnbounded &&
+orderSpec.isEmpty =>
+TypeCheckFailure(
+  "A range window frame cannot be used in an unordered window 
specification.")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
f.isValueBound &&
+orderSpec.size > 1 =>
+TypeCheckFailure(
+  s"A range window frame with value boundaries cannot be used in a 
window specification " +
+s"with multiple order by expressions: 
${orderSpec.mkString(",")}")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
f.isValueBound &&
+!isValidFrameType(f.valueBoundary.head.dataType) =>
+TypeCheckFailure(
+  s"The data type '${orderSpec.head.dataType}' used in the order 
specification does " +
+s"not match the data type '${f.valueBoundary.head.dataType}' 
which is used in the " +
+"range frame.")
+  case f: SpecifiedWindowFrame if !isValidFrameBoundary(f.lower, 
f.upper) =>
+TypeCheckFailure(s"The upper bound of the window frame is 
'${f.upper.sql}', which is " +
+  s"smaller than the lower bound '${f.lower.sql}'.")
+  case _ => TypeCheckSuccess
 }
+  }
 
-val order = if (orderSpec.isEmpty) {
-  ""
-} else {
-  "ORDER BY " + orderSpec.map(_.sql).mkString(", ") + " "
+  override def sql: String = {
+def toSql(exprs: Seq[Expression], prefix: String): Seq[String] = {
+  Seq(exprs).filter(_.nonEmpty).map(_.map(_.sql).mkString(prefix, ", 
", ""))
 }
 
-s"($partition$order${frameSpecification.toString})"
+val elements =
+  toSql(partitionSpec, "PARTITION BY ") ++
+toSql(orderSpec, "ORDER BY ") ++
+

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-28 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130035488
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -43,58 +42,54 @@ case class WindowSpecDefinition(
 orderSpec: Seq[SortOrder],
 frameSpecification: WindowFrame) extends Expression with WindowSpec 
with Unevaluable {
 
-  def validate: Option[String] = frameSpecification match {
-case UnspecifiedFrame =>
-  Some("Found a UnspecifiedFrame. It should be converted to a 
SpecifiedWindowFrame " +
-"during analysis. Please file a bug report.")
-case frame: SpecifiedWindowFrame => frame.validate.orElse {
-  def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
-if (orderSpec.length > 1)  {
-  // It is not allowed to have a value-based PRECEDING and 
FOLLOWING
-  // as the boundary of a Range Window Frame.
-  Some("This Range Window Frame only accepts at most one ORDER BY 
expression.")
-} else if (orderSpec.nonEmpty && 
!orderSpec.head.dataType.isInstanceOf[NumericType]) {
-  Some("The data type of the expression in the ORDER BY clause 
should be a numeric type.")
-} else {
-  None
-}
-  }
-
-  (frame.frameType, frame.frameStart, frame.frameEnd) match {
-case (RangeFrame, vp: ValuePreceding, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, vf: ValueFollowing, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vp: ValuePreceding) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vf: ValueFollowing) => 
checkValueBasedBoundaryForRangeFrame()
-case (_, _, _) => None
-  }
-}
-  }
-
-  override def children: Seq[Expression] = partitionSpec ++ orderSpec
+  override def children: Seq[Expression] = partitionSpec ++ orderSpec :+ 
frameSpecification
 
   override lazy val resolved: Boolean =
 childrenResolved && checkInputDataTypes().isSuccess &&
   frameSpecification.isInstanceOf[SpecifiedWindowFrame]
 
   override def nullable: Boolean = true
   override def foldable: Boolean = false
-  override def dataType: DataType = throw new UnsupportedOperationException
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
 
-  override def sql: String = {
-val partition = if (partitionSpec.isEmpty) {
-  ""
-} else {
-  "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
+  override def checkInputDataTypes(): TypeCheckResult = {
+frameSpecification match {
+  case UnspecifiedFrame =>
+TypeCheckFailure(
+  "Cannot use an UnspecifiedFrame. This should have been converted 
during analysis. " +
+"Please file a bug report.")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
!f.isUnbounded &&
--- End diff --

Sorry I was wrong


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-28 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r130035366
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -43,57 +42,65 @@ case class WindowSpecDefinition(
 orderSpec: Seq[SortOrder],
 frameSpecification: WindowFrame) extends Expression with WindowSpec 
with Unevaluable {
 
-  def validate: Option[String] = frameSpecification match {
-case UnspecifiedFrame =>
-  Some("Found a UnspecifiedFrame. It should be converted to a 
SpecifiedWindowFrame " +
-"during analysis. Please file a bug report.")
-case frame: SpecifiedWindowFrame => frame.validate.orElse {
-  def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
-if (orderSpec.length > 1)  {
-  // It is not allowed to have a value-based PRECEDING and 
FOLLOWING
-  // as the boundary of a Range Window Frame.
-  Some("This Range Window Frame only accepts at most one ORDER BY 
expression.")
-} else if (orderSpec.nonEmpty && 
!orderSpec.head.dataType.isInstanceOf[NumericType]) {
-  Some("The data type of the expression in the ORDER BY clause 
should be a numeric type.")
-} else {
-  None
-}
-  }
-
-  (frame.frameType, frame.frameStart, frame.frameEnd) match {
-case (RangeFrame, vp: ValuePreceding, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, vf: ValueFollowing, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vp: ValuePreceding) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vf: ValueFollowing) => 
checkValueBasedBoundaryForRangeFrame()
-case (_, _, _) => None
-  }
-}
-  }
-
-  override def children: Seq[Expression] = partitionSpec ++ orderSpec
+  override def children: Seq[Expression] = partitionSpec ++ orderSpec :+ 
frameSpecification
 
   override lazy val resolved: Boolean =
 childrenResolved && checkInputDataTypes().isSuccess &&
   frameSpecification.isInstanceOf[SpecifiedWindowFrame]
 
   override def nullable: Boolean = true
   override def foldable: Boolean = false
-  override def dataType: DataType = throw new UnsupportedOperationException
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
 
-  override def sql: String = {
-val partition = if (partitionSpec.isEmpty) {
-  ""
-} else {
-  "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
+  override def checkInputDataTypes(): TypeCheckResult = {
+frameSpecification match {
+  case UnspecifiedFrame =>
+TypeCheckFailure(
+  "Cannot use an UnspecifiedFrame. This should have been converted 
during analysis. " +
+"Please file a bug report.")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
!f.isUnbounded &&
+orderSpec.isEmpty =>
+TypeCheckFailure(
+  "A range window frame cannot be used in an unordered window 
specification.")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
f.isValueBound &&
+orderSpec.size > 1 =>
+TypeCheckFailure(
+  s"A range window frame with value boundaries cannot be used in a 
window specification " +
+s"with multiple order by expressions: 
${orderSpec.mkString(",")}")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
f.isValueBound &&
+!isValidFrameType(f.valueBoundary.head.dataType) =>
+TypeCheckFailure(
+  s"The data type '${orderSpec.head.dataType}' used in the order 
specification does " +
+s"not match the data type '${f.valueBoundary.head.dataType}' 
which is used in the " +
+"range frame.")
+  case f: SpecifiedWindowFrame if !isValidFrameBoundary(f.lower, 
f.upper) =>
+TypeCheckFailure(s"The upper bound of the window frame is 
'${f.upper.sql}', which is " +
+  s"smaller than the lower bound '${f.lower.sql}'.")
+  case _ => TypeCheckSuccess
 }
+  }
 
-val order = if (orderSpec.isEmpty) {
-  ""
-} else {
-  "ORDER BY " + orderSpec.map(_.sql).mkString(", ") + " "
+  override def sql: String = {
+def toSql(exprs: Seq[Expression], prefix: String): Seq[String] = {
+  Seq(exprs).filter(_.nonEmpty).map(_.map(_.sql).mkString(prefix, ", 
", ""))
 }
 
-s"($partition$order${frameSpecification.toString})"
+val elements =
+  toSql(partitionSpec, "PARTITION BY ") ++
+toSql(orderSpec, "ORDER BY ") ++
+

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-24 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r129209847
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -43,58 +42,54 @@ case class WindowSpecDefinition(
 orderSpec: Seq[SortOrder],
 frameSpecification: WindowFrame) extends Expression with WindowSpec 
with Unevaluable {
 
-  def validate: Option[String] = frameSpecification match {
-case UnspecifiedFrame =>
-  Some("Found a UnspecifiedFrame. It should be converted to a 
SpecifiedWindowFrame " +
-"during analysis. Please file a bug report.")
-case frame: SpecifiedWindowFrame => frame.validate.orElse {
-  def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
-if (orderSpec.length > 1)  {
-  // It is not allowed to have a value-based PRECEDING and 
FOLLOWING
-  // as the boundary of a Range Window Frame.
-  Some("This Range Window Frame only accepts at most one ORDER BY 
expression.")
-} else if (orderSpec.nonEmpty && 
!orderSpec.head.dataType.isInstanceOf[NumericType]) {
-  Some("The data type of the expression in the ORDER BY clause 
should be a numeric type.")
-} else {
-  None
-}
-  }
-
-  (frame.frameType, frame.frameStart, frame.frameEnd) match {
-case (RangeFrame, vp: ValuePreceding, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, vf: ValueFollowing, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vp: ValuePreceding) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vf: ValueFollowing) => 
checkValueBasedBoundaryForRangeFrame()
-case (_, _, _) => None
-  }
-}
-  }
-
-  override def children: Seq[Expression] = partitionSpec ++ orderSpec
+  override def children: Seq[Expression] = partitionSpec ++ orderSpec :+ 
frameSpecification
 
   override lazy val resolved: Boolean =
 childrenResolved && checkInputDataTypes().isSuccess &&
   frameSpecification.isInstanceOf[SpecifiedWindowFrame]
 
   override def nullable: Boolean = true
   override def foldable: Boolean = false
-  override def dataType: DataType = throw new UnsupportedOperationException
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
 
-  override def sql: String = {
-val partition = if (partitionSpec.isEmpty) {
-  ""
-} else {
-  "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
+  override def checkInputDataTypes(): TypeCheckResult = {
+frameSpecification match {
+  case UnspecifiedFrame =>
+TypeCheckFailure(
+  "Cannot use an UnspecifiedFrame. This should have been converted 
during analysis. " +
+"Please file a bug report.")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
!f.isUnbounded &&
--- End diff --

I think this should be `!f.isValueBound`? basically `current row and 
current row` is not unbound but should be allowed here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-24 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r129078561
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,173 +101,167 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
+}
 
 /**
- * The trait used to represent the type of a Window Frame Boundary.
+ * The trait used to represent special boundaries used in a window frame.
  */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = NullType
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
+/** UNBOUNDED boundary. */
+case object Unbounded extends SpecialFrameBoundary
+
+/** CURRENT ROW boundary. */
+case object CurrentRow extends SpecialFrameBoundary
+
 /**
- * Extractor for making working with frame boundaries easier.
+ * Represents a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait WindowFrame extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** Used as a placeholder when a frame specification is not defined. */
+case object UnspecifiedFrame extends WindowFrame
 
-  override def toString: String = "UNBOUNDED PRECEDING"
-}
+/**
+ * A specified Window Frame. The val lower/uppper can be either a foldable 
[[Expression]] or a
+ * [[SpecialFrameBoundary]].
+ */
+case class SpecifiedWindowFrame(
+frameType: FrameType,
+lower: Expression,
+upper: Expression)
+  extends WindowFrame {
 
-/**  PRECEDING boundary. */
-case class ValuePreceding(value: Int) extends FrameBoundary {
-  def

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-24 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r129005758
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
 ---
@@ -267,16 +267,17 @@ class ExpressionParserSuite extends PlanTest {
 // Range/Row
 val frameTypes = Seq(("rows", RowFrame), ("range", RangeFrame))
 val boundaries = Seq(
-  ("10 preceding", ValuePreceding(10), CurrentRow),
-  ("3 + 1 following", ValueFollowing(4), CurrentRow), // Will fail 
during analysis
-  ("unbounded preceding", UnboundedPreceding, CurrentRow),
-  ("unbounded following", UnboundedFollowing, CurrentRow), // Will 
fail during analysis
-  ("between unbounded preceding and current row", UnboundedPreceding, 
CurrentRow),
+  ("10 preceding", -Literal(10), CurrentRow),
+  ("2147483648 preceding", -Literal(2147483648L), CurrentRow),
+  ("3 + 1 following", Add(Literal(3), Literal(1)), CurrentRow), // 
Will fail during analysis
+  ("unbounded preceding", Unbounded, CurrentRow),
+  ("unbounded following", Unbounded, CurrentRow), // Will fail during 
analysis
--- End diff --

Well the idea was that the unboundedness was tied to the location in which 
it was used, so for example unbounded in the first position would mean 
unbounded preceding. However this is completely opposite to how we interpret 
literal bounds, it might be better to reintroduce special boundaries for 
unbounded preceding and unbounded following.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-24 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r129005233
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
 ---
@@ -267,16 +267,17 @@ class ExpressionParserSuite extends PlanTest {
 // Range/Row
 val frameTypes = Seq(("rows", RowFrame), ("range", RangeFrame))
 val boundaries = Seq(
-  ("10 preceding", ValuePreceding(10), CurrentRow),
-  ("3 + 1 following", ValueFollowing(4), CurrentRow), // Will fail 
during analysis
-  ("unbounded preceding", UnboundedPreceding, CurrentRow),
-  ("unbounded following", UnboundedFollowing, CurrentRow), // Will 
fail during analysis
-  ("between unbounded preceding and current row", UnboundedPreceding, 
CurrentRow),
+  ("10 preceding", -Literal(10), CurrentRow),
+  ("2147483648 preceding", -Literal(2147483648L), CurrentRow),
+  ("3 + 1 following", Add(Literal(3), Literal(1)), CurrentRow), // 
Will fail during analysis
--- End diff --

Do you think that will improve the UX?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-24 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r129005122
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,173 +101,167 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
+}
 
 /**
- * The trait used to represent the type of a Window Frame Boundary.
+ * The trait used to represent special boundaries used in a window frame.
  */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = NullType
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
+/** UNBOUNDED boundary. */
+case object Unbounded extends SpecialFrameBoundary
+
+/** CURRENT ROW boundary. */
+case object CurrentRow extends SpecialFrameBoundary
+
 /**
- * Extractor for making working with frame boundaries easier.
+ * Represents a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait WindowFrame extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** Used as a placeholder when a frame specification is not defined. */
+case object UnspecifiedFrame extends WindowFrame
 
-  override def toString: String = "UNBOUNDED PRECEDING"
-}
+/**
+ * A specified Window Frame. The val lower/uppper can be either a foldable 
[[Expression]] or a
+ * [[SpecialFrameBoundary]].
+ */
+case class SpecifiedWindowFrame(
+frameType: FrameType,
+lower: Expression,
+upper: Expression)
+  extends WindowFrame {
 
-/**  PRECEDING boundary. */
-case class ValuePreceding(value: Int) extends FrameBoundary {
-  def

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-24 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r129004992
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,173 +101,167 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
+}
 
 /**
- * The trait used to represent the type of a Window Frame Boundary.
+ * The trait used to represent special boundaries used in a window frame.
  */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = NullType
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
+/** UNBOUNDED boundary. */
+case object Unbounded extends SpecialFrameBoundary
+
+/** CURRENT ROW boundary. */
+case object CurrentRow extends SpecialFrameBoundary
+
 /**
- * Extractor for making working with frame boundaries easier.
+ * Represents a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait WindowFrame extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** Used as a placeholder when a frame specification is not defined. */
+case object UnspecifiedFrame extends WindowFrame
 
-  override def toString: String = "UNBOUNDED PRECEDING"
-}
+/**
+ * A specified Window Frame. The val lower/uppper can be either a foldable 
[[Expression]] or a
+ * [[SpecialFrameBoundary]].
+ */
+case class SpecifiedWindowFrame(
+frameType: FrameType,
+lower: Expression,
+upper: Expression)
+  extends WindowFrame {
 
-/**  PRECEDING boundary. */
-case class ValuePreceding(value: Int) extends FrameBoundary {
-  def

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-24 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r129002630
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
 ---
@@ -267,16 +267,17 @@ class ExpressionParserSuite extends PlanTest {
 // Range/Row
 val frameTypes = Seq(("rows", RowFrame), ("range", RangeFrame))
 val boundaries = Seq(
-  ("10 preceding", ValuePreceding(10), CurrentRow),
-  ("3 + 1 following", ValueFollowing(4), CurrentRow), // Will fail 
during analysis
-  ("unbounded preceding", UnboundedPreceding, CurrentRow),
-  ("unbounded following", UnboundedFollowing, CurrentRow), // Will 
fail during analysis
-  ("between unbounded preceding and current row", UnboundedPreceding, 
CurrentRow),
+  ("10 preceding", -Literal(10), CurrentRow),
+  ("2147483648 preceding", -Literal(2147483648L), CurrentRow),
+  ("3 + 1 following", Add(Literal(3), Literal(1)), CurrentRow), // 
Will fail during analysis
+  ("unbounded preceding", Unbounded, CurrentRow),
+  ("unbounded following", Unbounded, CurrentRow), // Will fail during 
analysis
--- End diff --

In fact this is problematic, we would generate the same result for both 
`unbounded preceding` and `unbounded following`. @hvanhovell any idea on 
resolving this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-24 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r129002412
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
 ---
@@ -267,16 +267,17 @@ class ExpressionParserSuite extends PlanTest {
 // Range/Row
 val frameTypes = Seq(("rows", RowFrame), ("range", RangeFrame))
 val boundaries = Seq(
-  ("10 preceding", ValuePreceding(10), CurrentRow),
-  ("3 + 1 following", ValueFollowing(4), CurrentRow), // Will fail 
during analysis
-  ("unbounded preceding", UnboundedPreceding, CurrentRow),
-  ("unbounded following", UnboundedFollowing, CurrentRow), // Will 
fail during analysis
-  ("between unbounded preceding and current row", UnboundedPreceding, 
CurrentRow),
+  ("10 preceding", -Literal(10), CurrentRow),
+  ("2147483648 preceding", -Literal(2147483648L), CurrentRow),
+  ("3 + 1 following", Add(Literal(3), Literal(1)), CurrentRow), // 
Will fail during analysis
--- End diff --

The lower boundary would be higher than the upper boundary, previously it 
would fail, but we have removed this check, should add it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-24 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r129000692
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/trees/TreeNodeSuite.scala
 ---
@@ -436,21 +436,22 @@ class TreeNodeSuite extends SparkFunSuite {
 "bucketColumnNames" -> "[bucket]",
 "sortColumnNames" -> "[sort]"))
 
-// Converts FrameBoundary to JSON
-assertJSON(
-  ValueFollowing(3),
-  JObject(
-"product-class" -> classOf[ValueFollowing].getName,
-"value" -> 3))
-
 // Converts WindowFrame to JSON
 assertJSON(
-  SpecifiedWindowFrame(RowFrame, UnboundedFollowing, CurrentRow),
-  JObject(
-"product-class" -> classOf[SpecifiedWindowFrame].getName,
-"frameType" -> JObject("object" -> 
JString(RowFrame.getClass.getName)),
-"frameStart" -> JObject("object" -> 
JString(UnboundedFollowing.getClass.getName)),
-"frameEnd" -> JObject("object" -> 
JString(CurrentRow.getClass.getName
+  SpecifiedWindowFrame(RowFrame, Unbounded, CurrentRow),
+  List(
+JObject(
+  "class" -> classOf[SpecifiedWindowFrame].getName,
+  "num-children" -> 2,
+  "frameType" -> JObject("object" -> 
JString(RowFrame.getClass.getName)),
+  "lower" -> 0,
+  "upper" -> 1),
--- End diff --

After this PR, `SpecialFrameBoundary` and `WindowFrame` are made 
`Expression`s, thus they are `TreeNode`s, so the field values are made value 
index in the TreeNode.children.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-23 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r128947347
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,173 +101,167 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
+}
 
 /**
- * The trait used to represent the type of a Window Frame Boundary.
+ * The trait used to represent special boundaries used in a window frame.
  */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = NullType
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
+/** UNBOUNDED boundary. */
+case object Unbounded extends SpecialFrameBoundary
+
+/** CURRENT ROW boundary. */
+case object CurrentRow extends SpecialFrameBoundary
+
 /**
- * Extractor for making working with frame boundaries easier.
+ * Represents a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait WindowFrame extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** Used as a placeholder when a frame specification is not defined. */
+case object UnspecifiedFrame extends WindowFrame
 
-  override def toString: String = "UNBOUNDED PRECEDING"
-}
+/**
+ * A specified Window Frame. The val lower/uppper can be either a foldable 
[[Expression]] or a
+ * [[SpecialFrameBoundary]].
+ */
+case class SpecifiedWindowFrame(
+frameType: FrameType,
+lower: Expression,
+upper: Expression)
+  extends WindowFrame {
 
-/**  PRECEDING boundary. */
-case class ValuePreceding(value: Int) extends FrameBoundary {
-  def

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-22 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r128894965
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/trees/TreeNodeSuite.scala
 ---
@@ -436,21 +436,22 @@ class TreeNodeSuite extends SparkFunSuite {
 "bucketColumnNames" -> "[bucket]",
 "sortColumnNames" -> "[sort]"))
 
-// Converts FrameBoundary to JSON
-assertJSON(
-  ValueFollowing(3),
-  JObject(
-"product-class" -> classOf[ValueFollowing].getName,
-"value" -> 3))
-
 // Converts WindowFrame to JSON
 assertJSON(
-  SpecifiedWindowFrame(RowFrame, UnboundedFollowing, CurrentRow),
-  JObject(
-"product-class" -> classOf[SpecifiedWindowFrame].getName,
-"frameType" -> JObject("object" -> 
JString(RowFrame.getClass.getName)),
-"frameStart" -> JObject("object" -> 
JString(UnboundedFollowing.getClass.getName)),
-"frameEnd" -> JObject("object" -> 
JString(CurrentRow.getClass.getName
+  SpecifiedWindowFrame(RowFrame, Unbounded, CurrentRow),
+  List(
+JObject(
+  "class" -> classOf[SpecifiedWindowFrame].getName,
+  "num-children" -> 2,
+  "frameType" -> JObject("object" -> 
JString(RowFrame.getClass.getName)),
+  "lower" -> 0,
+  "upper" -> 1),
--- End diff --

why lower and upper is 0 and 1?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-22 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r128894924
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
 ---
@@ -267,16 +267,17 @@ class ExpressionParserSuite extends PlanTest {
 // Range/Row
 val frameTypes = Seq(("rows", RowFrame), ("range", RangeFrame))
 val boundaries = Seq(
-  ("10 preceding", ValuePreceding(10), CurrentRow),
-  ("3 + 1 following", ValueFollowing(4), CurrentRow), // Will fail 
during analysis
-  ("unbounded preceding", UnboundedPreceding, CurrentRow),
-  ("unbounded following", UnboundedFollowing, CurrentRow), // Will 
fail during analysis
-  ("between unbounded preceding and current row", UnboundedPreceding, 
CurrentRow),
+  ("10 preceding", -Literal(10), CurrentRow),
+  ("2147483648 preceding", -Literal(2147483648L), CurrentRow),
+  ("3 + 1 following", Add(Literal(3), Literal(1)), CurrentRow), // 
Will fail during analysis
+  ("unbounded preceding", Unbounded, CurrentRow),
+  ("unbounded following", Unbounded, CurrentRow), // Will fail during 
analysis
--- End diff --

ditto, why?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-22 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r128894916
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
 ---
@@ -267,16 +267,17 @@ class ExpressionParserSuite extends PlanTest {
 // Range/Row
 val frameTypes = Seq(("rows", RowFrame), ("range", RangeFrame))
 val boundaries = Seq(
-  ("10 preceding", ValuePreceding(10), CurrentRow),
-  ("3 + 1 following", ValueFollowing(4), CurrentRow), // Will fail 
during analysis
-  ("unbounded preceding", UnboundedPreceding, CurrentRow),
-  ("unbounded following", UnboundedFollowing, CurrentRow), // Will 
fail during analysis
-  ("between unbounded preceding and current row", UnboundedPreceding, 
CurrentRow),
+  ("10 preceding", -Literal(10), CurrentRow),
+  ("2147483648 preceding", -Literal(2147483648L), CurrentRow),
+  ("3 + 1 following", Add(Literal(3), Literal(1)), CurrentRow), // 
Will fail during analysis
--- End diff --

why this will fail during analysis?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-22 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r128894905
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
 ---
@@ -1109,6 +1109,31 @@ class TypeCoercionSuite extends AnalysisTest {
   EqualTo(Literal(Array(1, 2)), Literal("123")),
   EqualTo(Literal(Array(1, 2)), Literal("123")))
   }
+
+  test("cast WindowFrame boundaries to the type they operate upon") {
+// Can cast frame boundaries to order dataType.
+ruleTest(WindowFrameCoercion,
+  windowSpec(
+Seq(UnresolvedAttribute("a")),
+Seq(SortOrder(Literal(1L), Ascending)),
+SpecifiedWindowFrame(RangeFrame, Literal(3), 
Literal(2147483648L))),
+  windowSpec(
+Seq(UnresolvedAttribute("a")),
+Seq(SortOrder(Literal(1L), Ascending)),
+SpecifiedWindowFrame(RangeFrame, Cast(3, LongType), 
Literal(2147483648L)))
+)
+// Cannot cast frame boundaries to order dataType.
+ruleTest(WindowFrameCoercion,
+  windowSpec(
+Seq(UnresolvedAttribute("a")),
+Seq(SortOrder(Literal.default(DateType), Ascending)),
+SpecifiedWindowFrame(RangeFrame, Literal(10.0), 
Literal(2147483648L))),
+  windowSpec(
+Seq(UnresolvedAttribute("a")),
+Seq(SortOrder(Literal.default(DateType), Ascending)),
+SpecifiedWindowFrame(RangeFrame, Literal(10.0), 
Literal(2147483648L)))
+)
--- End diff --

can we add some more test cases with special window frame boundary?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-22 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r128894802
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,173 +101,167 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
+}
 
 /**
- * The trait used to represent the type of a Window Frame Boundary.
+ * The trait used to represent special boundaries used in a window frame.
  */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = NullType
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
+/** UNBOUNDED boundary. */
+case object Unbounded extends SpecialFrameBoundary
+
+/** CURRENT ROW boundary. */
+case object CurrentRow extends SpecialFrameBoundary
+
 /**
- * Extractor for making working with frame boundaries easier.
+ * Represents a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait WindowFrame extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** Used as a placeholder when a frame specification is not defined. */
+case object UnspecifiedFrame extends WindowFrame
 
-  override def toString: String = "UNBOUNDED PRECEDING"
-}
+/**
+ * A specified Window Frame. The val lower/uppper can be either a foldable 
[[Expression]] or a
+ * [[SpecialFrameBoundary]].
+ */
+case class SpecifiedWindowFrame(
+frameType: FrameType,
+lower: Expression,
+upper: Expression)
+  extends WindowFrame {
 
-/**  PRECEDING boundary. */
-case class ValuePreceding(value: Int) extends FrameBoundary {
-  def

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-22 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r128894791
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,173 +101,167 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
+}
 
 /**
- * The trait used to represent the type of a Window Frame Boundary.
+ * The trait used to represent special boundaries used in a window frame.
  */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = NullType
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
+/** UNBOUNDED boundary. */
+case object Unbounded extends SpecialFrameBoundary
+
+/** CURRENT ROW boundary. */
+case object CurrentRow extends SpecialFrameBoundary
+
 /**
- * Extractor for making working with frame boundaries easier.
+ * Represents a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait WindowFrame extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** Used as a placeholder when a frame specification is not defined. */
+case object UnspecifiedFrame extends WindowFrame
 
-  override def toString: String = "UNBOUNDED PRECEDING"
-}
+/**
+ * A specified Window Frame. The val lower/uppper can be either a foldable 
[[Expression]] or a
+ * [[SpecialFrameBoundary]].
+ */
+case class SpecifiedWindowFrame(
+frameType: FrameType,
+lower: Expression,
+upper: Expression)
+  extends WindowFrame {
 
-/**  PRECEDING boundary. */
-case class ValuePreceding(value: Int) extends FrameBoundary {
-  def

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-22 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r128894779
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,173 +101,167 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
+}
 
 /**
- * The trait used to represent the type of a Window Frame Boundary.
+ * The trait used to represent special boundaries used in a window frame.
  */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
+sealed trait SpecialFrameBoundary extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = NullType
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
+/** UNBOUNDED boundary. */
+case object Unbounded extends SpecialFrameBoundary
+
+/** CURRENT ROW boundary. */
+case object CurrentRow extends SpecialFrameBoundary
+
 /**
- * Extractor for making working with frame boundaries easier.
+ * Represents a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait WindowFrame extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** Used as a placeholder when a frame specification is not defined. */
+case object UnspecifiedFrame extends WindowFrame
 
-  override def toString: String = "UNBOUNDED PRECEDING"
-}
+/**
+ * A specified Window Frame. The val lower/uppper can be either a foldable 
[[Expression]] or a
+ * [[SpecialFrameBoundary]].
+ */
+case class SpecifiedWindowFrame(
+frameType: FrameType,
+lower: Expression,
+upper: Expression)
+  extends WindowFrame {
 
-/**  PRECEDING boundary. */
-case class ValuePreceding(value: Int) extends FrameBoundary {
-  def

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-22 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r128894061
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -43,58 +42,54 @@ case class WindowSpecDefinition(
 orderSpec: Seq[SortOrder],
 frameSpecification: WindowFrame) extends Expression with WindowSpec 
with Unevaluable {
 
-  def validate: Option[String] = frameSpecification match {
-case UnspecifiedFrame =>
-  Some("Found a UnspecifiedFrame. It should be converted to a 
SpecifiedWindowFrame " +
-"during analysis. Please file a bug report.")
-case frame: SpecifiedWindowFrame => frame.validate.orElse {
-  def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
-if (orderSpec.length > 1)  {
-  // It is not allowed to have a value-based PRECEDING and 
FOLLOWING
-  // as the boundary of a Range Window Frame.
-  Some("This Range Window Frame only accepts at most one ORDER BY 
expression.")
-} else if (orderSpec.nonEmpty && 
!orderSpec.head.dataType.isInstanceOf[NumericType]) {
-  Some("The data type of the expression in the ORDER BY clause 
should be a numeric type.")
-} else {
-  None
-}
-  }
-
-  (frame.frameType, frame.frameStart, frame.frameEnd) match {
-case (RangeFrame, vp: ValuePreceding, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, vf: ValueFollowing, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vp: ValuePreceding) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vf: ValueFollowing) => 
checkValueBasedBoundaryForRangeFrame()
-case (_, _, _) => None
-  }
-}
-  }
-
-  override def children: Seq[Expression] = partitionSpec ++ orderSpec
+  override def children: Seq[Expression] = partitionSpec ++ orderSpec ++ 
Seq(frameSpecification)
--- End diff --

nit: `partitionSpec ++ orderSpec :+ frameSpecification`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-14 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r127453702
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,173 +105,164 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
+}
 
 /**
- * The trait used to represent the type of a Window Frame Boundary.
+ * The trait used to represent special boundaries used in a window frame.
  */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
-}
+sealed trait SpecialFrameBoundary
+
+/** UNBOUNDED boundary. */
+case object Unbounded extends SpecialFrameBoundary
+
+/** CURRENT ROW boundary. */
+case object CurrentRow extends SpecialFrameBoundary
 
 /**
- * Extractor for making working with frame boundaries easier.
+ * Represents a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait WindowFrame extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** Used as a placeholder when a frame specification is not defined. */
+case object UnspecifiedFrame extends WindowFrame
 
-  override def toString: String = "UNBOUNDED PRECEDING"
-}
+/**
+ * A specified Window Frame. The val lower/uppper can be either a foldable 
[[Expression]] or a
+ * [[SpecialFrameBoundary]].
--- End diff --

Sure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-14 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r127451425
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,173 +105,164 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
+}
 
 /**
- * The trait used to represent the type of a Window Frame Boundary.
+ * The trait used to represent special boundaries used in a window frame.
  */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
-}
+sealed trait SpecialFrameBoundary
+
+/** UNBOUNDED boundary. */
+case object Unbounded extends SpecialFrameBoundary
+
+/** CURRENT ROW boundary. */
+case object CurrentRow extends SpecialFrameBoundary
 
 /**
- * Extractor for making working with frame boundaries easier.
+ * Represents a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait WindowFrame extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** Used as a placeholder when a frame specification is not defined. */
+case object UnspecifiedFrame extends WindowFrame
 
-  override def toString: String = "UNBOUNDED PRECEDING"
-}
+/**
+ * A specified Window Frame. The val lower/uppper can be either a foldable 
[[Expression]] or a
+ * [[SpecialFrameBoundary]].
--- End diff --

can we do this in `WindowFrameCoercion`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-14 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r127449331
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,173 +105,164 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
+}
 
 /**
- * The trait used to represent the type of a Window Frame Boundary.
+ * The trait used to represent special boundaries used in a window frame.
  */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
-}
+sealed trait SpecialFrameBoundary
+
+/** UNBOUNDED boundary. */
+case object Unbounded extends SpecialFrameBoundary
+
+/** CURRENT ROW boundary. */
+case object CurrentRow extends SpecialFrameBoundary
 
 /**
- * Extractor for making working with frame boundaries easier.
+ * Represents a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait WindowFrame extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** Used as a placeholder when a frame specification is not defined. */
+case object UnspecifiedFrame extends WindowFrame
 
-  override def toString: String = "UNBOUNDED PRECEDING"
-}
+/**
+ * A specified Window Frame. The val lower/uppper can be either a foldable 
[[Expression]] or a
+ * [[SpecialFrameBoundary]].
--- End diff --

I tried that. The problem is that you will need to make them have a proper 
data type. I tried to make them `case object .. {}` with data type null, but I 
ended with these replaced with a null literal.

All I am saying that this will require a little bit more coding. Since you 
need to resolve the data type of the boundary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-13 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r127143764
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -43,57 +42,57 @@ case class WindowSpecDefinition(
 orderSpec: Seq[SortOrder],
 frameSpecification: WindowFrame) extends Expression with WindowSpec 
with Unevaluable {
 
-  def validate: Option[String] = frameSpecification match {
-case UnspecifiedFrame =>
-  Some("Found a UnspecifiedFrame. It should be converted to a 
SpecifiedWindowFrame " +
-"during analysis. Please file a bug report.")
-case frame: SpecifiedWindowFrame => frame.validate.orElse {
-  def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
-if (orderSpec.length > 1)  {
-  // It is not allowed to have a value-based PRECEDING and 
FOLLOWING
-  // as the boundary of a Range Window Frame.
-  Some("This Range Window Frame only accepts at most one ORDER BY 
expression.")
-} else if (orderSpec.nonEmpty && 
!orderSpec.head.dataType.isInstanceOf[NumericType]) {
-  Some("The data type of the expression in the ORDER BY clause 
should be a numeric type.")
-} else {
-  None
-}
-  }
-
-  (frame.frameType, frame.frameStart, frame.frameEnd) match {
-case (RangeFrame, vp: ValuePreceding, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, vf: ValueFollowing, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vp: ValuePreceding) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vf: ValueFollowing) => 
checkValueBasedBoundaryForRangeFrame()
-case (_, _, _) => None
-  }
-}
-  }
-
-  override def children: Seq[Expression] = partitionSpec ++ orderSpec
+  override def children: Seq[Expression] = partitionSpec ++ orderSpec ++ 
Seq(frameSpecification)
 
   override lazy val resolved: Boolean =
 childrenResolved && checkInputDataTypes().isSuccess &&
   frameSpecification.isInstanceOf[SpecifiedWindowFrame]
 
   override def nullable: Boolean = true
   override def foldable: Boolean = false
-  override def dataType: DataType = throw new UnsupportedOperationException
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
 
-  override def sql: String = {
-val partition = if (partitionSpec.isEmpty) {
-  ""
-} else {
-  "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
+  override def checkInputDataTypes(): TypeCheckResult = {
+frameSpecification match {
+  case UnspecifiedFrame =>
+TypeCheckFailure(
+  "Cannot use an UnspecifiedFrame. This should have been converted 
during analysis. " +
+"Please file a bug report.")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
!f.isUnbounded
+&& orderSpec.isEmpty =>
+TypeCheckFailure(
+  "A range window frame cannot be used in an unordered window 
specification.")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
f.isValueBound
+&& orderSpec.size > 1 =>
+TypeCheckFailure(
+  s"A range window frame with value boundaries cannot be used in a 
window specification " +
+s"with multiple order by expressions: 
${orderSpec.mkString(",")}")
+  case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
f.isValueBound
+&& !isValidFrameType(f.children.head.dataType) =>
+TypeCheckFailure(
+  s"The data type '${orderSpec.head.dataType}' used in the order 
specification does " +
+s"not match the data type '${f.children.head.dataType}' which 
is used in the " +
+"range frame.")
+  case _ => TypeCheckSuccess
 }
+  }
 
-val order = if (orderSpec.isEmpty) {
-  ""
-} else {
-  "ORDER BY " + orderSpec.map(_.sql).mkString(", ") + " "
+  override def sql: String = {
+def toSql(exprs: Seq[Expression], prefix: String): Seq[String] = {
+  Seq(exprs).filter(_.nonEmpty).map(_.map(_.sql).mkString(prefix, ", 
", ""))
 }
 
-s"($partition$order${frameSpecification.toString})"
+val elements =
+  toSql(partitionSpec, "PARTITION BY ") ++
+toSql(orderSpec, "ORDER BY ") ++
+Seq(frameSpecification.sql)
+elements.mkString("(", " ", ")")
+  }
+
+  private def isValidFrameType(ft: DataType): Boolean = 
(orderSpec.head.dataType, ft) match {
+case (DateType, IntegerType) => true
+case (TimestampType,

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-13 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r127142649
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,173 +105,164 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
+}
 
 /**
- * The trait used to represent the type of a Window Frame Boundary.
+ * The trait used to represent special boundaries used in a window frame.
  */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
-}
+sealed trait SpecialFrameBoundary
+
+/** UNBOUNDED boundary. */
+case object Unbounded extends SpecialFrameBoundary
+
+/** CURRENT ROW boundary. */
+case object CurrentRow extends SpecialFrameBoundary
 
 /**
- * Extractor for making working with frame boundaries easier.
+ * Represents a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait WindowFrame extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** Used as a placeholder when a frame specification is not defined. */
+case object UnspecifiedFrame extends WindowFrame
 
-  override def toString: String = "UNBOUNDED PRECEDING"
-}
+/**
+ * A specified Window Frame. The val lower/uppper can be either a foldable 
[[Expression]] or a
+ * [[SpecialFrameBoundary]].
--- End diff --

can we make `SpecialFrameBoundary` an expression?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail:

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-11 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r126680805
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -805,4 +806,24 @@ object TypeCoercion {
   Option(ret)
 }
   }
+
+  /**
+   * Cast WindowFrame boundaries to the type they operate upon.
+   */
+  object WindowFrameCoercion extends Rule[LogicalPlan] {
--- End diff --

Hmmm that would be kind of weird. So a user will get type coercion in its 
select but not in the range clause.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-07 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r126187356
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
@@ -174,28 +191,22 @@ class WindowSpec private[sql](
*/
   // Note: when updating the doc for this method, also update 
Window.rangeBetween.
   def rangeBetween(start: Long, end: Long): WindowSpec = {
--- End diff --

Let's rule it out of the scope of this PR and address this in a follow-up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-06 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r126016128
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
@@ -174,28 +191,22 @@ class WindowSpec private[sql](
*/
   // Note: when updating the doc for this method, also update 
Window.rangeBetween.
   def rangeBetween(start: Long, end: Long): WindowSpec = {
--- End diff --

i was trying to avoid introducing a special value, but maybe you can do 
that.

How important is it to fix this?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-06 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r126016260
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -805,4 +806,24 @@ object TypeCoercion {
   Option(ret)
 }
   }
+
+  /**
+   * Cast WindowFrame boundaries to the type they operate upon.
+   */
+  object WindowFrameCoercion extends Rule[LogicalPlan] {
--- End diff --

do we really need this? can we just require strict types?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-06 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r125901045
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala
 ---
@@ -151,6 +151,48 @@ class DataFrameWindowFunctionsSuite extends QueryTest 
with SharedSQLContext {
 Row(2.0d), Row(2.0d)))
   }
 
+  test("row between should accept integer values as boundary") {
--- End diff --

We can only add the test cases after we have finalized the API change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-06 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r125900833
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
@@ -174,28 +191,22 @@ class WindowSpec private[sql](
*/
   // Note: when updating the doc for this method, also update 
Window.rangeBetween.
   def rangeBetween(start: Long, end: Long): WindowSpec = {
--- End diff --

Maybe we can use `Literal(0)` to represent `CurrentRow`? And a sufficient 
large number(like `Literal(Long.MaxValue)`) for `Unbounded`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-05 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r125735576
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
@@ -174,28 +191,22 @@ class WindowSpec private[sql](
*/
   // Note: when updating the doc for this method, also update 
Window.rangeBetween.
   def rangeBetween(start: Long, end: Long): WindowSpec = {
--- End diff --

It might be the easiest to make it take a `Column`, so `rangeBetween(begin: 
Column, end: Column)`, only downside to this is that we need some way to 
express the special boundaries (`current row`, `unbounded`). Also cc @rxin


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-05 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r125688475
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala
 ---
@@ -151,6 +151,48 @@ class DataFrameWindowFunctionsSuite extends QueryTest 
with SharedSQLContext {
 Row(2.0d), Row(2.0d)))
   }
 
+  test("row between should accept integer values as boundary") {
--- End diff --

Will do this tomorrow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-05 Thread jiangxb1987

Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r125688053
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
@@ -174,28 +191,22 @@ class WindowSpec private[sql](
*/
   // Note: when updating the doc for this method, also update 
Window.rangeBetween.
   def rangeBetween(start: Long, end: Long): WindowSpec = {
--- End diff --

I totally agree that's what we definitely should do, but I'd suggest we 
address this in a follow-up work, and focus on resolving the overflow issue on 
Long frame boundaries in `rangeBetween` in this PR.

One major concern is the `WindowSpec` API is marked `Stable`, so I'm 
wondering what's the proper procedure to make a change to this interface?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-05 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r125602833
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExec.scala 
---
@@ -109,46 +109,54 @@ case class WindowExec(
*
* This method uses Code Generation. It can only be used on the executor 
side.
*
-   * @param frameType to evaluate. This can either be Row or Range based.
-   * @param offset with respect to the row.
+   * @param frame to evaluate. This can either be a Row or Range frame.
+   * @param bound with respect to the row.
* @return a bound ordering object.
*/
-  private[this] def createBoundOrdering(frameType: FrameType, offset: 
Int): BoundOrdering = {
-frameType match {
-  case RangeFrame =>
-val (exprs, current, bound) = if (offset == 0) {
-  // Use the entire order expression when the offset is 0.
-  val exprs = orderSpec.map(_.child)
-  val buildProjection = () => newMutableProjection(exprs, 
child.output)
-  (orderSpec, buildProjection(), buildProjection())
-} else if (orderSpec.size == 1) {
-  // Use only the first order expression when the offset is 
non-null.
-  val sortExpr = orderSpec.head
-  val expr = sortExpr.child
-  // Create the projection which returns the current 'value'.
-  val current = newMutableProjection(expr :: Nil, child.output)
-  // Flip the sign of the offset when processing the order is 
descending
-  val boundOffset = sortExpr.direction match {
-case Descending => -offset
-case Ascending => offset
-  }
-  // Create the projection which returns the current 'value' 
modified by adding the offset.
-  val boundExpr = Add(expr, Cast(Literal.create(boundOffset, 
IntegerType), expr.dataType))
-  val bound = newMutableProjection(boundExpr :: Nil, child.output)
-  (sortExpr :: Nil, current, bound)
-} else {
-  sys.error("Non-Zero range offsets are not supported for windows 
" +
-"with multiple order expressions.")
+  private[this] def createBoundOrdering(frame: FrameType, bound: AnyRef): 
BoundOrdering = {
+(frame, bound) match {
+  case (RowFrame, CurrentRow) =>
+RowBoundOrdering(0)
+
+  case (RowFrame, IntegerLiteral(offset)) =>
+RowBoundOrdering(offset)
+
+  case (RangeFrame, CurrentRow) =>
+val ordering = newOrdering(orderSpec, child.output)
+RangeBoundOrdering(ordering, IdentityProjection, 
IdentityProjection)
+
+  case (RangeFrame, offset: Expression) if orderSpec.size == 1 =>
+// Use only the first order expression when the offset is non-null.
+val sortExpr = orderSpec.head
+val expr = sortExpr.child
+
+// Create the projection which returns the current 'value'.
+val current = newMutableProjection(expr :: Nil, child.output)
+
+// Flip the sign of the offset when processing the order is 
descending
+val boundOffset = sortExpr.direction match {
+  case Descending => UnaryMinus(offset)
+  case Ascending => offset
+}
+
+// Create the projection which returns the current 'value' 
modified by adding the offset.
+val boundExpr = (expr.dataType, boundOffset.dataType) match {
+  case (DateType, IntegerType) => DateAdd(expr, boundOffset)
--- End diff --

Both `DateType` and `TimestampType` expressions are going to need a time 
zone. I was wondering if we can use a `GMT` because these are just offset 
calculation? cc @ueshin 

If we can't then we either need to thread through the session local 
timezone, or it might be easier to put the entire offset calculation in the 
frame.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-05 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r125603187
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala
 ---
@@ -151,6 +151,48 @@ class DataFrameWindowFunctionsSuite extends QueryTest 
with SharedSQLContext {
 Row(2.0d), Row(2.0d)))
   }
 
+  test("row between should accept integer values as boundary") {
--- End diff --

We should also add tests for dates/doubles/timestamps.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-05 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r125598577
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -43,57 +42,57 @@ case class WindowSpecDefinition(
 orderSpec: Seq[SortOrder],
 frameSpecification: WindowFrame) extends Expression with WindowSpec 
with Unevaluable {
 
-  def validate: Option[String] = frameSpecification match {
-case UnspecifiedFrame =>
-  Some("Found a UnspecifiedFrame. It should be converted to a 
SpecifiedWindowFrame " +
-"during analysis. Please file a bug report.")
-case frame: SpecifiedWindowFrame => frame.validate.orElse {
-  def checkValueBasedBoundaryForRangeFrame(): Option[String] = {
-if (orderSpec.length > 1)  {
-  // It is not allowed to have a value-based PRECEDING and 
FOLLOWING
-  // as the boundary of a Range Window Frame.
-  Some("This Range Window Frame only accepts at most one ORDER BY 
expression.")
-} else if (orderSpec.nonEmpty && 
!orderSpec.head.dataType.isInstanceOf[NumericType]) {
-  Some("The data type of the expression in the ORDER BY clause 
should be a numeric type.")
-} else {
-  None
-}
-  }
-
-  (frame.frameType, frame.frameStart, frame.frameEnd) match {
-case (RangeFrame, vp: ValuePreceding, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, vf: ValueFollowing, _) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vp: ValuePreceding) => 
checkValueBasedBoundaryForRangeFrame()
-case (RangeFrame, _, vf: ValueFollowing) => 
checkValueBasedBoundaryForRangeFrame()
-case (_, _, _) => None
-  }
-}
-  }
-
-  override def children: Seq[Expression] = partitionSpec ++ orderSpec
+  override def children: Seq[Expression] = partitionSpec ++ orderSpec ++ 
Seq(frameSpecification)
 
   override lazy val resolved: Boolean =
 childrenResolved && checkInputDataTypes().isSuccess &&
   frameSpecification.isInstanceOf[SpecifiedWindowFrame]
 
   override def nullable: Boolean = true
   override def foldable: Boolean = false
-  override def dataType: DataType = throw new UnsupportedOperationException
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
 
-  override def sql: String = {
-val partition = if (partitionSpec.isEmpty) {
-  ""
-} else {
-  "PARTITION BY " + partitionSpec.map(_.sql).mkString(", ") + " "
+  override def checkInputDataTypes(): TypeCheckResult = {
+frameSpecification match {
+  case UnspecifiedFrame =>
+TypeCheckFailure(
+  "Cann't use an UnspecifiedFrame. This should have been converted 
during analysis. " +
--- End diff --

NIT: typo `Cannot`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-05 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r125599886
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 ---
@@ -106,173 +105,161 @@ case class WindowSpecReference(name: String) 
extends WindowSpec
 /**
  * The trait used to represent the type of a Window Frame.
  */
-sealed trait FrameType
+sealed trait FrameType {
+  def inputType: AbstractDataType
+  def sql: String
+}
 
 /**
- * RowFrame treats rows in a partition individually. When a 
[[ValuePreceding]]
- * or a [[ValueFollowing]] is used as its [[FrameBoundary]], the value is 
considered
- * as a physical offset.
+ * RowFrame treats rows in a partition individually. Values used in a row 
frame are considered
+ * to be physical offsets.
  * For example, `ROW BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a 
3-row frame,
  * from the row that precedes the current row to the row that follows the 
current row.
  */
-case object RowFrame extends FrameType
+case object RowFrame extends FrameType {
+  override def inputType: AbstractDataType = IntegerType
+  override def sql: String = "ROWS"
+}
 
 /**
- * RangeFrame treats rows in a partition as groups of peers.
- * All rows having the same `ORDER BY` ordering are considered as peers.
- * When a [[ValuePreceding]] or a [[ValueFollowing]] is used as its 
[[FrameBoundary]],
- * the value is considered as a logical offset.
+ * RangeFrame treats rows in a partition as groups of peers. All rows 
having the same `ORDER BY`
+ * ordering are considered as peers. Values used in a range frame are 
considered to be logical
+ * offsets.
  * For example, assuming the value of the current row's `ORDER BY` 
expression `expr` is `v`,
  * `RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING` represents a frame 
containing rows whose values
  * `expr` are in the range of [v-1, v+1].
  *
  * If `ORDER BY` clause is not defined, all rows in the partition are 
considered as peers
  * of the current row.
  */
-case object RangeFrame extends FrameType
+case object RangeFrame extends FrameType {
+  override def inputType: AbstractDataType = 
TypeCollection.NumericAndInterval
+  override def sql: String = "RANGE"
+}
 
 /**
- * The trait used to represent the type of a Window Frame Boundary.
+ * The trait used to represent special boundaries used in a window frame.
  */
-sealed trait FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean
-}
+sealed trait SpecialFrameBoundary
+
+/** UNBOUNDED boundary. */
+case object Unbounded extends SpecialFrameBoundary
+
+/** CURRENT ROW boundary. */
+case object CurrentRow extends SpecialFrameBoundary
 
 /**
- * Extractor for making working with frame boundaries easier.
+ * Represents a window frame.
  */
-object FrameBoundary {
-  def apply(boundary: FrameBoundary): Option[Int] = unapply(boundary)
-  def unapply(boundary: FrameBoundary): Option[Int] = boundary match {
-case CurrentRow => Some(0)
-case ValuePreceding(offset) => Some(-offset)
-case ValueFollowing(offset) => Some(offset)
-case _ => None
-  }
+sealed trait WindowFrame extends Expression with Unevaluable {
+  override lazy val children: Seq[Expression] = Nil
+  override def dataType: DataType = throw new 
UnsupportedOperationException("dataType")
+  override def foldable: Boolean = false
+  override def nullable: Boolean = false
 }
 
-/** UNBOUNDED PRECEDING boundary. */
-case object UnboundedPreceding extends FrameBoundary {
-  def notFollows(other: FrameBoundary): Boolean = other match {
-case UnboundedPreceding => true
-case vp: ValuePreceding => true
-case CurrentRow => true
-case vf: ValueFollowing => true
-case UnboundedFollowing => true
-  }
+/** Used as a placeholder when a frame specification is not defined. */
+case object UnspecifiedFrame extends WindowFrame
 
-  override def toString: String = "UNBOUNDED PRECEDING"
-}
+/** A specified Window Frame. */
+case class SpecifiedWindowFrame(
+frameType: FrameType,
+lower: AnyRef,
--- End diff --

I made `lower` and `upper` `AnyRef`, this was to allow the use of both of 
(foldable) `Expressions` and `SpecialFrameBoundary`. This works rather well 
with things like constant folding. The reason for not making 
`SpecialFrameBoundary` an `Expression` is that this cannot have a type (unless 
you make it a case class I suppose) and that it showed some weird behavior 
during analysis/optimization.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-05 Thread hvanhovell

Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/18540#discussion_r125602934
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
@@ -174,28 +191,22 @@ class WindowSpec private[sql](
*/
   // Note: when updating the doc for this method, also update 
Window.rangeBetween.
   def rangeBetween(start: Long, end: Long): WindowSpec = {
--- End diff --

We should also create API's that allow for other types of literals, at 
least one for `CalendarInterval`s.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18540: [SPARK-19451][SQL] rangeBetween method should acc...

2017-07-05 Thread jiangxb1987

GitHub user jiangxb1987 opened a pull request:

https://github.com/apache/spark/pull/18540

[SPARK-19451][SQL] rangeBetween method should accept Long value as boundary

## What changes were proposed in this pull request?

Long values can be passed to `rangeBetween` as range frame boundaries, but 
we silently convert it to Int values, this can cause wrong results and we 
should fix this.

Further more, we should accept any legal literal values as range frame 
boundaries. In this PR, we make it possible for Long values, and make accepting 
other DataTypes really easy to add.

## How was this patch tested?

Add new tests in `DataFrameWindowFunctionsSuite` and `TypeCoercionSuite`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jiangxb1987/spark rangeFrame

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18540.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18540


commit 52c52895cfe3cae58f8b029a350949cc153971a7
Author: Xingbo Jiang 
Date:   2017-07-05T09:02:51Z

rangeBetween accept literal values.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

71 matches

Mail list logo