[GitHub] spark pull request: [WIP][SPARK-13332][SQL] Decimal datatype suppo...

2016-02-18 Thread yucai
Github user yucai commented on the pull request:

https://github.com/apache/spark/pull/11212#issuecomment-186077126
  
@rxin I tried your suggestion like creating PowDecimal for Decimal 
specially like below:
 
```
case class PowDecimal(left: Expression, right: Expression)
  extends BinaryMathExpression(math.pow, "POWER") {
  override def inputTypes: Seq[AbstractDataType] = Seq(DecimalType, 
IntegerType)
  ...
}

case class Pow(left: Expression, right: Expression)
  extends BinaryMathExpression(math.pow, "POWER") {
  ...
} 
``` 

But one concern is when "select pow(cast(2 as decimal(5,2)), 3)", how to 
make the "PowDecimal" node created? The current path will create "Pow" node 
anyway.

So we think of maybe we can still put Decimal processing in "Pow", but 
byte/short/etc. to integer in type coercion, like below:

```
case class Pow(left: Expression, right: Expression)
  extends BinaryMathExpression(math.pow, "POWER") {
  override def inputTypes: Seq[AbstractDataType] = Seq(NumericType, 
NumericType)

  override def dataType: DataType = (left.dataType, right.dataType) match {
case (dt: DecimalType, ByteType | ShortType | IntegerType) => dt
case _ => DoubleType
  }
  protected override def nullSafeEval(input1: Any, input2: Any): Any =
(left.dataType, right.dataType) match {
  case (dt: DecimalType, _) => 
input1.asInstanceOf[Decimal].pow(input2.asInstanceOf[Int])
  case _ => math.pow(input1.asInstanceOf[Double], 
input2.asInstanceOf[Double])
}
  override def genCode(ctx: CodegenContext, ev: ExprCode): String = ...
}
  
In HiveTypeCoercion:
```
  object PowCoercion extends Rule[LogicalPlan] {
def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
  case e if !e.childrenResolved => e
  case e @ Pow(left, right) =>
(left.dataType, right.dataType) match {
  case (dt: DecimalType, IntegerType) => e
  case (DoubleType, DoubleType) => e
  case (dt: DecimalType, ByteType | ShortType) =>
Pow(left, Cast(right, IntegerType))
  case _ => Pow(Cast(left, DoubleType), Cast(right, DoubleType))
}
}
  }
```
How do you think this way?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-13332][SQL] Decimal datatype suppo...

2016-02-18 Thread yucai
Github user yucai commented on the pull request:

https://github.com/apache/spark/pull/11212#issuecomment-185717402
  
OK, let me try this implementation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-13332][SQL] Decimal datatype suppo...

2016-02-17 Thread yucai
Github user yucai commented on a diff in the pull request:

https://github.com/apache/spark/pull/11212#discussion_r53275413
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeRowWriter.java
 ---
@@ -170,6 +170,7 @@ public void write(int ordinal, double value) {
   }
 
   public void write(int ordinal, Decimal input, int precision, int scale) {
+input = input.clone();
--- End diff --

As Adrian mentioned, we need a copy of input, otherwise `changePrecision` 
would change the original input.
In our case, this means `catalystValue`(expected value) would be changed 
when `checkEvalutionWithUnsafeProjection` is invoked, and then all tests after 
checkEvalutionWithUnsafeProjection will fail.
```
  protected def checkEvaluation(
  expression: => Expression, expected: Any, inputRow: InternalRow = 
EmptyRow): Unit = {
val catalystValue = CatalystTypeConverters.convertToCatalyst(expected)
checkEvaluationWithoutCodegen(expression, catalystValue, inputRow)
checkEvaluationWithGeneratedMutableProjection(expression, 
catalystValue, inputRow)
if (GenerateUnsafeProjection.canSupport(expression.dataType)) {
  checkEvalutionWithUnsafeProjection(expression, catalystValue, 
inputRow)
}
checkEvaluationWithOptimization(expression, catalystValue, inputRow)
  }
```
Does it make sense? Any suggestion is great helpful.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-13332][SQL] Decimal datatype suppo...

2016-02-16 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/11212#discussion_r53115148
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeRowWriter.java
 ---
@@ -170,6 +170,7 @@ public void write(int ordinal, double value) {
   }
 
   public void write(int ordinal, Decimal input, int precision, int scale) {
+input = input.clone();
--- End diff --

Here we'll call `changePrecision` on `input` here, which would affect the 
orignal data. I agree that this is a bad idea, maybe we need to propose a 
separate pr to work around this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-13332][SQL] Decimal datatype suppo...

2016-02-16 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/11212#issuecomment-184860431
  
I think it'd be a lot simpler if we create a separate Pow for Decimal, and 
handle byte/short/etc to integer in type coercion, rather than in the 
PowDecimal class.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-13332][SQL] Decimal datatype suppo...

2016-02-16 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/11212#discussion_r53071927
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeRowWriter.java
 ---
@@ -170,6 +170,7 @@ public void write(int ordinal, double value) {
   }
 
   public void write(int ordinal, Decimal input, int precision, int scale) {
+input = input.clone();
--- End diff --

Why is this necessary? Seems like a really bad idea.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-13332][SQL] Decimal datatype suppo...

2016-02-15 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/11212#discussion_r52968764
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
 ---
@@ -523,11 +523,45 @@ case class Atan2(left: Expression, right: Expression)
 
 case class Pow(left: Expression, right: Expression)
   extends BinaryMathExpression(math.pow, "POWER") {
-  override def genCode(ctx: CodegenContext, ev: ExprCode): String = {
-defineCodeGen(ctx, ev, (c1, c2) => s"java.lang.Math.pow($c1, $c2)")
-  }
-}
+  override def inputTypes: Seq[AbstractDataType] = Seq(NumericType, 
NumericType)
+
+  override def dataType: DataType = (left.dataType, right.dataType) match {
+case (dt: DecimalType, ByteType | ShortType | IntegerType) => dt
+case _ => DoubleType
+  }
+
+  protected override def nullSafeEval(input1: Any, input2: Any): Any =
+(left.dataType, right.dataType) match {
+  case (dt: DecimalType, ByteType) =>
+input1.asInstanceOf[Decimal].pow(input2.asInstanceOf[Byte])
+  case (dt: DecimalType, ShortType) =>
+input1.asInstanceOf[Decimal].pow(input2.asInstanceOf[Short])
+  case (dt: DecimalType, IntegerType) =>
+input1.asInstanceOf[Decimal].pow(input2.asInstanceOf[Int])
+  case (dt: DecimalType, FloatType) =>
+math.pow(input1.asInstanceOf[Decimal].toDouble, 
input2.asInstanceOf[Float])
+  case (dt: DecimalType, DoubleType) =>
+math.pow(input1.asInstanceOf[Decimal].toDouble, 
input2.asInstanceOf[Double])
+  case (dt1: DecimalType, dt2: DecimalType) =>
+math.pow(input1.asInstanceOf[Decimal].toDouble, 
input2.asInstanceOf[Decimal].toDouble)
--- End diff --

Shall we cast the result of `math.pow` back to `DecimalType` for these 
three cases?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-13332][SQL] Decimal datatype suppo...

2016-02-15 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/11212#discussion_r52968376
  
--- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeRowWriter.java
 ---
@@ -170,6 +170,7 @@ public void write(int ordinal, double value) {
   }
 
   public void write(int ordinal, Decimal input, int precision, int scale) {
+input = input.clone();
--- End diff --

Better add a comment that explains why we need to clone before write.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-13332][SQL] Decimal datatype suppo...

2016-02-15 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/11212#discussion_r52968287
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MathFunctionsSuite.scala
 ---
@@ -351,6 +350,20 @@ class MathFunctionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
   }
 
   test("pow") {
+testBinary(Pow, (d: Decimal, n: Byte) => d.pow(n),
+  (-5 to 5).map(v => (Decimal(v * 1.0), v.toByte)))
--- End diff --

maybe `v.toDouble` is better


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org