[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22375


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-11 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r224660195
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -69,11 +69,22 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks with PlanTestBa
 
   /**
* Check the equality between result of expression and expected value, 
it will handle
-   * Array[Byte], Spread[Double], MapData and Row.
+   * Array[Byte], Spread[Double], MapData and Row. Also check whether 
nullable in expression is
+   * true if result is null
*/
-  protected def checkResult(result: Any, expected: Any, exprDataType: 
DataType): Boolean = {
+  protected def checkResult(result: Any, expected: Any, expression: 
Expression): Boolean = {
+checkResult(result, expected, expression.dataType, expression.nullable)
+  }
+
+  protected def checkResult(
+  result: Any,
+  expected: Any,
+  exprDataType: DataType,
+  exprNullable: Boolean): Boolean = {
 val dataType = UserDefinedType.sqlType(exprDataType)
 
+// The result is null for a non-nullable expression
+assert(result != null || exprNullable, "exprNullable should be true if 
result is null")
--- End diff --

nit: how about "result cannot be null since it's not nullable."


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-11 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r224414454
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -69,11 +69,17 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks with PlanTestBa
 
   /**
* Check the equality between result of expression and expected value, 
it will handle
-   * Array[Byte], Spread[Double], MapData and Row.
+   * Array[Byte], Spread[Double], MapData and Row. Also check whether 
exprNullable is true
+   * if result of expression is null
*/
-  protected def checkResult(result: Any, expected: Any, exprDataType: 
DataType): Boolean = {
+  protected def checkResult(
+  result: Any,
+  expected: Any,
+  exprDataType: DataType,
+  exprNullable: Boolean): Boolean = {
 val dataType = UserDefinedType.sqlType(exprDataType)
 
+assert(result != null || exprNullable)
--- End diff --

we should add message to the assert, e.g. `assert(result != null || 
exprNullable, "xxx")`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-11 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r224414284
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -69,11 +69,22 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks with PlanTestBa
 
   /**
* Check the equality between result of expression and expected value, 
it will handle
-   * Array[Byte], Spread[Double], MapData and Row.
+   * Array[Byte], Spread[Double], MapData and Row. Also check whether 
exprNullable is true
--- End diff --

the comment doesn't match the method now


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-09 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r223783598
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -221,6 +227,12 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks with PlanTestBa
 val unsafeRow = evaluateWithUnsafeProjection(expression, inputRow)
 val input = if (inputRow == EmptyRow) "" else s", input: $inputRow"
 
+val dataType = expression.dataType
+if (!checkResult(unsafeRow.get(0, dataType), expected, dataType, 
expression.nullable)) {
--- End diff --

I think `checkResult` already validates `expression` according to the 
nullability of the given expression at 1. Thus, if the `expected` is not 
correct, 2. will detect an incorrect point.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r223245508
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala
 ---
@@ -113,7 +113,7 @@ class CodeGenerationSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 assert(actual.length == 1)
 val expected = UTF8String.fromString("abc")
 
-if (!checkResult(actual.head, expected, expressions.head.dataType)) {
+if (!checkResult(actual.head, expected, expressions.head.dataType, 
expressions.head.nullable)) {
--- End diff --

maybe we should provide an overload of `checkResult` that takes 
`Expression`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-07 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r223214388
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -221,6 +227,12 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks with PlanTestBa
 val unsafeRow = evaluateWithUnsafeProjection(expression, inputRow)
 val input = if (inputRow == EmptyRow) "" else s", input: $inputRow"
 
+val dataType = expression.dataType
+if (!checkResult(unsafeRow.get(0, dataType), expected, dataType, 
expression.nullable)) {
--- End diff --

> does not perform checks recursively

good point, I was not considering it. Then, do we need the check at 
https://github.com/apache/spark/pull/22375/files/9ef335d6e43a6ef7d253d0ed3564f95bd0278f71#diff-41747ec3f56901eb7bfb95d2a217e94dL231?
 Isn't it performed in `checkResult`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-07 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r223214130
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -221,6 +227,12 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks with PlanTestBa
 val unsafeRow = evaluateWithUnsafeProjection(expression, inputRow)
 val input = if (inputRow == EmptyRow) "" else s", input: $inputRow"
 
+val dataType = expression.dataType
+if (!checkResult(unsafeRow.get(0, dataType), expected, dataType, 
expression.nullable)) {
--- End diff --

1. sees only `expected`. 2. sees `expected` and `expression`. Thus, we are 
doing different.

At 1, as we discussed, we need to check the consistency recursively. IIUC, 
`unsafeRow.get(0, dataType) != null || expression.nullable` does not perform 
checks recursively. Do I make a misunderstanding?



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-07 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r223205359
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -221,6 +227,12 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks with PlanTestBa
 val unsafeRow = evaluateWithUnsafeProjection(expression, inputRow)
 val input = if (inputRow == EmptyRow) "" else s", input: $inputRow"
 
+val dataType = expression.dataType
+if (!checkResult(unsafeRow.get(0, dataType), expected, dataType, 
expression.nullable)) {
--- End diff --

yes, I just meant that here we are checking the result and we are doing the 
same after too. Shouldn't we just add an assert for `unsafeRow.get(0, dataType) 
!= null || expression.nullable` here instead?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-05 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r223169695
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -221,6 +227,12 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks with PlanTestBa
 val unsafeRow = evaluateWithUnsafeProjection(expression, inputRow)
 val input = if (inputRow == EmptyRow) "" else s", input: $inputRow"
 
+val dataType = expression.dataType
+if (!checkResult(unsafeRow.get(0, dataType), expected, dataType, 
expression.nullable)) {
--- End diff --

We check different properties in these two `if` statements.
1. Line 231 checks consistency between value and `nullable` in `expected`
1. Line 245 checks bit-wise value between `expected` and `expression`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-05 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r223169637
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala
 ---
@@ -113,7 +113,7 @@ class CodeGenerationSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 assert(actual.length == 1)
 val expected = UTF8String.fromString("abc")
 
-if (!checkResult(actual.head, expected, expressions.head.dataType)) {
+if (!checkResult(actual.head, expected, expressions.head.dataType, 
expressions.head.nullable)) {
--- End diff --

That is another option that I thought. On the other hand, to set default 
has a risk to overlook a possible incosistency between value and `nullable` at 
top level of `expected`.

Do we use the default value at the all of callers of `checkResult`? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-05 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r222954573
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -221,6 +227,12 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks with PlanTestBa
 val unsafeRow = evaluateWithUnsafeProjection(expression, inputRow)
 val input = if (inputRow == EmptyRow) "" else s", input: $inputRow"
 
+val dataType = expression.dataType
+if (!checkResult(unsafeRow.get(0, dataType), expected, dataType, 
expression.nullable)) {
--- End diff --

mmmh, I am not sure about this. Do we then still need the code below? Seems 
to me we are checking the same thing twice, please correct me if I am wrong.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r222920691
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala
 ---
@@ -113,7 +113,7 @@ class CodeGenerationSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 assert(actual.length == 1)
 val expected = UTF8String.fromString("abc")
 
-if (!checkResult(actual.head, expected, expressions.head.dataType)) {
+if (!checkResult(actual.head, expected, expressions.head.dataType, 
expressions.head.nullable)) {
--- End diff --

It's a little weird to ask the caller to provide both `expected` and 
`exprNullable `, and then use `exprNullable ` to validate `expected.` Can we 
set a default value for `exprNullable` in `checkResult`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-04 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r222741323
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -221,6 +227,12 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks with PlanTestBa
 val unsafeRow = evaluateWithUnsafeProjection(expression, inputRow)
 val input = if (inputRow == EmptyRow) "" else s", input: $inputRow"
 
+val dataType = expression.dataType
+if (!checkResult(unsafeRow.get(0, dataType), expected, dataType, 
expression.nullable)) {
--- End diff --

This is because this statement checks consistency between `expression` and 
its `nullable`, as you proposed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-04 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r222740882
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -69,11 +69,17 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks with PlanTestBa
 
   /**
* Check the equality between result of expression and expected value, 
it will handle
-   * Array[Byte], Spread[Double], MapData and Row.
+   * Array[Byte], Spread[Double], MapData and Row. Also check whether 
exprNullable is true
+   * if result of expression is null
*/
-  protected def checkResult(result: Any, expected: Any, exprDataType: 
DataType): Boolean = {
+  protected def checkResult(
+  result: Any,
+  expected: Any,
+  exprDataType: DataType,
+  exprNullable: Boolean): Boolean = {
 val dataType = UserDefinedType.sqlType(exprDataType)
 
+assert(result != null || exprNullable)
--- End diff --

Sure, I will add the description


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-04 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r222604522
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -221,6 +227,12 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks with PlanTestBa
 val unsafeRow = evaluateWithUnsafeProjection(expression, inputRow)
 val input = if (inputRow == EmptyRow) "" else s", input: $inputRow"
 
+val dataType = expression.dataType
+if (!checkResult(unsafeRow.get(0, dataType), expected, dataType, 
expression.nullable)) {
--- End diff --

why did you add this?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-10-04 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r222604455
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -69,11 +69,17 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks with PlanTestBa
 
   /**
* Check the equality between result of expression and expected value, 
it will handle
-   * Array[Byte], Spread[Double], MapData and Row.
+   * Array[Byte], Spread[Double], MapData and Row. Also check whether 
exprNullable is true
+   * if result of expression is null
*/
-  protected def checkResult(result: Any, expected: Any, exprDataType: 
DataType): Boolean = {
+  protected def checkResult(
+  result: Any,
+  expected: Any,
+  exprDataType: DataType,
+  exprNullable: Boolean): Boolean = {
 val dataType = UserDefinedType.sqlType(exprDataType)
 
+assert(result != null || exprNullable)
--- End diff --

Can we add a description which is more clear about the issue? Something 
like: `The result is null for a non-nullable expression`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-28 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r221194592
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

With some hints from @ueshin, this PR implemented the check of `null` value 
with nullable bit in `checkResult()`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-21 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r219454650
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

The motivations are the 2 mentioned above. Basically, I am proposing the 
same suggestion @cloud-fan has just commented 
[here](https://github.com/apache/spark/pull/22375#discussion_r219452615)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r219452615
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -223,8 +223,16 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks with PlanTestBa
   }
 } else {
   val lit = InternalRow(expected, expected)
--- End diff --

I think a more straightforward approach is, validate the `expected` 
according to the nullability of the given expression.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-21 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r219448432
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

I may not still understand your motivation correctly. What is the 
motivation to introduce this assertion?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-21 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r219435279
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

Yes, I said that the suggestion above is wrong and needs to be rewritten in 
a recursive way. Sorry for the bad suggestion, I just meant to show my idea. So 
it should be something like:

```
assert(!containsNullWhereNotNullable(expected, expression.dataType))
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-21 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r219432959
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

Even if we make it checking recursively, I think that this case cannot be 
detected. This is because the mismatch occurs in the different recursive path.

Would it be possible to share the case where we distingished a wrong output 
from a bad written UT in other places, as you proposed?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-21 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r219406170
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

Yes, you're right, my suggestion doesn't work in a case like that, sorry. 
We would need to make it checking recursively. But I think you got the idea of 
what I am proposing here. Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-21 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r219397495
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

The your first is correct since this patch addresses only codegen-on case. 
We can add another code to address codegen-off case.
Regarding the your second point, have we ever distingished a wrong output 
from a bad written UT when we defect the difference between `expression` and 
`expected`. I think that the distinguishment is nice to have, but not mandatory 
to have.


I have one question about your approach:
```
assert(containsNull(expected) && isNullable(expression.dataType))
```
Since the above two conditions evaluates `expected` and `expression` 
independently, how this works for the following case? I think that the 
assertion would be passed
```
expression:
dataType = StructType(ArrayType(IntegerType, false), ArrayType(IntegerType, 
true))
Struct(Array(0, null), Array(1, 0))

expected:
Struct(Array(0, 0), Array(1, null))
```




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-19 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r218686614
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

Sorry, you're right, I have not been precise in my previous statements so 
they are confusing. I'll try to elaborate more clearly.

IIUC what you are proposing now is for detecting the case when the output 
datatype of an expression is not nullable and the expected output instead 
contains a null. The way you're trying to achieve this is to make always 
nullable the expected output, so that we get a null as output both for codegen 
on and codegen off (while before this change for codegen on we would get as 
expected output a wrong value, which is the default value for that data type).

The problems I see on the approach above are:
 - if you test only codegen on or codegen off, you might still be missing 
to cover some cases (eg. see 
https://github.com/apache/spark/pull/22375#discussion_r218085246, with codegen 
off we can have null as output for a non-nullable expression...);
 - The failure you get is that the expected output is different from the 
the evaluated one, so this case is treated as any other failure in the 
expression evaluation.

With something like this:
```
assert(containsNull(expected) && isNullable(expression.dataType))
```
I think we could solve the 2 problems above as:
 - this check would fail in both modes;
 - we can differentiate between a test failing because of a wrong output 
and this new case which covers: a bad written UT; an expression returning 
wrongly a non-nullable datatype instead of a nullable one when returning null.

Hope now it is more clear. Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-18 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r218629945
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

I meant how your proposed assert can distinguish the two cases that you 
want to distinguish at 
[here](https://github.com/apache/spark/pull/22375#discussion_r217495874).
1. a failure caused by a bad test, ie. a test written wrongly
2. a UT failure caused by a bug in what we are testing

As you said 
[here](https://github.com/apache/spark/pull/22375#discussion_r218362156), the 
assert may detect both cases 1. and 2..



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-18 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r218362156
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

Well, actually I was not clear about this, sorry for that. I meant we can 
separate from:
 - a failure caused by a wrong result returned;
 - a failure caused by a wrong result type returned (ie. the expected 
values contains NULL while the expression return data type is not nullable - 
this case can happen both in case the UT is bad and in case the return type of 
the expression is wrong);


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-17 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r218106457
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

Would it be possible to let us know how can this assertion distinguish the 
following two cases?
* a failure caused by a bad test, ie. a test written wrongly
* a UT failure caused by a bug in what we are testing


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-17 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r218094296
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

yes, thanks for the summary, it states more clearly what I thought.

My point is that this fix works properly only when we test both codegen on 
and off, but it would fail to detect the error condition it claims to fix if 
only one of them (for any reason) is tested. So I am wondering if it is 
possible to perform a check on the expected value, instead of this fix. 
Something like:
```
assert(containsNull(expected) && isNullable(expression.dataType))
```
where `containsNull` and `isNullable` have to be defined properly. In this 
way we should fail properly independently from whether codegen is on or not. 
And we can also give a more clear hint in the error message about the problem 
being most likely a bad UT.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-17 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r218085246
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

Here is summary.
* `Map(3 -> 7, 6 -> null)` passes in codegen off and fails in codegen on 
with this fix
* `Map(3 -> 7, 6 -> -1)` fails in codegen off and fails in codegen on 

Would it be possible to share examples of two cases that you think we would 
be able to distinguish?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-13 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r217495874
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

I see, so the example above passes in codegen off and fails with codegen on 
with this fix, while using `Map(3 -> 7, 6 -> -1)` passes codegen on and fails 
codegen off, am I right?

What I am thinking about (but I have not yet found a working 
implementation) is: since the problem arise when we say we expect `null` in a 
non-nullable datatype, can we add such a check? I mean, instead of pretending 
the expected value to be nullable, can't we add a check in case it is not 
nullable for being sure that it does not contain `null`? I think it would be 
better, because we would be able to distinguish a failure caused by a bad test, 
ie. a test written wrongly, from a UT failure caused by a bug in what we are 
testing. What do you think?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-13 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r217454434
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

Here is an output. This is because the test correctly detects a failure 
even in codegen-off mode, too.

```
"Incorrect evaluation (codegen off): mapincorrectdatatypeexpression(), 
actual: keys: [3,6], values: [7,null], expected: keys: [3,6], values: [7,-1]" 
did not contain "Incorrect evaluation in unsafe mode"
ScalaTestFailureLocation: 
org.apache.spark.sql.catalyst.expressions.ExpressionEvalHelperSuite$$anonfun$4 
at (ExpressionEvalHelperSuite.scala:44)
org.scalatest.exceptions.TestFailedException: "Incorrect evaluation 
(codegen off): mapincorrectdatatypeexpression(), actual: keys: [3,6], values: 
[7,null], expected: keys: [3,6], values: [7,-1]" did not contain "Incorrect 
evaluation in unsafe mode"
at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:528)
...
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r217433074
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelper.scala
 ---
@@ -223,8 +223,8 @@ trait ExpressionEvalHelper extends 
GeneratorDrivenPropertyChecks with PlanTestBa
   }
 } else {
   val lit = InternalRow(expected, expected)
-  val expectedRow =
-UnsafeProjection.create(Array(expression.dataType, 
expression.dataType)).apply(lit)
+  val dtAsNullable = expression.dataType.asNullable
+  val expectedRow = UnsafeProjection.create(Array(dtAsNullable, 
dtAsNullable)).apply(lit)
--- End diff --

oh, yes. this is going to detect such failure.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22375: [SPARK-25388][Test][SQL] Detect incorrect nullabl...

2018-09-13 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22375#discussion_r217381027
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvalHelperSuite.scala
 ---
@@ -35,6 +36,13 @@ class ExpressionEvalHelperSuite extends SparkFunSuite 
with ExpressionEvalHelper
 val e = intercept[RuntimeException] { 
checkEvaluation(BadCodegenExpression(), 10) }
 assert(e.getMessage.contains("some_variable"))
   }
+
+  test("SPARK-25388: checkEvaluation should fail if nullable in DataType 
is incorrect") {
+val e = intercept[RuntimeException] {
+  checkEvaluation(MapIncorrectDataTypeExpression(), Map(3 -> 7, 6 -> 
null))
--- End diff --

what happens if here you put `Map(3 -> 7, 6 -> -1)`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org