[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-24 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21848
  
LGTM

Thanks! Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21848
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93523/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21848
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21848
  
**[Test build #93523 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93523/testReport)**
 for PR 21848 at commit 
[`b4f1431`](https://github.com/apache/spark/commit/b4f143180adc0196aa16650efc399226b463699f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-24 Thread dbtsai
Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/21848
  
Here is a followup PR for making `AssertTrue` and `AssertNotNull` 
`non-deterministic` https://issues.apache.org/jira/browse/SPARK-24913


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21848
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21848
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1292/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-24 Thread dbtsai
Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/21848
  
@kiszk `trait Stateful extends Nondeterministic`, and this rule will not be 
invoked when an expression is nondeterministic.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21848
  
**[Test build #93523 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93523/testReport)**
 for PR 21848 at commit 
[`b4f1431`](https://github.com/apache/spark/commit/b4f143180adc0196aa16650efc399226b463699f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-24 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21848
  
@dbtsai I have a question. How does the current code check the following 
condition?
> Stateful expression must have a side effect.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21848
  
Hmm, seems we have limitation on where non deterministic expressions can be 
in.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21848
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93471/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21848
  
**[Test build #93471 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93471/testReport)**
 for PR 21848 at commit 
[`bf0b2d9`](https://github.com/apache/spark/commit/bf0b2d91a89ec7913db2f86b632e6950ac490f70).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21848
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21848
  
**[Test build #93471 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93471/testReport)**
 for PR 21848 at commit 
[`bf0b2d9`](https://github.com/apache/spark/commit/bf0b2d91a89ec7913db2f86b632e6950ac490f70).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21848
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21848
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1253/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread dbtsai
Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/21848
  
This will simply the scope of this PR a lot by just having both 
`AssertTrue` and `AssertNotNull` as `non-deterministic` expression. My concern 
is the more  `non-deterministic` expressions we have, the less optimization we 
can do. Luckily, both of them are not used in general expressions.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21848
  
Currently, we are setting the expressions `deterministic ` to false when 
they are either having side effect or non-deterministic. We already did it for 
Hive UDFs who have stateful tags. 

We should change `deterministic` to `false` for both `AssertTrue` and 
`AssertNotNull`.  


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread dbtsai
Github user dbtsai commented on the issue:

https://github.com/apache/spark/pull/21848
  
@gatorsmile this can remove some of the expensive condition expressions, so 
I would like to find a way to properly implement this. 

Thank you all for chiming in with many good points. Let me summary here.

1. The cond expression can only be removed when it doesn't have a side 
effect.
2. Stateful expression must have a side effect.
3. Some of the non-stateful expressions such as AssertTrue have a side 
effect.
4. Determinstic expression could have a side effect. 
5. Nondeterministic expressions always have a side effect.

This means `determinstic` is not enough, and we need another variable to 
check if an expression has a side effect.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21848
  
For now, seems we don't have a good way to know if an expression has side 
effect. Some expressions like `AssertTrue` should be marked as one with side 
effect. Maybe we should create a trait for this purpose.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21848
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93454/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21848
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21848
  
**[Test build #93454 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93454/testReport)**
 for PR 21848 at commit 
[`30ca8cf`](https://github.com/apache/spark/commit/30ca8cf8aad2b3c030902bd00c4e8b0dd9acc430).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21848
  
This is a good point.

On Mon, Jul 23, 2018, 12:03 PM Dongjoon Hyun 
wrote:

> Since this skips the evaluation of if condition, this will cause the
> following difference.
>
> *This PR*
>
> scala> sql("select * from t").show
> ++
> |   a|
> ++
> |   1|
> |null|
> ++
>
> scala> sql("select if(assert_true(a is null),a,a) from t").show
> +-+
> |(IF(CAST(assert_true((a IS NULL)) AS BOOLEAN), a, a))|
> +-+
> |1|
> | null|
> +-+
>
> *Spark 2.3.1*
>
> scala> sql("select * from t").show
> ++
> |   a|
> ++
> |   1|
> |null|
> ++
>
> scala> sql("select if(assert_true(a is null),a,a) from t").show
> 18/07/23 11:59:11 ERROR Executor: Exception in task 0.0 in stage 20.0 
(TID 20)
> java.lang.RuntimeException: 'isnull(input[0, int, true])' is not true!
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/21848
  
Since this skips the evaluation of `if` condition, this will cause the 
following difference.

**This PR**
```
scala> sql("select * from t").show
++
|   a|
++
|   1|
|null|
++

scala> sql("select if(assert_true(a is null),a,a) from t").show
+-+
|(IF(CAST(assert_true((a IS NULL)) AS BOOLEAN), a, a))|
+-+
|1|
| null|
+-+
```

**Spark 2.3.1**
```
scala> sql("select * from t").show
++
|   a|
++
|   1|
|null|
++

scala> sql("select if(assert_true(a is null),a,a) from t").show
18/07/23 11:59:11 ERROR Executor: Exception in task 0.0 in stage 20.0 (TID 
20)
java.lang.RuntimeException: 'isnull(input[0, int, true])' is not true!
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/21848
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21848
  
**[Test build #93454 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93454/testReport)**
 for PR 21848 at commit 
[`30ca8cf`](https://github.com/apache/spark/commit/30ca8cf8aad2b3c030902bd00c4e8b0dd9acc430).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21848
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1241/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21848: [SPARK-24890] [SQL] Short circuiting the `if` condition ...

2018-07-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21848
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org