[GitHub] spark issue #16055: [SPARK-17897] [SQL] Attribute is not NullIntolerant

2016-11-29 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16055
  
The above fix does not cover all the cases. Found the root cause.

The `constraints` of an operator is the expressions that evaluate to `true` 
for all the rows produced. That means, the expression result should be neither 
`false` nor `unknown` (NULL). Thus, we can conclude that `IsNotNull` on all the 
constraints, which are generated by its own predicates or propagated from the 
children. The constraint can be a complex expression. For better usage of these 
constraints, we try to push down `IsNotNull` to the lowest-level expressions. 
`IsNotNull` can be pushed through an expression when it is null intolerant. 
(When the input is NULL, the null-intolerant expression always evaluates to 
null.)

Below is the code we have for `IsNotNull` pushdown.
```Scala
  private def scanNullIntolerantExpr(expr: Expression): Seq[Attribute] = 
expr match {
case a: Attribute => Seq(a)
case _: NullIntolerant | IsNotNull(_: NullIntolerant) =>
  expr.children.flatMap(scanNullIntolerantExpr)
case _ => Seq.empty[Attribute]
  }
```

`IsNotNull` is not null-intolerant. It converts `null` to `false`. If there 
does not exist any `Not`-like expression, it works; otherwise, it could 
generate a wrong result. The above function needs to be corrected to

```Scala
  private def scanNullIntolerantExpr(expr: Expression): Seq[Attribute] = 
expr match {
case a: Attribute => Seq(a)
case _: NullIntolerant => expr.children.flatMap(scanNullIntolerantExpr)
case _ => Seq.empty[Attribute]
  }
```
This fixes the problem, but we need a smarter fix for avoiding regressions. 
Now, working on a better fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16055: [SPARK-17897] [SQL] Attribute is not NullIntolerant

2016-11-29 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16055
  
This does not resolve all the cases. Will submit a better fix today. : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16055: [SPARK-17897] [SQL] Attribute is not NullIntolerant

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16055
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16055: [SPARK-17897] [SQL] Attribute is not NullIntolerant

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16055
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69310/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16055: [SPARK-17897] [SQL] Attribute is not NullIntolerant

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16055
  
**[Test build #69310 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69310/consoleFull)**
 for PR 16055 at commit 
[`33c10a0`](https://github.com/apache/spark/commit/33c10a0994c9802df901f211e1f28c52e34df27f).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `abstract class Attribute extends LeafExpression with NamedExpression `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16055: [SPARK-17897] [SQL] Attribute is not NullIntolerant

2016-11-29 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16055
  
Sure, will update the PR description tomorrow. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16055: [SPARK-17897] [SQL] Attribute is not NullIntolerant

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16055
  
**[Test build #69310 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69310/consoleFull)**
 for PR 16055 at commit 
[`33c10a0`](https://github.com/apache/spark/commit/33c10a0994c9802df901f211e1f28c52e34df27f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16055: [SPARK-17897] [SQL] Attribute is not NullIntolerant

2016-11-29 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16055
  
Can you explain how did nullintolerant impact the case?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org