[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16998
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73316/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16998
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16998
  
**[Test build #73316 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73316/testReport)**
 for PR 16998 at commit 
[`5be21b3`](https://github.com/apache/spark/commit/5be21b32d5b4e3e36e50317a385a554206967668).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16998
  
**[Test build #73316 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73316/testReport)**
 for PR 16998 at commit 
[`5be21b3`](https://github.com/apache/spark/commit/5be21b32d5b4e3e36e50317a385a554206967668).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-21 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16998
  
@sameeragarwal That's correct.

> By the way, as an aside we should probably allow constraint 
inference/propagation to be turned off via a conf flag to provide a quick work 
around against these kind of problems.

As we use constraints in optimization, if we turn off constraint 
inference/propagation, wouldn't it miss optimization chance for query plans?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-21 Thread sameeragarwal
Github user sameeragarwal commented on the issue:

https://github.com/apache/spark/pull/16998
  
By the way, as an aside we should probably allow constraint 
inference/propagation to be turned off via a conf flag to provide a quick work 
around against these kind of problems.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-21 Thread sameeragarwal
Github user sameeragarwal commented on the issue:

https://github.com/apache/spark/pull/16998
  
@viirya please correct me if I'm wrong but scanning through this patch, it 
appears that the underlying problem is that duplicating and tracking aliased 
constraints using a `Set` tends to blow up quickly (causing regressions) and 
this patch is proposing an alternate data structure 
(`aliasedExpressionsInConstraints`) to keep track of aliases? For e.g., in your 
example where `a > b`, and `a` is aliased to `c` and `d`, we currently track 
constraints as `Set(a > b, c > b, d > b)` whereas you'd like it to be tracked 
as `Set(a > b)` and `Map(a, Set(c, d))`? Is that correct?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16998
  
> I do not get your points. What does this mean? Constraint propagation is 
a bottom up mechanism for inferring the constraints. Can you elaborate your 
idea in the more formal way.

We fully expand the constraints with aliased attributes now. For example, 
if there is a constraint "a > b", and current query plan aliases "a" to "c" and 
"d". The final constraints of this plan is "a > b", "c > b", "d > b".

The values of those constraints are all the same, either all true or all 
false. So in case of inferring filters from the constraints, we only need "a > 
b", other aliased constraints "c > b", "d > b" are not necessary.

> I did not read the code. Just wondering if we could miss the chance of 
plan optimization after this PR? What is the negative impact, if exists?

The only one optimization I think would be affected is `PruneFilters`. 
`PruneFilters` will prune a condition if its child's constraints already 
contain the condition. Using above example to elaborate, if there is a `Filter` 
above the query plan and its condition is "c > b". As we only have "a > b" in 
the query plan's constraint, we can't prune the condition and the `Filter`.

However, this is not a big impact and it can be easily solved. We can use a 
simple method to inquire if a given condition like "c > b" is contained in the 
fully expanded constraints of a query plan, without really fully expanding the 
constraints.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16998
  
> Another issue is we actually don't need the additional constraints at 
most of time. For example, if there is a constraint "a > b", and "a" is aliased 
to "c" and "d". When we use this constraint in filtering, we don't need all 
constraints "a > b", "c > b", "d > b". We only need "a > b" because if it is 
false, it is guaranteed that all other constraints are false too.

I do not get your points. What does this mean? Constraint propagation is a 
bottom up mechanism for inferring the constraints. Can you elaborate your idea 
in the more formal way.

I did not read the code. Just wondering if we could miss the chance of plan 
optimization after this PR? What is the negative impact, if exists?





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16998
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16998
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73174/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16998
  
**[Test build #73174 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73174/testReport)**
 for PR 16998 at commit 
[`6cb896f`](https://github.com/apache/spark/commit/6cb896ff062c0a0c46f6f6ac4b88fad165eeaac0).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16998
  
**[Test build #73174 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73174/testReport)**
 for PR 16998 at commit 
[`6cb896f`](https://github.com/apache/spark/commit/6cb896ff062c0a0c46f6f6ac4b88fad165eeaac0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16998
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73163/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16998
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16998
  
**[Test build #73163 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73163/testReport)**
 for PR 16998 at commit 
[`d691c66`](https://github.com/apache/spark/commit/d691c66dd0092ad99751964a6b079193706b953c).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16998
  
**[Test build #73163 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73163/testReport)**
 for PR 16998 at commit 
[`d691c66`](https://github.com/apache/spark/commit/d691c66dd0092ad99751964a6b079193706b953c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16998
  
**[Test build #73158 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73158/testReport)**
 for PR 16998 at commit 
[`24fb723`](https://github.com/apache/spark/commit/24fb723207d80a7c6068fd113430488b89ed9d0b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16998
  
@hvanhovell Yes. #16785 only does a limited improvement. Both #16785 and 
this are non-parallel approach.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16998
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16998
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73159/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16998
  
**[Test build #73159 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73159/testReport)**
 for PR 16998 at commit 
[`917de74`](https://github.com/apache/spark/commit/917de74db0066f015ac814125f5cb2d85b7a5b85).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16998
  
**[Test build #73159 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73159/testReport)**
 for PR 16998 at commit 
[`917de74`](https://github.com/apache/spark/commit/917de74db0066f015ac814125f5cb2d85b7a5b85).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/16998
  
@viirya does this PR supersede #16785? I do like the non-parallel approach. 
I will try to take a more in-depth look at the end of the week (beginning of 
the next sprint).  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16998
  
**[Test build #73158 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73158/testReport)**
 for PR 16998 at commit 
[`24fb723`](https://github.com/apache/spark/commit/24fb723207d80a7c6068fd113430488b89ed9d0b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16998: [SPARK-19665][SQL][WIP] Improve constraint propagation

2017-02-20 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16998
  
cc @cloud-fan @hvanhovell @sameeragarwal


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org