[GitHub] spark issue #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPredicates ...

2017-04-11 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17520 @cloud-fan: would you be interested in reviewing this PR since I have not heard from @hvanhovell for a while? Note this is a WIP and I want to hear your feedback on the issues I put in the comments

[GitHub] spark issue #17491: [SPARK-20175][SQL] Exists should not be evaluated in Joi...

2017-04-11 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17491 @cloud-fan wrote: "How useful is this optimization? It only works when Exists has no condition, is that a common case?" One of the common cases of this usage is an application of

[GitHub] spark issue #17521: [SPARK-20204][SQL] remove SimpleCatalystConf and Catalys...

2017-04-04 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17521 Just add another view point to this incident. If those failed test cases were written with the time zone fixed to a region rather than Pacific time zone, we would fail fast in the first run

[GitHub] spark issue #17521: [SPARK-20204][SQL] remove SimpleCatalystConf and Catalys...

2017-04-04 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17521 @dilipbiswal has narrowed down that this PR is changing the behaviour. He will continue to investigate and will post an update in the next hour or so before he calls it a day. --- If your project

[GitHub] spark issue #17521: [SPARK-20204][SQL] remove SimpleCatalystConf and Catalys...

2017-04-04 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17521 I will investigate. I am searching from the last good point I merged my private branch with the master trunk and will go from there. --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #17521: [SPARK-20204][SQL] remove SimpleCatalystConf and Catalys...

2017-04-04 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17521 After merging my local branch up to this PR, I ran some of the regression tests from a machine in the Eastern time zone and observed the following failures: [info] *** 35 TESTS

[GitHub] spark issue #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPredicates ...

2017-04-03 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17520 cc: @hvanhovell --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark pull request #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPred...

2017-04-03 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17520#discussion_r109521220 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -848,11 +967,84 @@ object PushDownPredicate extends

[GitHub] spark pull request #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPred...

2017-04-03 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17520#discussion_r109521143 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -823,9 +890,61 @@ object PushDownPredicate extends Rule

[GitHub] spark pull request #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPred...

2017-04-03 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17520#discussion_r109522976 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -474,9 +478,42 @@ case class EliminateOuterJoin(conf

[GitHub] spark pull request #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPred...

2017-04-03 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17520#discussion_r109519783 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -792,6 +824,39 @@ object PushDownPredicate extends Rule

[GitHub] spark pull request #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPred...

2017-04-03 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17520#discussion_r109519728 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -766,6 +771,31 @@ object PushDownPredicate extends Rule

[GitHub] spark pull request #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPred...

2017-04-03 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17520#discussion_r109522412 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1032,6 +1248,109 @@ object PushPredicateThroughJoin

[GitHub] spark pull request #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPred...

2017-04-03 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17520#discussion_r109521667 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -848,11 +967,84 @@ object PushDownPredicate extends

[GitHub] spark pull request #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPred...

2017-04-03 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17520#discussion_r109521264 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -848,11 +967,84 @@ object PushDownPredicate extends

[GitHub] spark pull request #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPred...

2017-04-03 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17520#discussion_r109522050 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -792,6 +824,39 @@ object PushDownPredicate extends Rule

[GitHub] spark issue #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPredicates ...

2017-04-03 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17520 Commit https://github.com/apache/spark/pull/17520/commits/4aaab02b6fa384c51aef8484255f7a51097842be has the complete functionality and new test cases. --- If your project is set up for it, you can

[GitHub] spark issue #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPredicates ...

2017-04-03 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17520 Commit https://github.com/apache/spark/pull/17520/commits/bc4fe9326e3c33954d223746ec36fb990fb8d994 is an initial work to demonstrate the idea of merging the 2-stage transformation of [NOT] Exists/IN

[GitHub] spark pull request #17520: [WIP][SPARK-19712][SQL] Move PullupCorrelatedPred...

2017-04-03 Thread nsyca
GitHub user nsyca opened a pull request: https://github.com/apache/spark/pull/17520 [WIP][SPARK-19712][SQL] Move PullupCorrelatedPredicates and RewritePredicateSubquery after OptimizeSubqueries ## What changes were proposed in this pull request? This commit moves two rules

[GitHub] spark pull request #17491: [SPARK-20175][SQL] Exists should not be evaluated...

2017-03-31 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17491#discussion_r109222591 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -90,11 +90,12 @@ trait PredicateHelper

[GitHub] spark pull request #17491: [SPARK-20175][SQL] Exists should not be evaluated...

2017-03-31 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17491#discussion_r109198895 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala --- @@ -498,3 +498,31 @@ object RewriteCorrelatedScalarSubquery

[GitHub] spark pull request #17491: [SPARK-20175][SQL] Exists should not be evaluated...

2017-03-31 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17491#discussion_r109192716 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -90,11 +90,12 @@ trait PredicateHelper

[GitHub] spark pull request #17491: [SPARK-20175][SQL] Exists should not be evaluated...

2017-03-31 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17491#discussion_r109183735 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -90,11 +90,12 @@ trait PredicateHelper

[GitHub] spark pull request #17491: [SPARK-20175][SQL] Exists should not be evaluated...

2017-03-31 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17491#discussion_r109181709 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala --- @@ -498,3 +498,31 @@ object RewriteCorrelatedScalarSubquery

[GitHub] spark pull request #17450: [SPARK-20121][SQL] simplify NullPropagation with ...

2017-03-29 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17450#discussion_r108778378 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -1122,7 +1119,7 @@ case class StringSpace

[GitHub] spark pull request #17450: [SPARK-20121][SQL] simplify NullPropagation with ...

2017-03-29 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17450#discussion_r108720888 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -1122,7 +1119,7 @@ case class StringSpace

[GitHub] spark pull request #17450: [SPARK-20121][SQL] simplify NullPropagation with ...

2017-03-29 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17450#discussion_r108720512 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -384,33 +379,13 @@ case class NullPropagation(conf

[GitHub] spark pull request #17450: [SPARK-20121][SQL] simplify NullPropagation with ...

2017-03-28 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17450#discussion_r108519047 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala --- @@ -1122,7 +1119,7 @@ case class StringSpace

[GitHub] spark pull request #17428: [SPARK-20094][SQL] Don't put predicate with IN su...

2017-03-27 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17428#discussion_r108224172 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/JoinOptimizationSuite.scala --- @@ -123,6 +136,31 @@ class

[GitHub] spark pull request #17428: [SPARK-20094][SQL] Don't put predicate with IN su...

2017-03-27 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17428#discussion_r108222730 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/JoinOptimizationSuite.scala --- @@ -123,6 +136,31 @@ class

[GitHub] spark pull request #17428: [SPARK-20094][SQL] Don't put predicate with IN su...

2017-03-27 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17428#discussion_r108218188 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/JoinOptimizationSuite.scala --- @@ -123,6 +136,31 @@ class

[GitHub] spark pull request #17428: [SPARK-20094][SQL] Don't put predicate with IN su...

2017-03-27 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17428#discussion_r108217534 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala --- @@ -289,6 +289,9 @@ abstract class QueryPlan[PlanType

[GitHub] spark pull request #17428: [SPARK-20094][SQL] Don't put predicate with IN su...

2017-03-27 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17428#discussion_r108195342 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -90,6 +90,10 @@ trait PredicateHelper

[GitHub] spark pull request #17428: [SPARK-20094][SQL] Don't put predicate with IN su...

2017-03-27 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17428#discussion_r108196677 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/JoinOptimizationSuite.scala --- @@ -123,6 +136,31 @@ class

[GitHub] spark pull request #17428: [SPARK-20094][SQL] Don't put predicate with IN su...

2017-03-27 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17428#discussion_r108185240 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala --- @@ -289,6 +289,9 @@ abstract class QueryPlan[PlanType

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-18 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17286 Right. I misread it. if there is no join predicate between a table and any cluster of tables, we should not consider that table in the join enumeration at all. We can simply push that table to be the

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-18 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17286 @gatorsmile An equality join in most cases has a better filtering than an inequality join. This can be used heuristically. However, this is not always true. An equality join can be a lookup join from

[GitHub] spark pull request #17191: [SPARK-14471][SQL] Aliases in SELECT could be use...

2017-03-17 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17191#discussion_r106621654 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -836,17 +836,29 @@ class Analyzer

[GitHub] spark pull request #17191: [SPARK-14471][SQL] Aliases in SELECT could be use...

2017-03-16 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17191#discussion_r106540691 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -836,17 +836,29 @@ class Analyzer

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-15 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106290042 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -51,6 +51,11 @@ case class

[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

2017-03-15 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17186#discussion_r106169573 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PruneFiltersSuite.scala --- @@ -133,4 +146,28 @@ class PruneFiltersSuite

[GitHub] spark pull request #17186: [SPARK-19846][SQL] Add a flag to disable constrai...

2017-03-15 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17186#discussion_r106166444 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PruneFiltersSuite.scala --- @@ -133,4 +146,28 @@ class PruneFiltersSuite

[GitHub] spark issue #17294: [SPARK-18966][SQL] NOT IN subquery with correlated expre...

2017-03-14 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17294 @hvanhovell It would not be trivial to backport to 2.1/2.0. The correlated predicates in the NOT IN subquery have been pulled up in Analyzer phase and combined with the the columns in the LHS and RHS

[GitHub] spark pull request #17294: [SPARK-18966][SQL] NOT IN subquery with correlate...

2017-03-14 Thread nsyca
GitHub user nsyca opened a pull request: https://github.com/apache/spark/pull/17294 [SPARK-18966][SQL] NOT IN subquery with correlated expressions may return incorrect result ## What changes were proposed in this pull request? This PR fixes the following problem

[GitHub] spark issue #17240: [SPARK-19915][SQL] Improve join reorder: simplify cost e...

2017-03-14 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17240 @wzhfy wrote: *"usually big tables (fact table) have more columns than small tables, so cardinality and size is positively correlated"* I am aware this may not be a forum

[GitHub] spark pull request #17240: [SPARK-19915][SQL] Improve join reorder: simplify...

2017-03-13 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17240#discussion_r105748347 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -204,63 +206,37 @@ object JoinReorderDP

[GitHub] spark issue #17240: [SPARK-19915][SQL] Improve join reorder: simplify cost e...

2017-03-13 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17240 In the PR description: *"Usually cardinality is more important than size, we can simplify cost evaluation by using only cardinality. Note that this also enables us to not care about column

[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-03-13 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17191 @maropu: I have made a few comments in the code and test cases. IMO, we should go back to define what we want to fix before jumping into the code. I am sorry if my previous statement gave you

[GitHub] spark pull request #17191: [SPARK-14471][SQL] Aliases in SELECT could be use...

2017-03-13 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17191#discussion_r105659975 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2586,4 +2586,15 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark pull request #17191: [SPARK-14471][SQL] Aliases in SELECT could be use...

2017-03-13 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17191#discussion_r105679116 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -775,11 +775,21 @@ class Analyzer( case

[GitHub] spark issue #17138: [SPARK-17080] [SQL] join reorder

2017-03-09 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17138 I'd start with my definition of a cost-based optimizer (CBO). Cost-based optimizer is a process where an optimal execution plan is chosen based on its estimated cost of execution. The estimated

[GitHub] spark issue #17138: [SPARK-17080] [SQL] join reorder

2017-03-08 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17138 You are right. The plans generated at n-join level comes from the join of the plans in (n-1)-join level as well as (n-2)-join level and so on. So it should be able to generate {A,B} join {C,D} plan

[GitHub] spark issue #17138: [SPARK-17080] [SQL] join reorder

2017-03-08 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17138 @wzhfy: If the lowest cost plan is the join between {A, B} and {C,D}, can this join reorder algorithm produce this plan? I assume Spark can process bushy join. --- If your project is set up for it

[GitHub] spark issue #17191: [SPARK-14471][SQL] Aliases in SELECT could be used in GR...

2017-03-07 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17191 @maropu Would your code work for the problem in the PR description? sql("create table t(a int) using parquet") sql("select a a1, a1 + 1 as b, count(1) fro

[GitHub] spark pull request #17152: [SPARK-18389][SQL] Disallow cyclic view reference

2017-03-07 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17152#discussion_r104707409 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala --- @@ -358,4 +366,51 @@ object ViewHelper

[GitHub] spark issue #17152: [SPARK-18389][SQL] Disallow cyclic view reference

2017-03-06 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17152 Going back to @gatorsmile 's question, does this fix cover the scenario below? `sql("create or replace view v1 as select * from v2")` If this is an existing problem a

[GitHub] spark pull request #17152: [SPARK-18389][SQL] Disallow cyclic view reference

2017-03-06 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17152#discussion_r104462108 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewSuite.scala --- @@ -609,12 +609,25 @@ abstract class SQLViewSuite extends QueryTest

[GitHub] spark pull request #17152: [SPARK-18389][SQL] Disallow cyclic view reference

2017-03-06 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/17152#discussion_r104459690 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewSuite.scala --- @@ -609,12 +609,25 @@ abstract class SQLViewSuite extends QueryTest

[GitHub] spark pull request #16954: [SPARK-18874][SQL] First phase: Deferring the cor...

2017-02-28 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16954#discussion_r103461455 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -707,13 +709,85 @@ class Analyzer

[GitHub] spark pull request #16954: [SPARK-18874][SQL] First phase: Deferring the cor...

2017-02-16 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16954#discussion_r101594561 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala --- @@ -40,19 +42,179 @@ abstract class PlanExpression[T

[GitHub] spark pull request #16954: [SPARK-18874][SQL] First phase: Deferring the cor...

2017-02-16 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16954#discussion_r101596895 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala --- @@ -83,29 +95,150 @@ object RewritePredicateSubquery

[GitHub] spark pull request #16954: [SPARK-18874][SQL] First phase: Deferring the cor...

2017-02-16 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16954#discussion_r101590145 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala --- @@ -40,19 +42,179 @@ abstract class PlanExpression[T

[GitHub] spark pull request #16954: [SPARK-18874][SQL] First phase: Deferring the cor...

2017-02-16 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16954#discussion_r101590658 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala --- @@ -40,19 +42,179 @@ abstract class PlanExpression[T

[GitHub] spark pull request #16954: [SPARK-18874][SQL] First phase: Deferring the cor...

2017-02-16 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16954#discussion_r101593305 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala --- @@ -40,19 +42,179 @@ abstract class PlanExpression[T

[GitHub] spark issue #16915: [SPARK-18871][SQL][TESTS] New test cases for IN/NOT IN s...

2017-02-15 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16915 It's larger than typical test PRs we submitted for the subquery JIRA but since it's the last test PR, we think we wanted to avoid an additional round of administrative work. --- If your

[GitHub] spark pull request #16915: [SPARK-18871][SQL][TESTS] New test cases for IN/N...

2017-02-15 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16915#discussion_r101375513 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/not-in-joins.sql.out --- @@ -0,0 +1,229 @@ +-- Automatically generated by

[GitHub] spark pull request #16915: [SPARK-18871][SQL][TESTS] New test cases for IN/N...

2017-02-15 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16915#discussion_r101374593 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-set-operations.sql.out --- @@ -0,0 +1,595 @@ +-- Automatically generated

[GitHub] spark pull request #16915: [SPARK-18871][SQL][TESTS] New test cases for IN/N...

2017-02-15 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16915#discussion_r101375044 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-with-cte.sql.out --- @@ -0,0 +1,364 @@ +-- Automatically generated by

[GitHub] spark pull request #16841: [SPARK-18871][SQL][TESTS] New test cases for IN/N...

2017-02-08 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16841#discussion_r100076749 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-having.sql.out --- @@ -0,0 +1,217 @@ +-- Automatically generated by

[GitHub] spark pull request #16841: [SPARK-18871][SQL][TESTS] New test cases for IN/N...

2017-02-08 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16841#discussion_r100077204 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-joins.sql.out --- @@ -0,0 +1,353 @@ +-- Automatically generated by

[GitHub] spark pull request #16841: [SPARK-18871][SQL][TESTS] New test cases for IN/N...

2017-02-08 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16841#discussion_r100077423 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-multiple-columns.sql.out --- @@ -0,0 +1,178 @@ +-- Automatically

[GitHub] spark issue #16760: [SPARK-18872][SQL][TESTS] New test cases for EXISTS subq...

2017-02-08 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16760 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #16798: [SPARK-18873][SQL][TEST] New test cases for scala...

2017-02-05 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16798#discussion_r99499561 --- Diff: sql/core/src/test/resources/sql-tests/inputs/subquery/scalar-subquery/scalar-subquery-predicate.sql --- @@ -0,0 +1,255 @@ +-- A test suite for

[GitHub] spark pull request #16802: [SPARK-18872][SQL][TESTS] New test cases for EXIS...

2017-02-04 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16802#discussion_r99466405 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-cte.sql.out --- @@ -0,0 +1,200 @@ +-- Automatically generated by

[GitHub] spark pull request #16802: [SPARK-18872][SQL][TESTS] New test cases for EXIS...

2017-02-04 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16802#discussion_r99466460 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-joins-and-set-ops.sql.out --- @@ -0,0 +1,330 @@ +-- Automatically

[GitHub] spark issue #16798: [SPARK-18873][SQL][TEST] New test cases for scalar subqu...

2017-02-03 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16798 @dilipbiswal Could you please cross-check the results from both sources? @gatorsmile, @hvanhovell Could you please review? --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request #16798: [SPARK-18873][SQL][TEST] New test cases for scala...

2017-02-03 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16798#discussion_r99453763 --- Diff: sql/core/src/test/resources/sql-tests/inputs/scalar-subquery.sql --- @@ -1,20 +0,0 @@ -CREATE OR REPLACE TEMPORARY VIEW p AS VALUES (1, 1) AS T

[GitHub] spark issue #16798: [SPARK-18873][SQL][TEST] New test cases for scalar subqu...

2017-02-03 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16798 Below are a modified version of the test cases to run on DB2 and the result from DB2, as a second source to compare to the result from Spark. [Modified test file to run on DB2](https://github.com

[GitHub] spark pull request #16798: [SPARK-18873][SQL][TEST] New test cases for scala...

2017-02-03 Thread nsyca
GitHub user nsyca opened a pull request: https://github.com/apache/spark/pull/16798 [SPARK-18873][SQL][TEST] New test cases for scalar subquery (part 2 of 2) - scalar subquery in predicate context ## What changes were proposed in this pull request? This PR adds new test cases

[GitHub] spark pull request #16760: [SPARK-18872][SQL][TESTS] New test cases for EXIS...

2017-01-31 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16760#discussion_r98794587 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-aggregate.sql.out --- @@ -0,0 +1,183 @@ +-- Automatically

[GitHub] spark pull request #16760: [SPARK-18872][SQL][TESTS] New test cases for EXIS...

2017-01-31 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16760#discussion_r98794624 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-having.sql.out --- @@ -0,0 +1,153 @@ +-- Automatically generated

[GitHub] spark pull request #16760: [SPARK-18872][SQL][TESTS] New test cases for EXIS...

2017-01-31 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16760#discussion_r98794661 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-orderby-limit.sql.out --- @@ -0,0 +1,222 @@ +-- Automatically

[GitHub] spark issue #16712: [SPARK-18873][SQL][TEST] New test cases for scalar subqu...

2017-01-30 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16712 These two commands should give you the delta of the changes I made to address your comments. https://github.com/apache/spark/pull/16712/commits/0db0bc3a1896c6187b42e04ac2fd11a67769007c

[GitHub] spark issue #16712: [SPARK-18873][SQL][TEST] New test cases for scalar subqu...

2017-01-30 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16712 Thank you, @gatorsmile, for your time reviewing this test PR. I will wait for your suggestion on the pattern of the literals in the first columns of the tables if you do need to have them changed

[GitHub] spark pull request #16712: [SPARK-18873][SQL][TEST] New test cases for scala...

2017-01-30 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16712#discussion_r98453251 --- Diff: sql/core/src/test/resources/sql-tests/inputs/subquery/scalar-subquery/scalar-subquery-select.sql --- @@ -0,0 +1,139 @@ +-- A test suite for

[GitHub] spark pull request #16712: [SPARK-18873][SQL][TEST] New test cases for scala...

2017-01-30 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16712#discussion_r98453216 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/scalar-subquery/scalar-subquery-select.sql.out --- @@ -0,0 +1,198 @@ +-- Automatically

[GitHub] spark issue #16712: [SPARK-18873][SQL][TEST] New test cases for scalar subqu...

2017-01-30 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16712 The part-2 is for scalar subquery in predicates. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #16712: [SPARK-18873][SQL][TEST] New test cases for scala...

2017-01-30 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16712#discussion_r98452724 --- Diff: sql/core/src/test/resources/sql-tests/inputs/subquery/scalar-subquery/scalar-subquery-select.sql --- @@ -0,0 +1,139 @@ +-- A test suite for

[GitHub] spark pull request #16712: [SPARK-18873][SQL][TEST] New test cases for scala...

2017-01-30 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16712#discussion_r98452451 --- Diff: sql/core/src/test/resources/sql-tests/inputs/subquery/scalar-subquery/scalar-subquery-select.sql --- @@ -0,0 +1,139 @@ +-- A test suite for

[GitHub] spark issue #16712: [SPARK-18873][SQL][TEST] New test cases for scalar subqu...

2017-01-26 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16712 @kevinyu, @gatorsmile. Also FYI to @hvanhovell. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16712: [SPARK-18873][SQL][TEST] New test cases for scalar subqu...

2017-01-26 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16712 Attached are a slightly modified version of the submitted test file to adapt to IBM DB2 syntax, and the result of the run. [Modified version of the test file](https://github.com/apache/spark

[GitHub] spark pull request #16712: [SPARK-18873][SQL][TEST] New test cases for scala...

2017-01-26 Thread nsyca
GitHub user nsyca opened a pull request: https://github.com/apache/spark/pull/16712 [SPARK-18873][SQL][TEST] New test cases for scalar subquery (part 1 of 2) ## What changes were proposed in this pull request? This PR adds new test cases for scalar subquery in SELECT clause

[GitHub] spark pull request #16710: [SPARK-18872][SQL][TESTS] New test cases for EXIS...

2017-01-26 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16710#discussion_r98048551 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-within-and-or.sql.out --- @@ -0,0 +1,156 @@ +-- Automatically

[GitHub] spark pull request #16710: [SPARK-18872][SQL][TESTS] New test cases for EXIS...

2017-01-26 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16710#discussion_r98048517 --- Diff: sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-basic.sql.out --- @@ -0,0 +1,201 @@ +-- Automatically generated

[GitHub] spark issue #16572: [SPARK-18863][SQL] Output non-aggregate expressions with...

2017-01-25 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16572 @hvanhovell, I agree it does look risky with this approach. There are a lot of dependencies here. I am pitching in the idea to get your initial thought. Let me do some background and I will share

[GitHub] spark issue #16572: [SPARK-18863][SQL] Output non-aggregate expressions with...

2017-01-24 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16572 Thank you for your time reviewing this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16572: [SPARK-18863][SQL] Output non-aggregate expressio...

2017-01-24 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16572#discussion_r97694229 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala --- @@ -223,7 +228,10 @@ class SQLQueryTestSuite extends QueryTest with

[GitHub] spark pull request #16572: [SPARK-18863][SQL] Output non-aggregate expressio...

2017-01-24 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16572#discussion_r97694210 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala --- @@ -117,66 +117,72 @@ trait CheckAnalysis extends

[GitHub] spark issue #16572: [SPARK-18863][SQL] Output non-aggregate expressions with...

2017-01-24 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16572 Note the way the plans inside subqueries are not treated as part of the tree traversal is a common problem. Besides this problem, another was reported in SPARK-19093. Also the way Spark needs to

[GitHub] spark issue #16467: [SPARK-19017][SQL] NOT IN subquery with more than one co...

2017-01-16 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16467 @hvanhovell Would there be anything left that I have not addressed in this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

  1   2   3   >