Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17520
@cloud-fan: would you be interested in reviewing this PR since I have not
heard from @hvanhovell for a while? Note this is a WIP and I want to hear your
feedback on the issues I put in the comments
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17491
@cloud-fan wrote: "How useful is this optimization? It only works when
Exists has no condition, is that a common case?"
One of the common cases of this usage is an application of
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17521
Just add another view point to this incident.
If those failed test cases were written with the time zone fixed to a
region rather than Pacific time zone, we would fail fast in the first run
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17521
@dilipbiswal has narrowed down that this PR is changing the behaviour. He
will continue to investigate and will post an update in the next hour or so
before he calls it a day.
---
If your project
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17521
I will investigate. I am searching from the last good point I merged my
private branch with the master trunk and will go from there.
---
If your project is set up for it, you can reply to this email
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17521
After merging my local branch up to this PR, I ran some of the regression
tests from a machine in the Eastern time zone and observed the following
failures:
[info] *** 35 TESTS
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17520
cc: @hvanhovell
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17520#discussion_r109521220
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -848,11 +967,84 @@ object PushDownPredicate extends
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17520#discussion_r109521143
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -823,9 +890,61 @@ object PushDownPredicate extends Rule
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17520#discussion_r109522976
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -474,9 +478,42 @@ case class EliminateOuterJoin(conf
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17520#discussion_r109519783
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -792,6 +824,39 @@ object PushDownPredicate extends Rule
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17520#discussion_r109519728
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -766,6 +771,31 @@ object PushDownPredicate extends Rule
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17520#discussion_r109522412
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -1032,6 +1248,109 @@ object PushPredicateThroughJoin
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17520#discussion_r109521667
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -848,11 +967,84 @@ object PushDownPredicate extends
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17520#discussion_r109521264
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -848,11 +967,84 @@ object PushDownPredicate extends
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17520#discussion_r109522050
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -792,6 +824,39 @@ object PushDownPredicate extends Rule
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17520
Commit
https://github.com/apache/spark/pull/17520/commits/4aaab02b6fa384c51aef8484255f7a51097842be
has the complete functionality and new test cases.
---
If your project is set up for it, you can
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17520
Commit
https://github.com/apache/spark/pull/17520/commits/bc4fe9326e3c33954d223746ec36fb990fb8d994
is an initial work to demonstrate the idea of merging the 2-stage
transformation of [NOT] Exists/IN
GitHub user nsyca opened a pull request:
https://github.com/apache/spark/pull/17520
[WIP][SPARK-19712][SQL] Move PullupCorrelatedPredicates and
RewritePredicateSubquery after OptimizeSubqueries
## What changes were proposed in this pull request?
This commit moves two rules
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17491#discussion_r109222591
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
---
@@ -90,11 +90,12 @@ trait PredicateHelper
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17491#discussion_r109198895
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
---
@@ -498,3 +498,31 @@ object RewriteCorrelatedScalarSubquery
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17491#discussion_r109192716
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
---
@@ -90,11 +90,12 @@ trait PredicateHelper
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17491#discussion_r109183735
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
---
@@ -90,11 +90,12 @@ trait PredicateHelper
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17491#discussion_r109181709
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
---
@@ -498,3 +498,31 @@ object RewriteCorrelatedScalarSubquery
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17450#discussion_r108778378
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
---
@@ -1122,7 +1119,7 @@ case class StringSpace
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17450#discussion_r108720888
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
---
@@ -1122,7 +1119,7 @@ case class StringSpace
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17450#discussion_r108720512
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
---
@@ -384,33 +379,13 @@ case class NullPropagation(conf
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17450#discussion_r108519047
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
---
@@ -1122,7 +1119,7 @@ case class StringSpace
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17428#discussion_r108224172
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/JoinOptimizationSuite.scala
---
@@ -123,6 +136,31 @@ class
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17428#discussion_r108222730
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/JoinOptimizationSuite.scala
---
@@ -123,6 +136,31 @@ class
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17428#discussion_r108218188
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/JoinOptimizationSuite.scala
---
@@ -123,6 +136,31 @@ class
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17428#discussion_r108217534
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala
---
@@ -289,6 +289,9 @@ abstract class QueryPlan[PlanType
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17428#discussion_r108195342
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
---
@@ -90,6 +90,10 @@ trait PredicateHelper
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17428#discussion_r108196677
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/JoinOptimizationSuite.scala
---
@@ -123,6 +136,31 @@ class
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17428#discussion_r108185240
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala
---
@@ -289,6 +289,9 @@ abstract class QueryPlan[PlanType
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17286
Right. I misread it. if there is no join predicate between a table and any
cluster of tables, we should not consider that table in the join enumeration at
all. We can simply push that table to be the
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17286
@gatorsmile An equality join in most cases has a better filtering than an
inequality join. This can be used heuristically. However, this is not always
true. An equality join can be a lookup join from
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17191#discussion_r106621654
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -836,17 +836,29 @@ class Analyzer
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17191#discussion_r106540691
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -836,17 +836,29 @@ class Analyzer
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106290042
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -51,6 +51,11 @@ case class
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17186#discussion_r106169573
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PruneFiltersSuite.scala
---
@@ -133,4 +146,28 @@ class PruneFiltersSuite
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17186#discussion_r106166444
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PruneFiltersSuite.scala
---
@@ -133,4 +146,28 @@ class PruneFiltersSuite
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17294
@hvanhovell It would not be trivial to backport to 2.1/2.0. The correlated
predicates in the NOT IN subquery have been pulled up in Analyzer phase and
combined with the the columns in the LHS and RHS
GitHub user nsyca opened a pull request:
https://github.com/apache/spark/pull/17294
[SPARK-18966][SQL] NOT IN subquery with correlated expressions may return
incorrect result
## What changes were proposed in this pull request?
This PR fixes the following problem
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17240
@wzhfy wrote:
*"usually big tables (fact table) have more columns than small tables, so
cardinality and size is positively correlated"*
I am aware this may not be a forum
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17240#discussion_r105748347
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -204,63 +206,37 @@ object JoinReorderDP
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17240
In the PR description:
*"Usually cardinality is more important than size, we can simplify cost
evaluation by using only cardinality. Note that this also enables us to not
care about column
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17191
@maropu: I have made a few comments in the code and test cases. IMO, we
should go back to define what we want to fix before jumping into the code.
I am sorry if my previous statement gave you
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17191#discussion_r105659975
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
---
@@ -2586,4 +2586,15 @@ class SQLQuerySuite extends QueryTest with
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17191#discussion_r105679116
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -775,11 +775,21 @@ class Analyzer(
case
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17138
I'd start with my definition of a cost-based optimizer (CBO). Cost-based
optimizer is a process where an optimal execution plan is chosen based on its
estimated cost of execution. The estimated
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17138
You are right. The plans generated at n-join level comes from the join of
the plans in (n-1)-join level as well as (n-2)-join level and so on. So it
should be able to generate {A,B} join {C,D} plan
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17138
@wzhfy: If the lowest cost plan is the join between {A, B} and {C,D}, can
this join reorder algorithm produce this plan? I assume Spark can process bushy
join.
---
If your project is set up for it
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17191
@maropu Would your code work for the problem in the PR description?
sql("create table t(a int) using parquet")
sql("select a a1, a1 + 1 as b, count(1) fro
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17152#discussion_r104707409
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -358,4 +366,51 @@ object ViewHelper
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/17152
Going back to @gatorsmile 's question, does this fix cover the scenario
below?
`sql("create or replace view v1 as select * from v2")`
If this is an existing problem a
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17152#discussion_r104462108
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewSuite.scala ---
@@ -609,12 +609,25 @@ abstract class SQLViewSuite extends QueryTest
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/17152#discussion_r104459690
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/SQLViewSuite.scala ---
@@ -609,12 +609,25 @@ abstract class SQLViewSuite extends QueryTest
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16954#discussion_r103461455
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -707,13 +709,85 @@ class Analyzer
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16954#discussion_r101594561
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala
---
@@ -40,19 +42,179 @@ abstract class PlanExpression[T
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16954#discussion_r101596895
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
---
@@ -83,29 +95,150 @@ object RewritePredicateSubquery
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16954#discussion_r101590145
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala
---
@@ -40,19 +42,179 @@ abstract class PlanExpression[T
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16954#discussion_r101590658
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala
---
@@ -40,19 +42,179 @@ abstract class PlanExpression[T
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16954#discussion_r101593305
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala
---
@@ -40,19 +42,179 @@ abstract class PlanExpression[T
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/16915
It's larger than typical test PRs we submitted for the subquery JIRA but
since it's the last test PR, we think we wanted to avoid an additional round of
administrative work.
---
If your
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16915#discussion_r101375513
--- Diff:
sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/not-in-joins.sql.out
---
@@ -0,0 +1,229 @@
+-- Automatically generated by
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16915#discussion_r101374593
--- Diff:
sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-set-operations.sql.out
---
@@ -0,0 +1,595 @@
+-- Automatically generated
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16915#discussion_r101375044
--- Diff:
sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-with-cte.sql.out
---
@@ -0,0 +1,364 @@
+-- Automatically generated by
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16841#discussion_r100076749
--- Diff:
sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-having.sql.out
---
@@ -0,0 +1,217 @@
+-- Automatically generated by
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16841#discussion_r100077204
--- Diff:
sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-joins.sql.out
---
@@ -0,0 +1,353 @@
+-- Automatically generated by
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16841#discussion_r100077423
--- Diff:
sql/core/src/test/resources/sql-tests/results/subquery/in-subquery/in-multiple-columns.sql.out
---
@@ -0,0 +1,178 @@
+-- Automatically
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/16760
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16798#discussion_r99499561
--- Diff:
sql/core/src/test/resources/sql-tests/inputs/subquery/scalar-subquery/scalar-subquery-predicate.sql
---
@@ -0,0 +1,255 @@
+-- A test suite for
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16802#discussion_r99466405
--- Diff:
sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-cte.sql.out
---
@@ -0,0 +1,200 @@
+-- Automatically generated by
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16802#discussion_r99466460
--- Diff:
sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-joins-and-set-ops.sql.out
---
@@ -0,0 +1,330 @@
+-- Automatically
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/16798
@dilipbiswal Could you please cross-check the results from both sources?
@gatorsmile, @hvanhovell Could you please review?
---
If your project is set up for it, you can reply to this email and
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16798#discussion_r99453763
--- Diff: sql/core/src/test/resources/sql-tests/inputs/scalar-subquery.sql
---
@@ -1,20 +0,0 @@
-CREATE OR REPLACE TEMPORARY VIEW p AS VALUES (1, 1) AS T
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/16798
Below are a modified version of the test cases to run on DB2 and the result
from DB2, as a second source to compare to the result from Spark.
[Modified test file to run on
DB2](https://github.com
GitHub user nsyca opened a pull request:
https://github.com/apache/spark/pull/16798
[SPARK-18873][SQL][TEST] New test cases for scalar subquery (part 2 of 2) -
scalar subquery in predicate context
## What changes were proposed in this pull request?
This PR adds new test cases
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16760#discussion_r98794587
--- Diff:
sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-aggregate.sql.out
---
@@ -0,0 +1,183 @@
+-- Automatically
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16760#discussion_r98794624
--- Diff:
sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-having.sql.out
---
@@ -0,0 +1,153 @@
+-- Automatically generated
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16760#discussion_r98794661
--- Diff:
sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-orderby-limit.sql.out
---
@@ -0,0 +1,222 @@
+-- Automatically
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/16712
These two commands should give you the delta of the changes I made to
address your comments.
https://github.com/apache/spark/pull/16712/commits/0db0bc3a1896c6187b42e04ac2fd11a67769007c
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/16712
Thank you, @gatorsmile, for your time reviewing this test PR. I will wait
for your suggestion on the pattern of the literals in the first columns of the
tables if you do need to have them changed
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16712#discussion_r98453251
--- Diff:
sql/core/src/test/resources/sql-tests/inputs/subquery/scalar-subquery/scalar-subquery-select.sql
---
@@ -0,0 +1,139 @@
+-- A test suite for
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16712#discussion_r98453216
--- Diff:
sql/core/src/test/resources/sql-tests/results/subquery/scalar-subquery/scalar-subquery-select.sql.out
---
@@ -0,0 +1,198 @@
+-- Automatically
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/16712
The part-2 is for scalar subquery in predicates.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16712#discussion_r98452724
--- Diff:
sql/core/src/test/resources/sql-tests/inputs/subquery/scalar-subquery/scalar-subquery-select.sql
---
@@ -0,0 +1,139 @@
+-- A test suite for
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16712#discussion_r98452451
--- Diff:
sql/core/src/test/resources/sql-tests/inputs/subquery/scalar-subquery/scalar-subquery-select.sql
---
@@ -0,0 +1,139 @@
+-- A test suite for
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/16712
@kevinyu, @gatorsmile. Also FYI to @hvanhovell.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/16712
Attached are a slightly modified version of the submitted test file to
adapt to IBM DB2 syntax, and the result of the run.
[Modified version of the test
file](https://github.com/apache/spark
GitHub user nsyca opened a pull request:
https://github.com/apache/spark/pull/16712
[SPARK-18873][SQL][TEST] New test cases for scalar subquery (part 1 of 2)
## What changes were proposed in this pull request?
This PR adds new test cases for scalar subquery in SELECT clause
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16710#discussion_r98048551
--- Diff:
sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-within-and-or.sql.out
---
@@ -0,0 +1,156 @@
+-- Automatically
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16710#discussion_r98048517
--- Diff:
sql/core/src/test/resources/sql-tests/results/subquery/exists-subquery/exists-basic.sql.out
---
@@ -0,0 +1,201 @@
+-- Automatically generated
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/16572
@hvanhovell, I agree it does look risky with this approach. There are a lot
of dependencies here. I am pitching in the idea to get your initial thought.
Let me do some background and I will share
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/16572
Thank you for your time reviewing this PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16572#discussion_r97694229
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ---
@@ -223,7 +228,10 @@ class SQLQueryTestSuite extends QueryTest with
Github user nsyca commented on a diff in the pull request:
https://github.com/apache/spark/pull/16572#discussion_r97694210
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
---
@@ -117,66 +117,72 @@ trait CheckAnalysis extends
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/16572
Note the way the plans inside subqueries are not treated as part of the
tree traversal is a common problem. Besides this problem, another was reported
in SPARK-19093. Also the way Spark needs to
Github user nsyca commented on the issue:
https://github.com/apache/spark/pull/16467
@hvanhovell Would there be anything left that I have not addressed in this
PR?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
1 - 100 of 250 matches
Mail list logo