Dimitris Tsirogiannis has posted comments on this change. ( http://gerrit.cloudera.org:8080/8322 )
Change subject: IMPALA-1422: support a constant on LHS of IN predicates. ...................................................................... Patch Set 4: (20 comments) First pass. I think you may want to add a few planner tests to show the resulting plans. http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/SlotRef.java File fe/src/main/java/org/apache/impala/analysis/SlotRef.java: http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@96 PS4, Line 96: Preconditions.checkState(rawPath_ != null); Just by looking at some of the constructors, this precondition doesn't seem particularly correct. http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java: http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@225 PS4, Line 225: if (conjunct instanceof InPredicate) { I think it's better if you extract the check from L347 and merge it here. Then you can also remove the comment. http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@288 PS4, Line 288: the resulting expression : * is not analyzed Out of curiosity, why not analyzing the rewritten expression here? http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@296 PS4, Line 296: 1) For each case, you may want to mention if it works for correlated/uncorrelated subqueries. http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@302 PS4, Line 302: Example: Case with "limit 1"? http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@313 PS4, Line 313: NOT EXISTS (RHS AND C = e OR e IS NULL) Not sure I understand what this means. Is C = e OR e is NULL injected into the WHERE clause of RHS or something else? http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@332 PS4, Line 332: NOT EXISTS (S ? F (RHS) t WHERE C = e or e IS NOT NULL) :) I am not following this notation. Let's talk, maybe I am missing something obvious. Alternatively, simple examples might just work for all these cases. http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@340 PS4, Line 340: * @param inPred : * @return rewritten expression or null : * @throws AnalysisException Remove, we don't use javadoc :) http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@349 PS4, Line 349: Expr rhs = inPred.getChild(1); : Preconditions.checkArgument(rhs.contains(Subquery.class)); I believe this check is redundant. Subqueries can only be in the rhs of the in predicate. http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@353 PS4, Line 353: nit: empty lines (remove) http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@381 PS4, Line 381: new ArrayList<String>() nit: we use guava's static functions. i.e. Lists.newArrayList(); even if you use the standard java classes, you don't need to specify the type after the class name, it can be inferred by path's type (e.g. ArrayList<String> path = new ArrayList<>();). http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@408 PS4, Line 408: Subquery newSubquery = new Subquery(rewriteQuery); : newSubquery.analyze(rhsQuery.getAnalyzer()); Hm, rewriteQuery is already analyzed, so I am wondering if the analyze() call will have any affect at all. http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@414 PS4, Line 414: aggregation "or analytic function". http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@430 PS4, Line 430: new ArrayList<>(); use guava's static functions http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@679 PS4, Line 679: remove empty lines http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java: http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java@341 PS4, Line 341: TODO for 2.3 I know it's not your fault but can you remove the reference to 2.3 from all the TODOs in this file? http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java@477 PS4, Line 477: on;y only http://gerrit.cloudera.org:8080/#/c/8322/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java@484 PS4, Line 484: it's only for correlated subqueries, no? http://gerrit.cloudera.org:8080/#/c/8322/4/testdata/workloads/functional-query/queries/QueryTest/subquery.test File testdata/workloads/functional-query/queries/QueryTest/subquery.test: http://gerrit.cloudera.org:8080/#/c/8322/4/testdata/workloads/functional-query/queries/QueryTest/subquery.test@928 PS4, Line 928: SELECT a.id, a.int_col, a.string_col FROM alltypessmall a : WHERE NULL IN (SELECT int_col from alltypessmall where int_col < 0); Add a case (if not there) with: null not in (subquery) where subquery returns some results. http://gerrit.cloudera.org:8080/#/c/8322/4/testdata/workloads/functional-query/queries/QueryTest/subquery.test@962 PS4, Line 962: # LHS not-null, Predicate IN, RHS not-empty with match. Correlated. Expect 4 results, where int_col = 1. nit: fix long lines -- To view, visit http://gerrit.cloudera.org:8080/8322 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb Gerrit-Change-Number: 8322 Gerrit-PatchSet: 4 Gerrit-Owner: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Comment-Date: Mon, 23 Oct 2017 21:43:34 +0000 Gerrit-HasComments: Yes