[ https://issues.apache.org/jira/browse/IMPALA-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066124#comment-17066124 ]
ASF subversion and git services commented on IMPALA-9539: --------------------------------------------------------- Commit 1411ca6a00d408956bd63d20995d13c3e6ded1b1 in impala's branch refs/heads/master from Aman Sinha [ https://gitbox.apache.org/repos/asf?p=impala.git;h=1411ca6 ] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form Added an expression rewrite rule to convert a disjunctive predicate to conjunctive normal form (CNF). Converting to CNF enables multi-table predicates that were only evaluated by a Join operator to be converted into either single-table conjuncts that are eligible for predicate pushdown to the scan operator or other multi-table conjuncts that are eligible to be pushed to a Join below. This helps improve performance for such queries. Since converting to CNF expands the number of expressions, we place a limit on the maximum number of CNF exprs (each AND is counted as 1 CNF expr) that are considered. Once the MAX_CNF_EXPRS limit (default is unlimited) is exceeded, whatever expression was supplied to the rule is returned without further transformation. A setting of -1 or 0 allows unlimited number of CNF exprs to be created upto int32 max. Another option ENABLE_CNF_REWRITES enables or disables the entire rewrite. This is False by default until we have done more thorough functional testing (tracking JIRA IMPALA-9539). Examples of rewrites: original: (a AND b) OR c rewritten: (a OR c) AND (b OR c) original: (a AND b) OR (c AND d) rewritten: (a OR c) AND (a OR d) AND (b OR c) AND (b OR d) original: NOT(a OR b) rewritten: NOT(a) AND NOT(b) Testing: - Added new unit tests with variations of disjunctive predicates and verified their Explain plans - Manually tested the result correctness on impala shell by running these queries with ENABLE_CNF_REWRITES enabled and disabled - Added TPC-H q7, q19 and TPC-DS q13 with the CNF rewrite enabled - Preliminary performance testing of TPC-DS q13 on a 10TB scale factor shows almost 5x improvement: Original baseline: 47.5 sec With this patch and CNF rewrite enabled: 9.4 sec Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Reviewed-on: http://gerrit.cloudera.org:8080/15462 Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > Enable the conjunctive normal form rewrites by default > ------------------------------------------------------ > > Key: IMPALA-9539 > URL: https://issues.apache.org/jira/browse/IMPALA-9539 > Project: IMPALA > Issue Type: Bug > Components: Frontend > Reporter: Aman Sinha > Assignee: Aman Sinha > Priority: Major > > IMPALA-9183 adds the functionality to convert disjunctive predicates into > conjunctive normal form (CNF). Currently, the rewrite is disabled by default > and can be enabled by setting enable_cnf_rewrites to true. Since it can > cause several plan changes, we should ensure it does not cause regressions > (both in terms of result correctness and performance) and then enable it by > default. This JIRA is a follow-up to IMPALA-9183. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org