Abhishek Rawat has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13645
Change subject: IMPALA-6590: Disable expr rewrites and codegen for VALUES() statements ...................................................................... IMPALA-6590: Disable expr rewrites and codegen for VALUES() statements Expression rewrites for VALUES() could result in performance regression since there is virtually no benefit of rewrite, if the expression will only ever be evaluated once. The overhead of rewrites in some cases could be huge, especially if there are several constant expressions. The regression also seems to non-linearly increase as number of columns increases. Similarly, there is no value in doing codegen for such const expressions. The rewriteExprs() for ValuesStmt class was overridden with an empty function body. As a result rewrites for VALUES() is a no-op. Codegen was disabled for const expressions within a UNION node, if the UNION node is not within a subplan. This applies to all UNION nodes with const expressions (and not just limited to UNION nodes associated with a VALUES clause). The decision for whether or not to enable codegen for const expressions in a UNION is made in the planner when a UnionNode is initialized. A new member 'is_codegen_disabled' was added to the thrift struct TExprNode for communicating this decision to backend. The Optimizer should take decisions it can and so it seemed like the right place to disable/enable codegen. The infrastructure is generic and could be extended in future to selectively disable codegen for any given expression, if needed. Testing: - Added a new e2e test case in tests/query_test/test_codegen.py, which tests the different scenarios involving UNION with const expressions. - Ran manual tests to validate that the non-linear regression in VALUES clause when involving increasing number of columns is no longer seen. Results below. for i in 256 512 1024 2048 4096 8192 16384 32768; do (echo 'VALUES ('; for x in $(seq $i); do echo "cast($x as string),"; done; echo "NULL); profile;") | time impala-shell.sh -f /dev/stdin |& grep Analysis; done Base: - Analysis finished: 14.533ms (13.881ms) - Analysis finished: 36.736ms (35.478ms) - Analysis finished: 112.932ms (108.913ms) - Analysis finished: 357.739ms (352.843ms) - Analysis finished: 1s242ms (1s234ms) - Analysis finished: 5s832ms (5s815ms) - Analysis finished: 28s994ms (28s960ms) - Analysis finished: 2m28s (2m28s) Test: - Analysis finished: 2.107ms (1.380ms) - Analysis finished: 6.176ms (4.887ms) - Analysis finished: 20.043ms (17.569ms) - Analysis finished: 58.013ms (53.620ms) - Analysis finished: 241.455ms (232.775ms) - Analysis finished: 1s084ms (1s067ms) - Analysis finished: 5s718ms (5s674ms) - Analysis finished: 45s177ms (45s107ms) Change-Id: I229d67b821968321abd8f97f7c89cf2617000d8d --- M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/literal.cc M be/src/exprs/null-literal.h M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/slot-ref.cc M be/src/runtime/runtime-state.h M common/thrift/Exprs.thrift M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/ValuesStmt.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java A testdata/workloads/functional-query/queries/QueryTest/union-const-scalar-expr-codegen.test M tests/query_test/test_codegen.py 14 files changed, 176 insertions(+), 39 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/13645/1 -- To view, visit http://gerrit.cloudera.org:8080/13645 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I229d67b821968321abd8f97f7c89cf2617000d8d Gerrit-Change-Number: 13645 Gerrit-PatchSet: 1 Gerrit-Owner: Abhishek Rawat <ara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>