Michael Ho has uploaded a new change for review. http://gerrit.cloudera.org:8080/5105
Change subject: IMPALA-4432: Handle internal codegen disabling properly ...................................................................... IMPALA-4432: Handle internal codegen disabling properly There are some conditions in which codegen is disabled internally even if it's enabled in the query option. For instance, the single node optimization or the expression evaluation requests sent from the FE to the BE. These internal disabling of codegen are advisory as their purposes are to reduce the latency for tables with no or very few rows. The internal disabling of codegen doesn't interact well with UDFs which cannot be interpreted (e.g. IR UDF) as it conflates with the 'disable_codegen' query option set by the user. As a result, it's hard to differentiate between when codegen is disabled explicitly by users and when it is disabled internally. This change fixes the problem above by adding an explicit flag in TQueryCtx to indicate that codegen is disabled internally. This flag is only advisory. For cases in which codegen is needed to function, this internal flag is ignored and if codegen is disabled via query option, an error is thrown. For this new flag to work with ScalarFnCall, codegen needs to happen after ScalarFnCall::Prepare() because it's hard to tell if a fragment contains any UDF that cannot be interpreted until after ScalarFnCall::Prepare() is called. However, Prepare() needs the codegen object to codegen so it needs to be created before Prepare(). We can either always create the codegen module or defer codegen to a point after ScalarFnCall::Prepare(). The former has the downside of introducing unnecessary latency for say single-node optimization so the latter is implemented. It is needed as part of IMPALA-4192 any way. After this change, ScalarFnCall expressions which need to be codegen'd are inserted into a vector in RuntimeState in ScalarFnCall::Prepare(). Later in the codegen phase, these expressions' GetCodegendComputeFn() will be called after codegen for operators is done. If any of these expressions are already codegen'd indirectly by the operators, GetCodegendComputeFn() will be a no-op. This preserves the behavior that ScalarFnCall will always be codegen'd even if the fragment doesn't contain any codegen enabled operators. Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397 --- M be/src/exec/aggregation-node.cc M be/src/exec/exchange-node.cc M be/src/exec/exec-node.cc M be/src/exec/hash-join-node.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/sort-node.cc M be/src/exec/topn-node.cc M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/service/fe-support.cc M common/thrift/ImpalaInternalService.thrift M fe/src/main/java/org/apache/impala/planner/Planner.java M tests/query_test/test_udfs.py 23 files changed, 151 insertions(+), 90 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/5105/1 -- To view, visit http://gerrit.cloudera.org:8080/5105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho <k...@cloudera.com>