Daniel Becker has uploaded a new patch set (#41). ( http://gerrit.cloudera.org:8080/15105 )
Change subject: IMPALA-5444: Asynchronous code generation ...................................................................... IMPALA-5444: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. Testing: - In exhaustive mode, a limited number of end-to-end tests are run in async mode and with debug actions randomly delaying the codegen thread and the main thread after starting codegen to test various scenarios of relative timing. The number of such tests is kept small to avoid increasing the running time of the tests by too much. - Added a new end-to-end test, tests/query_test/test_async_codegen.py, which tests three relative timings: 1. Async codegen finishes before query execution starts (only codegen'd code runs). 2. Query execution finishes before async codegen finishes (only interpreted code runs). 3. Async codegen finishes during query execution (both interpreted and condegen'd code runs, switching to codegen from interpreted mode. TODO: The default should be synchronous codegen for now. I only want to change it back that way at the very end so in the meantime tests run with async codegen turned on. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-state.cc M be/src/runtime/fragment-state.h M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift A tests/query_test/test_async_codegen.py M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py 54 files changed, 928 insertions(+), 438 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/41 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 41 Gerrit-Owner: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>