[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 31: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5778/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 31 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 10 Apr 2020 00:14:58 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 30: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5777/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 30 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 09 Apr 2020 23:35:34 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 31: (2 comments) http://gerrit.cloudera.org:8080/#/c/15105/31/tests/query_test/test_queries.py File tests/query_test/test_queries.py: http://gerrit.cloudera.org:8080/#/c/15105/31/tests/query_test/test_queries.py@77 PS31, Line 77: flake8: W291 trailing whitespace http://gerrit.cloudera.org:8080/#/c/15105/31/tests/query_test/test_queries.py@77 PS31, Line 77: # enabled and we always run debug actions. We also filter out line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 31 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 09 Apr 2020 23:32:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#31). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-state.cc M be/src/runtime/fragment-state.h M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift A tests/query_test/test_async_codegen.py M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py 54 files changed, 927 insertions(+), 432 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/31 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 31 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 30: (2 comments) http://gerrit.cloudera.org:8080/#/c/15105/30/tests/query_test/test_queries.py File tests/query_test/test_queries.py: http://gerrit.cloudera.org:8080/#/c/15105/30/tests/query_test/test_queries.py@79 PS30, Line 79: flake8: W291 trailing whitespace http://gerrit.cloudera.org:8080/#/c/15105/30/tests/query_test/test_queries.py@79 PS30, Line 79: # enabled and we always run debug actions. We also filter out line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 30 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 09 Apr 2020 22:54:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#30). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-state.cc M be/src/runtime/fragment-state.h M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift A tests/query_test/test_async_codegen.py M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py 54 files changed, 930 insertions(+), 432 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/30 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 30 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 29: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5755/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 29 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 08 Apr 2020 13:52:54 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 28: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5754/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 28 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 08 Apr 2020 13:46:06 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#29). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-state.cc M be/src/runtime/fragment-state.h M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift A tests/query_test/test_async_codegen.py M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py 54 files changed, 922 insertions(+), 432 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/29 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 29 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#28). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-state.cc M be/src/runtime/fragment-state.h M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift A tests/query_test/test_async_codegen.py M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py 54 files changed, 894 insertions(+), 432 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/28 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 28 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 26: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5745/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 26 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 08 Apr 2020 03:18:29 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 25: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5744/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 25 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 08 Apr 2020 03:07:24 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#26). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-state.cc M be/src/runtime/fragment-state.h M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift A tests/query_test/test_async_codegen.py M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py 54 files changed, 911 insertions(+), 432 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/26 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 26 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 26: (21 comments) http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py File tests/query_test/test_async_codegen.py: http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@19 PS26, Line 19: from copy import copy flake8: F401 'copy.copy' imported but unused http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@21 PS26, Line 21: from tests.beeswax.impala_beeswax import ImpalaBeeswaxException flake8: F401 'tests.beeswax.impala_beeswax.ImpalaBeeswaxException' imported but unused http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@22 PS26, Line 22: from tests.common.test_dimensions import ( flake8: F401 'tests.common.test_dimensions.create_parquet_dimension' imported but unused http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@22 PS26, Line 22: from tests.common.test_dimensions import ( flake8: F401 'tests.common.test_dimensions.create_avro_snappy_dimension' imported but unused http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@26 PS26, Line 26: from tests.common.impala_cluster import ImpalaCluster flake8: F401 'tests.common.impala_cluster.ImpalaCluster' imported but unused http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@28 PS26, Line 28: from tests.common.skip import SkipIfNotHdfsMinicluster flake8: F401 'tests.common.skip.SkipIfNotHdfsMinicluster' imported but unused http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@29 PS26, Line 29: from tests.common.test_dimensions import ( flake8: F401 'tests.common.test_dimensions.extend_exec_option_dimension' imported but unused http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@32 PS26, Line 32: from tests.common.test_vector import ImpalaTestDimension flake8: F401 'tests.common.test_vector.ImpalaTestDimension' imported but unused http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@33 PS26, Line 33: from tests.verifiers.metric_verifier import MetricVerifier flake8: F401 'tests.verifiers.metric_verifier.MetricVerifier' imported but unused http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@36 PS26, Line 36: from RuntimeProfile.ttypes import TRuntimeProfileFormat flake8: F401 'RuntimeProfile.ttypes.TRuntimeProfileFormat' imported but unused http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@38 PS26, Line 38: class TestAsyncCodegen(ImpalaTestSuite): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@79 PS26, Line 79: ) flake8: E123 closing bracket does not match indentation of opening bracket's line http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@87 PS26, Line 87: = flake8: E712 comparison to False should be 'if cond is False:' or 'if not cond:' http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@93 PS26, Line 93: ) flake8: E123 closing bracket does not match indentation of opening bracket's line http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@106 PS26, Line 106: flake8: E261 at least two spaces before inline comment http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@106 PS26, Line 106: n flake8: E501 line too long (93 > 90 characters) http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@108 PS26, Line 108: c flake8: F841 local variable 'codegen_start' is assigned to but never used http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@120 PS26, Line 120: a flake8: F631 assertion is always true, perhaps remove parentheses? http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@155 PS26, Line 155: flake8: E261 at least two spaces before inline comment http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@157 PS26, Line 157: flake8: E203 whitespace before ':' http://gerrit.cloudera.org:8080/#/c/15105/26/tests/query_test/test_async_codegen.py@158 PS26, Line 158: flake8: W391 blank line at end of file -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 26 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Rin
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#25). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-state.cc M be/src/runtime/fragment-state.h M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py 53 files changed, 745 insertions(+), 429 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/25 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 25 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 24: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5699/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 24 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 03 Apr 2020 18:25:25 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 24: (3 comments) http://gerrit.cloudera.org:8080/#/c/15105/24/be/src/exec/partitioned-hash-join-builder.h File be/src/exec/partitioned-hash-join-builder.h: http://gerrit.cloudera.org:8080/#/c/15105/24/be/src/exec/partitioned-hash-join-builder.h@674 PS24, Line 674: std::deque>* output_partitions, RowBatch* batch); line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/15105/24/be/src/exec/partitioned-hash-join-builder.h@679 PS24, Line 679: std::deque>* output_partitions, RowBatch* batch); line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/15105/24/be/src/exec/partitioned-hash-join-builder.h@860 PS24, Line 860: const CodegenFnPtr& process_build_batch_fn_level0_; line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 24 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 03 Apr 2020 17:43:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#24). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-state.cc M be/src/runtime/fragment-state.h M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py 53 files changed, 742 insertions(+), 429 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/24 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 24 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 23: Rebase and conflict resolution. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 23 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 01 Apr 2020 16:55:14 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 23: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5674/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 23 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 01 Apr 2020 00:26:14 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#23). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py 52 files changed, 734 insertions(+), 423 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/23 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 23 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 23: (3 comments) http://gerrit.cloudera.org:8080/#/c/15105/23/be/src/exec/partitioned-hash-join-builder.h File be/src/exec/partitioned-hash-join-builder.h: http://gerrit.cloudera.org:8080/#/c/15105/23/be/src/exec/partitioned-hash-join-builder.h@680 PS23, Line 680: std::deque>* output_partitions, RowBatch* batch); line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/15105/23/be/src/exec/partitioned-hash-join-builder.h@685 PS23, Line 685: std::deque>* output_partitions, RowBatch* batch); line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/15105/23/be/src/exec/partitioned-hash-join-builder.h@866 PS23, Line 866: const CodegenFnPtr& process_build_batch_fn_level0_; line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 23 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 31 Mar 2020 23:44:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 22: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5456/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 22 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 09 Mar 2020 15:30:26 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#22). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py 52 files changed, 725 insertions(+), 414 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/22 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 22 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 21: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5442/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 21 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 06 Mar 2020 14:31:53 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 21: When unnesting PhjBuilder::Partition and thereby also PhjBuilder::Partition::InsertBatch (the function, not the typedef) I should have also changed be/src/codegen/gen_ir_descriptions.py. I'm doing that now together with rebasing and conflict resolution. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 21 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 06 Mar 2020 13:48:21 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#21). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py 52 files changed, 725 insertions(+), 414 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/21 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 21 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 20: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5441/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 20 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 06 Mar 2020 11:34:12 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 19: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5440/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 19 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 06 Mar 2020 11:14:54 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#20). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py 51 files changed, 723 insertions(+), 410 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/20 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 20 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 19: (2 comments) http://gerrit.cloudera.org:8080/#/c/15105/19/be/src/exec/partitioned-hash-join-builder.cc File be/src/exec/partitioned-hash-join-builder.cc: http://gerrit.cloudera.org:8080/#/c/15105/19/be/src/exec/partitioned-hash-join-builder.cc@659 PS19, Line 659: int PhjBuilder::GetNumSpilledPartitions(const vector>& partitions) { line too long (100 > 90) http://gerrit.cloudera.org:8080/#/c/15105/19/be/src/exec/partitioned-hash-join-builder.cc@979 PS19, Line 979: PhjBuilderPartition::PhjBuilderPartition(RuntimeState* state, PhjBuilder* parent, int level) line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 19 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 06 Mar 2020 10:31:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#19). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py 50 files changed, 720 insertions(+), 408 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/19 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 19 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 18: Thanks Tim. I still can't unnest the typedef of InsertBatchFn before the definition of PhjBuilder::Partition because it takes it as a parameter. I think I'll unnest PhjBuilder::Partition to a top level PhjBuilderPartition class. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 18 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 06 Mar 2020 09:32:55 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 18: I would suggest just unnesting the typedef and putting it at the top of the file. Nested definitions cause a lot of problems like this, I think they're mostly not a good idea *unless* they're only used internally within the class that they're defined in. I originally used nested definitions fairly liberally myself but I changed my mind on this after trying to reorganise DiskIoMgr headers to reduce the number of header dependencies - it was impossible without unnesting all of the classes. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 18 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 05 Mar 2020 15:58:55 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 18: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5430/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 18 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 05 Mar 2020 11:54:43 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 17: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5429/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 17 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 05 Mar 2020 11:43:35 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#18). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py 48 files changed, 653 insertions(+), 349 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/18 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 18 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 18: (4 comments) http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/codegen/codegen-fn-ptr.h File be/src/codegen/codegen-fn-ptr.h: http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/codegen/codegen-fn-ptr.h@38 PS16, Line 38: stexpr std::memory_order mem_order_load_ = std::memory_order_acquire; : static constexpr std::memory_order mem_order_store_ = std::memory_order_release; > nit: I think you can delete this part Done http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/exec/hdfs-scanner.h@465 PS16, Line 465: i > nit: we usually use /// for doc comments. Done http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/runtime/fragment-instance-state.cc@358 PS16, Line 358: > nit: too many / Done http://gerrit.cloudera.org:8080/#/c/15105/17/tests/query_test/test_query_mem_limit.py File tests/query_test/test_query_mem_limit.py: http://gerrit.cloudera.org:8080/#/c/15105/17/tests/query_test/test_query_mem_limit.py@130 PS17, Line 130: > flake8: E501 line too long (92 > 90 characters) Done -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 18 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 05 Mar 2020 11:08:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 17: (1 comment) http://gerrit.cloudera.org:8080/#/c/15105/17/tests/query_test/test_query_mem_limit.py File tests/query_test/test_query_mem_limit.py: http://gerrit.cloudera.org:8080/#/c/15105/17/tests/query_test/test_query_mem_limit.py@130 PS17, Line 130: h flake8: E501 line too long (92 > 90 characters) -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 17 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 05 Mar 2020 10:59:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#17). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py 48 files changed, 655 insertions(+), 349 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/17 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 17 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 16: (3 comments) The code LGTM, only found a few nits. http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/codegen/codegen-fn-ptr.h File be/src/codegen/codegen-fn-ptr.h: http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/codegen/codegen-fn-ptr.h@38 PS16, Line 38: Decide memory order. I think mem_order_relaxed should be enough as we : /// only need atomicity and the pointers can be set independently of each other. nit: I think you can delete this part http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/exec/hdfs-scanner.h@465 PS16, Line 465: / nit: we usually use /// for doc comments. http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/15105/16/be/src/runtime/fragment-instance-state.cc@358 PS16, Line 358: / nit: too many / -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 16 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 27 Feb 2020 14:02:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 16: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5356/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 16 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 27 Feb 2020 09:54:02 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 16: Rebased and conflicts resolved. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 16 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 27 Feb 2020 09:08:16 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#16). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py 46 files changed, 521 insertions(+), 229 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/16 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 16 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 15: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5349/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 15 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 26 Feb 2020 18:39:33 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#15). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py 46 files changed, 517 insertions(+), 228 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/15 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 15 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5327/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 14 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 25 Feb 2020 13:06:19 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#14). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py 46 files changed, 517 insertions(+), 228 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/14 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 14 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5285/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 13 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 20 Feb 2020 10:00:51 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#13). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py 46 files changed, 517 insertions(+), 228 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/13 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 13 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5224/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 12 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 14 Feb 2020 17:43:03 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py 46 files changed, 517 insertions(+), 228 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/12 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 12 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5223/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 11 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 14 Feb 2020 16:44:21 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/15105/9/be/src/codegen/codegen-fn-ptr.h File be/src/codegen/codegen-fn-ptr.h: http://gerrit.cloudera.org:8080/#/c/15105/9/be/src/codegen/codegen-fn-ptr.h@41 PS9, Line 41: static constexpr std::memory_order mem_order_ = std::memory_order_acq_rel; > Also I think I can't use memory_order_acq_rel, I need to use acquire for th Yeah, I just checked it, seems like memory_order_acq_rel is only for read-modify-write operations. http://gerrit.cloudera.org:8080/#/c/15105/9/be/src/codegen/codegen-fn-ptr.h@41 PS9, Line 41: static constexpr std::memory_order mem_order_ = std::memory_order_acq_rel; > I don't quite understand the beginning: "From the perspective of the C++ me To have synchronizes-with you need two atomic operations (e.g. store and load, mutex unlock-lock) with proper memory orderings, e.g. rel/acq, or seq_cst on both sides. Synchronizes-with gives you "happens before" semantics between the operations. "Happens before" is a transitive relationship: A B and B C ==> A C. That's why the following works: int A=0; === T1 T2 === A=1 B.store(42, release) if (B.load(acquire) == 42) { // see 42, the load synchronizes with the store cout << A; } T2 can only print 1. But if you do a store(42, release) in T1 and in T2 you do a load(relaxed) even if the load sees the value 42 there will be no "synchronizes with" relationship between the two operations. And if you don't have synchronizes-with then you don't have "happens before" either. E.g.: int A=0; === T1 T2 === A=1 B.store(42, release) if (B.load(relaxed) == 42) { // see 42, but there is no synchronizes-with cout << A; } T2 is allowed to print 0 or 1. On top of that, based on the C++ memory model even dependent loads are not guaranteed to work. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 9 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 14 Feb 2020 16:32:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 11: (2 comments) http://gerrit.cloudera.org:8080/#/c/15105/11/tests/query_test/test_queries.py File tests/query_test/test_queries.py: http://gerrit.cloudera.org:8080/#/c/15105/11/tests/query_test/test_queries.py@30 PS11, Line 30: from tests.common.test_vector import ImpalaTestDimension, ImpalaTestVector flake8: F401 'tests.common.test_vector.ImpalaTestVector' imported but unused http://gerrit.cloudera.org:8080/#/c/15105/11/tests/query_test/test_queries.py@30 PS11, Line 30: from tests.common.test_vector import ImpalaTestDimension, ImpalaTestVector flake8: F401 'tests.common.test_vector.ImpalaTestDimension' imported but unused -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 11 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 14 Feb 2020 15:58:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/query_test/test_queries.py 46 files changed, 513 insertions(+), 228 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/11 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 11 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/15105/9/be/src/codegen/codegen-fn-ptr.h File be/src/codegen/codegen-fn-ptr.h: http://gerrit.cloudera.org:8080/#/c/15105/9/be/src/codegen/codegen-fn-ptr.h@41 PS9, Line 41: static constexpr std::memory_order mem_order_ = std::memory_order_acq_rel; > I don't quite understand the beginning: "From the perspective of the C++ me Also I think I can't use memory_order_acq_rel, I need to use acquire for the load and release for the store... -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 10 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 14 Feb 2020 15:41:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/15105/9/be/src/codegen/codegen-fn-ptr.h File be/src/codegen/codegen-fn-ptr.h: http://gerrit.cloudera.org:8080/#/c/15105/9/be/src/codegen/codegen-fn-ptr.h@41 PS9, Line 41: static constexpr std::memory_order mem_order_ = std::memory_order_acq_rel; > From the perspective of the C++ memory model, release and relaxed doesn't h I don't quite understand the beginning: "From the perspective of the C++ memory model, release and relaxed doesn't have a synchronizes-with relationship". Does it mean that release is enough either or that release is ok (but not relaxed)? -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 10 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 14 Feb 2020 15:40:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 10: (2 comments) http://gerrit.cloudera.org:8080/#/c/15105/9/be/src/codegen/codegen-fn-ptr.h File be/src/codegen/codegen-fn-ptr.h: http://gerrit.cloudera.org:8080/#/c/15105/9/be/src/codegen/codegen-fn-ptr.h@41 PS9, Line 41: static constexpr std::memory_order mem_order_ = std::memory_order_acq_rel; > Thanks, I didn't think of that. >From the perspective of the C++ memory model, release and relaxed doesn't have >a synchronizes-with relationship, so you don't have "happens-before" semantics >either. I.e. "write code to allocated memory" does not necessarily happen >before executing the machine code instructions of that memory. You can relax a bit with release/consume semantics, that'll mean "write code..." will be dependency-ordered-before executing the machine code instructions, because reading the machine code is dependent on the load of the pointer. This might allow the compiler to do more reorderings. That said, AFAICT release/relaxed will work on x86 because it has a strong memory model so the stores won't be reordered, also the reading of the function machine code is dependent on the read of the pointer, so it won't be reordered either. But for portability we want code that is "theoretically correct", not just "practically correct" because we might shoot our foot on a weird platform (when we'll compile Impala to DEC Alpha :) ). And release/acquire or release/consume is not a high price to pay for correctness. http://gerrit.cloudera.org:8080/#/c/15105/10/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: http://gerrit.cloudera.org:8080/#/c/15105/10/be/src/exec/hdfs-scanner.h@464 PS10, Line 464: using CodegendRetT = std::result_of_t; : using InterpretedRetT = std::result_of_t; : static_assert(std::is_same::value, : "The return types of the codegen'd and " : "interpreted functions must be the same."); Now it's not necessary because it's enforced by decltype(auto). -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 10 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 14 Feb 2020 11:35:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5221/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 10 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 14 Feb 2020 10:46:49 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/15105/9/be/src/codegen/codegen-fn-ptr.h File be/src/codegen/codegen-fn-ptr.h: http://gerrit.cloudera.org:8080/#/c/15105/9/be/src/codegen/codegen-fn-ptr.h@41 PS9, Line 41: static constexpr std::memory_order mem_order_ = std::memory_order_acq_rel; > just passing through: I think you have it correct here. Relaxed is insuffii Thanks, I didn't think of that. It seems to me that in theory we could store the pointer using release semantics and read using relaxed but I agree that we are not likely to gain much performance by using relaxed and acq_rel is easier. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 9 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 14 Feb 2020 10:24:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift 45 files changed, 487 insertions(+), 227 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/10 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 10 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/15105/9/be/src/codegen/codegen-fn-ptr.h File be/src/codegen/codegen-fn-ptr.h: http://gerrit.cloudera.org:8080/#/c/15105/9/be/src/codegen/codegen-fn-ptr.h@41 PS9, Line 41: static constexpr std::memory_order mem_order_ = std::memory_order_acq_rel; just passing through: I think you have it correct here. Relaxed is insuffiicent because it would permit reordering of the writing of the code and the storing of the pointer. In other words, your compilation thread does things like: 1. allocate memory for function machine code 2. write code to allocated memory 3. publish pointer and without "release" semantics on the publishing of the pointer, there is no guarantee that #2 and #3 won't be reordered. In practice, this is really unlikely to happen, because there are probably some other locking operations in between #2 and #3 and because x86 is a TSO architecture, but worth doing it right regardless. Also worth noting that acq/rel semantics on x86 isn't expensive, they are just simple load/store operations. It only prevents compiler reordering. So, there is little to gain by using relaxed. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 9 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 13 Feb 2020 21:18:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5216/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 9 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 13 Feb 2020 11:50:47 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exec/hdfs-scanner.h@462 PS6, Line 462: decltype(auto) invoke(ThisT This, CodegenFnPt > Since this is a wrapper function I think decltype(auto) should be preferred Thanks, I didn't know that. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 9 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 13 Feb 2020 11:04:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift 45 files changed, 487 insertions(+), 227 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/9 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 9 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exec/hdfs-scanner.h@462 PS6, Line 462: auto invoke(ThisT This, CodegenFnPtrBase* ato > Also auto works, I'll use that. Since this is a wrapper function I think decltype(auto) should be preferred because you want the exact same return type as your delegated functions. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 8 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 12 Feb 2020 11:47:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5202/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 8 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 11 Feb 2020 17:26:39 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5201/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 7 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 11 Feb 2020 16:46:51 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift 45 files changed, 486 insertions(+), 227 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/8 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 8 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 7: Sorry, the previous patch set didn't contain my changes... -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 7 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 11 Feb 2020 16:42:33 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/runtime/runtime-state.h@148 PS6, Line 148: if the expression is not interpretable > I think udf's with lots of parameters are an example. Also IR UDFs -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 6 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 11 Feb 2020 16:30:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift 45 files changed, 485 insertions(+), 227 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/7 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 7 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP: Asynchronous code generation
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 5: (14 comments) http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/benchmarks/hash-benchmark.cc File be/src/benchmarks/hash-benchmark.cc: http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/benchmarks/hash-benchmark.cc@530 PS6, Line 530: .load() std::atomics have user-defined conversions to T, maybe you could also add to make it more convenient. https://en.cppreference.com/w/cpp/atomic/atomic/operator_T http://gerrit.cloudera.org:8080/#/c/15105/1/be/src/codegen/codegen-fn-ptr.h File be/src/codegen/codegen-fn-ptr.h: http://gerrit.cloudera.org:8080/#/c/15105/1/be/src/codegen/codegen-fn-ptr.h@30 PS1, Line 30: rn ptr_.load(mem_order_); > I think relaxed is correct for the reasons you mention. +1 for acq/rel because I think it has the perfect balance between correctness and performance. It provides quite strong guarantees and x86 supports it without any additional cpu operations unlike sequential consistency. http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/codegen/llvm-codegen.h@345 PS6, Line 345: ` We usually use ' and not `. http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/codegen/llvm-codegen.h@352 PS6, Line 352: correspinging corresponding http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/codegen/llvm-codegen.cc@477 PS6, Line 477: if (async_compile_thread_ != nullptr) { : async_compile_thread_->Join(); : } nit: fits single line http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exec/grouping-aggregator.h File be/src/exec/grouping-aggregator.h: http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exec/grouping-aggregator.h@167 PS6, Line 167: Function type: AddBatchStreamingImplFn. nit: Unnecessary comment. http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exec/hdfs-scanner.h@462 PS6, Line 462: odegen is either disabled or not ready yet. Would decltype(auto) work? You have multiple return stmts, but the return types must be the same. http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exec/hdfs-scanner.h@465 PS6, Line 465: You need 'Args&&' if you want perfect forwarding. http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exec/hdfs-scanner.h@487 PS6, Line 487: Status ScannerDebugAction() WARN_UNUSED_RESULT { nit: please add short comment http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exec/partitioned-hash-join-builder.h File be/src/exec/partitioned-hash-join-builder.h: http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exec/partitioned-hash-join-builder.h@721 PS6, Line 721: /// Function type: ProcessBuildBatchFn nit: unnecessary comment http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exec/partitioned-hash-join-builder.h@728 PS6, Line 728: /// Function type: InsertBatchFn. unnecessary comment http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exec/topn-node.h File be/src/exec/topn-node.h: http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exec/topn-node.h@140 PS6, Line 140: Function type: InsertBatchFn unnecessary comment http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exprs/scalar-expr.inline.h File be/src/exprs/scalar-expr.inline.h: http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/exprs/scalar-expr.inline.h@48 PS6, Line 48: nit: missing indentation, and maybe put the = sign to the prev line http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: http://gerrit.cloudera.org:8080/#/c/15105/6/be/src/runtime/runtime-state.h@148 PS6, Line 148: if the expression is not interpretable I'm just wondering what expressions are not interpretable but can be codegen'd? -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 10 Feb 2020 18:00:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5155/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 6 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 07 Feb 2020 15:44:33 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift 45 files changed, 484 insertions(+), 227 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/6 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 6 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] WIP: Asynchronous code generation
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/15105/1/be/src/runtime/fragment-instance-state.cc File be/src/runtime/fragment-instance-state.cc: http://gerrit.cloudera.org:8080/#/c/15105/1/be/src/runtime/fragment-instance-state.cc@360 PS1, Line 360: if (async_enabled && runtime_state_->is_interpretable()) { > I'm a little torn on this. On one hand, I really dislike random performance It is also possible to start the codegen async, and call a function like BeforeUsingCodegendFunctionsForTheFirstTime() in ExecNodes, and wait for codegen to be finished if there is a non-interpretable function. This would allow some overlap between codegen and regular ExecNode work. The way I see the benefits of async codegen is that Scanners always do some work before starting processing rows, e.g. read the initial scan ranges, process the metadata of files, read and process dictionaries/column indexes, start the scan ranges for columns. (hmm, just realized that dictionary filtering may actually need codegend functions). So in many cases async codegen will be actually finished before the first call to these functions. This seems a clear win scenario, while if we process a bunch of rows on the interpreted path then the CPU cost of the whole query is actually increased (of course finishing a bit earlier should usually worth it). -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Feb 2020 14:36:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5128/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Feb 2020 14:06:19 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/15105/5/be/src/exec/hdfs-avro-scanner.cc File be/src/exec/hdfs-avro-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15105/5/be/src/exec/hdfs-avro-scanner.cc@1149 PS5, Line 1149: /// return fn(this, max_tuples, pool, &data_block_, data_block_end_, tuple, tuple_row); line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/15105/5/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: http://gerrit.cloudera.org:8080/#/c/15105/5/be/src/exec/hdfs-scanner.h@448 PS5, Line 448: using InterpretedRetT = decltype((This->*interpreted_fn)(std::forward(args)...)); line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/15105/5/be/src/exec/hdfs-scanner.h@450 PS5, Line 450: "The return types of the codegen'd and interpreted functions must be the same."); line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Feb 2020 13:18:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift 45 files changed, 476 insertions(+), 227 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/5 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5127/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 4 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Feb 2020 09:41:54 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/15105/4/be/src/exec/union-node.cc File be/src/exec/union-node.cc: http://gerrit.cloudera.org:8080/#/c/15105/4/be/src/exec/union-node.cc@239 PS4, Line 239: UnionMaterializeBatchFn fn = line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 4 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Feb 2020 08:46:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift 45 files changed, 462 insertions(+), 226 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/4 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 4 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 3: The build was aborted. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Feb 2020 07:55:24 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 3: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/5122/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Feb 2020 20:29:30 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/15105/3/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: http://gerrit.cloudera.org:8080/#/c/15105/3/be/src/codegen/llvm-codegen.cc@1155 PS3, Line 1155: for (const std::pair& fn_pair : fns_to_jit_compile_) { line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/15105/3/be/src/exec/union-node.h File be/src/exec/union-node.h: http://gerrit.cloudera.org:8080/#/c/15105/3/be/src/exec/union-node.h@113 PS3, Line 113: std::vector> line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/15105/3/be/src/exec/union-node.cc File be/src/exec/union-node.cc: http://gerrit.cloudera.org:8080/#/c/15105/3/be/src/exec/union-node.cc@239 PS3, Line 239: UnionMaterializeBatchFn fn = line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/15105/3/be/src/exprs/scalar-expr.h File be/src/exprs/scalar-expr.h: http://gerrit.cloudera.org:8080/#/c/15105/3/be/src/exprs/scalar-expr.h@402 PS3, Line 402: /// calling the functions. line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Feb 2020 16:27:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/15105/1/be/src/codegen/codegen-fn-ptr.h File be/src/codegen/codegen-fn-ptr.h: http://gerrit.cloudera.org:8080/#/c/15105/1/be/src/codegen/codegen-fn-ptr.h@30 PS1, Line 30: rn ptr_.load(mem_order_); > Yeah I think we should either just hardcode the memory order as a static co Yes, probably the best solution is a constexpr memory order, I just wasn't sure in the beginning and thought we may want to test two different versions in the same build. What do you think would be the best memory order? Do you think relaxed should be enough? http://gerrit.cloudera.org:8080/#/c/15105/1/be/src/codegen/codegen-fn-ptr.h@45 PS1, Line 45: inline CodegenFnPtrBase::~CodegenFnPtrBase() = default; > Would it make sense to make the function pointer type a template parameter I don't think we can do it type-safely because we store pointers to instances in containers where the instances have different underlying function pointers, for example in LlvmCodegen. But I have implemented a version where there is a general base class and the templated classes with concrete function types are subclasses of the base class. This way the function type is included in the type name of the atomic function pointers (good for code clarity) and the reinterpret casts can be collected to one place in the majority of the cases (however not all, for example 'codegend_fn_map_' in hdfs-scan-node.h still has to use void* as the function type because there are different kinds of functions in the collection). http://gerrit.cloudera.org:8080/#/c/15105/1/be/src/exec/hdfs-sequence-scanner.cc File be/src/exec/hdfs-sequence-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15105/1/be/src/exec/hdfs-sequence-scanner.cc@310 PS1, Line 310: int tuples_returned = WriteAlignedTuplesCodegenOrInterpret(row_batch->tuple_data_pool(), > Maybe remove this comment? Not sure it is that informative. Done http://gerrit.cloudera.org:8080/#/c/15105/1/be/src/exec/hdfs-text-scanner.cc File be/src/exec/hdfs-text-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15105/1/be/src/exec/hdfs-text-scanner.cc@869 PS1, Line 869: int tuples_returned = WriteAlignedTuplesCodegenOrInterpret(pool, row, fields, > Maybe remove this comment? Not sure it is that informative. Done http://gerrit.cloudera.org:8080/#/c/15105/1/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: http://gerrit.cloudera.org:8080/#/c/15105/1/be/src/runtime/runtime-state.h@200 PS1, Line 200: return !CodegenDisabledByQueryOption() && !CodegenDisabledByHint(); > nit: I think the member should be is_interpretable_ and the accessor method Done -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Feb 2020 16:26:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift 45 files changed, 459 insertions(+), 226 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/3 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5590/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 2 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 03 Feb 2020 14:52:46 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: Asynchronous code generation
Daniel Becker has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15105 Change subject: WIP: Asynchronous code generation .. WIP: Asynchronous code generation This commit introduces optional asynchronous code generation. Asynchronous code generation means that instead of waiting for codegen to finish, the query starts in interpreted mode while codegen is done on another thread. All the function pointers that point to codegen'd functions are changed to be atomic, wrapped in a CodegenFnPtr. These are initialised to nullptr and as long as they are nullptr, the corresponding interpreted functions are used (as before). When code generation is ready, the funtion pointers are set by the codegen thread. No synchronisation is needed as the function pointers are atomic and it is not a problem if, at a given moment, only a subset of the codegen'd function pointers are set and the rest are interpreted. Asynchronous code generation can be turned on using the ASYNC_CODEGEN boolean query option. TODO: The default should be synchronous codegen for now. TODO: Testing. TODO: Benchmarks. Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b --- M be/src/benchmarks/hash-benchmark.cc A be/src/codegen/codegen-fn-ptr.h M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/grouping-aggregator.cc M be/src/exec/grouping-aggregator.h M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/non-grouping-aggregator.cc M be/src/exec/non-grouping-aggregator.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/select-node.cc M be/src/exec/select-node.h M be/src/exec/topn-node.cc M be/src/exec/topn-node.h M be/src/exec/union-node.cc M be/src/exec/union-node.h M be/src/exprs/expr-codegen-test.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/scalar-expr.inline.h M be/src/exprs/scalar-fn-call.cc M be/src/exprs/scalar-fn-call.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/runtime-state.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.cc M be/src/util/tuple-row-compare.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift 45 files changed, 453 insertions(+), 214 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/15105/2 -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 2 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] WIP: Asynchronous code generation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15105 ) Change subject: WIP: Asynchronous code generation .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/15105/2/be/src/exec/hdfs-avro-scanner.cc File be/src/exec/hdfs-avro-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15105/2/be/src/exec/hdfs-avro-scanner.cc@559 PS2, Line 559: line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/15105 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b Gerrit-Change-Number: 15105 Gerrit-PatchSet: 2 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 03 Feb 2020 14:05:10 + Gerrit-HasComments: Yes