[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool Add a copy of BufferedTupleStream that allocates memory from BufferPool. This will replace the original implementation when IMPALA-3200 is completed. The major changes are: * Terminology is updated from "blocks" to "pages" * No small buffer support is needed (hooray!). * BufferedTupleStream needs to do its own tracking of # of rows per page, etc instead of relying on BufferedBlockMgr to do it. A wrapper around PageHandle is used. * Profile counters (unpin, pin, get new block time) are removed - similar counters in the BufferPool client are more useful. * Changed the tuple null indicators so that they are allocated before each tuple, rather than in a block at the start of the page. This slightly reduces the memory density, but greatly simplifies much logic. In particular, it avoids problems with RowIdx and larger pages with offsets that don't fit in 16 bits. * Buffer management of read/write streams uses the new pin-counting support to separate pinning of the read and write pages. This means that the reservation usage of an unpinned read/write stream does not fluctuate and the read/write iterators can always advance without requiring additional reservation. Testing this required some further changes. TestEnv was refactored so it can set up either BufferPool or BufferedBlockMgr. Some BufferPool-related state is added to ExecEnv, QueryState and RuntimeState, but is only created for backend tests that explicitly create a BufferPool. The following is left for future work: * IMPALA-3808 (large row support) is not added. I've added TODOs to the code to places that will require changes. * IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since it requires global changes to operators that accumulate memory. Testing: All of the BufferedTupleStream unit tests are ported to the new implementation, except for ones specifically testing the small buffer functionality. Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Reviewed-on: http://gerrit.cloudera.org:8080/5811 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/llvm-codegen-test.cc M be/src/exec/hash-table-test.cc M be/src/exprs/expr-codegen-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc A be/src/runtime/buffered-tuple-stream-v2-test.cc A be/src/runtime/buffered-tuple-stream-v2.cc A be/src/runtime/buffered-tuple-stream-v2.h A be/src/runtime/buffered-tuple-stream-v2.inline.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc 23 files changed, 3,012 insertions(+), 105 deletions(-) Approvals: Impala Public Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 18 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 17: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 17 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 17: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/386/ -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 17 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Hello Impala Public Jenkins, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5811 to look at the new patch set (#17). Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool Add a copy of BufferedTupleStream that allocates memory from BufferPool. This will replace the original implementation when IMPALA-3200 is completed. The major changes are: * Terminology is updated from "blocks" to "pages" * No small buffer support is needed (hooray!). * BufferedTupleStream needs to do its own tracking of # of rows per page, etc instead of relying on BufferedBlockMgr to do it. A wrapper around PageHandle is used. * Profile counters (unpin, pin, get new block time) are removed - similar counters in the BufferPool client are more useful. * Changed the tuple null indicators so that they are allocated before each tuple, rather than in a block at the start of the page. This slightly reduces the memory density, but greatly simplifies much logic. In particular, it avoids problems with RowIdx and larger pages with offsets that don't fit in 16 bits. * Buffer management of read/write streams uses the new pin-counting support to separate pinning of the read and write pages. This means that the reservation usage of an unpinned read/write stream does not fluctuate and the read/write iterators can always advance without requiring additional reservation. Testing this required some further changes. TestEnv was refactored so it can set up either BufferPool or BufferedBlockMgr. Some BufferPool-related state is added to ExecEnv, QueryState and RuntimeState, but is only created for backend tests that explicitly create a BufferPool. The following is left for future work: * IMPALA-3808 (large row support) is not added. I've added TODOs to the code to places that will require changes. * IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since it requires global changes to operators that accumulate memory. Testing: All of the BufferedTupleStream unit tests are ported to the new implementation, except for ones specifically testing the small buffer functionality. Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 --- M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/llvm-codegen-test.cc M be/src/exec/hash-table-test.cc M be/src/exprs/expr-codegen-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc A be/src/runtime/buffered-tuple-stream-v2-test.cc A be/src/runtime/buffered-tuple-stream-v2.cc A be/src/runtime/buffered-tuple-stream-v2.h A be/src/runtime/buffered-tuple-stream-v2.inline.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc 23 files changed, 3,012 insertions(+), 105 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/17 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 17 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 17: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 17 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 16: Verified-1 Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/385/ -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 16: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/385/ -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 16: Code-Review+2 Fix a couple of clang-tidy issues -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Hello Impala Public Jenkins, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5811 to look at the new patch set (#16). Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool Add a copy of BufferedTupleStream that allocates memory from BufferPool. This will replace the original implementation when IMPALA-3200 is completed. The major changes are: * Terminology is updated from "blocks" to "pages" * No small buffer support is needed (hooray!). * BufferedTupleStream needs to do its own tracking of # of rows per page, etc instead of relying on BufferedBlockMgr to do it. A wrapper around PageHandle is used. * Profile counters (unpin, pin, get new block time) are removed - similar counters in the BufferPool client are more useful. * Changed the tuple null indicators so that they are allocated before each tuple, rather than in a block at the start of the page. This slightly reduces the memory density, but greatly simplifies much logic. In particular, it avoids problems with RowIdx and larger pages with offsets that don't fit in 16 bits. * Buffer management of read/write streams uses the new pin-counting support to separate pinning of the read and write pages. This means that the reservation usage of an unpinned read/write stream does not fluctuate and the read/write iterators can always advance without requiring additional reservation. Testing this required some further changes. TestEnv was refactored so it can set up either BufferPool or BufferedBlockMgr. Some BufferPool-related state is added to ExecEnv, QueryState and RuntimeState, but is only created for backend tests that explicitly create a BufferPool. The following is left for future work: * IMPALA-3808 (large row support) is not added. I've added TODOs to the code to places that will require changes. * IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since it requires global changes to operators that accumulate memory. Testing: All of the BufferedTupleStream unit tests are ported to the new implementation, except for ones specifically testing the small buffer functionality. Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 --- M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/llvm-codegen-test.cc M be/src/exec/hash-table-test.cc M be/src/exprs/expr-codegen-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc A be/src/runtime/buffered-tuple-stream-v2-test.cc A be/src/runtime/buffered-tuple-stream-v2.cc A be/src/runtime/buffered-tuple-stream-v2.h A be/src/runtime/buffered-tuple-stream-v2.inline.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc 23 files changed, 3,012 insertions(+), 105 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/16 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 15: Verified-1 Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/384/ -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 15: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/384/ -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Hello Impala Public Jenkins, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5811 to look at the new patch set (#15). Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool Add a copy of BufferedTupleStream that allocates memory from BufferPool. This will replace the original implementation when IMPALA-3200 is completed. The major changes are: * Terminology is updated from "blocks" to "pages" * No small buffer support is needed (hooray!). * BufferedTupleStream needs to do its own tracking of # of rows per page, etc instead of relying on BufferedBlockMgr to do it. A wrapper around PageHandle is used. * Profile counters (unpin, pin, get new block time) are removed - similar counters in the BufferPool client are more useful. * Changed the tuple null indicators so that they are allocated before each tuple, rather than in a block at the start of the page. This slightly reduces the memory density, but greatly simplifies much logic. In particular, it avoids problems with RowIdx and larger pages with offsets that don't fit in 16 bits. * Buffer management of read/write streams uses the new pin-counting support to separate pinning of the read and write pages. This means that the reservation usage of an unpinned read/write stream does not fluctuate and the read/write iterators can always advance without requiring additional reservation. Testing this required some further changes. TestEnv was refactored so it can set up either BufferPool or BufferedBlockMgr. Some BufferPool-related state is added to ExecEnv, QueryState and RuntimeState, but is only created for backend tests that explicitly create a BufferPool. The following is left for future work: * IMPALA-3808 (large row support) is not added. I've added TODOs to the code to places that will require changes. * IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since it requires global changes to operators that accumulate memory. Testing: All of the BufferedTupleStream unit tests are ported to the new implementation, except for ones specifically testing the small buffer functionality. Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 --- M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/llvm-codegen-test.cc M be/src/exec/hash-table-test.cc M be/src/exprs/expr-codegen-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc A be/src/runtime/buffered-tuple-stream-v2-test.cc A be/src/runtime/buffered-tuple-stream-v2.cc A be/src/runtime/buffered-tuple-stream-v2.h A be/src/runtime/buffered-tuple-stream-v2.inline.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc 23 files changed, 3,012 insertions(+), 105 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/15 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 15: Code-Review+2 Fix some test failures from the rebase -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 14: Verified-1 Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/382/ -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 14: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/382/ -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 14: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 13: Code-Review+2 Rebased onto https://gerrit.cloudera.org/#/c/6313/ -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5811 to look at the new patch set (#13). Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool Add a copy of BufferedTupleStream that allocates memory from BufferPool. This will replace the original implementation when IMPALA-3200 is completed. The major changes are: * Terminology is updated from "blocks" to "pages" * No small buffer support is needed (hooray!). * BufferedTupleStream needs to do its own tracking of # of rows per page, etc instead of relying on BufferedBlockMgr to do it. A wrapper around PageHandle is used. * Profile counters (unpin, pin, get new block time) are removed - similar counters in the BufferPool client are more useful. * Changed the tuple null indicators so that they are allocated before each tuple, rather than in a block at the start of the page. This slightly reduces the memory density, but greatly simplifies much logic. In particular, it avoids problems with RowIdx and larger pages with offsets that don't fit in 16 bits. * Buffer management of read/write streams uses the new pin-counting support to separate pinning of the read and write pages. This means that the reservation usage of an unpinned read/write stream does not fluctuate and the read/write iterators can always advance without requiring additional reservation. Testing this required some further changes. TestEnv was refactored so it can set up either BufferPool or BufferedBlockMgr. Some BufferPool-related state is added to ExecEnv, QueryState and RuntimeState, but is only created for backend tests that explicitly create a BufferPool. The following is left for future work: * IMPALA-3808 (large row support) is not added. I've added TODOs to the code to places that will require changes. * IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since it requires global changes to operators that accumulate memory. Testing: All of the BufferedTupleStream unit tests are ported to the new implementation, except for ones specifically testing the small buffer functionality. Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 --- M be/src/exec/hash-table-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc A be/src/runtime/buffered-tuple-stream-v2-test.cc A be/src/runtime/buffered-tuple-stream-v2.cc A be/src/runtime/buffered-tuple-stream-v2.h A be/src/runtime/buffered-tuple-stream-v2.inline.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc 20 files changed, 3,002 insertions(+), 100 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/13 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 12: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 11: (7 comments) http://gerrit.cloudera.org:8080/#/c/5811/11/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: Line 176: CHECK_CONSISTENCY(); > maybe move these CHECK_CONSISTECY() to the end of the functions so they val We don't check at the exit points of all functions so I wanted to at least check at the entry to some of these functions before they start changing state. Line 270: RETURN_IF_ERROR(PinPageIfNeeded(write_page_, pinned_)); > wouldn't it be better to move this to AdvanceWritePage()? The other caller Done Line 299: // May need to pin the new page for both reading and writing. > Add: "See ExpectedPinCount()." Done Line 374: pinned_ || buffer_pool_client_->reservation()->IncreaseReservationToFit(page_len_); > let's add a comment, something like: Done Line 443: MemTracker* tracker, scoped_ptr* batch, bool* got_rows) { > you don't have to do it now, but I kinda think this function should be in P Yeah, I think we should really remove it: IMPALA-2758 http://gerrit.cloudera.org:8080/#/c/5811/11/be/src/runtime/buffered-tuple-stream-v2.h File be/src/runtime/buffered-tuple-stream-v2.h: PS11, Line 498: and to pin it for reading if the stream's current : /// state requires it. > see comment in cc file on this function. if you take that suggestion, then Done PS11, Line 518: write iterator is > iterators are Done -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Hello Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5811 to look at the new patch set (#12). Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool Add a copy of BufferedTupleStream that allocates memory from BufferPool. This will replace the original implementation when IMPALA-3200 is completed. The major changes are: * Terminology is updated from "blocks" to "pages" * No small buffer support is needed (hooray!). * BufferedTupleStream needs to do its own tracking of # of rows per page, etc instead of relying on BufferedBlockMgr to do it. A wrapper around PageHandle is used. * Profile counters (unpin, pin, get new block time) are removed - similar counters in the BufferPool client are more useful. * Changed the tuple null indicators so that they are allocated before each tuple, rather than in a block at the start of the page. This slightly reduces the memory density, but greatly simplifies much logic. In particular, it avoids problems with RowIdx and larger pages with offsets that don't fit in 16 bits. * Buffer management of read/write streams uses the new pin-counting support to separate pinning of the read and write pages. This means that the reservation usage of an unpinned read/write stream does not fluctuate and the read/write iterators can always advance without requiring additional reservation. Testing this required some further changes. TestEnv was refactored so it can set up either BufferPool or BufferedBlockMgr. Some BufferPool-related state is added to ExecEnv, QueryState and RuntimeState, but is only created for backend tests that explicitly create a BufferPool. The following is left for future work: * IMPALA-3808 (large row support) is not added. I've added TODOs to the code to places that will require changes. * IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since it requires global changes to operators that accumulate memory. Testing: All of the BufferedTupleStream unit tests are ported to the new implementation, except for ones specifically testing the small buffer functionality. Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 --- M be/src/exec/hash-table-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc A be/src/runtime/buffered-tuple-stream-v2-test.cc A be/src/runtime/buffered-tuple-stream-v2.cc A be/src/runtime/buffered-tuple-stream-v2.h A be/src/runtime/buffered-tuple-stream-v2.inline.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc 20 files changed, 3,012 insertions(+), 101 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/12 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Dan Hecht has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 11: Code-Review+2 (7 comments) http://gerrit.cloudera.org:8080/#/c/5811/11/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: Line 176: CHECK_CONSISTENCY(); maybe move these CHECK_CONSISTECY() to the end of the functions so they validate the resulting state, but feel free to ignore. Line 270: RETURN_IF_ERROR(PinPageIfNeeded(write_page_, pinned_)); wouldn't it be better to move this to AdvanceWritePage()? The other callers handle it for their specific case (i.e. by calling PrepareForReadInternal(), or not needing it), so this is really just for when we're adding a new write page to an existing stream, no? Line 299: // May need to pin the new page for both reading and writing. Add: "See ExpectedPinCount()." since that's where we really document how we manage reservations/pins. Line 374: pinned_ || buffer_pool_client_->reservation()->IncreaseReservationToFit(page_len_); let's add a comment, something like: // If already pinned, no additional pin is needed (see ExpectedPinCount()). Line 443: MemTracker* tracker, scoped_ptr* batch, bool* got_rows) { you don't have to do it now, but I kinda think this function should be in PHJ. It's really just a composition of buffered-tuple-stream operations, and it's not something that we should be doing generally. http://gerrit.cloudera.org:8080/#/c/5811/11/be/src/runtime/buffered-tuple-stream-v2.h File be/src/runtime/buffered-tuple-stream-v2.h: PS11, Line 498: and to pin it for reading if the stream's current : /// state requires it. see comment in cc file on this function. if you take that suggestion, then update this. PS11, Line 518: write iterator is iterators are -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 10: (19 comments) http://gerrit.cloudera.org:8080/#/c/5811/9/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: PS9, Line 322: > !has_read_iterator() ? Done http://gerrit.cloudera.org:8080/#/c/5811/10/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: Line 101: for (const Page& page : pages_) { > will this be too expensive, even in debug builds (and we should make sure t We were already doing a pass for the list in one of the Calc* methods, so it's not likely to be significantly worse. I removed the consistency check from GetNextInternal(), which is likely to be the most commonly-called. I added a CHECK_CONSISTENCY macro to guarantee the calls get removed in release builds. PS10, Line 125: read_page_ == pages_.end() > has_read_iterator() and switch clauses? Done Line 129: } > does it make sense to print the write page as well? It's printed on line 123 PS10, Line 149: // This must be the first call to PrepareForWrite(). > Maybe say, this must be the first iterator? i.e. you can't call this functi Done PS10, Line 158: // This must be the first call to PrepareForWrite(). > same Done PS10, Line 205: if > when Done PS10, Line 211: If > When Done PS10, Line 258: get_extra_reservation > it's unfortunate we need this. i wonder if moving the reservation logic (an Done PS10, Line 272: bool pin_for_read = has_read_iterator() && pinned_; : get_extra_reservation |= pin_for_read; > this is still pretty confusing. if you don't move the reservation logic com I restructured this code and move the reservation logic into the callers. It adds a bit more boilerplate but seems less magical. Line 283: if (pin_for_read) RETURN_IF_ERROR(PinPage(&new_page)); > see comment above. Done Line 298: int64_t row_size, bool get_extra_reservation, bool* got_page) noexcept { > this wrapper now seems unnecessary and not helpful for readability because Done PS10, Line 339: read_page_ == pages_.end() > !has_read_iterator()? Done Line 379: ResetReadPage(); > this seems to make more sense to do in PrepareForRead() since that's the on Done http://gerrit.cloudera.org:8080/#/c/5811/10/be/src/runtime/buffered-tuple-stream-v2.h File be/src/runtime/buffered-tuple-stream-v2.h: PS10, Line 217: got_buffer > maybe this should be 'got_reservation' now, and explained in terms of reser Works for me - done. PS10, Line 228: got_buffer > same Done PS10, Line 496: get_extra_reservation > this needs explanation, but first see comments in the cc file about it. Reworked this. Line 503: bool* got_page) noexcept WARN_UNUSED_RESULT; > seems like we can put this code in NewWritePage() now. This ended up being restructured a bit - some of the code is in a new function AdvanceWritePage(). PS10, Line 509: got_buffer > got_page? Done -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has uploaded a new patch set (#11). Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool Add a copy of BufferedTupleStream that allocates memory from BufferPool. This will replace the original implementation when IMPALA-3200 is completed. The major changes are: * Terminology is updated from "blocks" to "pages" * No small buffer support is needed (hooray!). * BufferedTupleStream needs to do its own tracking of # of rows per page, etc instead of relying on BufferedBlockMgr to do it. A wrapper around PageHandle is used. * Profile counters (unpin, pin, get new block time) are removed - similar counters in the BufferPool client are more useful. * Changed the tuple null indicators so that they are allocated before each tuple, rather than in a block at the start of the page. This slightly reduces the memory density, but greatly simplifies much logic. In particular, it avoids problems with RowIdx and larger pages with offsets that don't fit in 16 bits. * Buffer management of read/write streams uses the new pin-counting support to separate pinning of the read and write pages. This means that the reservation usage of an unpinned read/write stream does not fluctuate and the read/write iterators can always advance without requiring additional reservation. Testing this required some further changes. TestEnv was refactored so it can set up either BufferPool or BufferedBlockMgr. Some BufferPool-related state is added to ExecEnv, QueryState and RuntimeState, but is only created for backend tests that explicitly create a BufferPool. The following is left for future work: * IMPALA-3808 (large row support) is not added. I've added TODOs to the code to places that will require changes. * IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since it requires global changes to operators that accumulate memory. Testing: All of the BufferedTupleStream unit tests are ported to the new implementation, except for ones specifically testing the small buffer functionality. Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 --- M be/src/exec/hash-table-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc A be/src/runtime/buffered-tuple-stream-v2-test.cc A be/src/runtime/buffered-tuple-stream-v2.cc A be/src/runtime/buffered-tuple-stream-v2.h A be/src/runtime/buffered-tuple-stream-v2.inline.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc 20 files changed, 3,010 insertions(+), 101 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/11 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Dan Hecht has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 10: (19 comments) http://gerrit.cloudera.org:8080/#/c/5811/9/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: PS9, Line 322: !has_read_iterator() ? http://gerrit.cloudera.org:8080/#/c/5811/10/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: Line 101: for (const Page& page : pages_) { will this be too expensive, even in debug builds (and we should make sure this loop is compiled out of release builds)? PS10, Line 125: read_page_ == pages_.end() has_read_iterator() and switch clauses? Line 129: } does it make sense to print the write page as well? PS10, Line 149: // This must be the first call to PrepareForWrite(). Maybe say, this must be the first iterator? i.e. you can't call this function after starting a read iteration or read/write iteration, either, right? PS10, Line 158: // This must be the first call to PrepareForWrite(). same PS10, Line 205: if when (sorry, i wrote the comment but 'if' sounds confusing since it's not currently pinned) PS10, Line 211: If When PS10, Line 258: get_extra_reservation it's unfortunate we need this. i wonder if moving the reservation logic (and the ResetWritePage() call) to the callers would be simpler? In any case, if we leave it here, how about calling this parameter 'get_read_reservation'? also see next comment. PS10, Line 272: bool pin_for_read = has_read_iterator() && pinned_; : get_extra_reservation |= pin_for_read; this is still pretty confusing. if you don't move the reservation logic completely into the callers, it may still make sense to move the calculation 'pinned_ && has_read_iterator()' into AddRowSlow(). The other callsites are known to be either static true or false. THen you would also move the PinPage() call into the caller (using PinPageIfNeeded()), which seems to make sense since in the case of PrepareForReadWrite(), you really want PrepareForReadInternal() to do that anyway, no? Or at least you could move this after the write iterator is set up and still use PinPageIfNeeded(). Line 283: if (pin_for_read) RETURN_IF_ERROR(PinPage(&new_page)); see comment above. Line 298: int64_t row_size, bool get_extra_reservation, bool* got_page) noexcept { this wrapper now seems unnecessary and not helpful for readability because NewWritePage() has no other callers. PS10, Line 339: read_page_ == pages_.end() !has_read_iterator()? Line 379: ResetReadPage(); this seems to make more sense to do in PrepareForRead() since that's the only case you can possible have a read iterator, right? and that assumption is built into the reservation management code. http://gerrit.cloudera.org:8080/#/c/5811/10/be/src/runtime/buffered-tuple-stream-v2.h File be/src/runtime/buffered-tuple-stream-v2.h: PS10, Line 217: got_buffer maybe this should be 'got_reservation' now, and explained in terms of reservation. or at least got_page? PS10, Line 228: got_buffer same PS10, Line 496: get_extra_reservation this needs explanation, but first see comments in the cc file about it. Line 503: bool* got_page) noexcept WARN_UNUSED_RESULT; seems like we can put this code in NewWritePage() now. PS10, Line 509: got_buffer got_page? -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 10: Made changes discussed offline - PrepareForReadWrite() and re-oriented the pin counting logic to be driven by an ExpectedPinCount() method. -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has uploaded a new patch set (#10). Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool Add a copy of BufferedTupleStream that allocates memory from BufferPool. This will replace the original implementation when IMPALA-3200 is completed. The major changes are: * Terminology is updated from "blocks" to "pages" * No small buffer support is needed (hooray!). * BufferedTupleStream needs to do its own tracking of # of rows per page, etc instead of relying on BufferedBlockMgr to do it. A wrapper around PageHandle is used. * Profile counters (unpin, pin, get new block time) are removed - similar counters in the BufferPool client are more useful. * Changed the tuple null indicators so that they are allocated before each tuple, rather than in a block at the start of the page. This slightly reduces the memory density, but greatly simplifies much logic. In particular, it avoids problems with RowIdx and larger pages with offsets that don't fit in 16 bits. * Buffer management of read/write streams uses the new pin-counting support to separate pinning of the read and write pages. This means that the reservation usage of an unpinned read/write stream does not fluctuate and the read/write iterators can always advance without requiring additional reservation. Testing this required some further changes. TestEnv was refactored so it can set up either BufferPool or BufferedBlockMgr. Some BufferPool-related state is added to ExecEnv, QueryState and RuntimeState, but is only created for backend tests that explicitly create a BufferPool. The following is left for future work: * IMPALA-3808 (large row support) is not added. I've added TODOs to the code to places that will require changes. * IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since it requires global changes to operators that accumulate memory. Testing: All of the BufferedTupleStream unit tests are ported to the new implementation, except for ones specifically testing the small buffer functionality. Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 --- M be/src/exec/hash-table-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc A be/src/runtime/buffered-tuple-stream-v2-test.cc A be/src/runtime/buffered-tuple-stream-v2.cc A be/src/runtime/buffered-tuple-stream-v2.h A be/src/runtime/buffered-tuple-stream-v2.inline.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc 20 files changed, 3,002 insertions(+), 101 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/10 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has uploaded a new patch set (#9). Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool Add a copy of BufferedTupleStream that allocates memory from BufferPool. This will replace the original implementation when IMPALA-3200 is completed. The major changes are: * Terminology is updated from "blocks" to "pages" * No small buffer support is needed (hooray!). * BufferedTupleStream needs to do its own tracking of # of rows per page, etc instead of relying on BufferedBlockMgr to do it. A wrapper around PageHandle is used. * Profile counters (unpin, pin, get new block time) are removed - similar counters in the BufferPool client are more useful. * Changed the tuple null indicators so that they are allocated before each tuple, rather than in a block at the start of the page. This slightly reduces the memory density, but greatly simplifies much logic. In particular, it avoids problems with RowIdx and larger pages with offsets that don't fit in 16 bits. * Buffer management of read/write streams uses the new pin-counting support to separate pinning of the read and write pages. This means that the reservation usage of an unpinned read/write stream does not fluctuate and the read/write iterators can always advance without requiring additional reservation. Testing this required some further changes. TestEnv was refactored so it can set up either BufferPool or BufferedBlockMgr. Some BufferPool-related state is added to ExecEnv, QueryState and RuntimeState, but is only created for backend tests that explicitly create a BufferPool. The following is left for future work: * IMPALA-3808 (large row support) is not added. I've added TODOs to the code to places that will require changes. * IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since it requires global changes to operators that accumulate memory. Testing: All of the BufferedTupleStream unit tests are ported to the new implementation, except for ones specifically testing the small buffer functionality. Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 --- M be/src/exec/hash-table-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc A be/src/runtime/buffered-tuple-stream-v2-test.cc A be/src/runtime/buffered-tuple-stream-v2.cc A be/src/runtime/buffered-tuple-stream-v2.h A be/src/runtime/buffered-tuple-stream-v2.inline.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc 20 files changed, 2,941 insertions(+), 101 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/9 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 8: (19 comments) http://gerrit.cloudera.org:8080/#/c/5811/8/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: PS8, Line 108: if > iff, just to be super clear. Done PS8, Line 110: read_page_ != pages_.end() > why is this condition needed? what does it mean to have read_page_ == pages This means that there's a read iterator active. read_page_ == pages_.end() && pinned_ means the stream is pinned but there is no read iterator. I clarified in the comment of read_page_ and restructured this condition so the meanings of subexpressions were more explicit. Line 116: } > fine with me to keep, but isn't this the same as line 109? Done Line 141: ss << &*read_page_; > will it be possible to identify which page in the list printed below this i Yes, you can match up the addresses. I made this change to reduce the amount of redundancy in the debug output. Line 166: DCHECK(read_page_ == pages_.end()); > DCHECK_EQ x2 We don't generally use DCHECK_EQ for null pointer comparison (I did switch this to nullptr for consistency though). We also can't use it for iterators since they're not printable. PS8, Line 232: read_page_ != pages_.end() > what does this condition mean? why isn't it just: if the stream is pinned f It means there's an active read iterator. I actually updated the has_read_page() method to be called has_read_iterator() and used that instead - I think it's clear. If there's only a write iterator we don't need to double-pin (because it's not really a read/write stream). I rewrote things in terms of the presence of read/write iterators instead of the read_write mode flag. Now read_write_ mode only directly influences whether PrepareForRead() invalidates the write iterator. Line 241: RETURN_IF_ERROR(buffer_pool_->Pin(buffer_pool_client_, &new_page.handle)); > does this purposely not go through PinPage()? I was experimenting a bit with different flows here so that related operations were grouped together. Seems like it was confusing so I switched back to PinPage(). PS8, Line 250: (double_pin ? 2 : 1) > i guess because of this. maybe move this to be after the CreatePage() call Done Line 348: *pinned = true; > why was this needed? Previously PinStream() was idempotent - I wanted to preserve that behaviour. PS8, Line 356: We need to double-pin the current write page if we also have a read iterator. > why? or maybe you mean just that we need to take a pin-count for reading ev Tried to reword to be clearer. PS8, Line 358: write_page_ == &page && read_page_ == pages_.end() > why? i thought we now do take a separate pin count for read and for write? This essentially implements that idea. Some of the complication is that on a non-read-write stream where there is only ever one iterator active, we don't need to do the double-pinning of the current write page. Line 387: if (pinned_) UnpinPage(&page); > do we allow you to call UnpinStream() on an already unpinned stream? why? I restructured this code to hopefully be clearer. As a bonus, calling UnpinStream() on an already-unpinned stream is now O(1) instead of O(n) PS8, Line 388: continue > i think this would be clearer as: I think the new version is easier to follow. http://gerrit.cloudera.org:8080/#/c/5811/8/be/src/runtime/buffered-tuple-stream-v2.h File be/src/runtime/buffered-tuple-stream-v2.h: PS8, Line 68: . > maybe add: "for reading" Done PS8, Line 69: access > read access? We do update some tuples in place in the agg so that might be more misleading. I changed this so that it's clearer that it means that pointers to rows can be saved (since we don't really support random access via the stream interface). PS8, Line 75: event > even Done PS8, Line 319: bytes_pinned > i guess should be BytesPinned() since not a simple accessor Done PS8, Line 385: including any pages already deleted in 'delete_on_read' : /// mode. > that's not intuitive. is there a good reason why we do that rather than mak Currently those kind of invariants aren't maintained when reading in delete_on_read mode - that has to be the last pass over the stream. num_rows_ also isn't updated. I don't see a use case for why client code would depend on these values being updated while the stream is destructively read so the value of adding additional logic to keep this in sync would be limited. I already put some effort into updating the documentation and DCHECKs so that it was clearer that delete_on_read was destructive and you couldn't re-read a stream afterwards. PS8, Line 392: num_rows_returned_ > rows_returned_? (or rename that member, which might be better). Updated the comment. I would rename the member in other circumstances but I don't want to make t
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Dan Hecht has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/5811/8/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: PS8, Line 110: read_page_ != pages_.end() why is this condition needed? what does it mean to have read_page_ == pages_.end() && pinned_? -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Dan Hecht has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 8: (20 comments) http://gerrit.cloudera.org:8080/#/c/5811/8/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: PS8, Line 108: if iff, just to be super clear. Line 116: } fine with me to keep, but isn't this the same as line 109? Line 141: ss << &*read_page_; will it be possible to identify which page in the list printed below this is? Line 166: DCHECK(read_page_ == pages_.end()); DCHECK_EQ x2 PS8, Line 219: If the stream is pinned, the previous write page will remain pinned : // with pin count 1 as a read page. If a read/write stream is pinned, the write : // page has pin count 2, so we need to unpin once these two sentences seem to both be talking about pinned streams, but i'm not sure what they are trying to conclude. i also don't understand this if-stmt that follows. Why isn't it simply: if (write_page_ != nullptr) UnpinPage(write_page_); ? i.e. we need to unpin *for write* the page regardless of the state of the read-iterator, no? i.e. can't we separate out reasoning of the read and write iterators? PS8, Line 232: read_page_ != pages_.end() what does this condition mean? why isn't it just: if the stream is pinned for reading, need to take a pin count for both read and write, otherwise need a pin count for only write? Line 241: RETURN_IF_ERROR(buffer_pool_->Pin(buffer_pool_client_, &new_page.handle)); does this purposely not go through PinPage()? PS8, Line 250: (double_pin ? 2 : 1) i guess because of this. maybe move this to be after the CreatePage() call and just add 1 in there and then let PinPage() do its job? Line 348: *pinned = true; why was this needed? PS8, Line 356: We need to double-pin the current write page if we also have a read iterator. why? or maybe you mean just that we need to take a pin-count for reading even if it's already pinned for write? if that's the case, I think we should just talk about "read pinning" and "write pinning" separately (and this routine shouldn't have to muck with write pinning, right?) PS8, Line 358: write_page_ == &page && read_page_ == pages_.end() why? i thought we now do take a separate pin count for read and for write? Line 387: if (pinned_) UnpinPage(&page); do we allow you to call UnpinStream() on an already unpinned stream? why? PS8, Line 388: continue i think this would be clearer as: if (is_write_page || is_read_page) { } else { } rather than the "continue" flow. http://gerrit.cloudera.org:8080/#/c/5811/8/be/src/runtime/buffered-tuple-stream-v2.h File be/src/runtime/buffered-tuple-stream-v2.h: PS8, Line 68: . maybe add: "for reading" PS8, Line 69: access read access? Line 71: /// unpinned and therefore do not use the client's reservation and can be spilled to disk. comments in this paragraph are meant to further clarify that pinned vs. unpinned stream really only applies to the read-iterator. PS8, Line 75: event even PS8, Line 319: bytes_pinned i guess should be BytesPinned() since not a simple accessor PS8, Line 385: including any pages already deleted in 'delete_on_read' : /// mode. that's not intuitive. is there a good reason why we do that rather than make it always be the size of the stream? and if so, seems like it should be called highwater_byte_size or something. PS8, Line 392: num_rows_returned_ rows_returned_? (or rename that member, which might be better). -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has uploaded a new patch set (#8). Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool Add a copy of BufferedTupleStream that allocates memory from BufferPool. This will replace the original implementation when IMPALA-3200 is completed. The major changes are: * Terminology is updated from "blocks" to "pages" * No small buffer support is needed (hooray!). * BufferedTupleStream needs to do its own tracking of # of rows per page, etc instead of relying on BufferedBlockMgr to do it. A wrapper around PageHandle is used. * Profile counters (unpin, pin, get new block time) are removed - similar counters in the BufferPool client are more useful. * Changed the tuple null indicators so that they are allocated before each tuple, rather than in a block at the start of the page. This slightly reduces the memory density, but greatly simplifies much logic. In particular, it avoids problems with RowIdx and larger pages with offsets that don't fit in 16 bits. * Buffer management of read/write streams uses the new pin-counting support to separate pinning of the read and write pages. This means that the reservation usage of an unpinned read/write stream does not fluctuate and the read/write iterators can always advance without requiring additional reservation. Testing this required some further changes. TestEnv was refactored so it can set up either BufferPool or BufferedBlockMgr. Some BufferPool-related state is added to ExecEnv, QueryState and RuntimeState, but is only created for backend tests that explicitly create a BufferPool. The following is left for future work: * IMPALA-3808 (large row support) is not added. I've added TODOs to the code to places that will require changes. * IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since it requires global changes to operators that accumulate memory. Testing: All of the BufferedTupleStream unit tests are ported to the new implementation, except for ones specifically testing the small buffer functionality. Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 --- M be/src/exec/hash-table-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc A be/src/runtime/buffered-tuple-stream-v2-test.cc A be/src/runtime/buffered-tuple-stream-v2.cc A be/src/runtime/buffered-tuple-stream-v2.h A be/src/runtime/buffered-tuple-stream-v2.inline.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc 20 files changed, 2,931 insertions(+), 101 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/8 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/5811/6/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: PS6, Line 112: DCHECK_LE(read_ptr_, read_end_ptr); > would be nice to write this and line 106 the same to make it a little faste Done Line 252: if (!read_page_->is_pinned()) { > why does this not have to check if read and write page are the same? I ended up reworking the buffer management as discussed to take advantage of the new pin-counting support. There are still some tricky corner cases but I think the invariants are clearer now, and we can provide stronger guarantees about being able to iterate through unpinned streams. Line 276: if (!pinned_ && write_page_ != &pages_.front()) { > this could use a comment: Done Line 332: && (&page == write_page_ || (read_write_ && &page == &*read_page_))) { > why is this right to unpin the read_page_ if !read_write_? This got reworked with the other changes http://gerrit.cloudera.org:8080/#/c/5811/6/be/src/runtime/buffered-tuple-stream-v2.h File be/src/runtime/buffered-tuple-stream-v2.h: PS6, Line 517: CalcNumPinned > this looks dead Done -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has uploaded a new patch set (#7). Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool Add a copy of BufferedTupleStream that allocates memory from BufferPool. This will replace the original implementation when IMPALA-3200 is completed. The major changes are: * Terminology is updated from "blocks" to "pages" * No small buffer support is needed (hooray!). * BufferedTupleStream needs to do its own tracking of # of rows per page, etc instead of relying on BufferedBlockMgr to do it. A wrapper around PageHandle is used. * Profile counters (unpin, pin, get new block time) are removed - similar counters in the BufferPool client are more useful. * Changed the tuple null indicators so that they are allocated before each tuple, rather than in a block at the start of the page. This slightly reduces the memory density, but greatly simplifies much logic. In particular, it avoids problems with RowIdx and larger pages with offsets that don't fit in 16 bits. * Buffer management of read/write streams uses the new pin-counting support to separate pinning of the read and write pages. This means that the reservation usage of an unpinned read/write stream does not fluctuate and the read/write iterators can always advance without requiring additional reservation. Testing this required some further changes. TestEnv was refactored so it can set up either BufferPool or BufferedBlockMgr. Some BufferPool-related state is added to ExecEnv, QueryState and RuntimeState, but is only created for backend tests that explicitly create a BufferPool. The following is left for future work: * IMPALA-3808 (large row support) is not added. I've added TODOs to the code to places that will require changes. * IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since it requires global changes to operators that accumulate memory. Testing: All of the BufferedTupleStream unit tests are ported to the new implementation, except for ones specifically testing the small buffer functionality. Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 --- M be/src/exec/hash-table-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc A be/src/runtime/buffered-tuple-stream-v2-test.cc A be/src/runtime/buffered-tuple-stream-v2.cc A be/src/runtime/buffered-tuple-stream-v2.h A be/src/runtime/buffered-tuple-stream-v2.inline.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc 20 files changed, 2,935 insertions(+), 101 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/7 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Dan Hecht has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 6: (6 comments) http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: Line 256: // actually fail once we have variable-length pages. > I just kept this logic from the old BufferedTupleStream. How would the stream use more reservation after UnpinStream()? Wouldn't we spend a reservation on the read and write pages both when the stream is pinned and then unpinned? I need to read the code through some more, but it seems to that a fair amount of complexity comes from sharing the reservation when they overlap. If we do end up sticking with this scheme, how about verifying the pin count is exactly 1 for the read/write page(s)? http://gerrit.cloudera.org:8080/#/c/5811/6/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: PS6, Line 112: DCHECK_LE(read_ptr_, read_end_ptr); would be nice to write this and line 106 the same to make it a little faster to read Line 252: if (!read_page_->is_pinned()) { why does this not have to check if read and write page are the same? Line 276: if (!pinned_ && write_page_ != &pages_.front()) { this could use a comment: // pages_.front() will become the read_page_. Line 332: && (&page == write_page_ || (read_write_ && &page == &*read_page_))) { why is this right to unpin the read_page_ if !read_write_? http://gerrit.cloudera.org:8080/#/c/5811/6/be/src/runtime/buffered-tuple-stream-v2.h File be/src/runtime/buffered-tuple-stream-v2.h: PS6, Line 517: CalcNumPinned this looks dead -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has uploaded a new patch set (#6). Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool Add a copy of BufferedTupleStream that allocates memory from BufferPool. This will replace the original implementation when IMPALA-3200 is completed. The major changes are: * Terminology is updated from "blocks" to "pages" * No small buffer support is needed (hooray!). * BufferedTupleStream needs to do its own tracking of # of rows per page, etc instead of relying on BufferedBlockMgr to do it. A wrapper around PageHandle is used. * Profile counters (unpin, pin, get new block time) are removed - similar counters in the BufferPool client are more useful. * Changed the tuple null indicators so that they are allocated before each tuple, rather than in a block at the start of the page. This slightly reduces the memory density, but greatly simplifies much logic. In particular, it avoids problems with RowIdx and larger pages with offsets that don't fit in 16 bits. Testing this required some further changes. TestEnv was refactored so it can set up either BufferPool or BufferedBlockMgr. Some BufferPool-related state is added to ExecEnv, QueryState and RuntimeState, but is only created for backend tests that explicitly create a BufferPool. The following is left for future work: * IMPALA-3808 (large row support) is not added. I've added TODOs to the code to places that will require changes. * IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since it requires global changes to operators that accumulate memory. Testing: All of the BufferedTupleStream unit tests are ported to the new implementation, except for ones specifically testing the small buffer functionality. Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 --- M be/src/exec/hash-table-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc A be/src/runtime/buffered-tuple-stream-v2-test.cc A be/src/runtime/buffered-tuple-stream-v2.cc A be/src/runtime/buffered-tuple-stream-v2.h A be/src/runtime/buffered-tuple-stream-v2.inline.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc 20 files changed, 2,822 insertions(+), 101 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/6 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 5: (39 comments) http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: PS5, Line 181: write_page_->is_pinned() > when do we have an unpinned write page? what does that mean? I documented that they have to be pinned and added some consistency checks. If I set write_page_ = null earlier that keeps it in a consistent state. Line 189: *got_page = buffer_pool_client_->reservation()->IncreaseReservationToFit(page_len); > shouldn't this come with IMPALA-3208? or do we need this already? We need it already for the first page or if got_page was false on an earlier call and we're retrying. Line 198: write_end_ptr_ = new_page.data() + page_len; > would be slightly clearer to compute this based on write_page_ Done Line 234: DCHECK(&*read_page_ != write_page_); > here and above: DCHECK_NE/EQ It doesn't work for the first one since the values of iterators aren't printable. Line 246: bytes_in_mem_ -= read_page_->len(); > UnpinPage()? Done Line 256: if (!read_page_->is_pinned()) { > is this check sufficient? couldn't the page by pinned due to it also being I just kept this logic from the old BufferedTupleStream. Added a comment to 'write_page_' to explain. I think it's easiest to leave as-is for now. Having a pin per iterator makes a lot of sense when we have a separate iterator abstraction that manages its own reservation, but it interacts strangely with the current UnpinStream() API - if you UnpinStream() a read-write stream with a single page then it could actually use more reservation. The only current user -AnalyticEvalNode - shouldn't do that but it adds a weird edge case to be wary of. I'm open to changing it but it wasn't a strict improvement in my eyes. Line 300: bytes_in_mem_ += pages_.front().len(); > maybe factor this into PinPage() to match UnpinPage() Done PS5, Line 324: is_pinned > i guess this use is probably okay I think we'd need to change it if we started pinned pages multiple times - we might need to unpin a page here if it was pinned multiple times. http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.h File be/src/runtime/buffered-tuple-stream-v2.h: Line 65: // > /// Done PS5, Line 66: In the pinned mode all pages are : /// pinned and the data is fully in memory. > seems redundant with the next sentence Done PS5, Line 74: PrepareForWrite > shouldn't that be PrepareForRead()? Done PS5, Line 74: the > to Done Line 75: /// pass over the stream. > this should go away soon after switching to BufferPool, right? i.e. buffers I think the concept of a destructive read pass will continue to exist, but the pages would be destroyed and buffers attached instead of the current dance with NeedsDeepCopy(). There will probably be other related changes though, e.g. around buffer management for non-destructive read passes. To support non-destructive unpinned read passes we might actually need a new BufferPool API to extract a buffer from a page without destroying the page. Line 121: /// <---12b---> <---12b---> <-5b> > i guess these examples all assume non-nullable? Added a nulllable example Line 127: /// caller does not need to copy if it is streaming. > are these listed in order of precedence? i.e. if a stream is both delete on Reworked this and the memory lifetime section to avoid ambiguity. Line 139: /// or free up other memory before retrying. > nit:indentation of some of these paragraphs is inconsistent Done Line 152: /// layout described above. > maybe a TODO/JIRA for what we had discussed with unifying AddRow() and Allo Done Line 199: ///Returns false and sets 'status' to an error. > nit: indentations of these bullets are inconsistent Done Line 262: /// rows will remain valid until the stream is unpinned, destroyed, etc. > let's also put TODO IMPALA-4179 to make it easier to keep straight of where Done Line 267: /// stream remains pinned. > does this interface require the stream to be pinned? if so, let's be explic Done PS5, Line 370: offset > this isn't really the offset. Done Line 374: uint8_t* read_end_ptr_; > given we store number of rows in the page, what is this used for? oh, i see Done Line 380: uint8_t* write_end_ptr_; > same, and this one looks dead anyway. This one is used in AllocateRow() where the perf matters - it likely saves a few cycles per row. I ended up removing write_page_bytes_remaining() and just inlining the calculations. Line 382: /// Number of rows returned to the caller from GetNext(). > ... since PrepareForRead() was called? Done Line 386: int read_page_idx_; > this looks dead Done Line 393: int num_pinned_; > this doesn't looked
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Dan Hecht has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 5: (39 comments) Will finish buffered-tuple-stream-v2.cc after you reply to the pin count management question. http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: PS5, Line 181: write_page_->is_pinned() when do we have an unpinned write page? what does that mean? oh, i guess we can get into that state if we return at 190? any way to simplify these possible states? Line 189: *got_page = buffer_pool_client_->reservation()->IncreaseReservationToFit(page_len); shouldn't this come with IMPALA-3208? or do we need this already? Line 198: write_end_ptr_ = new_page.data() + page_len; would be slightly clearer to compute this based on write_page_ Line 234: DCHECK(&*read_page_ != write_page_); here and above: DCHECK_NE/EQ Line 246: bytes_in_mem_ -= read_page_->len(); UnpinPage()? Line 256: if (!read_page_->is_pinned()) { is this check sufficient? couldn't the page by pinned due to it also being the write page. in which case shouldn't we increment the pin count? i.e. shouldn't this check by !is_pinned_? in fact, is using page->is_pinned() ever the right thing given that we don't know whether it is pinned as the read or write page? oh, i guess we're careful to handle the read_page==write_page case in all pinning/unpinning places? i.e the stream only takes one pin count per page even if used as both read and write page, is that right? is this explained anywhere? (wasn't the plan to just take a pin for each iterator? does that not work?) Line 300: bytes_in_mem_ += pages_.front().len(); maybe factor this into PinPage() to match UnpinPage() PS5, Line 324: is_pinned i guess this use is probably okay http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.h File be/src/runtime/buffered-tuple-stream-v2.h: Line 65: // /// PS5, Line 66: In the pinned mode all pages are : /// pinned and the data is fully in memory. seems redundant with the next sentence PS5, Line 74: the to PS5, Line 74: PrepareForWrite shouldn't that be PrepareForRead()? Line 75: /// pass over the stream. this should go away soon after switching to BufferPool, right? i.e. buffers can be attached to the row-batches rather than using needs-deep-copy mechanism. if so, how about a TODO (and reference IMPALA-4179)? Line 121: /// <---12b---> <---12b---> <-5b> i guess these examples all assume non-nullable? Line 127: /// caller does not need to copy if it is streaming. are these listed in order of precedence? i.e. if a stream is both delete on read and pinned, the page is deleted, right? Line 139: /// or free up other memory before retrying. nit:indentation of some of these paragraphs is inconsistent Line 152: /// layout described above. maybe a TODO/JIRA for what we had discussed with unifying AddRow() and AllocateRow() by evaluating expression? Line 199: ///Returns false and sets 'status' to an error. nit: indentations of these bullets are inconsistent Line 262: /// rows will remain valid until the stream is unpinned, destroyed, etc. let's also put TODO IMPALA-4179 to make it easier to keep straight of where we're headed. Line 267: /// stream remains pinned. does this interface require the stream to be pinned? if so, let's be explicit about that. if not, why not? PS5, Line 370: offset this isn't really the offset. Line 374: uint8_t* read_end_ptr_; given we store number of rows in the page, what is this used for? oh, i see this is the physical end, but only used in dchecks. how about removing this and just use data() + len() in those dchecks? Line 380: uint8_t* write_end_ptr_; same, and this one looks dead anyway. Line 382: /// Number of rows returned to the caller from GetNext(). ... since PrepareForRead() was called? i.e. this doesn't accumulate over passes, right? Line 386: int read_page_idx_; this looks dead Line 393: int num_pinned_; this doesn't looked used either, except for DebugString() which iterates the list anyway, so how about removing. or do the exec nodes really need to know that? Line 407: /// to AddRow() must occur before calling PrepareForRead() and subsequent calls to or AllocateRow() PS5, Line 414: nit extra space PS5, Line 419: This changes the : /// memory management of the stream this sentence isn't very helpful PS5, Line 454: vailable available? but does this try to increase reservation? etc PS5, Line 463: blocks is this talking about blocking on disk in order to repin a page? or something else? Line 517: int CalcNumPinned(); won't need this if we get rid of num_pinned_ http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.inlin
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/5811/5//COMMIT_MSG Commit Message: Line 25: RowIdx and larger pages with offsets that don't fit in 16 bits. > how much is the memory density reduced? in the worse case, does this use 2 The impact is mixed. In general density is hurt if tuples are mostly non-null. I think the worst case is 8x, if the row has no materialised slots and therefore all memory is devoted to the bitstring. My feeling is that adding an extra byte per row is acceptable, even if it's an 8x regression. In other cases, where tuples are mostly null, memory density can improve more than 2x. E.g. if you have a very wide nullable tuple, with many nulls, then the previous encoding is very inefficient because it runs out of space in the fixed-size bit vector before it runs out of space in the page. -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Dan Hecht has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/5811/5//COMMIT_MSG Commit Message: Line 25: RowIdx and larger pages with offsets that don't fit in 16 bits. how much is the memory density reduced? in the worse case, does this use 2x the memory (single bool slot)? -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has uploaded a new patch set (#5). Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool Add a copy of BufferedTupleStream that allocates memory from BufferPool. This will replace the original implementation when IMPALA-3200 is completed. The major changes are: * Terminology is updated from "blocks" to "pages" * No small buffer support is needed (hooray!). * BufferedTupleStream needs to do its own tracking of # of rows per page, etc instead of relying on BufferedBlockMgr to do it. A wrapper around PageHandle is used. * Profile counters (unpin, pin, get new block time) are removed - similar counters in the BufferPool client are more useful. * Changed the tuple null indicators so that they are allocated before each tuple, rather than in a block at the start of the page. This slightly reduces the memory density, but greatly simplifies much logic. In particular, it avoids problems with RowIdx and larger pages with offsets that don't fit in 16 bits. Testing this required some further changes. TestEnv was refactored so it can set up either BufferPool or BufferedBlockMgr. Some BufferPool-related state is added to ExecEnv, QueryState and RuntimeState, but is only created for backend tests that explicitly create a BufferPool. The following is left for future work: * IMPALA-3808 (large row support) is not added. I've added TODOs to the code to places that will require changes. * IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since it requires global changes to operators that accumulate memory. Testing: All of the BufferedTupleStream unit tests are ported to the new implementation, except for ones specifically testing the small buffer functionality. Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 --- M be/src/exec/hash-table-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc A be/src/runtime/buffered-tuple-stream-v2-test.cc A be/src/runtime/buffered-tuple-stream-v2.cc A be/src/runtime/buffered-tuple-stream-v2.h A be/src/runtime/buffered-tuple-stream-v2.inline.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc 20 files changed, 2,831 insertions(+), 99 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/5 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 4: (37 comments) http://gerrit.cloudera.org:8080/#/c/5811/4/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: Line 326: LOG(INFO) << "Failed to get reservation for " << pages_.front().len(); > I need to remove this log message. Done http://gerrit.cloudera.org:8080/#/c/5811/4/be/src/runtime/buffered-tuple-stream-v2.h File be/src/runtime/buffered-tuple-stream-v2.h: PS4, Line 41: or scratch disk storage > what does this mean? isn't it always backed by BufferPool (which may spill Reworded to reflect that. PS4, Line 45: the APIs are called from a single thread > this is saying per BufferTupleStream object, right? maybe clarify that Done Line 46: /// > it'd be nice to explain the point of this abstraction a bit more explicitly Done PS4, Line 49: pinned or unpinned > I think these terms need to be defined relative to the stream as far as wha Done PS4, Line 59: page size > wouldn't it also be a function of tuple size (i.e. tuples per page)? I don't think this added much - the previous sentence better describes the number of null indicators. PS4, Line 77: pages > pages' buffers. Done PS4, Line 80: location in memory > or you could say "buffer". Done PS4, Line 110: in pages_ > in the stream Done PS4, Line 111: in pages_ > same Done PS4, Line 114: (if read_write is : /// true, then two pages need to be in memory). > rather than saying this in parenthesis, i think we should pull it up to say Reworded this part to say that we only need enough reservation for a single read/write buffer. PS4, Line 117: caller's reservation is insufficient > does it first try to increase the reservation at this point? Done Line 124: /// returned so far from the stream may be freed on the next call to GetNext(). > do we have a jira we can reference as a TODO here to simplify this? Done Line 126: /// Manual construction of rows with AllocateRow(): > not for now, but is it possible to unify things so we only have one way to Updated to make it clearer that it's called instead of AddRow(). Line 138: /// this array as new rows are inserted in the page. If we do so, then there will be > is that still something we want to do? No longer relevant with the changes I made to the null indicator layout. PS4, Line 146: id > what is 'id'? is this index? Done PS4, Line 147: With 8MB pages that is 512GB per stream. > where does this limit come into play? is this something that will cause tr I believe we only use RowIdx currently if the right side of a join has multiple tuples per row. I reworked this to drastically simplify it. Instead of RowIdx we just return a pointer to the start of the data. This avoids all the problems with limits here. PS4, Line 186: page_len: the size of pages to use in the stream > what does that mean for large rows? We need something like a default and max page length. Added a TODO. PS4, Line 198: of > off Done PS4, Line 203: AddRow > is it also used before AllocateRow()? Done Line 207: /// if an error status is returned. > broken line break Done Line 212: /// sets 'status' to an error if an error occurred. > hard to parse that sentence. but, can we simplify this dance now that we h Updated this comment to be clearer about the possible scenarios. It's guaranteed to succeed for now. It's actually a bit more complicated for variable-length pages since that would rely on the client leaving enough slack in the reservation to read the largest-possible page. Line 223: /// tuples. > same questions. let's think of this interface can be simplified now. Updated to refer back to the AddRow() comment. Line 228: /// been allocated with the stream's row desc. > is the row valid only as long as the stream remains pinned? Done PS4, Line 232: begin reading > before the first GetNext()? Done Line 236: /// false if the page could not be pinned and no error was encountered. > can we explain that in terms of reservations? and like questions above, doe Done PS4, Line 240: enough memory > what does that mean in terms of reservations? and does it still make sense Done Line 253: }; > this seems like a wrinkle to the class comment. can it be succinctly descri I think it now covers it (it mentions that it only pins the pages being read and written in unpinned mode) PS4, Line 259: must be copied out by the caller > this seems to contradict the class comment about "needs_deep_copy" Updated the comment. PS4, Line 267: *got_rows is false if the stream could not be pinne > does that still make sense with reservations? if so, it should be expresse Reworded PS4, Line 271: date > data? Done PS4, Line 271: buffers containing page > are there buffers that don't contain page data? Reworded, t
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Dan Hecht has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 4: (32 comments) First set of comments focusing on the public interface. http://gerrit.cloudera.org:8080/#/c/5811/4/be/src/runtime/buffered-tuple-stream-v2.h File be/src/runtime/buffered-tuple-stream-v2.h: PS4, Line 41: or scratch disk storage what does this mean? isn't it always backed by BufferPool (which may spill to disk internally)? PS4, Line 45: the APIs are called from a single thread this is saying per BufferTupleStream object, right? maybe clarify that Line 46: /// it'd be nice to explain the point of this abstraction a bit more explicitly. like when we talk about BTS, we talk in terms of iterators and the iteration patterns, but this only alludes to that. and the random access pattern. etc. PS4, Line 49: pinned or unpinned I think these terms need to be defined relative to the stream as far as what it means to a client and why they would change between states. PS4, Line 59: page size wouldn't it also be a function of tuple size (i.e. tuples per page)? PS4, Line 77: pages pages' buffers. PS4, Line 80: location in memory or you could say "buffer". PS4, Line 110: in pages_ in the stream PS4, Line 111: in pages_ same PS4, Line 114: (if read_write is : /// true, then two pages need to be in memory). rather than saying this in parenthesis, i think we should pull it up to say that there can be both a read and write iterator simultaneously. PS4, Line 117: caller's reservation is insufficient does it first try to increase the reservation at this point? Line 124: /// returned so far from the stream may be freed on the next call to GetNext(). do we have a jira we can reference as a TODO here to simplify this? Line 126: /// Manual construction of rows with AllocateRow(): not for now, but is it possible to unify things so we only have one way to write into the stream? In the mean time, how does this mode relate to the "write" mode above? is it a sub-mode for pinned? Line 138: /// this array as new rows are inserted in the page. If we do so, then there will be is that still something we want to do? PS4, Line 146: id what is 'id'? is this index? PS4, Line 147: With 8MB pages that is 512GB per stream. where does this limit come into play? is this something that will cause trouble with going to 2MB pages? PS4, Line 186: page_len: the size of pages to use in the stream what does that mean for large rows? PS4, Line 198: of off PS4, Line 203: AddRow is it also used before AllocateRow()? Line 207: /// if an error status is returned. broken line break Line 212: /// sets 'status' to an error if an error occurred. hard to parse that sentence. but, can we simplify this dance now that we have real reservations? also, does this operation try to increase reservation before failing to append? is this guaranteed to succeed (other than runtime errors) for unpinned streams (maybe after you add support for large rows)? Line 223: /// tuples. same questions. let's think of this interface can be simplified now. Line 228: /// been allocated with the stream's row desc. is the row valid only as long as the stream remains pinned? PS4, Line 232: begin reading before the first GetNext()? Line 236: /// false if the page could not be pinned and no error was encountered. can we explain that in terms of reservations? and like questions above, does this got_buffer==false interface still make sense with reservations? PS4, Line 240: enough memory what does that mean in terms of reservations? and does it still make sense with reservations? Line 253: }; this seems like a wrinkle to the class comment. can it be succinctly described as part of the iterator discussion? PS4, Line 259: must be copied out by the caller this seems to contradict the class comment about "needs_deep_copy" PS4, Line 267: *got_rows is false if the stream could not be pinne does that still make sense with reservations? if so, it should be expressed in terms of reservation. PS4, Line 271: date data? PS4, Line 271: buffers containing page are there buffers that don't contain page data? PS4, Line 286: Returns the byte size of the stream that is currently pinned in memory. Is this trying to say: Returns the number of bytes currently pinned in memory by the stream? -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/5811/4/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: Line 326: LOG(INFO) << "Failed to get reservation for " << pages_.front().len(); I need to remove this log message. http://gerrit.cloudera.org:8080/#/c/5811/4/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: Line 100: if (ExecEnv::GetInstance()->buffer_pool() != nullptr) { Use exec-env directly. http://gerrit.cloudera.org:8080/#/c/5811/4/be/src/runtime/query-state.h File be/src/runtime/query-state.h: Line 29: #include "runtime/tmp-file-mgr.h" // TODO: can we avoid this? Remove TODO Line 121: Status CheckSpillingEnabled(); This is currently dead code - can remove it. http://gerrit.cloudera.org:8080/#/c/5811/4/be/src/runtime/row-batch.h File be/src/runtime/row-batch.h: Line 442: /// TODO: ownership semantics Need to fix this TODO -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has uploaded a new patch set (#4). Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool Add a copy of BufferedTupleStream that allocates memory from BufferPool. This will replace the original implementation when IMPALA-3200 is completed. The major changes are: * Terminology is updated from "blocks" to "pages" * No small buffer support is needed (hooray!). * BufferedTupleStream needs to do its own tracking of # of rows per page, etc instead of relying on BufferedBlockMgr to do it. A wrapper around PageHandle is used. * Profile counters (unpin, pin, get new block time) are removed - similar counters in the BufferPool client are more useful. The logic for serialising and deserialising rows has not changed, since that is implemented in terms of read and write pointers into the buffers. Testing this required some further changes. TestEnv was refactored so it can set up either BufferPool or BufferedBlockMgr. Some BufferPool-related state is added to ExecEnv, QueryState and RuntimeState, but is only created for backend tests that explicitly create a BufferPool. The following is left for future work: * IMPALA-3808 (large row support) is not added. I've added TODOs to the code to places that will require changes. * IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since it requires global changes to operators that accumulate memory. Testing: All of the BufferedTupleStream unit tests are ported to the new implementation, except for ones specifically testing the small buffer functionality. Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 --- M be/src/exec/hash-table-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc A be/src/runtime/buffered-tuple-stream-v2-test.cc A be/src/runtime/buffered-tuple-stream-v2.cc A be/src/runtime/buffered-tuple-stream-v2.h A be/src/runtime/buffered-tuple-stream-v2.inline.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc 20 files changed, 2,942 insertions(+), 99 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/4 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht
[Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Tim Armstrong has uploaded a new patch set (#3). Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool .. IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool Add a copy of BufferedTupleStream that allocates memory from BufferPool. This will replace the original implementation when IMPALA-3200 is completed. The major changes are: * Terminology is updated from "blocks" to "pages" * No small buffer support is needed (hooray!). * BufferedTupleStream needs to do its own tracking of # of rows per page, etc instead of relying on BufferedBlockMgr to do it. A wrapper around PageHandle is used. * Profile counters (unpin, pin, get new block time) are removed - similar counters in the BufferPool client are more useful. The logic for serialising and deserialising rows has not changed, since that is implemented in terms of read and write pointers into the buffers. Testing this required some further changes. TestEnv was refactored so it can set up either BufferPool or BufferedBlockMgr. Some BufferPool-related state is added to ExecEnv, QueryState and RuntimeState, but is only created for backend tests that explicitly create a BufferPool. The following is left for future work: * IMPALA-3808 (large row support) is not added. I've added TODOs to the code to places that will require changes. * IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since it requires global changes to operators that accumulate memory. Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 --- M be/src/exec/hash-table-test.cc M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-tuple-stream-test.cc A be/src/runtime/buffered-tuple-stream-v2-test.cc A be/src/runtime/buffered-tuple-stream-v2.cc A be/src/runtime/buffered-tuple-stream-v2.h A be/src/runtime/buffered-tuple-stream-v2.inline.h M be/src/runtime/bufferpool/buffer-pool-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc 20 files changed, 2,942 insertions(+), 99 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/3 -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong