Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/18219 )
Change subject: IMPALA-11064 Optimizing Temporary File Structure for Batch Reading ...................................................................... Patch Set 11: (16 comments) http://gerrit.cloudera.org:8080/#/c/18219/10//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18219/10//COMMIT_MSG@131 PS10, Line 131: Added testcases in TmpFileMgrTest. > Yes, I think so. Will add them in the next patch. Done http://gerrit.cloudera.org:8080/#/c/18219/5/be/src/exec/partitioned-hash-join-builder.h File be/src/exec/partitioned-hash-join-builder.h: http://gerrit.cloudera.org:8080/#/c/18219/5/be/src/exec/partitioned-hash-join-builder.h@82 PS5, Line 82: static uint64_t GenerateBaseSpillId(int64_t level, uint64_t default_id) { > Will add. I think this was overlooked. http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/exec/partitioned-hash-join-builder.h File be/src/exec/partitioned-hash-join-builder.h: http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/exec/partitioned-hash-join-builder.h@547 PS4, Line 547: return null_aware_partition_.get(); > Done Done http://gerrit.cloudera.org:8080/#/c/18219/10/be/src/exec/partitioned-hash-join-builder.cc File be/src/exec/partitioned-hash-join-builder.cc: http://gerrit.cloudera.org:8080/#/c/18219/10/be/src/exec/partitioned-hash-join-builder.cc@181 PS10, Line 181: base_spill_id_(PhjBuilderConfig::GenerateBaseSpillId( > Done Done http://gerrit.cloudera.org:8080/#/c/18219/10/be/src/exec/partitioned-hash-join-builder.cc@182 PS10, Line 182: state->query_options().base_spill_id_level, (uint64_t)this)) { > Done Done http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/bufferpool/buffer-pool.h File be/src/runtime/bufferpool/buffer-pool.h: http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/bufferpool/buffer-pool.h@412 PS4, Line 412: DISALLOW_COPY_AND_ASSIGN(ClientHandle); > Removed. Done http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/bufferpool/buffer-pool.h@538 PS4, Line 538: uint64_t spill_id() const { return spill_id_; } > Done Done http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/io/disk-file.h File be/src/runtime/io/disk-file.h: http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/io/disk-file.h@20 PS4, Line 20: #include <string> > Removed. Done http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/io/disk-file.h@569 PS4, Line 569: > I introduced these atomic variables several patches ago, but on second thou Done http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/io/local-file-writer.cc File be/src/runtime/io/local-file-writer.cc: http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/io/local-file-writer.cc@92 PS4, Line 92: > Done Done http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/tmp-file-mgr.cc@123 PS4, Line 123: "sequential reading, which may improve the query performance."); > Done Done http://gerrit.cloudera.org:8080/#/c/18219/5/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/18219/5/be/src/runtime/tmp-file-mgr.cc@1639 PS5, Line 1639: return new_file_path.string(); > During the test, the upper layer could always pin these pages in a reverse Done http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.cc@457 PS8, Line 457: metrics->RegisterMetric(new HistogramMetric( > I guess the function is helping the turn the buffer size into a form of fil Done http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.cc@545 PS8, Line 545: return tmp_dirs_remote_ctrl_.tmp_file_pool_->DequeueTmpFilesPool( > On second thought, I moved the function to TmpFileRemote, and name it as Re Done http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.cc@549 PS8, Line 549: void TmpFileMgr::IncUnuploadedBatchReadFileCnt(TmpFileGroup* file_group) { > Removed the check of the caller, and keep the check inside the implementati Done http://gerrit.cloudera.org:8080/#/c/18219/5/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/18219/5/be/src/service/query-options.cc@1259 PS5, Line 1259: > I was mostly trying to understand the rationale for how it's set. If it wer Done -- To view, visit http://gerrit.cloudera.org:8080/18219 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If913785cac9e2dafa20013b6600c87fcaf3e2018 Gerrit-Change-Number: 18219 Gerrit-PatchSet: 11 Gerrit-Owner: Yida Wu <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Thu, 11 May 2023 17:57:16 +0000 Gerrit-HasComments: Yes
