Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18219 )

Change subject: IMPALA-11064 Optimizing Temporary File Structure for Batch 
Reading
......................................................................


Patch Set 11:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/18219/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18219/10//COMMIT_MSG@131
PS10, Line 131: Added testcases in TmpFileMgrTest.
> Yes, I think so. Will add them in the next patch.
Done


http://gerrit.cloudera.org:8080/#/c/18219/5/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/18219/5/be/src/exec/partitioned-hash-join-builder.h@82
PS5, Line 82:   static uint64_t GenerateBaseSpillId(int64_t level, uint64_t 
default_id) {
> Will add.
I think this was overlooked.


http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/exec/partitioned-hash-join-builder.h@547
PS4, Line 547:     return null_aware_partition_.get();
> Done
Done


http://gerrit.cloudera.org:8080/#/c/18219/10/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/18219/10/be/src/exec/partitioned-hash-join-builder.cc@181
PS10, Line 181:     base_spill_id_(PhjBuilderConfig::GenerateBaseSpillId(
> Done
Done


http://gerrit.cloudera.org:8080/#/c/18219/10/be/src/exec/partitioned-hash-join-builder.cc@182
PS10, Line 182:         state->query_options().base_spill_id_level, 
(uint64_t)this)) {
> Done
Done


http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/bufferpool/buffer-pool.h
File be/src/runtime/bufferpool/buffer-pool.h:

http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/bufferpool/buffer-pool.h@412
PS4, Line 412:   DISALLOW_COPY_AND_ASSIGN(ClientHandle);
> Removed.
Done


http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/bufferpool/buffer-pool.h@538
PS4, Line 538:   uint64_t spill_id() const { return spill_id_; }
> Done
Done


http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/io/disk-file.h
File be/src/runtime/io/disk-file.h:

http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/io/disk-file.h@20
PS4, Line 20: #include <string>
> Removed.
Done


http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/io/disk-file.h@569
PS4, Line 569:
> I introduced these atomic variables several patches ago, but on second thou
Done


http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/io/local-file-writer.cc
File be/src/runtime/io/local-file-writer.cc:

http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/io/local-file-writer.cc@92
PS4, Line 92:
> Done
Done


http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/18219/4/be/src/runtime/tmp-file-mgr.cc@123
PS4, Line 123:     "sequential reading, which may improve the query 
performance.");
> Done
Done


http://gerrit.cloudera.org:8080/#/c/18219/5/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/18219/5/be/src/runtime/tmp-file-mgr.cc@1639
PS5, Line 1639:   return new_file_path.string();
> During the test, the upper layer could always pin these pages in a reverse
Done


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.cc@457
PS8, Line 457:       metrics->RegisterMetric(new HistogramMetric(
> I guess the function is helping the turn the buffer size into a form of fil
Done


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.cc@545
PS8, Line 545:   return 
tmp_dirs_remote_ctrl_.tmp_file_pool_->DequeueTmpFilesPool(
> On second thought, I moved the function to TmpFileRemote, and name it as Re
Done


http://gerrit.cloudera.org:8080/#/c/18219/8/be/src/runtime/tmp-file-mgr.cc@549
PS8, Line 549: void TmpFileMgr::IncUnuploadedBatchReadFileCnt(TmpFileGroup* 
file_group) {
> Removed the check of the caller, and keep the check inside the implementati
Done


http://gerrit.cloudera.org:8080/#/c/18219/5/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/18219/5/be/src/service/query-options.cc@1259
PS5, Line 1259:
> I was mostly trying to understand the rationale for how it's set. If it wer
Done



--
To view, visit http://gerrit.cloudera.org:8080/18219
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If913785cac9e2dafa20013b6600c87fcaf3e2018
Gerrit-Change-Number: 18219
Gerrit-PatchSet: 11
Gerrit-Owner: Yida Wu <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Thu, 11 May 2023 17:57:16 +0000
Gerrit-HasComments: Yes

Reply via email to