Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17979 )

Change subject: IMPALA-10791 Add batching reading for remote temporary files
......................................................................


Patch Set 7: Code-Review+1

(6 comments)

Looks good!

http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-io-mgr.cc@424
PS5, Line 424:           read_buffer_bloc, MemBlockStatus::DISABLED, dstl, 
&read_buffer_lock)) {
> I think the mechanism is like this, any read(pin) who wants the block memor
Done


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-io-mgr.cc@430
PS5, Line 430: RN_IF_ERROR(
> Tried a way like closing the file handle if two blocks have used. The perfo
Looking into it in near future sounds like a good plan and it probably should 
be given a high priority.

In realty, say with the HJ builder use case, the overflow is very much instance 
specific and so is the tmp file itself. Therefore, only the reader(s) of that 
tmp file contend for the file handle.

Per this doc, hdfs lib API is thread safe: 
https://hadoop.apache.org/docs/r1.2.1/libhdfs.html.


http://gerrit.cloudera.org:8080/#/c/17979/6/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/17979/6/be/src/runtime/tmp-file-mgr.cc@412
PS6, Line 412: ist
should be int64_t?


http://gerrit.cloudera.org:8080/#/c/17979/6/be/src/runtime/tmp-file-mgr.cc@411
PS6, Line 411:  tmp_dirs_remote_ctrl_.tmp_file_pool_->dequeue_timer_metric_ =
             :       metrics->RegisterMetric(new HistogramMetric(
             :           MetricDefs::Get(TMP_FILE_BUFF_POOL_DEQUEUE_DURATI
> We will go "read by page" instead in this case. Because the "read by page"
Okay I see. Done.


http://gerrit.cloudera.org:8080/#/c/17979/6/be/src/runtime/tmp-file-mgr.cc@426
PS6, Line 426: nique_i
nit: reduced


http://gerrit.cloudera.org:8080/#/c/17979/7/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/17979/7/be/src/runtime/tmp-file-mgr.cc@1014
PS7, Line 1014: if (read_buffer_file->IsReadBufferBlockStatus(read_buffer_block,
              :             io::MemBlockStatus::UNINIT, read_file_lock, 
&mem_bloc_lock)) {
nit. Please add comment here this block of code which does several things.



--
To view, visit http://gerrit.cloudera.org:8080/17979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b
Gerrit-Change-Number: 17979
Gerrit-PatchSet: 7
Gerrit-Owner: Yida Wu <wydbaggio...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Yida Wu <wydbaggio...@gmail.com>
Gerrit-Comment-Date: Sat, 13 Nov 2021 00:31:15 +0000
Gerrit-HasComments: Yes

Reply via email to