Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17979 )

Change subject: IMPALA-10791 Add batching reading for remote temporary files
......................................................................


Patch Set 6:

(6 comments)

We have at most 10% of the process memory for the remote batching reading, 
theoretically this memory consumption should be okay, but maybe later we can 
test more fierce cases

http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-io-mgr.cc@424
PS5, Line 424:           read_buffer_bloc, MemBlockStatus::DISABLED, dstl, 
&read_buffer_lock)) {
> Well, if we are here, it means we want to a memory block to hold some data,
I think there are two types of memory may be involved.

For the memory of the block content, we don't need to allocate all of them of a 
file during construction.

Each block will reserve or allocate its own memory separately when a page in 
the block is going to be read(pinned).

Right now the memory reuse process is, we will free the memory of a DISABLED 
block and return the reservation to the global, then other block can get the 
reservation and allocate itself. Maybe later we can have some memory pool, but 
compared to the rate of network transmission, it should be not that bad to have 
one system free and malloc during reuse.

For the memory of block structures(MemBlock), we will have all of them since 
construction of the TmpFile till destruction. For example, if the file is 256M, 
block is 16M, we will have 16 block structures created since construction of 
the TmpFile. But consider that, the size of each MemBlock structure is 128 
bytes, if we are going to spill 1TB data, the total memory needs for the block 
structures is 1T/16M*128B=8M, the number is not that bad.


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-io-mgr.cc@430
PS5, Line 430: RN_IF_ERROR(
> This could be a performance hit to open/close a file to read one data block
It is a good point, could be a potential performance improvement.

I modified the code and did some tests, looks like the reuse of the hdfs file 
handle doesn't work that good. Firstly, the hdfsFile seems not allow 
multithreading use, therefore, there needs to be an exclusive lock among 
multiple blocks in the same file, the performance is affected. Secondly, it 
could require more memory because probably we need to keep all the file handle 
till query ends, during the tests, I have met hanging on the the calls to hdfs 
interface after lots of file handle have been created, I assume it could be 
related.

Therefore, to keep the file open, may need more efforts, could be good to put 
it in the next round of improvement.


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-io-mgr.cc@445
PS5, Line 445: queue->read_size()->Update(local_file_size);
             :       disk_file_dst_->SetReadBufferBlockStatus(
> See my comment above. Basically, I have a good mem block and there is a rea
It depends on how the caller deal with the error.
Right now, if fails to fetch the block, we simply return the block memory to 
global(the block structure is still there, but content memory is released), 
therefore some other block can use. The one who fails to fetch a block will 
change the policy to read by page for this page. If read by page fails again, 
the query will fail.


http://gerrit.cloudera.org:8080/#/c/17979/5/be/src/runtime/io/disk-io-mgr.cc@452
PS5, Line 452:
> This code could mask out a read error set at line 445.  Better to return th
Done


http://gerrit.cloudera.org:8080/#/c/17979/6/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/17979/6/be/src/runtime/tmp-file-mgr.cc@402
PS6, Line 402: TmpFileMgr::SetUpReadBufferParams
> Since the work in this method is done for tmp_dirs_remote_ctrl_, suggest to
Done


http://gerrit.cloudera.org:8080/#/c/17979/6/be/src/runtime/tmp-file-mgr.cc@411
PS6, Line 411:  tmp_dirs_remote_ctrl_.num_read_buffer_blocks_per_file_ =
             :       
static_cast<int>(tmp_dirs_remote_ctrl_.remote_tmp_file_size_
             :           / tmp_dirs_remote_ctrl_.read_buffer_block_size_);
> This assumes that there is enough memory to hold the entire file. When it i
It doesn't need the memory for the entire file.

During the initialization, we only create the structure of the memory blocks, 
not reserve or allocate the memory for the block content.
Each block will reserve or allocate its own memory separately.

We have a global limitation on the sum of the size of all the blocks, which is 
the tmp_dirs_remote_ctrl_.max_read_buffer_size_. The global usage of the read 
buffer memory is tracked by current_read_memory_allocated_ in 
https://gerrit.cloudera.org/#/c/17979/6/be/src/runtime/tmp-file-mgr.h@400.

If the current global usage is below the limitation, the block will reserve the 
memory successfully. The memory reservation for a block is done in 
https://gerrit.cloudera.org/#/c/17979/6/be/src/runtime/tmp-file-mgr.cc@1024, 
when someone wants to pin a page.

Then with the reservation, the block can allocate the memory in 
https://gerrit.cloudera.org/#/c/17979/6/be/src/runtime/io/disk-io-mgr.cc@430. 
The data will be read from the remote file to the block memory which was just 
allocated.

If error happens or query ends, the memory of the block will be released 
according to the status of the block in 
https://gerrit.cloudera.org/#/c/17979/6/be/src/runtime/tmp-file-mgr.cc@1107. By 
decreasing the reservation tracker current_read_memory_allocated_ by one block 
size, other blocks could be able to reserve the memory then allocate.



--
To view, visit http://gerrit.cloudera.org:8080/17979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b
Gerrit-Change-Number: 17979
Gerrit-PatchSet: 6
Gerrit-Owner: Yida Wu <wydbaggio...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Yida Wu <wydbaggio...@gmail.com>
Gerrit-Comment-Date: Fri, 12 Nov 2021 21:14:41 +0000
Gerrit-HasComments: Yes

Reply via email to