Yida Wu has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/17979 )
Change subject: IMPALA-10791 Add batching reading for remote temporary files ...................................................................... IMPALA-10791 Add batching reading for remote temporary files The patch adds a feature to batching read from a remote temporary file in order to improve the reading performance for the spilled remote data. Originally, the design is to use the local disk file as the buffer for batching reading from the remote file. But in practice, it doesn't help to improve the performance. Therefore, the design is changed to use the memory as the read buffer. Currently, each TmpFileRemote has two DiskFile, one is for the remote, and one is for the local buffer. The patch adds MemBlocks to the local buffer file. Each local buffer file is divided into several MemBlocks evenly. Moreover, in order to guarantee a single page not being cut into two parts in different blocks, the block size could be a little different to each other in practice. The default block size is the minimum value between the default file size and MAX_REMOTE_READ_MEM_BLOCK_THRESHOLD_MB, which is 16MB. When pinning a page, the system will detect if there is enough memory for the block that holds the page, if not, we will go reading the page directly and disable this block, because it may be good to avoid duplicated reads from the remote fs for the same content. If the system decides to fetch a block, the block will be stored in the memory until all of the pages in the block are read or the query ends. One challenge of using the memory for the buffer is that, when the system is lacking of memory when it needs to spill the data. So we make a restriction to limit the percentage of the memory for the read buffer to 10% of the total, because right now the impala process will reserve 20% memory as unused memory by default, using 10% for the emergency case like spilling is reasonable. Two start options have been added for the new feature. 1. remote_batching_read. Default is false. If set true, the batching read is enabled. 2. remote_read_memory_buffer_size. Default is 1G. The maximum memory that can be used by the read buffer. The number also restricted by the total system memory, which can not exceed 10% of the total memory. Added metrics ScratchReadsUseMem/ScratchBytesReadUseMem/ ScratchBytesReadUseLocalDisk to the query profile. The patch also increases the MAX_REMOTE_TMPFILE_SIZE_THRESHOLD_MB from 256 to 512. Tests: Ran core and exhaustive tests. Added and ran TmpFileMgrTest::TestBatchingReadFromRemote. Added e2e test test_scratch_dirs_batch_reading. Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b --- M be/src/runtime/io/disk-file.cc M be/src/runtime/io/disk-file.h M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/tmp-file-mgr-internal.h M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h M common/thrift/metrics.json M tests/custom_cluster/test_scratch_disk.py 13 files changed, 1,304 insertions(+), 149 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/17979/8 -- To view, visit http://gerrit.cloudera.org:8080/17979 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b Gerrit-Change-Number: 17979 Gerrit-PatchSet: 8 Gerrit-Owner: Yida Wu <wydbaggio...@gmail.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Yida Wu <wydbaggio...@gmail.com>