Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/17979 )
Change subject: IMPALA-10791 Add batch reading for remote temporary files ...................................................................... Patch Set 14: (11 comments) http://gerrit.cloudera.org:8080/#/c/17979/13/be/src/runtime/io/disk-io-mgr-test.cc File be/src/runtime/io/disk-io-mgr-test.cc: http://gerrit.cloudera.org:8080/#/c/17979/13/be/src/runtime/io/disk-io-mgr-test.cc@191 PS13, Line 191: protected: > Naming this "ValidateMemBlockStatus" seems more consistent with naming test Done. Moved to DiskFileTest. http://gerrit.cloudera.org:8080/#/c/17979/13/be/src/runtime/io/disk-io-mgr-test.cc@2494 PS13, Line 2494: > ObjectPool is useful when you need to generate a bunch or variable number o Done. Moved to DiskFileTest. http://gerrit.cloudera.org:8080/#/c/17979/13/be/src/runtime/io/disk-io-mgr-test.cc@2525 PS13, Line 2525: ASSERT_EQ(0, status.code()); > Does disk-io-mgr-test contain a bunch of unit tests for DiskFile too? This Added some negative tests. Moved to DiskFileTest. http://gerrit.cloudera.org:8080/#/c/17979/13/be/src/runtime/io/disk-io-mgr.cc File be/src/runtime/io/disk-io-mgr.cc: http://gerrit.cloudera.org:8080/#/c/17979/13/be/src/runtime/io/disk-io-mgr.cc@264 PS13, Line 264: disk_file_->UpdateReadBufferMetaDataIfNeeded(written_bytes - len_); > The condition check is unnecessary because UpdateReadBufferMetaData returns Done http://gerrit.cloudera.org:8080/#/c/17979/13/be/src/runtime/io/disk-io-mgr.cc@353 PS13, Line 353: if (!status.ok()) goto end; > nit: line length limit is 90, this doesn't need to wrap. Done http://gerrit.cloudera.org:8080/#/c/17979/13/be/src/runtime/io/disk-io-mgr.cc@407 PS13, Line 407: // Get the shared lock to prevent the physical files from deletion during the fetching. > Storing the result of `c_str()` is usually a bad idea, since it has no guar Done http://gerrit.cloudera.org:8080/#/c/17979/13/be/src/runtime/io/disk-io-mgr.cc@411 PS13, Line 411: shared_lock<shared_mutex> srcl(disk_file_src_->physical_file_lock_); > Initializing this up-front is a very old-school C idiom. Compilers can ofte Done http://gerrit.cloudera.org:8080/#/c/17979/13/be/src/runtime/io/disk-io-mgr.cc@447 PS13, Line 447: read_buffer_bloc, MemBlockStatus::WRITTEN, dstl, &read_buffer_lock); > If we wanted to get fancy we could use a lambda to return the read result. Done http://gerrit.cloudera.org:8080/#/c/17979/13/be/src/runtime/tmp-file-mgr-test.cc File be/src/runtime/tmp-file-mgr-test.cc: http://gerrit.cloudera.org:8080/#/c/17979/13/be/src/runtime/tmp-file-mgr-test.cc@2101 PS13, Line 2101: TEST_F(TmpFileMgrTest, TestBatchReadingFromRemote) { > Can we add a test case where the remote file is deleted while DoUpload is i Added a testcase DiskIoMgrTest::WriteToRemoteFileDeleted, but it is hard to delete the remote physical file in the middle of the uploading because the detection using hdfsExists seems only after the file is closed. http://gerrit.cloudera.org:8080/#/c/17979/13/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/17979/13/be/src/runtime/tmp-file-mgr.cc@1019 PS13, Line 1019: // waiting for the fetch, so that it won't block the current page reading. > So if MemBlock is reserved/allocated, that means another process has alread Added comments. http://gerrit.cloudera.org:8080/#/c/17979/13/be/src/runtime/tmp-file-mgr.cc@1022 PS13, Line 1022: if (read_buffer_file->IsReadBufferBlockStatus(read_buffer_block, > Ok, took me a sec to see why these checks make sense even though the only c This part is changed a lot in part 2. Maybe we can optimize the structure in part 2. -- To view, visit http://gerrit.cloudera.org:8080/17979 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b Gerrit-Change-Number: 17979 Gerrit-PatchSet: 14 Gerrit-Owner: Yida Wu <wydbaggio...@gmail.com> Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Yida Wu <wydbaggio...@gmail.com> Gerrit-Comment-Date: Mon, 11 Jul 2022 15:53:18 +0000 Gerrit-HasComments: Yes