Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/16318 )
Change subject: IMPALA-9867: Add Support for Spilling to S3: Milestone 1 ...................................................................... Patch Set 30: (9 comments) http://gerrit.cloudera.org:8080/#/c/16318/30//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16318/30//COMMIT_MSG@112 PS30, Line 112: * Ran pre-review-test > Have you tested in combination with disk spill compression? Added TmpFileMgrTest::TestCompressBufferManagement*Remote and TmpFileMgrTest::TestCompressBufferManagement*RemoteUpload test cases. http://gerrit.cloudera.org:8080/#/c/16318/30/be/src/runtime/io/disk-io-mgr.cc File be/src/runtime/io/disk-io-mgr.cc: http://gerrit.cloudera.org:8080/#/c/16318/30/be/src/runtime/io/disk-io-mgr.cc@328 PS30, Line 328: 0 > I think we should probably set the replication to 1 for this use case - tha Done http://gerrit.cloudera.org:8080/#/c/16318/30/be/src/runtime/io/local-file-system.h File be/src/runtime/io/local-file-system.h: http://gerrit.cloudera.org:8080/#/c/16318/30/be/src/runtime/io/local-file-system.h@42 PS30, Line 42: uint8 > nit: uint8_t Done http://gerrit.cloudera.org:8080/#/c/16318/30/be/src/runtime/io/local-file-system.cc File be/src/runtime/io/local-file-system.cc: http://gerrit.cloudera.org:8080/#/c/16318/30/be/src/runtime/io/local-file-system.cc@98 PS30, Line 98: uint8 > nit: uint8_t Done http://gerrit.cloudera.org:8080/#/c/16318/30/be/src/runtime/io/local-file-writer.cc File be/src/runtime/io/local-file-writer.cc: http://gerrit.cloudera.org:8080/#/c/16318/30/be/src/runtime/io/local-file-writer.cc@45 PS30, Line 45: int written = write(fileno(file_), range->data(), range->len()); > I missed this, I think we should use LocalFileSystem::Fwrite which converts Use write() instead of fwrite(), because the file handle is shared among multiple write ranges before the file is full and closed, during the period, if using fwrite(), it may not be safe to read the file because the buffer fwrite() is using may not flush to the disk. But write() is safe to read immediately because it writes without a buffer. If we don’t share the file handle for ranges of the same file, fopen() and fclose() for every write ranges to guarantee the safety of read, it would be slower than to share the file handle and use write() to do the sequential writes. Added LocalFileSystem::Write() to handle errors for calling write() and some comments. http://gerrit.cloudera.org:8080/#/c/16318/30/be/src/runtime/tmp-file-mgr.h File be/src/runtime/tmp-file-mgr.h: http://gerrit.cloudera.org:8080/#/c/16318/30/be/src/runtime/tmp-file-mgr.h@420 PS30, Line 420: /// > This is the main public API for TmpFileMgr writing. I notice that it doesn' Added. http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr.h File be/src/runtime/tmp-file-mgr.h: http://gerrit.cloudera.org:8080/#/c/16318/27/be/src/runtime/tmp-file-mgr.h@194 PS27, Line 194: std::unique_ptr<TmpFileBufferPool> tmp_file_pool_; > Can you also put this in the comment for tmp_file_pool_? Done http://gerrit.cloudera.org:8080/#/c/16318/30/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/16318/30/be/src/runtime/tmp-file-mgr.cc@517 PS30, Line 517: if (HasRemoteDir()) { > This doesn't seem right if write_range is just writing to a local scratch d Done http://gerrit.cloudera.org:8080/#/c/16318/30/be/src/runtime/tmp-file-mgr.cc@1820 PS30, Line 1820: DCHECK(status.ok()); > nit: use DCHECK_OK - we have a custom version that automatically prints the Done -- To view, visit http://gerrit.cloudera.org:8080/16318 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I419b1d5dbbfe35334d9f964c4b65e553579fdc89 Gerrit-Change-Number: 16318 Gerrit-PatchSet: 30 Gerrit-Owner: Yida Wu <wydbaggio...@gmail.com> Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Yida Wu <wydbaggio...@gmail.com> Gerrit-Comment-Date: Thu, 21 Jan 2021 17:52:28 +0000 Gerrit-HasComments: Yes