[Impala-ASF-CR] IMPALA-3202,IMPALA-2298: rework scratch file I/O
Dan Hecht has posted comments on this change. Change subject: IMPALA-3202,IMPALA-2298: rework scratch file I/O .. Patch Set 15: (15 comments) Haven't made it all the way through the code, but had some comments about the header to get out there. One high level comment is that at some point we had thought that the exec nodes (or buffered tuple stream equivalent layer) would need to have some control over how the buffers are laid out in scratch files, to get good sequential read patterns. This new abstraction seems to preclude that possibility, no? http://gerrit.cloudera.org:8080/#/c/5141/15//COMMIT_MSG Commit Message: PS15, Line 7: IMPALA-2298 this one is already marked as resolved. is it the right jira? Line 54: the remaining data (72M) to /tmp. this might be a good py.test. what do you think? http://gerrit.cloudera.org:8080/#/c/5141/15/be/src/runtime/buffered-block-mgr.h File be/src/runtime/buffered-block-mgr.h: Line 177: this could use a comment. http://gerrit.cloudera.org:8080/#/c/5141/15/be/src/runtime/tmp-file-mgr.h File be/src/runtime/tmp-file-mgr.h: PS15, Line 45: File given that File is no longer exposed, make it "file". Or reword this now that files are really implementation details. PS15, Line 48: a delete or make singular Line 51: /// data once the write finishes. does the read on WriteHandle block if the write wasn't complete yet, or how is this synchronized? PS15, Line 53: The files are : /// allocated lazily upon write. lazily with respect to what? this also seems redundant with the first paragraph. Line 56: /// WriteHandle. these two paragraphs kind of mix together the class abstraction and the implementation (i.e. files and directories are only implementation details now, right?). Separating it would make the abstraction clearer. Something else I wasn't sure about when reading this comment was the relationship with DiskIoMgr. With the old interface, it wasn't relevant since the user of TmpFileMgr() determined how to do the actually reads/writes. Now that this abstraction does the read/writes, it is relevant. Line 57: /// also, reading through the code (though I haven't got through it all yet), the block_size concept seems pretty integral to the abstraction, but isn't explained here. Line 62: class File; is this public declaration needed? Line 69: typedef boost::functionWriteDoneCallback; how does the callback know which write was completed? Line 96: /// data). this last sentence could use clarification that the rewrite is by the callee (i.e. it's not saying the caller may rewrite the data in place). Also, at what point can the caller reclaim the buffer for another use? The buffer ownership is not transferred, right? PS15, Line 99: be PS15, Line 102: CancelAndRestoreData CancelWriteAndRestoreData http://gerrit.cloudera.org:8080/#/c/5141/15/be/src/util/buffer-ref.h File be/src/util/buffer-ref.h: Line 26: /// a separate pointer and length. does this go away once "everything" is switched to using BufferPool BufferHandles? -- To view, visit http://gerrit.cloudera.org:8080/5141 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8c9c587df006d2f09d72dd636adafbd295fcdc17 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3202,IMPALA-2298: rework scratch file I/O
Tim Armstrong has uploaded a new patch set (#15). Change subject: IMPALA-3202,IMPALA-2298: rework scratch file I/O .. IMPALA-3202,IMPALA-2298: rework scratch file I/O Refactor BufferedBlockMgr/TmpFileMgr to push more I/O logic into TmpFileMgr, in anticipation of it being shared with BufferPool. TmpFileMgr now handles: * Scratch space allocation and recycling * Read and write I/O The interface is also greatly changed so that it is built around Write() and Read() calls, abstracting away the details of temporary file allocation from clients. This means the TmpFileMgr::File class can be hidden from clients. Write error recovery: Also implement write error recovery in TmpFileMgr. If an error occurs while writing to scratch and we have multiple scratch directories, we will try one of the other directories before cancelling the query. File-level blacklisting is used to prevent excessive repeated attempts to resize a scratch file during a single query. Device-level blacklisting is still disabled because it is problematic to permanently take a scratch directory out of use. To reduce the number of error paths, all I/O errors are now handled asynchronously. Previously errors creating or extending the file were returned synchronously from WriteUnpinnedBlock(). This required modifying DiskIoMgr to create the file if not present when opened. Future Work: * Support for recycling variable-length scratch file ranges. I omitted this to avoid making the patch even large. Testing: Updated BufferedBlockMgr unit test to reflect changes in behaviour: * Scratch space is no longer permanently associated with a block, and is remapped every time a new block is written to disk . * Files are now blacklisted - updated existing tests and enable the disable blacklisting test. Added some basic testing of recycling of scratch file ranges in the TmpFileMgr unit test. I also manually tested the code in two ways. First by removing permissions for /tmp/impala-scratch and ensuring that a spilling query fails cleanly. Second, by creating a tiny ramdisk (16M) and running with two scratch directories: one on /tmp and one on the tiny ramdisk. When spilling, an out of space error is encountered for the tiny ramdisk and impala spills the remaining data (72M) to /tmp. Change-Id: I8c9c587df006d2f09d72dd636adafbd295fcdc17 --- M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-block-mgr.cc M be/src/runtime/buffered-block-mgr.h M be/src/runtime/disk-io-mgr-test.cc M be/src/runtime/disk-io-mgr.cc M be/src/runtime/disk-io-mgr.h M be/src/runtime/query-state.cc A be/src/runtime/tmp-file-mgr-internal.h M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h A be/src/util/buffer-ref.h M be/src/util/filesystem-util.cc M be/src/util/filesystem-util.h M common/thrift/ImpalaInternalService.thrift 15 files changed, 1,151 insertions(+), 506 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/5141/15 -- To view, visit http://gerrit.cloudera.org:8080/5141 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8c9c587df006d2f09d72dd636adafbd295fcdc17 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong