Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3202,IMPALA-2079: rework scratch file I/O
......................................................................


Patch Set 20:

(25 comments)

Added an end-to-end test. This required adding a flag to allow multiple scratch 
dirs per device.

http://gerrit.cloudera.org:8080/#/c/5141/20/be/src/runtime/buffered-block-mgr.cc
File be/src/runtime/buffered-block-mgr.cc:

Line 633:       // The block's buffer is still in memory with the original data.
> ... but we couldn't get additional reservation. However, we can use 'releas
Done


Line 656: 
> // FindBufferForBlock() wasn't able to find a buffer so transfer the one fr
Done


Line 845:     if (block->write_handle_ != NULL) {
> how could write_handle_ be NULL here?  anyway, okay with leaving this if-st
Good point. Removed the unnecessary if stmt.


http://gerrit.cloudera.org:8080/#/c/5141/20/be/src/runtime/buffered-block-mgr.h
File be/src/runtime/buffered-block-mgr.h:

Line 363:   int64_t GetNumWritesOutstanding();
> since the test class is a friend, can this be made private?
Done


PS20, Line 403: pinned state
> but this method doesn't actually set the pin bit on the block...
Went with CancelWrite(), since it also reverses the effects if the write is 
finished.


PS20, Line 458: or
> double or
Done


http://gerrit.cloudera.org:8080/#/c/5141/20/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS20, Line 1167: with
> will
Done


Line 1173:                                      write_range->file_, errno, 
GetStrErrMsg())));
> weird line break
clang-format gets funny with multi-line strings. I reworded the error message 
anyway to be more user-centric and it wraps better now.


http://gerrit.cloudera.org:8080/#/c/5141/20/be/src/runtime/tmp-file-mgr-internal.h
File be/src/runtime/tmp-file-mgr-internal.h:

Line 41:   /// The physical file is created on the first call to 
AllocateSpace().
> is that true?
Nope, fixed.


PS20, Line 43:  
> on
Done


Line 93:   /// blacklisted, the corresponding device will always be blacklisted.
> hmm i don't see how a device is blacklisted.
Yeah this comment is stale - the device blacklisting was removed a long time 
ago. Updated comment.


http://gerrit.cloudera.org:8080/#/c/5141/20/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

Line 404:   // Don't hold 'lock_' outside AllocateSpace() to minimise 
contention.
> not sure what this comment is telling me.
Removed, I don't think it's needed.


PS20, Line 499: Log the error before retrying so that the failure isn't 
swallowed silently.
> is this referring to the old LogError() code, or the scratch_errors_?
Yeah it needs updating - "Save and report" i think is more accurate.


Line 500:   // Make sure we don't allocate new blocks in this file. This is to 
avoid hitting
Removed the second part of the comment - I don't think it added anything that 
wasn't covered by the long comment further down.


Line 507:   handle->file_->ReportIOError(write_status.msg());
> this does the file-level blacklist. But i don't see where the device-level 
We don't do any permanent device-level blacklisting.


http://gerrit.cloudera.org:8080/#/c/5141/19/be/src/runtime/tmp-file-mgr.h
File be/src/runtime/tmp-file-mgr.h:

Line 330:     /// so no locking is required. This is a terminal lock and should 
not be held while
> There's nothing really about this class that makes it need to have query sc
Yep, it's all gone now - much simpler


http://gerrit.cloudera.org:8080/#/c/5141/20/be/src/runtime/tmp-file-mgr.h
File be/src/runtime/tmp-file-mgr.h:

PS20, Line 40: local
> delete, or say (usually local) or similar
Done


PS20, Line 48: Temporary files are managed via a FileGroup. To write to a 
temporary file, first a
             : /// FileGroup is created, then FileGroup::Write() is called to 
asynchronously write a
             : /// memory buffer to disk.
> This still kinda makes it sound like the user of this class can care about 
Done


PS20, Line 78: It is used as a handle for external classes to identify devices.
> if this is only used externally for tests, I think we should say something 
Reworded to make it clear that it's internal.


PS20, Line 167: w
> handle->write_in_flight_ ?
Done


Line 276:     void Cancel();
> does this need to be exposed, or could all callers just use CancelWriteAndR
It doesn't need to be exposed now that we don't have 'state_' (previously it 
was called from BufferedBlockMgr to avoid that issue). Might be useful in 
future to cancel the write while discarding the data, but agree it's easier to 
keep it private for now to avoid complicating lifecycle.


PS20, Line 300: SetWriteComplete
> "Set" makes it sound like it just sets state.  Maybe just call this WriteCo
Done


http://gerrit.cloudera.org:8080/#/c/5141/20/be/src/util/buffer-ref.h
File be/src/util/buffer-ref.h:

PS20, Line 27: Ref
> what does "Ref" stand for? Reference?  This might get confused with BufferH
Changed to MemRange.

Yeah I was planning to do a follow-up patch to use this in other places in 
bufferpool/.


Line 29:   BufferRef(uint8_t* data, int64_t len) : data_(data), len_(len) {}
Added some DCHECKs to the constructor.


http://gerrit.cloudera.org:8080/#/c/5141/20/be/src/util/disk-info.h
File be/src/util/disk-info.h:

Line 91:   static int num_datanode_dirs_;
> dead code also?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5141
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9c587df006d2f09d72dd636adafbd295fcdc17
Gerrit-PatchSet: 20
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to