Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8707 )

Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront
......................................................................


Patch Set 20:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/disk-io-mgr.h@289
PS20, Line 289: AddBuffersToRange
nit: AllocateBuffersForRange


http://gerrit.cloudera.org:8080/#/c/8707/14/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8707/14/be/src/runtime/io/disk-io-mgr.h@259
PS14, Line 259: do
nit: does


http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/disk-io-mgr.cc@826
PS20, Line 826: ScheduleMode::IMMEDIATELY
please see comment on request-context.cc::179


http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/request-context.cc
File be/src/runtime/io/request-context.cc:

http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/request-context.cc@174
PS20, Line 174: disk_state->ScheduleContext(lock, this, range->disk_id());
shouldn't this be called only for ScheduleMode::IMMEDIATELY?

for ScheduleMode::UPON_GETNEXT: GetNextRange() will be scheduling this context
and
for ScheduleMode::BY_CALLER : the caller will explicitly call ScheduleContext()


http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/request-context.cc@179
PS20, Line 179: DCHECK(schedule_mode == ScheduleMode::IMMEDIATELY) << 
static_cast<int>(schedule_mode);
can you document this in the method comment, that schedule_mode should always 
be ScheduleMode::IMMEDIATELY for writes.

Also, I have some confusion over here too, initially we required 
schedule_immediately to be always false, yet we always called  
ScheduleContext() for writes. What should be the desired behavior here?


http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/request-ranges.h
File be/src/runtime/io/request-ranges.h:

http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/request-ranges.h@437
PS20, Line 437: unused_buffers_
should we call this unused_internal_buffers? to make the distinction clear from 
external buffers


http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/request-ranges.h@444
PS20, Line 444: unused_buffer_bytes_returned_
this implicitly signifies the amount of bytes that have been read or are being 
read, which is why we are using this to check if we need to hold onto buffers.

this however is incremented on call to GetUnusedBuffer(), which is also being 
used in CleanUpUnusedBuffers().
Although this wont make any difference to the correctness of how this variable 
is being used, since CleanUpUnusedBuffers() is only called when eosr is true or 
scanner is cancelled, after which this variable will be redundant....still I 
would prefer this variable be incremented in GetNextBufferForRange() instead 
and renamed appropriately


http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/scan-range.cc
File be/src/runtime/io/scan-range.cc:

http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/scan-range.cc@139
PS20, Line 139: unused_buffer_bytes_ += buffer->buffer_len();
should we do a check for (unused_buffer_bytes_ >= len_ - 
unused_buffer_bytes_returned_) after absorbing each buffer.

This way we can hold on to only the buffers we need to read the remainder of 
the file and cleanup the rest of them


http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/scan-range.cc@141
PS20, Line 141: if (blocked_on_buffer_) {
              :           // This scan range was blocked and is no longer, 
schedule it again.
              :           blocked_on_buffer_ = false;
              :           schedule_range = true;
              :         }
maybe do this outside the for loop?


http://gerrit.cloudera.org:8080/#/c/8707/20/be/src/runtime/io/scan-range.cc@212
PS20, Line 212: boost::
nit: can be removed



--
To view, visit http://gerrit.cloudera.org:8080/8707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f
Gerrit-Change-Number: 8707
Gerrit-PatchSet: 20
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Tianyi Wang <tw...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Thu, 01 Feb 2018 01:28:21 +0000
Gerrit-HasComments: Yes

Reply via email to