Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool ......................................................................
Patch Set 5: (39 comments) http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: PS5, Line 181: write_page_->is_pinned() > when do we have an unpinned write page? what does that mean? I documented that they have to be pinned and added some consistency checks. If I set write_page_ = null earlier that keeps it in a consistent state. Line 189: *got_page = buffer_pool_client_->reservation()->IncreaseReservationToFit(page_len); > shouldn't this come with IMPALA-3208? or do we need this already? We need it already for the first page or if got_page was false on an earlier call and we're retrying. Line 198: write_end_ptr_ = new_page.data() + page_len; > would be slightly clearer to compute this based on write_page_ Done Line 234: DCHECK(&*read_page_ != write_page_); > here and above: DCHECK_NE/EQ It doesn't work for the first one since the values of iterators aren't printable. Line 246: bytes_in_mem_ -= read_page_->len(); > UnpinPage()? Done Line 256: if (!read_page_->is_pinned()) { > is this check sufficient? couldn't the page by pinned due to it also being I just kept this logic from the old BufferedTupleStream. Added a comment to 'write_page_' to explain. I think it's easiest to leave as-is for now. Having a pin per iterator makes a lot of sense when we have a separate iterator abstraction that manages its own reservation, but it interacts strangely with the current UnpinStream() API - if you UnpinStream() a read-write stream with a single page then it could actually use more reservation. The only current user -AnalyticEvalNode - shouldn't do that but it adds a weird edge case to be wary of. I'm open to changing it but it wasn't a strict improvement in my eyes. Line 300: bytes_in_mem_ += pages_.front().len(); > maybe factor this into PinPage() to match UnpinPage() Done PS5, Line 324: is_pinned > i guess this use is probably okay I think we'd need to change it if we started pinned pages multiple times - we might need to unpin a page here if it was pinned multiple times. http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.h File be/src/runtime/buffered-tuple-stream-v2.h: Line 65: // > /// Done PS5, Line 66: In the pinned mode all pages are : /// pinned and the data is fully in memory. > seems redundant with the next sentence Done PS5, Line 74: PrepareForWrite > shouldn't that be PrepareForRead()? Done PS5, Line 74: the > to Done Line 75: /// pass over the stream. > this should go away soon after switching to BufferPool, right? i.e. buffers I think the concept of a destructive read pass will continue to exist, but the pages would be destroyed and buffers attached instead of the current dance with NeedsDeepCopy(). There will probably be other related changes though, e.g. around buffer management for non-destructive read passes. To support non-destructive unpinned read passes we might actually need a new BufferPool API to extract a buffer from a page without destroying the page. Line 121: /// <---12b---> <---12b---> <-----5b----> > i guess these examples all assume non-nullable? Added a nulllable example Line 127: /// caller does not need to copy if it is streaming. > are these listed in order of precedence? i.e. if a stream is both delete on Reworked this and the memory lifetime section to avoid ambiguity. Line 139: /// or free up other memory before retrying. > nit:indentation of some of these paragraphs is inconsistent Done Line 152: /// layout described above. > maybe a TODO/JIRA for what we had discussed with unifying AddRow() and Allo Done Line 199: /// Returns false and sets 'status' to an error. > nit: indentations of these bullets are inconsistent Done Line 262: /// rows will remain valid until the stream is unpinned, destroyed, etc. > let's also put TODO IMPALA-4179 to make it easier to keep straight of where Done Line 267: /// stream remains pinned. > does this interface require the stream to be pinned? if so, let's be explic Done PS5, Line 370: offset > this isn't really the offset. Done Line 374: uint8_t* read_end_ptr_; > given we store number of rows in the page, what is this used for? oh, i see Done Line 380: uint8_t* write_end_ptr_; > same, and this one looks dead anyway. This one is used in AllocateRow() where the perf matters - it likely saves a few cycles per row. I ended up removing write_page_bytes_remaining() and just inlining the calculations. Line 382: /// Number of rows returned to the caller from GetNext(). > ... since PrepareForRead() was called? Done Line 386: int read_page_idx_; > this looks dead Done Line 393: int num_pinned_; > this doesn't looked used either, except for DebugString() which iterates th Removed - it isn't needed now that we have bytes_in_mem_ Line 407: /// to AddRow() must occur before calling PrepareForRead() and subsequent calls to > or AllocateRow() Done PS5, Line 414: > nit extra space Done PS5, Line 419: This changes the : /// memory management of the stream > this sentence isn't very helpful Removed and edited. PS5, Line 454: vailable > available? Updated comment PS5, Line 463: blocks > is this talking about blocking on disk in order to repin a page? or somethi Yes - clarified the comment. Line 517: int CalcNumPinned(); > won't need this if we get rid of num_pinned_ Done http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/buffered-tuple-stream-v2.inline.h File be/src/runtime/buffered-tuple-stream-v2.inline.h: PS5, Line 29: null_indicator_bytes_per_row > NullIndicatorBytesPerRow Done http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/exec-env.cc File be/src/runtime/exec-env.cc: Line 242: } > when does this get destructed? Only in backend tests. I added a comment in the header to clarify that. http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/exec-env.h File be/src/runtime/exec-env.h: PS5, Line 104: buffer_reservation > this could use an explanation I extended the comment on the private member to be consistent with elsewhere in the backend where we don't document trivial accessors. http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/query-state.h File be/src/runtime/query-state.h: PS5, Line 153: the > that? Done http://gerrit.cloudera.org:8080/#/c/5811/5/be/src/runtime/row-batch.h File be/src/runtime/row-batch.h: PS5, Line 226: freing > freeing Done PS5, Line 227: deleted when freeing resources > garbled Done Line 232: /// consistent between buffers and I/O buffers. > in addition to making it consistent, i think we should also revisit this ma Agreed, done -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-HasComments: Yes