Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool ......................................................................
Patch Set 8: (19 comments) http://gerrit.cloudera.org:8080/#/c/5811/8/be/src/runtime/buffered-tuple-stream-v2.cc File be/src/runtime/buffered-tuple-stream-v2.cc: PS8, Line 108: if > iff, just to be super clear. Done PS8, Line 110: read_page_ != pages_.end() > why is this condition needed? what does it mean to have read_page_ == pages This means that there's a read iterator active. read_page_ == pages_.end() && pinned_ means the stream is pinned but there is no read iterator. I clarified in the comment of read_page_ and restructured this condition so the meanings of subexpressions were more explicit. Line 116: } > fine with me to keep, but isn't this the same as line 109? Done Line 141: ss << &*read_page_; > will it be possible to identify which page in the list printed below this i Yes, you can match up the addresses. I made this change to reduce the amount of redundancy in the debug output. Line 166: DCHECK(read_page_ == pages_.end()); > DCHECK_EQ x2 We don't generally use DCHECK_EQ for null pointer comparison (I did switch this to nullptr for consistency though). We also can't use it for iterators since they're not printable. PS8, Line 232: read_page_ != pages_.end() > what does this condition mean? why isn't it just: if the stream is pinned f It means there's an active read iterator. I actually updated the has_read_page() method to be called has_read_iterator() and used that instead - I think it's clear. If there's only a write iterator we don't need to double-pin (because it's not really a read/write stream). I rewrote things in terms of the presence of read/write iterators instead of the read_write mode flag. Now read_write_ mode only directly influences whether PrepareForRead() invalidates the write iterator. Line 241: RETURN_IF_ERROR(buffer_pool_->Pin(buffer_pool_client_, &new_page.handle)); > does this purposely not go through PinPage()? I was experimenting a bit with different flows here so that related operations were grouped together. Seems like it was confusing so I switched back to PinPage(). PS8, Line 250: (double_pin ? 2 : 1) > i guess because of this. maybe move this to be after the CreatePage() call Done Line 348: *pinned = true; > why was this needed? Previously PinStream() was idempotent - I wanted to preserve that behaviour. PS8, Line 356: We need to double-pin the current write page if we also have a read iterator. > why? or maybe you mean just that we need to take a pin-count for reading ev Tried to reword to be clearer. PS8, Line 358: write_page_ == &page && read_page_ == pages_.end() > why? i thought we now do take a separate pin count for read and for write? This essentially implements that idea. Some of the complication is that on a non-read-write stream where there is only ever one iterator active, we don't need to do the double-pinning of the current write page. Line 387: if (pinned_) UnpinPage(&page); > do we allow you to call UnpinStream() on an already unpinned stream? why? I restructured this code to hopefully be clearer. As a bonus, calling UnpinStream() on an already-unpinned stream is now O(1) instead of O(n) PS8, Line 388: continue > i think this would be clearer as: I think the new version is easier to follow. http://gerrit.cloudera.org:8080/#/c/5811/8/be/src/runtime/buffered-tuple-stream-v2.h File be/src/runtime/buffered-tuple-stream-v2.h: PS8, Line 68: . > maybe add: "for reading" Done PS8, Line 69: access > read access? We do update some tuples in place in the agg so that might be more misleading. I changed this so that it's clearer that it means that pointers to rows can be saved (since we don't really support random access via the stream interface). PS8, Line 75: event > even Done PS8, Line 319: bytes_pinned > i guess should be BytesPinned() since not a simple accessor Done PS8, Line 385: including any pages already deleted in 'delete_on_read' : /// mode. > that's not intuitive. is there a good reason why we do that rather than mak Currently those kind of invariants aren't maintained when reading in delete_on_read mode - that has to be the last pass over the stream. num_rows_ also isn't updated. I don't see a use case for why client code would depend on these values being updated while the stream is destructively read so the value of adding additional logic to keep this in sync would be limited. I already put some effort into updating the documentation and DCHECKs so that it was clearer that delete_on_read was destructive and you couldn't re-read a stream afterwards. PS8, Line 392: num_rows_returned_ > rows_returned_? (or rename that member, which might be better). Updated the comment. I would rename the member in other circumstances but I don't want to make too make minor API changes since it will increase the amount of code churn and noise in the patch that switches the ExecNodes over to this. -- To view, visit http://gerrit.cloudera.org:8080/5811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-HasComments: Yes