Dan Hecht has posted comments on this change. Change subject: IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder ......................................................................
Patch Set 18: (18 comments) http://gerrit.cloudera.org:8080/#/c/3873/18/be/src/exec/partitioned-hash-join-builder.cc File be/src/exec/partitioned-hash-join-builder.cc: PS18, Line 326: we'd like all partitions to ... all partitions .. PS18, Line 327: be are (to match above edit) Line 353: // building hash tables because allocating probe buffers may cause some more partitions duplicated sentences PS18, Line 797: filters_.size() > 0) this doesn't seem to exactly match the logic at line 166 (that doesn't care about filters_.size(). http://gerrit.cloudera.org:8080/#/c/3873/18/be/src/exec/partitioned-hash-join-builder.h File be/src/exec/partitioned-hash-join-builder.h: Line 276: /// Returns non-ok status if we couldn't spill a partition. nit: weird line breaking. PS18, Line 280: should have all of their build rows what is this trying to say? that we're done partitioning the build? PS18, Line 286: no buffers in either the build or probe stream. clarify whether this means neither has buffers or at least one does not have buffers (the latter being what the current wording seems to imply). PS18, Line 287: in_mem are these states actually represented in code? (if not, why the underscore?). also, how about pointing at that for #2, no probe streams are needed. PS18, Line 291: probe_buffers_for_spilled_partitions_ what's that? PS18, Line 301: probe_buffers_for_spilled_partitions_ same PS18, Line 349: PartitionedHashJoinNode who owns it? PS18, Line 427: so that the : /// initial probe buffer is allocated this doesn't really explain why the builder has to do it. Would be good to explain that. i.e. this can lead to more spilling Line 432: /// phase of the algorithm without spilling more partitions. this hints at it, so maybe move this up and reword to be more explicit. http://gerrit.cloudera.org:8080/#/c/3873/18/be/src/exec/partitioned-hash-join-node-ir.cc File be/src/exec/partitioned-hash-join-node-ir.cc: PS18, Line 285: *status if we can now do this, can we also just make NextProbeRow() and others return Status directly? Line 293: if (LIKELY(hash_tbl != NULL)) { is probe_partition NULL in this case? http://gerrit.cloudera.org:8080/#/c/3873/18/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: Line 47: "the memory limit may help this query to complete successfully."; weird line breakages http://gerrit.cloudera.org:8080/#/c/3873/18/be/src/exec/partitioned-hash-join-node.h File be/src/exec/partitioned-hash-join-node.h: Line 404: /// builder_->hash_partitions. This is non-empty only in the PARTITIONING_PROBE or underscore, right? PS18, Line 406: a and? -- To view, visit http://gerrit.cloudera.org:8080/3873 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1e02ea9c7a7d1a0f373b11aa06c3237e1c7bd4cb Gerrit-PatchSet: 18 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-HasComments: Yes