[ https://issues.apache.org/jira/browse/IMPALA-9127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Armstrong reassigned IMPALA-9127: ------------------------------------- Assignee: Tim Armstrong > Clean up probe-side state machine in hash join > ---------------------------------------------- > > Key: IMPALA-9127 > URL: https://issues.apache.org/jira/browse/IMPALA-9127 > Project: IMPALA > Issue Type: Sub-task > Components: Backend > Reporter: Tim Armstrong > Assignee: Tim Armstrong > Priority: Major > > There's an implicit state machine in the main loop in > PartitionedHashJoinNode::GetNext() > https://github.com/apache/impala/blob/eea617b/be/src/exec/partitioned-hash-join-node.cc#L510 > The state is implicitly defined based on the following conditions: > * !output_build_partitions_.empty() -> "outputting build rows after probing" > * builder_->null_aware_partition() == NULL -> "eos, because this the > null-aware partition is processed after all other partitions" > * null_probe_output_idx_ >= 0 -> "null probe rows being processed" > * output_null_aware_probe_rows_running_ -> "null-aware partition being > processed" > * probe_batch_pos_ != -1 -> "processing probe batch" > * builder_->num_hash_partitions() != 0 -> "have active hash partitions that > are being probed" > * spilled_partitions_.empty() -> "no more spilled partitions" > I think this would be a lot easier to follow if the state machine was > explicit and documented, and would make separating out the build side of a > spilling hash join easier to get right. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org