[ 
https://issues.apache.org/jira/browse/IMPALA-9127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-9127:
-------------------------------------

    Assignee: Tim Armstrong

> Clean up probe-side state machine in hash join
> ----------------------------------------------
>
>                 Key: IMPALA-9127
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9127
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Major
>
> There's an implicit state machine in the main loop in  
> PartitionedHashJoinNode::GetNext() 
> https://github.com/apache/impala/blob/eea617b/be/src/exec/partitioned-hash-join-node.cc#L510
> The state is implicitly defined based on the following conditions:
> * !output_build_partitions_.empty() -> "outputting build rows after probing"
> * builder_->null_aware_partition() == NULL -> "eos, because this the 
> null-aware partition is processed after all other partitions"
> * null_probe_output_idx_ >= 0 -> "null probe rows being processed"
> * output_null_aware_probe_rows_running_ -> "null-aware partition being 
> processed"
> * probe_batch_pos_ != -1 -> "processing probe batch"
> * builder_->num_hash_partitions() != 0 -> "have active hash partitions that 
> are being probed"
> * spilled_partitions_.empty() -> "no more spilled partitions"
> I think this would be a lot easier to follow if the state machine was 
> explicit and documented, and would make separating out the build side of a 
> spilling hash join easier to get right.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to