github-actions[bot] commented on code in PR #63345:
URL: https://github.com/apache/doris/pull/63345#discussion_r3257745170


##########
be/src/exec/operator/hashjoin_build_sink.cpp:
##########
@@ -243,16 +252,9 @@ Status HashJoinBuildSinkLocalState::close(RuntimeState* 
state, Status exec_statu
 
         if (p._use_shared_hash_table) {
             LockGuard lock(p._mutex);
-            // Only signal non-builder tasks when the builder actually built 
the hash table.
-            // When the builder is terminated (woken up early because the 
probe side finished
-            // first), it never called process_build_block() so the hash table 
variant is still
-            // monostate. Setting _signaled=true in that case would cause 
non-builder tasks to
-            // enter std::visit on monostate and crash with "Hash table type 
mismatch".
-            //
-            // _terminated is reliably true here when the task was woken up 
early, because
-            // operator terminate() is called from the execute() Defer in 
PipelineTask
-            // before close() is invoked.
-            if (!_terminated) {
+            // Do not wake non-builders into a monostate hash table after 
early termination/errors.
+            const auto should_signal = !_terminated && 
hash_table_built(_shared_state);
+            if (should_signal) {

Review Comment:
   This still signals non-builder tasks when the builder is closing after a 
failed build path as long as `_hash_table_init()` has already changed the first 
variant away from `std::monostate`. For example, in shared broadcast mode 
`_hash_table_init()` initializes variants before 
`process_build_block()`/`build_asof_index()` fully complete; if a later variant 
initialization fails, or `build_asof_index()` returns an error after the hash 
table variant is initialized, `close(exec_status)` will run with `_terminated 
== false` and `hash_table_built(_shared_state) == true`, set `_signaled`, and 
wake non-builders to copy/probe shared state from a failed builder. That can 
still mask the original failure with the shared-table type mismatch or expose 
incomplete ASOF state. Please gate signaling on successful builder completion 
as well, for example by checking `exec_status.ok()` or by setting an explicit 
`build_completed` flag only after all build-side steps succeed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to