zanmato1984 commented on code in PR #45268:
URL: https://github.com/apache/arrow/pull/45268#discussion_r1924100107
##########
cpp/src/arrow/acero/task_util.cc:
##########
@@ -278,8 +278,8 @@ Status TaskSchedulerImpl::ExecuteMore(size_t thread_id, int
num_tasks_to_execute
bool task_group_finished = false;
Status status = ExecuteTask(thread_id, group_id, task_id,
&task_group_finished);
if (!status.ok()) {
- // Mark the remaining picked tasks as finished
- for (size_t j = i + 1; j < tasks.size(); ++j) {
+ // Mark the current and remaining picked tasks as finished
+ for (size_t j = i; j < tasks.size(); ++j) {
Review Comment:
This is needed for the serial execution path to correctly set the task count.
##########
cpp/src/arrow/acero/task_util.cc:
##########
@@ -413,6 +421,8 @@ void TaskSchedulerImpl::Abort(AbortContinuationImpl impl) {
all_finished = false;
task_group.state_ = TaskGroupState::ALL_TASKS_STARTED;
}
+ } else if (task_group.state_ == TaskGroupState::ALL_TASKS_STARTED) {
Review Comment:
This is needed for allowing the `Abort` to be called in a task itself,
otherwise the abort continuation will be called twice.
I suppose the `Abort` function is not necessarily to be called inside a task
body to trigger the issue. In other words, it could happen when `Abort` is
called in a timing that is subtle enough.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]