Sky-Gu opened a new issue, #13726: URL: https://github.com/apache/dolphinscheduler/issues/13726
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues. ### What happened version:3.1.3 When there are too many tasks in the task group, the tasks in the queue are incorrectly updated. t_ds_task_group_queue.in_queue will be updated to 1 If t_ds_task_group_queue.in_queue = 1, the queued task cannot be submitted again and can only be executed forcibly. If t_ds_task_group_queue.in_queue = 1, the workflow cannot be terminated or even cancelled, and subsequent tasks submitted to the task group are affected. The actual task id=2175, but the id=2178 was updated during the update, as shown in the log screenshot  Possible exception code as the screenshot below  ### What you expected to happen Task groups can be released correctly ### How to reproduce Multiple tasks are submitted to the task group at the same time, and the time required for each task is different. There is a probability that this problem will occur ### Anything else [INFO] 2023-03-13 11:37:23.412 +0800 org.apache.dolphinscheduler.service.process.ProcessServiceImpl:[3030] - [WorkflowInstance-10245][TaskInstance-80134] - Begin to release task group: 10 [DEBUG] 2023-03-13 11:37:23.412 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupMapper.selectById:[137] - [WorkflowInstance-10245][TaskInstance-80134] - ==> Preparing: SELECT id,name,description,group_size,use_size,user_id,status,create_time,update_time,project_code FROM t_ds_task_group WHERE id=? [DEBUG] 2023-03-13 11:37:23.413 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupMapper.selectById:[137] - [WorkflowInstance-10245][TaskInstance-80134] - ==> Parameters: 10(Integer) [DEBUG] 2023-03-13 11:37:23.416 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupMapper.selectById:[137] - [WorkflowInstance-10245][TaskInstance-80134] - <== Total: 1 [DEBUG] 2023-03-13 11:37:23.416 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupQueueMapper.queryByTaskId:[137] - [WorkflowInstance-10245][TaskInstance-80134] - ==> Preparing: select id, task_id, task_name, group_id, process_id, priority, status , force_start , in_queue, create_time, update_time from t_ds_task_group_queue where task_id = ? [DEBUG] 2023-03-13 11:37:23.416 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupQueueMapper.queryByTaskId:[137] - [WorkflowInstance-10245][TaskInstance-80134] - ==> Parameters: 80134(Integer) [DEBUG] 2023-03-13 11:37:23.420 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupQueueMapper.queryByTaskId:[137] - [WorkflowInstance-10245][TaskInstance-80134] - <== Total: 1 [DEBUG] 2023-03-13 11:37:23.420 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupMapper.releaseTaskGroupResource:[137] - [WorkflowInstance-10245][TaskInstance-80134] - ==> Preparing: update t_ds_task_group set use_size = use_size-1 where id = ? and use_size > 0 and (select count(1) FROM t_ds_task_group_queue where id = ? and status = ? ) = 1 [DEBUG] 2023-03-13 11:37:23.420 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupMapper.releaseTaskGroupResource:[137] - [WorkflowInstance-10245][TaskInstance-80134] - ==> Parameters: 10(Integer), 2175(Integer), 1(Integer) [DEBUG] 2023-03-13 11:37:23.426 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupMapper.releaseTaskGroupResource:[137] - [WorkflowInstance-10245][TaskInstance-80134] - <== Updates: 1 [INFO] 2023-03-13 11:37:23.426 +0800 org.apache.dolphinscheduler.service.process.ProcessServiceImpl:[3056] - [WorkflowInstance-10245][TaskInstance-80134] - Finished to release task group, taskGroupId: 10 [INFO] 2023-03-13 11:37:23.426 +0800 org.apache.dolphinscheduler.service.process.ProcessServiceImpl:[3058] - [WorkflowInstance-10245][TaskInstance-80134] - Begin to release task group queue, taskGroupId: 10 [DEBUG] 2023-03-13 11:37:23.426 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupQueueMapper.queryByTaskId:[137] - [WorkflowInstance-10245][TaskInstance-80134] - ==> Preparing: select id, task_id, task_name, group_id, process_id, priority, status , force_start , in_queue, create_time, update_time from t_ds_task_group_queue where task_id = ? [DEBUG] 2023-03-13 11:37:23.427 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupQueueMapper.queryByTaskId:[137] - [WorkflowInstance-10245][TaskInstance-80134] - ==> Parameters: 80134(Integer) [DEBUG] 2023-03-13 11:37:23.429 +0800 org.apache.dolphinscheduler.remote.handler.NettyClientHandler:[191] - [WorkflowInstance-0][TaskInstance-0] - Client send heart beat to: 172.16.10.205:1234 [DEBUG] 2023-03-13 11:37:23.430 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupQueueMapper.queryByTaskId:[137] - [WorkflowInstance-10245][TaskInstance-80134] - <== Total: 1 [DEBUG] 2023-03-13 11:37:23.431 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupQueueMapper.updateById:[137] - [WorkflowInstance-10245][TaskInstance-80134] - ==> Preparing: UPDATE t_ds_task_group_queue SET task_id=?, task_name=?, group_id=?, process_id=?, priority=?, force_start=?, in_queue=?, status=?, create_time=?, update_time=? WHERE id=? [DEBUG] 2023-03-13 11:37:23.431 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupQueueMapper.updateById:[137] - [WorkflowInstance-10245][TaskInstance-80134] - ==> Parameters: 80134(Integer), report_xxx(String), 10(Integer), 10245(Integer), 0(Integer), 0(Integer), 0(Integer), 2(Integer), 2023-03-13 11:37:00.0(Timestamp), 2023-03-13 11:37:23.431(Timestamp), 2175(Integer) [DEBUG] 2023-03-13 11:37:23.437 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupQueueMapper.updateById:[137] - [WorkflowInstance-10245][TaskInstance-80134] - <== Updates: [DEBUG] 2023-03-13 11:37:23.437 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupQueueMapper.queryTheHighestPriorityTasks:[137] - [WorkflowInstance-10245][TaskInstance-80134] - ==> Preparing: select id, task_id, task_name, group_id, process_id, priority, status , force_start , in_queue, create_time, update_time from t_ds_task_group_queue where priority = (select max(priority) from t_ds_task_group_queue where group_id = ? and status = ? and in_queue = ? and force_start = ? ) and group_id = ? and status = ? and in_queue = ? and force_start = ? limit 1 [DEBUG] 2023-03-13 11:37:23.437 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupQueueMapper.queryTheHighestPriorityTasks:[137] - [WorkflowInstance-10245][TaskInstance-80134] - ==> Parameters: 10(Integer), -1(Integer), 0(Integer), 0(Integer), 10(Integer), -1(Integer), 0(Integer), 0(Integer) [DEBUG] 2023-03-13 11:37:23.442 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupQueueMapper.queryTheHighestPriorityTasks:[137] - [WorkflowInstance-10245][TaskInstance-80134] - <== Total: 1 [DEBUG] 2023-03-13 11:37:23.442 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupQueueMapper.updateInQueueCAS:[137] - [WorkflowInstance-10245][TaskInstance-80134] - ==> Preparing: update t_ds_task_group_queue set in_queue = ? where id = ? and in_queue = ? [DEBUG] 2023-03-13 11:37:23.443 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupQueueMapper.updateInQueueCAS:[137] - [WorkflowInstance-10245][TaskInstance-80134] - ==> Parameters: 1(Integer), 2178(Integer), 0(Integer) [DEBUG] 2023-03-13 11:37:23.448 +0800 org.apache.dolphinscheduler.dao.mapper.TaskGroupQueueMapper.updateInQueueCAS:[137] - [WorkflowInstance-10245][TaskInstance-80134] - <== Updates: 1 [INFO] 2023-03-13 11:37:23.448 +0800 org.apache.dolphinscheduler.service.process.ProcessServiceImpl:[3073] - [WorkflowInstance-10245][TaskInstance-80134] - Finished to release task group queue: taskGroupId: 10, taskGroupQueueId: 2178 ### Version 3.1.x ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
