starrysxy opened a new issue, #16340: URL: https://github.com/apache/dolphinscheduler/issues/16340
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues. ### What happened When I run a workflow in parallel using the complement mode, I can't stop the process instances. If I do this, a few process instances will not be scheduled besides the instance I stop. e.g.: scroll down and refer to **How to reproduce** please I have check the code, process instances executed in parallel will be divided into queues according to the degree of parallelism. But, there is something wrong in `org.apache.dolphinscheduler.server.master.event.WorkflowStateEventHandler#handleStateEvent` method. When I stop one process instance, I get these logs: ```text songxingyin@songxinyindeMBP logs % cat dolphinscheduler-standalone.2024-07-17_23.0.log | grep 'Handle workflow instance state event, the current workflow instance state WorkflowExecutionStatus' | grep 'stop' [INFO] 2024-07-17 23:33:35.736 +0800 o.a.d.s.m.e.WorkflowStateEventHandler:[43] - Handle workflow instance state event, the current workflow instance state WorkflowExecutionStatus{code=4, desc='ready stop'} will be changed to WorkflowExecutionStatus{code=4, desc='ready stop'} [INFO] 2024-07-17 23:33:38.809 +0800 o.a.d.s.m.e.WorkflowStateEventHandler:[43] - Handle workflow instance state event, the current workflow instance state WorkflowExecutionStatus{code=5, desc='stop'} will be changed to WorkflowExecutionStatus{code=5, desc='stop'} ``` The initial state and target state are both `{code=4, desc='ready stop'}` or `{code=5, desc='stop'}`. I know the next complement command will be created in `org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable#processComplementData` method. When `{code=4, desc='ready stop'}`, the `processComplementData()` method will return `false` before create next complement command. And, when `{code=5, desc='stop'}`, the `processComplementData()` method will not be called, so there is also no next complement command. ```java @Override public boolean handleStateEvent(WorkflowExecuteRunnable workflowExecuteRunnable, StateEvent stateEvent) throws StateEventHandleException { WorkflowStateEvent workflowStateEvent = (WorkflowStateEvent) stateEvent; ProcessInstance processInstance = workflowExecuteRunnable.getWorkflowExecuteContext().getWorkflowInstance(); ProcessDefinition processDefinition = processInstance.getProcessDefinition(); measureProcessState(workflowStateEvent, processInstance.getProcessDefinitionCode().toString()); log.info( "Handle workflow instance state event, the current workflow instance state {} will be changed to {}", processInstance.getState(), workflowStateEvent.getStatus()); if (workflowStateEvent.getStatus().isStop()) { // serial wait execution type needs to wake up the waiting process if (processDefinition.getExecutionType().typeIsSerialWait() || processDefinition.getExecutionType() .typeIsSerialPriority()) { workflowExecuteRunnable.endProcess(); return true; } workflowExecuteRunnable.updateProcessInstanceState(workflowStateEvent); return true; } if (workflowExecuteRunnable.processComplementData()) { return true; } if (workflowStateEvent.getStatus().isFinished()) { ... } if (workflowStateEvent.getStatus().isReadyStop()) { ... } return true; } ``` ```java public boolean processComplementData() { ProcessInstance workflowInstance = workflowExecuteContext.getWorkflowInstance(); if (!needComplementProcess()) { return false; } // when the serial complement is executed, the next complement instance is created, // and this method does not need to be executed when the parallel complement is used. if (workflowInstance.getState().isReadyStop() || !workflowInstance.getState().isFinished()) { return false; } ... return true; } ``` ### What you expected to happen When I stop one process instance executed in parallel, other instances will not be influenced. ### How to reproduce 1. Click 'Start' button, and set the config like the following picture ![image](https://github.com/user-attachments/assets/daa3d181-f1b9-4ce1-bd1e-006fc9799612) 2. There will be 5 process instances running, and I stop one of them. In this cycle, when I stop the instance, everything looks good. But in the next cycle, there are only 4 process instances running (In the following picture, A is the first cycle, and B is the second cycle). And If I stop another one in the later cycle, there will be only 3 process instances running in the next cycle. And so on. ![image](https://github.com/user-attachments/assets/89a79487-1911-4242-ba37-fc7662aee9cf) ### Anything else In my opinion, this is a bug, process instances executed in parallel should not influence each other. If this is a bug, I am willing to try to fix it. ### Version dev ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@dolphinscheduler.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org