starrysxy opened a new issue, #16340:
URL: https://github.com/apache/dolphinscheduler/issues/16340

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### What happened
   
   When I run a workflow in parallel using the complement mode, I can't stop 
the process instances. If I do this, a few process instances will not be 
scheduled besides the instance I stop.
   
   e.g.: scroll down and refer to **How to reproduce** please
   
   I have check the code, process instances executed in parallel will be 
divided into queues according to the degree of parallelism. But, there is 
something wrong in 
`org.apache.dolphinscheduler.server.master.event.WorkflowStateEventHandler#handleStateEvent`
 method.
   
   When I stop one process instance, I get these logs:
   ```text
   songxingyin@songxinyindeMBP logs % cat 
dolphinscheduler-standalone.2024-07-17_23.0.log | grep 'Handle workflow 
instance state event, the current workflow instance state 
WorkflowExecutionStatus' | grep 'stop'
   [INFO] 2024-07-17 23:33:35.736 +0800 
o.a.d.s.m.e.WorkflowStateEventHandler:[43] - Handle workflow instance state 
event, the current workflow instance state WorkflowExecutionStatus{code=4, 
desc='ready stop'} will be changed to WorkflowExecutionStatus{code=4, 
desc='ready stop'}
   [INFO] 2024-07-17 23:33:38.809 +0800 
o.a.d.s.m.e.WorkflowStateEventHandler:[43] - Handle workflow instance state 
event, the current workflow instance state WorkflowExecutionStatus{code=5, 
desc='stop'} will be changed to WorkflowExecutionStatus{code=5, desc='stop'}
   ```
   
   The initial state and target state are both `{code=4, desc='ready stop'}` or 
`{code=5, desc='stop'}`.
   
   I know the next complement command will be created in 
`org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteRunnable#processComplementData`
 method.
   
   When `{code=4, desc='ready stop'}`, the `processComplementData()` method 
will return `false` before create next complement command. And, when `{code=5, 
desc='stop'}`, the `processComplementData()` method will not be called, so 
there is also no next complement command.
   
   ```java
   @Override
   public boolean handleStateEvent(WorkflowExecuteRunnable 
workflowExecuteRunnable,
                                   StateEvent stateEvent) throws 
StateEventHandleException {
       WorkflowStateEvent workflowStateEvent = (WorkflowStateEvent) stateEvent;
       ProcessInstance processInstance =
               
workflowExecuteRunnable.getWorkflowExecuteContext().getWorkflowInstance();
       ProcessDefinition processDefinition = 
processInstance.getProcessDefinition();
       measureProcessState(workflowStateEvent, 
processInstance.getProcessDefinitionCode().toString());
   
       log.info(
               "Handle workflow instance state event, the current workflow 
instance state {} will be changed to {}",
               processInstance.getState(), workflowStateEvent.getStatus());
   
       if (workflowStateEvent.getStatus().isStop()) {
           // serial wait execution type needs to wake up the waiting process
           if (processDefinition.getExecutionType().typeIsSerialWait() || 
processDefinition.getExecutionType()
                   .typeIsSerialPriority()) {
               workflowExecuteRunnable.endProcess();
               return true;
           }
           
workflowExecuteRunnable.updateProcessInstanceState(workflowStateEvent);
           return true;
       }
       if (workflowExecuteRunnable.processComplementData()) {
           return true;
       }
       if (workflowStateEvent.getStatus().isFinished()) {
           ...
       }
   
       if (workflowStateEvent.getStatus().isReadyStop()) {
           ...
       }
       return true;
   }
   ```
   
   ```java
   public boolean processComplementData() {
       ProcessInstance workflowInstance = 
workflowExecuteContext.getWorkflowInstance();
       if (!needComplementProcess()) {
           return false;
       }
   
       // when the serial complement is executed, the next complement instance 
is created,
       // and this method does not need to be executed when the parallel 
complement is used.
       if (workflowInstance.getState().isReadyStop() || 
!workflowInstance.getState().isFinished()) {
           return false;
       }
       ...
       return true;
   }
   ```
   
   ### What you expected to happen
   
   When I stop one process instance executed in parallel, other instances will 
not be influenced.
   
   ### How to reproduce
   
   1. Click 'Start' button, and set the config like the following picture
   
![image](https://github.com/user-attachments/assets/daa3d181-f1b9-4ce1-bd1e-006fc9799612)
   2. There will be 5 process instances running, and I stop one of them. In 
this cycle, when I stop the instance, everything looks good. But in the next 
cycle, there are only 4 process instances running (In the following picture, A 
is the first cycle, and B is the second cycle). And If I stop another one in 
the later cycle, there will be only 3 process instances running in the next 
cycle. And so on.
   
![image](https://github.com/user-attachments/assets/89a79487-1911-4242-ba37-fc7662aee9cf)
   
   
   ### Anything else
   
   In my opinion, this is a bug, process instances executed in parallel should 
not influence each other.
   
   If this is a bug, I am willing to try to fix it.
   
   ### Version
   
   dev
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: 
commits-unsubscr...@dolphinscheduler.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to