[ https://issues.apache.org/jira/browse/CLOUDSTACK-7778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Min Chen reassigned CLOUDSTACK-7778: ------------------------------------ Assignee: Min Chen > Start VM checkWorkItem loop should also check VM DB state before going into > idle waiting to exit faster. > -------------------------------------------------------------------------------------------------------- > > Key: CLOUDSTACK-7778 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7778 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Management Server > Affects Versions: 4.0.0 > Reporter: Min Chen > Assignee: Min Chen > Fix For: 4.5.0 > > > During VM deployment, it may involve starting VR. In the meantime, our HA > process may also try to start the same VR for example due to host disconnect. > Pre-4.3 release, we didn't serialize these two VR start operations, and tried > to use VM state transition failure to tell if there is another concurrent > operation. In case of concurrent operation, we are not fail the VM deployment > job immediately. Instead, we have retry logic to keep checking op_it_work > table to see if some other outstanding items have been working on the same > VR. If there is any issue with some dangling op_it_work item, this retry will > take more than one hour and then fail even though VR may have already been > started a while back by HA process. Although due to recent VMsync framework > change, this concurrent VM operations become less, it is still better to > check current VM state in the while loop of check op_it_work items to get > early exit instead of purely relying on op_it_work table being updated > properly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)