Github user rafaelweingartner commented on the issue:
https://github.com/apache/cloudstack/pull/1762
@serg38 I have just now started reading this PR (excuse me if I overlooked
some information).
> If we are to try to implement a general way of dealing with deadlocks in
ACS how could it be done to ensure DB consistency and correct transaction retry?
Answering your question; in my opinion, we should not âtryâ to
implement a general way of managing transactions. We are only having this type
of problem because instead of using a framework to manage access and
transactions in databases, it was developed a module to do that and
incorporated to ACS; this means we have to maintain and live with this code.
Now, the problem is that it would be a Dantesque task to change the way ACS
manages transactions today.
I am with John on this one, retrying is not a good idea; it can hide
problems, cause overheads and cause even more headaches. I think that the best
approach is to deal with this type of problem on the fly; this means, as John
said, addressing them as bugs when they are reported.
Having said that, I have not helped a bit to solve the problem⦠Letâs
see if I can be of any help.
I was reading the ticket #CLOUDSTACK-9595. It seems that the problem
(reported there) happened when a VM was being removed from a table
âinstance_group_vm_mapâ. I just do not understand because the method called
is âUserVmManagerImpl.addInstanceToGroupâ. I am hoping that this makes
sense. Anywaysâ¦
The MYSQL docs have the following on deadlocks:
> A deadlock is a situation where different transactions are unable to
proceed because each holds a lock that the other needs
This means, there was something else being executed when that VM was
deleted/added, and this caused the deadlock and the exception. Probably
something else is using the table âinstance_group_vm_mapâ.
I think we should track these two tasks/processes that can cause the
problem and work them out, instead of looking for a generic way to deal with
this situation. Maybe these processes that are causing deadlock are locking
tables that are not needed or executing some processing that could be avoided
or modified.
Do we use case that can reproduce the problem?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---