[ https://issues.apache.org/jira/browse/CLOUDSTACK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665209#comment-15665209 ]
ASF GitHub Bot commented on CLOUDSTACK-9595: -------------------------------------------- Github user jburwell commented on the issue: https://github.com/apache/cloudstack/pull/1762 @serg38 that is not a safe assumption. Transactions often span multiple statements and methods across DAOs. `TransactionLegacy` has a transaction stacking/nested model that further occludes when a transaction actually completely. Deadlocks are a severe problem that need to be fixed. Unfortunately, this patch would do more harm than good as it would eventually corrupt the database. In, and of themselves, retries are also a very expensive solution to the problem both in terms of the engineering effort required to do it properly and the extra stress placed on the database to perform additional work that will likely fail. Furthermore, a generic **and** correct retry mechanism is a very difficult thing to write. Given the way transaction boundaries are managed in ACS, I think such an effort would be nearly impossible. In a properly written application, deadlocks should very rarely, if ever, occur. Their presence is a symptom of improper transaction handling and/or poor lock management problems. Therefore, my suggestion is that we change this patch to log details about the context in which deadlocks occur. We can then use this information to identify the areas in ACS where these contention problems are location and fix the root cause. > Transactions are not getting retried in case of database deadlock errors > ------------------------------------------------------------------------ > > Key: CLOUDSTACK-9595 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9595 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Affects Versions: 4.8.0 > Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Customer is seeing occasional error 'Deadlock found when trying to get lock; > try restarting transaction' messages in their management server logs. It > happens regularly at least once a day. The following is the error seen > 2015-12-09 19:23:19,450 ERROR [cloud.api.ApiServer] > (catalina-exec-3:ctx-f05c58fc ctx-39c17156 ctx-7becdf6e) unhandled exception > executing api command: [Ljava.lang.String;@230a6e7f > com.cloud.utils.exception.CloudRuntimeException: DB Exception on: > com.mysql.jdbc.JDBC4PreparedStatement@74f134e3: DELETE FROM > instance_group_vm_map WHERE instance_group_vm_map.instance_id = 941374 > at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1209) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) > at > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) > at > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) > at > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) > at com.sun.proxy.$Proxy237.expunge(Unknown Source) > at > com.cloud.vm.UserVmManagerImpl$2.doInTransactionWithoutResult(UserVmManagerImpl.java:2593) > at > com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25) > at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:57) > at com.cloud.utils.db.Transaction.execute(Transaction.java:45) > at com.cloud.utils.db.Transaction.execute(Transaction.java:54) > at > com.cloud.vm.UserVmManagerImpl.addInstanceToGroup(UserVmManagerImpl.java:2575) > at > com.cloud.vm.UserVmManagerImpl.updateVirtualMachine(UserVmManagerImpl.java:2332) -- This message was sent by Atlassian JIRA (v6.3.4#6332)