[ https://issues.apache.org/jira/browse/CLOUDSTACK-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15696396#comment-15696396 ]
ASF GitHub Bot commented on CLOUDSTACK-9595: -------------------------------------------- Github user serg38 commented on the issue: https://github.com/apache/cloudstack/pull/1762 @rafaelweingartner You might be right that pod_vlan_map should be in the join. May be I didn't find the correct methods after all. @jburwell @rhtyd What do you think? I was able to find management serve log for Deadlock 1. Looks like one of transaction came from findAndUpdateDirectAgentToLoad method in HostDaoImpl which creates rather complex transaction: 2016-11-24 15:04:39,284 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager Timer:ctx-a8e9449c) Resetting hosts suitable for reconnect 2016-11-24 15:04:39,320 DEBUG [db.Transaction.Transaction] (ClusteredAgentManager Timer:ctx-a8e9449c) Rolling back the transaction: Time = 36 Name = ClusteredAgentManager Timer; called by -TransactionLegacy.rollback:879-TransactionLegacy.removeUpTo:822-TransactionLegacy.close:646-TransactionContextInterceptor.invoke:36-ReflectiveMethodInvocation.proceed:161-ExposeInvocationInterceptor.invoke:91-ReflectiveMethodInvocation.proceed:172-JdkDynamicAopProxy.invoke:204-$Proxy48.findAndUpdateDirectAgentToLoad:-1-ClusteredAgentManagerImpl.scanDirectAgentToLoad:195-ClusteredAgentManagerImpl.runDirectAgentScanTimerTask:185-ClusteredAgentManagerImpl.access$100:99 2016-11-24 15:04:39,322 ERROR [agent.manager.ClusteredAgentManagerImpl] (ClusteredAgentManager Timer:ctx-a8e9449c) Unexpected exception DB Exception on: com.mysql.jdbc.JDBC4PreparedStatement@1e58727c: SELECT host.id, host.disconnected, host.name, host.status, host.type, host.private_ip_address, host.private_mac_address, host.private_netmask, host.public_netmask, host.public_ip_address, host.public_mac_address, host.storage_ip_address, host.cluster_id, host.storage_netmask, host.storage_mac_address, host.storage_ip_address_2, host.storage_netmask_2, host.storage_mac_address_2, host.hypervisor_type, host.proxy_port, host.resource, host.fs_type, host.available, host.setup, host.resource_state, host.hypervisor_version, host.update_count, host.uuid, host.data_center_id, host.pod_id, host.cpu_sockets, host.cpus, host.url, host.speed, host.ram, host.parent, host.guid, host.capabilities, host.total_size, host.last_ping, host.mgmt_server_id, host.dom0_memory, host.version, host.created, host.removed FROM host WHERE host.resource IS NOT NULL AND host.mgmt_server_id = 345048964870 AND host.last_ping <= 1445339907 AND host.cluster_id IS NOT NULL AND host.status IN ('Disconnected','Down','Alert') AND host.removed IS NULL FOR UPDATE Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction Beginning of second transaction was SELECT host.id, host.disconnected, host.name, host.status, host.type, host.private_ip_address, host.private_mac_address, host.private_netmask, host.public_netmask, host.public_ip_address, host.public_mac_address, host.storage_ip_address, host.cluster_id, host.storage_netmask, host.storage_mac_address, host.storage_ip_address_2, host.storage_netmask_2, host.storage_mac_address_2, host.hypervisor_type, host.proxy_port, host.resource, host.fs_type, host.available, host.setup, host.resource_state, host.hypervisor_version, host.update_count, host.uuid, host.data_center_id, host.pod_id, host.cpu_sockets, host.cpus, host.url, host.speed, host.ram, host.parent, host.guid, host.capabilities, host.total_size, host.last_ping, host.mgmt_server_id, host.dom0_memory, host.version, host.created, host.removed FROM host LEFT OUTER JOIN op_host_transfer ON host.id=op_host_transfer.id IN I will try to trace it to the ACS method. > Transactions are not getting retried in case of database deadlock errors > ------------------------------------------------------------------------ > > Key: CLOUDSTACK-9595 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9595 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Affects Versions: 4.8.0 > Reporter: subhash yedugundla > Fix For: 4.8.1 > > > Customer is seeing occasional error 'Deadlock found when trying to get lock; > try restarting transaction' messages in their management server logs. It > happens regularly at least once a day. The following is the error seen > 2015-12-09 19:23:19,450 ERROR [cloud.api.ApiServer] > (catalina-exec-3:ctx-f05c58fc ctx-39c17156 ctx-7becdf6e) unhandled exception > executing api command: [Ljava.lang.String;@230a6e7f > com.cloud.utils.exception.CloudRuntimeException: DB Exception on: > com.mysql.jdbc.JDBC4PreparedStatement@74f134e3: DELETE FROM > instance_group_vm_map WHERE instance_group_vm_map.instance_id = 941374 > at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1209) > at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150) > at > com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:34) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161) > at > org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91) > at > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172) > at > org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204) > at com.sun.proxy.$Proxy237.expunge(Unknown Source) > at > com.cloud.vm.UserVmManagerImpl$2.doInTransactionWithoutResult(UserVmManagerImpl.java:2593) > at > com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25) > at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:57) > at com.cloud.utils.db.Transaction.execute(Transaction.java:45) > at com.cloud.utils.db.Transaction.execute(Transaction.java:54) > at > com.cloud.vm.UserVmManagerImpl.addInstanceToGroup(UserVmManagerImpl.java:2575) > at > com.cloud.vm.UserVmManagerImpl.updateVirtualMachine(UserVmManagerImpl.java:2332) -- This message was sent by Atlassian JIRA (v6.3.4#6332)