Rayees Namathponnan created CLOUDSTACK-7112: -----------------------------------------------
Summary: [Automation] Deadlock observed while restarting MS, and MS never comes back Key: CLOUDSTACK-7112 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7112 Project: CloudStack Issue Type: Bug Security Level: Public (Anyone can view this level - this is the default.) Components: Management Server Affects Versions: 4.5.0 Environment: 4.5 Reporter: Rayees Namathponnan Priority: Blocker Fix For: 4.5.0 Steps to reproduce Step 1 : Create advance zone on vmware Step 2 : Make sure CPVM and SSVM are up Step 3 : deploy VM Step 4 : Restart MS MS never comes back, observed below deadlock in log 2014-07-15 13:18:33,221 INFO [c.c.u.d.GenericDaoBase] (main:null) Cache created: [ name = HostPodDaoImpl status = STATUS_ALIVE eternal = false overflowToDisk = false maxEntriesLocalHeap = 50 maxEntriesLocalDisk = 0 memoryStoreEvictionPolicy = LRU timeToLiveSeconds = 600 timeToIdleSeconds = 300 persistence = none diskExpiryThreadIntervalSeconds = 120 cacheEventListeners: net.sf.ehcache.statistics.LiveCacheStatisticsWrapper hitCount = 0 memoryStoreHitCount = 0 diskStoreHitCount = 0 missCountNotFound = 0 missCountExpired = 0 maxBytesLocalHeap = 0 overflowToOffHeap = false maxBytesLocalOffHeap = 0 maxBytesLocalDisk = 0 pinned = false ] 2014-07-15 13:18:33,267 INFO [c.c.u.d.GenericDaoBase] (main:null) Cache created: [ name = ServiceOfferingDaoImpl status = STATUS_ALIVE eternal = false overflowToDisk = false maxEntriesLocalHeap = 50 maxEntriesLocalDisk = 0 memoryStoreEvictionPolicy = LRU timeToLiveSeconds = 600 timeToIdleSeconds = 300 persistence = none diskExpiryThreadIntervalSeconds = 120 cacheEventListeners: net.sf.ehcache.statistics.LiveCacheStatisticsWrapper hitCount = 0 memoryStoreHitCount = 0 diskStoreHitCount = 0 missCountNotFound = 0 missCountExpired = 0 maxBytesLocalHeap = 0 overflowToOffHeap = false maxBytesLocalOffHeap = 0 maxBytesLocalDisk = 0 pinned = false ] 2014-07-15 13:18:33,314 DEBUG [c.c.s.ConfigurationServerImpl] (main:null) Caught exception when inserting system account: Duplicate entry '1' for key 'PRIMARY' *** Java threads running at time of deadlock *** "net.sf.ehcache.CacheManager@1f465da5" tid=24 TIMED_WAITING on lock=java.util.TaskQueue@c1665ee in java.lang.Object.wait() at java.lang.Object.wait(Native Method) at java.util.TimerThread.mainLoop(Timer.java:552) at java.util.TimerThread.run(Timer.java:505) "RMI TCP Accept-41360" tid=23 RUNNABLE (running in native) in java.net.PlainSocketImpl.socketAccept() at java.net.PlainSocketImpl.socketAccept(Native Method) at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398) at java.net.ServerSocket.implAccept(ServerSocket.java:530) at java.net.ServerSocket.accept(ServerSocket.java:498) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:388) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:360) at java.lang.Thread.run(Thread.java:744) "Multicast Heartbeat Sender Thread" tid=22 TIMED_WAITING on lock=net.sf.ehcache.distribution.MulticastKeepaliveHeartbeatSender$MulticastServerThread@127df275 in java.lang.Object.wait() -- This message was sent by Atlassian JIRA (v6.2#6252)