[ https://issues.apache.org/jira/browse/CLOUDSTACK-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kelven Yang resolved CLOUDSTACK-4288. ------------------------------------- Resolution: Fixed > Management server is hanging quite often and in indefinite time intervals. > ---------------------------------------------------------------------------- > > Key: CLOUDSTACK-4288 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4288 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Install and Setup > Affects Versions: 4.2.0 > Reporter: Kiran Koneti > Assignee: Kelven Yang > Priority: Blocker > Fix For: 4.2.0 > > Attachments: catalina.zip, management-server.zip > > > I have created a Advanced Zone setup using the latest rhel63 399 build which > is generated around 12:08 PM IST.I see the management server hanging quite > few often for few minutes and restores again after some time on its own. > At that time all the all teh CS operations are halted even the Management > server logs also halt and once it starts the hosts go into alert state and > comes up later. > This is observed quite often and when i took the thread dump it shows the > below messages > ""SecGrp-Worker-1" prio=10 tid=0x00007f86bc1de000 nid=0x28b waiting on > condition [0x00007f86b7cfb000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x000000077b193618> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) > at > com.cloud.network.security.LocalSecurityGroupWorkQueue.getWork(LocalSecurityGroupWorkQueue.java:152) > at > com.cloud.network.security.SecurityGroupManagerImpl2.work(SecurityGroupManagerImpl2.java:136) > at > com.cloud.network.security.SecurityGroupManagerImpl2$WorkerThread.run(SecurityGroupManagerImpl2.java:71) > "SecGrp-Worker-0" prio=10 tid=0x00007f86bc1dc000 nid=0x28a waiting on > condition [0x00007f86b7dfc000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x000000077b193618> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) > at > com.cloud.network.security.LocalSecurityGroupWorkQueue.getWork(LocalSecurityGroupWorkQueue.java:152) > at > com.cloud.network.security.SecurityGroupManagerImpl2.work(SecurityGroupManagerImpl2.java:136) > at > com.cloud.network.security.SecurityGroupManagerImpl2$WorkerThread.run(SecurityGroupManagerImpl2.java:71) > "HA-2" prio=10 tid=0x00007f86bc1da000 nid=0x289 waiting on condition > [0x00007f86b7efd000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x000000077dc218b0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2081) > at java.util.concurrent.DelayQueue.take(DelayQueue.java:193) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:688) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:681) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1043)" > Attaching the catalina.out as well as the management server logs. > This issue is observed in two different setups i.e with rhel 63 build in my > environment and also rhel62 environment which manasa is using. > During the hang period when i did top the cpu% goes down to very low values. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira