Kiran Koneti created CLOUDSTACK-4288:
----------------------------------------

             Summary: Management server is hanging quite often and in 
indefinite time intervals.  
                 Key: CLOUDSTACK-4288
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4288
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: Install and Setup
    Affects Versions: 4.2.0
            Reporter: Kiran Koneti
            Priority: Blocker
             Fix For: 4.2.0


I have created a Advanced Zone setup using the latest rhel63 399 build which is 
generated around 12:08 PM IST.I see the management server hanging quite few 
often for few minutes and restores again after some time on its own.

At that time all the all teh CS operations are halted even the Management 
server logs also halt and once it starts the hosts go into alert state and 
comes up later.

This is observed quite often and when i took the thread dump it shows the below 
messages


""SecGrp-Worker-1" prio=10 tid=0x00007f86bc1de000 nid=0x28b waiting on 
condition [0x00007f86b7cfb000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000077b193618> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
        at 
com.cloud.network.security.LocalSecurityGroupWorkQueue.getWork(LocalSecurityGroupWorkQueue.java:152)
        at 
com.cloud.network.security.SecurityGroupManagerImpl2.work(SecurityGroupManagerImpl2.java:136)
        at 
com.cloud.network.security.SecurityGroupManagerImpl2$WorkerThread.run(SecurityGroupManagerImpl2.java:71)

"SecGrp-Worker-0" prio=10 tid=0x00007f86bc1dc000 nid=0x28a waiting on condition 
[0x00007f86b7dfc000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000077b193618> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
        at 
com.cloud.network.security.LocalSecurityGroupWorkQueue.getWork(LocalSecurityGroupWorkQueue.java:152)
        at 
com.cloud.network.security.SecurityGroupManagerImpl2.work(SecurityGroupManagerImpl2.java:136)
        at 
com.cloud.network.security.SecurityGroupManagerImpl2$WorkerThread.run(SecurityGroupManagerImpl2.java:71)

"HA-2" prio=10 tid=0x00007f86bc1da000 nid=0x289 waiting on condition 
[0x00007f86b7efd000]
   java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000077dc218b0> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2081)
        at java.util.concurrent.DelayQueue.take(DelayQueue.java:193)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:688)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:681)
        at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1043)"


Attaching the catalina.out as well as the management server logs.


This issue is observed in two different setups i.e with rhel 63 build in my 
environment and also rhel62 environment which manasa is using.

During the hang period when i did top the cpu% goes down to very low values.

 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to