[ https://issues.apache.org/jira/browse/AMBARI-21013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gaurav Kanade updated AMBARI-21013: ----------------------------------- Description: Ambari server process is consuming very high CPU thus rendering ambari inaccessible and causing availability issues to other dependent services. On killing and restarting ambari-server process, the issue seems mitigated but we would like to investigate root cause Unfortunately an attempt at taking a thread dump with jstack (or other means) is unresponsive, but the one occasion where I managed to take a jstack trace I saw several threads blocked on ambari-agent. The ps output shows an unexpectedly large number of threads for the ambari server process (as well as other processes). The ambari server logs show some interesting details - initially we find an error message as: 07 May 2017 04:06:55,258 ERROR [ExecutionScheduler_QuartzSchedulerThread] JobStoreTX:3652 - Couldn't rollback jdbc connection. The connection is closed. eventually leading to messages regarding to apparent deadlocks 08 May 2017 18:33:31,478 WARN [C3P0PooledConnectionPoolManager[identityToken->2rvxua9n1o51qkzhlhyqx|1fd73dcb]-AdminTaskTimer] ThreadPoolAsynchronousRunner:220 - com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@78a76923 -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks! and subsequently continuous messages failing to connect to backend db 08 May 2017 18:35:32,215 ERROR [ambari-client-thread-24746] ReadHandler:102 - Caught a runtime exception executing a query javax.persistence.PersistenceException: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.6.2.v20151217-774c696): org.eclipse.persistence.exceptions.DatabaseException Internal Exception: com.microsoft.sqlserver.jdbc.SQLServerException: The TCP/IP connection to the host d630qym524.database.windows.net, port 1433 has failed. Error: "null. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.". Error Code: 0 I also see instances of java heap space (out of memory) messages 08 May 2017 05:34:42,833 WARN [qtp-ambari-agent-26306] nio:726 - handle failed java.lang.OutOfMemoryError: Java heap space at org.eclipse.jetty.io.ByteArrayBuffer.<init>(ByteArrayBuffer.java:41) at org.eclipse.jetty.io.nio.IndirectNIOBuffer.<init>(IndirectNIOBuffer.java:32) at org.eclipse.jetty.io.nio.SslConnection$SslBuffers.<init>(SslConnection.java:83) at org.eclipse.jetty.io.nio.SslConnection.allocateBuffers(SslConnection.java:146) at org.eclipse.jetty.io.nio.SslConnection.handle(SslConnection.java:183) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) All logs attached was: Ambari server process is consuming very high CPU thus rendering ambari inaccessible and causing availability issues to other dependent services. Unfortunately an attempt at taking a thread dump with jstack (or other means) is unresponsive, but the one occasion where I managed to take a jstack trace I saw several threads blocked on ambari-agent. The ps output shows an unexpectedly large number of threads for the ambari server process (as well as other processes). The ambari server logs show some interesting details - initially we find an error message as: 07 May 2017 04:06:55,258 ERROR [ExecutionScheduler_QuartzSchedulerThread] JobStoreTX:3652 - Couldn't rollback jdbc connection. The connection is closed. eventually leading to messages regarding to apparent deadlocks 08 May 2017 18:33:31,478 WARN [C3P0PooledConnectionPoolManager[identityToken->2rvxua9n1o51qkzhlhyqx|1fd73dcb]-AdminTaskTimer] ThreadPoolAsynchronousRunner:220 - com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@78a76923 -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks! and subsequently continuous messages failing to connect to backend db 08 May 2017 18:35:32,215 ERROR [ambari-client-thread-24746] ReadHandler:102 - Caught a runtime exception executing a query javax.persistence.PersistenceException: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.6.2.v20151217-774c696): org.eclipse.persistence.exceptions.DatabaseException Internal Exception: com.microsoft.sqlserver.jdbc.SQLServerException: The TCP/IP connection to the host d630qym524.database.windows.net, port 1433 has failed. Error: "null. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.". Error Code: 0 I also see instances of java heap space (out of memory) messages 08 May 2017 05:34:42,833 WARN [qtp-ambari-agent-26306] nio:726 - handle failed java.lang.OutOfMemoryError: Java heap space at org.eclipse.jetty.io.ByteArrayBuffer.<init>(ByteArrayBuffer.java:41) at org.eclipse.jetty.io.nio.IndirectNIOBuffer.<init>(IndirectNIOBuffer.java:32) at org.eclipse.jetty.io.nio.SslConnection$SslBuffers.<init>(SslConnection.java:83) at org.eclipse.jetty.io.nio.SslConnection.allocateBuffers(SslConnection.java:146) at org.eclipse.jetty.io.nio.SslConnection.handle(SslConnection.java:183) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) All logs attached > Ambari Server high CPU renders Ambari unavailable > ------------------------------------------------- > > Key: AMBARI-21013 > URL: https://issues.apache.org/jira/browse/AMBARI-21013 > Project: Ambari > Issue Type: Bug > Components: ambari-server > Reporter: Gaurav Kanade > Attachments: ambari-server-check-database.log, ambari-server.log, > ambari-server.out, ps1.txt, ps2.txt, top.txt > > > Ambari server process is consuming very high CPU thus rendering ambari > inaccessible and causing availability issues to other dependent services. On > killing and restarting ambari-server process, the issue seems mitigated but > we would like to investigate root cause > Unfortunately an attempt at taking a thread dump with jstack (or other means) > is unresponsive, but the one occasion where I managed to take a jstack trace > I saw several threads blocked on ambari-agent. The ps output shows an > unexpectedly large number of threads for the ambari server process (as well > as other processes). The ambari server logs show some interesting details - > initially we find an error message as: > 07 May 2017 04:06:55,258 ERROR [ExecutionScheduler_QuartzSchedulerThread] > JobStoreTX:3652 - Couldn't rollback jdbc connection. The connection is closed. > eventually leading to messages regarding to apparent deadlocks > 08 May 2017 18:33:31,478 WARN > [C3P0PooledConnectionPoolManager[identityToken->2rvxua9n1o51qkzhlhyqx|1fd73dcb]-AdminTaskTimer] > ThreadPoolAsynchronousRunner:220 - > com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@78a76923 > -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending > tasks! > and subsequently continuous messages failing to connect to backend db > 08 May 2017 18:35:32,215 ERROR [ambari-client-thread-24746] ReadHandler:102 - > Caught a runtime exception executing a query > javax.persistence.PersistenceException: Exception [EclipseLink-4002] (Eclipse > Persistence Services - 2.6.2.v20151217-774c696): > org.eclipse.persistence.exceptions.DatabaseException > Internal Exception: com.microsoft.sqlserver.jdbc.SQLServerException: The > TCP/IP connection to the host d630qym524.database.windows.net, port 1433 has > failed. Error: "null. Verify the connection properties. Make sure that an > instance of SQL Server is running on the host and accepting TCP/IP > connections at the port. Make sure that TCP connections to the port are not > blocked by a firewall.". > Error Code: 0 > I also see instances of java heap space (out of memory) messages > 08 May 2017 05:34:42,833 WARN [qtp-ambari-agent-26306] nio:726 - handle > failed > java.lang.OutOfMemoryError: Java heap space > at org.eclipse.jetty.io.ByteArrayBuffer.<init>(ByteArrayBuffer.java:41) > at > org.eclipse.jetty.io.nio.IndirectNIOBuffer.<init>(IndirectNIOBuffer.java:32) > at > org.eclipse.jetty.io.nio.SslConnection$SslBuffers.<init>(SslConnection.java:83) > at > org.eclipse.jetty.io.nio.SslConnection.allocateBuffers(SslConnection.java:146) > at org.eclipse.jetty.io.nio.SslConnection.handle(SslConnection.java:183) > at > org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696) > at > org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) > at java.lang.Thread.run(Thread.java:745) > All logs attached -- This message was sent by Atlassian JIRA (v6.3.15#6346)