Re: Java process segfaulting and using 100% CPU

David Delbecq Wed, 19 Apr 2006 07:24:54 -0700

Hi,

"or other processes may end up using 100% CPU as well and presenting the
same behavior while on strace"


If you mean also non java processes have similar problems, randomly, i
bet you may have either an OS issue (which affect all processes
randomly) either an hardware issue (corrupting RAM segments?)

If you want to discover where you tomcat is eating all it's CPU cycle, i
recommend you issue a 'kill -3 <java process id>', this will dump in
java stdout a list of current stacktrace, including native frames. You
might get there hints on what is problematic (might simply be an
HttpThread in a webapp endless loop)

Regards

Gustavo Noronha a écrit :

>Hello,
>
>I've searched a lot about this issue on the web, but did not find a
>solution. I have the following setup:
>
>Debian GNU/Linux 3.1 with official Tomcat5 package (5.0.30),
>java-package-built Sun JVM:
>
># java -version
>java version "1.5.0_05"
>Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_05-b05)
>Java HotSpot(TM) Server VM (build 1.5.0_05-b05, mixed mode)
>
>Our main application uses Hibernate to persist objects and C3P0 to control
>the connection pool. Here's the configuration file:
>
>hibernate.connection.driver_class com.p6spy.engine.spy.P6SpyDriver
>hibernate.connection.url jdbc:oracle:thin:@dbhost:1521:sid
>hibernate.connection.username user
>hibernate.connection.password pass
>hibernate.transaction.factory_class =
>org.hibernate.transaction.JDBCTransactionFactory
>hibernate.use_outer_join true
>hibernate.show_sql false
>hibernate.c3p0.minPoolSize = 10
>hibernate.c3p0.maxPoolSize = 200;
>hibernate.c3p0.min_size=10
>hibernate.c3p0.max_size=200
>hibernate.c3p0.timeout=5000
>hibernate.c3p0.max_statements=0
>hibernate.cglib.use_reflection_optimizer=true
>hibernate.dialect org.hibernate.dialect.OracleDialect
>
>I use the P6SpyDriver to be able to log the SQL queries that go through to
>the Oracle database, but I have the same problem with the OracleDriver as
>well. So here's the problem:
>
>After some time running one java process begins using 100% CPU; strace shows
>this:
>
>--- SIGSEGV (Segmentation fault) @ 0 (0) ---
>rt_sigreturn(0x55d32bda)                = -442421359
>futex(0x805c770, FUTEX_WAIT, 2, NULL)   = -1 EAGAIN (Resource temporarily
>unavailable)
>futex(0x81ff918, FUTEX_REQUEUE, 1, 2147483647, 0x805c770) = 1
>futex(0x805c770, FUTEX_WAKE, 1)         = 1
>futex(0x805c7f0, FUTEX_WAIT, 2, NULL)   = 0
>futex(0x805c7f0, FUTEX_WAKE, 1)         = 0
>--- SIGSEGV (Segmentation fault) @ 0 (0) ---
>rt_sigreturn(0x500ddbda)                = -442421359
>futex(0x805c770, FUTEX_WAIT, 2, NULL)   = 0
>futex(0x81ff918, FUTEX_REQUEUE, 1, 2147483647, 0x805c770) = 1
>futex(0x805c770, FUTEX_WAKE, 1)         = 1
>futex(0x805c7f0, FUTEX_WAIT, 2, NULL)   = 0
>futex(0x805c7f0, FUTEX_WAKE, 1)         = 0
>
>This may stay like that for the whole day, tomcat being responsive, or other
>processes may end up using 100% CPU as well and presenting the same behavior
>while on strace. Eventually, tomcat may become unresponsive, but I could not
>estabilish a pattern. I found similar SIGSEGV/futex behavior on a
>non-100%-using process, too:
>
>send(23,
>"\2c\0\0\6\0\0\0\0\0\21i\0\1\1\1\1\1\3^\0\2\200!\0\1\2\2\"\1\1\r\0\0\0\0\4\177\377\377\377\0\0\0\0\0\0\0\0\0select
>papelpesso0_.ID_ENTIDADE_DIRIGENTE as ID1_, papelpesso0_.DT_INICIO_MANDATO
>as DT2_29_, papelpesso0_.DT_FIM_MANDATO as DT3_29_,
>papelpesso0_.DT_NOMEACAO_DIRIGENTE as DT4_29_, papelpesso0_.NU_REMESSA_SIAFI
>as NU5_29_, papelpesso0_.DT_ENVIO_CREDOR_SIAFI as DT6_29_,
>papelpesso0_.DT_RETORNO_CREDOR_SIAFI as DT7_29_, papelpesso0_.ST_RESPONSAVEL
>as ST8_29_, papelpesso0_.CO_ENTIDADE as CO9_29_, papelpesso0_.CO_DIRIGENTE
>as CO10_29_, papelpesso0_.CO_CARGO as CO11_29_ from RL_ENTIDADE_DIRIGENTE
>papelpesso0_ where
>papelpesso0_.CO_DIRIGENTE=28071\1\1\0\0\0\0\0\0\1\1\0\0\0\0\0", 611, 0) =
>611
>recv(23, "\1\217\0\0\6\0\0\0", 8, 0)    = 8
>recv(23, "\0\0\20\31\0279
>\273t6X\203\0\0xj\4\22\27\0220\4\31\0\0\0\0\0\0\1\201\1\v7\2\0\t\0\1\26\0\0\0\0\0\0\0\0\4\1\4\4ID1_\0\0\f\0\0\0\1\1\0\0\0\0\0\0\0\1\7\1\7\7DT2_29_\0\0\f\0\0\0\1\1\0\0\0\0\0\0\0\1\7\1\7\7DT3_29_\0\0\f\0\0\0\1\1\0\0\0\0\0\0\0\1\7\1\7\7DT4_29_\0\0\1\200\0\0\1\5\0\0\0\0\1\37\1\1\5\1\7\1\7\7NU5_29_\0\0\f\0\0\0\1\1\0\0\0\0\0\0\0\1\7\1\7\7DT6_29_\0\0\f\0\0\0\1\1\0\0\0\0\0\0\0\1\7\1\7\7DT7_29_\0\0`\200\0\0\1\1\0\0\0\0\1\37\1\1\1\1\7\1\7\7ST8_29_\0\0\2\0\f\0\1\26\0\0\0\0\0\0\0\0\7\1\7\7CO9_29_\0\0\2\0\n\0\1\26\0\0\0\0\0\0\0\0\10\1\10\10CO10_29_\0\0\2\0\t\0\1\26\0\0\0\0\0\0\0\0\10\1\10\10CO11_29_\0\0\1\7\7xj\4\23\2!\17\10\1\5\3+\367Y\0\1\1\0\0\0\0\4\0\0\0\0\1\1\2\1\332\3\0\0\0\0\0\2\177|\1\1\0\2\317/\1\5\0\0\0\0\1\1",
>391, 0) = 391
>send(23, "\0\21\0\0\6\0\0\0\0\0\3\5\0\1\1\1\n", 17, 0) = 17
>recv(23, "\0F\0\0\6\0\0\0", 8, 0)       = 8
>recv(23, "\0\0\4\0\2\5{\0\0\1\1\0\3\0\1
>\0\0\0\2\177|\1\1\0\2\317/\1\5\0\0\0\0\1\1\31ORA-01403: no data found\n",
>62, 0) = 62
>gettimeofday({1145421443, 130050}, NULL) = 0
>gettimeofday({1145421443, 130213}, NULL) = 0
>write(14, "1145421443130|17|37|statement|select
>papelpesso0_.ID_ENTIDADE_DIRIGENTE as ID1_, papelpesso0_.DT_INICIO_MANDATO
>as DT2_29_, papelpesso0_.DT_FIM_MANDATO as DT3_29_,
>papelpesso0_.DT_NOMEACAO_DIRIGENTE as DT4_29_, papelpesso0_.NU_REMESSA_SIAFI
>as NU5_29_, papelpesso0_.DT_ENVIO_CREDOR_SIAFI as DT6_29_,
>papelpesso0_.DT_RETORNO_CREDOR_SIAFI as DT7_29_, papelpesso0_.ST_RESPONSAVEL
>as ST8_29_,
>papelpesso0_.CO_ENTIDADE as CO9_29_, papelpesso0_.CO_DIRIGENTE as CO10_29_,
>papelpesso0_.CO_CARGO as CO11_29_ from RL_ENTIDADE_DIRIGENTE papelpesso0_
>where papelpesso0_.CO_DIRIGENTE=28071|select
>papelpesso0_.ID_ENTIDADE_DIRIGENTE as ID1_, papelpesso0_.DT_INICIO_MANDATO
>as DT2_29_, papelpesso0_.DT_FIM_MANDATO as DT3_29_,
>papelpesso0_.DT_NOMEACAO_DIRIGENTE as DT4_29_, papelpesso0_.NU_REMESSA_SIAFI
>as NU5_29_, papelpesso0_.DT_ENVIO_CREDOR_SIAFI as DT6_29_,
>papelpesso0_.DT_RETORNO_CREDOR_SIAFI as DT7_29_, papelpesso0_.ST_RESPONSAVEL
>as ST8_29_, papelpesso0_.CO_ENTIDADE as CO9_29_, papelpesso0_.CO_DIRIGENTE
>as CO10_29_, papelpesso0_"..., 1123) = 1123
>write(14, "\n", 1)                      = 1
>gettimeofday({1145421443, 131610}, NULL) = 0
>gettimeofday({1145421443, 131692}, NULL) = 0
>--- SIGSEGV (Segmentation fault) @ 0 (0) ---
>rt_sigreturn(0x443e228c)                = 0
>futex(0x805c770, FUTEX_WAIT, 2, NULL)   = 0
>futex(0x8ab5810, FUTEX_REQUEUE, 1, 2147483647, 0x805c770) = 1
>futex(0x805c770, FUTEX_WAKE, 1)         = 1
>futex(0x805c7f0, FUTEX_WAIT, 2, NULL)   = 0
>futex(0x805c7f0, FUTEX_WAKE, 1)         = 0
>
>At this moment Tomcat was unresponsive. Moments before there were some
>exceptions:
>
>org.hibernate.HibernateException: Found shared references to a collection
>        at org.hibernate.engine.Collections.processReachableCollection (
>Collections.java:130)
>[...]
>ERROR 19 Abr 2006 01:25:14,744 suasnob.component.HibernateSession  -> error
>during commit
>org.hibernate.HibernateException: Flush during cascade is dangerous
>[...]
>WARN  19 Abr 2006 01:25:14,775 hibernate.util.JDBCExceptionReporter  -> SQL
>Error: 0, SQLState: null
>WARN  19 Abr 2006 01:25:14,775 hibernate.util.JDBCExceptionReporter  -> SQL
>Error: 0, SQLState: null
>ERROR 19 Abr 2006 01:25:14,776 hibernate.util.JDBCExceptionReporter   -> You
>can't operate on a closed Connection!!!
>ERROR 19 Abr 2006 01:25:14,776 hibernate.util.JDBCExceptionReporter  -> You
>can't operate on a closed Connection!!!
>WARN  19 Abr 2006 01:25:14,776 hibernate.util.JDBCExceptionReporter   -> SQL
>Error: 0, SQLState: null
>WARN  19 Abr 2006 01:25:14,776 hibernate.util.JDBCExceptionReporter  -> SQL
>Error: 0, SQLState: null
>ERROR 19 Abr 2006 01:25:14,776 hibernate.util.JDBCExceptionReporter  -> You
>can't operate on a closed Connection!!!
>ERROR 19 Abr 2006 01:25:14,776 hibernate.util.JDBCExceptionReporter  -> You
>can't operate on a closed Connection!!!
>org.hibernate.exception.GenericJDBCException: could not execute query
>[...]
>Caused by: java.sql.SQLException : You can't operate on a closed
>Connection!!!
>        at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:104)
>        at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:65)
>        at com.mchange.v2.c3p0.impl.NewProxyConnection.prepareStatement (
>NewProxyConnection.java:199)
>[...]
>Caused by: java.lang.NullPointerException
>        at com.mchange.v2.c3p0.impl.NewProxyConnection.prepareStatement(
>NewProxyConnection.java:174)
>        ... 108 more
>[...]
>ERROR 19 Abr 2006 01:25:20,408 suasnob.interceptor.HibernateInterceptor  ->
>HibernateException in execute()
>org.hibernate.HibernateException: Flush during cascade is dangerous
>[...]
>
>Then Tomcat became unresponsive. Moments after this was spit:
>
>WARN  19 Abr 2006 01:46:23,018 hibernate.jdbc.ConnectionManager  -> unclosed
>connection, forgot to call close() on your session?
>WARN  19 Abr 2006 01:46:23,018 hibernate.jdbc.ConnectionManager  -> unclosed
>connection, forgot to call close() on your session?
>
>And then I restarted tomcat to get it back. I believe these might be
>related. We have lots of exceptions related to database connections,
>hibernate and c3p0 on the logs, such as:
>
>exception clearing maxRows/queryTimeout, You can't operate on a closed
>Connection!!!, You can't operate on a closed ResultSet!!!, SQL Error: 17010,
>SQLState: null.
>
>Also, it might be important to notice that we have some of these from time
>to time:
>
>WARN  19 Abr 2006 06:48:09,577 v2.async.ThreadPoolAsynchronousRunner  ->
>[EMAIL PROTECTED] --
>APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending
>tasks!
>WARN  19 Abr 2006 06:48:09,578 v2.async.ThreadPoolAsynchronousRunner  ->
>[EMAIL PROTECTED] --
>APPARENT DEADLOCK!!! Complete Status: [num_managed_threads: 3, num_active:
>3; activeTasks:
>[EMAIL PROTECTED] (
>com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#0),
>[EMAIL PROTECTED] (
>com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#1),
>[EMAIL PROTECTED] (
>com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#2 );
>pendingTasks: [EMAIL PROTECTED],
>[EMAIL PROTECTED],
>[EMAIL PROTECTED],
>[EMAIL PROTECTED] ,
>[EMAIL PROTECTED],
>[EMAIL PROTECTED],
>[EMAIL PROTECTED],
>[EMAIL PROTECTED] ]
>WARN  19 Abr 2006 06:49:23,468 v2.async.ThreadPoolAsynchronousRunner  ->
>Task [EMAIL PROTECTED] (in deadlocked
>PoolThread) failed to complete in maximum time 60000ms. Trying interrupt().
>WARN  19 Abr 2006 06:49:23,468 v2.async.ThreadPoolAsynchronousRunner  ->
>Task [EMAIL PROTECTED] (in deadlocked
>PoolThread)
>failed to complete in maximum time 60000ms. Trying interrupt().
>WARN  19 Abr 2006 06:49:23,468 v2.async.ThreadPoolAsynchronousRunner  ->
>Task [EMAIL PROTECTED] (in deadlocked
>PoolThread) failed to complete in maximum time 60000ms. Trying interrupt().
>
>Being a beginner on Tomcat, I'm unable to easily identify what may be faulty
>here: if our application, our tomcat configuration, mainly hibernate's and
>c3p0's, or if it's something deeper on Java or the OS... Any hints on where
>to investigate, and how to solve this symptom?
>
>Thanks,
>
>--
>Gustavo Noronha Silva
>Coordenação de Segurança e Sustentação
>Ministério do Desenvolvimento Social - Brasil
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Java process segfaulting and using 100% CPU

Reply via email to