[jira] [Commented] (AMQ-3075) Auto-create database fails with PostgreSQL (Error in SQL: 'drop primary key')
[ https://issues.apache.org/jira/browse/AMQ-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266305#comment-17266305 ] Volker Kleinschmidt commented on AMQ-3075: -- Same here > Auto-create database fails with PostgreSQL (Error in SQL: 'drop primary key') > - > > Key: AMQ-3075 > URL: https://issues.apache.org/jira/browse/AMQ-3075 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.4.2 > Environment: ActiveMQ 5.4.2 fresh install, Ubuntu 64-bit OpenJDK > 6b20-1.9.2-0ubuntu1 PostgreSQL 8.4 >Reporter: Ned Wolpert >Assignee: Dejan Bosanac >Priority: Major > Fix For: 5.5.0 > > > Trying to do a fresh install with persistence fails to create the database, > with a listed database error. > Persistence support added to activemq.xml file: > > > > > > > > > > > > > useDatabaseLock="false"/> > > postgresql-8.4-701.jdbc4.jar added to the lib directory > Log from startup: > INFO | Pre-instantiating singletons in > org.springframework.beans.factory.support.DefaultListableBeanFactory@40b0095d: > defining beans > [org.springframework.beans.factory.config.PropertyPlaceholderConfigurer#0,postgres-ds,org.apache.activemq.xbean.XBeanBrokerService#0,securityLoginService,securityConstraint,securityConstraintMapping,securityHandler,contexts,Server]; > root of factory hierarchy > WARN | destroyApplicationContextOnStop parameter is deprecated, please use > shutdown hooks instead > INFO | > PListStore:/home/wolpert/Downloads/apache-activemq-5.4.2/data/localhost/tmp_storage > started > INFO | Using Persistence Adapter: > JDBCPersistenceAdapter(org.postgresql.ds.PGPoolingDataSource@3302fc5) > INFO | Database adapter driver override recognized for : > [postgresql_native_driver] - adapter: class > org.apache.activemq.store.jdbc.adapter.PostgresqlJDBCAdapter > WARN | Could not create JDBC tables; they could already exist. Failure was: > ALTER TABLE ACTIVEMQ_ACKS DROP PRIMARY KEY Message: ERROR: syntax error at or > near "PRIMARY" > Position: 32 SQLState: 42601 Vendor code: 0 > WARN | Failure details: ERROR: syntax error at or near "PRIMARY" > Position: 32 > org.postgresql.util.PSQLException: ERROR: syntax error at or near "PRIMARY" > Position: 32 > at > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2062) > at > org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1795) > at > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257) > at > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:479) > at > org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:353) > at > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:345) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.postgresql.ds.jdbc23.AbstractJdbc23PooledConnection$StatementHandler.invoke(AbstractJdbc23PooledConnection.java:455) > at $Proxy5.execute(Unknown Source) > at > org.apache.activemq.store.jdbc.adapter.DefaultJDBCAdapter.doCreateTables(DefaultJDBCAdapter.java:101) > at > org.apache.activemq.store.jdbc.JDBCPersistenceAdapter.start(JDBCPersistenceAdapter.java:272) > at > org.apache.activemq.broker.BrokerService.start(BrokerService.java:485) > at > org.apache.activemq.xbean.XBeanBrokerService.afterPropertiesSet(XBeanBrokerService.java:60) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > ... > Database reports the following with its log turned on full. > 2010-12-08 14:35:31 MST LOG: execute : SET SESSION CHARACTERISTICS > AS TRANSACTION ISOLATION LEVEL READ UNCOMMITTED > 2010-12-08 14:35:31 MST LOG: execute S_1: BEGIN > 2010-12-08 14:35:31 MST LOG: execute : SELECT NULL AS TABLE_CAT, > n.nspname AS TABLE_SCHEM, c.relname AS TABLE_NAME, CASE n.nspname ~ '^pg_' > OR n.nspname = 'information_schema' WHEN true THEN CASE WHEN n.nspname = > 'pg_catalog' OR n.nspname = 'information_schema' THEN CASE c.relkind WHEN > 'r' THEN 'SYSTEM TABLE' WHEN 'v' THEN 'SYSTEM VIEW'
[jira] [Comment Edited] (AMQ-2520) Oracle 10g RAC resource usage VERY high from the passive servers SQL requests to the Database.
[ https://issues.apache.org/jira/browse/AMQ-2520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16200986#comment-16200986 ] Volker Kleinschmidt edited comment on AMQ-2520 at 10/11/17 9:38 PM: Also note that if you use the FOR UPDATE NOWAIT or FOR UPDATE WAIT N forms of the locking query (as should be the default!!!), you will get an exception from the attempt to lock. Currently these are being logged at "info" or "warn" level depending on the locker implementation you use, so they will flood your logs with something that should be perfectly ignored, as this exception is the expected state 99.99% of the time (it should at most be logged at debug level). So there's definitely a need for code change here. was (Author: volkerk): Also note that if you use the FOR UPDATE NOWAIT or FOR UPDATE WAIT N forms of the locking query (as should be the default!!!), you will get an exception from the attempt to lock. Currently these are being logged at "warn" level, so they will flood your logs with something that should be perfectly ignored, as this exception is the expected state 99.99% of the time (it should at most be logged at debug level). So there's definitely a need for code change here. > Oracle 10g RAC resource usage VERY high from the passive servers SQL requests > to the Database. > -- > > Key: AMQ-2520 > URL: https://issues.apache.org/jira/browse/AMQ-2520 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.3.0, 5.4.0 > Environment: Redhat Enterprise Linux 5, Oracle 10g RAC >Reporter: Thomas Connolly > Fix For: 5.x > > > Two active MQ brokers are installed on RH EL 5 servers (one per server). > They're configured as a JDBC master / slave failover (as per examples). > Failover is tested and working and messages delivered. > Oracle is used for synchronisation (ACTIVEMQ_ tables), persistence etc. > We run a durable subscriber, and the client connects via a failover operation. > The SELECT * FROM ACTIVEMQ_LOCK FOR UPDATE is causing spin lock on the Oracle > database. > Basically the indefinite waiting from the passive mq instance is causing high > resource usage on Oracle. > After a short period Oracle dashboard shows a high number of active sessions > from Active MQ due to the continuous execution of > UPDATE ACTIVEMQ_LOCK SET TIME = ? WHERE ID = 1 > in the keepAlive method in > > https://svn.apache.org/repos/asf/activemq/trunk/activemq-core/src/main/java/org/apache/activemq/store/jdbc/DatabaseLocker.java > As a workaround we've had to push out the lockAcquireSleepInterval to 5 > minutes in the configuration of ActiveMQ, but this didn't work. > lockAcquireSleepInterval="30" createTablesOnStartup="true"/> > We're currently changing the broker to poll rather than block so in > Statement.java we've added a WAIT 0 that throws an exception if the lock is > not acquired. > public String getLockCreateStatement() { > if (lockCreateStatement == null) { > lockCreateStatement = "SELECT * FROM " + getFullLockTableName(); > if (useLockCreateWhereClause) { > lockCreateStatement += " WHERE ID = 1"; > } > lockCreateStatement += " FOR UPDATE WAIT 0"; > } > return lockCreateStatement; > } > Any suggestions to this issue, this seems to be a quite fundamental issue? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AMQ-2520) Oracle 10g RAC resource usage VERY high from the passive servers SQL requests to the Database.
[ https://issues.apache.org/jira/browse/AMQ-2520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16200986#comment-16200986 ] Volker Kleinschmidt commented on AMQ-2520: -- Also note that if you use the FOR UPDATE NOWAIT or FOR UPDATE WAIT N forms of the locking query (as should be the default!!!), you will get an exception from the attempt to lock. Currently these are being logged at "warn" level, so they will flood your logs with something that should be perfectly ignored, as this exception is the expected state 99.99% of the time (it should at most be logged at debug level). So there's definitely a need for code change here. > Oracle 10g RAC resource usage VERY high from the passive servers SQL requests > to the Database. > -- > > Key: AMQ-2520 > URL: https://issues.apache.org/jira/browse/AMQ-2520 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.3.0, 5.4.0 > Environment: Redhat Enterprise Linux 5, Oracle 10g RAC >Reporter: Thomas Connolly > Fix For: 5.x > > > Two active MQ brokers are installed on RH EL 5 servers (one per server). > They're configured as a JDBC master / slave failover (as per examples). > Failover is tested and working and messages delivered. > Oracle is used for synchronisation (ACTIVEMQ_ tables), persistence etc. > We run a durable subscriber, and the client connects via a failover operation. > The SELECT * FROM ACTIVEMQ_LOCK FOR UPDATE is causing spin lock on the Oracle > database. > Basically the indefinite waiting from the passive mq instance is causing high > resource usage on Oracle. > After a short period Oracle dashboard shows a high number of active sessions > from Active MQ due to the continuous execution of > UPDATE ACTIVEMQ_LOCK SET TIME = ? WHERE ID = 1 > in the keepAlive method in > > https://svn.apache.org/repos/asf/activemq/trunk/activemq-core/src/main/java/org/apache/activemq/store/jdbc/DatabaseLocker.java > As a workaround we've had to push out the lockAcquireSleepInterval to 5 > minutes in the configuration of ActiveMQ, but this didn't work. > lockAcquireSleepInterval="30" createTablesOnStartup="true"/> > We're currently changing the broker to poll rather than block so in > Statement.java we've added a WAIT 0 that throws an exception if the lock is > not acquired. > public String getLockCreateStatement() { > if (lockCreateStatement == null) { > lockCreateStatement = "SELECT * FROM " + getFullLockTableName(); > if (useLockCreateWhereClause) { > lockCreateStatement += " WHERE ID = 1"; > } > lockCreateStatement += " FOR UPDATE WAIT 0"; > } > return lockCreateStatement; > } > Any suggestions to this issue, this seems to be a quite fundamental issue? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AMQ-3189) Postgresql with spring embedded activeMQ has "table already created" exception
[ https://issues.apache.org/jira/browse/AMQ-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16200552#comment-16200552 ] Volker Kleinschmidt commented on AMQ-3189: -- Gary, providing the table name in the properties file is a feasible workaround, but a workaround that should not be necessary. All supported DBs should be handled correctly out of the box. > Postgresql with spring embedded activeMQ has "table already created" exception > -- > > Key: AMQ-3189 > URL: https://issues.apache.org/jira/browse/AMQ-3189 > Project: ActiveMQ > Issue Type: Improvement >Affects Versions: 5.4.2 > Environment: Postgresql 8.4, latest Postgresql JDBC9.0.x, tomcat > 6.x, spring 2.5 >Reporter: steve neo > > This may not a bug as MQ is still workable after this exception warning. > However, can you suppress the exception stack in log? It even can not be a > kind of "warning" as this is just a normal process to detect if tables exist > or not. > Same configuration works fine with MySQL. For postgresql, first time starting > will create table without problem After restart tomcat, log prints some > annoying failure message with long exception stack. > = Exception > 13:38:53] INFO [JDBCPersistenceAdapter] Database adapter driver override > recognized for : [postgresql_native_driver] - adapter: class > org.apache.activemq.store.jdbc.adapter.PostgresqlJDBCAdapter > [13:38:53] WARN [DefaultJDBCAdapter] Could not create JDBC tables; they > could already exist. Failure was: CREATE TABLE EDG_ACTIVEMQ_MSGS(ID BIGINT > NOT NULL, CONTAINER VARCHAR(80), MSGID_PROD VARCHAR(80), MSGID_SEQ BIGINT, > EXPIRATION BIGINT, MSG BYTEA, PRIMARY KEY ( ID ) ) Message: ERROR: relation > "edg_activemq_msgs" already exists SQLState: 42P07 Vendor code: 0 > [13:38:53] WARN [JDBCPersistenceAdapter] Failure details: ERROR: relation > "edg_activemq_msgs" already exists > org.postgresql.util.PSQLException: ERROR: relation "edg_activemq_msgs" > already exists > at > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:374) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:366) > ~[postgresql-9.0-801.jdbc4.jar:na] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.6.0_21] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > ~[na:1.6.0_21] > = End Exception > I try dbcp.DataSource and postgresql Datasource both. > = Datasource > destroy-method="close"> > > > > > > > > > > = broker >start="true" useShutdownHook="true" > dataDirectory="${geniuswiki.tmp.dir}activemq-data"> > >dataSource="#coreDS" createTablesOnStartup="true" useDatabaseLock="false"> >tablePrefix="@TOKEN.TABLE.PREFIX@" stringIdDataType ="VARCHAR(80)" > msgIdDataType="VARCHAR(80)" containerNameDataType="VARCHAR(80)"/> > > > > >uri="tcp://${mq.server.url}?wireFormat.maxInactivityDuration=0"/> > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (AMQ-3075) Auto-create database fails with PostgreSQL (Error in SQL: 'drop primary key')
[ https://issues.apache.org/jira/browse/AMQ-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15689497#comment-15689497 ] Volker Kleinschmidt commented on AMQ-3075: -- The issue remains on SQLserver in 5.14.1 and prevents broker startup, as the tables do not get created. Despite the existence of a workaround (manually precreate tables) it's rather stunning that such a critical issue remains unfixed. > Auto-create database fails with PostgreSQL (Error in SQL: 'drop primary key') > - > > Key: AMQ-3075 > URL: https://issues.apache.org/jira/browse/AMQ-3075 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.4.2 > Environment: ActiveMQ 5.4.2 fresh install, Ubuntu 64-bit OpenJDK > 6b20-1.9.2-0ubuntu1 PostgreSQL 8.4 >Reporter: Ned Wolpert >Assignee: Dejan Bosanac > Fix For: 5.5.0 > > > Trying to do a fresh install with persistence fails to create the database, > with a listed database error. > Persistence support added to activemq.xml file: > > > > > > > > > > > > > useDatabaseLock="false"/> > > postgresql-8.4-701.jdbc4.jar added to the lib directory > Log from startup: > INFO | Pre-instantiating singletons in > org.springframework.beans.factory.support.DefaultListableBeanFactory@40b0095d: > defining beans > [org.springframework.beans.factory.config.PropertyPlaceholderConfigurer#0,postgres-ds,org.apache.activemq.xbean.XBeanBrokerService#0,securityLoginService,securityConstraint,securityConstraintMapping,securityHandler,contexts,Server]; > root of factory hierarchy > WARN | destroyApplicationContextOnStop parameter is deprecated, please use > shutdown hooks instead > INFO | > PListStore:/home/wolpert/Downloads/apache-activemq-5.4.2/data/localhost/tmp_storage > started > INFO | Using Persistence Adapter: > JDBCPersistenceAdapter(org.postgresql.ds.PGPoolingDataSource@3302fc5) > INFO | Database adapter driver override recognized for : > [postgresql_native_driver] - adapter: class > org.apache.activemq.store.jdbc.adapter.PostgresqlJDBCAdapter > WARN | Could not create JDBC tables; they could already exist. Failure was: > ALTER TABLE ACTIVEMQ_ACKS DROP PRIMARY KEY Message: ERROR: syntax error at or > near "PRIMARY" > Position: 32 SQLState: 42601 Vendor code: 0 > WARN | Failure details: ERROR: syntax error at or near "PRIMARY" > Position: 32 > org.postgresql.util.PSQLException: ERROR: syntax error at or near "PRIMARY" > Position: 32 > at > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2062) > at > org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1795) > at > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257) > at > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:479) > at > org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:353) > at > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:345) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.postgresql.ds.jdbc23.AbstractJdbc23PooledConnection$StatementHandler.invoke(AbstractJdbc23PooledConnection.java:455) > at $Proxy5.execute(Unknown Source) > at > org.apache.activemq.store.jdbc.adapter.DefaultJDBCAdapter.doCreateTables(DefaultJDBCAdapter.java:101) > at > org.apache.activemq.store.jdbc.JDBCPersistenceAdapter.start(JDBCPersistenceAdapter.java:272) > at > org.apache.activemq.broker.BrokerService.start(BrokerService.java:485) > at > org.apache.activemq.xbean.XBeanBrokerService.afterPropertiesSet(XBeanBrokerService.java:60) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > ... > Database reports the following with its log turned on full. > 2010-12-08 14:35:31 MST LOG: execute : SET SESSION CHARACTERISTICS > AS TRANSACTION ISOLATION LEVEL READ UNCOMMITTED > 2010-12-08 14:35:31 MST LOG: execute S_1: BEGIN > 2010-12-08 14:35:31 MST LOG: execute : SELECT NULL AS TABLE_CAT, > n.nspname AS TABLE_SCHEM, c.relname AS TABLE_NAME, CASE n.nspname ~ '^pg_' > OR n.nspname
[jira] [Commented] (AMQ-3189) Postgresql with spring embedded activeMQ has "table already created" exception
[ https://issues.apache.org/jira/browse/AMQ-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15670872#comment-15670872 ] Volker Kleinschmidt commented on AMQ-3189: -- This isn't an improvement, it's a bug. While the createTablesOnStartup option provides a workaround, it's not a good one, since it means you need to have that to true initially, then change it to false later to avoid the annoying logged errors. Not suitable for mass deployment. It's pretty harmless though - merely annoying. But it's also trivial to fix (see my previous comment). Just something to fix for goodwill... > Postgresql with spring embedded activeMQ has "table already created" exception > -- > > Key: AMQ-3189 > URL: https://issues.apache.org/jira/browse/AMQ-3189 > Project: ActiveMQ > Issue Type: Improvement >Affects Versions: 5.4.2 > Environment: Postgresql 8.4, latest Postgresql JDBC9.0.x, tomcat > 6.x, spring 2.5 >Reporter: steve neo > > This may not a bug as MQ is still workable after this exception warning. > However, can you suppress the exception stack in log? It even can not be a > kind of "warning" as this is just a normal process to detect if tables exist > or not. > Same configuration works fine with MySQL. For postgresql, first time starting > will create table without problem After restart tomcat, log prints some > annoying failure message with long exception stack. > = Exception > 13:38:53] INFO [JDBCPersistenceAdapter] Database adapter driver override > recognized for : [postgresql_native_driver] - adapter: class > org.apache.activemq.store.jdbc.adapter.PostgresqlJDBCAdapter > [13:38:53] WARN [DefaultJDBCAdapter] Could not create JDBC tables; they > could already exist. Failure was: CREATE TABLE EDG_ACTIVEMQ_MSGS(ID BIGINT > NOT NULL, CONTAINER VARCHAR(80), MSGID_PROD VARCHAR(80), MSGID_SEQ BIGINT, > EXPIRATION BIGINT, MSG BYTEA, PRIMARY KEY ( ID ) ) Message: ERROR: relation > "edg_activemq_msgs" already exists SQLState: 42P07 Vendor code: 0 > [13:38:53] WARN [JDBCPersistenceAdapter] Failure details: ERROR: relation > "edg_activemq_msgs" already exists > org.postgresql.util.PSQLException: ERROR: relation "edg_activemq_msgs" > already exists > at > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:374) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:366) > ~[postgresql-9.0-801.jdbc4.jar:na] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.6.0_21] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > ~[na:1.6.0_21] > = End Exception > I try dbcp.DataSource and postgresql Datasource both. > = Datasource > destroy-method="close"> > > > > > > > > > > = broker >start="true" useShutdownHook="true" > dataDirectory="${geniuswiki.tmp.dir}activemq-data"> > >dataSource="#coreDS" createTablesOnStartup="true" useDatabaseLock="false"> >tablePrefix="@TOKEN.TABLE.PREFIX@" stringIdDataType ="VARCHAR(80)" > msgIdDataType="VARCHAR(80)" containerNameDataType="VARCHAR(80)"/> > > > > >uri="tcp://${mq.server.url}?wireFormat.maxInactivityDuration=0"/> > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-5543) Broker logs show repeated "java.lang.IllegalStateException: Timer already cancelled." lines
[ https://issues.apache.org/jira/browse/AMQ-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400743#comment-15400743 ] Volker Kleinschmidt commented on AMQ-5543: -- Hmm, while that's nice to hear it's very non-obvious that the two issues are even related. > Broker logs show repeated "java.lang.IllegalStateException: Timer already > cancelled." lines > > > Key: AMQ-5543 > URL: https://issues.apache.org/jira/browse/AMQ-5543 > Project: ActiveMQ > Issue Type: Bug > Components: Transport >Affects Versions: 5.10.0 >Reporter: Tim Bain > > One of our brokers running 5.10.0 spewed over 1 million of the following > exceptions to the log over a 2-hour period: > Transport Connection to: tcp://127.0.0.1:x failed: java.io.IOException: > Unexpected error occurred: java.lang.IllegalStateException: Timer already > cancelled. > Clients were observed to hang on startup (which we believe means they were > unable to connect to the broker) until it was rebooted, after which we > haven't seen the exception again. > Once the exceptions started, there were no stack traces or other log lines > that would indicate anything else about the cause, just those messages > repeating. The problems started immediately (a few milliseconds) after an > EOFException in the broker logs; we see those EOFExceptions pretty often and > they've never before resulted in "Timer already cancelled" exceptions, so > that might indicate what got us into a bad state but then again it might be > entirely unrelated. > I searched JIRA and the mailing list archives for similar issues, and > although there are a lot of incidences of "Timer already cancelled" > exceptions, none of them exactly match our situation. > * > http://activemq.2283324.n4.nabble.com/Producer-connections-keep-breaking-td4671152.html > describes repeated copies of the line in the logs and is the closest > parallel I've found, but it sounded like messages were still getting passed, > albeit more slowly than normal, whereas the developer on my team who hit this > said he didn't think any messages were getting sent. But it's the closest > match of the group. > * AMQ-5508 has a detailed investigation into the root cause of the problem > that Pero Atanasov saw, but his scenario occurred only on broker shutdown, > whereas our broker was not shutting down at the time, and it wasn't clear > that the log line repeated for him. > * > http://activemq.2283324.n4.nabble.com/Timer-already-cancelled-and-KahaDB-Recovering-checkpoint-thread-after-death-td4676684.html > has repeated messages like I'm seeing, but appears to be specific to KahaDB > which I'm not using (we use non-persistent messages only). > * AMQ-4805 has the same inner exception but not the full log line, and it > shows a full stack trace whereas I only see the one line without the stack > trace. > * > http://activemq.2283324.n4.nabble.com/can-t-send-message-Timer-already-cancelled-td4680175.html > appears to see the exception on the producer when sending rather than on the > broker. > The only thing that we observed that might be related was that earlier in the > broker's uptime, the developer who was running it ran up against his maxProc > limit and wasn't able to create new native threads. That didn't result in > any issues in the broker logs at the time (we only hit these issues several > hours later), so I'm skeptical that it's related given that there are lots of > other things that cause "Timer already cancelled" exceptions as evidenced by > the partial list above, but it could be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-5995) ActiveMQ exits on startup with UTFDataFormatException: bad string
[ https://issues.apache.org/jira/browse/AMQ-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278208#comment-15278208 ] Volker Kleinschmidt commented on AMQ-5995: -- At minimum there needs to be a release notes entry that tells everybody that they must empty their message store before upgrading. And this requires no external contributions. > ActiveMQ exits on startup with UTFDataFormatException: bad string > - > > Key: AMQ-5995 > URL: https://issues.apache.org/jira/browse/AMQ-5995 > Project: ActiveMQ > Issue Type: Bug >Affects Versions: 5.12.0 >Reporter: Gijsbert van den Brink > > When upgrading from 5.11.1 to 5.12, ActiveMQ does not start and exits with an > error (see full stack trace below). The issue occurs when ActiveMQ reads a > message from the store (in my case an Oracle database using > JDBCPersistenceAdapter) that was stored in 5.11.1 and tries to read in > 5.12.0. > Tim Bain sugested on the mailing list that maybe the Openwire format is > incompatible between 5.11.1 and 5.12.0. One thing I noticed when debugging > this is that 5.11.1 uses v6.MessageIdMarshaller, while 5.12.0 uses v11. Not > sure if that has anything to do with this issue though. > ERROR o.a.activemq.broker.BrokerService - Failed to start Apache ActiveMQ > ([eda2e5a4c4d0, null], {}) > java.io.UTFDataFormatException: bad string > at > org.apache.activemq.util.DataByteArrayInputStream.readUTF(DataByteArrayInputStream.java:315) > ~[activemq-client-5.12.0.jar:5.12.0] > at > org.apache.activemq.openwire.v11.BaseDataStreamMarshaller.looseUnmarshalString(BaseDataStreamMarshaller.java:571) > ~[activemq-client-5.12.0.jar:5.12.0] > at > org.apache.activemq.openwire.v11.MessageIdMarshaller.looseUnmarshal(MessageIdMarshaller.java:122) > ~[activemq-client-5.12.0.jar:5.12.0] > at > org.apache.activemq.openwire.OpenWireFormat.looseUnmarshalNestedObject(OpenWireFormat.java:473) > ~[activemq-client-5.12.0.jar:5.12.0] > at > org.apache.activemq.openwire.v11.BaseDataStreamMarshaller.looseUnmarsalNestedObject(BaseDataStreamMarshaller.java:466) > ~[activemq-client-5.12.0.jar:5.12.0] > at > org.apache.activemq.openwire.v11.MessageMarshaller.looseUnmarshal(MessageMarshaller.java:220) > ~[activemq-client-5.12.0.jar:5.12.0] > at > org.apache.activemq.openwire.v11.ActiveMQMessageMarshaller.looseUnmarshal(ActiveMQMessageMarshaller.java:101) > ~[activemq-client-5.12.0.jar:5.12.0] > at > org.apache.activemq.openwire.v11.ActiveMQObjectMessageMarshaller.looseUnmarshal(ActiveMQObjectMessageMarshaller.java:101) > ~[activemq-client-5.12.0.jar:5.12.0] > at > org.apache.activemq.openwire.OpenWireFormat.doUnmarshal(OpenWireFormat.java:366) > ~[activemq-client-5.12.0.jar:5.12.0] > at > org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:200) > ~[activemq-client-5.12.0.jar:5.12.0] > at > org.apache.activemq.store.jdbc.JDBCPersistenceAdapter.getLastMessageBrokerSequenceId(JDBCPersistenceAdapter.java:266) > ~[activemq-jdbc-store-5.12.0.jar:5.12.0] > at > org.apache.activemq.broker.region.DestinationFactoryImpl.getLastMessageBrokerSequenceId(DestinationFactoryImpl.java:147) > ~[activemq-broker-5.12.0.jar:5.12.0] > at > org.apache.activemq.broker.region.RegionBroker.(RegionBroker.java:130) > ~[activemq-broker-5.12.0.jar:5.12.0] > at > org.apache.activemq.broker.jmx.ManagedRegionBroker.(ManagedRegionBroker.java:112) > ~[activemq-broker-5.12.0.jar:5.12.0] > at > org.apache.activemq.broker.BrokerService.createRegionBroker(BrokerService.java:2297) > ~[activemq-broker-5.12.0.jar:5.12.0] > at > org.apache.activemq.broker.BrokerService.createRegionBroker(BrokerService.java:2290) > ~[activemq-broker-5.12.0.jar:5.12.0] > at > org.apache.activemq.broker.BrokerService.createBroker(BrokerService.java:2247) > ~[activemq-broker-5.12.0.jar:5.12.0] > at > org.apache.activemq.broker.BrokerService.getBroker(BrokerService.java:981) > ~[activemq-broker-5.12.0.jar:5.12.0] > at > org.apache.activemq.broker.BrokerService.getAdminConnectionContext(BrokerService.java:2518) > ~[activemq-broker-5.12.0.jar:5.12.0] > at > org.apache.activemq.broker.BrokerService.startVirtualConsumerDestinations(BrokerService.java:2657) > ~[activemq-broker-5.12.0.jar:5.12.0] > at > org.apache.activemq.broker.BrokerService.startDestinations(BrokerService.java:2509) > ~[activemq-broker-5.12.0.jar:5.12.0] > at > org.apache.activemq.broker.BrokerService.doStartBroker(BrokerService.java:692) > ~[activemq-broker-5.12.0.jar:5.12.0] > at > org.apache.activemq.broker.BrokerService.startBroker(BrokerService.java:684) > ~[a
[jira] [Commented] (AMQ-6108) SelectorManager Executor is not shutdown when transport os stopped.
[ https://issues.apache.org/jira/browse/AMQ-6108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196057#comment-15196057 ] Volker Kleinschmidt commented on AMQ-6108: -- Thanks for the quick fix! > SelectorManager Executor is not shutdown when transport os stopped. > --- > > Key: AMQ-6108 > URL: https://issues.apache.org/jira/browse/AMQ-6108 > Project: ActiveMQ > Issue Type: Bug >Reporter: Andy Gumbrecht >Assignee: Timothy Bish > Fix For: 5.13.1, 5.14.0 > > Attachments: SelectorManager.Shutdown.patch > > > SelectorManager creates an Executor that is not shut down on termination of > the Transport. > The Executor currently uses non-daemon threads and is is not guaranteed the > the SelectorWorker thread exit condition is ever met. > This causes the shutdown to hang when using transports that utilise the > SelectorManager, such as nio+ssl for example. > The proposed patch shuts down the ExecutorService on/after Transport > shutdown. The SelectorWorkers also check for this as an exit condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AMQ-6108) SelectorManager Executor is not shutdown when transport os stopped.
[ https://issues.apache.org/jira/browse/AMQ-6108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194293#comment-15194293 ] Volker Kleinschmidt edited comment on AMQ-6108 at 3/15/16 2:58 PM: --- Actually these check-ins by [~tabish121] call setDaemon(false), i.e. they create the ActiveMQ IO Worker thread as user thread, not as daemon. This is exactly the wrong way around, so this issue remains unfixed, or rather the fix here really broke things! This is a major problem, as it prevents proper shutdown and restart of JVMs running AMQ. In all previous releases these threads actually were daemon threads, as can easily be verified in thread dumps, but the newly added explicit setDaemon(false) call broke that in 5.13.1. was (Author: volkerk): Actually these check-ins call setDaemon(false), i.e. they create the ActiveMQ IO Worker thread as user thread, not as daemon. This is exactly the wrong way around, so this issue remains unfixed, or rather the fix here really broke things! This is a major problem, as it prevents proper shutdown and restart of JVMs running AMQ. In all previous releases these threads actually were daemon threads, as can easily be verified in thread dumps, but the newly added explicit setDaemon(false) call broke that in 5.13.1. > SelectorManager Executor is not shutdown when transport os stopped. > --- > > Key: AMQ-6108 > URL: https://issues.apache.org/jira/browse/AMQ-6108 > Project: ActiveMQ > Issue Type: Bug >Reporter: Andy Gumbrecht >Assignee: Timothy Bish > Fix For: 5.13.1, 5.14.0 > > Attachments: SelectorManager.Shutdown.patch > > > SelectorManager creates an Executor that is not shut down on termination of > the Transport. > The Executor currently uses non-daemon threads and is is not guaranteed the > the SelectorWorker thread exit condition is ever met. > This causes the shutdown to hang when using transports that utilise the > SelectorManager, such as nio+ssl for example. > The proposed patch shuts down the ExecutorService on/after Transport > shutdown. The SelectorWorkers also check for this as an exit condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AMQ-6108) SelectorManager Executor is not shutdown when transport os stopped.
[ https://issues.apache.org/jira/browse/AMQ-6108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194293#comment-15194293 ] Volker Kleinschmidt edited comment on AMQ-6108 at 3/15/16 2:56 PM: --- Actually these check-ins call setDaemon(false), i.e. they create the ActiveMQ IO Worker thread as user thread, not as daemon. This is exactly the wrong way around, so this issue remains unfixed, or rather the fix here really broke things! This is a major problem, as it prevents proper shutdown and restart of JVMs running AMQ. In all previous releases these threads actually were daemon threads, as can easily be verified in thread dumps, but the newly added explicit setDaemon(false) call broke that in 5.13.1. was (Author: volkerk): Actually these check-ins call setDaemon(false), i.e. they create the ActiveMQ IO Worker thread as user thread, not as daemon. This is exactly the wrong way around, so this issue remains unfixed! This is a major problem, as it prevents proper shutdown and restart of JVMs running AMQ. In all previous releases these threads actually were daemon threads, as can easily be verified in thread dumps, but the newly added explicit setDaemon(false) call broke that in 5.13.1. > SelectorManager Executor is not shutdown when transport os stopped. > --- > > Key: AMQ-6108 > URL: https://issues.apache.org/jira/browse/AMQ-6108 > Project: ActiveMQ > Issue Type: Bug >Reporter: Andy Gumbrecht >Assignee: Timothy Bish > Fix For: 5.13.1, 5.14.0 > > Attachments: SelectorManager.Shutdown.patch > > > SelectorManager creates an Executor that is not shut down on termination of > the Transport. > The Executor currently uses non-daemon threads and is is not guaranteed the > the SelectorWorker thread exit condition is ever met. > This causes the shutdown to hang when using transports that utilise the > SelectorManager, such as nio+ssl for example. > The proposed patch shuts down the ExecutorService on/after Transport > shutdown. The SelectorWorkers also check for this as an exit condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AMQ-6108) SelectorManager Executor is not shutdown when transport os stopped.
[ https://issues.apache.org/jira/browse/AMQ-6108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194293#comment-15194293 ] Volker Kleinschmidt edited comment on AMQ-6108 at 3/15/16 2:54 PM: --- Actually these check-ins call setDaemon(false), i.e. they create the ActiveMQ IO Worker thread as user thread, not as daemon. This is exactly the wrong way around, so this issue remains unfixed! This is a major problem, as it prevents proper shutdown and restart of JVMs running AMQ. In all previous releases these threads actually were daemon threads, as can easily be verified in thread dumps, but the newly added explicit setDaemon(false) call broke that in 5.13.1. was (Author: volkerk): Actually these check-ins call setDaemon(false), i.e. they create the ActiveMQ IO Worker thread as user thread, not as daemon. This is exactly the wrong way around, so this issue remains unfixed! > SelectorManager Executor is not shutdown when transport os stopped. > --- > > Key: AMQ-6108 > URL: https://issues.apache.org/jira/browse/AMQ-6108 > Project: ActiveMQ > Issue Type: Bug >Reporter: Andy Gumbrecht >Assignee: Timothy Bish > Fix For: 5.13.1, 5.14.0 > > Attachments: SelectorManager.Shutdown.patch > > > SelectorManager creates an Executor that is not shut down on termination of > the Transport. > The Executor currently uses non-daemon threads and is is not guaranteed the > the SelectorWorker thread exit condition is ever met. > This causes the shutdown to hang when using transports that utilise the > SelectorManager, such as nio+ssl for example. > The proposed patch shuts down the ExecutorService on/after Transport > shutdown. The SelectorWorkers also check for this as an exit condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AMQ-6108) SelectorManager Executor is not shutdown when transport os stopped.
[ https://issues.apache.org/jira/browse/AMQ-6108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194293#comment-15194293 ] Volker Kleinschmidt edited comment on AMQ-6108 at 3/14/16 10:28 PM: Actually these check-ins call setDaemon(false), i.e. they create the ActiveMQ IO Worker thread as user thread, not as daemon. This is exactly the wrong way around, so this issue remains unfixed! was (Author: volkerk): Actually these check-ins call setDaemon(false), i.e. they create the ActiveMQ IO Worker thread as user thread, not as daemon. This is exactly the wrong way around! > SelectorManager Executor is not shutdown when transport os stopped. > --- > > Key: AMQ-6108 > URL: https://issues.apache.org/jira/browse/AMQ-6108 > Project: ActiveMQ > Issue Type: Bug >Reporter: Andy Gumbrecht >Assignee: Timothy Bish > Fix For: 5.13.1, 5.14.0 > > Attachments: SelectorManager.Shutdown.patch > > > SelectorManager creates an Executor that is not shut down on termination of > the Transport. > The Executor currently uses non-daemon threads and is is not guaranteed the > the SelectorWorker thread exit condition is ever met. > This causes the shutdown to hang when using transports that utilise the > SelectorManager, such as nio+ssl for example. > The proposed patch shuts down the ExecutorService on/after Transport > shutdown. The SelectorWorkers also check for this as an exit condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-6108) SelectorManager Executor is not shutdown when transport os stopped.
[ https://issues.apache.org/jira/browse/AMQ-6108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194293#comment-15194293 ] Volker Kleinschmidt commented on AMQ-6108: -- Actually these check-ins call setDaemon(false), i.e. they create the ActiveMQ IO Worker thread as user thread, not as daemon. This is exactly the wrong way around! > SelectorManager Executor is not shutdown when transport os stopped. > --- > > Key: AMQ-6108 > URL: https://issues.apache.org/jira/browse/AMQ-6108 > Project: ActiveMQ > Issue Type: Bug >Reporter: Andy Gumbrecht >Assignee: Timothy Bish > Fix For: 5.13.1, 5.14.0 > > Attachments: SelectorManager.Shutdown.patch > > > SelectorManager creates an Executor that is not shut down on termination of > the Transport. > The Executor currently uses non-daemon threads and is is not guaranteed the > the SelectorWorker thread exit condition is ever met. > This causes the shutdown to hang when using transports that utilise the > SelectorManager, such as nio+ssl for example. > The proposed patch shuts down the ExecutorService on/after Transport > shutdown. The SelectorWorkers also check for this as an exit condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AMQ-3189) Postgresql with spring embedded activeMQ has "table already created" exception
[ https://issues.apache.org/jira/browse/AMQ-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172818#comment-15172818 ] Volker Kleinschmidt edited comment on AMQ-3189 at 3/3/16 2:03 AM: -- It appears this issue is simply due to a case discrepancy between the form of the table name used during creation and the one used when checking for existence. Note that the creation statement uses all upper case, but the existence message reports the tablename in lower case. All identifiers (including column names) that are not double-quoted are folded to lower case in PostgreSQL. It does not help that this is exactly the opposite behavior from Oracle! A proper table existence query in Oracle: {noformat} select count(1) from user_tables where table_name=UPPER(string_to_check); {noformat} A proper table existence query in PostGres: {noformat} select count(1) from information_schema.tables where table_catalog = CURRENT_CATALOG and table_schema = CURRENT_SCHEMA and table_name = LOWER(string_to_check); {noformat} In other words, what needs to be fixed here is the following line in DefaultJDBCAdapter.doCreateTables(): {noformat} rs = c.getConnection().getMetaData().getTables(null, null, this.statements.getFullMessageTableName(), {noformat} For PostGres this needs to use getFullMessageTableName().toLowerCase() instead - so you either need platform-specific code here, overriding this in PostgresqlJDBCAdapter(), or you need to lower-case the table names already in Statements.setMessageTableName() etc. depending on the platform (lower-casing it across the board would obviously not work with Oracle), Or perhaps the tablenamePattern for the DatabaseMetaData.getTables() method can simply specify that it should be a case-insensitive match - I don't find documented what types of patterns are supported there. was (Author: volkerk): It appears this issue is simply due to a case discrepancy between the form of the table name used during creation and the one used when checking for existence. Note that the creation statement uses all upper case, but the existence message reports the tablename in lower case. All identifiers (including column names) that are not double-quoted are folded to lower case in PostgreSQL. It does not help that this is exactly the opposite behavior from Oracle! A proper table existence query in Oracle: {noformat} select count(1) from user_tables where table_name=UPPER(string_to_check); {noformat} A proper table existence query in PostGres: {noformat} select count(1) from information_schema.tables where table_catalog = CURRENT_CATALOG and table_schema = CURRENT_SCHEMA and table_name = LOWER(string_to_check); {noformat} > Postgresql with spring embedded activeMQ has "table already created" exception > -- > > Key: AMQ-3189 > URL: https://issues.apache.org/jira/browse/AMQ-3189 > Project: ActiveMQ > Issue Type: Improvement >Affects Versions: 5.4.2 > Environment: Postgresql 8.4, latest Postgresql JDBC9.0.x, tomcat > 6.x, spring 2.5 >Reporter: steve neo > > This may not a bug as MQ is still workable after this exception warning. > However, can you suppress the exception stack in log? It even can not be a > kind of "warning" as this is just a normal process to detect if tables exist > or not. > Same configuration works fine with MySQL. For postgresql, first time starting > will create table without problem After restart tomcat, log prints some > annoying failure message with long exception stack. > = Exception > 13:38:53] INFO [JDBCPersistenceAdapter] Database adapter driver override > recognized for : [postgresql_native_driver] - adapter: class > org.apache.activemq.store.jdbc.adapter.PostgresqlJDBCAdapter > [13:38:53] WARN [DefaultJDBCAdapter] Could not create JDBC tables; they > could already exist. Failure was: CREATE TABLE EDG_ACTIVEMQ_MSGS(ID BIGINT > NOT NULL, CONTAINER VARCHAR(80), MSGID_PROD VARCHAR(80), MSGID_SEQ BIGINT, > EXPIRATION BIGINT, MSG BYTEA, PRIMARY KEY ( ID ) ) Message: ERROR: relation > "edg_activemq_msgs" already exists SQLState: 42P07 Vendor code: 0 > [13:38:53] WARN [JDBCPersistenceAdapter] Failure details: ERROR: relation > "edg_activemq_msgs" already exists > org.postgresql.util.PSQLException: ERROR: relation "edg_activemq_msgs" > already exists > at > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > o
[jira] [Comment Edited] (AMQ-3189) Postgresql with spring embedded activeMQ has "table already created" exception
[ https://issues.apache.org/jira/browse/AMQ-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172818#comment-15172818 ] Volker Kleinschmidt edited comment on AMQ-3189 at 3/2/16 10:47 PM: --- It appears this issue is simply due to a case discrepancy between the form of the table name used during creation and the one used when checking for existence. Note that the creation statement uses all upper case, but the existence message reports the tablename in lower case. All identifiers (including column names) that are not double-quoted are folded to lower case in PostgreSQL. It does not help that this is exactly the opposite behavior from Oracle! A proper table existence query in Oracle: {noformat} select count(1) from user_tables where table_name=UPPER(string_to_check); {noformat} A proper table existence query in PostGres: {noformat} select count(1) from information_schema.tables where table_catalog = CURRENT_CATALOG and table_schema = CURRENT_SCHEMA and table_name = LOWER(string_to_check); {noformat} was (Author: volkerk): It appears this issue is simply due to a case discrepancy between the form of the table name used during creation and the one used when checking for existence. > Postgresql with spring embedded activeMQ has "table already created" exception > -- > > Key: AMQ-3189 > URL: https://issues.apache.org/jira/browse/AMQ-3189 > Project: ActiveMQ > Issue Type: Improvement >Affects Versions: 5.4.2 > Environment: Postgresql 8.4, latest Postgresql JDBC9.0.x, tomcat > 6.x, spring 2.5 >Reporter: steve neo > > This may not a bug as MQ is still workable after this exception warning. > However, can you suppress the exception stack in log? It even can not be a > kind of "warning" as this is just a normal process to detect if tables exist > or not. > Same configuration works fine with MySQL. For postgresql, first time starting > will create table without problem After restart tomcat, log prints some > annoying failure message with long exception stack. > = Exception > 13:38:53] INFO [JDBCPersistenceAdapter] Database adapter driver override > recognized for : [postgresql_native_driver] - adapter: class > org.apache.activemq.store.jdbc.adapter.PostgresqlJDBCAdapter > [13:38:53] WARN [DefaultJDBCAdapter] Could not create JDBC tables; they > could already exist. Failure was: CREATE TABLE EDG_ACTIVEMQ_MSGS(ID BIGINT > NOT NULL, CONTAINER VARCHAR(80), MSGID_PROD VARCHAR(80), MSGID_SEQ BIGINT, > EXPIRATION BIGINT, MSG BYTEA, PRIMARY KEY ( ID ) ) Message: ERROR: relation > "edg_activemq_msgs" already exists SQLState: 42P07 Vendor code: 0 > [13:38:53] WARN [JDBCPersistenceAdapter] Failure details: ERROR: relation > "edg_activemq_msgs" already exists > org.postgresql.util.PSQLException: ERROR: relation "edg_activemq_msgs" > already exists > at > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:374) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:366) > ~[postgresql-9.0-801.jdbc4.jar:na] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.6.0_21] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > ~[na:1.6.0_21] > = End Exception > I try dbcp.DataSource and postgresql Datasource both. > = Datasource > destroy-method="close"> > > > > > > > > > > = broker >start="true" useShutdownHook="true" > dataDirectory="${geniuswiki.tmp.dir}activemq-data"> > >dataSource="#coreDS" createTablesOnStartup="true" useDatabaseLock="false"> >tablePrefix="@TOKEN.TABLE.PREFIX@" stringIdDataType ="VARCHAR(80)" > msgIdDataType="VARCHAR(80)" containerNameDataType="VARCHAR(80)"/> > > > > >uri="tcp://${mq.server.url}?wireFormat.maxInactivityDuration=0"/> > >
[jira] [Comment Edited] (AMQ-3189) Postgresql with spring embedded activeMQ has "table already created" exception
[ https://issues.apache.org/jira/browse/AMQ-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172818#comment-15172818 ] Volker Kleinschmidt edited comment on AMQ-3189 at 2/29/16 11:06 PM: It appears this issue is simply due to a case discrepancy between the form of the table name used during creation and the one used when checking for existence. was (Author: volkerk): It appears this issue is simply due to a case discrepancy between the form of the table name used during creation and the one used when checking. > Postgresql with spring embedded activeMQ has "table already created" exception > -- > > Key: AMQ-3189 > URL: https://issues.apache.org/jira/browse/AMQ-3189 > Project: ActiveMQ > Issue Type: Improvement >Affects Versions: 5.4.2 > Environment: Postgresql 8.4, latest Postgresql JDBC9.0.x, tomcat > 6.x, spring 2.5 >Reporter: steve neo > > This may not a bug as MQ is still workable after this exception warning. > However, can you suppress the exception stack in log? It even can not be a > kind of "warning" as this is just a normal process to detect if tables exist > or not. > Same configuration works fine with MySQL. For postgresql, first time starting > will create table without problem After restart tomcat, log prints some > annoying failure message with long exception stack. > = Exception > 13:38:53] INFO [JDBCPersistenceAdapter] Database adapter driver override > recognized for : [postgresql_native_driver] - adapter: class > org.apache.activemq.store.jdbc.adapter.PostgresqlJDBCAdapter > [13:38:53] WARN [DefaultJDBCAdapter] Could not create JDBC tables; they > could already exist. Failure was: CREATE TABLE EDG_ACTIVEMQ_MSGS(ID BIGINT > NOT NULL, CONTAINER VARCHAR(80), MSGID_PROD VARCHAR(80), MSGID_SEQ BIGINT, > EXPIRATION BIGINT, MSG BYTEA, PRIMARY KEY ( ID ) ) Message: ERROR: relation > "edg_activemq_msgs" already exists SQLState: 42P07 Vendor code: 0 > [13:38:53] WARN [JDBCPersistenceAdapter] Failure details: ERROR: relation > "edg_activemq_msgs" already exists > org.postgresql.util.PSQLException: ERROR: relation "edg_activemq_msgs" > already exists > at > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:374) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:366) > ~[postgresql-9.0-801.jdbc4.jar:na] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.6.0_21] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > ~[na:1.6.0_21] > = End Exception > I try dbcp.DataSource and postgresql Datasource both. > = Datasource > destroy-method="close"> > > > > > > > > > > = broker >start="true" useShutdownHook="true" > dataDirectory="${geniuswiki.tmp.dir}activemq-data"> > >dataSource="#coreDS" createTablesOnStartup="true" useDatabaseLock="false"> >tablePrefix="@TOKEN.TABLE.PREFIX@" stringIdDataType ="VARCHAR(80)" > msgIdDataType="VARCHAR(80)" containerNameDataType="VARCHAR(80)"/> > > > > >uri="tcp://${mq.server.url}?wireFormat.maxInactivityDuration=0"/> > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-3189) Postgresql with spring embedded activeMQ has "table already created" exception
[ https://issues.apache.org/jira/browse/AMQ-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172818#comment-15172818 ] Volker Kleinschmidt commented on AMQ-3189: -- It appears this issue is simply due to a case discrepancy between the form of the table name used during creation and the one used when checking. > Postgresql with spring embedded activeMQ has "table already created" exception > -- > > Key: AMQ-3189 > URL: https://issues.apache.org/jira/browse/AMQ-3189 > Project: ActiveMQ > Issue Type: Improvement >Affects Versions: 5.4.2 > Environment: Postgresql 8.4, latest Postgresql JDBC9.0.x, tomcat > 6.x, spring 2.5 >Reporter: steve neo > > This may not a bug as MQ is still workable after this exception warning. > However, can you suppress the exception stack in log? It even can not be a > kind of "warning" as this is just a normal process to detect if tables exist > or not. > Same configuration works fine with MySQL. For postgresql, first time starting > will create table without problem After restart tomcat, log prints some > annoying failure message with long exception stack. > = Exception > 13:38:53] INFO [JDBCPersistenceAdapter] Database adapter driver override > recognized for : [postgresql_native_driver] - adapter: class > org.apache.activemq.store.jdbc.adapter.PostgresqlJDBCAdapter > [13:38:53] WARN [DefaultJDBCAdapter] Could not create JDBC tables; they > could already exist. Failure was: CREATE TABLE EDG_ACTIVEMQ_MSGS(ID BIGINT > NOT NULL, CONTAINER VARCHAR(80), MSGID_PROD VARCHAR(80), MSGID_SEQ BIGINT, > EXPIRATION BIGINT, MSG BYTEA, PRIMARY KEY ( ID ) ) Message: ERROR: relation > "edg_activemq_msgs" already exists SQLState: 42P07 Vendor code: 0 > [13:38:53] WARN [JDBCPersistenceAdapter] Failure details: ERROR: relation > "edg_activemq_msgs" already exists > org.postgresql.util.PSQLException: ERROR: relation "edg_activemq_msgs" > already exists > at > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2102) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1835) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:257) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:500) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:374) > ~[postgresql-9.0-801.jdbc4.jar:na] > at > org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:366) > ~[postgresql-9.0-801.jdbc4.jar:na] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.6.0_21] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > ~[na:1.6.0_21] > = End Exception > I try dbcp.DataSource and postgresql Datasource both. > = Datasource > destroy-method="close"> > > > > > > > > > > = broker >start="true" useShutdownHook="true" > dataDirectory="${geniuswiki.tmp.dir}activemq-data"> > >dataSource="#coreDS" createTablesOnStartup="true" useDatabaseLock="false"> >tablePrefix="@TOKEN.TABLE.PREFIX@" stringIdDataType ="VARCHAR(80)" > msgIdDataType="VARCHAR(80)" containerNameDataType="VARCHAR(80)"/> > > > > >uri="tcp://${mq.server.url}?wireFormat.maxInactivityDuration=0"/> > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-2520) Oracle 10g RAC resource usage VERY high from the passive servers SQL requests to the Database.
[ https://issues.apache.org/jira/browse/AMQ-2520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172717#comment-15172717 ] Volker Kleinschmidt commented on AMQ-2520: -- Hmm. Why on earth isn't this the default behavior of the default DB locker? You naturally do NOT want to sit around holding active transactions that are trying to acquire a DB resource that you know you'll never get unless the master broker fails. That's being a really bad citizen in the DB world. The FOR UPDATE NOWAIT clause is the right thing to use, period. > Oracle 10g RAC resource usage VERY high from the passive servers SQL requests > to the Database. > -- > > Key: AMQ-2520 > URL: https://issues.apache.org/jira/browse/AMQ-2520 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.3.0, 5.4.0 > Environment: Redhat Enterprise Linux 5, Oracle 10g RAC >Reporter: Thomas Connolly > Fix For: 5.x > > > Two active MQ brokers are installed on RH EL 5 servers (one per server). > They're configured as a JDBC master / slave failover (as per examples). > Failover is tested and working and messages delivered. > Oracle is used for synchronisation (ACTIVEMQ_ tables), persistence etc. > We run a durable subscriber, and the client connects via a failover operation. > The SELECT * FROM ACTIVEMQ_LOCK FOR UPDATE is causing spin lock on the Oracle > database. > Basically the indefinite waiting from the passive mq instance is causing high > resource usage on Oracle. > After a short period Oracle dashboard shows a high number of active sessions > from Active MQ due to the continuous execution of > UPDATE ACTIVEMQ_LOCK SET TIME = ? WHERE ID = 1 > in the keepAlive method in > > https://svn.apache.org/repos/asf/activemq/trunk/activemq-core/src/main/java/org/apache/activemq/store/jdbc/DatabaseLocker.java > As a workaround we've had to push out the lockAcquireSleepInterval to 5 > minutes in the configuration of ActiveMQ, but this didn't work. > lockAcquireSleepInterval="30" createTablesOnStartup="true"/> > We're currently changing the broker to poll rather than block so in > Statement.java we've added a WAIT 0 that throws an exception if the lock is > not acquired. > public String getLockCreateStatement() { > if (lockCreateStatement == null) { > lockCreateStatement = "SELECT * FROM " + getFullLockTableName(); > if (useLockCreateWhereClause) { > lockCreateStatement += " WHERE ID = 1"; > } > lockCreateStatement += " FOR UPDATE WAIT 0"; > } > return lockCreateStatement; > } > Any suggestions to this issue, this seems to be a quite fundamental issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-6125) Potential NPE in session rollback if no default redlivery policy configured
[ https://issues.apache.org/jira/browse/AMQ-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172493#comment-15172493 ] Volker Kleinschmidt commented on AMQ-6125: -- Hmm, red livery stables wonders why they're mentioned in AMQ tickets :) > Potential NPE in session rollback if no default redlivery policy configured > --- > > Key: AMQ-6125 > URL: https://issues.apache.org/jira/browse/AMQ-6125 > Project: ActiveMQ > Issue Type: Bug >Reporter: Timothy Bish >Assignee: Timothy Bish > Fix For: 5.13.1, 5.14.0 > > > If the RedliveryPolicyMap is set on a ConnectionFactory and no default entry > is set on that instance than the MessageConsumer can throw an NPE on Rollback > because its policy will be null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-6005) Slave broker startup corrupts shared PList storage
[ https://issues.apache.org/jira/browse/AMQ-6005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15051137#comment-15051137 ] Volker Kleinschmidt commented on AMQ-6005: -- Thanks for the fix, much appreciated! > Slave broker startup corrupts shared PList storage > -- > > Key: AMQ-6005 > URL: https://issues.apache.org/jira/browse/AMQ-6005 > Project: ActiveMQ > Issue Type: Bug > Components: KahaDB >Affects Versions: 5.7.0, 5.10.0 > Environment: RHLinux6 >Reporter: Volker Kleinschmidt >Assignee: Gary Tully > Fix For: 5.13.1, 5.14.0 > > > h4. Background > When multiple JVMs run AMQ in a master/slave configuration with the broker > directory in a shared filesystem location (as is required e.g. for > kahaPersistence), and when due to high message volume or slow producers the > broker's memory needs exceed the configured memory usage limit, AMQ will > overflow asynchronous messages to a PList store inside the "tmp_storage" > subdirectory of said shared broker directory. > h4. Issue > We frequently observed this tmpDB store getting corrupted with "stale NFS > filehandle" errors for tmpDB.data, tmpDB.redo, and some journal files, all of > which suddenly went missing from the tmp_storage folder. This puts the entire > broker into a bad state from which it cannot recover. Only restarting the > service (which causes a broker slave to take over and loses the > yet-undelivered messages) gets a working state back. > h4. Symptoms > Stack trace: > {noformat} > ... > Caused by: java.io.IOException: Stale file handle > at java.io.RandomAccessFile.readBytes0(Native Method) > at java.io.RandomAccessFile.readBytes(RandomAccessFile.java:350) > at java.io.RandomAccessFile.read(RandomAccessFile.java:385) > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:444) > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:424) > at org.apache.kahadb.page.PageFile.readPage(PageFile.java:876) > at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:446) > at org.apache.kahadb.page.Transaction$2.(Transaction.java:437) > at > org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:434) > at org.apache.kahadb.page.Transaction.load(Transaction.java:410) > at org.apache.kahadb.page.Transaction.load(Transaction.java:367) > at org.apache.kahadb.index.ListIndex.loadNode(ListIndex.java:306) > at org.apache.kahadb.index.ListIndex.getHead(ListIndex.java:99) > at org.apache.kahadb.index.ListIndex.iterator(ListIndex.java:284) > at > org.apache.activemq.store.kahadb.plist.PList$PListIterator.(PList.java:199) > at org.apache.activemq.store.kahadb.plist.PList.iterator(PList.java:189) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.(FilePendingMessageCursor.java:496) > {noformat} > h4. Cause > During BrokerThread startup, the BrokerService.startPersistenceAdapter() > method is called, which via doStartPersistenceAdapter() and > getProducerSystemUsage() invokes getSystemUsage(), that calls > getTempDataStore(), and that method summarily cleans out the existing > contents of the tmp_storage directory. > All of this happens *before* the broker lock is obtained in the > PersistenceAdapter.start() method at the end of doStartPersistenceAdapter(). > So a JVM that doesn't get to be the broker (because there already is one) and > runs in slave mode (waiting to obtain the broker lock) interferes with and > corrupts the running broker's tmp_storage and thus breaks the broker. That's > a critical bug. The slave has no business starting up the persistence adapter > and cleaning out data as it hasn't gotten the lock yet, so isn't allowed to > do any work, period. > h4. Workaround > As workaround, an unshared local directory needs to be specified as > tempDirectory for the broker, even if the main broker directory is shared. > Also, since broker startup will clear the tmp_storage out anyway, there > really is no advantage to having this in a shared location - since the next > broker that starts up after a broker failure will never re-use that data > anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AMQ-6005) Slave broker startup corrupts shared PList storage
[ https://issues.apache.org/jira/browse/AMQ-6005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967709#comment-14967709 ] Volker Kleinschmidt edited comment on AMQ-6005 at 12/10/15 3:51 PM: Also, to replicate the source of the problem, you can simply hand-create empty tmpDB.data, tmpDB.redo, and db-1.log files in the shared tmp_storage folder, then restart one of the slave nodes while the master broker is still running - you will see that those files get deleted by the slave startup, which would corrupt the master broker's tmpDB if it were currently using it. So for issue replication it's not necessary to actually create enough asynchronous message load to be using tmp_storage - it's the tmpDB deletion by a *slave* that is the problem. I should also note that we'd originally been seeing this in action only when using JDBC persistence (where there's technically no need for a shared dataDirectory at all anymore). With kahaPersistence things used to go south before ever reaching this point, due to unreliable file locking over NFS, so we've moved away from that. (Update: it does happen with kahaPersistence too, seen it just recently, with 160 GB of error logging as a result) was (Author: volkerk): Also, to replicate the source of the problem, you can simply hand-create empty tmpDB.data, tmpDB.redo, and db-1.log files in the shared tmp_storage folder, then restart one of the slave nodes while the master broker is still running - you will see that those files get deleted by the slave startup, which would corrupt the master broker's tmpDB if it were currently using it. So for issue replication it's not necessary to actually create enough asynchronous message load to be using tmp_storage - it's the tmpDB deletion by a *slave* that is the problem. I should also note that we've been seeing this in action only when using JDBC persistence (where there's technically no need for a shared dataDirectory at all anymore). With kahaPersistence things used to go south before ever reaching this point, due to unreliable file locking over NFS, so we've moved away from that. > Slave broker startup corrupts shared PList storage > -- > > Key: AMQ-6005 > URL: https://issues.apache.org/jira/browse/AMQ-6005 > Project: ActiveMQ > Issue Type: Bug > Components: KahaDB >Affects Versions: 5.7.0, 5.10.0 > Environment: RHLinux6 >Reporter: Volker Kleinschmidt >Assignee: Gary Tully > Fix For: 5.13.1, 5.14.0 > > > h4. Background > When multiple JVMs run AMQ in a master/slave configuration with the broker > directory in a shared filesystem location (as is required e.g. for > kahaPersistence), and when due to high message volume or slow producers the > broker's memory needs exceed the configured memory usage limit, AMQ will > overflow asynchronous messages to a PList store inside the "tmp_storage" > subdirectory of said shared broker directory. > h4. Issue > We frequently observed this tmpDB store getting corrupted with "stale NFS > filehandle" errors for tmpDB.data, tmpDB.redo, and some journal files, all of > which suddenly went missing from the tmp_storage folder. This puts the entire > broker into a bad state from which it cannot recover. Only restarting the > service (which causes a broker slave to take over and loses the > yet-undelivered messages) gets a working state back. > h4. Symptoms > Stack trace: > {noformat} > ... > Caused by: java.io.IOException: Stale file handle > at java.io.RandomAccessFile.readBytes0(Native Method) > at java.io.RandomAccessFile.readBytes(RandomAccessFile.java:350) > at java.io.RandomAccessFile.read(RandomAccessFile.java:385) > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:444) > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:424) > at org.apache.kahadb.page.PageFile.readPage(PageFile.java:876) > at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:446) > at org.apache.kahadb.page.Transaction$2.(Transaction.java:437) > at > org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:434) > at org.apache.kahadb.page.Transaction.load(Transaction.java:410) > at org.apache.kahadb.page.Transaction.load(Transaction.java:367) > at org.apache.kahadb.index.ListIndex.loadNode(ListIndex.java:306) > at org.apache.kahadb.index.ListIndex.getHead(ListIndex.java:99) > at org.apache.kahadb.index.ListIndex.iterator(ListIndex.java:284) > at > org.apache.activemq.store.kahadb.plist.PList$PListIterator.(PList.java:199) > at org.apache.activemq.store.kahadb.plist.PList.iterator(PList.java:189) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskI
[jira] [Comment Edited] (AMQ-4645) The lease-database-locker does not work properly slave broker ssytem clock is behind Database server
[ https://issues.apache.org/jira/browse/AMQ-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15001792#comment-15001792 ] Volker Kleinschmidt edited comment on AMQ-4645 at 11/12/15 7:54 AM: Minor nitpick: Title is missing "if", and system is misspelled. was (Author: volkerk): Title is missing "if". > The lease-database-locker does not work properly slave broker ssytem clock is > behind Database server > > > Key: AMQ-4645 > URL: https://issues.apache.org/jira/browse/AMQ-4645 > Project: ActiveMQ > Issue Type: Bug > Components: Message Store >Affects Versions: 5.8.0 > Environment: jdbc master slave >Reporter: Gary Tully >Assignee: Gary Tully > Labels: clock, jdbc, lease, masterSlave > Fix For: 5.9.0 > > > The lease locker can adjust the lease duration based on the DB current time > but this only works if the broker is ahead of the Db. > If the broker is behind, it will always obtain a lease due to incorrect > adjustment. > If the clocks are in sync there is no issue.{code} > > lockKeepAlivePeriod="5000"> > > > > > {code} > The problem is that the negative diff is being treated as a positive diff. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-4645) The lease-database-locker does not work properly slave broker ssytem clock is behind Database server
[ https://issues.apache.org/jira/browse/AMQ-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15001792#comment-15001792 ] Volker Kleinschmidt commented on AMQ-4645: -- Title is missing "if". > The lease-database-locker does not work properly slave broker ssytem clock is > behind Database server > > > Key: AMQ-4645 > URL: https://issues.apache.org/jira/browse/AMQ-4645 > Project: ActiveMQ > Issue Type: Bug > Components: Message Store >Affects Versions: 5.8.0 > Environment: jdbc master slave >Reporter: Gary Tully >Assignee: Gary Tully > Labels: clock, jdbc, lease, masterSlave > Fix For: 5.9.0 > > > The lease locker can adjust the lease duration based on the DB current time > but this only works if the broker is ahead of the Db. > If the broker is behind, it will always obtain a lease due to incorrect > adjustment. > If the clocks are in sync there is no issue.{code} > > lockKeepAlivePeriod="5000"> > > > > > {code} > The problem is that the negative diff is being treated as a positive diff. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AMQ-6005) Slave broker startup corrupts shared PList storage
[ https://issues.apache.org/jira/browse/AMQ-6005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967709#comment-14967709 ] Volker Kleinschmidt edited comment on AMQ-6005 at 10/21/15 7:23 PM: Also, to replicate the source of the problem, you can simply hand-create empty tmpDB.data, tmpDB.redo, and db-1.log files in the shared tmp_storage folder, then restart one of the slave nodes while the master broker is still running - you will see that those files get deleted by the slave startup, which would corrupt the master broker's tmpDB if it were currently using it. So for issue replication it's not necessary to actually create enough asynchronous message load to be using tmp_storage - it's the tmpDB deletion by a *slave* that is the problem. I should also note that we've been seeing this in action only when using JDBC persistence (where there's technically no need for a shared dataDirectory at all anymore). With kahaPersistence things used to go south before ever reaching this point, due to unreliable file locking over NFS, so we've moved away from that. was (Author: volkerk): Also, to replicate the source of the problem, you can simply hand-create empty tmpDB.data, tmpDB.redo, and db-1.log files in the shared tmp_storage folder, then restart one of the slave nodes while the master broker is still running - you will see that those files get deleted by the slave startup, which would corrupt the master broker's tmpDB if it were currently using it. So for issue replication it's not necessary to actually create enough asynchronous message load to be using tmp_storage - it's the tmpDB deletion by a *slave* that is the problem. > Slave broker startup corrupts shared PList storage > -- > > Key: AMQ-6005 > URL: https://issues.apache.org/jira/browse/AMQ-6005 > Project: ActiveMQ > Issue Type: Bug > Components: KahaDB >Affects Versions: 5.7.0, 5.10.0 > Environment: RHLinux6 >Reporter: Volker Kleinschmidt > > h4. Background > When multiple JVMs run AMQ in a master/slave configuration with the broker > directory in a shared filesystem location (as is required e.g. for > kahaPersistence), and when due to high message volume or slow producers the > broker's memory needs exceed the configured memory usage limit, AMQ will > overflow asynchronous messages to a PList store inside the "tmp_storage" > subdirectory of said shared broker directory. > h4. Issue > We frequently observed this tmpDB store getting corrupted with "stale NFS > filehandle" errors for tmpDB.data, tmpDB.redo, and some journal files, all of > which suddenly went missing from the tmp_storage folder. This puts the entire > broker into a bad state from which it cannot recover. Only restarting the > service (which causes a broker slave to take over and loses the > yet-undelivered messages) gets a working state back. > h4. Symptoms > Stack trace: > {noformat} > ... > Caused by: java.io.IOException: Stale file handle > at java.io.RandomAccessFile.readBytes0(Native Method) > at java.io.RandomAccessFile.readBytes(RandomAccessFile.java:350) > at java.io.RandomAccessFile.read(RandomAccessFile.java:385) > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:444) > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:424) > at org.apache.kahadb.page.PageFile.readPage(PageFile.java:876) > at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:446) > at org.apache.kahadb.page.Transaction$2.(Transaction.java:437) > at > org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:434) > at org.apache.kahadb.page.Transaction.load(Transaction.java:410) > at org.apache.kahadb.page.Transaction.load(Transaction.java:367) > at org.apache.kahadb.index.ListIndex.loadNode(ListIndex.java:306) > at org.apache.kahadb.index.ListIndex.getHead(ListIndex.java:99) > at org.apache.kahadb.index.ListIndex.iterator(ListIndex.java:284) > at > org.apache.activemq.store.kahadb.plist.PList$PListIterator.(PList.java:199) > at org.apache.activemq.store.kahadb.plist.PList.iterator(PList.java:189) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.(FilePendingMessageCursor.java:496) > {noformat} > h4. Cause > During BrokerThread startup, the BrokerService.startPersistenceAdapter() > method is called, which via doStartPersistenceAdapter() and > getProducerSystemUsage() invokes getSystemUsage(), that calls > getTempDataStore(), and that method summarily cleans out the existing > contents of the tmp_storage directory. > All of this happens *before* the broker lock is obtained in the > PersistenceAdapter.start() method at the end of doStartPersistenceA
[jira] [Commented] (AMQ-6005) Slave broker startup corrupts shared PList storage
[ https://issues.apache.org/jira/browse/AMQ-6005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967709#comment-14967709 ] Volker Kleinschmidt commented on AMQ-6005: -- Also, to replicate the source of the problem, you can simply hand-create empty tmpDB.data, tmpDB.redo, and db-1.log files in the shared tmp_storage folder, then restart one of the slave nodes while the master broker is still running - you will see that those files get deleted by the slave startup, which would corrupt the master broker's tmpDB if it were currently using it. So for issue replication it's not necessary to actually create enough asynchronous message load to be using tmp_storage - it's the tmpDB deletion by a *slave* that is the problem. > Slave broker startup corrupts shared PList storage > -- > > Key: AMQ-6005 > URL: https://issues.apache.org/jira/browse/AMQ-6005 > Project: ActiveMQ > Issue Type: Bug > Components: KahaDB >Affects Versions: 5.7.0, 5.10.0 > Environment: RHLinux6 >Reporter: Volker Kleinschmidt > > h4. Background > When multiple JVMs run AMQ in a master/slave configuration with the broker > directory in a shared filesystem location (as is required e.g. for > kahaPersistence), and when due to high message volume or slow producers the > broker's memory needs exceed the configured memory usage limit, AMQ will > overflow asynchronous messages to a PList store inside the "tmp_storage" > subdirectory of said shared broker directory. > h4. Issue > We frequently observed this tmpDB store getting corrupted with "stale NFS > filehandle" errors for tmpDB.data, tmpDB.redo, and some journal files, all of > which suddenly went missing from the tmp_storage folder. This puts the entire > broker into a bad state from which it cannot recover. Only restarting the > service (which causes a broker slave to take over and loses the > yet-undelivered messages) gets a working state back. > h4. Symptoms > Stack trace: > {noformat} > ... > Caused by: java.io.IOException: Stale file handle > at java.io.RandomAccessFile.readBytes0(Native Method) > at java.io.RandomAccessFile.readBytes(RandomAccessFile.java:350) > at java.io.RandomAccessFile.read(RandomAccessFile.java:385) > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:444) > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:424) > at org.apache.kahadb.page.PageFile.readPage(PageFile.java:876) > at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:446) > at org.apache.kahadb.page.Transaction$2.(Transaction.java:437) > at > org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:434) > at org.apache.kahadb.page.Transaction.load(Transaction.java:410) > at org.apache.kahadb.page.Transaction.load(Transaction.java:367) > at org.apache.kahadb.index.ListIndex.loadNode(ListIndex.java:306) > at org.apache.kahadb.index.ListIndex.getHead(ListIndex.java:99) > at org.apache.kahadb.index.ListIndex.iterator(ListIndex.java:284) > at > org.apache.activemq.store.kahadb.plist.PList$PListIterator.(PList.java:199) > at org.apache.activemq.store.kahadb.plist.PList.iterator(PList.java:189) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.(FilePendingMessageCursor.java:496) > {noformat} > h4. Cause > During BrokerThread startup, the BrokerService.startPersistenceAdapter() > method is called, which via doStartPersistenceAdapter() and > getProducerSystemUsage() invokes getSystemUsage(), that calls > getTempDataStore(), and that method summarily cleans out the existing > contents of the tmp_storage directory. > All of this happens *before* the broker lock is obtained in the > PersistenceAdapter.start() method at the end of doStartPersistenceAdapter(). > So a JVM that doesn't get to be the broker (because there already is one) and > runs in slave mode (waiting to obtain the broker lock) interferes with and > corrupts the running broker's tmp_storage and thus breaks the broker. That's > a critical bug. The slave has no business starting up the persistence adapter > and cleaning out data as it hasn't gotten the lock yet, so isn't allowed to > do any work, period. > h4. Workaround > As workaround, an unshared local directory needs to be specified as > tempDirectory for the broker, even if the main broker directory is shared. > Also, since broker startup will clear the tmp_storage out anyway, there > really is no advantage to having this in a shared location - since the next > broker that starts up after a broker failure will never re-use that data > anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AMQ-6005) Slave broker startup corrupts shared PList storage
[ https://issues.apache.org/jira/browse/AMQ-6005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954334#comment-14954334 ] Volker Kleinschmidt edited comment on AMQ-6005 at 10/13/15 4:15 AM: Not yet - we have an AMQ 5.12 upgrade in the pipeline, but not in production use yet, and this problem is only reproduced in production use, since you need to create quite a few asynchronous topic messages to even get to use tmp_storage - while there's few messages they all stay in memory and this issue doesn't arise. Plus you need a multi-server environment. So this isn't easily reproduced in the lab. However the relevant code has not changed in 5.12, I've verified that. It's all in one class and should be easy enough to follow the outline in the ticket description. I've added some additional detail to make it easier to follow. was (Author: volkerk): Not yet - we have an AMQ 5.12 upgrade in the pipeline, but not in production use yet, and this problem is only reproduced in production use, since you need to create quite a few asynchronous topic messages to even get to use tmp_storage - while there's few messages they all stay in memory and this issue doesn't arise. Plus you need a multi-server environment. So this isn't easily reproduced in the lab. However the relevant code has not changed in 5.12, I've verified that. It's all in one class and should be easy enough to follow the outline in the ticket description. > Slave broker startup corrupts shared PList storage > -- > > Key: AMQ-6005 > URL: https://issues.apache.org/jira/browse/AMQ-6005 > Project: ActiveMQ > Issue Type: Bug > Components: KahaDB >Affects Versions: 5.7.0, 5.10.0 > Environment: RHLinux6 >Reporter: Volker Kleinschmidt > > h4. Background > When multiple JVMs run AMQ in a master/slave configuration with the broker > directory in a shared filesystem location (as is required e.g. for > kahaPersistence), and when due to high message volume or slow producers the > broker's memory needs exceed the configured memory usage limit, AMQ will > overflow asynchronous messages to a PList store inside the "tmp_storage" > subdirectory of said shared broker directory. > h4. Issue > We frequently observed this tmpDB store getting corrupted with "stale NFS > filehandle" errors for tmpDB.data, tmpDB.redo, and some journal files, all of > which suddenly went missing from the tmp_storage folder. This puts the entire > broker into a bad state from which it cannot recover. Only restarting the > service (which causes a broker slave to take over and loses the > yet-undelivered messages) gets a working state back. > h4. Symptoms > Stack trace: > {noformat} > ... > Caused by: java.io.IOException: Stale file handle > at java.io.RandomAccessFile.readBytes0(Native Method) > at java.io.RandomAccessFile.readBytes(RandomAccessFile.java:350) > at java.io.RandomAccessFile.read(RandomAccessFile.java:385) > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:444) > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:424) > at org.apache.kahadb.page.PageFile.readPage(PageFile.java:876) > at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:446) > at org.apache.kahadb.page.Transaction$2.(Transaction.java:437) > at > org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:434) > at org.apache.kahadb.page.Transaction.load(Transaction.java:410) > at org.apache.kahadb.page.Transaction.load(Transaction.java:367) > at org.apache.kahadb.index.ListIndex.loadNode(ListIndex.java:306) > at org.apache.kahadb.index.ListIndex.getHead(ListIndex.java:99) > at org.apache.kahadb.index.ListIndex.iterator(ListIndex.java:284) > at > org.apache.activemq.store.kahadb.plist.PList$PListIterator.(PList.java:199) > at org.apache.activemq.store.kahadb.plist.PList.iterator(PList.java:189) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.(FilePendingMessageCursor.java:496) > {noformat} > h4. Cause > During BrokerThread startup, the BrokerService.startPersistenceAdapter() > method is called, which via doStartPersistenceAdapter() and > getProducerSystemUsage() invokes getSystemUsage(), that calls > getTempDataStore(), and that method summarily cleans out the existing > contents of the tmp_storage directory. > All of this happens *before* the broker lock is obtained in the > PersistenceAdapter.start() method at the end of doStartPersistenceAdapter(). > So a JVM that doesn't get to be the broker (because there already is one) and > runs in slave mode (waiting to obtain the broker lock) interferes with and > corrupts the running broker's tmp_storage and thus b
[jira] [Updated] (AMQ-6005) Slave broker startup corrupts shared PList storage
[ https://issues.apache.org/jira/browse/AMQ-6005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volker Kleinschmidt updated AMQ-6005: - Description: h4. Background When multiple JVMs run AMQ in a master/slave configuration with the broker directory in a shared filesystem location (as is required e.g. for kahaPersistence), and when due to high message volume or slow producers the broker's memory needs exceed the configured memory usage limit, AMQ will overflow asynchronous messages to a PList store inside the "tmp_storage" subdirectory of said shared broker directory. h4. Issue We frequently observed this tmpDB store getting corrupted with "stale NFS filehandle" errors for tmpDB.data, tmpDB.redo, and some journal files, all of which suddenly went missing from the tmp_storage folder. This puts the entire broker into a bad state from which it cannot recover. Only restarting the service (which causes a broker slave to take over and loses the yet-undelivered messages) gets a working state back. h4. Symptoms Stack trace: {noformat} ... Caused by: java.io.IOException: Stale file handle at java.io.RandomAccessFile.readBytes0(Native Method) at java.io.RandomAccessFile.readBytes(RandomAccessFile.java:350) at java.io.RandomAccessFile.read(RandomAccessFile.java:385) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:444) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:424) at org.apache.kahadb.page.PageFile.readPage(PageFile.java:876) at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:446) at org.apache.kahadb.page.Transaction$2.(Transaction.java:437) at org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:434) at org.apache.kahadb.page.Transaction.load(Transaction.java:410) at org.apache.kahadb.page.Transaction.load(Transaction.java:367) at org.apache.kahadb.index.ListIndex.loadNode(ListIndex.java:306) at org.apache.kahadb.index.ListIndex.getHead(ListIndex.java:99) at org.apache.kahadb.index.ListIndex.iterator(ListIndex.java:284) at org.apache.activemq.store.kahadb.plist.PList$PListIterator.(PList.java:199) at org.apache.activemq.store.kahadb.plist.PList.iterator(PList.java:189) at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.(FilePendingMessageCursor.java:496) {noformat} h4. Cause During BrokerThread startup, the BrokerService.startPersistenceAdapter() method is called, which via doStartPersistenceAdapter() and getProducerSystemUsage() invokes getSystemUsage(), that calls getTempDataStore(), and that method summarily cleans out the existing contents of the tmp_storage directory. All of this happens *before* the broker lock is obtained in the PersistenceAdapter.start() method at the end of doStartPersistenceAdapter(). So a JVM that doesn't get to be the broker (because there already is one) and runs in slave mode (waiting to obtain the broker lock) interferes with and corrupts the running broker's tmp_storage and thus breaks the broker. That's a critical bug. The slave has no business starting up the persistence adapter and cleaning out data as it hasn't gotten the lock yet, so isn't allowed to do any work, period. h4. Workaround As workaround, an unshared local directory needs to be specified as tempDirectory for the broker, even if the main broker directory is shared. Also, since broker startup will clear the tmp_storage out anyway, there really is no advantage to having this in a shared location - since the next broker that starts up after a broker failure will never re-use that data anyway. was: h4. Background When multiple JVMs run AMQ in a master/slave configuration with the broker directory in a shared filesystem location (as is required e.g. for kahaPersistence), and when due to high message volume or slow producers the broker's memory needs exceed the configured memory usage limit, AMQ will overflow asynchronous messages to a PList store inside the "tmp_storage" subdirectory of said shared broker directory. h4. Issue We frequently observed this tmpDB store getting corrupted with "stale NFS filehandle" errors for tmpDB.data, tmpDB.redo, and some journal files, all of which suddenly went missing from the tmp_storage folder. This puts the entire broker into a bad state from which it cannot recover. Only restarting the service (which causes a broker slave to take over and loses the yet-undelivered messages) gets a working state back. h4. Symptoms Stack trace: {noformat} ... Caused by: java.io.IOException: Stale file handle at java.io.RandomAccessFile.readBytes0(Native Method) at java.io.RandomAccessFile.readBytes(RandomAccessFile.java:350) at java.io.RandomAccessFile.read(RandomAccessFile.java:385) at java.io.RandomAccessFile.readFully(Ra
[jira] [Commented] (AMQ-6005) Slave broker startup corrupts shared PList storage
[ https://issues.apache.org/jira/browse/AMQ-6005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954334#comment-14954334 ] Volker Kleinschmidt commented on AMQ-6005: -- Not yet - we have an AMQ 5.12 upgrade in the pipeline, but not in production use yet, and this problem is only reproduced in production use, since you need to create quite a few asynchronous topic messages to even get to use tmp_storage - while there's few messages they all stay in memory and this issue doesn't arise. Plus you need a multi-server environment. So this isn't easily reproduced in the lab. However the relevant code has not changed in 5.12, I've verified that. It's all in one class and should be easy enough to follow the outline in the ticket description. > Slave broker startup corrupts shared PList storage > -- > > Key: AMQ-6005 > URL: https://issues.apache.org/jira/browse/AMQ-6005 > Project: ActiveMQ > Issue Type: Bug > Components: KahaDB >Affects Versions: 5.7.0, 5.10.0 > Environment: RHLinux6 >Reporter: Volker Kleinschmidt > > h4. Background > When multiple JVMs run AMQ in a master/slave configuration with the broker > directory in a shared filesystem location (as is required e.g. for > kahaPersistence), and when due to high message volume or slow producers the > broker's memory needs exceed the configured memory usage limit, AMQ will > overflow asynchronous messages to a PList store inside the "tmp_storage" > subdirectory of said shared broker directory. > h4. Issue > We frequently observed this tmpDB store getting corrupted with "stale NFS > filehandle" errors for tmpDB.data, tmpDB.redo, and some journal files, all of > which suddenly went missing from the tmp_storage folder. This puts the entire > broker into a bad state from which it cannot recover. Only restarting the > service (which causes a broker slave to take over and loses the > yet-undelivered messages) gets a working state back. > h4. Symptoms > Stack trace: > {noformat} > ... > Caused by: java.io.IOException: Stale file handle > at java.io.RandomAccessFile.readBytes0(Native Method) > at java.io.RandomAccessFile.readBytes(RandomAccessFile.java:350) > at java.io.RandomAccessFile.read(RandomAccessFile.java:385) > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:444) > at java.io.RandomAccessFile.readFully(RandomAccessFile.java:424) > at org.apache.kahadb.page.PageFile.readPage(PageFile.java:876) > at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:446) > at org.apache.kahadb.page.Transaction$2.(Transaction.java:437) > at > org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:434) > at org.apache.kahadb.page.Transaction.load(Transaction.java:410) > at org.apache.kahadb.page.Transaction.load(Transaction.java:367) > at org.apache.kahadb.index.ListIndex.loadNode(ListIndex.java:306) > at org.apache.kahadb.index.ListIndex.getHead(ListIndex.java:99) > at org.apache.kahadb.index.ListIndex.iterator(ListIndex.java:284) > at > org.apache.activemq.store.kahadb.plist.PList$PListIterator.(PList.java:199) > at org.apache.activemq.store.kahadb.plist.PList.iterator(PList.java:189) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.(FilePendingMessageCursor.java:496) > {noformat} > h4. Cause > During BrokerThread startup, the BrokerService.startPersistenceAdapter() > method is called, which eventually invokes getSystemUsage(), that calls > getTempDataStore(), and that summarily cleans out the existing contents of > the tmp_storage directory. All of this happens before the broker lock is > obtained in the startBroker() method. So a JVM that doesn't get to be the > broker (because there already is one) and runs in slave mode (waiting to > obtain the broker lock) interferes with and corrupts the running broker's > tmp_storage and thus breaks the broker. That's a critical bug. The slave has > no business starting up the persistence adapter and cleaning out data as it > hasn't gotten the lock yet, so isn't allowed to do any work, period. > h4. Workaround > As workaround, an unshared local directory needs to be specified as > tempDirectory for the broker, even if the main broker directory is shared. > Also, since broker startup will clear the tmp_storage out anyway, there > really is no advantage to having this in a shared location - since the next > broker that starts up after a broker failure will never re-use that data > anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-3434) Contention in PLIist creation results in NPE on load - FilePendingMessageCursor
[ https://issues.apache.org/jira/browse/AMQ-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14952616#comment-14952616 ] Volker Kleinschmidt commented on AMQ-3434: -- Btw, even in 5.10 this very error can still be seen under circumstances leading to tmpDB corruption (such as AMQ-6005), now from line 308 of ListIndex.java; {noformat} 305 ListNode loadNode(Transaction tx, long pageId) throws IOException { 306 Page> page = tx.load(pageId, marshaller); 307 ListNode node = page.get(); 308 node.setPage(page); {noformat} At this point the broker is broken, futsch, kaput, and only a restart helps. > Contention in PLIist creation results in NPE on load - > FilePendingMessageCursor > > > Key: AMQ-3434 > URL: https://issues.apache.org/jira/browse/AMQ-3434 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.5.0 > Environment: stomp stress test >Reporter: Gary Tully >Assignee: Gary Tully > Labels: filependingmessagecursor, pliststore > Fix For: 5.6.0 > > > Ocassional ocurrance of stack trace{code}2011-06-30 16:02:09,903 > [127.0.0.1:50524] ERROR FilePendingMessageCursor - Caught an IO > Exception getting the DiskList 98_PendingCursor:loadq-3 > java.lang.NullPointerException > at org.apache.kahadb.index.ListIndex.loadNode(ListIndex.java:203) > at org.apache.kahadb.index.ListIndex.load(ListIndex.java:75) > at > org.apache.activemq.store.kahadb.plist.PListStore$1.execute(PListStore.java:219) > at org.apache.kahadb.page.Transaction.execute(Transaction.java:729) > at > org.apache.activemq.store.kahadb.plist.PListStore.getPList(PListStore.java:216) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.getDiskList(FilePendingMessageCursor.java:454) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.flushToDisk(FilePendingMessageCursor.java:432) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.tryAddMessageLast(FilePendingMessageCursor.java:217) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.addMessageLast(FilePendingMessageCursor.java:193) > at org.apache.activemq.broker.region.Queue.sendMessage(Queue.java:1629) > at org.apache.activemq.broker.region.Queue.doMessageSend(Queue.java:720) > at org.apache.activemq.broker.region.Queue.send(Queue.java:652) > at > org.apache.activemq.broker.region.AbstractRegion.send(AbstractRegion.java:379) > at > org.apache.activemq.broker.region.RegionBroker.send(RegionBroker.java:523) > at org.apache.activemq.broker.BrokerFilter.send(BrokerFilter.java:129) > at > org.apache.activemq.broker.CompositeDestinationBroker.send(CompositeDestinationBroker.java:96) > at > org.apache.activemq.broker.TransactionBroker.send(TransactionBroker.java:304) > at org.apache.activemq.broker.BrokerFilter.send(BrokerFilter.java:129) > at org.apache.activemq.broker.UserIDBroker.send(UserIDBroker.java:56) > at > org.apache.activemq.broker.MutableBrokerFilter.send(MutableBrokerFilter.java:135) > at > org.apache.activemq.broker.TransportConnection.processMessage(TransportConnection.java:468) > at > org.apache.activemq.command.ActiveMQMessage.visit(ActiveMQMessage.java:681) > at > org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:316) > at > org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:180) > at > org.apache.activemq.transport.TransportFilter.onCommand(TransportFilter.java:69) > at > org.apache.activemq.transport.stomp.StompTransportFilter.sendToActiveMQ(StompTransportFilter.java:81) > at > org.apache.activemq.transport.stomp.ProtocolConverter.sendToActiveMQ(ProtocolConverter.java:140) > at > org.apache.activemq.transport.stomp.ProtocolConverter.onStompSend(ProtocolConverter.java:257) > at > org.apache.activemq.transport.stomp.ProtocolConverter.onStompCommand(ProtocolConverter.java:178) > at > org.apache.activemq.transport.stomp.StompTransportFilter.onCommand(StompTransportFilter.java:70) > at > org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:83) > at > org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:221) > at > org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:203) > at java.lang.Thread.run(Thread.java:662) > 2011-06-30 16:02:09,912 [127.0.0.1:50524] ERROR FilePendingMessageCursor > - Caught an Exception adding a message: ActiveMQBytesMessage {commandId = > 19796, > responseRequir
[jira] [Created] (AMQ-6005) Slave broker startup corrupts shared PList storage
Volker Kleinschmidt created AMQ-6005: Summary: Slave broker startup corrupts shared PList storage Key: AMQ-6005 URL: https://issues.apache.org/jira/browse/AMQ-6005 Project: ActiveMQ Issue Type: Bug Components: KahaDB Affects Versions: 5.10.0, 5.7.0 Environment: RHLinux6 Reporter: Volker Kleinschmidt h4. Background When multiple JVMs run AMQ in a master/slave configuration with the broker directory in a shared filesystem location (as is required e.g. for kahaPersistence), and when due to high message volume or slow producers the broker's memory needs exceed the configured memory usage limit, AMQ will overflow asynchronous messages to a PList store inside the "tmp_storage" subdirectory of said shared broker directory. h4. Issue We frequently observed this tmpDB store getting corrupted with "stale NFS filehandle" errors for tmpDB.data, tmpDB.redo, and some journal files, all of which suddenly went missing from the tmp_storage folder. This puts the entire broker into a bad state from which it cannot recover. Only restarting the service (which causes a broker slave to take over and loses the yet-undelivered messages) gets a working state back. h4. Symptoms Stack trace: {noformat} ... Caused by: java.io.IOException: Stale file handle at java.io.RandomAccessFile.readBytes0(Native Method) at java.io.RandomAccessFile.readBytes(RandomAccessFile.java:350) at java.io.RandomAccessFile.read(RandomAccessFile.java:385) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:444) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:424) at org.apache.kahadb.page.PageFile.readPage(PageFile.java:876) at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:446) at org.apache.kahadb.page.Transaction$2.(Transaction.java:437) at org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:434) at org.apache.kahadb.page.Transaction.load(Transaction.java:410) at org.apache.kahadb.page.Transaction.load(Transaction.java:367) at org.apache.kahadb.index.ListIndex.loadNode(ListIndex.java:306) at org.apache.kahadb.index.ListIndex.getHead(ListIndex.java:99) at org.apache.kahadb.index.ListIndex.iterator(ListIndex.java:284) at org.apache.activemq.store.kahadb.plist.PList$PListIterator.(PList.java:199) at org.apache.activemq.store.kahadb.plist.PList.iterator(PList.java:189) at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.(FilePendingMessageCursor.java:496) {noformat} h4. Cause During BrokerThread startup, the BrokerService.startPersistenceAdapter() method is called, which eventually invokes getSystemUsage(), that calls getTempDataStore(), and that summarily cleans out the existing contents of the tmp_storage directory. All of this happens before the broker lock is obtained in the startBroker() method. So a JVM that doesn't get to be the broker (because there already is one) and runs in slave mode (waiting to obtain the broker lock) interferes with and corrupts the running broker's tmp_storage and thus breaks the broker. That's a critical bug. The slave has no business starting up the persistence adapter and cleaning out data as it hasn't gotten the lock yet, so isn't allowed to do any work, period. h4. Workaround As workaround, an unshared local directory needs to be specified as tempDirectory for the broker, even if the main broker directory is shared. Also, since broker startup will clear the tmp_storage out anyway, there really is no advantage to having this in a shared location - since the next broker that starts up after a broker failure will never re-use that data anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-4346) Activemq-5.8.0 Shutdown failing when using NIO + LevelDB
[ https://issues.apache.org/jira/browse/AMQ-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949460#comment-14949460 ] Volker Kleinschmidt commented on AMQ-4346: -- found it - dupe of AMQ-4349 point I was trying to make is that it's *not* a dupe unless there's a valid reference to the issue it duplicates > Activemq-5.8.0 Shutdown failing when using NIO + LevelDB > > > Key: AMQ-4346 > URL: https://issues.apache.org/jira/browse/AMQ-4346 > Project: ActiveMQ > Issue Type: Bug >Reporter: RK G > > I configured activemq 5.8.0 with nio connector and leveldb. When ./activemq > stop is issued shutdown process is throwing an exception. Its a standalone > installation. > Here is the exception. > 2013-02-25 12:15:07,431 | INFO | Connector amqp Stopped | > org.apache.activemq.broker.TransportConnector | ActiveMQ ShutdownHook > 2013-02-25 12:15:07,549 | INFO | Stopped LevelDB[/opt/activemq/data/leveldb] > | org.apache.activemq.leveldb.LevelDBStore | ActiveMQ Sh > utdownHook > 2013-02-25 12:15:07,550 | ERROR | Could not stop service: QueueRegion: > destinations=1, subscriptions=0, memory=0%. Reason: java.lang.N > ullPointerException | org.apache.activemq.broker.jmx.ManagedQueueRegion | > ActiveMQ ShutdownHook > java.lang.NullPointerException > at > org.fusesource.hawtdispatch.package$RichExecutor.execute(hawtdispatch.scala:171) > at > org.fusesource.hawtdispatch.package$RichExecutorTrait$class.apply(hawtdispatch.scala:68) > at > org.fusesource.hawtdispatch.package$RichExecutor.apply(hawtdispatch.scala:169) > at > org.fusesource.hawtdispatch.package$RichExecutorTrait$class.future(hawtdispatch.scala:116) > at > org.fusesource.hawtdispatch.package$RichExecutor.future(hawtdispatch.scala:169) > at > org.fusesource.hawtdispatch.package$RichExecutorTrait$class.sync(hawtdispatch.scala:107) > at > org.fusesource.hawtdispatch.package$RichExecutor.sync(hawtdispatch.scala:169) > at > org.apache.activemq.leveldb.DBManager.destroyPList(DBManager.scala:773) > at > org.apache.activemq.leveldb.LevelDBStore.removePList(LevelDBStore.scala:454) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.destroyDiskList(FilePendingMessageCursor.java:168) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.destroy(FilePendingMessageCursor.java:163) > at > org.apache.activemq.broker.region.cursors.StoreQueueCursor.stop(StoreQueueCursor.java:82) > at org.apache.activemq.broker.region.Queue.stop(Queue.java:910) > at > org.apache.activemq.broker.region.AbstractRegion.stop(AbstractRegion.java:117) > at > org.apache.activemq.util.ServiceStopper.stop(ServiceStopper.java:41) > at > org.apache.activemq.broker.region.RegionBroker.doStop(RegionBroker.java:574) > at > org.apache.activemq.broker.jmx.ManagedRegionBroker.doStop(ManagedRegionBroker.java:126) > at > org.apache.activemq.broker.region.RegionBroker.stop(RegionBroker.java:194) > at org.apache.activemq.broker.BrokerFilter.stop(BrokerFilter.java:161) > at org.apache.activemq.broker.BrokerFilter.stop(BrokerFilter.java:161) > at > org.apache.activemq.broker.TransactionBroker.stop(TransactionBroker.java:204) > at > org.apache.activemq.broker.BrokerService$5.stop(BrokerService.java:2070) > at > org.apache.activemq.util.ServiceStopper.stop(ServiceStopper.java:41) > at > org.apache.activemq.broker.BrokerService.stop(BrokerService.java:715) > at > org.apache.activemq.xbean.XBeanBrokerService.stop(XBeanBrokerService.java:96) > at > org.apache.activemq.broker.BrokerService.containerShutdown(BrokerService.java:2282) > at > org.apache.activemq.broker.BrokerService$6.run(BrokerService.java:2249) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-4346) Activemq-5.8.0 Shutdown failing when using NIO + LevelDB
[ https://issues.apache.org/jira/browse/AMQ-4346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949456#comment-14949456 ] Volker Kleinschmidt commented on AMQ-4346: -- of what? > Activemq-5.8.0 Shutdown failing when using NIO + LevelDB > > > Key: AMQ-4346 > URL: https://issues.apache.org/jira/browse/AMQ-4346 > Project: ActiveMQ > Issue Type: Bug >Reporter: RK G > > I configured activemq 5.8.0 with nio connector and leveldb. When ./activemq > stop is issued shutdown process is throwing an exception. Its a standalone > installation. > Here is the exception. > 2013-02-25 12:15:07,431 | INFO | Connector amqp Stopped | > org.apache.activemq.broker.TransportConnector | ActiveMQ ShutdownHook > 2013-02-25 12:15:07,549 | INFO | Stopped LevelDB[/opt/activemq/data/leveldb] > | org.apache.activemq.leveldb.LevelDBStore | ActiveMQ Sh > utdownHook > 2013-02-25 12:15:07,550 | ERROR | Could not stop service: QueueRegion: > destinations=1, subscriptions=0, memory=0%. Reason: java.lang.N > ullPointerException | org.apache.activemq.broker.jmx.ManagedQueueRegion | > ActiveMQ ShutdownHook > java.lang.NullPointerException > at > org.fusesource.hawtdispatch.package$RichExecutor.execute(hawtdispatch.scala:171) > at > org.fusesource.hawtdispatch.package$RichExecutorTrait$class.apply(hawtdispatch.scala:68) > at > org.fusesource.hawtdispatch.package$RichExecutor.apply(hawtdispatch.scala:169) > at > org.fusesource.hawtdispatch.package$RichExecutorTrait$class.future(hawtdispatch.scala:116) > at > org.fusesource.hawtdispatch.package$RichExecutor.future(hawtdispatch.scala:169) > at > org.fusesource.hawtdispatch.package$RichExecutorTrait$class.sync(hawtdispatch.scala:107) > at > org.fusesource.hawtdispatch.package$RichExecutor.sync(hawtdispatch.scala:169) > at > org.apache.activemq.leveldb.DBManager.destroyPList(DBManager.scala:773) > at > org.apache.activemq.leveldb.LevelDBStore.removePList(LevelDBStore.scala:454) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.destroyDiskList(FilePendingMessageCursor.java:168) > at > org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.destroy(FilePendingMessageCursor.java:163) > at > org.apache.activemq.broker.region.cursors.StoreQueueCursor.stop(StoreQueueCursor.java:82) > at org.apache.activemq.broker.region.Queue.stop(Queue.java:910) > at > org.apache.activemq.broker.region.AbstractRegion.stop(AbstractRegion.java:117) > at > org.apache.activemq.util.ServiceStopper.stop(ServiceStopper.java:41) > at > org.apache.activemq.broker.region.RegionBroker.doStop(RegionBroker.java:574) > at > org.apache.activemq.broker.jmx.ManagedRegionBroker.doStop(ManagedRegionBroker.java:126) > at > org.apache.activemq.broker.region.RegionBroker.stop(RegionBroker.java:194) > at org.apache.activemq.broker.BrokerFilter.stop(BrokerFilter.java:161) > at org.apache.activemq.broker.BrokerFilter.stop(BrokerFilter.java:161) > at > org.apache.activemq.broker.TransactionBroker.stop(TransactionBroker.java:204) > at > org.apache.activemq.broker.BrokerService$5.stop(BrokerService.java:2070) > at > org.apache.activemq.util.ServiceStopper.stop(ServiceStopper.java:41) > at > org.apache.activemq.broker.BrokerService.stop(BrokerService.java:715) > at > org.apache.activemq.xbean.XBeanBrokerService.stop(XBeanBrokerService.java:96) > at > org.apache.activemq.broker.BrokerService.containerShutdown(BrokerService.java:2282) > at > org.apache.activemq.broker.BrokerService$6.run(BrokerService.java:2249) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-2935) java.io.EOFException: Chunk stream does not exist at page on broker start
[ https://issues.apache.org/jira/browse/AMQ-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933672#comment-14933672 ] Volker Kleinschmidt commented on AMQ-2935: -- This error still happens in 5.10 with the PList store (tmp_storage), which always uses kahaDB. A lot! > java.io.EOFException: Chunk stream does not exist at page on broker start > - > > Key: AMQ-2935 > URL: https://issues.apache.org/jira/browse/AMQ-2935 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.4.0, 5.4.1, 5.4.2 > Environment: Win7 32bit, JDK 1.6_20 >Reporter: Andy Gumbrecht >Assignee: Gary Tully >Priority: Blocker > Fix For: 5.4.2 > > Attachments: activemq-data.zip, activemq.xml, stacktraces.txt > > > I am seeing this regularly upon restarts in all versions from 5.4.x - I > cannot downgrade due to breaking issues in previous versions. > The broker was shutdown cleanly with no logged issues. > Deleting the activemq-data directory seems to be the only recovery solution > (which is not an option in production) > 2010-09-23 13:54:30,997 [Starting ActiveMQ Broker] ERROR > org.apache.activemq.broker.BrokerService - Failed to start ActiveMQ JMS > Message Broker. Reason: java.io.EOFException: Chunk stream does not exist at > page: 0 > java.io.EOFException: Chunk stream does not exist at page: 0 > at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:454) > at org.apache.kahadb.page.Transaction$2.(Transaction.java:431) > at > org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:428) > at org.apache.kahadb.page.Transaction.load(Transaction.java:404) > at org.apache.kahadb.page.Transaction.load(Transaction.java:361) > at > org.apache.activemq.broker.scheduler.JobSchedulerStore$3.execute(JobSchedulerStore.java:250) > at org.apache.kahadb.page.Transaction.execute(Transaction.java:728) > at > org.apache.activemq.broker.scheduler.JobSchedulerStore.doStart(JobSchedulerStore.java:239) > at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:53) > at > org.apache.activemq.broker.scheduler.SchedulerBroker.getStore(SchedulerBroker.java:198) > at > org.apache.activemq.broker.scheduler.SchedulerBroker.getInternalScheduler(SchedulerBroker.java:185) > at > org.apache.activemq.broker.scheduler.SchedulerBroker.start(SchedulerBroker.java:85) > at org.apache.activemq.broker.BrokerFilter.start(BrokerFilter.java:157) > at org.apache.activemq.broker.BrokerFilter.start(BrokerFilter.java:157) > at > org.apache.activemq.broker.TransactionBroker.start(TransactionBroker.java:112) > at > org.apache.activemq.broker.BrokerService$3.start(BrokerService.java:1786) > at > org.apache.activemq.broker.BrokerService.start(BrokerService.java:496) > at > org.apache.activemq.ra.ActiveMQResourceAdapter$1.run(ActiveMQResourceAdapter.java:85) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-1382) Unnecessary creation of /activemq-data/localhost/tmp_storage directory with AMQ 5.x
[ https://issues.apache.org/jira/browse/AMQ-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14731700#comment-14731700 ] Volker Kleinschmidt commented on AMQ-1382: -- The tmp_storage directory is used to store non-persistent messages when their creation occurs faster than their consumption (i.e. when the consumer is slow). You really have no control over whether this gets created or not - it happens dynamically as needed. But you do have settings to control how much memory and disk space the broker may use, and by allocating more memory to the broker you can often prevent the use of tmp_storage. > Unnecessary creation of /activemq-data/localhost/tmp_storage directory with > AMQ 5.x > --- > > Key: AMQ-1382 > URL: https://issues.apache.org/jira/browse/AMQ-1382 > Project: ActiveMQ > Issue Type: Bug > Environment: NA >Reporter: Dave Stanley >Assignee: Hiram Chirino > Fix For: 5.3.0 > > Attachments: AMQ-1382_Unit_Test1.patch > > > With AMQ 5.0 everytime AMQ runs the following directory structure is created: > /activemq-data/localhost/tmp_storage. > This didn't happen in AMQ 4.1.0.X and looks to be a side effect of the new > temporary spooling feature in 5.x. > Since the broker is configured to be non-persistent, ActiveMQ should not be > creating this directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AMQ-5543) Broker logs show repeated "java.lang.IllegalStateException: Timer already cancelled." lines
[ https://issues.apache.org/jira/browse/AMQ-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580607#comment-14580607 ] Volker Kleinschmidt edited comment on AMQ-5543 at 8/6/15 2:41 PM: -- We've seen this very same issue. Unable to create native thread due to ulimit setting, then broker gets into bad state and sulks for hours, slamming the door in the face of every client that's trying to connect. On the client this is witnessed by sockets being closed from the remote end, which looks like potential external interference at first, but when you look at the broker log it's clear that it's the broker resetting these sockets. I vote for making the broker more resilient against this type of problem, since hitting a ulimit and thus not being able to create a new pool worker or some such shouldn't throw the entire broker into the abyss. Here's the original OOM reported in the JVM's stdout.log: {noformat} INFO | jvm 1| 2015/06/08 05:30:26 | WARNING: RMI TCP Accept-0: accept loop for ServerSocket[addr=localhost/127.0.0.1,localport=42882] throws INFO | jvm 1| 2015/06/08 05:30:26 | java.lang.OutOfMemoryError: unable to create new native thread INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.start0(Native Method) INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.start(Thread.java:714) INFO | jvm 1| 2015/06/08 05:30:26 | at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949) INFO | jvm 1| 2015/06/08 05:30:26 | at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1371) INFO | jvm 1| 2015/06/08 05:30:26 | at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:414) INFO | jvm 1| 2015/06/08 05:30:26 | at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:371) INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.run(Thread.java:745) {noformat} And this next OOM is what starts the "timer already cancelled" messages in the activeMQ log: {noformat} INFO | jvm 1| 2015/06/08 05:39:34 | Exception in thread "ActiveMQ InactivityMonitor WriteCheckTimer" java.lang.OutOfMemoryError: unable to create new native thread INFO | jvm 1| 2015/06/08 05:39:34 | at java.lang.Thread.start0(Native Method) INFO | jvm 1| 2015/06/08 05:39:34 | at java.lang.Thread.start(Thread.java:714) INFO | jvm 1| 2015/06/08 05:39:34 | at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949) INFO | jvm 1| 2015/06/08 05:39:34 | at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1371) INFO | jvm 1| 2015/06/08 05:39:34 | at org.apache.activemq.transport.AbstractInactivityMonitor.writeCheck(AbstractInactivityMonitor.java:158) INFO | jvm 1| 2015/06/08 05:39:34 | at org.apache.activemq.transport.AbstractInactivityMonitor$2.run(AbstractInactivityMonitor.java:122) INFO | jvm 1| 2015/06/08 05:39:34 | at org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33) INFO | jvm 1| 2015/06/08 05:39:34 | at java.util.TimerThread.mainLoop(Timer.java:555) INFO | jvm 1| 2015/06/08 05:39:34 | at java.util.TimerThread.run(Timer.java:505) {noformat} So basically it's the InactivityMonitor getting into a bad state. If that fails to create a new thread we should just disable the monitor entirely - there's certainly no stability to be gained by locking everyone out! At minimum if the broker gets into such a bad state it needs to shut itself down and either restart or relinquish its lock, so another broker can start somewhere else. was (Author: volkerk): We've seen this very same issue. Unable to create native thread due to ulimit setting, then broker gets into bad state and sulks for hours, slamming the door in the face of every client that's trying to connect. On the client this is witnessed by sockets being closed from the remote end, which looks like potential external interference at first, but when you look at the broker log it's clear that it's the broker resetting these sockets. I vote for making the broker more resilient against this type of problem, since hitting a ulimit and thus not being able to create a new pool worker or some such shouldn't throw the entire broker into the abyss. Here's the original OOM reported in the JVM's stdout.log: {noformat} INFO | jvm 1| 2015/06/08 05:30:26 | WARNING: RMI TCP Accept-0: accept loop for ServerSocket[addr=localhost/127.0.0.1,localport=42882] throws INFO | jvm 1| 2015/06/08 05:30:26 | java.lang.OutOfMemoryError: unable to create new native thread INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.start0(Native Method) INFO | jvm 1| 2015/06/08 05:30:26 |
[jira] [Commented] (AMQ-4575) JDBCIOExceptionHandler does not restart TransportConnector when JMX is enabled on broker - java.io.IOException: Transport Connector could not be registered in JMX
[ https://issues.apache.org/jira/browse/AMQ-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593817#comment-14593817 ] Volker Kleinschmidt commented on AMQ-4575: -- Ugh. "The transport connectors are unregistered when stopping the broker service but the transport connector list is not cleared". In other words they were NOT being unregistered. > JDBCIOExceptionHandler does not restart TransportConnector when JMX is > enabled on broker - java.io.IOException: Transport Connector could not be > registered in JMX > --- > > Key: AMQ-4575 > URL: https://issues.apache.org/jira/browse/AMQ-4575 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.8.0 > Environment: tested against latest apache trunk (5.9 - snapshot) > svn 14900032 >Reporter: Pat Fox >Assignee: Timothy Bish > Fix For: 5.9.0 > > Attachments: DuplicateTransportConnectorsInListTest.java, > JDBCIOExceptionHandlerTest.java > > > Without JMX enabled on the broker; when the DB is shutdown and subsequently > restarted the JDBCIOExceptionHandler does a shutdown and restart on the > transport connector as expected. > However when JMX is enabled on the broker the transport connector fails to > restart throwing the following exception and subsequently shutting down the > broker > {code} > 2013-06-06 15:25:22,113 [st IO exception] - INFO DefaultIOExceptionHandler >- Stopping the broker due to exception, java.io.IOException: Transport > Connector could not be registered in JMX: > org.apache.activemq:type=Broker,brokerName=localhost,connector=clientConnectors,connectorName=tcp_//sideshow.home_61616 > java.io.IOException: Transport Connector could not be registered in JMX: > org.apache.activemq:type=Broker,brokerName=localhost,connector=clientConnectors,connectorName=tcp_//sideshow.home_61616 > at > org.apache.activemq.util.IOExceptionSupport.create(IOExceptionSupport.java:27) > at > org.apache.activemq.broker.BrokerService.registerConnectorMBean(BrokerService.java:1972) > at > org.apache.activemq.broker.BrokerService.startTransportConnector(BrokerService.java:2434) > at > org.apache.activemq.broker.BrokerService.startAllConnectors(BrokerService.java:2351) > at > org.apache.activemq.util.DefaultIOExceptionHandler$2.run(DefaultIOExceptionHandler.java:101) > Caused by: javax.management.InstanceAlreadyExistsException: > org.apache.activemq:type=Broker,brokerName=localhost,connector=clientConnectors,connectorName=tcp_//sideshow.home_61616 > at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:453) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1484) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:963) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:917) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:312) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:482) > at > org.apache.activemq.broker.jmx.ManagementContext.registerMBean(ManagementContext.java:380) > at > org.apache.activemq.broker.jmx.AnnotatedMBean.registerMBean(AnnotatedMBean.java:72) > at > org.apache.activemq.broker.BrokerService.registerConnectorMBean(BrokerService.java:1969) > ... 3 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-4911) Activemq running on standalone not able to to post messages to database in case of database failover
[ https://issues.apache.org/jira/browse/AMQ-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593815#comment-14593815 ] Volker Kleinschmidt commented on AMQ-4911: -- You'd need to provide details of how your RAC connection is configured, and how failover is being handled. Transparent application failover (TAF) is only a feature of the OCI driver. And TAF exists only for SELECT queries - not for locking or modifying queries, so in any case for failover to work we'd need application code to handle the failover. And Kishan, what does a "mysql-ds" do in a ticket specific to Oracle RAC? > Activemq running on standalone not able to to post messages to database in > case of database failover > - > > Key: AMQ-4911 > URL: https://issues.apache.org/jira/browse/AMQ-4911 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.8.0 > Environment: activemq 5.8 with oracle RAC with 2 nodes. activemq in > windows platform >Reporter: kishan > > when both the database nodes were up and running things were working fine > with active mq , then after closing the database server on one of the nodes > say node2, things some how worked good and messages were posted in the > database, but when brought back node2 and brought node1 down, messages went > to queued state and message finally were lost with these exceptions, but > after 5 -6mins things were stablized again > 2013-11-28 11:04:09,515 | WARN | Error while closing connection: No more > data to read from socket, due to: No more data to read from socket | > org.apache.activemq.store.jdbc.JDBCPersistenceAdapter | ActiveMQ Transport: > tcp:///10.167.91.198:58115@61618 > java.sql.SQLException: No more data to read from socket > at > oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112) > at > oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:146) > at > oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:208) > at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1123) > at oracle.jdbc.driver.T4CMAREngine.unmarshalSB1(T4CMAREngine.java:1075) > at oracle.jdbc.driver.T4C8Oall.receive(T4C8Oall.java:480) > at > oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:219) > at > oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:970) > at > oracle.jdbc.driver.OraclePreparedStatement.executeBatch(OraclePreparedStatement.java:10690) > at > org.apache.commons.dbcp.DelegatingStatement.executeBatch(DelegatingStatement.java:297) > at > org.apache.commons.dbcp.DelegatingStatement.executeBatch(DelegatingStatement.java:297) > at > org.apache.commons.dbcp.DelegatingStatement.executeBatch(DelegatingStatement.java:297) > at > org.apache.activemq.store.jdbc.TransactionContext.executeBatch(TransactionContext.java:106) > at > org.apache.activemq.store.jdbc.TransactionContext.executeBatch(TransactionContext.java:84) > at > org.apache.activemq.store.jdbc.TransactionContext.close(TransactionContext.java:132) > at > org.apache.activemq.store.jdbc.JDBCMessageStore.addMessage(JDBCMessageStore.java:129) > at > org.apache.activemq.store.memory.MemoryTransactionStore.addMessage(MemoryTransactionStore.java:327) > at > org.apache.activemq.store.memory.MemoryTransactionStore$1.asyncAddQueueMessage(MemoryTransactionStore.java:154) > at org.apache.activemq.broker.region.Queue.doMessageSend(Queue.java:748) > at org.apache.activemq.broker.region.Queue.send(Queue.java:721) > at > org.apache.activemq.broker.region.AbstractRegion.send(AbstractRegion.java:406) > at > org.apache.activemq.broker.region.RegionBroker.send(RegionBroker.java:392) > at org.apache.activemq.broker.BrokerFilter.send(BrokerFilter.java:129) > at > org.apache.activemq.broker.scheduler.SchedulerBroker.send(SchedulerBroker.java:177) > at org.apache.activemq.broker.BrokerFilter.send(BrokerFilter.java:129) > at > org.apache.activemq.broker.CompositeDestinationBroker.send(CompositeDestinationBroker.java:96) > at > org.apache.activemq.broker.TransactionBroker.send(TransactionBroker.java:317) > at > org.apache.activemq.broker.MutableBrokerFilter.send(MutableBrokerFilter.java:135) > at > org.apache.activemq.broker.MutableBrokerFilter.send(MutableBrokerFilter.java:135) > at > org.apache.activemq.broker.TransportConnection.processMessage(TransportConnection.java:499) > at > org.apache.activemq.command.ActiveMQMessage.visit(ActiveMQMessage.java:749) > at > org.apache.activemq.broker.TransportConnection.service
[jira] [Commented] (AMQ-1780) ActiveMQ broker does not automatically reconnect if the connection to the database is lost
[ https://issues.apache.org/jira/browse/AMQ-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593638#comment-14593638 ] Volker Kleinschmidt commented on AMQ-1780: -- When using useDatabaseLock="true" of course a stopped DB will cause the broker to go down - it no longer has a lock, nor a means of obtaining one. But when the DB is back up, it needs to be able to re-establish connections, and obtain a new lock, since the DB going down means there no longer is anyone holding that lock. But if you have merely a network blip, and the DB did not go down, you may have a stale session in the DB that still holds the lock, and needs to be killed before a broker can be reestablished. That's tough, but tends to be a lot more reliable than filesystem-based locks, which go AWOL all the time. > ActiveMQ broker does not automatically reconnect if the connection to the > database is lost > -- > > Key: AMQ-1780 > URL: https://issues.apache.org/jira/browse/AMQ-1780 > Project: ActiveMQ > Issue Type: Bug > Components: Broker >Affects Versions: 5.0.0 > Environment: Windows 2003 Server >Reporter: Jaya Srinivasan >Assignee: Gary Tully > Fix For: 5.5.0 > > > hi > We are noticing that after any SQL Server restart or network blip between > ActiveMQ and the database, after the connection or the database comes back > online activeMQ broker needs to be restarted as well i.e it doesn't > automatically re-establish connection to the database as result any message > send fails because the broker is still using the stale connection to the > database. > Is this designed behaviour or a bug? we are using ActiveMQ 5.0.0 and the > latest version of the JSQLConnect database driver: version 5.7. The database > we are using is MS SQL Server 2005 > Right now, in our production environment any time we have network maintenance > or database restart we also have to restart the ActiveMQ broker which is not > a good option for us. > Also, We are using a single ActiveMQ broker and not the JDBC(Master/Slave) > set up. > Issue details in > http://www.nabble.com/Database-connection-between-ActiveMQ-and-broker-td17321330s2354.html > Please let me know if I need to give more information > thanks > jaya -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AMQ-5543) Broker logs show repeated "java.lang.IllegalStateException: Timer already cancelled." lines
[ https://issues.apache.org/jira/browse/AMQ-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580607#comment-14580607 ] Volker Kleinschmidt edited comment on AMQ-5543 at 6/10/15 3:30 PM: --- We've seen this very same issue. Unable to create native thread due to ulimit setting, then broker gets into bad state and sulks for hours, slamming the door in the face of every client that's trying to connect. On the client this is witnessed by sockets being closed from the remote end, which looks like potential external interference at first, but when you look at the broker log it's clear that it's the broker resetting these sockets. I vote for making the broker more resilient against this type of problem, since hitting a ulimit and thus not being able to create a new pool worker or some such shouldn't throw the entire broker into the abyss. Here's the original OOM reported in the JVM's stdout.log: {noformat} INFO | jvm 1| 2015/06/08 05:30:26 | WARNING: RMI TCP Accept-0: accept loop for ServerSocket[addr=localhost/127.0.0.1,localport=42882] throws INFO | jvm 1| 2015/06/08 05:30:26 | java.lang.OutOfMemoryError: unable to create new native thread INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.start0(Native Method) INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.start(Thread.java:714) INFO | jvm 1| 2015/06/08 05:30:26 | at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949) INFO | jvm 1| 2015/06/08 05:30:26 | at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1371) INFO | jvm 1| 2015/06/08 05:30:26 | at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:414) INFO | jvm 1| 2015/06/08 05:30:26 | at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:371) INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.run(Thread.java:745) {noformat} And this next OOM is what starts the "timer already cancelled" messages in the activeMQ log: {noformat} INFO | jvm 1| 2015/06/08 05:39:34 | Exception in thread "ActiveMQ InactivityMonitor WriteCheckTimer" java.lang.OutOfMemoryError: unable to create new native thread INFO | jvm 1| 2015/06/08 05:39:34 | at java.lang.Thread.start0(Native Method) INFO | jvm 1| 2015/06/08 05:39:34 | at java.lang.Thread.start(Thread.java:714) INFO | jvm 1| 2015/06/08 05:39:34 | at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949) INFO | jvm 1| 2015/06/08 05:39:34 | at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1371) INFO | jvm 1| 2015/06/08 05:39:34 | at org.apache.activemq.transport.AbstractInactivityMonitor.writeCheck(AbstractInactivityMonitor.java:158) INFO | jvm 1| 2015/06/08 05:39:34 | at org.apache.activemq.transport.AbstractInactivityMonitor$2.run(AbstractInactivityMonitor.java:122) INFO | jvm 1| 2015/06/08 05:39:34 | at org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33) INFO | jvm 1| 2015/06/08 05:39:34 | at java.util.TimerThread.mainLoop(Timer.java:555) INFO | jvm 1| 2015/06/08 05:39:34 | at java.util.TimerThread.run(Timer.java:505) {noformat} So basically it's the InactivityMonitor getting into a bad state. If that fails to create a new thread we should just disable the monitor entirely - there's certainly no stability to be gained by shutting everything down! was (Author: volkerk): We've seen this very same issue. Unable to create native thread due to ulimit setting, then broker gets into bad state and sulks for hours, slamming the door in the face of every client that's trying to connect. On the client this is witnessed by sockets being closed from the remote end, which looks like potential external interference at first, but when you look at the broker log it's clear that it's the broker resetting these sockets. I vote for making the broker more resilient against this type of problem, since hitting a ulimit and thus not being able to create a new pool worker or some such shouldn't throw the entire broker into the abyss. Here's the original OOM reported in the JVM's stdout.log: {noformat} INFO | jvm 1| 2015/06/08 05:30:26 | WARNING: RMI TCP Accept-0: accept loop for ServerSocket[addr=localhost/127.0.0.1,localport=42882] throws INFO | jvm 1| 2015/06/08 05:30:26 | java.lang.OutOfMemoryError: unable to create new native thread INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.start0(Native Method) INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.start(Thread.java:714) INFO | jvm 1| 2015/06/08 05:30:26 | at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecut
[jira] [Comment Edited] (AMQ-5543) Broker logs show repeated "java.lang.IllegalStateException: Timer already cancelled." lines
[ https://issues.apache.org/jira/browse/AMQ-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580607#comment-14580607 ] Volker Kleinschmidt edited comment on AMQ-5543 at 6/10/15 3:29 PM: --- We've seen this very same issue. Unable to create native thread due to ulimit setting, then broker gets into bad state and sulks for hours, slamming the door in the face of every client that's trying to connect. On the client this is witnessed by sockets being closed from the remote end, which looks like potential external interference at first, but when you look at the broker log it's clear that it's the broker resetting these sockets. I vote for making the broker more resilient against this type of problem, since hitting a ulimit and thus not being able to create a new pool worker or some such shouldn't throw the entire broker into the abyss. Here's the original OOM reported in the JVM's stdout.log: {noformat} INFO | jvm 1| 2015/06/08 05:30:26 | WARNING: RMI TCP Accept-0: accept loop for ServerSocket[addr=localhost/127.0.0.1,localport=42882] throws INFO | jvm 1| 2015/06/08 05:30:26 | java.lang.OutOfMemoryError: unable to create new native thread INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.start0(Native Method) INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.start(Thread.java:714) INFO | jvm 1| 2015/06/08 05:30:26 | at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949) INFO | jvm 1| 2015/06/08 05:30:26 | at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1371) INFO | jvm 1| 2015/06/08 05:30:26 | at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:414) INFO | jvm 1| 2015/06/08 05:30:26 | at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:371) INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.run(Thread.java:745) {noformat} And this next OOM is what starts the "timer already cancelled" messages in the activeMQ log: {noformat} INFO | jvm 1| 2015/06/08 05:39:34 | Exception in thread "ActiveMQ InactivityMonitor WriteCheckTimer" java.lang.OutOfMemoryError: unable to create new native thread INFO | jvm 1| 2015/06/08 05:39:34 | at java.lang.Thread.start0(Native Method) INFO | jvm 1| 2015/06/08 05:39:34 | at java.lang.Thread.start(Thread.java:714) INFO | jvm 1| 2015/06/08 05:39:34 | at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949) INFO | jvm 1| 2015/06/08 05:39:34 | at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1371) INFO | jvm 1| 2015/06/08 05:39:34 | at org.apache.activemq.transport.AbstractInactivityMonitor.writeCheck(AbstractInactivityMonitor.java:158) INFO | jvm 1| 2015/06/08 05:39:34 | at org.apache.activemq.transport.AbstractInactivityMonitor$2.run(AbstractInactivityMonitor.java:122) INFO | jvm 1| 2015/06/08 05:39:34 | at org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33) INFO | jvm 1| 2015/06/08 05:39:34 | at java.util.TimerThread.mainLoop(Timer.java:555) INFO | jvm 1| 2015/06/08 05:39:34 | at java.util.TimerThread.run(Timer.java:505) {noformat} was (Author: volkerk): We've seen this very same issue. Unable to create native thread due to ulimit setting, then broker gets into bad state and sulks for hours, slamming the door in the face of every client that's trying to connect. On the client this is witnessed by sockets being closed from the remote end, which looks like potential external interference at first, but when you look at the broker log it's clear that it's the broker resetting these sockets. I vote for making the broker more resilient against this type of problem, since hitting a ulimit and thus not being able to create a new pool worker or some such shouldn't throw the entire broker into the abyss. Here's the OOM reported in the JVM's stdout.log: {noformat} INFO | jvm 1| 2015/06/08 05:30:26 | WARNING: RMI TCP Accept-0: accept loop for ServerSocket[addr=localhost/127.0.0.1,localport=42882] throws INFO | jvm 1| 2015/06/08 05:30:26 | java.lang.OutOfMemoryError: unable to create new native thread INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.start0(Native Method) INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.start(Thread.java:714) INFO | jvm 1| 2015/06/08 05:30:26 | at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949) INFO | jvm 1| 2015/06/08 05:30:26 | at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1371) INFO | jvm 1| 2015/06/08 05:30:26 | at sun.rmi.transport.tcp.TCPTransport$Acce
[jira] [Commented] (AMQ-5543) Broker logs show repeated "java.lang.IllegalStateException: Timer already cancelled." lines
[ https://issues.apache.org/jira/browse/AMQ-5543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580607#comment-14580607 ] Volker Kleinschmidt commented on AMQ-5543: -- We've seen this very same issue. Unable to create native thread due to ulimit setting, then broker gets into bad state and sulks for hours, slamming the door in the face of every client that's trying to connect. On the client this is witnessed by sockets being closed from the remote end, which looks like potential external interference at first, but when you look at the broker log it's clear that it's the broker resetting these sockets. I vote for making the broker more resilient against this type of problem, since hitting a ulimit and thus not being able to create a new pool worker or some such shouldn't throw the entire broker into the abyss. Here's the OOM reported in the JVM's stdout.log: {noformat} INFO | jvm 1| 2015/06/08 05:30:26 | WARNING: RMI TCP Accept-0: accept loop for ServerSocket[addr=localhost/127.0.0.1,localport=42882] throws INFO | jvm 1| 2015/06/08 05:30:26 | java.lang.OutOfMemoryError: unable to create new native thread INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.start0(Native Method) INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.start(Thread.java:714) INFO | jvm 1| 2015/06/08 05:30:26 | at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949) INFO | jvm 1| 2015/06/08 05:30:26 | at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1371) INFO | jvm 1| 2015/06/08 05:30:26 | at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:414) INFO | jvm 1| 2015/06/08 05:30:26 | at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:371) INFO | jvm 1| 2015/06/08 05:30:26 | at java.lang.Thread.run(Thread.java:745) {noformat} > Broker logs show repeated "java.lang.IllegalStateException: Timer already > cancelled." lines > > > Key: AMQ-5543 > URL: https://issues.apache.org/jira/browse/AMQ-5543 > Project: ActiveMQ > Issue Type: Bug > Components: Transport >Affects Versions: 5.10.0 >Reporter: Tim Bain > > One of our brokers running 5.10.0 spewed over 1 million of the following > exceptions to the log over a 2-hour period: > Transport Connection to: tcp://127.0.0.1:x failed: java.io.IOException: > Unexpected error occurred: java.lang.IllegalStateException: Timer already > cancelled. > Clients were observed to hang on startup (which we believe means they were > unable to connect to the broker) until it was rebooted, after which we > haven't seen the exception again. > Once the exceptions started, there were no stack traces or other log lines > that would indicate anything else about the cause, just those messages > repeating. The problems started immediately (a few milliseconds) after an > EOFException in the broker logs; we see those EOFExceptions pretty often and > they've never before resulted in "Timer already cancelled" exceptions, so > that might indicate what got us into a bad state but then again it might be > entirely unrelated. > I searched JIRA and the mailing list archives for similar issues, and > although there are a lot of incidences of "Timer already cancelled" > exceptions, none of them exactly match our situation. > * > http://activemq.2283324.n4.nabble.com/Producer-connections-keep-breaking-td4671152.html > describes repeated copies of the line in the logs and is the closest > parallel I've found, but it sounded like messages were still getting passed, > albeit more slowly than normal, whereas the developer on my team who hit this > said he didn't think any messages were getting sent. But it's the closest > match of the group. > * AMQ-5508 has a detailed investigation into the root cause of the problem > that Pero Atanasov saw, but his scenario occurred only on broker shutdown, > whereas our broker was not shutting down at the time, and it wasn't clear > that the log line repeated for him. > * > http://activemq.2283324.n4.nabble.com/Timer-already-cancelled-and-KahaDB-Recovering-checkpoint-thread-after-death-td4676684.html > has repeated messages like I'm seeing, but appears to be specific to KahaDB > which I'm not using (we use non-persistent messages only). > * AMQ-4805 has the same inner exception but not the full log line, and it > shows a full stack trace whereas I only see the one line without the stack > trace. > * > http://activemq.2283324.n4.nabble.com/can-t-send-message-Timer-already-cancelled-td4680175.html > appears to see the exception on the producer when sending rather than on the > broke