[jira] [Commented] (AMQ-5446) activemq doesn't start in java home contain space character
[ https://issues.apache.org/jira/browse/AMQ-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219945#comment-14219945 ] Jesse Fugitt commented on AMQ-5446: --- Sounds like this was what I was seeing on Windows and I attached a patch to JIRA 5422: https://issues.apache.org/jira/browse/AMQ-5422 I didn't test on Linux but that patch should help for windows. Basically the variable needs quotes around it in the script. -Djava.security.auth.login.config=%ACTIVEMQ_CONF%\login.config should be -Djava.security.auth.login.config=%ACTIVEMQ_CONF%\login.config activemq doesn't start in java home contain space character --- Key: AMQ-5446 URL: https://issues.apache.org/jira/browse/AMQ-5446 Project: ActiveMQ Issue Type: Bug Affects Versions: 5.10.0 Environment: cygwin Reporter: Herve Dumont Priority: Minor Found on cygwin/Windows 7 but should be application for Linux/Unix platform too If Java is installed in a directory containing space character, activemq start command doesn't start Script doesn't report any error. activemq.pid file pid of a process which doesn't exist. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (AMQ-5446) activemq doesn't start in java home contain space character
[ https://issues.apache.org/jira/browse/AMQ-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219945#comment-14219945 ] Jesse Fugitt edited comment on AMQ-5446 at 11/20/14 8:35 PM: - Updating my comment now that I re-read and see you are referring to Java home having a space. I was having a similar problem when ActiveMQ was installed to a directory path with spaces (like Program Files). Maybe this other info could help you. Based on what I was seeing on Windows, I attached a patch to JIRA 5422: https://issues.apache.org/jira/browse/AMQ-5422 I didn't test on Linux but that patch should help for windows. Basically the variable needs quotes around it in the script. -Djava.security.auth.login.config=%ACTIVEMQ_CONF%\login.config should be -Djava.security.auth.login.config=%ACTIVEMQ_CONF%\login.config was (Author: jfugitt): Sounds like this was what I was seeing on Windows and I attached a patch to JIRA 5422: https://issues.apache.org/jira/browse/AMQ-5422 I didn't test on Linux but that patch should help for windows. Basically the variable needs quotes around it in the script. -Djava.security.auth.login.config=%ACTIVEMQ_CONF%\login.config should be -Djava.security.auth.login.config=%ACTIVEMQ_CONF%\login.config activemq doesn't start in java home contain space character --- Key: AMQ-5446 URL: https://issues.apache.org/jira/browse/AMQ-5446 Project: ActiveMQ Issue Type: Bug Affects Versions: 5.10.0 Environment: cygwin Reporter: Herve Dumont Priority: Minor Found on cygwin/Windows 7 but should be application for Linux/Unix platform too If Java is installed in a directory containing space character, activemq start command doesn't start Script doesn't report any error. activemq.pid file pid of a process which doesn't exist. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-5440) KahaDB error at startup Looking for key N but not found in fileMap
[ https://issues.apache.org/jira/browse/AMQ-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220075#comment-14220075 ] Jesse Fugitt commented on AMQ-5440: --- Debugged this a little more but the root cause isn't obvious even after adding more trace debugging. Basically, the metadata that is written to disk at the time the checkpoint cleanup was done looks correct and would not cause a startup error because it is not referencing the files that are being cleaned and deleted from disk. However, upon restart, a different set of metadata is loaded from disk which is old and still references a file that was deleted during the last checkpoint cleanup which causes the exception during open/recover as shown in the stack trace. One workaround I tried was to force a rebuild of the index through journal replay for any exception thrown by the recover method (as shown below) but it doesn't feel like this is really getting to the root of the problem. //inside the open method: public void open() throws IOException { ... startCheckpoint(); //recover //instead try to catch exceptions and rebuild the index try { recover(); } catch (IOException ex) { LOG.warn(Recovery failure during open. Recovering the index through journal replay., ex); // try to recover index try { pageFile.unload(); } catch (Exception ignore) {} if (archiveCorruptedIndex) { pageFile.archive(); } else { pageFile.delete(); } metadata = createMetadata(); pageFile = null; loadPageFile(); recover(); } KahaDB error at startup Looking for key N but not found in fileMap Key: AMQ-5440 URL: https://issues.apache.org/jira/browse/AMQ-5440 Project: ActiveMQ Issue Type: Bug Components: Message Store Affects Versions: 5.10.0 Reporter: Jesse Fugitt Priority: Critical Attachments: KahaDB.zip, TestApp.java, kahadbtest.log After being shutdown uncleanly, KahaDB can hit a startup error at times that causes the broker to fail to start up and potentially causes messages to be re-assigned that are not marked as redelivered. The log message at startup is: 2014-11-17 11:10:36,826 | ERROR | Looking for key 275 but not found in fileMap: {305=db-305.log number = 305 , length = 8217, 304=db-304.log number = 304 , length = 8217, 307=db-307.log number = 307 , length = 8217, 306=db-306.log number = 306 , length = 8217, 309=db-309.log number = 309 , length = 8217, 308=db-308.log number = 308 , length = 8217, 311=db-311.log number = 311 , length = 8217, 310=db-310.log number = 310 , length = 8217, 313=db-313.log number = 313 , length = 8217, 312=db-312.log number = 312 , length = 8217, 314=db-314.log number = 314 , length = 317, 303=db-303.log number = 303 , length = 8433} | org.apache.activemq.store.kahadb.disk.journal.Journal | main and the stack trace is: Starting TestApp... INFO | KahaDB is version 5 ERROR | Looking for key 275 but not found in fileMap: {305=db-305.log number = 305 , length = 8217, 304=db-304.log number = 304 , length = 8217, 307=db-307.log number = 307 , length = 8217, 306=db-306.log number = 306 , length = 8217, 309=db-309.log number = 309 , length = 8217, 308=db-308.log number = 308 , length = 8217, 311=db-311.log number = 311 , length = 8217, 310=db-310.log number = 310 , length = 8217, 313=db-313.log number = 313 , length = 8217, 312=db-312.log number = 312 , length = 8217, 314=db-314.log number = 314 , length = 317, 303=db-303.log number = 303 , length = 8433} Exception in thread main java.io.IOException: Could not locate data file KahaDB\db-275.log at org.apache.activemq.store.kahadb.disk.journal.Journal.getDataFile(Journal.java:353) at org.apache.activemq.store.kahadb.disk.journal.Journal.read(Journal.java:600) at org.apache.activemq.store.kahadb.MessageDatabase.load(MessageDatabase.java:1014) at org.apache.activemq.store.kahadb.MessageDatabase.recoverProducerAudit(MessageDatabase.java:687) at org.apache.activemq.store.kahadb.MessageDatabase.recover(MessageDatabase.java:595) at org.apache.activemq.store.kahadb.MessageDatabase.open(MessageDatabase.java:400) at org.apache.activemq.store.kahadb.MessageDatabase.load(MessageDatabase.java:418) at org.apache.activemq.store.kahadb.MessageDatabase.doStart(MessageDatabase.java:262) at org.apache.activemq.store.kahadb.KahaDBStore.doStart(KahaDBStore.java:194) at
[jira] [Created] (AMQ-5444) KahaDB bug that skips doing a sync on recoveryFile
Jesse Fugitt created AMQ-5444: - Summary: KahaDB bug that skips doing a sync on recoveryFile Key: AMQ-5444 URL: https://issues.apache.org/jira/browse/AMQ-5444 Project: ActiveMQ Issue Type: Bug Components: Message Store Reporter: Jesse Fugitt There appears to be a bug in the KahaDB PageFile.java class when attempting to sync files to disk. If the enableDiskSyncs option is set to true, it looks like the code is intending to sync the recoveryFile and the writeFile. However, it accidentally syncs the writeFile twice and fails to sync the recoveryFile. In the method below, see the if statement towards the bottom that checks the enableDiskSyncs boolean to see the problem: private void writeBatch() throws IOException { ... if (enableDiskSyncs) { // Sync to make sure recovery buffer writes land on disk.. if (enableRecoveryFile) { writeFile.sync(); // This should not be writeFile.sync!! } writeFile.sync(); } ... The code above should have a recoveryFile.sync() on the line with the comment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AMQ-5440) KahaDB error at startup Looking for key N but not found in fileMap
Jesse Fugitt created AMQ-5440: - Summary: KahaDB error at startup Looking for key N but not found in fileMap Key: AMQ-5440 URL: https://issues.apache.org/jira/browse/AMQ-5440 Project: ActiveMQ Issue Type: Bug Components: Message Store Affects Versions: 5.10.0 Reporter: Jesse Fugitt Priority: Critical After being shutdown uncleanly, KahaDB can hit a startup error at times that causes the broker to fail to start up and potentially causes messages to be re-assigned that are not marked as redelivered. The log message at startup is: 2014-11-17 11:10:36,826 | ERROR | Looking for key 275 but not found in fileMap: {305=db-305.log number = 305 , length = 8217, 304=db-304.log number = 304 , length = 8217, 307=db-307.log number = 307 , length = 8217, 306=db-306.log number = 306 , length = 8217, 309=db-309.log number = 309 , length = 8217, 308=db-308.log number = 308 , length = 8217, 311=db-311.log number = 311 , length = 8217, 310=db-310.log number = 310 , length = 8217, 313=db-313.log number = 313 , length = 8217, 312=db-312.log number = 312 , length = 8217, 314=db-314.log number = 314 , length = 317, 303=db-303.log number = 303 , length = 8433} | org.apache.activemq.store.kahadb.disk.journal.Journal | main and the stack trace is: Starting TestApp... INFO | KahaDB is version 5 ERROR | Looking for key 275 but not found in fileMap: {305=db-305.log number = 305 , length = 8217, 304=db-304.log number = 304 , length = 8217, 307=db-307.log number = 307 , length = 8217, 306=db-306.log number = 306 , length = 8217, 309=db-309.log number = 309 , length = 8217, 308=db-308.log number = 308 , length = 8217, 311=db-311.log number = 311 , length = 8217, 310=db-310.log number = 310 , length = 8217, 313=db-313.log number = 313 , length = 8217, 312=db-312.log number = 312 , length = 8217, 314=db-314.log number = 314 , length = 317, 303=db-303.log number = 303 , length = 8433} Exception in thread main java.io.IOException: Could not locate data file KahaDB\db-275.log at org.apache.activemq.store.kahadb.disk.journal.Journal.getDataFile(Journal.java:353) at org.apache.activemq.store.kahadb.disk.journal.Journal.read(Journal.java:600) at org.apache.activemq.store.kahadb.MessageDatabase.load(MessageDatabase.java:1014) at org.apache.activemq.store.kahadb.MessageDatabase.recoverProducerAudit(MessageDatabase.java:687) at org.apache.activemq.store.kahadb.MessageDatabase.recover(MessageDatabase.java:595) at org.apache.activemq.store.kahadb.MessageDatabase.open(MessageDatabase.java:400) at org.apache.activemq.store.kahadb.MessageDatabase.load(MessageDatabase.java:418) at org.apache.activemq.store.kahadb.MessageDatabase.doStart(MessageDatabase.java:262) at org.apache.activemq.store.kahadb.KahaDBStore.doStart(KahaDBStore.java:194) at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55) at org.apache.activemq.store.kahadb.KahaDBPersistenceAdapter.doStart(KahaDBPersistenceAdapter.java:215) at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55) at kahadbtest.TestApp.run(TestApp.java:29) at kahadbtest.TestApp.main(TestApp.java:21) This was fairly hard to reproduce without unclean shutdown but the attached log and broken KahaDB folder should help debug the problem. Also, I will attach the small test app that exercises the KahaDB APIs that I was using to cause the invalid state (I normally start and stop the app a few times until the problem appears at startup at which point it will no longer start). Some initial debugging looks like it might be related to the way that message acks are stored via the metadata serialization and how that interacts with the GC timer but I didn't see anything obvious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AMQ-5440) KahaDB error at startup Looking for key N but not found in fileMap
[ https://issues.apache.org/jira/browse/AMQ-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Fugitt updated AMQ-5440: -- Attachment: KahaDB.zip kahadbtest.log Broker KahaDB folder that causes startup error and TRACE level log showing the GC records that caused it. KahaDB error at startup Looking for key N but not found in fileMap Key: AMQ-5440 URL: https://issues.apache.org/jira/browse/AMQ-5440 Project: ActiveMQ Issue Type: Bug Components: Message Store Affects Versions: 5.10.0 Reporter: Jesse Fugitt Priority: Critical Attachments: KahaDB.zip, kahadbtest.log After being shutdown uncleanly, KahaDB can hit a startup error at times that causes the broker to fail to start up and potentially causes messages to be re-assigned that are not marked as redelivered. The log message at startup is: 2014-11-17 11:10:36,826 | ERROR | Looking for key 275 but not found in fileMap: {305=db-305.log number = 305 , length = 8217, 304=db-304.log number = 304 , length = 8217, 307=db-307.log number = 307 , length = 8217, 306=db-306.log number = 306 , length = 8217, 309=db-309.log number = 309 , length = 8217, 308=db-308.log number = 308 , length = 8217, 311=db-311.log number = 311 , length = 8217, 310=db-310.log number = 310 , length = 8217, 313=db-313.log number = 313 , length = 8217, 312=db-312.log number = 312 , length = 8217, 314=db-314.log number = 314 , length = 317, 303=db-303.log number = 303 , length = 8433} | org.apache.activemq.store.kahadb.disk.journal.Journal | main and the stack trace is: Starting TestApp... INFO | KahaDB is version 5 ERROR | Looking for key 275 but not found in fileMap: {305=db-305.log number = 305 , length = 8217, 304=db-304.log number = 304 , length = 8217, 307=db-307.log number = 307 , length = 8217, 306=db-306.log number = 306 , length = 8217, 309=db-309.log number = 309 , length = 8217, 308=db-308.log number = 308 , length = 8217, 311=db-311.log number = 311 , length = 8217, 310=db-310.log number = 310 , length = 8217, 313=db-313.log number = 313 , length = 8217, 312=db-312.log number = 312 , length = 8217, 314=db-314.log number = 314 , length = 317, 303=db-303.log number = 303 , length = 8433} Exception in thread main java.io.IOException: Could not locate data file KahaDB\db-275.log at org.apache.activemq.store.kahadb.disk.journal.Journal.getDataFile(Journal.java:353) at org.apache.activemq.store.kahadb.disk.journal.Journal.read(Journal.java:600) at org.apache.activemq.store.kahadb.MessageDatabase.load(MessageDatabase.java:1014) at org.apache.activemq.store.kahadb.MessageDatabase.recoverProducerAudit(MessageDatabase.java:687) at org.apache.activemq.store.kahadb.MessageDatabase.recover(MessageDatabase.java:595) at org.apache.activemq.store.kahadb.MessageDatabase.open(MessageDatabase.java:400) at org.apache.activemq.store.kahadb.MessageDatabase.load(MessageDatabase.java:418) at org.apache.activemq.store.kahadb.MessageDatabase.doStart(MessageDatabase.java:262) at org.apache.activemq.store.kahadb.KahaDBStore.doStart(KahaDBStore.java:194) at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55) at org.apache.activemq.store.kahadb.KahaDBPersistenceAdapter.doStart(KahaDBPersistenceAdapter.java:215) at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55) at kahadbtest.TestApp.run(TestApp.java:29) at kahadbtest.TestApp.main(TestApp.java:21) This was fairly hard to reproduce without unclean shutdown but the attached log and broken KahaDB folder should help debug the problem. Also, I will attach the small test app that exercises the KahaDB APIs that I was using to cause the invalid state (I normally start and stop the app a few times until the problem appears at startup at which point it will no longer start). Some initial debugging looks like it might be related to the way that message acks are stored via the metadata serialization and how that interacts with the GC timer but I didn't see anything obvious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AMQ-5440) KahaDB error at startup Looking for key N but not found in fileMap
[ https://issues.apache.org/jira/browse/AMQ-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Fugitt updated AMQ-5440: -- Attachment: TestApp.java Test app used to hit KahaDB error by causing many add/update/remove APIs while db-log files are being deleted. KahaDB error at startup Looking for key N but not found in fileMap Key: AMQ-5440 URL: https://issues.apache.org/jira/browse/AMQ-5440 Project: ActiveMQ Issue Type: Bug Components: Message Store Affects Versions: 5.10.0 Reporter: Jesse Fugitt Priority: Critical Attachments: KahaDB.zip, TestApp.java, kahadbtest.log After being shutdown uncleanly, KahaDB can hit a startup error at times that causes the broker to fail to start up and potentially causes messages to be re-assigned that are not marked as redelivered. The log message at startup is: 2014-11-17 11:10:36,826 | ERROR | Looking for key 275 but not found in fileMap: {305=db-305.log number = 305 , length = 8217, 304=db-304.log number = 304 , length = 8217, 307=db-307.log number = 307 , length = 8217, 306=db-306.log number = 306 , length = 8217, 309=db-309.log number = 309 , length = 8217, 308=db-308.log number = 308 , length = 8217, 311=db-311.log number = 311 , length = 8217, 310=db-310.log number = 310 , length = 8217, 313=db-313.log number = 313 , length = 8217, 312=db-312.log number = 312 , length = 8217, 314=db-314.log number = 314 , length = 317, 303=db-303.log number = 303 , length = 8433} | org.apache.activemq.store.kahadb.disk.journal.Journal | main and the stack trace is: Starting TestApp... INFO | KahaDB is version 5 ERROR | Looking for key 275 but not found in fileMap: {305=db-305.log number = 305 , length = 8217, 304=db-304.log number = 304 , length = 8217, 307=db-307.log number = 307 , length = 8217, 306=db-306.log number = 306 , length = 8217, 309=db-309.log number = 309 , length = 8217, 308=db-308.log number = 308 , length = 8217, 311=db-311.log number = 311 , length = 8217, 310=db-310.log number = 310 , length = 8217, 313=db-313.log number = 313 , length = 8217, 312=db-312.log number = 312 , length = 8217, 314=db-314.log number = 314 , length = 317, 303=db-303.log number = 303 , length = 8433} Exception in thread main java.io.IOException: Could not locate data file KahaDB\db-275.log at org.apache.activemq.store.kahadb.disk.journal.Journal.getDataFile(Journal.java:353) at org.apache.activemq.store.kahadb.disk.journal.Journal.read(Journal.java:600) at org.apache.activemq.store.kahadb.MessageDatabase.load(MessageDatabase.java:1014) at org.apache.activemq.store.kahadb.MessageDatabase.recoverProducerAudit(MessageDatabase.java:687) at org.apache.activemq.store.kahadb.MessageDatabase.recover(MessageDatabase.java:595) at org.apache.activemq.store.kahadb.MessageDatabase.open(MessageDatabase.java:400) at org.apache.activemq.store.kahadb.MessageDatabase.load(MessageDatabase.java:418) at org.apache.activemq.store.kahadb.MessageDatabase.doStart(MessageDatabase.java:262) at org.apache.activemq.store.kahadb.KahaDBStore.doStart(KahaDBStore.java:194) at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55) at org.apache.activemq.store.kahadb.KahaDBPersistenceAdapter.doStart(KahaDBPersistenceAdapter.java:215) at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55) at kahadbtest.TestApp.run(TestApp.java:29) at kahadbtest.TestApp.main(TestApp.java:21) This was fairly hard to reproduce without unclean shutdown but the attached log and broken KahaDB folder should help debug the problem. Also, I will attach the small test app that exercises the KahaDB APIs that I was using to cause the invalid state (I normally start and stop the app a few times until the problem appears at startup at which point it will no longer start). Some initial debugging looks like it might be related to the way that message acks are stored via the metadata serialization and how that interacts with the GC timer but I didn't see anything obvious. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AMQ-5422) Windows bat script doesn't work when run from a directory with spaces
Jesse Fugitt created AMQ-5422: - Summary: Windows bat script doesn't work when run from a directory with spaces Key: AMQ-5422 URL: https://issues.apache.org/jira/browse/AMQ-5422 Project: ActiveMQ Issue Type: Bug Components: Broker Affects Versions: 5.10.0 Reporter: Jesse Fugitt When you attempt to run ActiveMQ on Windows with the activemq.bat file from a directory with spaces (ex: C:\Program Files), you get JVM related error about main class not found. The problem is that one of the -D values in the ACTIVEMQ_OPTS line needs to have double quotes around it. Will attach a patch to fix the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AMQ-5422) Windows bat script doesn't work when run from a directory with spaces
[ https://issues.apache.org/jira/browse/AMQ-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Fugitt updated AMQ-5422: -- Attachment: AMQ5422.patch Patch to fix windows bat problem. Windows bat script doesn't work when run from a directory with spaces - Key: AMQ-5422 URL: https://issues.apache.org/jira/browse/AMQ-5422 Project: ActiveMQ Issue Type: Bug Components: Broker Affects Versions: 5.10.0 Reporter: Jesse Fugitt Attachments: AMQ5422.patch When you attempt to run ActiveMQ on Windows with the activemq.bat file from a directory with spaces (ex: C:\Program Files), you get JVM related error about main class not found. The problem is that one of the -D values in the ACTIVEMQ_OPTS line needs to have double quotes around it. Will attach a patch to fix the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-5347) persistJMSRedelivered flag doesn't work correctly when exceptions occur
[ https://issues.apache.org/jira/browse/AMQ-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192307#comment-14192307 ] Jesse Fugitt commented on AMQ-5347: --- Great thanks. Killing the connection was the important thing and the runtime exception looks like it accomplishes that. Originally, I looked at a way to sneak the exception through the interface since changing would impact so many files/plugins but ended up just submitting the patch so that the intent was clear. Glad you were able to get this in. persistJMSRedelivered flag doesn't work correctly when exceptions occur --- Key: AMQ-5347 URL: https://issues.apache.org/jira/browse/AMQ-5347 Project: ActiveMQ Issue Type: Bug Components: Broker Affects Versions: 5.10.0 Reporter: Jesse Fugitt Assignee: Gary Tully Fix For: 5.11.0 Attachments: AMQ5347.patch, RedeliveryRestartWithExceptionTest.java The new flag in 5.10 that ensures the JMSRedelivered flag persists across broker restarts does not work correctly when an exception occurs when attempting to write the message update to disk before the restart. In that case, messages can be assigned to receivers, the broker can be restarted, and then the messages are re-assigned to receivers and do not include the JMSRedelivered flag as expected. I will attach a unit test and proposed fix to illustrate the problem. Also, here is additional information I had sent to the mailing list: When using the new option persisteJMSRedelivered (to ensure the redelivered flag is set correctly on potentially duplicate messages that are re-dispatched by the broker even after a restart): policyEntry queue= persistJMSRedelivered=true/policyEntry there is still a case where a message can be re-sent and will not be marked as redelivered. I can open a JIRA and probably create a unit test but it is pretty clear from the pasted code below where the exception is getting swallowed. Would the preferred fix be to update the broker interface and make preProcessDispatch throw an IOException or would it be better to add a new field to the MessageDispatch class to indicate an exception occurred and leave the interface alone? The specific case when this can happen is when a MessageStore returns an exception during the updateMessage call, which then gets swallowed (and an ERROR logged) and still allows the message to be dispatched to the consumer. The exception seems like it should actually propagate out of the preProcessDispatch function in RegionBroker as shown below, but this would require changing the Broker interface and making the void preProcessDispatch function throw an IOException. //RegionBroker.java @Override public void preProcessDispatch(MessageDispatch messageDispatch) { Message message = messageDispatch.getMessage(); if (message != null) { long endTime = System.currentTimeMillis(); message.setBrokerOutTime(endTime); if (getBrokerService().isEnableStatistics()) { long totalTime = endTime - message.getBrokerInTime(); ((Destination) message.getRegionDestination()).getDestinationStatistics().getProcessTime().addTime(totalTime); } if (((BaseDestination) message.getRegionDestination()).isPersistJMSRedelivered() !message.isRedelivered() message.isPersistent()) { final int originalValue = message.getRedeliveryCounter(); message.incrementRedeliveryCounter(); try { ((BaseDestination) message.getRegionDestination()).getMessageStore().updateMessage(message); } catch (IOException error) { LOG.error(Failed to persist JMSRedeliveryFlag on {} in {}, message.getMessageId(), message.getDestination(), error); } finally { message.setRedeliveryCounter(originalValue); } } } } //TransportConnection.java protected void processDispatch(Command command) throws IOException { MessageDispatch messageDispatch = (MessageDispatch) (command.isMessageDispatch() ? command : null); try { if (!stopping.get()) { if (messageDispatch != null) { broker.preProcessDispatch(messageDispatch); } dispatch(command); //This code will dispatch the message whether or not the updateMessage function actually worked } ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (AMQ-5394) Incorrect handling of duplicate update message commands in KahaDB can lead to broker startup errors
[ https://issues.apache.org/jira/browse/AMQ-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14173992#comment-14173992 ] Jesse Fugitt commented on AMQ-5394: --- After further testing, it looks like the metadata.lastUpdate location in KahaDB is being set incorrectly in more places than just the function that handles update message (the add message and remove message functions also appear to have branches where this can happen). A more complete patch will be needed to catch all the cases where this could occur. Incorrect handling of duplicate update message commands in KahaDB can lead to broker startup errors --- Key: AMQ-5394 URL: https://issues.apache.org/jira/browse/AMQ-5394 Project: ActiveMQ Issue Type: Bug Components: Message Store Affects Versions: 5.10.0 Reporter: Jesse Fugitt Attachments: AMQ5394.patch When using the new (in 5.10) persistJMSRedelivered option to make sure all duplicates are marked as redelivered (the activemq.xml config file used policyEntry queue= persistJMSRedelivered=true/policyEntry), we occasionally had a broker fail to start up with the following error: 2014-10-07 17:31:15,117 | ERROR | Looking for key 7 but not found in fileMap: {8=db-8.log number = 8 , length = 9132256} | org.apache.activemq.store.kahadb.disk.journal.Journal | main 2014-10-07 17:31:15,117 | ERROR | Failed to start Apache ActiveMQ ([broker0, null], java.io.IOException: Could not locate data file /local/temp/apache-activemq-5.10.0/data/kahadb/db-7.log) | org.apache.activemq.broker.BrokerService | main The root cause seems to be when KahaDB processes a duplicate update message command or if it processes an update message command after the message has been removed from kahadb. The code in KahaDB logs a warning when this occurs from the following else statement and then updates the metadata location and then exits the function as shown below: ... } else { LOG.warn(Non existent message update attempt rejected. Destination: {}://{}, Message id: {}, command.getDestination().getType(), command.getDestination().getName(), command.getMessageId()); } metadata.lastUpdate = location; ... It turns out that the metadata.lastUpdate = location; line should not run if we took the else branch above so the simple fix is to move that line up into the if block so that it will not run after the log warning. Once we did that, we no longer see the broker startup errors. Note that this log warning does not always lead to a broker startup error as it is also related to writing at the end of a transaction log file or the checkpoint timer interval so it is not simple to reproduce but we have not see the startup error once the metadata.lastUpdate line was moved to the correct location. A patch will be provided to show the change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (AMQ-5394) Incorrect handling of duplicate update message commands in KahaDB can lead to broker startup errors
[ https://issues.apache.org/jira/browse/AMQ-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Fugitt updated AMQ-5394: -- Comment: was deleted (was: Patch to fix update message command in KahaDB that incorrects sets metadata lastUpdate location.) Incorrect handling of duplicate update message commands in KahaDB can lead to broker startup errors --- Key: AMQ-5394 URL: https://issues.apache.org/jira/browse/AMQ-5394 Project: ActiveMQ Issue Type: Bug Components: Message Store Affects Versions: 5.10.0 Reporter: Jesse Fugitt Attachments: AMQ5394.patch When using the new (in 5.10) persistJMSRedelivered option to make sure all duplicates are marked as redelivered (the activemq.xml config file used policyEntry queue= persistJMSRedelivered=true/policyEntry), we occasionally had a broker fail to start up with the following error: 2014-10-07 17:31:15,117 | ERROR | Looking for key 7 but not found in fileMap: {8=db-8.log number = 8 , length = 9132256} | org.apache.activemq.store.kahadb.disk.journal.Journal | main 2014-10-07 17:31:15,117 | ERROR | Failed to start Apache ActiveMQ ([broker0, null], java.io.IOException: Could not locate data file /local/temp/apache-activemq-5.10.0/data/kahadb/db-7.log) | org.apache.activemq.broker.BrokerService | main The root cause seems to be when KahaDB processes a duplicate update message command or if it processes an update message command after the message has been removed from kahadb. The code in KahaDB logs a warning when this occurs from the following else statement and then updates the metadata location and then exits the function as shown below: ... } else { LOG.warn(Non existent message update attempt rejected. Destination: {}://{}, Message id: {}, command.getDestination().getType(), command.getDestination().getName(), command.getMessageId()); } metadata.lastUpdate = location; ... It turns out that the metadata.lastUpdate = location; line should not run if we took the else branch above so the simple fix is to move that line up into the if block so that it will not run after the log warning. Once we did that, we no longer see the broker startup errors. Note that this log warning does not always lead to a broker startup error as it is also related to writing at the end of a transaction log file or the checkpoint timer interval so it is not simple to reproduce but we have not see the startup error once the metadata.lastUpdate line was moved to the correct location. A patch will be provided to show the change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AMQ-5394) Incorrect handling of duplicate update message commands in KahaDB can lead to broker startup errors
[ https://issues.apache.org/jira/browse/AMQ-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Fugitt updated AMQ-5394: -- Attachment: AMQ5394.patch Updated patch to fix add/update/remove message command in KahaDB that incorrectly sets metadata lastUpdate location in places. Incorrect handling of duplicate update message commands in KahaDB can lead to broker startup errors --- Key: AMQ-5394 URL: https://issues.apache.org/jira/browse/AMQ-5394 Project: ActiveMQ Issue Type: Bug Components: Message Store Affects Versions: 5.10.0 Reporter: Jesse Fugitt Attachments: AMQ5394.patch, AMQ5394.patch When using the new (in 5.10) persistJMSRedelivered option to make sure all duplicates are marked as redelivered (the activemq.xml config file used policyEntry queue= persistJMSRedelivered=true/policyEntry), we occasionally had a broker fail to start up with the following error: 2014-10-07 17:31:15,117 | ERROR | Looking for key 7 but not found in fileMap: {8=db-8.log number = 8 , length = 9132256} | org.apache.activemq.store.kahadb.disk.journal.Journal | main 2014-10-07 17:31:15,117 | ERROR | Failed to start Apache ActiveMQ ([broker0, null], java.io.IOException: Could not locate data file /local/temp/apache-activemq-5.10.0/data/kahadb/db-7.log) | org.apache.activemq.broker.BrokerService | main The root cause seems to be when KahaDB processes a duplicate update message command or if it processes an update message command after the message has been removed from kahadb. The code in KahaDB logs a warning when this occurs from the following else statement and then updates the metadata location and then exits the function as shown below: ... } else { LOG.warn(Non existent message update attempt rejected. Destination: {}://{}, Message id: {}, command.getDestination().getType(), command.getDestination().getName(), command.getMessageId()); } metadata.lastUpdate = location; ... It turns out that the metadata.lastUpdate = location; line should not run if we took the else branch above so the simple fix is to move that line up into the if block so that it will not run after the log warning. Once we did that, we no longer see the broker startup errors. Note that this log warning does not always lead to a broker startup error as it is also related to writing at the end of a transaction log file or the checkpoint timer interval so it is not simple to reproduce but we have not see the startup error once the metadata.lastUpdate line was moved to the correct location. A patch will be provided to show the change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AMQ-5394) Incorrect handling of duplicate update message commands in KahaDB can lead to broker startup errors
Jesse Fugitt created AMQ-5394: - Summary: Incorrect handling of duplicate update message commands in KahaDB can lead to broker startup errors Key: AMQ-5394 URL: https://issues.apache.org/jira/browse/AMQ-5394 Project: ActiveMQ Issue Type: Bug Components: Message Store Affects Versions: 5.10.0 Reporter: Jesse Fugitt When using the new (in 5.10) persistJMSRedelivered option to make sure all duplicates are marked as redelivered (the activemq.xml config file used policyEntry queue= persistJMSRedelivered=true/policyEntry), we occasionally had a broker fail to start up with the following error: 2014-10-07 17:31:15,117 | ERROR | Looking for key 7 but not found in fileMap: {8=db-8.log number = 8 , length = 9132256} | org.apache.activemq.store.kahadb.disk.journal.Journal | main 2014-10-07 17:31:15,117 | ERROR | Failed to start Apache ActiveMQ ([broker0, null], java.io.IOException: Could not locate data file /local/temp/apache-activemq-5.10.0/data/kahadb/db-7.log) | org.apache.activemq.broker.BrokerService | main The root cause seems to be when KahaDB processes a duplicate update message command. The code in KahaDB logs a warning when this occurs from the following else statement and then updates the metadata location and then exits the function as shown below: ... } else { LOG.warn(Non existent message update attempt rejected. Destination: {}://{}, Message id: {}, command.getDestination().getType(), command.getDestination().getName(), command.getMessageId()); } metadata.lastUpdate = location; ... It turns out that the metadata.lastUpdate = location; line should not run if we took the else branch above so the simple fix is to move that line up into the if block so that it will not run after the log warning. Once we did that, we no longer see the broker startup errors. Note that this log warning does not always lead to a broker startup error as it is also related to writing at the end of a transaction log file or the checkpoint timer interval so it is not simple to reproduce but we have not see the startup error once the metadata.lastUpdate line was moved to the correct location. A patch will be provided to show the change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AMQ-5394) Incorrect handling of duplicate update message commands in KahaDB can lead to broker startup errors
[ https://issues.apache.org/jira/browse/AMQ-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Fugitt updated AMQ-5394: -- Attachment: AMQ5394.patch Patch to fix update message command in KahaDB that incorrects sets metadata lastUpdate location. Incorrect handling of duplicate update message commands in KahaDB can lead to broker startup errors --- Key: AMQ-5394 URL: https://issues.apache.org/jira/browse/AMQ-5394 Project: ActiveMQ Issue Type: Bug Components: Message Store Affects Versions: 5.10.0 Reporter: Jesse Fugitt Attachments: AMQ5394.patch When using the new (in 5.10) persistJMSRedelivered option to make sure all duplicates are marked as redelivered (the activemq.xml config file used policyEntry queue= persistJMSRedelivered=true/policyEntry), we occasionally had a broker fail to start up with the following error: 2014-10-07 17:31:15,117 | ERROR | Looking for key 7 but not found in fileMap: {8=db-8.log number = 8 , length = 9132256} | org.apache.activemq.store.kahadb.disk.journal.Journal | main 2014-10-07 17:31:15,117 | ERROR | Failed to start Apache ActiveMQ ([broker0, null], java.io.IOException: Could not locate data file /local/temp/apache-activemq-5.10.0/data/kahadb/db-7.log) | org.apache.activemq.broker.BrokerService | main The root cause seems to be when KahaDB processes a duplicate update message command. The code in KahaDB logs a warning when this occurs from the following else statement and then updates the metadata location and then exits the function as shown below: ... } else { LOG.warn(Non existent message update attempt rejected. Destination: {}://{}, Message id: {}, command.getDestination().getType(), command.getDestination().getName(), command.getMessageId()); } metadata.lastUpdate = location; ... It turns out that the metadata.lastUpdate = location; line should not run if we took the else branch above so the simple fix is to move that line up into the if block so that it will not run after the log warning. Once we did that, we no longer see the broker startup errors. Note that this log warning does not always lead to a broker startup error as it is also related to writing at the end of a transaction log file or the checkpoint timer interval so it is not simple to reproduce but we have not see the startup error once the metadata.lastUpdate line was moved to the correct location. A patch will be provided to show the change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AMQ-5394) Incorrect handling of duplicate update message commands in KahaDB can lead to broker startup errors
[ https://issues.apache.org/jira/browse/AMQ-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Fugitt updated AMQ-5394: -- Description: When using the new (in 5.10) persistJMSRedelivered option to make sure all duplicates are marked as redelivered (the activemq.xml config file used policyEntry queue= persistJMSRedelivered=true/policyEntry), we occasionally had a broker fail to start up with the following error: 2014-10-07 17:31:15,117 | ERROR | Looking for key 7 but not found in fileMap: {8=db-8.log number = 8 , length = 9132256} | org.apache.activemq.store.kahadb.disk.journal.Journal | main 2014-10-07 17:31:15,117 | ERROR | Failed to start Apache ActiveMQ ([broker0, null], java.io.IOException: Could not locate data file /local/temp/apache-activemq-5.10.0/data/kahadb/db-7.log) | org.apache.activemq.broker.BrokerService | main The root cause seems to be when KahaDB processes a duplicate update message command or if it processes an update message command after the message has been removed from kahadb. The code in KahaDB logs a warning when this occurs from the following else statement and then updates the metadata location and then exits the function as shown below: ... } else { LOG.warn(Non existent message update attempt rejected. Destination: {}://{}, Message id: {}, command.getDestination().getType(), command.getDestination().getName(), command.getMessageId()); } metadata.lastUpdate = location; ... It turns out that the metadata.lastUpdate = location; line should not run if we took the else branch above so the simple fix is to move that line up into the if block so that it will not run after the log warning. Once we did that, we no longer see the broker startup errors. Note that this log warning does not always lead to a broker startup error as it is also related to writing at the end of a transaction log file or the checkpoint timer interval so it is not simple to reproduce but we have not see the startup error once the metadata.lastUpdate line was moved to the correct location. A patch will be provided to show the change. was: When using the new (in 5.10) persistJMSRedelivered option to make sure all duplicates are marked as redelivered (the activemq.xml config file used policyEntry queue= persistJMSRedelivered=true/policyEntry), we occasionally had a broker fail to start up with the following error: 2014-10-07 17:31:15,117 | ERROR | Looking for key 7 but not found in fileMap: {8=db-8.log number = 8 , length = 9132256} | org.apache.activemq.store.kahadb.disk.journal.Journal | main 2014-10-07 17:31:15,117 | ERROR | Failed to start Apache ActiveMQ ([broker0, null], java.io.IOException: Could not locate data file /local/temp/apache-activemq-5.10.0/data/kahadb/db-7.log) | org.apache.activemq.broker.BrokerService | main The root cause seems to be when KahaDB processes a duplicate update message command. The code in KahaDB logs a warning when this occurs from the following else statement and then updates the metadata location and then exits the function as shown below: ... } else { LOG.warn(Non existent message update attempt rejected. Destination: {}://{}, Message id: {}, command.getDestination().getType(), command.getDestination().getName(), command.getMessageId()); } metadata.lastUpdate = location; ... It turns out that the metadata.lastUpdate = location; line should not run if we took the else branch above so the simple fix is to move that line up into the if block so that it will not run after the log warning. Once we did that, we no longer see the broker startup errors. Note that this log warning does not always lead to a broker startup error as it is also related to writing at the end of a transaction log file or the checkpoint timer interval so it is not simple to reproduce but we have not see the startup error once the metadata.lastUpdate line was moved to the correct location. A patch will be provided to show the change. Incorrect handling of duplicate update message commands in KahaDB can lead to broker startup errors --- Key: AMQ-5394 URL: https://issues.apache.org/jira/browse/AMQ-5394 Project: ActiveMQ Issue Type: Bug Components: Message Store Affects Versions: 5.10.0 Reporter: Jesse Fugitt Attachments: AMQ5394.patch When using the new (in 5.10) persistJMSRedelivered option to make sure all duplicates are marked as redelivered (the activemq.xml config file used policyEntry queue= persistJMSRedelivered=true/policyEntry), we occasionally had a broker fail to start up with the following error: 2014-10-07 17:31:15,117 | ERROR | Looking for key 7 but not found in fileMap: {8=db-8.log number = 8 , length = 9132256} |
[jira] [Created] (AMQ-5354) persistJMSRedelivered feature breaks the ability for KahaDB to compact its journal files
Jesse Fugitt created AMQ-5354: - Summary: persistJMSRedelivered feature breaks the ability for KahaDB to compact its journal files Key: AMQ-5354 URL: https://issues.apache.org/jira/browse/AMQ-5354 Project: ActiveMQ Issue Type: Bug Components: Message Store Affects Versions: 5.10.0 Reporter: Jesse Fugitt Priority: Critical While doing testing with persistJMSRedelivered enabled in the ActiveMQ config file (which is new in 5.10), it became obvious that the KahaDB transaction log files are never being compacted even though all messages had been consumed. This is very easy to reproduce using a standard config with the following policyEntry to enable the feature: destinationPolicy policyMap policyEntries policyEntry queue= persistJMSRedelivered=true/policyEntry After waiting several minutes it was obvious the KahaDB transaction logs (~2500 files using 30GB of disk space) were not getting compacted and a log with DEBUG enabled (attached) shows that the files are getting filtered out as gc candidates. Since the updateMessage function is essentially doing a second add message operation down in KahaDB, it appears that the reference to the original message is not being cleaned up from the locationIndex preventing compaction of any message. I will attach a patch that fixes the issue but this appears to be a pretty critical issue when using the persistJMSRedelivered feature. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AMQ-5354) persistJMSRedelivered feature breaks the ability for KahaDB to compact its journal files
[ https://issues.apache.org/jira/browse/AMQ-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Fugitt updated AMQ-5354: -- Attachment: activemq.log ActiveMQ TRACE log that illustrates the problem with attempting to gc KahaDB transaction log files when using the persistJMSRedelivered feature. persistJMSRedelivered feature breaks the ability for KahaDB to compact its journal files Key: AMQ-5354 URL: https://issues.apache.org/jira/browse/AMQ-5354 Project: ActiveMQ Issue Type: Bug Components: Message Store Affects Versions: 5.10.0 Reporter: Jesse Fugitt Priority: Critical Attachments: activemq.log While doing testing with persistJMSRedelivered enabled in the ActiveMQ config file (which is new in 5.10), it became obvious that the KahaDB transaction log files are never being compacted even though all messages had been consumed. This is very easy to reproduce using a standard config with the following policyEntry to enable the feature: destinationPolicy policyMap policyEntries policyEntry queue= persistJMSRedelivered=true/policyEntry After waiting several minutes it was obvious the KahaDB transaction logs (~2500 files using 30GB of disk space) were not getting compacted and a log with DEBUG enabled (attached) shows that the files are getting filtered out as gc candidates. Since the updateMessage function is essentially doing a second add message operation down in KahaDB, it appears that the reference to the original message is not being cleaned up from the locationIndex preventing compaction of any message. I will attach a patch that fixes the issue but this appears to be a pretty critical issue when using the persistJMSRedelivered feature. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AMQ-5354) persistJMSRedelivered feature breaks the ability for KahaDB to compact its journal files
[ https://issues.apache.org/jira/browse/AMQ-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Fugitt updated AMQ-5354: -- Attachment: AMQ5354.patch Patch for KahaDB that fixes the problem where persistJMSRedelivered is breaking KahaDB's ability to compact its transaction log files. persistJMSRedelivered feature breaks the ability for KahaDB to compact its journal files Key: AMQ-5354 URL: https://issues.apache.org/jira/browse/AMQ-5354 Project: ActiveMQ Issue Type: Bug Components: Message Store Affects Versions: 5.10.0 Reporter: Jesse Fugitt Priority: Critical Attachments: AMQ5354.patch, activemq.log While doing testing with persistJMSRedelivered enabled in the ActiveMQ config file (which is new in 5.10), it became obvious that the KahaDB transaction log files are never being compacted even though all messages had been consumed. This is very easy to reproduce using a standard config with the following policyEntry to enable the feature: destinationPolicy policyMap policyEntries policyEntry queue= persistJMSRedelivered=true/policyEntry After waiting several minutes it was obvious the KahaDB transaction logs (~2500 files using 30GB of disk space) were not getting compacted and a log with DEBUG enabled (attached) shows that the files are getting filtered out as gc candidates. Since the updateMessage function is essentially doing a second add message operation down in KahaDB, it appears that the reference to the original message is not being cleaned up from the locationIndex preventing compaction of any message. I will attach a patch that fixes the issue but this appears to be a pretty critical issue when using the persistJMSRedelivered feature. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AMQ-5347) persistJMSRedelivered flag doesn't work correctly when exceptions occur
[ https://issues.apache.org/jira/browse/AMQ-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Fugitt updated AMQ-5347: -- Patch Info: Patch Available persistJMSRedelivered flag doesn't work correctly when exceptions occur --- Key: AMQ-5347 URL: https://issues.apache.org/jira/browse/AMQ-5347 Project: ActiveMQ Issue Type: Bug Components: Broker Affects Versions: 5.10.0 Reporter: Jesse Fugitt Attachments: AMQ5347.patch, RedeliveryRestartWithExceptionTest.java The new flag in 5.10 that ensures the JMSRedelivered flag persists across broker restarts does not work correctly when an exception occurs when attempting to write the message update to disk before the restart. In that case, messages can be assigned to receivers, the broker can be restarted, and then the messages are re-assigned to receivers and do not include the JMSRedelivered flag as expected. I will attach a unit test and proposed fix to illustrate the problem. Also, here is additional information I had sent to the mailing list: When using the new option persisteJMSRedelivered (to ensure the redelivered flag is set correctly on potentially duplicate messages that are re-dispatched by the broker even after a restart): policyEntry queue= persistJMSRedelivered=true/policyEntry there is still a case where a message can be re-sent and will not be marked as redelivered. I can open a JIRA and probably create a unit test but it is pretty clear from the pasted code below where the exception is getting swallowed. Would the preferred fix be to update the broker interface and make preProcessDispatch throw an IOException or would it be better to add a new field to the MessageDispatch class to indicate an exception occurred and leave the interface alone? The specific case when this can happen is when a MessageStore returns an exception during the updateMessage call, which then gets swallowed (and an ERROR logged) and still allows the message to be dispatched to the consumer. The exception seems like it should actually propagate out of the preProcessDispatch function in RegionBroker as shown below, but this would require changing the Broker interface and making the void preProcessDispatch function throw an IOException. //RegionBroker.java @Override public void preProcessDispatch(MessageDispatch messageDispatch) { Message message = messageDispatch.getMessage(); if (message != null) { long endTime = System.currentTimeMillis(); message.setBrokerOutTime(endTime); if (getBrokerService().isEnableStatistics()) { long totalTime = endTime - message.getBrokerInTime(); ((Destination) message.getRegionDestination()).getDestinationStatistics().getProcessTime().addTime(totalTime); } if (((BaseDestination) message.getRegionDestination()).isPersistJMSRedelivered() !message.isRedelivered() message.isPersistent()) { final int originalValue = message.getRedeliveryCounter(); message.incrementRedeliveryCounter(); try { ((BaseDestination) message.getRegionDestination()).getMessageStore().updateMessage(message); } catch (IOException error) { LOG.error(Failed to persist JMSRedeliveryFlag on {} in {}, message.getMessageId(), message.getDestination(), error); } finally { message.setRedeliveryCounter(originalValue); } } } } //TransportConnection.java protected void processDispatch(Command command) throws IOException { MessageDispatch messageDispatch = (MessageDispatch) (command.isMessageDispatch() ? command : null); try { if (!stopping.get()) { if (messageDispatch != null) { broker.preProcessDispatch(messageDispatch); } dispatch(command); //This code will dispatch the message whether or not the updateMessage function actually worked } ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AMQ-5347) persistJMSRedelivered flag doesn't work correctly when exceptions occur
Jesse Fugitt created AMQ-5347: - Summary: persistJMSRedelivered flag doesn't work correctly when exceptions occur Key: AMQ-5347 URL: https://issues.apache.org/jira/browse/AMQ-5347 Project: ActiveMQ Issue Type: Bug Components: Broker Affects Versions: 5.10.0 Reporter: Jesse Fugitt The new flag in 5.10 that ensures the JMSRedelivered flag persists across broker restarts does not work correctly when an exception occurs when attempting to write the message update to disk before the restart. In that case, messages can be assigned to receivers, the broker can be restarted, and then the messages are re-assigned to receivers and do not include the JMSRedelivered flag as expected. I will attach a unit test and proposed fix to illustrate the problem. Also, here is additional information I had sent to the mailing list: When using the new option persisteJMSRedelivered (to ensure the redelivered flag is set correctly on potentially duplicate messages that are re-dispatched by the broker even after a restart): policyEntry queue= persistJMSRedelivered=true/policyEntry there is still a case where a message can be re-sent and will not be marked as redelivered. I can open a JIRA and probably create a unit test but it is pretty clear from the pasted code below where the exception is getting swallowed. Would the preferred fix be to update the broker interface and make preProcessDispatch throw an IOException or would it be better to add a new field to the MessageDispatch class to indicate an exception occurred and leave the interface alone? The specific case when this can happen is when a MessageStore returns an exception during the updateMessage call, which then gets swallowed (and an ERROR logged) and still allows the message to be dispatched to the consumer. The exception seems like it should actually propagate out of the preProcessDispatch function in RegionBroker as shown below, but this would require changing the Broker interface and making the void preProcessDispatch function throw an IOException. //RegionBroker.java @Override public void preProcessDispatch(MessageDispatch messageDispatch) { Message message = messageDispatch.getMessage(); if (message != null) { long endTime = System.currentTimeMillis(); message.setBrokerOutTime(endTime); if (getBrokerService().isEnableStatistics()) { long totalTime = endTime - message.getBrokerInTime(); ((Destination) message.getRegionDestination()).getDestinationStatistics().getProcessTime().addTime(totalTime); } if (((BaseDestination) message.getRegionDestination()).isPersistJMSRedelivered() !message.isRedelivered() message.isPersistent()) { final int originalValue = message.getRedeliveryCounter(); message.incrementRedeliveryCounter(); try { ((BaseDestination) message.getRegionDestination()).getMessageStore().updateMessage(message); } catch (IOException error) { LOG.error(Failed to persist JMSRedeliveryFlag on {} in {}, message.getMessageId(), message.getDestination(), error); } finally { message.setRedeliveryCounter(originalValue); } } } } //TransportConnection.java protected void processDispatch(Command command) throws IOException { MessageDispatch messageDispatch = (MessageDispatch) (command.isMessageDispatch() ? command : null); try { if (!stopping.get()) { if (messageDispatch != null) { broker.preProcessDispatch(messageDispatch); } dispatch(command); //This code will dispatch the message whether or not the updateMessage function actually worked } ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (AMQ-5347) persistJMSRedelivered flag doesn't work correctly when exceptions occur
[ https://issues.apache.org/jira/browse/AMQ-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Fugitt updated AMQ-5347: -- Attachment: RedeliveryRestartWithExceptionTest.java AMQ5347.patch Attached a proposed patch against the latest code on github for the persistJMSRedelivered issue. Also, attached a new unit test that passes when run from activemq-unit-tests after the fix is applied. persistJMSRedelivered flag doesn't work correctly when exceptions occur --- Key: AMQ-5347 URL: https://issues.apache.org/jira/browse/AMQ-5347 Project: ActiveMQ Issue Type: Bug Components: Broker Affects Versions: 5.10.0 Reporter: Jesse Fugitt Attachments: AMQ5347.patch, RedeliveryRestartWithExceptionTest.java The new flag in 5.10 that ensures the JMSRedelivered flag persists across broker restarts does not work correctly when an exception occurs when attempting to write the message update to disk before the restart. In that case, messages can be assigned to receivers, the broker can be restarted, and then the messages are re-assigned to receivers and do not include the JMSRedelivered flag as expected. I will attach a unit test and proposed fix to illustrate the problem. Also, here is additional information I had sent to the mailing list: When using the new option persisteJMSRedelivered (to ensure the redelivered flag is set correctly on potentially duplicate messages that are re-dispatched by the broker even after a restart): policyEntry queue= persistJMSRedelivered=true/policyEntry there is still a case where a message can be re-sent and will not be marked as redelivered. I can open a JIRA and probably create a unit test but it is pretty clear from the pasted code below where the exception is getting swallowed. Would the preferred fix be to update the broker interface and make preProcessDispatch throw an IOException or would it be better to add a new field to the MessageDispatch class to indicate an exception occurred and leave the interface alone? The specific case when this can happen is when a MessageStore returns an exception during the updateMessage call, which then gets swallowed (and an ERROR logged) and still allows the message to be dispatched to the consumer. The exception seems like it should actually propagate out of the preProcessDispatch function in RegionBroker as shown below, but this would require changing the Broker interface and making the void preProcessDispatch function throw an IOException. //RegionBroker.java @Override public void preProcessDispatch(MessageDispatch messageDispatch) { Message message = messageDispatch.getMessage(); if (message != null) { long endTime = System.currentTimeMillis(); message.setBrokerOutTime(endTime); if (getBrokerService().isEnableStatistics()) { long totalTime = endTime - message.getBrokerInTime(); ((Destination) message.getRegionDestination()).getDestinationStatistics().getProcessTime().addTime(totalTime); } if (((BaseDestination) message.getRegionDestination()).isPersistJMSRedelivered() !message.isRedelivered() message.isPersistent()) { final int originalValue = message.getRedeliveryCounter(); message.incrementRedeliveryCounter(); try { ((BaseDestination) message.getRegionDestination()).getMessageStore().updateMessage(message); } catch (IOException error) { LOG.error(Failed to persist JMSRedeliveryFlag on {} in {}, message.getMessageId(), message.getDestination(), error); } finally { message.setRedeliveryCounter(originalValue); } } } } //TransportConnection.java protected void processDispatch(Command command) throws IOException { MessageDispatch messageDispatch = (MessageDispatch) (command.isMessageDispatch() ? command : null); try { if (!stopping.get()) { if (messageDispatch != null) { broker.preProcessDispatch(messageDispatch); } dispatch(command); //This code will dispatch the message whether or not the updateMessage function actually worked } ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (AMQ-5278) Proton JMS transformer causes corrupted payload with redelivered messages
Jesse Fugitt created AMQ-5278: - Summary: Proton JMS transformer causes corrupted payload with redelivered messages Key: AMQ-5278 URL: https://issues.apache.org/jira/browse/AMQ-5278 Project: ActiveMQ Issue Type: Bug Components: AMQP Affects Versions: 5.10.0 Reporter: Jesse Fugitt When using AMQP with transformer=jms, redelivered messages have a corrupted payload (payload of all zeros). Creating this JIRA to link to the already opened Proton JIRA (624) and to provide a unit test for ActiveMQ. See https://issues.apache.org/jira/browse/PROTON-624 for details. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (AMQ-5273) Problem handling connections from multiple AMQP clients in ActiveMQ
Jesse Fugitt created AMQ-5273: - Summary: Problem handling connections from multiple AMQP clients in ActiveMQ Key: AMQ-5273 URL: https://issues.apache.org/jira/browse/AMQ-5273 Project: ActiveMQ Issue Type: Bug Components: AMQP Affects Versions: 5.10.0 Reporter: Jesse Fugitt Priority: Critical When multiple AMQP clients try to connect to the broker at exactly the same time, the broker can end up in a state where it gets an AMQP parsing error during the connection handshake and then all future AMQP connections fail until the broker is stopped. This was reproduced with C proton clients and QPID JMS clients but it seemed to depend on the speed of the machine where the broker was running and the network speed to ensure that the timing window would be hit. Turning on remote debugging in the ActiveMQ startup script made it happen much more frequently. The QPID JMS clients end up staying hung in the ConnectionEndpoint.open function and the C proton clients return a SASL error. Code analysis in the broker appeared to point to an incorrect use of static for the list data structure in the AMQPProtocolDiscriminator class. I am planning to attach a patch and unit test that demonstrate the behavior as well as some logs. The unit test fails with a timeout as all of the threads get hung when attempting to make connection and seems to require the following MAVEN_OPTS to be set to see it fail consistently: export MAVEN_OPTS=-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005 When running the broker in console mode, the following would be output for each failed client connection attempt: Could not decode AMQP frame: hex: 00210201005341d000110002a309414e4f4e594d4f5553a000 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (AMQ-5273) Problem handling connections from multiple AMQP clients in ActiveMQ
[ https://issues.apache.org/jira/browse/AMQ-5273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Fugitt updated AMQ-5273: -- Attachment: org.apache.activemq.transport.amqp.AMQ5273Test-output.txt amqp_connection_race.patch AMQ5273Test.java Attached git diff and unit test. See comments in unit test source code for how this was run with maven to reproduce. Problem handling connections from multiple AMQP clients in ActiveMQ --- Key: AMQ-5273 URL: https://issues.apache.org/jira/browse/AMQ-5273 Project: ActiveMQ Issue Type: Bug Components: AMQP Affects Versions: 5.10.0 Reporter: Jesse Fugitt Priority: Critical Labels: AMQP Attachments: AMQ5273Test.java, amqp_connection_race.patch, org.apache.activemq.transport.amqp.AMQ5273Test-output.txt When multiple AMQP clients try to connect to the broker at exactly the same time, the broker can end up in a state where it gets an AMQP parsing error during the connection handshake and then all future AMQP connections fail until the broker is stopped. This was reproduced with C proton clients and QPID JMS clients but it seemed to depend on the speed of the machine where the broker was running and the network speed to ensure that the timing window would be hit. Turning on remote debugging in the ActiveMQ startup script made it happen much more frequently. The QPID JMS clients end up staying hung in the ConnectionEndpoint.open function and the C proton clients return a SASL error. Code analysis in the broker appeared to point to an incorrect use of static for the list data structure in the AMQPProtocolDiscriminator class. I am planning to attach a patch and unit test that demonstrate the behavior as well as some logs. The unit test fails with a timeout as all of the threads get hung when attempting to make connection and seems to require the following MAVEN_OPTS to be set to see it fail consistently: export MAVEN_OPTS=-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005 When running the broker in console mode, the following would be output for each failed client connection attempt: Could not decode AMQP frame: hex: 00210201005341d000110002a309414e4f4e594d4f5553a000 -- This message was sent by Atlassian JIRA (v6.2#6252)