Hi Quinn I don't think that you need to match exactly my broker version. I had first discovered this issue on ActiveMQ 5.9.0 standard edition. I guess that simply every broker version suffers from this. I really don't think it is an ActiveMQ problem. It is according to Redhat a Spring JMS problem.
No, I never tried to use an embedded broker. Probably because I used remote brokers when I discovered the problem during Master-Slave failover tests. I will try to rewrite the test project to use an embedded broker that can be stopped and started as part of the test. Yes, that's what I meant. That the remote broker increases the probability to show the issue. Because when the analysis of Redhat was correct, it is really a timing issue. You can also increase the chance for the issue if you produce even more messages per second. That increases the probability that a message falls just into the problematic time slice where the consumer has committed but not the producer. Yes, that's right. I start the test and when I see lots of console output I hit enter on the console where the stop command of the broker has waited. Then I wait about 5 to 10 seconds and then I start the broker again. The test reconnects and continues. Regards Stephan On Fri, Feb 5, 2016 at 7:40 PM, Quinn Stevenson <qu...@pronoia-solutions.com > wrote: > Stephan - > > I’ll get a broker running and try to match your version - I think I can > get it from one of my customers whose running Fuse 6.2. > > While I do that - have you considered trying to reproduce this using an > embedded broker that the test could control? It would make it much easier > to reproduce. > > I don’t think running the broker locally vs remotely should increase any > probably of losing messages - we shouldn’t lose any as long as the > configuration is correct. It may increase the probably of an issue, but we > shouldn’t lose messages. > > Also, just to confirm - when you’re testing this you are stopping/starting > the broker in the middle of the test, not killing and restarting the broker > - correct? > > > > On Feb 5, 2016, at 12:37 AM, Stephan Burkard <sburk...@gmail.com> wrote: > > > > Hi Quinn > > > > I just tested the POM changes you posted and the second run failed > (without > > failover-URL). I then tested with the failover-URL and the third attempt > > failed. > > > > The latter is no big surprise since I discovered the problem during > > failover tests in a master-slave-config. I then reduced the setup to a > > single broker environment and it was still there. > > > > My test broker is apache-activemq-5.11.0.redhat-620133, a patched Redhat > > version of AMQ 5.11. As you, I also don't change the AMQ version number > in > > the POM, I just use a newer broker than the library version. My broker > runs > > on another machine than the test. Perhaps this increases the probability > of > > losing a message? > > > > Regards > > Stephan > > > > > > > > > > On Thu, Feb 4, 2016 at 7:06 PM, Quinn Stevenson < > qu...@pronoia-solutions.com <mailto:qu...@pronoia-solutions.com> > >> wrote: > > > >> I tested this with a 5.9.0 broker and I am seeing messages dropped with > >> the TxText, but I still have to use the failover URL or the test just > stops > >> after the broker is restarted. > >> > >> I don’t have a 5.9.1 broker to test with, so I don’t know if that would > >> help, but the next oldest broker I have is 5.10.1, and it seems to be > >> working with that broker. > >> > >> NOTE: I’m not changing the activemq-version in the POM when I change > the > >> broker version - I’m just starting a different broker (locally) on the > same > >> port. > >> > >> > >>> On Feb 4, 2016, at 10:41 AM, Quinn Stevenson < > >> qu...@pronoia-solutions.com> wrote: > >>> > >>> I still can’t make either test drop messages between the input and the > >> output queue with the POM changes I sent, but I did find one difference > >> between what you’ve done and what I normally do that changes the output > I’m > >> seeing - I always use a failover URL > >>> > >>> <property name="brokerURL" > >> > value="failover:(tcp://localhost:61616?wireFormat.tightEncodingEnabled=false > >> <tcp://localhost:61616?wireFormat.tightEncodingEnabled=false > <tcp://localhost:61616?wireFormat.tightEncodingEnabled=false>>)"/> > >>> > >>> My test broker is v 5.10.1 as well - I’ll see if it makes any > difference > >> with 5.9.0 > >>> > >>> > >>> > >>>> On Feb 4, 2016, at 9:52 AM, Quinn Stevenson < > >> qu...@pronoia-solutions.com <mailto:qu...@pronoia-solutions.com> > <mailto:qu...@pronoia-solutions.com <mailto:qu...@pronoia-solutions.com>>> > wrote: > >>>> > >>>> It is strange - I’m trying to compare what you have in the “standard” > >> version to what I did before. We tested our configs pretty heavily > under > >> all sorts of strange conditions to verify we weren’t looking messages, > but > >> we were using newer versions of Camel and ActiveMQ. > >>>> > >>>> So we’re on the same page - can you try your tests again with POM > >> dependencies that look something like this? > >>>> > >>>> <properties> > >>>> <camel-version>2.12.5</camel-version> > >>>> <activemq-version>5.9.0</activemq-version> > >>>> </properties> > >>>> > >>>> <dependencies> > >>>> <dependency> > >>>> <groupId>org.apache.activemq</groupId> > >>>> <artifactId>activemq-all</artifactId> > >>>> <version>${activemq-version}</version> > >>>> </dependency> > >>>> <dependency> > >>>> <groupId>org.apache.activemq</groupId> > >>>> <artifactId>activemq-pool</artifactId> > >>>> <version>${activemq-version}</version> > >>>> </dependency> > >>>> > >>>> <dependency> > >>>> <groupId>org.apache.camel</groupId> > >>>> <artifactId>camel-spring</artifactId> > >>>> <version>${camel-version}</version> > >>>> </dependency> > >>>> <dependency> > >>>> <groupId>org.apache.camel</groupId> > >>>> <artifactId>camel-jms</artifactId> > >>>> <version>${camel-version}</version> > >>>> </dependency> > >>>> > >>>> <dependency> > >>>> <groupId>org.apache.camel</groupId> > >>>> <artifactId>camel-test-spring</artifactId> > >>>> <version>${camel-version}</version> > >>>> <scope>test</scope> > >>>> </dependency> > >>>> > >>>> <dependency> > >>>> <groupId>commons-collections</groupId> > >>>> <artifactId>commons-collections</artifactId> > >>>> <version>3.2.1</version> > >>>> <scope>test</scope> > >>>> </dependency> > >>>> <dependency> > >>>> <groupId>org.hamcrest</groupId> > >>>> <artifactId>hamcrest-integration</artifactId> > >>>> <version>1.3</version> > >>>> <scope>test</scope> > >>>> </dependency> > >>>> > >>>> </dependencies> > >>>> > >>>> > >>>> > >>>>> On Feb 4, 2016, at 9:49 AM, Stephan Burkard <sburk...@gmail.com > <mailto:sburk...@gmail.com> > >> <mailto:sburk...@gmail.com <mailto:sburk...@gmail.com>>> wrote: > >>>>> > >>>>> Hi Quinn > >>>>> > >>>>> The "standard" version is the big mystery. As I stated in my first > >> post, a > >>>>> Redhat engineer analysed a similar project (with less book-keeping > and > >>>>> logging stuff) and his conclusion was that as soon as a transaction > >> manager > >>>>> is explicitly defined, Spring JMS Template (that is used by Camel > >> under the > >>>>> hood) creates two of them by bug, by accident or just by strange > >> behaviour. > >>>>> > >>>>> This conclusion was quite suprising since that meant that all our > >> Camel-JMS > >>>>> project are theoretically suffering from message loss. > >>>>> > >>>>> The "no-tx" version should definitely be OK, see also CAMEL-5055 for > >> the " > >>>>> lazyCreateTransactionManager" flag. The JMS transaction manager may > >> not be > >>>>> defined but it creates one implicitly because of "transacted = true". > >>>>> > >>>>> The two "flaws" you mentioned are perhaps an issue. It would be > somehow > >>>>> calming if it is my project who has a flaw. > >>>>> > >>>>> Regards > >>>>> Stephan > >>>>> > >>>>> > >>>>> > >>>>> On Thu, Feb 4, 2016 at 4:44 PM, Quinn Stevenson < > >> qu...@pronoia-solutions.com <mailto:qu...@pronoia-solutions.com> > <mailto:qu...@pronoia-solutions.com <mailto:qu...@pronoia-solutions.com>> > >>>>>> wrote: > >>>>> > >>>>>> I’m still going through the project, but the first couple of things > >> that > >>>>>> jump out at me are you have two Spring versions - the one you > >> explicitly > >>>>>> put in your POM (3.2.8.RELEASE) and the one pulled in by > camel-spring > >>>>>> (3.2.11.RELEASE). Also, camel-spring should be included in the POM > >> since > >>>>>> you’re using Spring routes. I’m not sure if that’s enough to cause > >> issues > >>>>>> or not. > >>>>>> > >>>>>> I believe what’s going on with the “no-tx” version is you’re > actually > >>>>>> using JMS transactions since you still have transacted set to true > in > >> the > >>>>>> JmsConfiguration. > >>>>>> > >>>>>> I’m not sure what’s going in with the “standard” version - it looks > >>>>>> similar to some XA stuff I’ve setup before (because I had multiple > >> brokers > >>>>>> involved) except I had to use XA Connection Factories. > >>>>>> > >>>>>> > >>>>>> > >>>>>>> On Feb 3, 2016, at 3:12 PM, Stephan Burkard <sburk...@gmail.com > <mailto:sburk...@gmail.com> > >> <mailto:sburk...@gmail.com <mailto:sburk...@gmail.com>>> wrote: > >>>>>>> > >>>>>>> Yes, same broker. There is only one ActiveMQ connection config in > the > >>>>>>> project. > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On Wed, Feb 3, 2016 at 8:00 PM, Quinn Stevenson < > >>>>>> qu...@pronoia-solutions.com <mailto:qu...@pronoia-solutions.com> > <mailto:qu...@pronoia-solutions.com <mailto:qu...@pronoia-solutions.com>> > >>>>>>>> wrote: > >>>>>>> > >>>>>>>> Are both the source and destination queues hosted by the same > >> ActiveMQ > >>>>>>>> broker? > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>> On Feb 3, 2016, at 8:21 AM, Stephan Burkard <sburk...@gmail.com > <mailto:sburk...@gmail.com> > >> <mailto:sburk...@gmail.com <mailto:sburk...@gmail.com>>> > >>>>>> wrote: > >>>>>>>>> > >>>>>>>>> Hi > >>>>>>>>> > >>>>>>>>> I have built a small Maven project (attached) to demonstrate a > JMS > >>>>>>>> transaction problem in Camel routes under certain load conditions. > >> In > >>>>>> fact > >>>>>>>> I am losing messages between two queues. > >>>>>>>>> > >>>>>>>>> The project contains two different flavours of the same test. One > >> of > >>>>>>>> them suffers from the problem, the other (due to my tests) not. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> *** What does the testcase? > >>>>>>>>> 1. Produces 1000 messages (100/s) and sends them to an "input" > >> queue. > >>>>>>>>> 2. Sends the messages from the "input" queue to an "output" > queue. > >>>>>>>>> 3. Finally consumes the messages from the "output" queue to count > >> them. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> *** What is the difference between the two test flavours? > >>>>>>>>> - There is a "standard" flavour that suffers from the problem > >>>>>>>>> - And there is a "noTxManager" flavour that seems to not have the > >>>>>> problem > >>>>>>>>> - The "standard" flavour is kind of a well known Camel/ActiveMQ > >>>>>>>> configuration > >>>>>>>>> - with a Spring transaction manager > >>>>>>>>> - with a Spring transaction policy > >>>>>>>>> - With a "transacted" flag in Camel routes > >>>>>>>>> - The "noTxManager" flavour is a "simple" configuration > >>>>>>>>> - no Spring transaction manager > >>>>>>>>> - no Spring transaction policy > >>>>>>>>> - no "transacted" flag in Camel routes > >>>>>>>>> - BUT: "lazyCreateTransactionManager" = false (so routes are > >>>>>>>> transacted too) > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> *** How to run the testcases? > >>>>>>>>> 1. Replace "[yourBrokerHost]" with the hostname of your ActiveMQ > >> broker > >>>>>>>>> 2. Run the testcase as JUnit test > >>>>>>>>> 3. When you see lots of console messages that messages are sent, > >> stop > >>>>>>>> your ActiveMQ broker (do not kill-9 it, just shut it down > normally) > >>>>>>>>> 4. Exceptions are thrown on the console output > >>>>>>>>> 5. After some seconds start your broker again > >>>>>>>>> 6. The test finish normally and after some seconds dumps a book > >> keeping > >>>>>>>> on the console > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> *** How to interpret the results? > >>>>>>>>> - When the test is successful, no message is lost. You can run > the > >> test > >>>>>>>> without broker shutdown/startup and it will obviously always be > >>>>>> successful. > >>>>>>>>> - When the test fails, one or more messages are lost between > queue > >>>>>>>> "input" and "output". In my tests I was not able to run the > >> "standard" > >>>>>>>> flavour three times in a row successfully. About every second run > >>>>>> failed. > >>>>>>>> In contrast, the "noTxManager" flavour never failed in my tests. > >>>>>>>>> > >>>>>>>>> The book keeping for a failed test looks like the following. In > >> this > >>>>>>>> example Message number 281 is arrived at the input queue but not > at > >> the > >>>>>>>> output queue. So it is lost. > >>>>>>>>> > >>>>>>>>> Messages created by Client: 1000 > >>>>>>>>> Client Exceptions during send: 0 [] > >>>>>>>>> > >>>>>>>>> Messages received at input queue: 993 > >>>>>>>>> Missing Messages at input queue: 7 > >> [282,283,284,285,286,287,288] > >>>>>>>>> Duplicate Messages at input queue: 0 [] > >>>>>>>>> > >>>>>>>>> Messages received at output queue: 992 > >>>>>>>>> Missing Messages at output queue: 8 > >>>>>> [281,282,283,284,285,286,287,288] > >>>>>>>>> Duplicate Messages at output queue: 0 [] > >>>>>>>>> > >>>>>>>>> Lost Messages between Queues: 1 [281] > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> *** What is the problem? > >>>>>>>>> A Redhat engineer tracked the problem down to a Spring JMS > template > >>>>>>>> behaviour that is kind of strange. If a Spring transaction manager > >> is > >>>>>>>> defined in the config, it will end up with two of them. Therefore > >> the > >>>>>> small > >>>>>>>> time range where messages can get lost that arises only when you > >> have a > >>>>>>>> certain load. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> *** So, what is my question? > >>>>>>>>> - Does this really mean that it is unsafe to use the "standard" > >> flavour > >>>>>>>> of configuration? > >>>>>>>>> - Is there another config with TxManager etc that works > correctly? > >>>>>>>>> - What are limits of the "noTxManager" config? When is it not > >>>>>> sufficent? > >>>>>>>>> > >>>>>>>>> Regards > >>>>>>>>> Stephan > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> <CamelAmqTxTest.zip> > >