Re: Transaction problem with Camel, ActiveMQ and Spring JMS

Stephan Burkard Sat, 06 Feb 2016 13:32:45 -0800

Hi Quinn

I don't think that you need to match exactly my broker version. I had first
discovered this issue on ActiveMQ 5.9.0 standard edition. I guess that
simply every broker version suffers from this. I really don't think it is
an ActiveMQ problem. It is according to Redhat a Spring JMS problem.


No, I never tried to use an embedded broker. Probably because I used remote
brokers when I discovered the problem during Master-Slave failover tests. I
will try to rewrite the test project to use an embedded broker that can be
stopped and started as part of the test.

Yes, that's what I meant. That the remote broker increases the probability
to show the issue. Because when the analysis of Redhat was correct, it is
really a timing issue. You can also increase the chance for the issue if
you produce even more messages per second. That increases the probability
that a message falls just into the problematic time slice where the
consumer has committed but not the producer.

Yes, that's right. I start the test and when I see lots of console output I
hit enter on the console where the stop command of the broker has waited.
Then I wait about 5 to 10 seconds and then I start the broker again. The
test reconnects and continues.

Regards
Stephan





On Fri, Feb 5, 2016 at 7:40 PM, Quinn Stevenson <qu...@pronoia-solutions.com
> wrote:

> Stephan -
>
> I’ll get a broker running and try to match your version - I think I can
> get it from one of my customers whose running Fuse 6.2.
>
> While I do that - have you considered trying to reproduce this using an
> embedded broker that the test could control?  It would make it much easier
> to reproduce.
>
> I don’t think running the broker locally vs remotely should increase any
> probably of losing messages - we shouldn’t lose any as long as the
> configuration is correct.  It may increase the probably of an issue, but we
> shouldn’t lose messages.
>
> Also, just to confirm - when you’re testing this you are stopping/starting
> the broker in the middle of the test, not killing and restarting the broker
> - correct?
>
>
> > On Feb 5, 2016, at 12:37 AM, Stephan Burkard <sburk...@gmail.com> wrote:
> >
> > Hi Quinn
> >
> > I just tested the POM changes you posted and the second run failed
> (without
> > failover-URL). I then tested with the failover-URL and the third attempt
> > failed.
> >
> > The latter is no big surprise since I discovered the problem during
> > failover tests in a master-slave-config. I then reduced the setup to a
> > single broker environment and it was still there.
> >
> > My test broker is apache-activemq-5.11.0.redhat-620133, a patched Redhat
> > version of AMQ 5.11. As you, I also don't change the AMQ version number
> in
> > the POM, I just use a newer broker than the library version. My broker
> runs
> > on another machine than the test. Perhaps this increases the probability
> of
> > losing a message?
> >
> > Regards
> > Stephan
> >
> >
> >
> >
> > On Thu, Feb 4, 2016 at 7:06 PM, Quinn Stevenson <
> qu...@pronoia-solutions.com <mailto:qu...@pronoia-solutions.com>
> >> wrote:
> >
> >> I tested this with a 5.9.0 broker and I am seeing messages dropped with
> >> the TxText, but I still have to use the failover URL or the test just
> stops
> >> after the broker is restarted.
> >>
> >> I don’t have a 5.9.1 broker to test with, so I don’t know if that would
> >> help, but the next oldest broker I have is 5.10.1, and it seems to be
> >> working with that broker.
> >>
> >> NOTE:  I’m not changing the activemq-version in the POM when I change
> the
> >> broker version - I’m just starting a different broker (locally) on the
> same
> >> port.
> >>
> >>
> >>> On Feb 4, 2016, at 10:41 AM, Quinn Stevenson <
> >> qu...@pronoia-solutions.com> wrote:
> >>>
> >>> I still can’t make either test drop messages between the input and the
> >> output queue with the POM changes I sent, but I did find one difference
> >> between what you’ve done and what I normally do that changes the output
> I’m
> >> seeing - I always use a failover URL
> >>>
> >>> <property name="brokerURL"
> >>
> value="failover:(tcp://localhost:61616?wireFormat.tightEncodingEnabled=false
> >> <tcp://localhost:61616?wireFormat.tightEncodingEnabled=false
> <tcp://localhost:61616?wireFormat.tightEncodingEnabled=false>>)"/>
> >>>
> >>> My test broker is v 5.10.1 as well - I’ll see if it makes any
> difference
> >> with 5.9.0
> >>>
> >>>
> >>>
> >>>> On Feb 4, 2016, at 9:52 AM, Quinn Stevenson <
> >> qu...@pronoia-solutions.com <mailto:qu...@pronoia-solutions.com>
> <mailto:qu...@pronoia-solutions.com <mailto:qu...@pronoia-solutions.com>>>
> wrote:
> >>>>
> >>>> It is strange - I’m trying to compare what you have in the “standard”
> >> version to what I did before.  We tested our configs pretty heavily
> under
> >> all sorts of strange conditions to verify we weren’t looking messages,
> but
> >> we were using newer versions of Camel and ActiveMQ.
> >>>>
> >>>> So we’re on the same page - can you try your tests again with POM
> >> dependencies that look something like this?
> >>>>
> >>>> <properties>
> >>>>    <camel-version>2.12.5</camel-version>
> >>>>    <activemq-version>5.9.0</activemq-version>
> >>>> </properties>
> >>>>
> >>>> <dependencies>
> >>>>    <dependency>
> >>>>        <groupId>org.apache.activemq</groupId>
> >>>>        <artifactId>activemq-all</artifactId>
> >>>>        <version>${activemq-version}</version>
> >>>>    </dependency>
> >>>>    <dependency>
> >>>>        <groupId>org.apache.activemq</groupId>
> >>>>        <artifactId>activemq-pool</artifactId>
> >>>>        <version>${activemq-version}</version>
> >>>>    </dependency>
> >>>>
> >>>>    <dependency>
> >>>>        <groupId>org.apache.camel</groupId>
> >>>>        <artifactId>camel-spring</artifactId>
> >>>>        <version>${camel-version}</version>
> >>>>    </dependency>
> >>>>    <dependency>
> >>>>        <groupId>org.apache.camel</groupId>
> >>>>        <artifactId>camel-jms</artifactId>
> >>>>        <version>${camel-version}</version>
> >>>>    </dependency>
> >>>>
> >>>>    <dependency>
> >>>>        <groupId>org.apache.camel</groupId>
> >>>>        <artifactId>camel-test-spring</artifactId>
> >>>>        <version>${camel-version}</version>
> >>>>        <scope>test</scope>
> >>>>    </dependency>
> >>>>
> >>>>    <dependency>
> >>>>        <groupId>commons-collections</groupId>
> >>>>        <artifactId>commons-collections</artifactId>
> >>>>        <version>3.2.1</version>
> >>>>        <scope>test</scope>
> >>>>    </dependency>
> >>>>    <dependency>
> >>>>        <groupId>org.hamcrest</groupId>
> >>>>        <artifactId>hamcrest-integration</artifactId>
> >>>>        <version>1.3</version>
> >>>>        <scope>test</scope>
> >>>>    </dependency>
> >>>>
> >>>> </dependencies>
> >>>>
> >>>>
> >>>>
> >>>>> On Feb 4, 2016, at 9:49 AM, Stephan Burkard <sburk...@gmail.com
> <mailto:sburk...@gmail.com>
> >> <mailto:sburk...@gmail.com <mailto:sburk...@gmail.com>>> wrote:
> >>>>>
> >>>>> Hi Quinn
> >>>>>
> >>>>> The "standard" version is the big mystery. As I stated in my first
> >> post, a
> >>>>> Redhat engineer analysed a similar project (with less book-keeping
> and
> >>>>> logging stuff) and his conclusion was that as soon as a transaction
> >> manager
> >>>>> is explicitly defined, Spring JMS Template (that is used by Camel
> >> under the
> >>>>> hood) creates two of them by bug, by accident or just by strange
> >> behaviour.
> >>>>>
> >>>>> This conclusion was quite suprising since that meant that all our
> >> Camel-JMS
> >>>>> project are theoretically suffering from message loss.
> >>>>>
> >>>>> The "no-tx" version should definitely be OK, see also CAMEL-5055 for
> >> the "
> >>>>> lazyCreateTransactionManager" flag. The JMS transaction manager may
> >> not be
> >>>>> defined but it creates one implicitly because of "transacted = true".
> >>>>>
> >>>>> The two "flaws" you mentioned are perhaps an issue. It would be
> somehow
> >>>>> calming if it is my project who has a flaw.
> >>>>>
> >>>>> Regards
> >>>>> Stephan
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Thu, Feb 4, 2016 at 4:44 PM, Quinn Stevenson <
> >> qu...@pronoia-solutions.com <mailto:qu...@pronoia-solutions.com>
> <mailto:qu...@pronoia-solutions.com <mailto:qu...@pronoia-solutions.com>>
> >>>>>> wrote:
> >>>>>
> >>>>>> I’m still going through the project, but the first couple of things
> >> that
> >>>>>> jump out at me are you have two Spring versions - the one you
> >> explicitly
> >>>>>> put in your POM (3.2.8.RELEASE) and the one pulled in by
> camel-spring
> >>>>>> (3.2.11.RELEASE).  Also, camel-spring should be included in the POM
> >> since
> >>>>>> you’re using Spring routes.  I’m not sure if that’s enough to cause
> >> issues
> >>>>>> or not.
> >>>>>>
> >>>>>> I believe what’s going on with the “no-tx” version is you’re
> actually
> >>>>>> using JMS transactions since you still have transacted set to true
> in
> >> the
> >>>>>> JmsConfiguration.
> >>>>>>
> >>>>>> I’m not sure what’s going in with the “standard” version - it looks
> >>>>>> similar to some XA stuff I’ve setup before (because I had multiple
> >> brokers
> >>>>>> involved) except I had to use XA Connection Factories.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> On Feb 3, 2016, at 3:12 PM, Stephan Burkard <sburk...@gmail.com
> <mailto:sburk...@gmail.com>
> >> <mailto:sburk...@gmail.com <mailto:sburk...@gmail.com>>> wrote:
> >>>>>>>
> >>>>>>> Yes, same broker. There is only one ActiveMQ connection config in
> the
> >>>>>>> project.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Feb 3, 2016 at 8:00 PM, Quinn Stevenson <
> >>>>>> qu...@pronoia-solutions.com <mailto:qu...@pronoia-solutions.com>
> <mailto:qu...@pronoia-solutions.com <mailto:qu...@pronoia-solutions.com>>
> >>>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Are both the source and destination queues hosted by the same
> >> ActiveMQ
> >>>>>>>> broker?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> On Feb 3, 2016, at 8:21 AM, Stephan Burkard <sburk...@gmail.com
> <mailto:sburk...@gmail.com>
> >> <mailto:sburk...@gmail.com <mailto:sburk...@gmail.com>>>
> >>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi
> >>>>>>>>>
> >>>>>>>>> I have built a small Maven project (attached) to demonstrate a
> JMS
> >>>>>>>> transaction problem in Camel routes under certain load conditions.
> >> In
> >>>>>> fact
> >>>>>>>> I am losing messages between two queues.
> >>>>>>>>>
> >>>>>>>>> The project contains two different flavours of the same test. One
> >> of
> >>>>>>>> them suffers from the problem, the other (due to my tests) not.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> *** What does the testcase?
> >>>>>>>>> 1. Produces 1000 messages (100/s) and sends them to an "input"
> >> queue.
> >>>>>>>>> 2. Sends the messages from the "input" queue to an "output"
> queue.
> >>>>>>>>> 3. Finally consumes the messages from the "output" queue to count
> >> them.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> *** What is the difference between the two test flavours?
> >>>>>>>>> - There is a "standard" flavour that suffers from the problem
> >>>>>>>>> - And there is a "noTxManager" flavour that seems to not have the
> >>>>>> problem
> >>>>>>>>> - The "standard" flavour is kind of a well known Camel/ActiveMQ
> >>>>>>>> configuration
> >>>>>>>>> - with a Spring transaction manager
> >>>>>>>>> - with a Spring transaction policy
> >>>>>>>>> - With a "transacted" flag in Camel routes
> >>>>>>>>> - The "noTxManager" flavour is a "simple" configuration
> >>>>>>>>> - no Spring transaction manager
> >>>>>>>>> - no Spring transaction policy
> >>>>>>>>> - no "transacted" flag in Camel routes
> >>>>>>>>> - BUT: "lazyCreateTransactionManager" = false (so routes are
> >>>>>>>> transacted too)
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> *** How to run the testcases?
> >>>>>>>>> 1. Replace "[yourBrokerHost]" with the hostname of your ActiveMQ
> >> broker
> >>>>>>>>> 2. Run the testcase as JUnit test
> >>>>>>>>> 3. When you see lots of console messages that messages are sent,
> >> stop
> >>>>>>>> your ActiveMQ broker (do not kill-9 it, just shut it down
> normally)
> >>>>>>>>> 4. Exceptions are thrown on the console output
> >>>>>>>>> 5. After some seconds start your broker again
> >>>>>>>>> 6. The test finish normally and after some seconds dumps a book
> >> keeping
> >>>>>>>> on the console
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> *** How to interpret the results?
> >>>>>>>>> - When the test is successful, no message is lost. You can run
> the
> >> test
> >>>>>>>> without broker shutdown/startup and it will obviously always be
> >>>>>> successful.
> >>>>>>>>> - When the test fails, one or more messages are lost between
> queue
> >>>>>>>> "input" and "output". In my tests I was not able to run the
> >> "standard"
> >>>>>>>> flavour three times in a row successfully. About every second run
> >>>>>> failed.
> >>>>>>>> In contrast, the "noTxManager" flavour never failed in my tests.
> >>>>>>>>>
> >>>>>>>>> The book keeping for a failed test looks like the following. In
> >> this
> >>>>>>>> example Message number 281 is arrived at the input queue but not
> at
> >> the
> >>>>>>>> output queue. So it is lost.
> >>>>>>>>>
> >>>>>>>>> Messages created by Client:          1000
> >>>>>>>>> Client Exceptions during send:       0 []
> >>>>>>>>>
> >>>>>>>>> Messages received at input queue:    993
> >>>>>>>>> Missing Messages at input queue:     7
> >> [282,283,284,285,286,287,288]
> >>>>>>>>> Duplicate Messages at input queue:   0 []
> >>>>>>>>>
> >>>>>>>>> Messages received at output queue:   992
> >>>>>>>>> Missing Messages at output queue:    8
> >>>>>> [281,282,283,284,285,286,287,288]
> >>>>>>>>> Duplicate Messages at output queue:  0 []
> >>>>>>>>>
> >>>>>>>>> Lost Messages between Queues:        1 [281]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> *** What is the problem?
> >>>>>>>>> A Redhat engineer tracked the problem down to a Spring JMS
> template
> >>>>>>>> behaviour that is kind of strange. If a Spring transaction manager
> >> is
> >>>>>>>> defined in the config, it will end up with two of them. Therefore
> >> the
> >>>>>> small
> >>>>>>>> time range where messages can get lost that arises only when you
> >> have a
> >>>>>>>> certain load.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> *** So, what is my question?
> >>>>>>>>> - Does this really mean that it is unsafe to use the "standard"
> >> flavour
> >>>>>>>> of configuration?
> >>>>>>>>> - Is there another config with TxManager etc that works
> correctly?
> >>>>>>>>> - What are limits of the "noTxManager" config? When is it not
> >>>>>> sufficent?
> >>>>>>>>>
> >>>>>>>>> Regards
> >>>>>>>>> Stephan
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> <CamelAmqTxTest.zip>
>
>

Re: Transaction problem with Camel, ActiveMQ and Spring JMS

Reply via email to