[ 
https://issues.apache.org/jira/browse/AMQ-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Serrano updated AMQ-4157:
--------------------------------

    Description: 
This was very difficult to track down.  It rarely occurs because a certain set 
of events must be occurring to trigger the bug.   I have marked it a Blocker 
because when it does occur, it is silent and leads to a message not being 
persisted in the MessageStore.

*Description*
The crux of the bug is that when a rollback on a session occurs, the resulting 
MessageAck can overlap with the async store of the message in the KahaDB.   
When this occurs, the message is never persisted.  Additionally, the resultant 
{{CancellationException}} is ignored in o.a.a.broker.region.Queue:796.   The 
steps:

# a StoreQueueTask is created to add a message X.  this is put on the async 
task queue
# meanwhile this message is dispatched via a prefetch subscription to a 
transacted consumer.
# the transacted consumer calls session.rollback
# this leads to acknowledgement of the dispatched messages 
# as a result destination.removeAsyncMessage is called
# if the original add has not yet executed then it will be cancelled leading to 
the message never being persisted!  (occurs at KahaDBStore:401)
# the Queue.send method uses the result future to make sure the persist happens 
in the store, but it ignores cancellation, so this can lead execution control 
to return to the sender when no persistence has occurred without an error.

I have not been able to reproduce this in a small activemq-only test.  But I 
can reproduce it in my environment.  

*Proposed Solutions*
I'm really unsure of the solution here.  Should 
{{KahaDBStore.removeAsyncMessage}} (line 393) check the context and only cancel 
tasks if it is not in a transaction context?  But what would that mean in the 
log?  Would there be a removeMessage prior to the addMessage?

*Workaround*
* turn off caching for the destination (see [dest 
policies|http://activemq.apache.org/per-destination-policies.html]).  this will 
cause messages to be added to the synchronously so they will not be subject to 
the async cancellation

  was:
This was very difficult to track down.  It rarely occurs because a certain set 
of events must be occurring to trigger the bug.   I have marked it a Blocker 
because when it does occur, it is silent and leads to a message not being 
persisted in the MessageStore.

*Description*
The crux of the bug is that when a rollback on a session occurs, the resulting 
MessageAck can overlap with the async store of the message in the KahaDB.   
When this occurs, the message is never persisted.  Additionally, the resultant 
{{CancellationException}} is ignored in o.a.a.broker.region.Queue:796.   The 
steps:

# a StoreQueueTask is created to add a message X.  this is put on the async 
task queue
# meanwhile this message is dispatched via a prefetch subscription to a 
transacted consumer.
# the transacted consumer calls session.rollback
# this leads to acknowledgement of the dispatched messages 
# as a result destination.removeAsyncMessage is called
# if the original add has not yet executed then it will be cancelled leading to 
the message never being persisted!  (occurs at KahaDBStore:401)

I have not been able to reproduce this in a small activemq-only test.  But I 
can reproduce it in my environment.  

*Proposed Solutions*
I think the issue lies either with:
* the check at KahaDBTransactionStore:477, should it be calling 
{{theStore.isConcurrentStoreAndDispatchQueues()}} as 


*Workaround*
* turn off caching for the destination (see [dest 
policies|http://activemq.apache.org/per-destination-policies.html]).  this will 
cause messages to be added to the synchronously so they will not be subject to 
the async cancellation

    
> KahaDBTransactionStore.removeAyncMessage may cancel addMessage when in 
> transaction leading to unpersisted messages
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-4157
>                 URL: https://issues.apache.org/jira/browse/AMQ-4157
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.7.0
>         Environment: linux 64-bit, kahadb, persisted messages, cached dest, 
> transacted
>            Reporter: Martin Serrano
>            Priority: Blocker
>
> This was very difficult to track down.  It rarely occurs because a certain 
> set of events must be occurring to trigger the bug.   I have marked it a 
> Blocker because when it does occur, it is silent and leads to a message not 
> being persisted in the MessageStore.
> *Description*
> The crux of the bug is that when a rollback on a session occurs, the 
> resulting MessageAck can overlap with the async store of the message in the 
> KahaDB.   When this occurs, the message is never persisted.  Additionally, 
> the resultant {{CancellationException}} is ignored in 
> o.a.a.broker.region.Queue:796.   The steps:
> # a StoreQueueTask is created to add a message X.  this is put on the async 
> task queue
> # meanwhile this message is dispatched via a prefetch subscription to a 
> transacted consumer.
> # the transacted consumer calls session.rollback
> # this leads to acknowledgement of the dispatched messages 
> # as a result destination.removeAsyncMessage is called
> # if the original add has not yet executed then it will be cancelled leading 
> to the message never being persisted!  (occurs at KahaDBStore:401)
> # the Queue.send method uses the result future to make sure the persist 
> happens in the store, but it ignores cancellation, so this can lead execution 
> control to return to the sender when no persistence has occurred without an 
> error.
> I have not been able to reproduce this in a small activemq-only test.  But I 
> can reproduce it in my environment.  
> *Proposed Solutions*
> I'm really unsure of the solution here.  Should 
> {{KahaDBStore.removeAsyncMessage}} (line 393) check the context and only 
> cancel tasks if it is not in a transaction context?  But what would that mean 
> in the log?  Would there be a removeMessage prior to the addMessage?
> *Workaround*
> * turn off caching for the destination (see [dest 
> policies|http://activemq.apache.org/per-destination-policies.html]).  this 
> will cause messages to be added to the synchronously so they will not be 
> subject to the async cancellation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to