On 18 Mar 2011, at 12:13, Mircea Markus wrote:

> Hi,
> 
> It's about the stage where TM's recovery  process finds a in-doubt 
> transaction and notifies the sys admin about it: what hooks does ISPN provide 
> to the sys admin in order to "fix" the tx.
> E.g. step >= 3.3 : 
> http://community.jboss.org/servlet/JiveServlet/showImage/102-16552-14-11811/3_non_originator_failure.png
> 
> Here is what I have in mind:
> 
> Expose (JMX) two operations:
> 
>   //all the params together fully describe a xid.
>   replayTx(byte[] txBranch, byte[] txId, int formatId); 
>   forceRollbackTx(byte[] txBranch, byte[] txId, int formatId);

You expect a sysadmin to type a byte array into a JMX console?  :-)  You might 
get death threats from sysadmins... 

> Here is how these two ops would work:
> A. replayTx 
>    1. the node has locally the PrepareCommand associated with that XID
>       - re-issues a prepare: TransactionXAResource.prepare
>       - if successful re-issues a commit: TransactionXAResource.commit
>        -if failure happens at any step the user is informed and she/he can 
> re-do the JMX call
>       - if success the recovery information is removed from the cluster 
> (async)
>    2. the node doesn't have the PrepareCommand associated with that XID
>       - broadcast ReplayTxCommand (Xid)
>        - when a node receives ReplayTxCommand
>               - if doesn't have a PreparedCommand associated with the Xid 
> ignores it
>               - if has a PreparedCommand...
>                       - is it the first in the view that has it [1]? 

How does a node know the answer to this question?  Is the list of nodes that 
holds the prepare replay info stored on the PrepareCommand?

>                               - yes. Execute A.1then returns result to node 
> that broadcasted ReplayTxCommand. This is guaranteed to happen on at most[2] 
> one node in the cluster
>                               - no. Ignores it.
>       - if success the recovery information is removed from the cluster 
> (async)
> B.rollbackTx
>   - node broadcasts RollbackCommand
>   - each node that has the PrepareCommand forces a rollback
>   - each node that doesn't have the PreparedCommand ignores it
>   - if success the recovery information is removed from the cluster (async)
> 
> Cheers,
> Mircea
> 
> [1] this is determined by building the set of nodes on which tx spreads, 
> based on tx's state. Then determine the first in the view. 
> [2] it is possible not to happen on any node as the PrepareCommand might had 
> been removed from all nodes in between (node failures, expiration from the 
> recovery cache). 
> 
> 
> 
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> [email protected]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
[email protected]
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org



_______________________________________________
infinispan-dev mailing list
[email protected]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Reply via email to