> If I haven't missed something I think each subcoordinator would be
> handled as a RM. So the only thing to cover is to add each
> sub-coordinator as a RM, and the rest will work automatically by way of
> recursion.

Yes, but every TM will be more complex in its recovery process because it
has to support more. Not complicated but hard to get right.

> > > > and
> > > > the large amount of possible failures that can happen,
> > >
> > > All of which are reported to the TM as an exception.
> >
> > I don't think that is the case. Consider the case when we have
> subordinate coordinator as
> > a participant in a transaction that dies after sending PREPARED
> to the coordinating TM.
> > When the participating does its recovery it see that COMMITED
> is not logged so it has no
> > way of telling if the coordinator has sent COMMIT or not, it
> then has to run a
> > termination protocol of some kind. Right?
>
> Hm.. I thought that only the top TM was allowed to do commits... I need
> to read up perhaps.. you might be right.

You are right, only the top TM can do commits. Let me try to explain the
example above better.

You have a transaction with a top TM that has one subcoordinator, called
sub-TM. Let us see how the top-TM and the sub-TM handles a COMMIT. Assume
presumed rollback as well.

top-TM ("Simplified normal procedure")
--------------------------------------
1. Sends a PREPARE request to all of its participant
2. Gets PREPARED responses from all of its participants
3. Sends a COMMIT request to all of its participants. A log record about the
commit and a list of all participants is written.

sub-TM ("Simplified normal procedure")
--------------------------------------
1. Gets a PREPARE request from the top-TM.
2. Relays the PREPARE request to all of its participants
3. Gets PREPARED responses from all of its participants
4. Returns PREPARED response to the top-TM. A log about the PREPARED state,
a reference to the top-TM and a list of its participant as to be written.
5. Gets a COMMIT request from the top-TM
6. Relay to participants.
7. Get responses from participants
8. Respond to top-TM


What complicates the example above is possible failures, and failures alone.
For example, when the sub-TM dies between step 4 and 5 above, (which is what
I tried to explain in the earlier mail.) the sub-TM has no way of knowing if
the top-TM has issued a COMMIT or not so it has to run a termination
protocol to figure that out.

(Why all these examples? My only intent was to show that I don't think all
possible errors is in there as exceptions...)

To summarize: The recovery process a TM has to make during start-up is much
more complicated when we have sub-TMs. OTS has all that covered. OTS uses
IIOP as a protocol but I don't see why it couldn't be any transport. If you
aren't a transaction expert (like me, I mean that I'm not an expert...) it
can really be better to follow a standard document than trying to figuring
out what to do by yourself.


Regards,
/Tommy


Reply via email to