Manfred Koizar wrote:
> On Fri, 22 Nov 2002 00:32:46 -0500 (EST), Bruce Momjian
> <[EMAIL PROTECTED]> wrote:
> >I am going to work on nested transactions for 7.4.
> > [...]
> >And finally, I must abort tuple changes made by the aborted
> >subtransaction.  One way of doing that is to keep all relation id's
> >modified by the transaction, and do a sequential scan of the tables on
> >abort, changing the transaction id's to a fixed aborted transaction id. 
> >However, this could be slow.  (We could store tids if only a few rows
> >are updated by a subtransaction.  That would speed it up considerably.)
> 
> Depends on your definition of "few".  I don't expect problems for up
> to several thousand tids.  If there are more modified tuples, we could
> first reduce the list to page numbers, before finally falling back to
> table scans.

Yes, and the key point is that those are kept only in the backend local
memory, so clearly thousands are possible.  The outer transaction takes
care of all the ACID issues.

> >Another idea is to use new transaction id's for the subtransactions, and
> >[...]
> >would increase the clog size per transaction from 2 bits to 4 bytes 
> >(two bits for status, 30 bits for offset to parent).
> 
> Nice idea, this 30 bit offset.  But one could argue that increased
> clog size even hurts users who don't use nested transactions at all.
> If parent/child dependency is kept separate from status bits (in
> pg_subtransxxxx files), additional I/O cost is only paid if
> subtransactions are actually used.  New status bits (XMIN_IS_SUB,
> XMAX_IS_SUB) in tuple headers can avoid unnecessary parent xid
> lookups.
> 
> I also thought of subtransaction xids in tuple headers as short lived
> information.  Under certain conditions they can be replaced with the
> parent xid as soon as the parent transaction has finished.  I proposed
> this to be done on the next tuple access just like we set
> committed/aborted flags now, though I'm not sure anymore that it is
> safe to do this.
> 
> Old pg_subtrans files can be removed by VACUUM.
> 
> One more difference between the two proposals:  The former (locally
> remember modified tuples) can be used for recovery after a failed
> command.  The latter (subtrans tree) can only help, if we give a new
> xid to each command, which I'm sure we don't want to do.

The interesting issue is that if we could set the commit/abort bits all
at the same time, we could have the parent/child dependency local to the
backend --- other backends don't need to know the parent, only the
status of the (subtransaction's) xid, and they need to see all those
xid's committed at the same time.

You could store the backend slot id in pg_clog rather than the parent
xid and look up the status of the outer xid for that backend slot.  That
would allow you to use 2 bytes, with a max of 16k backends.  The problem
is that on a crash, the pg_clog points to invalid slots --- it would
probably have to be cleaned up on startup.

But still, you have an interesting idea of just setting the bit to be "I
am a child".  The trick is allowing backends to figure out who's child
you are.  We could store this somehow in shared memory, but that is
finite and there can be lots of xid's for a backend using
subtransactions.

I still think there must be a clean way, but I haven't figured it out yet.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  [EMAIL PROTECTED]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Reply via email to