[HACKERS] Question concerning XTM (eXtensible Transaction Manager API)

Konstantin Knizhnik Mon, 16 Nov 2015 00:48:13 -0800

Hello,

Some time ago at PgConn.Vienna we have proposed eXtensible TransactionManager API (XTM).The idea is to be able to provide custom implementation of transactionmanagers as standard Postgres extensions,

primary goal is implementation of distritibuted transaction manager.

It should not only support 2PC, but also provide consistent snapshotsfor global transaction executed at different nodes.

Actually, current version of XTM API propose any particular 2PC model.It can be implemented either at coordinator side(as it is done in our pg_tsdtm <https://github.com/postgrespro/pg_tsdtm>implementation based on timestamps and not requiring centralizedarbiter), either by arbiter(pg_dtm <https://github.com/postgrespro/pg_dtm>). In the last case 2PClogic is hidden under XTM SetTransactionStatus method:

bool (*SetTransactionStatus)(TransactionId xid, int nsubxids,TransactionId *subxids, XidStatus status, XLogRecPtr lsn);


which encapsulates TransactionIdSetTreeStatus in clog.c.

But you may notice that original TransactionIdSetTreeStatus function isvoid - it is not intended to return anything.It is called in RecordTransactionCommit in critical section where it isnot expected that commit may fail.But in case of DTM transaction may be rejected by arbiter. XTM APIallows to control access to CLOG, so everybody will see that transactionis aborted. But we in any case have to somehow notify client about abortof transaction.

We can not just call elog(ERROR,...) in SetTransactionStatusimplementation because inside critical section it cause Postgres crashwith panic message. So we have to remember that transaction is rejectedand report error later after exit from critical section:



        /*
         * Now we may update the CLOG, if we wrote a COMMIT record above
         */
        if (markXidCommitted) {
            committed = TransactionIdCommitTree(xid, nchildren, children);
        }
...
    /*
     * If we entered a commit critical section, leave it now, and let
     * checkpoints proceed.
     */
    if (markXidCommitted)
    {
        MyPgXact->delayChkpt = false;
        END_CRIT_SECTION();
        if (!committed) {
            CurrentTransactionState->state = TRANS_ABORT;
            CurrentTransactionState->blockState = TBLOCK_ABORT_PENDING;
            elog(ERROR, "Transaction commit rejected by XTM");
        }
    }

There is one more problem - at this moment the state of transaction isTRANS_COMMIT.If ERROR handler will try to abort it, then we get yet another fatalerror: attempt to rollback committed transaction.So we need to hide the fact that transaction is actually committed inlocal XLOG.

This approach works but looks a little bit like hacker approach. Itrequires not only to replace direct call of TransactionIdSetTreeStatuswith indirect (though XTM API), but also requires to make some nonobvious changes in RecordTransactionCommit.


So what are the alternatives?

1. Move RecordTransactionCommit to XTM. In this case we have to copyoriginal RecordTransactionCommit to DTM implementation and patch ithere. It is also not nice, because it will complicate maintenance of DTMimplementation.The primary idea of XTM is to allow development of DTM as standardPostgreSQL extension without creating of specific clones of mainPostgreSQL source tree. But this idea will be compromised if we havecopy&paste some pieces of PostgreSQL code.In some sense it is even worser than maintaining separate branch - inlast case at least we have some way to perfrtom automatic merge.

2. Propose some alternative two-phase commit implementation inPostgreSQL core. The main motivation for such "lightweight"implementation of 2PC in pg_dtm is that original mechanism of preparedtransactions in PostgreSQL adds to much overhead.In our benchmarks we have found that simple credit-debit banking test(without any DTM) works almost 10 times slower with PostgreSQL 2PC thanwithout it. This is why we try to propose alternative solution (rightnow pg_dtm is 2 times slower than vanilla PostgreSQL, but it not onlyperforms 2PC but also provide consistent snapshots).


May be somebody can suggest some other solution?
Or give some comments concerning current approach?

Thank in advance,
Konstantin,
Postgres Professional

[HACKERS] Question concerning XTM (eXtensible Transaction Manager API)

Reply via email to