On 04/12/14 10:03, Andres Freund wrote:
On 2014-04-12 09:47:24 -0400, Tom Lane wrote:
Heikki Linnakangas <hlinnakan...@vmware.com> writes:
> On 04/12/2014 12:07 AM, Jan Wieck wrote:
>> the Slony team has been getting seldom reports of a problem with the
>> txid_snapshot data type.
>> The symptom is that a txid_snapshot on output lists the same txid
>> multiple times in the xip list part of the external representation.

> It's two-phase commit. When preparing a transaction, the state of the
> transaction is first transfered to a dummy PGXACT entry, and then the
> PGXACT entry of the backend is cleared. There is a transient state when
> both PGXACT entries have the same xid.

Hm, yeah, but why is that intermediate state visible to anyone else?
Don't we have exclusive lock on the PGPROC array while we're doing this?

It's done outside the remit of ProcArray lock :(. And documented to lead
to duplicate xids in PGXACT.
EndPrepare():
        /*
         * Mark the prepared transaction as valid.      As soon as xact.c marks
         * MyPgXact as not running our XID (which it will do immediately after
         * this function returns), others can commit/rollback the xact.
         *
         * NB: a side effect of this is to make a dummy ProcArray entry for the
         * prepared XID.  This must happen before we clear the XID from 
MyPgXact,
         * else there is a window where the XID is not running according to
         * TransactionIdIsInProgress, and onlookers would be entitled to assume
         * the xact crashed.  Instead we have a window where the same XID 
appears
         * twice in ProcArray, which is OK.
         */
        MarkAsPrepared(gxact);

It doesn't sound too hard to essentially move PrepareTransaction()'s
ProcArrayClearTransaction() into MarkAsPrepared() and rejigger the
locking to remove the intermediate state. But I think it'll lead to
bigger changes than we'd be comfortable backpatching.

If we don't, aren't we letting other backends see non-self-consistent
state in regards to who holds which locks, for example?

I think that actually works out ok, because the locks aren't owned by
xids/xacts, but procs. Otherwise we'd be in deep trouble in
CommitTransaction() as well where ProcArrayEndTransaction() clearing
that state.
After the whole xid transfer, there's PostPrepare_Locks() transferring
the locks.

Since it doesn't seem to produce any side effects, I'd think that making the snapshot unique within txid_current_snapshot() and filtering duplicates on input should be sufficient and eligible for backpatching.

The attached patch adds a unique loop to the internal sort_snapshot() function and skips duplicates on input. The git commit is here:

https://github.com/wieck/postgres/commit/a88a2b2c25b856478d7e2b012fc718106338fe00


Regards,
Jan

--
Jan Wieck
Senior Software Engineer
http://slony.info
diff --git a/src/backend/utils/adt/txid.c b/src/backend/utils/adt/txid.c
index a005e67..44f7880 100644
*** a/src/backend/utils/adt/txid.c
--- b/src/backend/utils/adt/txid.c
*************** cmp_txid(const void *aa, const void *bb)
*** 131,146 ****
  }
  
  /*
!  * sort a snapshot's txids, so we can use bsearch() later.
   *
   * For consistency of on-disk representation, we always sort even if bsearch
!  * will not be used.
   */
  static void
  sort_snapshot(TxidSnapshot *snap)
  {
        if (snap->nxip > 1)
                qsort(snap->xip, snap->nxip, sizeof(txid), cmp_txid);
  }
  
  /*
--- 131,161 ----
  }
  
  /*
!  * unique sort a snapshot's txids, so we can use bsearch() later.
   *
   * For consistency of on-disk representation, we always sort even if bsearch
!  * will not be used. 
   */
  static void
  sort_snapshot(TxidSnapshot *snap)
  {
+       txid    last = 0;
+       int             nxip, idx1, idx2;
+ 
        if (snap->nxip > 1)
+       {
                qsort(snap->xip, snap->nxip, sizeof(txid), cmp_txid);
+               nxip = snap->nxip;
+               idx1 = idx2 = 0;
+               while (idx1 < nxip)
+               {
+                       if (snap->xip[idx1] != last)
+                               last = snap->xip[idx2++] = snap->xip[idx1];
+                       else
+                               snap->nxip--;
+                       idx1++;
+               }
+       }
  }
  
  /*
*************** parse_snapshot(const char *str)
*** 294,304 ****
                val = str2txid(str, &endp);
                str = endp;
  
!               /* require the input to be in order */
!               if (val < xmin || val >= xmax || val <= last_val)
                        goto bad_format;
  
!               buf_add_txid(buf, val);
                last_val = val;
  
                if (*str == ',')
--- 309,320 ----
                val = str2txid(str, &endp);
                str = endp;
  
!               /* require the input to be in order and skip duplicates */
!               if (val < xmin || val >= xmax || val < last_val)
                        goto bad_format;
  
!               if (val != last_val)
!                       buf_add_txid(buf, val);
                last_val = val;
  
                if (*str == ',')
diff --git a/src/test/regress/expected/txid.out 
b/src/test/regress/expected/txid.out
index 930b86a..7750b7b 100644
*** a/src/test/regress/expected/txid.out
--- b/src/test/regress/expected/txid.out
*************** select '12:18:14,16'::txid_snapshot;
*** 12,17 ****
--- 12,23 ----
   12:18:14,16
  (1 row)
  
+ select '12:16:14,14'::txid_snapshot;
+  txid_snapshot 
+ ---------------
+  12:16:14
+ (1 row)
+ 
  -- errors
  select '31:12:'::txid_snapshot;
  ERROR:  invalid input for txid_snapshot: "31:12:"
*************** select '12:16:14,13'::txid_snapshot;
*** 29,38 ****
  ERROR:  invalid input for txid_snapshot: "12:16:14,13"
  LINE 1: select '12:16:14,13'::txid_snapshot;
                 ^
- select '12:16:14,14'::txid_snapshot;
- ERROR:  invalid input for txid_snapshot: "12:16:14,14"
- LINE 1: select '12:16:14,14'::txid_snapshot;
-                ^
  create temp table snapshot_test (
        nr      integer,
        snap    txid_snapshot
--- 35,40 ----
diff --git a/src/test/regress/sql/txid.sql b/src/test/regress/sql/txid.sql
index ecae10e..b6650b9 100644
*** a/src/test/regress/sql/txid.sql
--- b/src/test/regress/sql/txid.sql
***************
*** 3,15 ****
  -- i/o
  select '12:13:'::txid_snapshot;
  select '12:18:14,16'::txid_snapshot;
  
  -- errors
  select '31:12:'::txid_snapshot;
  select '0:1:'::txid_snapshot;
  select '12:13:0'::txid_snapshot;
  select '12:16:14,13'::txid_snapshot;
- select '12:16:14,14'::txid_snapshot;
  
  create temp table snapshot_test (
        nr      integer,
--- 3,15 ----
  -- i/o
  select '12:13:'::txid_snapshot;
  select '12:18:14,16'::txid_snapshot;
+ select '12:16:14,14'::txid_snapshot;
  
  -- errors
  select '31:12:'::txid_snapshot;
  select '0:1:'::txid_snapshot;
  select '12:13:0'::txid_snapshot;
  select '12:16:14,13'::txid_snapshot;
  
  create temp table snapshot_test (
        nr      integer,
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to