Noah, thank you for this excellent review. I have taken over most (allmost all) of your suggestions (except for the documentation which is still missing). Please check the updated & attached patch for details.
On Fri, Jan 14, 2011 at 10:13 PM, Noah Misch <n...@leadboat.com> wrote: > "00000000000000000000" is a valid md5 message digest. To avoid always > accepting > a snapshot with that digest, we would need to separately store a flag > indicating > whether a given syncSnapshotChksums member is currently valid. Maybe that > hole > is acceptable, though. In the end I decided to store md5 checksums as printable characters in shmem. We now need a few more bytes for each checksum but as soon as a byte is NUL we know that it is not a valid checksum. >> + >> + if (!XactReadOnly) >> + elog(WARNING, "A snapshot exporting function should be >> readonly."); > > There are legitimate use cases for copying the snapshot of a read-write > transaction. Consider a task like calculating some summaries for insert into > a > table. To speed this up, you might notionally partition the source data and > have each of several workers calculate each partition. I'd vote for removing > this warning entirely. Warning removed, adding this fact to the documentation only is fine with me. > InvalidBackendId is -1. Is InvalidOid (0) in fact also invalid as a backend > ID? Yes, there is a check in utils/init/postinit.c that makes sure that 0 is never a valid backendId. But anyway, I have now made this more explicit by adding an own parse routine for ints. >> + while ((xid = parseXactFromText(&s, "xip:")) != InvalidTransactionId) >> + { >> + if (xid == GetTopTransactionIdIfAny()) >> + continue; >> + snapshot->xip[i++] = xid; > > This works on a virgin GetSnapshotData() snapshot, which always has enough > space > (sized according to max_connections). A snapshot from CopySnapshot(), > however, > has just enough space for the original usage. I get a SIGSEGV at this line > with > the attached test script. If I change things around so xip starts non-NULL, I > get messages like these as the code scribbles past the end of the allocation: This has been fixed. xip/subxip are now allocated separately if necessary. > Though unlikely, the other backend may not even exist by now. Indeed, that > could have happened and RecentGlobalXmin advanced since we compared the > checksum. Not sure what the right locking is here, but something is needed. Good catch. What I have done now is a second check at the end of ImportSnapshot(). At that point we have set up the new snapshot and adjusted MyProc->xmin. If we succeed, then we are fine. If not we throw an error and invalidate the whole transaction. >> + * Instead we must check to not go forward (we might have opened a >> cursor >> + * in this transaction and still have its snapshot registered) >> + */ > > I'm thinking this case should yield an ERROR. Otherwise, our net result would > be to silently adopt a snapshot that is neither our old self nor the target. > Maybe there's a use case for that, but none comes to my mind. This can happen when you do: DECLARE foo CURSOR FOR SELECT * FROM bar; import snapshot... FETCH 1 FROM foo; > I guess the use case for pg_import_snapshot from READ COMMITTED would be > something like "DO $$BEGIN PERFORM pg_import_snapshot('...'); stuff; END$$;". > First off, would that work ("stuff" use the imported snapshot)? If it does > work, we should take the precedent of SET LOCAL and issue no warning. If it > doesn't work, then we should document that the snapshot does take effect until > the next command and probably keep this warning or something similar. No, this will also give you a new snapshot for every query, no matter if it is executed within or outside of a DO-Block. >> + Datum >> + pg_import_snapshot(PG_FUNCTION_ARGS) >> + { >> + bytea *snapshotData = PG_GETARG_BYTEA_P(0); >> + bool ret = true; >> + >> + if (ActiveSnapshotSet()) >> + ret = ImportSnapshot(snapshotData, GetActiveSnapshot()); >> + >> + ret &= ImportSnapshot(snapshotData, GetTransactionSnapshot()); > > Is it valid to scribble directly on snapshots like this? I figured that previously executed code still has references pointing to the snapshots and so we cannot just push a new snapshot on top but really need to change the memory where they are pointing to. I am also adding a test script that shows the difference of READ COMMITTED and SERIALIZABLE in an importing transaction in a DO-Block. It's based on the script you sent. So thanks again and I'm looking forward to your next review... :-) Joachim
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c index 2dbac56..c96942a 100644 *** a/src/backend/storage/ipc/ipci.c --- b/src/backend/storage/ipc/ipci.c *************** CreateSharedMemoryAndSemaphores(bool mak *** 124,129 **** --- 124,130 ---- size = add_size(size, BTreeShmemSize()); size = add_size(size, SyncScanShmemSize()); size = add_size(size, AsyncShmemSize()); + size = add_size(size, SyncSnapshotShmemSize()); #ifdef EXEC_BACKEND size = add_size(size, ShmemBackendArraySize()); #endif *************** CreateSharedMemoryAndSemaphores(bool mak *** 228,233 **** --- 229,235 ---- BTreeShmemInit(); SyncScanShmemInit(); AsyncShmemInit(); + SyncSnapshotInit(); #ifdef EXEC_BACKEND diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index 8b36df4..29c9426 100644 *** a/src/backend/storage/ipc/procarray.c --- b/src/backend/storage/ipc/procarray.c *************** *** 50,60 **** --- 50,63 ---- #include "access/transam.h" #include "access/xact.h" #include "access/twophase.h" + #include "libpq/md5.h" #include "miscadmin.h" #include "storage/procarray.h" #include "storage/spin.h" #include "storage/standby.h" #include "utils/builtins.h" + #include "utils/bytea.h" + #include "utils/memutils.h" #include "utils/snapmgr.h" *************** static int KnownAssignedXidsGetAndSetXmi *** 159,164 **** --- 162,171 ---- static TransactionId KnownAssignedXidsGetOldestXmin(void); static void KnownAssignedXidsDisplay(int trace_level); + typedef char snapshotChksum[33]; + static snapshotChksum *syncSnapshotChksums; + static Snapshot exportedSnapshot; + /* * Report shared-memory space needed by CreateSharedProcArray. */ *************** KnownAssignedXidsDisplay(int trace_level *** 3065,3067 **** --- 3072,3424 ---- pfree(buf.data); } + + + /* + * Report space needed for our shared memory area, which is basically an + * md5 checksum per connection. + */ + Size + SyncSnapshotShmemSize(void) + { + return PROCARRAY_MAXPROCS * sizeof(snapshotChksum); + } + + void + SyncSnapshotInit(void) + { + Size size; + bool found; + + size = SyncSnapshotShmemSize(); + + syncSnapshotChksums = (snapshotChksum*) ShmemInitStruct("SyncSnapshotChksums", + size, &found); + if (!found) + memset(syncSnapshotChksums[0], 0, sizeof(snapshotChksum) * PROCARRAY_MAXPROCS); + } + + void + InvalidateExportedSnapshot(void) + { + if (syncSnapshotChksums[MyBackendId][0] == '\0') + return; + + memset(syncSnapshotChksums[MyBackendId], 0, sizeof(snapshotChksum)); + + UnregisterSnapshotFromOwner(exportedSnapshot, TopTransactionResourceOwner); + exportedSnapshot = NULL; + } + + static void + snapshot_appendf(char ** buffer, int *bufsize_filled, int *bufsize_left, char *fmt, ...) + { + va_list ap; + int ret; + + if (*bufsize_left < 64) + { + /* enlarge buffer by 1024 bytes */ + *buffer = repalloc(*buffer, *bufsize_filled + 1024); + *bufsize_left = *bufsize_filled + 1024; + } + + va_start(ap, fmt); + ret = vsnprintf(*buffer + *bufsize_filled, *bufsize_left, fmt, ap); + va_end(ap); + + /* There shouldn't be any error, we leave enough room for the data that we + * are writing (only numbers basically). */ + Assert(ret < *bufsize_left && ret > 0); + + *bufsize_left -= ret; + *bufsize_filled += ret; + + Assert(strlen(*buffer) == *bufsize_filled); + } + + bytea * + ExportSnapshot(Snapshot snapshot) + { + int bufsize = 1024; + int bufsize_filled = 0; /* doesn't include NUL byte */ + int bufsize_left = bufsize - bufsize_filled; + char *buf = (char *) palloc(bufsize); + int i; + TransactionId *children; + int nchildren; + + /* In a subtransaction we don't see our open subxip values in the snapshot + * so they would be missing in the backend applying it. */ + /* XXX is ERRCODE_INVALID_TRANSACTION_STATE more appropriate or less? + * It's a valid transaction state but invalid for the requested operation. */ + if (GetCurrentTransactionNestLevel() != 1) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("can only export a snapshot from a top level transaction"))); + + /* We do however see our already committed subxip values and add them to + * the subxip array. */ + nchildren = xactGetCommittedChildren(&children); + + /* Write up all the data that we return */ + snapshot_appendf(&buf, &bufsize_filled, &bufsize_left, + "did:%d ", MyDatabaseId); + snapshot_appendf(&buf, &bufsize_filled, &bufsize_left, + "bid:%d ", MyBackendId); + snapshot_appendf(&buf, &bufsize_filled, &bufsize_left, + "xmi:%d ", snapshot->xmin); + snapshot_appendf(&buf, &bufsize_filled, &bufsize_left, + "xma:%d ", snapshot->xmax); + /* Include our own transaction ID into the count if any. */ + snapshot_appendf(&buf, &bufsize_filled, &bufsize_left, + "xcnt:%d ", TransactionIdIsValid(GetTopTransactionIdIfAny()) ? + snapshot->xcnt + 1 : snapshot->xcnt); + for (i = 0; i < snapshot->xcnt; i++) + snapshot_appendf(&buf, &bufsize_filled, &bufsize_left, + "xip:%d ", snapshot->xip[i]); + /* + * Finally add our own XID if we have one, since by definition we will + * still be running when the other transaction takes over the snapshot. + */ + if (TransactionIdIsValid(GetTopTransactionIdIfAny())) + snapshot_appendf(&buf, &bufsize_filled, &bufsize_left, + "xip:%d ", GetTopTransactionIdIfAny()); + if (snapshot->suboverflowed || snapshot->subxcnt + nchildren > TOTAL_MAX_CACHED_SUBXIDS) + snapshot_appendf(&buf, &bufsize_filled, &bufsize_left, + "sof:1 "); + else + { + snapshot_appendf(&buf, &bufsize_filled, &bufsize_left, + "sxcnt:%d ", snapshot->subxcnt + nchildren); + for (i = 0; i < snapshot->subxcnt; i++) + snapshot_appendf(&buf, &bufsize_filled, &bufsize_left, + "sxp:%d ", snapshot->subxip[i]); + /* Add already committed subtransactions. */ + for (i = 0; i < nchildren; i++) + snapshot_appendf(&buf, &bufsize_filled, &bufsize_left, + "sxp:%d ", children[i]); + } + + /* + * buf ends with a trailing space but we leave it in for simplicity. The + * parsing routines also depend on it. + */ + + /* Register the snapshot */ + snapshot = RegisterSnapshotOnOwner(snapshot, TopTransactionResourceOwner); + exportedSnapshot = snapshot; + + pg_md5_hash(buf, bufsize_filled, syncSnapshotChksums[MyBackendId]); + + return DatumGetByteaP(DirectFunctionCall1(byteain, CStringGetDatum(buf))); + } + + static Oid + parseOidFromText(char **s, const char *prefix) + { + char *n, *p = strstr(*s, prefix); + Oid oid; + + if (!p) + return InvalidOid; + p += strlen(prefix); + n = strchr(p, ' '); + if (!n) + return InvalidOid; + *n = '\0'; + oid = DatumGetObjectId(DirectFunctionCall1(oidin, CStringGetDatum(p))); + *s = n + 1; + return oid; + } + + static int + parseIntFromText(char **s, const char *prefix, int notfound) + { + char *n, *p = strstr(*s, prefix); + int i; + + if (!p) + return notfound; + p += strlen(prefix); + n = strchr(p, ' '); + if (!n) + return notfound; + *n = '\0'; + i = DatumGetObjectId(DirectFunctionCall1(int4in, CStringGetDatum(p))); + *s = n + 1; + return i; + } + + static bool + parseBoolFromText(char **s, const char *prefix) + { + return (bool) parseIntFromText(s, prefix, 0); + } + + static TransactionId + parseXactFromText(char **s, const char *prefix) + { + char *n, *p = strstr(*s, prefix); + TransactionId xid; + + if (!p) + return InvalidTransactionId; + p += strlen(prefix); + n = strchr(p, ' '); + if (!n) + return InvalidTransactionId; + *n = '\0'; + xid = DatumGetTransactionId(DirectFunctionCall1(xidin, CStringGetDatum(p))); + *s = n + 1; + return xid; + } + + bool + ImportSnapshot(bytea *snapshotData, Snapshot snapshot) + { + snapshotChksum cksum; + Oid databaseId; + BackendId backendId; + int i; + TransactionId xid; + int len = VARSIZE_ANY_EXHDR(snapshotData); + char *s = palloc(len + 1); + MemoryContext oldctx; + + /* XXX is ERRCODE_INVALID_TRANSACTION_STATE more appropriate or less? + * It's a valid transaction state but invalid for the requested operation. */ + if (GetCurrentTransactionNestLevel() != 1) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("can only import a snapshot to a top level transaction"))); + + if (len == 0) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid data in snapshot information"))); + + strncpy(s, VARDATA_ANY(snapshotData), len); + s[len] = '\0'; + + pg_md5_hash(s, len, (void *) &cksum); + + databaseId = parseOidFromText(&s, "did:"); + backendId = (BackendId) parseIntFromText(&s, "bid:", (int) InvalidBackendId); + + if (databaseId != MyDatabaseId + || backendId == InvalidBackendId + /* + * Make sure backendId is in a reasonable range before using it as an + * array subscript. + */ + || backendId >= PROCARRAY_MAXPROCS + || backendId < 0 + || backendId == MyBackendId + /* + * Lock considerations: + * + * syncSnapshotChksums[backendId] is only changed by the backend with ID + * backendID and read by another backend that is asked to import a + * snapshot. + * syncSnapshotChksums[backendId] contains either NUL bytes or the checksum + * of a valid snapshot. + * Every comparision to the checksum while it is being written or + * deleted is okay to fail, so we don't need to take a lock at all. + */ + || strcmp(cksum, syncSnapshotChksums[backendId]) != 0) + { + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid data in snapshot information"))); + } + + Assert(databaseId != InvalidOid); + + oldctx = MemoryContextSwitchTo(TopTransactionContext); + + snapshot->xmin = parseXactFromText(&s, "xmi:"); + Assert(snapshot->xmin != InvalidTransactionId); + snapshot->xmax = parseXactFromText(&s, "xma:"); + Assert(snapshot->xmax != InvalidTransactionId); + + if (snapshot->copied) + { + int xcnt = parseIntFromText(&s, "xcnt:", 0); + /* + * Unfortunately CopySnapshot allocates just one large chunk of memory, + * and makes snapshot->xip then point past the end of the fixed-size + * snapshot data. That way we cannot just repalloc(snapshot->xip, ...). + * Neither can we just change the base address of the snapshot because + * this address might still be saved somewhere. + */ + if (snapshot->xcnt < xcnt) + snapshot->xip = palloc(xcnt * sizeof(TransactionId)); + } + + i = 0; + while ((xid = parseXactFromText(&s, "xip:")) != InvalidTransactionId) + { + if (xid == GetTopTransactionIdIfAny()) + continue; + snapshot->xip[i++] = xid; + } + snapshot->xcnt = i; + + /* + * We only write "sof:1" if the snapshot overflowed. If not, then there is + * no "sof:x" entry at all and parseBoolFromText() will just return false. + */ + snapshot->suboverflowed = parseBoolFromText(&s, "sof:"); + + if (!snapshot->suboverflowed) + { + if (snapshot->copied) + { + int sxcnt = parseIntFromText(&s, "sxcnt:", 0); + if (snapshot->subxcnt < sxcnt) + snapshot->subxip = palloc(sxcnt * sizeof(TransactionId)); + } + + i = 0; + while ((xid = parseXactFromText(&s, "sxp:")) != InvalidTransactionId) + snapshot->subxip[i++] = xid; + snapshot->subxcnt = i; + } + + MemoryContextSwitchTo(oldctx); + + /* Leave snapshot->curcid as is. */ + + /* + * No ProcArrayLock held here, we assume that a write is atomic. Also note + * that MyProc->xmin can go backwards here. However this is safe because + * the xmin we set here is the same as in the backend's proc->xmin whose + * snapshot we are copying. At this very moment, anybody computing a + * minimum will calculate at least this xmin as the overall xmin with or + * without us setting MyProc->xmin to this value. + * Instead we must check to not go forward (we might have opened a cursor + * in this transaction and still have its snapshot registered) + */ + if (TransactionIdPrecedes(snapshot->xmin, MyProc->xmin)) + MyProc->xmin = snapshot->xmin; + + /* + * Check the checksum again to prevent a race condition. If the exporting + * backend invalidated its snapshot since we last checked or before we + * finish with this second check, then we fail here and error out, thus + * invalidating the snapshot we've built up. + * We could succeed here even though the exporting transaction is at the + * same time invalidating the checksum while we are checking it here for + * the second time. But even then we are good, because we have set up our + * MyProc record already. + */ + if (strcmp(cksum, syncSnapshotChksums[backendId]) != 0) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid data in snapshot information"))); + + return true; + } + + diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c index 45b92a0..f74ea76 100644 *** a/src/backend/utils/time/snapmgr.c --- b/src/backend/utils/time/snapmgr.c *************** *** 29,34 **** --- 29,35 ---- #include "access/xact.h" #include "storage/proc.h" #include "storage/procarray.h" + #include "utils/builtins.h" #include "utils/memutils.h" #include "utils/memutils.h" #include "utils/resowner.h" *************** AtEarlyCommit_Snapshot(void) *** 523,528 **** --- 524,530 ---- TopTransactionResourceOwner); registered_xact_snapshot = false; + InvalidateExportedSnapshot(); } /* *************** AtEOXact_Snapshot(bool isCommit) *** 559,561 **** --- 561,595 ---- FirstSnapshotSet = false; registered_xact_snapshot = false; } + + Datum + pg_export_snapshot(PG_FUNCTION_ARGS) + { + bytea *snapshotData = ExportSnapshot(GetTransactionSnapshot()); + PG_RETURN_BYTEA_P(snapshotData); + } + + Datum + pg_import_snapshot(PG_FUNCTION_ARGS) + { + bytea *snapshotData = PG_GETARG_BYTEA_P(0); + bool ret = true; + + if (ActiveSnapshotSet()) + ret = ImportSnapshot(snapshotData, GetActiveSnapshot()); + + ret &= ImportSnapshot(snapshotData, GetTransactionSnapshot()); + + /* + * If we are in read committed mode then the next query will execute with a + * new snapshot making this function call quite useless. + */ + if (!IsolationUsesXactSnapshot()) + ereport(WARNING, + (errcode(ERRCODE_INAPPROPRIATE_ISOLATION_LEVEL_FOR_BRANCH_TRANSACTION), + errmsg("a snapshot importing transaction should be ISOLATION LEVEL SERIALIZABLE"), + errhint("A transaction of isolation level READ COMMITTED gives you a new snapshot for each query."))); + + PG_RETURN_BOOL(ret); + } + diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index f8b5d4d..3a4bc6d 100644 *** a/src/include/catalog/pg_proc.h --- b/src/include/catalog/pg_proc.h *************** DATA(insert OID = 2171 ( pg_cancel_backe *** 3391,3396 **** --- 3391,3400 ---- DESCR("cancel a server process' current query"); DATA(insert OID = 2096 ( pg_terminate_backend PGNSP PGUID 12 1 0 0 f f f t f v 1 0 16 "23" _null_ _null_ _null_ _null_ pg_terminate_backend _null_ _null_ _null_ )); DESCR("terminate a server process"); + DATA(insert OID = 3115 ( pg_export_snapshot PGNSP PGUID 12 1 0 0 f f f t f v 0 0 17 "" _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ )); + DESCR("export a snapshot"); + DATA(insert OID = 3116 ( pg_import_snapshot PGNSP PGUID 12 1 0 0 f f f t f v 1 0 16 "17" _null_ _null_ _null_ _null_ pg_import_snapshot _null_ _null_ _null_ )); + DESCR("import a snapshot"); DATA(insert OID = 2172 ( pg_start_backup PGNSP PGUID 12 1 0 0 f f f t f v 2 0 25 "25 16" _null_ _null_ _null_ _null_ pg_start_backup _null_ _null_ _null_ )); DESCR("prepare for taking an online backup"); DATA(insert OID = 2173 ( pg_stop_backup PGNSP PGUID 12 1 0 0 f f f t f v 0 0 25 "" _null_ _null_ _null_ _null_ pg_stop_backup _null_ _null_ _null_ )); diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h index 334f9a2..ee94451 100644 *** a/src/include/storage/procarray.h --- b/src/include/storage/procarray.h *************** extern void XidCacheRemoveRunningXids(Tr *** 71,74 **** --- 71,81 ---- int nxids, const TransactionId *xids, TransactionId latestXid); + extern Size SyncSnapshotShmemSize(void); + extern void SyncSnapshotInit(void); + + extern bytea *ExportSnapshot(Snapshot snapshot); + extern bool ImportSnapshot(bytea *snapshotData, Snapshot snapshot); + extern void InvalidateExportedSnapshot(void); + #endif /* PROCARRAY_H */ diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h index 6d784d2..3dcb924 100644 *** a/src/include/utils/snapmgr.h --- b/src/include/utils/snapmgr.h *************** extern void AtSubAbort_Snapshot(int leve *** 43,46 **** --- 43,49 ---- extern void AtEarlyCommit_Snapshot(void); extern void AtEOXact_Snapshot(bool isCommit); + extern Datum pg_export_snapshot(PG_FUNCTION_ARGS); + extern Datum pg_import_snapshot(PG_FUNCTION_ARGS); + #endif /* SNAPMGR_H */
use DBI; use Data::Dumper; $dad = DBI->connect('dbi:Pg:dbname=template1;host=localhost', '', '', {AutoCommit => 0, RaiseError => 1}); $kid = DBI->connect('dbi:Pg:dbname=template1;host=localhost', '', '', {AutoCommit => 0, RaiseError => 1}); # Cleanup... $kid->do("DROP TABLE IF EXISTS kidseen"); $kid->do("DROP TABLE IF EXISTS kidtest"); $kid->commit(); # Dad opens a transaction... $dad->rollback(); $dad->do("SET bytea_output = 'escape'"); # Dad exports its snapshot for the kid to take it over... ($snap) = $dad->selectrow_array('SELECT pg_export_snapshot()'); print "Dad: " . Dumper($dad->selectall_arrayref("SELECT * FROM pg_class where relname = 'kidtest'")); # The kid creates a table that will disappear when it takes over dad's snapshot... $kid->do("CREATE TABLE kidtest (a int);"); $kid->commit(); # READ COMMITTED $kid->do("set default_transaction_isolation to 'read committed'"); $kid->commit(); $kid->do("DO \$\$BEGIN PERFORM pg_import_snapshot('$snap'); CREATE TABLE kidseen AS SELECT * FROM pg_class WHERE relname = 'kidtest'; END\$\$"); print "Kid RC: " . Dumper($kid->selectall_arrayref("SELECT * FROM kidseen")); $kid->rollback(); # SERIALIZABLE $kid->do("set default_transaction_isolation to serializable"); $kid->commit(); $kid->do("DO \$\$BEGIN PERFORM pg_import_snapshot('$snap'); CREATE TABLE kidseen AS SELECT * FROM pg_class WHERE relname = 'kidtest'; END\$\$"); print "Kid SERIAL: " . Dumper($kid->selectall_arrayref("SELECT * FROM kidseen")); $kid->rollback();
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers