Noah, thank you for this excellent review. I have taken over most
(allmost all) of your suggestions (except for the documentation which
is still missing). Please check the updated & attached patch for
details.

On Fri, Jan 14, 2011 at 10:13 PM, Noah Misch <n...@leadboat.com> wrote:
> "00000000000000000000" is a valid md5 message digest.  To avoid always 
> accepting
> a snapshot with that digest, we would need to separately store a flag 
> indicating
> whether a given syncSnapshotChksums member is currently valid.  Maybe that 
> hole
> is acceptable, though.

In the end I decided to store md5 checksums as printable characters in
shmem. We now need a few more bytes for each checksum but as soon as a
byte is NUL we know that it is not a valid checksum.

>> +
>> +     if (!XactReadOnly)
>> +             elog(WARNING, "A snapshot exporting function should be 
>> readonly.");
>
> There are legitimate use cases for copying the snapshot of a read-write
> transaction.  Consider a task like calculating some summaries for insert into 
> a
> table.  To speed this up, you might notionally partition the source data and
> have each of several workers calculate each partition.  I'd vote for removing
> this warning entirely.

Warning removed, adding this fact to the documentation only is fine with me.


> InvalidBackendId is -1.  Is InvalidOid (0) in fact also invalid as a backend 
> ID?

Yes, there is a check in utils/init/postinit.c that makes sure that 0
is never a valid backendId. But anyway, I have now made this more
explicit by adding an own parse routine for ints.


>> +     while ((xid = parseXactFromText(&s, "xip:")) != InvalidTransactionId)
>> +     {
>> +             if (xid == GetTopTransactionIdIfAny())
>> +                     continue;
>> +             snapshot->xip[i++] = xid;
>
> This works on a virgin GetSnapshotData() snapshot, which always has enough 
> space
> (sized according to max_connections).  A snapshot from CopySnapshot(), 
> however,
> has just enough space for the original usage.  I get a SIGSEGV at this line 
> with
> the attached test script.  If I change things around so xip starts non-NULL, I
> get messages like these as the code scribbles past the end of the allocation:

This has been fixed. xip/subxip are now allocated separately if necessary.


> Though unlikely, the other backend may not even exist by now.  Indeed, that
> could have happened and RecentGlobalXmin advanced since we compared the
> checksum.  Not sure what the right locking is here, but something is needed.

Good catch. What I have done now is a second check at the end of
ImportSnapshot(). At that point we have set up the new snapshot and
adjusted MyProc->xmin. If we succeed, then we are fine. If not we
throw an error and invalidate the whole transaction.


>> +      * Instead we must check to not go forward (we might have opened a 
>> cursor
>> +      * in this transaction and still have its snapshot registered)
>> +      */
>
> I'm thinking this case should yield an ERROR.  Otherwise, our net result would
> be to silently adopt a snapshot that is neither our old self nor the target.
> Maybe there's a use case for that, but none comes to my mind.

This can happen when you do:

DECLARE foo CURSOR FOR SELECT * FROM bar;
import snapshot...
FETCH 1 FROM foo;


> I guess the use case for pg_import_snapshot from READ COMMITTED would be
> something like "DO $$BEGIN PERFORM pg_import_snapshot('...'); stuff; END$$;".
> First off, would that work ("stuff" use the imported snapshot)?  If it does
> work, we should take the precedent of SET LOCAL and issue no warning.  If it
> doesn't work, then we should document that the snapshot does take effect until
> the next command and probably keep this warning or something similar.

No, this will also give you a new snapshot for every query, no matter
if it is executed within or outside of a DO-Block.


>> + Datum
>> + pg_import_snapshot(PG_FUNCTION_ARGS)
>> + {
>> +     bytea  *snapshotData = PG_GETARG_BYTEA_P(0);
>> +     bool    ret = true;
>> +
>> +     if (ActiveSnapshotSet())
>> +             ret = ImportSnapshot(snapshotData, GetActiveSnapshot());
>> +
>> +     ret &= ImportSnapshot(snapshotData, GetTransactionSnapshot());
>
> Is it valid to scribble directly on snapshots like this?

I figured that previously executed code still has references pointing
to the snapshots and so we cannot just push a new snapshot on top but
really need to change the memory where they are pointing to.


I am also adding a test script that shows the difference of READ
COMMITTED and SERIALIZABLE in an importing transaction in a DO-Block.
It's based on the script you sent.


So thanks again and I'm looking forward to your next review... :-)

Joachim
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2dbac56..c96942a 100644
*** a/src/backend/storage/ipc/ipci.c
--- b/src/backend/storage/ipc/ipci.c
*************** CreateSharedMemoryAndSemaphores(bool mak
*** 124,129 ****
--- 124,130 ----
  		size = add_size(size, BTreeShmemSize());
  		size = add_size(size, SyncScanShmemSize());
  		size = add_size(size, AsyncShmemSize());
+ 		size = add_size(size, SyncSnapshotShmemSize());
  #ifdef EXEC_BACKEND
  		size = add_size(size, ShmemBackendArraySize());
  #endif
*************** CreateSharedMemoryAndSemaphores(bool mak
*** 228,233 ****
--- 229,235 ----
  	BTreeShmemInit();
  	SyncScanShmemInit();
  	AsyncShmemInit();
+ 	SyncSnapshotInit();
  
  #ifdef EXEC_BACKEND
  
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 8b36df4..29c9426 100644
*** a/src/backend/storage/ipc/procarray.c
--- b/src/backend/storage/ipc/procarray.c
***************
*** 50,60 ****
--- 50,63 ----
  #include "access/transam.h"
  #include "access/xact.h"
  #include "access/twophase.h"
+ #include "libpq/md5.h"
  #include "miscadmin.h"
  #include "storage/procarray.h"
  #include "storage/spin.h"
  #include "storage/standby.h"
  #include "utils/builtins.h"
+ #include "utils/bytea.h"
+ #include "utils/memutils.h"
  #include "utils/snapmgr.h"
  
  
*************** static int KnownAssignedXidsGetAndSetXmi
*** 159,164 ****
--- 162,171 ----
  static TransactionId KnownAssignedXidsGetOldestXmin(void);
  static void KnownAssignedXidsDisplay(int trace_level);
  
+ typedef char snapshotChksum[33];
+ static snapshotChksum *syncSnapshotChksums;
+ static Snapshot exportedSnapshot;
+ 
  /*
   * Report shared-memory space needed by CreateSharedProcArray.
   */
*************** KnownAssignedXidsDisplay(int trace_level
*** 3065,3067 ****
--- 3072,3424 ----
  
  	pfree(buf.data);
  }
+ 
+ 
+ /*
+  *  Report space needed for our shared memory area, which is basically an
+  *  md5 checksum per connection.
+  */
+ Size
+ SyncSnapshotShmemSize(void)
+ {
+ 	return PROCARRAY_MAXPROCS * sizeof(snapshotChksum);
+ }
+ 
+ void
+ SyncSnapshotInit(void)
+ {
+ 	Size	size;
+ 	bool	found;
+ 
+ 	size = SyncSnapshotShmemSize();
+ 
+ 	syncSnapshotChksums = (snapshotChksum*) ShmemInitStruct("SyncSnapshotChksums",
+ 															size, &found);
+ 	if (!found)
+ 		memset(syncSnapshotChksums[0], 0, sizeof(snapshotChksum) * PROCARRAY_MAXPROCS);
+ }
+ 
+ void
+ InvalidateExportedSnapshot(void)
+ {
+ 	if (syncSnapshotChksums[MyBackendId][0] == '\0')
+ 		return;
+ 
+ 	memset(syncSnapshotChksums[MyBackendId], 0, sizeof(snapshotChksum));
+ 
+ 	UnregisterSnapshotFromOwner(exportedSnapshot, TopTransactionResourceOwner);
+ 	exportedSnapshot = NULL;
+ }
+ 
+ static void
+ snapshot_appendf(char ** buffer, int *bufsize_filled, int *bufsize_left, char *fmt, ...)
+ {
+ 	va_list		ap;
+ 	int			ret;
+ 
+ 	if (*bufsize_left < 64)
+ 	{
+ 		/* enlarge buffer by 1024 bytes */
+ 		*buffer = repalloc(*buffer, *bufsize_filled + 1024);
+ 		*bufsize_left = *bufsize_filled + 1024;
+ 	}
+ 
+ 	va_start(ap, fmt);
+ 	ret = vsnprintf(*buffer + *bufsize_filled, *bufsize_left, fmt, ap);
+ 	va_end(ap);
+ 
+ 	/* There shouldn't be any error, we leave enough room for the data that we
+ 	 * are writing (only numbers basically). */
+ 	Assert(ret < *bufsize_left && ret > 0);
+ 
+ 	*bufsize_left -= ret;
+ 	*bufsize_filled += ret;
+ 
+ 	Assert(strlen(*buffer) == *bufsize_filled);
+ }
+ 
+ bytea *
+ ExportSnapshot(Snapshot snapshot)
+ {
+ 	int				bufsize = 1024;
+ 	int				bufsize_filled = 0; /* doesn't include NUL byte */
+ 	int				bufsize_left = bufsize - bufsize_filled;
+ 	char		   *buf = (char *) palloc(bufsize);
+ 	int				i;
+ 	TransactionId  *children;
+ 	int				nchildren;
+ 
+ 	/* In a subtransaction we don't see our open subxip values in the snapshot
+ 	 * so they would be missing in the backend applying it. */
+ 	/* XXX is ERRCODE_INVALID_TRANSACTION_STATE more appropriate or less?
+ 	 * It's a valid transaction state but invalid for the requested operation. */
+ 	if (GetCurrentTransactionNestLevel() != 1)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ 				 errmsg("can only export a snapshot from a top level transaction")));
+ 
+ 	/* We do however see our already committed subxip values and add them to
+ 	 * the subxip array. */
+ 	nchildren = xactGetCommittedChildren(&children);
+ 
+ 	/* Write up all the data that we return */
+ 	snapshot_appendf(&buf, &bufsize_filled, &bufsize_left,
+ 					 "did:%d ", MyDatabaseId);
+ 	snapshot_appendf(&buf, &bufsize_filled, &bufsize_left,
+ 					 "bid:%d ", MyBackendId);
+ 	snapshot_appendf(&buf, &bufsize_filled, &bufsize_left,
+ 					 "xmi:%d ", snapshot->xmin);
+ 	snapshot_appendf(&buf, &bufsize_filled, &bufsize_left,
+ 					 "xma:%d ", snapshot->xmax);
+ 	/* Include our own transaction ID into the count if any. */
+ 	snapshot_appendf(&buf, &bufsize_filled, &bufsize_left,
+ 					 "xcnt:%d ", TransactionIdIsValid(GetTopTransactionIdIfAny()) ?
+ 								 snapshot->xcnt + 1 : snapshot->xcnt);
+ 	for (i = 0; i < snapshot->xcnt; i++)
+ 		snapshot_appendf(&buf, &bufsize_filled, &bufsize_left,
+ 						 "xip:%d ", snapshot->xip[i]);
+ 	/*
+ 	 * Finally add our own XID if we have one, since by definition we will
+ 	 * still be running when the other transaction takes over the snapshot.
+ 	 */
+ 	if (TransactionIdIsValid(GetTopTransactionIdIfAny()))
+ 		snapshot_appendf(&buf, &bufsize_filled, &bufsize_left,
+ 						 "xip:%d ", GetTopTransactionIdIfAny());
+ 	if (snapshot->suboverflowed || snapshot->subxcnt + nchildren > TOTAL_MAX_CACHED_SUBXIDS)
+ 		snapshot_appendf(&buf, &bufsize_filled, &bufsize_left,
+ 						 "sof:1 ");
+ 	else
+ 	{
+ 		snapshot_appendf(&buf, &bufsize_filled, &bufsize_left,
+ 						 "sxcnt:%d ", snapshot->subxcnt + nchildren);
+ 		for (i = 0; i < snapshot->subxcnt; i++)
+ 			snapshot_appendf(&buf, &bufsize_filled, &bufsize_left,
+ 							 "sxp:%d ", snapshot->subxip[i]);
+ 		/* Add already committed subtransactions. */
+ 		for (i = 0; i < nchildren; i++)
+ 			snapshot_appendf(&buf, &bufsize_filled, &bufsize_left,
+ 							 "sxp:%d ", children[i]);
+ 	}
+ 
+ 	/*
+ 	 * buf ends with a trailing space but we leave it in for simplicity. The
+ 	 * parsing routines also depend on it.
+ 	 */
+ 
+ 	/* Register the snapshot */
+ 	snapshot = RegisterSnapshotOnOwner(snapshot, TopTransactionResourceOwner);
+ 	exportedSnapshot = snapshot;
+ 
+ 	pg_md5_hash(buf, bufsize_filled, syncSnapshotChksums[MyBackendId]);
+ 
+ 	return DatumGetByteaP(DirectFunctionCall1(byteain, CStringGetDatum(buf)));
+ }
+ 
+ static Oid
+ parseOidFromText(char **s, const char *prefix)
+ {
+ 	char   *n, *p = strstr(*s, prefix);
+ 	Oid		oid;
+ 
+ 	if (!p)
+ 		return InvalidOid;
+ 	p += strlen(prefix);
+ 	n = strchr(p, ' ');
+ 	if (!n)
+ 		return InvalidOid;
+ 	*n = '\0';
+ 	oid = DatumGetObjectId(DirectFunctionCall1(oidin, CStringGetDatum(p)));
+ 	*s = n + 1;
+ 	return oid;
+ }
+ 
+ static int
+ parseIntFromText(char **s, const char *prefix, int notfound)
+ {
+ 	char   *n, *p = strstr(*s, prefix);
+ 	int		i;
+ 
+ 	if (!p)
+ 		return notfound;
+ 	p += strlen(prefix);
+ 	n = strchr(p, ' ');
+ 	if (!n)
+ 		return notfound;
+ 	*n = '\0';
+ 	i = DatumGetObjectId(DirectFunctionCall1(int4in, CStringGetDatum(p)));
+ 	*s = n + 1;
+ 	return i;
+ }
+ 
+ static bool
+ parseBoolFromText(char **s, const char *prefix)
+ {
+ 	return (bool) parseIntFromText(s, prefix, 0);
+ }
+ 
+ static TransactionId
+ parseXactFromText(char **s, const char *prefix)
+ {
+ 	char			*n, *p = strstr(*s, prefix);
+ 	TransactionId	xid;
+ 
+ 	if (!p)
+ 		return InvalidTransactionId;
+ 	p += strlen(prefix);
+ 	n = strchr(p, ' ');
+ 	if (!n)
+ 		return InvalidTransactionId;
+ 	*n = '\0';
+ 	xid = DatumGetTransactionId(DirectFunctionCall1(xidin, CStringGetDatum(p)));
+ 	*s = n + 1;
+ 	return xid;
+ }
+ 
+ bool
+ ImportSnapshot(bytea *snapshotData, Snapshot snapshot)
+ {
+ 	snapshotChksum	cksum;
+ 	Oid				databaseId;
+ 	BackendId		backendId;
+ 	int				i;
+ 	TransactionId	xid;
+ 	int				len = VARSIZE_ANY_EXHDR(snapshotData);
+ 	char			*s = palloc(len + 1);
+ 	MemoryContext	oldctx;
+ 
+ 	/* XXX is ERRCODE_INVALID_TRANSACTION_STATE more appropriate or less?
+ 	 * It's a valid transaction state but invalid for the requested operation. */
+ 	if (GetCurrentTransactionNestLevel() != 1)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ 				 errmsg("can only import a snapshot to a top level transaction")));
+ 
+ 	if (len == 0)
+ 		ereport(ERROR,
+ 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 			 errmsg("invalid data in snapshot information")));
+ 
+ 	strncpy(s, VARDATA_ANY(snapshotData), len);
+ 	s[len] = '\0';
+ 
+ 	pg_md5_hash(s, len, (void *) &cksum);
+ 
+ 	databaseId = parseOidFromText(&s, "did:");
+ 	backendId = (BackendId) parseIntFromText(&s, "bid:", (int) InvalidBackendId);
+ 
+ 	if (databaseId != MyDatabaseId
+ 		|| backendId == InvalidBackendId
+ 		/*
+ 		 * Make sure backendId is in a reasonable range before using it as an
+ 		 * array subscript.
+ 		 */
+ 		|| backendId >= PROCARRAY_MAXPROCS
+ 		|| backendId < 0
+ 		|| backendId == MyBackendId
+ 		/*
+ 		 * Lock considerations:
+ 		 *
+ 		 * syncSnapshotChksums[backendId] is only changed by the backend with ID
+ 		 * backendID and read by another backend that is asked to import a
+ 		 * snapshot.
+ 		 * syncSnapshotChksums[backendId] contains either NUL bytes or the checksum
+ 		 * of a valid snapshot.
+ 		 * Every comparision to the checksum while it is being written or
+ 		 * deleted is okay to fail, so we don't need to take a lock at all.
+ 		 */
+ 		|| strcmp(cksum, syncSnapshotChksums[backendId]) != 0)
+ 	{
+ 		ereport(ERROR,
+ 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 			 errmsg("invalid data in snapshot information")));
+ 	}
+ 
+ 	Assert(databaseId != InvalidOid);
+ 
+ 	oldctx = MemoryContextSwitchTo(TopTransactionContext);
+ 
+ 	snapshot->xmin = parseXactFromText(&s, "xmi:");
+ 	Assert(snapshot->xmin != InvalidTransactionId);
+ 	snapshot->xmax = parseXactFromText(&s, "xma:");
+ 	Assert(snapshot->xmax != InvalidTransactionId);
+ 
+ 	if (snapshot->copied)
+ 	{
+ 		int xcnt = parseIntFromText(&s, "xcnt:", 0);
+ 		/*
+ 		 * Unfortunately CopySnapshot allocates just one large chunk of memory,
+ 		 * and makes snapshot->xip then point past the end of the fixed-size
+ 		 * snapshot data. That way we cannot just repalloc(snapshot->xip, ...).
+ 		 * Neither can we just change the base address of the snapshot because
+ 		 * this address might still be saved somewhere.
+ 		 */
+ 		if (snapshot->xcnt < xcnt)
+ 			snapshot->xip = palloc(xcnt * sizeof(TransactionId));
+ 	}
+ 
+ 	i = 0;
+ 	while ((xid = parseXactFromText(&s, "xip:")) != InvalidTransactionId)
+ 	{
+ 		if (xid == GetTopTransactionIdIfAny())
+ 			continue;
+ 		snapshot->xip[i++] = xid;
+ 	}
+ 	snapshot->xcnt = i;
+ 
+ 	/*
+ 	 * We only write "sof:1" if the snapshot overflowed. If not, then there is
+ 	 * no "sof:x" entry at all and parseBoolFromText() will just return false.
+ 	 */
+ 	snapshot->suboverflowed = parseBoolFromText(&s, "sof:");
+ 
+ 	if (!snapshot->suboverflowed)
+ 	{
+ 		if (snapshot->copied)
+ 		{
+ 			int sxcnt = parseIntFromText(&s, "sxcnt:", 0);
+ 			if (snapshot->subxcnt < sxcnt)
+ 				snapshot->subxip = palloc(sxcnt * sizeof(TransactionId));
+ 		}
+ 
+ 		i = 0;
+ 		while ((xid = parseXactFromText(&s, "sxp:")) != InvalidTransactionId)
+ 			snapshot->subxip[i++] = xid;
+ 		snapshot->subxcnt = i;
+ 	}
+ 
+ 	MemoryContextSwitchTo(oldctx);
+ 
+ 	/* Leave snapshot->curcid as is. */
+ 
+ 	/*
+ 	 * No ProcArrayLock held here, we assume that a write is atomic. Also note
+ 	 * that MyProc->xmin can go backwards here. However this is safe because
+ 	 * the xmin we set here is the same as in the backend's proc->xmin whose
+ 	 * snapshot we are copying. At this very moment, anybody computing a
+ 	 * minimum will calculate at least this xmin as the overall xmin with or
+ 	 * without us setting MyProc->xmin to this value.
+ 	 * Instead we must check to not go forward (we might have opened a cursor
+ 	 * in this transaction and still have its snapshot registered)
+ 	 */
+ 	if (TransactionIdPrecedes(snapshot->xmin, MyProc->xmin))
+ 		MyProc->xmin = snapshot->xmin;
+ 
+ 	/*
+ 	 * Check the checksum again to prevent a race condition. If the exporting
+ 	 * backend invalidated its snapshot since we last checked or before we
+ 	 * finish with this second check, then we fail here and error out, thus
+ 	 * invalidating the snapshot we've built up.
+ 	 * We could succeed here even though the exporting transaction is at the
+ 	 * same time invalidating the checksum while we are checking it here for
+ 	 * the second time. But even then we are good, because we have set up our
+ 	 * MyProc record already.
+ 	 */
+ 	if (strcmp(cksum, syncSnapshotChksums[backendId]) != 0)
+ 		ereport(ERROR,
+ 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 			 errmsg("invalid data in snapshot information")));
+ 
+ 	return true;
+ }
+ 
+ 
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 45b92a0..f74ea76 100644
*** a/src/backend/utils/time/snapmgr.c
--- b/src/backend/utils/time/snapmgr.c
***************
*** 29,34 ****
--- 29,35 ----
  #include "access/xact.h"
  #include "storage/proc.h"
  #include "storage/procarray.h"
+ #include "utils/builtins.h"
  #include "utils/memutils.h"
  #include "utils/memutils.h"
  #include "utils/resowner.h"
*************** AtEarlyCommit_Snapshot(void)
*** 523,528 ****
--- 524,530 ----
  									TopTransactionResourceOwner);
  	registered_xact_snapshot = false;
  
+ 	InvalidateExportedSnapshot();
  }
  
  /*
*************** AtEOXact_Snapshot(bool isCommit)
*** 559,561 ****
--- 561,595 ----
  	FirstSnapshotSet = false;
  	registered_xact_snapshot = false;
  }
+ 
+ Datum
+ pg_export_snapshot(PG_FUNCTION_ARGS)
+ {
+ 	bytea  *snapshotData = ExportSnapshot(GetTransactionSnapshot());
+ 	PG_RETURN_BYTEA_P(snapshotData);
+ }
+ 
+ Datum
+ pg_import_snapshot(PG_FUNCTION_ARGS)
+ {
+ 	bytea  *snapshotData = PG_GETARG_BYTEA_P(0);
+ 	bool	ret = true;
+ 
+ 	if (ActiveSnapshotSet())
+ 		ret = ImportSnapshot(snapshotData, GetActiveSnapshot());
+ 
+ 	ret &= ImportSnapshot(snapshotData, GetTransactionSnapshot());
+ 
+ 	/*
+ 	 * If we are in read committed mode then the next query will execute with a
+ 	 * new snapshot making this function call quite useless.
+ 	 */
+ 	if (!IsolationUsesXactSnapshot())
+ 		ereport(WARNING,
+ 				(errcode(ERRCODE_INAPPROPRIATE_ISOLATION_LEVEL_FOR_BRANCH_TRANSACTION),
+ 				 errmsg("a snapshot importing transaction should be ISOLATION LEVEL SERIALIZABLE"),
+ 				 errhint("A transaction of isolation level READ COMMITTED gives you a new snapshot for each query.")));
+ 
+ 	PG_RETURN_BOOL(ret);
+ }
+ 
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index f8b5d4d..3a4bc6d 100644
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 2171 ( pg_cancel_backe
*** 3391,3396 ****
--- 3391,3400 ----
  DESCR("cancel a server process' current query");
  DATA(insert OID = 2096 ( pg_terminate_backend		PGNSP PGUID 12 1 0 0 f f f t f v 1 0 16 "23" _null_ _null_ _null_ _null_ pg_terminate_backend _null_ _null_ _null_ ));
  DESCR("terminate a server process");
+ DATA(insert OID = 3115 ( pg_export_snapshot		PGNSP PGUID 12 1 0 0 f f f t f v 0 0 17 "" _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ ));
+ DESCR("export a snapshot");
+ DATA(insert OID = 3116 ( pg_import_snapshot		PGNSP PGUID 12 1 0 0 f f f t f v 1 0 16 "17" _null_ _null_ _null_ _null_ pg_import_snapshot _null_ _null_ _null_ ));
+ DESCR("import a snapshot");
  DATA(insert OID = 2172 ( pg_start_backup		PGNSP PGUID 12 1 0 0 f f f t f v 2 0 25 "25 16" _null_ _null_ _null_ _null_ pg_start_backup _null_ _null_ _null_ ));
  DESCR("prepare for taking an online backup");
  DATA(insert OID = 2173 ( pg_stop_backup			PGNSP PGUID 12 1 0 0 f f f t f v 0 0 25 "" _null_ _null_ _null_ _null_ pg_stop_backup _null_ _null_ _null_ ));
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index 334f9a2..ee94451 100644
*** a/src/include/storage/procarray.h
--- b/src/include/storage/procarray.h
*************** extern void XidCacheRemoveRunningXids(Tr
*** 71,74 ****
--- 71,81 ----
  						  int nxids, const TransactionId *xids,
  						  TransactionId latestXid);
  
+ extern Size SyncSnapshotShmemSize(void);
+ extern void SyncSnapshotInit(void);
+ 
+ extern bytea *ExportSnapshot(Snapshot snapshot);
+ extern bool ImportSnapshot(bytea *snapshotData, Snapshot snapshot);
+ extern void InvalidateExportedSnapshot(void);
+ 
  #endif   /* PROCARRAY_H */
diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h
index 6d784d2..3dcb924 100644
*** a/src/include/utils/snapmgr.h
--- b/src/include/utils/snapmgr.h
*************** extern void AtSubAbort_Snapshot(int leve
*** 43,46 ****
--- 43,49 ----
  extern void AtEarlyCommit_Snapshot(void);
  extern void AtEOXact_Snapshot(bool isCommit);
  
+ extern Datum pg_export_snapshot(PG_FUNCTION_ARGS);
+ extern Datum pg_import_snapshot(PG_FUNCTION_ARGS);
+ 
  #endif   /* SNAPMGR_H */
use DBI;
use Data::Dumper;

$dad = DBI->connect('dbi:Pg:dbname=template1;host=localhost', '', '', {AutoCommit => 0, RaiseError => 1});
$kid = DBI->connect('dbi:Pg:dbname=template1;host=localhost', '', '', {AutoCommit => 0, RaiseError => 1});

# Cleanup...
$kid->do("DROP TABLE IF EXISTS kidseen");
$kid->do("DROP TABLE IF EXISTS kidtest");
$kid->commit();

# Dad opens a transaction...
$dad->rollback();
$dad->do("SET bytea_output = 'escape'");

# Dad exports its snapshot for the kid to take it over...
($snap) = $dad->selectrow_array('SELECT pg_export_snapshot()');
print "Dad: " . Dumper($dad->selectall_arrayref("SELECT * FROM pg_class where relname = 'kidtest'"));

# The kid creates a table that will disappear when it takes over dad's snapshot...
$kid->do("CREATE TABLE kidtest (a int);");
$kid->commit();


# READ COMMITTED

$kid->do("set default_transaction_isolation to 'read committed'");
$kid->commit();

$kid->do("DO \$\$BEGIN
PERFORM pg_import_snapshot('$snap');
CREATE TABLE kidseen AS SELECT * FROM pg_class WHERE relname = 'kidtest';
END\$\$");
print "Kid RC: " . Dumper($kid->selectall_arrayref("SELECT * FROM kidseen"));
$kid->rollback();



# SERIALIZABLE

$kid->do("set default_transaction_isolation to serializable");
$kid->commit();

$kid->do("DO \$\$BEGIN
PERFORM pg_import_snapshot('$snap');
CREATE TABLE kidseen AS SELECT * FROM pg_class WHERE relname = 'kidtest';
END\$\$");
print "Kid SERIAL: " . Dumper($kid->selectall_arrayref("SELECT * FROM kidseen"));
$kid->rollback();
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to