Re: [HACKERS] tracking commit timestamps
Petr Jelinek wrote: On Thu, Dec 4, 2014 at 12:08 PM, Fujii Masao masao.fu...@gmail.com wrote: I got the following assertion failure when I executed pg_xact_commit_timestamp() in the standby server. =# select pg_xact_commit_timestamp('1000'::xid); TRAP: FailedAssertion(!(((oldestCommitTs) != ((TransactionId) 0)) == ((newestCommitTs) != ((TransactionId) 0))), File: commit_ts.c, Line: 315) Attached patch fixes it, I am not sure how happy I am with the way I did it though. Pushed, thanks. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 05/01/15 17:50, Petr Jelinek wrote: On 05/01/15 16:17, Petr Jelinek wrote: On 05/01/15 07:28, Fujii Masao wrote: On Thu, Dec 4, 2014 at 12:08 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Dec 3, 2014 at 11:54 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Pushed with some extra cosmetic tweaks. I got the following assertion failure when I executed pg_xact_commit_timestamp() in the standby server. =# select pg_xact_commit_timestamp('1000'::xid); TRAP: FailedAssertion(!(((oldestCommitTs) != ((TransactionId) 0)) == ((newestCommitTs) != ((TransactionId) 0))), File: commit_ts.c, Line: 315) server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: 2014-12-04 12:01:08 JST sby1 LOG: server process (PID 15545) was terminated by signal 6: Aborted 2014-12-04 12:01:08 JST sby1 DETAIL: Failed process was running: select pg_xact_commit_timestamp('1000'::xid); The way to reproduce this problem is #1. set up and start the master and standby servers with track_commit_timestamp disabled #2. enable track_commit_timestamp in the master and restart the master #3. run some write transactions #4. enable track_commit_timestamp in the standby and restart the standby #5. execute select pg_xact_commit_timestamp('1000'::xid) in the standby BTW, at the step #4, I got the following log messages. This might be a hint for this problem. LOG: file pg_commit_ts/ doesn't exist, reading as zeroes CONTEXT: xlog redo Transaction/COMMIT: 2014-12-04 12:00:16.428702+09; inval msgs: catcache 59 catcache 58 catcache 59 catcache 58 catcache 45 catcache 44 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 snapshot 2608 relcache 16384 This problem still happens in the master. Regards, Attached patch fixes it, I am not sure how happy I am with the way I did it though. Added more comments and made the error message bit clearer. Fujii, Alvaro, did one of you had chance to look at this fix? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Tue, Jan 6, 2015 at 2:58 AM, Michael Paquier michael.paqu...@gmail.com wrote: So, we would need additional information other than the node ID *and* the timestamp to ensure proper transaction commit ordering on Windows. That's not cool and makes this feature very limited on this platform. You can't use the timestamp alone for commit ordering on any platform. Eventually, two transactions will manage to commit in a single clock tick, no matter how short that is. Now, if we'd included the LSN in there... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Fri, Dec 19, 2014 at 3:53 PM, Noah Misch n...@leadboat.com wrote: localhost template1=# select clock_timestamp(), pg_sleep(.1 * (n % 2)) from generate_series(0,7) t(n); clock_timestamp| pg_sleep ---+-- 2014-12-18 08:34:34.522126+00 | 2014-12-18 08:34:34.522126+00 | 2014-12-18 08:34:34.631508+00 | 2014-12-18 08:34:34.631508+00 | 2014-12-18 08:34:34.74089+00 | 2014-12-18 08:34:34.74089+00 | 2014-12-18 08:34:34.850272+00 | 2014-12-18 08:34:34.850272+00 | (8 rows) So, we would need additional information other than the node ID *and* the timestamp to ensure proper transaction commit ordering on Windows. That's not cool and makes this feature very limited on this platform. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 05/01/15 16:17, Petr Jelinek wrote: On 05/01/15 07:28, Fujii Masao wrote: On Thu, Dec 4, 2014 at 12:08 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Dec 3, 2014 at 11:54 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Pushed with some extra cosmetic tweaks. I got the following assertion failure when I executed pg_xact_commit_timestamp() in the standby server. =# select pg_xact_commit_timestamp('1000'::xid); TRAP: FailedAssertion(!(((oldestCommitTs) != ((TransactionId) 0)) == ((newestCommitTs) != ((TransactionId) 0))), File: commit_ts.c, Line: 315) server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: 2014-12-04 12:01:08 JST sby1 LOG: server process (PID 15545) was terminated by signal 6: Aborted 2014-12-04 12:01:08 JST sby1 DETAIL: Failed process was running: select pg_xact_commit_timestamp('1000'::xid); The way to reproduce this problem is #1. set up and start the master and standby servers with track_commit_timestamp disabled #2. enable track_commit_timestamp in the master and restart the master #3. run some write transactions #4. enable track_commit_timestamp in the standby and restart the standby #5. execute select pg_xact_commit_timestamp('1000'::xid) in the standby BTW, at the step #4, I got the following log messages. This might be a hint for this problem. LOG: file pg_commit_ts/ doesn't exist, reading as zeroes CONTEXT: xlog redo Transaction/COMMIT: 2014-12-04 12:00:16.428702+09; inval msgs: catcache 59 catcache 58 catcache 59 catcache 58 catcache 45 catcache 44 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 snapshot 2608 relcache 16384 This problem still happens in the master. Regards, Attached patch fixes it, I am not sure how happy I am with the way I did it though. Added more comments and made the error message bit clearer. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c index ca074da..59d19a0 100644 --- a/src/backend/access/transam/commit_ts.c +++ b/src/backend/access/transam/commit_ts.c @@ -557,6 +557,12 @@ StartupCommitTs(void) TransactionId xid = ShmemVariableCache-nextXid; int pageno = TransactionIdToCTsPage(xid); + if (track_commit_timestamp) + { + ActivateCommitTs(); + return; + } + LWLockAcquire(CommitTsControlLock, LW_EXCLUSIVE); /* @@ -571,14 +577,30 @@ StartupCommitTs(void) * This must be called ONCE during postmaster or standalone-backend startup, * when commit timestamp is enabled. Must be called after recovery has * finished. + */ +void +CompleteCommitTsInitialization(void) +{ + if (!track_commit_timestamp) + DeactivateCommitTs(true); +} + +/* + * This must be called when track_commit_timestamp is turned on. + * Note that this only happens during postmaster or standalone-backend startup + * or during WAL replay. + * + * The reason why this SLRU needs separate activation/deactivation functions is + * that it can be enabled/disabled during start and the activation/deactivation + * on master is propagated to slave via replay. Other SLRUs don't have this + * property and they can be just initialized during normal startup. * * This is in charge of creating the currently active segment, if it's not * already there. The reason for this is that the server might have been * running with this module disabled for a while and thus might have skipped * the normal creation point. */ -void -CompleteCommitTsInitialization(void) +void ActivateCommitTs(void) { TransactionId xid = ShmemVariableCache-nextXid; int pageno = TransactionIdToCTsPage(xid); @@ -591,22 +613,6 @@ CompleteCommitTsInitialization(void) LWLockRelease(CommitTsControlLock); /* - * If this module is not currently enabled, make sure we don't hand back - * possibly-invalid data; also remove segments of old data. - */ - if (!track_commit_timestamp) - { - LWLockAcquire(CommitTsLock, LW_EXCLUSIVE); - ShmemVariableCache-oldestCommitTs = InvalidTransactionId; - ShmemVariableCache-newestCommitTs = InvalidTransactionId; - LWLockRelease(CommitTsLock); - - TruncateCommitTs(ReadNewTransactionId()); - - return; - } - - /* * If CommitTs is enabled, but it wasn't in the previous server run, we * need to set the oldest and newest values to the next Xid; that way, we * will not try to read data that might not have been set. @@ -641,6 +647,35 @@ CompleteCommitTsInitialization(void) } /* + * This must be called when track_commit_timestamp is turned off. + * Note that this only happens during postmaster or standalone-backend startup + * or during WAL replay. + * + * Resets CommitTs into invalid state to
Re: [HACKERS] tracking commit timestamps
On 05/01/15 07:28, Fujii Masao wrote: On Thu, Dec 4, 2014 at 12:08 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Dec 3, 2014 at 11:54 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Pushed with some extra cosmetic tweaks. I got the following assertion failure when I executed pg_xact_commit_timestamp() in the standby server. =# select pg_xact_commit_timestamp('1000'::xid); TRAP: FailedAssertion(!(((oldestCommitTs) != ((TransactionId) 0)) == ((newestCommitTs) != ((TransactionId) 0))), File: commit_ts.c, Line: 315) server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: 2014-12-04 12:01:08 JST sby1 LOG: server process (PID 15545) was terminated by signal 6: Aborted 2014-12-04 12:01:08 JST sby1 DETAIL: Failed process was running: select pg_xact_commit_timestamp('1000'::xid); The way to reproduce this problem is #1. set up and start the master and standby servers with track_commit_timestamp disabled #2. enable track_commit_timestamp in the master and restart the master #3. run some write transactions #4. enable track_commit_timestamp in the standby and restart the standby #5. execute select pg_xact_commit_timestamp('1000'::xid) in the standby BTW, at the step #4, I got the following log messages. This might be a hint for this problem. LOG: file pg_commit_ts/ doesn't exist, reading as zeroes CONTEXT: xlog redo Transaction/COMMIT: 2014-12-04 12:00:16.428702+09; inval msgs: catcache 59 catcache 58 catcache 59 catcache 58 catcache 45 catcache 44 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 snapshot 2608 relcache 16384 This problem still happens in the master. Regards, Attached patch fixes it, I am not sure how happy I am with the way I did it though. And while at it I noticed that redo code for XLOG_PARAMETER_CHANGE sets ControlFile-wal_log_hints = wal_log_hints; shouldn't it be ControlFile-wal_log_hints = xlrec.wal_log_hints; instead? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c index ca074da..fcfccf8 100644 --- a/src/backend/access/transam/commit_ts.c +++ b/src/backend/access/transam/commit_ts.c @@ -557,6 +557,12 @@ StartupCommitTs(void) TransactionId xid = ShmemVariableCache-nextXid; int pageno = TransactionIdToCTsPage(xid); + if (track_commit_timestamp) + { + ActivateCommitTs(); + return; + } + LWLockAcquire(CommitTsControlLock, LW_EXCLUSIVE); /* @@ -571,14 +577,25 @@ StartupCommitTs(void) * This must be called ONCE during postmaster or standalone-backend startup, * when commit timestamp is enabled. Must be called after recovery has * finished. + */ +void +CompleteCommitTsInitialization(void) +{ + if (!track_commit_timestamp) + DeactivateCommitTs(true); +} + +/* + * This must be called when track_commit_timestamp is turned on. + * Note that this only happens during postmaster or standalone-backend startup + * or during WAL replay. * * This is in charge of creating the currently active segment, if it's not * already there. The reason for this is that the server might have been * running with this module disabled for a while and thus might have skipped * the normal creation point. */ -void -CompleteCommitTsInitialization(void) +void ActivateCommitTs(void) { TransactionId xid = ShmemVariableCache-nextXid; int pageno = TransactionIdToCTsPage(xid); @@ -591,22 +608,6 @@ CompleteCommitTsInitialization(void) LWLockRelease(CommitTsControlLock); /* - * If this module is not currently enabled, make sure we don't hand back - * possibly-invalid data; also remove segments of old data. - */ - if (!track_commit_timestamp) - { - LWLockAcquire(CommitTsLock, LW_EXCLUSIVE); - ShmemVariableCache-oldestCommitTs = InvalidTransactionId; - ShmemVariableCache-newestCommitTs = InvalidTransactionId; - LWLockRelease(CommitTsLock); - - TruncateCommitTs(ReadNewTransactionId()); - - return; - } - - /* * If CommitTs is enabled, but it wasn't in the previous server run, we * need to set the oldest and newest values to the next Xid; that way, we * will not try to read data that might not have been set. @@ -641,6 +642,35 @@ CompleteCommitTsInitialization(void) } /* + * This must be called when track_commit_timestamp is turned off. + * Note that this only happens during postmaster or standalone-backend startup + * or during WAL replay. + * + * Resets CommitTs into invalid state to make sure we don't hand back + * possibly-invalid data; also removes segments of old data. + */ +void +DeactivateCommitTs(bool do_wal) +{ + TransactionId xid = ShmemVariableCache-nextXid; +
Re: [HACKERS] tracking commit timestamps
On 12/19/2014 02:53 PM, Noah Misch wrote: The test assumed that no two transactions of a given backend will get the same timestamp value from now(). That holds so long as ticks of the system time are small enough. Not so on at least some Windows configurations. Most Windows systems with nothing else running will have 15 ms timer granularity. So multiple timestamps allocated within the same millisecond will have the same value for timestamps captured within that interval. If you're running other programs that use the multimedia timer APIs (including Google Chrome, MS SQL Server, and all sorts of other apps you might not expect) you'll probably have 1ms timer granularity instead. Since PostgreSQL 9.4 and below capture time on Windows using GetSystemTime the sub-millisecond part is lost anyway. On 9.5 it's retained but will usually be some fixed value because the timer tick is still 1ms. If you're on Windows 8 or Windows 2012 and running PostgreSQL 9.5 (master), but not earlier versions, you'll get sub-microsecond resolution like on sensible platforms. Some details here: https://github.com/2ndQuadrant/pg_sysdatetime -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Thu, Dec 4, 2014 at 12:08 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Dec 3, 2014 at 11:54 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Pushed with some extra cosmetic tweaks. I got the following assertion failure when I executed pg_xact_commit_timestamp() in the standby server. =# select pg_xact_commit_timestamp('1000'::xid); TRAP: FailedAssertion(!(((oldestCommitTs) != ((TransactionId) 0)) == ((newestCommitTs) != ((TransactionId) 0))), File: commit_ts.c, Line: 315) server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: 2014-12-04 12:01:08 JST sby1 LOG: server process (PID 15545) was terminated by signal 6: Aborted 2014-12-04 12:01:08 JST sby1 DETAIL: Failed process was running: select pg_xact_commit_timestamp('1000'::xid); The way to reproduce this problem is #1. set up and start the master and standby servers with track_commit_timestamp disabled #2. enable track_commit_timestamp in the master and restart the master #3. run some write transactions #4. enable track_commit_timestamp in the standby and restart the standby #5. execute select pg_xact_commit_timestamp('1000'::xid) in the standby BTW, at the step #4, I got the following log messages. This might be a hint for this problem. LOG: file pg_commit_ts/ doesn't exist, reading as zeroes CONTEXT: xlog redo Transaction/COMMIT: 2014-12-04 12:00:16.428702+09; inval msgs: catcache 59 catcache 58 catcache 59 catcache 58 catcache 45 catcache 44 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 snapshot 2608 relcache 16384 This problem still happens in the master. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Tue, Dec 16, 2014 at 01:05:31AM -0300, Alvaro Herrera wrote: Noah Misch wrote: On Mon, Dec 15, 2014 at 12:12:10AM -0800, Michael Paquier wrote: FWIW, I just tried that with MinGW-32 and I can see the error on Win7. I also checked that changing now() to = now() fixed the problem, so your assumption was right, Petr. Committed, after fixing the alternate expected output. Thanks. I admit I don't understand what the issue is. The test assumed that no two transactions of a given backend will get the same timestamp value from now(). That holds so long as ticks of the system time are small enough. Not so on at least some Windows configurations. Notice the repeated timestamp values: Windows Server 2003 x64, 32-bit build w/ VS2010 localhost template1=# select clock_timestamp(), pg_sleep(.1 * (n % 2)) from generate_series(0,7) t(n); clock_timestamp| pg_sleep ---+-- 2014-12-18 08:34:34.522126+00 | 2014-12-18 08:34:34.522126+00 | 2014-12-18 08:34:34.631508+00 | 2014-12-18 08:34:34.631508+00 | 2014-12-18 08:34:34.74089+00 | 2014-12-18 08:34:34.74089+00 | 2014-12-18 08:34:34.850272+00 | 2014-12-18 08:34:34.850272+00 | (8 rows) GNU/Linux [local] test=# select clock_timestamp(), pg_sleep(.1 * (n % 2)) from generate_series(0,7) t(n); clock_timestamp| pg_sleep ---+-- 2014-12-19 06:49:47.590556+00 | 2014-12-19 06:49:47.590611+00 | 2014-12-19 06:49:47.691488+00 | 2014-12-19 06:49:47.691508+00 | 2014-12-19 06:49:47.801483+00 | 2014-12-19 06:49:47.801502+00 | 2014-12-19 06:49:47.921486+00 | 2014-12-19 06:49:47.921505+00 | (8 rows) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Wed, Dec 10, 2014 at 6:50 PM, Noah Misch n...@leadboat.com wrote: On Mon, Dec 08, 2014 at 02:23:39AM +0100, Petr Jelinek wrote: On 08/12/14 00:56, Noah Misch wrote: The commit_ts test suite gives me the attached diff on a 32-bit MinGW build running on 64-bit Windows Server 2003. I have not checked other Windows configurations; the suite does pass on GNU/Linux. Hmm I wonder if now() needs to be changed to = now() in those queries to make them work correctly on that plarform, I don't have machine with that environment handy right now, so I would appreciate if you could try that, in case you don't have time for that, I will try to setup something later... I will try that, though perhaps not until next week. FWIW, I just tried that with MinGW-32 and I can see the error on Win7. I also checked that changing now() to = now() fixed the problem, so your assumption was right, Petr. Regards, -- Michael 20141215_committs_mingw_fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 15/12/14 09:12, Michael Paquier wrote: On Wed, Dec 10, 2014 at 6:50 PM, Noah Misch n...@leadboat.com wrote: On Mon, Dec 08, 2014 at 02:23:39AM +0100, Petr Jelinek wrote: On 08/12/14 00:56, Noah Misch wrote: The commit_ts test suite gives me the attached diff on a 32-bit MinGW build running on 64-bit Windows Server 2003. I have not checked other Windows configurations; the suite does pass on GNU/Linux. Hmm I wonder if now() needs to be changed to = now() in those queries to make them work correctly on that plarform, I don't have machine with that environment handy right now, so I would appreciate if you could try that, in case you don't have time for that, I will try to setup something later... I will try that, though perhaps not until next week. FWIW, I just tried that with MinGW-32 and I can see the error on Win7. I also checked that changing now() to = now() fixed the problem, so your assumption was right, Petr. Regards, Cool, thanks, I think it was the time granularity problem in Windows. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Mon, Dec 15, 2014 at 12:12:10AM -0800, Michael Paquier wrote: On Wed, Dec 10, 2014 at 6:50 PM, Noah Misch n...@leadboat.com wrote: On Mon, Dec 08, 2014 at 02:23:39AM +0100, Petr Jelinek wrote: On 08/12/14 00:56, Noah Misch wrote: The commit_ts test suite gives me the attached diff on a 32-bit MinGW build running on 64-bit Windows Server 2003. I have not checked other Windows configurations; the suite does pass on GNU/Linux. Hmm I wonder if now() needs to be changed to = now() in those queries to make them work correctly on that plarform, I don't have machine with that environment handy right now, so I would appreciate if you could try that, in case you don't have time for that, I will try to setup something later... I will try that, though perhaps not until next week. FWIW, I just tried that with MinGW-32 and I can see the error on Win7. I also checked that changing now() to = now() fixed the problem, so your assumption was right, Petr. Committed, after fixing the alternate expected output. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
Noah Misch wrote: On Mon, Dec 15, 2014 at 12:12:10AM -0800, Michael Paquier wrote: FWIW, I just tried that with MinGW-32 and I can see the error on Win7. I also checked that changing now() to = now() fixed the problem, so your assumption was right, Petr. Committed, after fixing the alternate expected output. Thanks. I admit I don't understand what the issue is. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Mon, Dec 08, 2014 at 02:23:39AM +0100, Petr Jelinek wrote: On 08/12/14 00:56, Noah Misch wrote: The commit_ts test suite gives me the attached diff on a 32-bit MinGW build running on 64-bit Windows Server 2003. I have not checked other Windows configurations; the suite does pass on GNU/Linux. Hmm I wonder if now() needs to be changed to = now() in those queries to make them work correctly on that plarform, I don't have machine with that environment handy right now, so I would appreciate if you could try that, in case you don't have time for that, I will try to setup something later... I will try that, though perhaps not until next week. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Wed, Dec 03, 2014 at 11:54:38AM -0300, Alvaro Herrera wrote: Pushed with some extra cosmetic tweaks. The commit_ts test suite gives me the attached diff on a 32-bit MinGW build running on 64-bit Windows Server 2003. I have not checked other Windows configurations; the suite does pass on GNU/Linux. *** Z:/nm/postgresql/src/test/modules/commit_ts/expected/commit_timestamp.out 2014-12-05 05:43:01.07442 + --- Z:/nm/postgresql/src/test/modules/commit_ts/results/commit_timestamp.out 2014-12-05 08:24:13.094705200 + *** *** 19,27 ORDER BY id; id | ?column? | ?column? | ?column? +--+--+-- ! 1 | t| t| t ! 2 | t| t| t ! 3 | t| t| t (3 rows) DROP TABLE committs_test; --- 19,27 ORDER BY id; id | ?column? | ?column? | ?column? +--+--+-- ! 1 | t| f| t ! 2 | t| f| t ! 3 | t| f| t (3 rows) DROP TABLE committs_test; *** *** 34,39 SELECT x.xid::text::bigint 0, x.timestamp '-infinity'::timestamptz, x.timestamp now() FROM pg_last_committed_xact() x; ?column? | ?column? | ?column? --+--+-- ! t| t| t (1 row) --- 34,39 SELECT x.xid::text::bigint 0, x.timestamp '-infinity'::timestamptz, x.timestamp now() FROM pg_last_committed_xact() x; ?column? | ?column? | ?column? --+--+-- ! t| t| f (1 row) == -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 08/12/14 00:56, Noah Misch wrote: On Wed, Dec 03, 2014 at 11:54:38AM -0300, Alvaro Herrera wrote: Pushed with some extra cosmetic tweaks. The commit_ts test suite gives me the attached diff on a 32-bit MinGW build running on 64-bit Windows Server 2003. I have not checked other Windows configurations; the suite does pass on GNU/Linux. Hmm I wonder if now() needs to be changed to = now() in those queries to make them work correctly on that plarform, I don't have machine with that environment handy right now, so I would appreciate if you could try that, in case you don't have time for that, I will try to setup something later... -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
Pushed with some extra cosmetic tweaks. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 03/12/14 15:54, Alvaro Herrera wrote: Pushed with some extra cosmetic tweaks. Cool, thanks! -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Mon, Dec 1, 2014 at 5:34 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I made two more changes: 1. introduce newestCommitTs. Original code was using lastCommitXact to check that no future transaction is asked for, but this doesn't really work if a long-running transaction is committed, because asking for transactions with a higher Xid but which were committed earlier would raise an error. I'm kind of disappointed that, in spite of previous review comments, this got committed with extensive use of the CommitTs naming. I think that's confusing, but it's also something that will be awkward if we want to add other data, such as the much-discussed commit LSN, to the facility. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
Robert Haas wrote: On Mon, Dec 1, 2014 at 5:34 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I made two more changes: 1. introduce newestCommitTs. Original code was using lastCommitXact to check that no future transaction is asked for, but this doesn't really work if a long-running transaction is committed, because asking for transactions with a higher Xid but which were committed earlier would raise an error. I'm kind of disappointed that, in spite of previous review comments, this got committed with extensive use of the CommitTs naming. I think that's confusing, but it's also something that will be awkward if we want to add other data, such as the much-discussed commit LSN, to the facility. I never saw a comment that CommitTs was an unwanted name. There were some that said that committs wasn't liked because it looked like a misspelling, so we added an underscore -- stuff in lower case is commit_ts everywhere. Stuff in camel case didn't get the underscore because it didn't seem necessary. But other than that issue, the name wasn't questioned, as far as I'm aware. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
Alvaro Herrera wrote: Robert Haas wrote: I'm kind of disappointed that, in spite of previous review comments, this got committed with extensive use of the CommitTs naming. I think that's confusing, but it's also something that will be awkward if we want to add other data, such as the much-discussed commit LSN, to the facility. I never saw a comment that CommitTs was an unwanted name. There were some that said that committs wasn't liked because it looked like a misspelling, so we added an underscore -- stuff in lower case is commit_ts everywhere. Stuff in camel case didn't get the underscore because it didn't seem necessary. But other than that issue, the name wasn't questioned, as far as I'm aware. I found one email where you said you didn't like committs and preferred commit_timestamp instead. I don't see how making that change would have made you happy wrt the concern you just expressed. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Wed, Dec 3, 2014 at 2:36 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Alvaro Herrera wrote: Robert Haas wrote: I'm kind of disappointed that, in spite of previous review comments, this got committed with extensive use of the CommitTs naming. I think that's confusing, but it's also something that will be awkward if we want to add other data, such as the much-discussed commit LSN, to the facility. I never saw a comment that CommitTs was an unwanted name. There were some that said that committs wasn't liked because it looked like a misspelling, so we added an underscore -- stuff in lower case is commit_ts everywhere. Stuff in camel case didn't get the underscore because it didn't seem necessary. But other than that issue, the name wasn't questioned, as far as I'm aware. I found one email where you said you didn't like committs and preferred commit_timestamp instead. I don't see how making that change would have made you happy wrt the concern you just expressed. Fair point. I'm still not sure we got this one right, but I don't know that I want to spend more time wrangling about it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Wed, Dec 3, 2014 at 11:54 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Pushed with some extra cosmetic tweaks. I got the following assertion failure when I executed pg_xact_commit_timestamp() in the standby server. =# select pg_xact_commit_timestamp('1000'::xid); TRAP: FailedAssertion(!(((oldestCommitTs) != ((TransactionId) 0)) == ((newestCommitTs) != ((TransactionId) 0))), File: commit_ts.c, Line: 315) server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: 2014-12-04 12:01:08 JST sby1 LOG: server process (PID 15545) was terminated by signal 6: Aborted 2014-12-04 12:01:08 JST sby1 DETAIL: Failed process was running: select pg_xact_commit_timestamp('1000'::xid); The way to reproduce this problem is #1. set up and start the master and standby servers with track_commit_timestamp disabled #2. enable track_commit_timestamp in the master and restart the master #3. run some write transactions #4. enable track_commit_timestamp in the standby and restart the standby #5. execute select pg_xact_commit_timestamp('1000'::xid) in the standby BTW, at the step #4, I got the following log messages. This might be a hint for this problem. LOG: file pg_commit_ts/ doesn't exist, reading as zeroes CONTEXT: xlog redo Transaction/COMMIT: 2014-12-04 12:00:16.428702+09; inval msgs: catcache 59 catcache 58 catcache 59 catcache 58 catcache 45 catcache 44 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 snapshot 2608 relcache 16384 Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 4 December 2014 at 03:08, Fujii Masao masao.fu...@gmail.com wrote: #1. set up and start the master and standby servers with track_commit_timestamp disabled #2. enable track_commit_timestamp in the master and restart the master #3. run some write transactions #4. enable track_commit_timestamp in the standby and restart the standby #5. execute select pg_xact_commit_timestamp('1000'::xid) in the standby I'm not sure what step4 is supposed to do? Surely if steps 1-3 generate any WAL then the standby should replay it, whether or not track_commit_timestamp is enabled. So what effect does setting that parameter on the standby? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Thu, Dec 4, 2014 at 12:58 PM, Simon Riggs si...@2ndquadrant.com wrote: On 4 December 2014 at 03:08, Fujii Masao masao.fu...@gmail.com wrote: #1. set up and start the master and standby servers with track_commit_timestamp disabled #2. enable track_commit_timestamp in the master and restart the master #3. run some write transactions #4. enable track_commit_timestamp in the standby and restart the standby #5. execute select pg_xact_commit_timestamp('1000'::xid) in the standby I'm not sure what step4 is supposed to do? Surely if steps 1-3 generate any WAL then the standby should replay it, whether or not track_commit_timestamp is enabled. So what effect does setting that parameter on the standby? At least track_commit_timestamp seems to need to be enabled even in the standby when we want to call pg_xact_commit_timestamp() and pg_last_committed_xact() in the standby. I'm not sure if this is good design, though. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
Petr Jelinek wrote: On 25/11/14 16:30, Alvaro Herrera wrote: Petr Jelinek wrote: On 25/11/14 16:23, Alvaro Herrera wrote: Robert Haas wrote: Maybe 0 should get translated to a NULL return, instead of a bogus timestamp. That's one idea --- surely no transaction is going to commit at 00:00:00 on 2000-01-01 anymore. Yet this is somewhat discomforting. I solved it for xids that are out of range by returning -infinity and then changing that to NULL in sql interface, but no idea how to do that for aborted transactions. I guess the idea is that we just read the value from the slru and if it exactly matches allballs we do the same -infinity return and translation to NULL. (Do we really love this -infinity idea? If it's just an internal API we can use a boolean instead.) As in returning boolean instead of void as found? That works for me (for the C interface). Petr sent me privately some changes to implement this idea. I also reworked the tests so that they only happen on src/test/modules (getting rid of the one in core regress) and made them work with both enabled and disabled track_commit_timestamps; in make installcheck, they pass regardless of the setting of the installed server, and in make check, they run a server with the setting enabled. I made two more changes: 1. introduce newestCommitTs. Original code was using lastCommitXact to check that no future transaction is asked for, but this doesn't really work if a long-running transaction is committed, because asking for transactions with a higher Xid but which were committed earlier would raise an error. 2. change CommitTsControlLock to CommitTsLock as the lock that protects the stuff we keep in ShmemVariableCache. The Control one is for SLRU access, and so it might be slow at times. This is important because we fill the checkpoint struct from values protected by that lock, so a checkpoint might be delayed if it happens to land in the middle of a slru IO operation. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/contrib/pg_upgrade/pg_upgrade.c b/contrib/pg_upgrade/pg_upgrade.c index 3b8241b..f0a023f 100644 --- a/contrib/pg_upgrade/pg_upgrade.c +++ b/contrib/pg_upgrade/pg_upgrade.c @@ -423,8 +423,10 @@ copy_clog_xlog_xid(void) /* set the next transaction id and epoch of the new cluster */ prep_status(Setting next transaction ID and epoch for new cluster); exec_prog(UTILITY_LOG_FILE, NULL, true, - \%s/pg_resetxlog\ -f -x %u \%s\, - new_cluster.bindir, old_cluster.controldata.chkpnt_nxtxid, + \%s/pg_resetxlog\ -f -x %u -c %u \%s\, + new_cluster.bindir, + old_cluster.controldata.chkpnt_nxtxid, + old_cluster.controldata.chkpnt_nxtxid, new_cluster.pgdata); exec_prog(UTILITY_LOG_FILE, NULL, true, \%s/pg_resetxlog\ -f -e %u \%s\, diff --git a/contrib/pg_xlogdump/rmgrdesc.c b/contrib/pg_xlogdump/rmgrdesc.c index 9397198..180818d 100644 --- a/contrib/pg_xlogdump/rmgrdesc.c +++ b/contrib/pg_xlogdump/rmgrdesc.c @@ -10,6 +10,7 @@ #include access/brin_xlog.h #include access/clog.h +#include access/commit_ts.h #include access/gin.h #include access/gist_private.h #include access/hash.h diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index ab8c263..e3713d3 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2673,6 +2673,20 @@ include_dir 'conf.d' /listitem /varlistentry + varlistentry id=guc-track-commit-timestamp xreflabel=track_commit_timestamp + termvarnametrack_commit_timestamp/varname (typebool/type)/term + indexterm + primaryvarnametrack_commit_timestamp/ configuration parameter/primary + /indexterm + listitem + para +Record commit time of transactions. This parameter +can only be set in filenamepostgresql.conf/ file or on the server +command line. The default value is literaloff/literal. + /para + /listitem + /varlistentry + /variablelist /sect2 diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index baf81ee..62ec275 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -15938,6 +15938,45 @@ SELECT collation for ('foo' COLLATE de_DE); For example literal10:20:10,14,15/literal means literalxmin=10, xmax=20, xip_list=10, 14, 15/literal. /para + + para +The functions shown in xref linkend=functions-commit-timestamp +provide information about transactions that have been already committed. +These functions mainly provide information about when the transactions +were committed. They only provide useful data when +xref linkend=guc-track-commit-timestamp configuration option is enabled +and only for transactions that were committed after it was enabled. + /para + + table id=functions-commit-timestamp +titleCommitted transaction information/title +tgroup cols=3 + thead +
Re: [HACKERS] tracking commit timestamps
On Tue, Nov 25, 2014 at 7:58 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: And here is v10 which fixes conflicts with Heikki's WAL API changes (no changes otherwise). After some slight additional changes, here's v11, which I intend to commit early tomorrow. The main change is moving the test module from contrib to src/test/modules. When I specify the XID of the aborted transaction in pg_xact_commit_timestamp(), it always returns 2000-01-01 09:00:00+09. Is this intentional? Can I check my understanding? Probably we cannot use this feature to calculate the actual replication lag by, for example, comparing the result of pg_last_committed_xact() in the master and that of pg_last_xact_replay_timestamp() in the standby. Because pg_last_xact_replay_timestamp() can return even the timestamp of aborted transaction, but pg_last_committed_xact() cannot. Right? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
Fujii Masao wrote: On Tue, Nov 25, 2014 at 7:58 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: And here is v10 which fixes conflicts with Heikki's WAL API changes (no changes otherwise). After some slight additional changes, here's v11, which I intend to commit early tomorrow. The main change is moving the test module from contrib to src/test/modules. When I specify the XID of the aborted transaction in pg_xact_commit_timestamp(), it always returns 2000-01-01 09:00:00+09. Is this intentional? Well, when a transaction has not committed, nothing is written so on reading we get all zeroes which corresponds to the timestamp you give. So yeah, it is intentional. We could alternatively check pg_clog and raise an error if the transaction is not marked either COMMITTED or SUBCOMMITTED, but I'm not real sure there's much point. The other option is to record a commit time for aborted transactions too, but that doesn't seem very good either: first, this doesn't do anything for crashed or for in-progress transactions; and second, how does it make sense to have a commit time for a transaction that doesn't actually commit? Can I check my understanding? Probably we cannot use this feature to calculate the actual replication lag by, for example, comparing the result of pg_last_committed_xact() in the master and that of pg_last_xact_replay_timestamp() in the standby. Because pg_last_xact_replay_timestamp() can return even the timestamp of aborted transaction, but pg_last_committed_xact() cannot. Right? I don't think it's suited for that. I guess if you recorded the time of the last transaction that actually committed, you can use that. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Tue, Nov 25, 2014 at 11:19 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fujii Masao wrote: On Tue, Nov 25, 2014 at 7:58 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: And here is v10 which fixes conflicts with Heikki's WAL API changes (no changes otherwise). After some slight additional changes, here's v11, which I intend to commit early tomorrow. The main change is moving the test module from contrib to src/test/modules. When I specify the XID of the aborted transaction in pg_xact_commit_timestamp(), it always returns 2000-01-01 09:00:00+09. Is this intentional? Well, when a transaction has not committed, nothing is written so on reading we get all zeroes which corresponds to the timestamp you give. So yeah, it is intentional. We could alternatively check pg_clog and raise an error if the transaction is not marked either COMMITTED or SUBCOMMITTED, but I'm not real sure there's much point. The other option is to record a commit time for aborted transactions too, but that doesn't seem very good either: first, this doesn't do anything for crashed or for in-progress transactions; and second, how does it make sense to have a commit time for a transaction that doesn't actually commit? What about the PREPARE and COMMIT PREPARED transactions? ISTM that this feature tracks neither of them. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Tue, Nov 25, 2014 at 9:19 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fujii Masao wrote: On Tue, Nov 25, 2014 at 7:58 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: And here is v10 which fixes conflicts with Heikki's WAL API changes (no changes otherwise). After some slight additional changes, here's v11, which I intend to commit early tomorrow. The main change is moving the test module from contrib to src/test/modules. When I specify the XID of the aborted transaction in pg_xact_commit_timestamp(), it always returns 2000-01-01 09:00:00+09. Is this intentional? Well, when a transaction has not committed, nothing is written so on reading we get all zeroes which corresponds to the timestamp you give. So yeah, it is intentional. We could alternatively check pg_clog and raise an error if the transaction is not marked either COMMITTED or SUBCOMMITTED, but I'm not real sure there's much point. Maybe 0 should get translated to a NULL return, instead of a bogus timestamp. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
Robert Haas wrote: On Tue, Nov 25, 2014 at 9:19 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fujii Masao wrote: On Tue, Nov 25, 2014 at 7:58 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: And here is v10 which fixes conflicts with Heikki's WAL API changes (no changes otherwise). After some slight additional changes, here's v11, which I intend to commit early tomorrow. The main change is moving the test module from contrib to src/test/modules. When I specify the XID of the aborted transaction in pg_xact_commit_timestamp(), it always returns 2000-01-01 09:00:00+09. Is this intentional? Well, when a transaction has not committed, nothing is written so on reading we get all zeroes which corresponds to the timestamp you give. So yeah, it is intentional. We could alternatively check pg_clog and raise an error if the transaction is not marked either COMMITTED or SUBCOMMITTED, but I'm not real sure there's much point. Maybe 0 should get translated to a NULL return, instead of a bogus timestamp. That's one idea --- surely no transaction is going to commit at 00:00:00 on 2000-01-01 anymore. Yet this is somewhat discomforting. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 25/11/14 16:23, Alvaro Herrera wrote: Robert Haas wrote: On Tue, Nov 25, 2014 at 9:19 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fujii Masao wrote: On Tue, Nov 25, 2014 at 7:58 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: And here is v10 which fixes conflicts with Heikki's WAL API changes (no changes otherwise). After some slight additional changes, here's v11, which I intend to commit early tomorrow. The main change is moving the test module from contrib to src/test/modules. When I specify the XID of the aborted transaction in pg_xact_commit_timestamp(), it always returns 2000-01-01 09:00:00+09. Is this intentional? Well, when a transaction has not committed, nothing is written so on reading we get all zeroes which corresponds to the timestamp you give. So yeah, it is intentional. We could alternatively check pg_clog and raise an error if the transaction is not marked either COMMITTED or SUBCOMMITTED, but I'm not real sure there's much point. Maybe 0 should get translated to a NULL return, instead of a bogus timestamp. That's one idea --- surely no transaction is going to commit at 00:00:00 on 2000-01-01 anymore. Yet this is somewhat discomforting. I solved it for xids that are out of range by returning -infinity and then changing that to NULL in sql interface, but no idea how to do that for aborted transactions. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
Petr Jelinek wrote: On 25/11/14 16:23, Alvaro Herrera wrote: Robert Haas wrote: Maybe 0 should get translated to a NULL return, instead of a bogus timestamp. That's one idea --- surely no transaction is going to commit at 00:00:00 on 2000-01-01 anymore. Yet this is somewhat discomforting. I solved it for xids that are out of range by returning -infinity and then changing that to NULL in sql interface, but no idea how to do that for aborted transactions. I guess the idea is that we just read the value from the slru and if it exactly matches allballs we do the same -infinity return and translation to NULL. (Do we really love this -infinity idea? If it's just an internal API we can use a boolean instead.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 25/11/14 16:30, Alvaro Herrera wrote: Petr Jelinek wrote: On 25/11/14 16:23, Alvaro Herrera wrote: Robert Haas wrote: Maybe 0 should get translated to a NULL return, instead of a bogus timestamp. That's one idea --- surely no transaction is going to commit at 00:00:00 on 2000-01-01 anymore. Yet this is somewhat discomforting. I solved it for xids that are out of range by returning -infinity and then changing that to NULL in sql interface, but no idea how to do that for aborted transactions. I guess the idea is that we just read the value from the slru and if it exactly matches allballs we do the same -infinity return and translation to NULL. (Do we really love this -infinity idea? If it's just an internal API we can use a boolean instead.) As in returning boolean instead of void as found? That works for me (for the C interface). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 25 November 2014 at 13:35, Fujii Masao masao.fu...@gmail.com wrote: Can I check my understanding? Probably we cannot use this feature to calculate the actual replication lag by, for example, comparing the result of pg_last_committed_xact() in the master and that of pg_last_xact_replay_timestamp() in the standby. Because pg_last_xact_replay_timestamp() can return even the timestamp of aborted transaction, but pg_last_committed_xact() cannot. Right? It was intended for that, but I forgot that pg_last_xact_replay_timestamp() includes abort as well. I suggest we add a function that returns both the xid and timestamp on the standby: * pg_last_commit_replay_info() - which returns both the xid and timestamp of the last commit replayed on standby * then we use the xid from the standby to lookup the commit timestamp on the master. We then have two timestamps that refer to the same xid and can subtract to give us an exact replication lag. That can be done manually by user if requested. We can also do that by sending the replay info back as a feedback message from standby to master, so the information can be calculated by pg_stat_replication when requested. I'll work on that once we have this current patch committed. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 25/11/14 17:16, Simon Riggs wrote: On 25 November 2014 at 13:35, Fujii Masao masao.fu...@gmail.com wrote: Can I check my understanding? Probably we cannot use this feature to calculate the actual replication lag by, for example, comparing the result of pg_last_committed_xact() in the master and that of pg_last_xact_replay_timestamp() in the standby. Because pg_last_xact_replay_timestamp() can return even the timestamp of aborted transaction, but pg_last_committed_xact() cannot. Right? It was intended for that, but I forgot that pg_last_xact_replay_timestamp() includes abort as well. I suggest we add a function that returns both the xid and timestamp on the standby: * pg_last_commit_replay_info() - which returns both the xid and timestamp of the last commit replayed on standby * then we use the xid from the standby to lookup the commit timestamp on the master. We then have two timestamps that refer to the same xid and can subtract to give us an exact replication lag. Won't the pg_last_committed_xact() on slave + pg_xact_commit_timestamp() on master with the xid returned by slave accomplish the same thing? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 25 November 2014 at 16:18, Petr Jelinek p...@2ndquadrant.com wrote: Won't the pg_last_committed_xact() on slave + pg_xact_commit_timestamp() on master with the xid returned by slave accomplish the same thing? Surely the pg_last_committed_xact() will return the same value on standby as it did on the master? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Wed, Nov 26, 2014 at 1:51 AM, Simon Riggs si...@2ndquadrant.com wrote: On 25 November 2014 at 16:18, Petr Jelinek p...@2ndquadrant.com wrote: Won't the pg_last_committed_xact() on slave + pg_xact_commit_timestamp() on master with the xid returned by slave accomplish the same thing? Surely the pg_last_committed_xact() will return the same value on standby as it did on the master? It should. Now it needs some extra help as well as in its current shape this patch will WAL log a commit timestamp if the Node ID is valid, per RecordTransactionCommit. The node ID can be set only through CommitTsSetDefaultNodeId, which is called nowhere actually. So if an extension or an extra library needs to do some leg work to have to allow this information to be replayed on other nodes. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
And here is v10 which fixes conflicts with Heikki's WAL API changes (no changes otherwise). After some slight additional changes, here's v11, which I intend to commit early tomorrow. The main change is moving the test module from contrib to src/test/modules. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services *** a/contrib/pg_upgrade/pg_upgrade.c --- b/contrib/pg_upgrade/pg_upgrade.c *** *** 423,430 copy_clog_xlog_xid(void) /* set the next transaction id and epoch of the new cluster */ prep_status(Setting next transaction ID and epoch for new cluster); exec_prog(UTILITY_LOG_FILE, NULL, true, ! \%s/pg_resetxlog\ -f -x %u \%s\, ! new_cluster.bindir, old_cluster.controldata.chkpnt_nxtxid, new_cluster.pgdata); exec_prog(UTILITY_LOG_FILE, NULL, true, \%s/pg_resetxlog\ -f -e %u \%s\, --- 423,432 /* set the next transaction id and epoch of the new cluster */ prep_status(Setting next transaction ID and epoch for new cluster); exec_prog(UTILITY_LOG_FILE, NULL, true, ! \%s/pg_resetxlog\ -f -x %u -c %u \%s\, ! new_cluster.bindir, ! old_cluster.controldata.chkpnt_nxtxid, ! old_cluster.controldata.chkpnt_nxtxid, new_cluster.pgdata); exec_prog(UTILITY_LOG_FILE, NULL, true, \%s/pg_resetxlog\ -f -e %u \%s\, *** a/contrib/pg_xlogdump/rmgrdesc.c --- b/contrib/pg_xlogdump/rmgrdesc.c *** *** 10,15 --- 10,16 #include access/brin_xlog.h #include access/clog.h + #include access/commit_ts.h #include access/gin.h #include access/gist_private.h #include access/hash.h *** a/doc/src/sgml/config.sgml --- b/doc/src/sgml/config.sgml *** *** 2673,2678 include_dir 'conf.d' --- 2673,2692 /listitem /varlistentry + varlistentry id=guc-track-commit-timestamp xreflabel=track_commit_timestamp + termvarnametrack_commit_timestamp/varname (typebool/type)/term + indexterm +primaryvarnametrack_commit_timestamp/ configuration parameter/primary + /indexterm + listitem +para + Record commit time of transactions. This parameter + can only be set in filenamepostgresql.conf/ file or on the server + command line. The default value is literaloff/literal. +/para + /listitem + /varlistentry + /variablelist /sect2 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 15923,15928 SELECT collation for ('foo' COLLATE de_DE); --- 15923,15960 For example literal10:20:10,14,15/literal means literalxmin=10, xmax=20, xip_list=10, 14, 15/literal. /para + +para + The functions shown in xref linkend=functions-committs + provide information about transactions that have been already committed. + These functions mainly provide information about when the transactions + were committed. They only provide useful data when + xref linkend=guc-track-commit-timestamp configuration option is enabled + and only for transactions that were committed after it was enabled. +/para + +table id=functions-committs + titleCommitted transaction information/title + tgroup cols=3 + thead + rowentryName/entry entryReturn Type/entry entryDescription/entry/row + /thead + + tbody + row +entryliteralfunctionpg_xact_commit_timestamp(parameterxid/parameter)/function/literal/entry +entrytypetimestamp with time zone/type/entry +entryget commit timestamp of a transaction/entry + /row + row +entryliteralfunctionpg_last_committed_xact()/function/literal/entry +entryparameterxid/ typexid/, parametertimestamp/ typetimestamp with time zone//entry +entryget transaction Id and commit timestamp of latest transaction commit/entry + /row + /tbody + /tgroup +/table + /sect1 sect1 id=functions-admin *** a/doc/src/sgml/ref/pg_resetxlog.sgml --- b/doc/src/sgml/ref/pg_resetxlog.sgml *** *** 22,27 PostgreSQL documentation --- 22,28 refsynopsisdiv cmdsynopsis commandpg_resetxlog/command +arg choice=optoption-c/option replaceable class=parameterxid/replaceable/arg arg choice=optoption-f/option/arg arg choice=optoption-n/option/arg arg choice=optoption-o/option replaceable class=parameteroid/replaceable/arg *** *** 77,88 PostgreSQL documentation /para para !The option-o/, option-x/, option-e/, !option-m/, option-O/, !and option-l/ options allow the next OID, next transaction ID, next transaction ID's !epoch, next and oldest multitransaction ID, next multitransaction offset, and WAL !starting address values to be set manually. These are only needed when commandpg_resetxlog/command is unable to determine appropriate values by reading
Re: [HACKERS] tracking commit timestamps
On 19/11/14 17:30, Steve Singer wrote: On 11/19/2014 08:22 AM, Alvaro Herrera wrote: I think we're overblowing the pg_upgrade issue. Surely we don't need to preserve commit_ts data when upgrading across major versions; and pg_upgrade is perfectly prepared to remove old data when upgrading (actually it just doesn't copy it; consider pg_subtrans or pg_serial, for instance.) If we need to change binary representation in a future major release, we can do so without any trouble. That sounds reasonable. I am okay with Petr removing the LSN portion this patch. I did that then, v9 attached with following changes: - removed lsn info (obviously) - the pg_xact_commit_timestamp and pg_last_committed_xact return NULLs when timestamp data was not found - made the default nodeid crash safe - this also makes use of the do_xlog parameter in TransactionTreeSetCommitTsData if nodeid is set, although that still does not happen without extension actually using the API - added some more regression tests - some small comment and docs adjustments based on Michael's last email I didn't change the pg_last_committed_xact function name and I didn't make nodeid visible from SQL level interfaces and don't plan to in this patch as I think it's very premature to do so before we have some C code using it. Just to explain once more and hopefully more clearly how the crash safety/WAL logging is handled since there was some confusion in last review: We only do WAL logging when nodeid is also logged (when nodeid is not 0) because the timestamp itself can be read from WAL record of transaction commit so it's pointless to log another WAL record just to store the timestamp again. Extension can either set default nodeid which is then logged transparently, or can override the default logging mechanism by calling TransactionTreeSetCommitTsData with whatever data it wants and do_xlog set to true which will then write the WAL record with this overriding information. During WAL replay the commit timestamp is set from transaction commit record and then if committs WAL record is found it's used to overwrite the commit timestamp/nodeid for given xid. Also, there is exactly one record in SLRU for each xid so there is no point in making the SQL interfaces return multiple results. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/contrib/Makefile b/contrib/Makefile index b37d0dd..e331297 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -50,6 +50,7 @@ SUBDIRS = \ spi \ tablefunc \ tcn \ + test_committs \ test_decoding \ test_parser \ test_shm_mq \ diff --git a/contrib/pg_upgrade/pg_upgrade.c b/contrib/pg_upgrade/pg_upgrade.c index 3b8241b..f0a023f 100644 --- a/contrib/pg_upgrade/pg_upgrade.c +++ b/contrib/pg_upgrade/pg_upgrade.c @@ -423,8 +423,10 @@ copy_clog_xlog_xid(void) /* set the next transaction id and epoch of the new cluster */ prep_status(Setting next transaction ID and epoch for new cluster); exec_prog(UTILITY_LOG_FILE, NULL, true, - \%s/pg_resetxlog\ -f -x %u \%s\, - new_cluster.bindir, old_cluster.controldata.chkpnt_nxtxid, + \%s/pg_resetxlog\ -f -x %u -c %u \%s\, + new_cluster.bindir, + old_cluster.controldata.chkpnt_nxtxid, + old_cluster.controldata.chkpnt_nxtxid, new_cluster.pgdata); exec_prog(UTILITY_LOG_FILE, NULL, true, \%s/pg_resetxlog\ -f -e %u \%s\, diff --git a/contrib/pg_xlogdump/rmgrdesc.c b/contrib/pg_xlogdump/rmgrdesc.c index 9397198..e0af3cf 100644 --- a/contrib/pg_xlogdump/rmgrdesc.c +++ b/contrib/pg_xlogdump/rmgrdesc.c @@ -10,6 +10,7 @@ #include access/brin_xlog.h #include access/clog.h +#include access/committs.h #include access/gin.h #include access/gist_private.h #include access/hash.h diff --git a/contrib/test_committs/.gitignore b/contrib/test_committs/.gitignore new file mode 100644 index 000..1f95503 --- /dev/null +++ b/contrib/test_committs/.gitignore @@ -0,0 +1,5 @@ +# Generated subdirectories +/log/ +/isolation_output/ +/regression_output/ +/tmp_check/ diff --git a/contrib/test_committs/Makefile b/contrib/test_committs/Makefile new file mode 100644 index 000..2240749 --- /dev/null +++ b/contrib/test_committs/Makefile @@ -0,0 +1,45 @@ +# Note: because we don't tell the Makefile there are any regression tests, +# we have to clean those result files explicitly +EXTRA_CLEAN = $(pg_regress_clean_files) ./regression_output ./isolation_output + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = contrib/test_committs +top_builddir = ../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif + +# We can't support installcheck because normally installcheck users don't have +# the required track_commit_timestamp on +installcheck:; + +check: regresscheck + +submake-regress: + $(MAKE) -C
Re: [HACKERS] tracking commit timestamps
On 19 November 2014 02:12, Petr Jelinek p...@2ndquadrant.com wrote: Maybe we need better explanation of the LSN use-case(s) to understand why it should be stored here and why the other solutions are significantly worse. We should apply the same standard that has been applied elsewhere. If someone can show some software that could actually make use of LSN and there isn't a better way, then we can include it. PostgreSQL isn't a place where we speculate about possible future needs. I don't see why it should take 2+ years of prototypes, designs and discussions to get something in from BDR, but then we simply wave a hand and include something else at last minute without careful thought. Even if that means that later additions might need to think harder about upgrades. Timestamp and nodeid are useful for a variety of cases; LSN doesn't meet the same standard and should not be included now. We still have many months before even beta for people that want LSN to make a *separate* case for its inclusion as a separate feature. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 19/11/14 12:20, Simon Riggs wrote: On 19 November 2014 02:12, Petr Jelinek p...@2ndquadrant.com wrote: Maybe we need better explanation of the LSN use-case(s) to understand why it should be stored here and why the other solutions are significantly worse. We should apply the same standard that has been applied elsewhere. If someone can show some software that could actually make use of LSN and there isn't a better way, then we can include it. ... We still have many months before even beta for people that want LSN to make a *separate* case for its inclusion as a separate feature. This is good point, we are not too late in the cycle that LSN couldn't be added later if we find that it is indeed needed (and we don't have to care about pg_upgrade until beta). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
Petr Jelinek wrote: This is good point, we are not too late in the cycle that LSN couldn't be added later if we find that it is indeed needed (and we don't have to care about pg_upgrade until beta). I think we're overblowing the pg_upgrade issue. Surely we don't need to preserve commit_ts data when upgrading across major versions; and pg_upgrade is perfectly prepared to remove old data when upgrading (actually it just doesn't copy it; consider pg_subtrans or pg_serial, for instance.) If we need to change binary representation in a future major release, we can do so without any trouble. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Wed, Nov 19, 2014 at 8:22 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Petr Jelinek wrote: This is good point, we are not too late in the cycle that LSN couldn't be added later if we find that it is indeed needed (and we don't have to care about pg_upgrade until beta). I think we're overblowing the pg_upgrade issue. Surely we don't need to preserve commit_ts data when upgrading across major versions; and pg_upgrade is perfectly prepared to remove old data when upgrading (actually it just doesn't copy it; consider pg_subtrans or pg_serial, for instance.) If we need to change binary representation in a future major release, we can do so without any trouble. Actually, that's a good point. I still don't understand what the resistance is to add something quite inexpensive that multiple people obviously want, but at least if we don't, we still have the option to change it later. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 11/19/2014 08:22 AM, Alvaro Herrera wrote: I think we're overblowing the pg_upgrade issue. Surely we don't need to preserve commit_ts data when upgrading across major versions; and pg_upgrade is perfectly prepared to remove old data when upgrading (actually it just doesn't copy it; consider pg_subtrans or pg_serial, for instance.) If we need to change binary representation in a future major release, we can do so without any trouble. That sounds reasonable. I am okay with Petr removing the LSN portion this patch. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 15/11/14 13:36, Simon Riggs wrote: On 15 November 2014 04:32, Steve Singer st...@ssinger.info wrote: The use cases I'm talking about aren't really replication related. Often I have come across systems that want to do something such as 'select * from orders where X the_last_row_I_saw order by X' and then do further processing on the order. Yes, existing facilities provide mechanisms for different types of application change queues. If you want to write a processing queue in SQL, that isn't the best way. You'll need some way to keep track of whether or not its been successfully processed. That's either a column in the table, or a column in a queue table maintained by triggers, with the row write locked on read. You can then have multiple readers from this queue using the new SKIP LOCKED feature, which was specifically designed to facilitate that. Logical decoding was intended for much more than just replication. It provides commit order access to changed data in a form that is both usable and efficient for high volume applicatiion needs. I don't see any reason to add LSN into a SLRU updated at commit to support those application needs. I am still on the fence about the LSN issue, I don't mind it from code perspective, it's already written anyway, but I am not sure if we really want it in the SLRU as Simon says. Mainly because of three things: One, this patch is not really feature patch, as you can do most of what it does via tables already, but more a performance improvement and we should try to make it perform as good as possible then, adding more things does not really improve performance (according to my benchmarks the performance difference with/without LSN is under 1% so it's not terrible, but it's there), not to mention additional disk space. Two, the LSN use-cases seem to still be only theoretical to me, while the timestamp use-case has been production problem for at least a decade. Three, even if we add LSN, I am still not convinced that the use-cases presented here wouldn't be better served by putting that info into actual table instead of SLRU - as people want to use it as filter in WHERE clause, somebody mentioned exporting to different db, etc. Maybe we need better explanation of the LSN use-case(s) to understand why it should be stored here and why the other solutions are significantly worse. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 15 November 2014 04:32, Steve Singer st...@ssinger.info wrote: The use cases I'm talking about aren't really replication related. Often I have come across systems that want to do something such as 'select * from orders where X the_last_row_I_saw order by X' and then do further processing on the order. Yes, existing facilities provide mechanisms for different types of application change queues. If you want to write a processing queue in SQL, that isn't the best way. You'll need some way to keep track of whether or not its been successfully processed. That's either a column in the table, or a column in a queue table maintained by triggers, with the row write locked on read. You can then have multiple readers from this queue using the new SKIP LOCKED feature, which was specifically designed to facilitate that. Logical decoding was intended for much more than just replication. It provides commit order access to changed data in a form that is both usable and efficient for high volume applicatiion needs. I don't see any reason to add LSN into a SLRU updated at commit to support those application needs. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Thu, Nov 13, 2014 at 6:55 PM, Simon Riggs si...@2ndquadrant.com wrote: On 13 November 2014 21:24, Robert Haas robertmh...@gmail.com wrote: On Thu, Nov 13, 2014 at 8:18 AM, Simon Riggs si...@2ndquadrant.com wrote: Ordering transactions in LSN order is very precisly the remit of the existing logical decoding API. Any user that wishes to see a commits in sequence can do so using that API. BDR already does this, as do other users of the decoding API. So Slony already has access to a useful ordering if it wishes it. We do not need to anything *on this patch* to enable LSNs for Slony or anyone else. I don't see any reason to provide the same facility twice, in two different ways. Perhaps you could respond more specifically to concerns expressed upthread, like: http://www.postgresql.org/message-id/blu436-smtp28b68b9312cbe5d9125441dc...@phx.gbl I don't see that as a dumb argument on the face of it. We have a clear must have argument for timestamps to support replication conflicts. Adding LSNs, when we already have a way to access them, is much more of a nice to have. I don't really see it as a particularly nice to have either, since the SLRU is in no way ordered. Scope creep is a dangerous thing, so we shouldn't, and elsewhere don't, collect up ideas like a bag of mixed sweets. It's easy to overload things to the point where they don't fly at all. The whole point of this is that we're building something faster than trigger based systems. I think that's slamming the door closed and nailing it shut behind you. If we add the feature without LSNs, how will someone go back and add that later? It would change the on-disk format of the SLRU, so to avoid breaking pg_upgrade, someone would have to write a conversion utility. Even at that, it would slow pg_upgrade down when the feature has been used. By way of contrast, adding the feature now is quite easy. It just requires storing a few more bytes per transaction. I am all in favor of incremental development when possible, but not when it so greatly magnifies the work that needs to be done. People have been asking for the ability to determine the commit ordering for years, and this is a way to do that, very inexpensively, as part of a patch that is needed anyway. We are not talking about loading 20 new requirements on top of this patch; that would be intolerable. We're talking about adding one additional piece of information that has been requested multiple times over the years. The way I see it, there are at least three uses for this information. One, trigger-based replication solutions. While logical decoding will doubtless be preferable, I don't think trigger-based replication solutions will go away completely, and certainly not right away. They've wanted this since forever. Two, for user applications that want to know the commit order for their own purposes, as in Steve's example. Three, for O(1) snapshots. Heikki's patch to make that happen seems to have stalled a bit, but if it's ever to go anywhere it will need something like this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 14 November 2014 17:12, Robert Haas robertmh...@gmail.com wrote: We are not talking about loading 20 new requirements on top of this patch; that would be intolerable. We're talking about adding one additional piece of information that has been requested multiple times over the years. The requested information is already available, as discussed. Logical decoding adds commit ordering for *exactly* the purpose of using it for replication, available to all solutions. This often requested feature has now been added and doesn't need to be added twice. So what we are discussing is adding a completely superfluous piece of information. Not including the LSN info does nothing to trigger based replication, which will no doubt live on happily for many years. But adding LSN will slow down logical replication, for no purpose at all. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 11/14/2014 08:21 PM, Simon Riggs wrote: The requested information is already available, as discussed. Logical decoding adds commit ordering for *exactly* the purpose of using it for replication, available to all solutions. This often requested feature has now been added and doesn't need to be added twice. So what we are discussing is adding a completely superfluous piece of information. Not including the LSN info does nothing to trigger based replication, which will no doubt live on happily for many years. But adding LSN will slow down logical replication, for no purpose at all. Simon, The use cases I'm talking about aren't really replication related. Often I have come across systems that want to do something such as 'select * from orders where X the_last_row_I_saw order by X' and then do further processing on the order. This is kind of awkard to do today because you don't have a good candidate for 'X' to order on. Using either a sequence or insert-row timestamp doesn't work well because a transaction with a lower value for X might end up committing after the higher value in in a query result. Yes you could setup a logical wal slot and listen on the stream of inserts into your order table but thats a lot of administration overhead compared to just issuing an SQL query for what really is a query type operation. Using the commit timestamp for my X sounded very tempting but could allow duplicates. One could argue that this patch is about replication features, and providing commit ordering for query purposes should be a separate patch to add that on top of this infrastructure. I see merit to smaller more focused patches but that requires leaving the door open to easily extending things later. It could also be that I'm the only one who wants to order and filter queries in this manner (but that would surprise me). If the commit lsn has limited appeal and we decide we don't want it at all then we shouldn't add it. I've seen this type of requirement in a number of different systems at a number of different companies. I've generally seen it dealt with by either selecting rows behind the last now() timestamp seen and then filtering out already processed rows or by tracking the 'processed' state of each row individually (ie performing an update on each row once its been processed) which performs poorly. Steve -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 13/11/14 07:04, Michael Paquier wrote: On Wed, Nov 12, 2014 at 10:06 PM, Petr Jelinek p...@2ndquadrant.com wrote: Brief list of changes: - the commit timestamp record now stores timestamp, lsn and nodeid Now that not only the commit timestamp is stored, calling that commit timestamp, committs or commit_timestamp is strange, no? If this patch is moving toward being a more complex information provider, calling it commit information or commit data is more adapted, no? It's not, since adding more info will break upgrades, I doubt we will add more anytime soon. I was thinking about it too tbh, but don't have better name (I don't like commit data as it seems confusing - isn't commit data the dml you just committed?). Maybe commit_metadata, commit_information is probably ok also, this would need input from others, I am personally fine with keeping the commit_timestamp name too. - if the xid passed to get interface is out of range -infinity timestamp is returned (I think it's bad idea to throw errors here as the valid range is not static and same ID can start throwing errors between calls theoretically) Already mentioned by Jim in a previous mail: this would be better as NULL. Yeah that make sense, I have no idea what I was thinking :) - renamed the sql interfaces to pg_xact_commit_timestamp, pg_xact_commit_timestamp_data and pg_last_committed_xact, they don't expose the nodeid atm, I personally am not big fan of the xact but it seems more consistent with existing naming pg_xact_commit_timestamp and pg_xact_commit_timestamp_data are overlapping. What's wrong with a single function able to return the whole set (node ID, commit timetamp, commit LSN)? Let's say That's what pg_xact_commit_timestamp_data does (it does not show nodeid because we agreed that it should not be exposed yet on sql level). Might make sense to rename, but let's wait for input about the general renaming point at the beginning of the mail. pg_xact_commit_information or pg_xact_commit_data. Already mentioned, but I also find using a SRF able to return all the available information from a given XID value quite useful. And this does not conflict with what is proposed currently, you would need just to call the function with XID + number of entries wanted to get a single one. Comments from other folks about that? No idea what you mean by this to be honest, there is exactly one record stored for single XID. Then more input about the latest patch: 1) This block is not needed, option -e is listed twice: The option-o/, option-x/, option-e/, - option-m/, option-O/, - and option-l/ + option-m/, option-O/, option-l/ + and option-e/ 2) Very small thing: a couple of files have no newlines at the end, among them committs.conf and test_committs/Makefile. 3) pg_last_committed_xact and not pg_last_xact_commit_information or similar? Just inspiration from DB2's rollforward (which shows among other things last committed transaction: timestamp), but I don't feel strongly about naming so can be changed. 4) storage.sgml needs to be updated with the new folder pg_committs Right. 5) Er.. node ID is missing in pg_last_committed_xact, no? That's intentional (for now). 6) This XXX notice can be removed: + /* +* Return empty if the requested value is older than what we have or +* newer than newest we have. +* +* XXX: should this be error instead? +*/ Ok. (Note that I am still a partisan of an error message to let the caller know that commit info history does not have the information requested). IMHO throwing error there would be same as throwing error when WHERE clause in SELECT does not match anything. As the xid range for which we store data is dynamic we need to accept any xid as valid input because the caller has no way of validating if the xid passed is out of range or not. 7) Note that TransactionTreeSetCommitTsData still never sets do_xlog at true and that WriteSetTimestampXlogRec never gets called. So no information is WAL-logged with this patch. Wouldn't that be useful for standbys as well? Perhaps I am missing why this is disabled? This code should be activated IMO or it would be just untested. True is only needed here when you are setting this info to different transaction than the one you are in since the info can be reconstructed from normal transaction WAL record (see that it's actually called from xact_redo_commit_internal, which is how we get the WAL safety and why it works on slave). So the true is for use by extensions only, it's not completely uncommon that we have APIs that are used only by extensions. 8) As a more general point, the node ID stuff makes me uncomfortable and is just added on top of the existing patch without much thinking... So I am really skeptical about it. The need here is to pass on demand a int8 that is a node ID that can only be set through a C interface, so only extensions could play with
Re: [HACKERS] tracking commit timestamps
On 9 November 2014 16:57, Steve Singer st...@ssinger.info wrote: On 11/07/2014 07:07 PM, Petr Jelinek wrote: The list of what is useful might be long, but we can't have everything there as there are space constraints, and LSN is another 8 bytes and I still want to have some bytes for storing the origin or whatever you want to call it there, as that's the one I personally have biggest use-case for. So this would be ~24bytes per txid already, hmm I wonder if we can pull some tricks to lower that a bit. The reason why Jim and myself are asking for the LSN and not just the timestamp is that I want to be able to order the transactions. Jim pointed out earlier in the thread that just ordering on timestamp allows for multiple transactions with the same timestamp. I think we need to be clear about the role and function of components here. Xid timestamps allow a replication system to do post-commit conflict resolution based upon timestamp, i.e. last update wins. That is potentially usable by BDR, Slony, xdb and anything else that wants that. Ordering transactions in LSN order is very precisly the remit of the existing logical decoding API. Any user that wishes to see a commits in sequence can do so using that API. BDR already does this, as do other users of the decoding API. So Slony already has access to a useful ordering if it wishes it. We do not need to anything *on this patch* to enable LSNs for Slony or anyone else. I don't see any reason to provide the same facility twice, in two different ways. So in summary... the components are * Commit LSN order is useful for applying changes - available by logical decoding * Commit timestamps and nodeid are useful for conflict resolution - available from this patch Both components have been designed in ways that allow multiple replication systems to use these facilities. So, -1 to including commit LSN in the SLRU alongside commit timestamp and nodeid. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 13/11/14 14:18, Simon Riggs wrote: So in summary... the components are * Commit LSN order is useful for applying changes - available by logical decoding * Commit timestamps and nodeid are useful for conflict resolution - available from this patch Both components have been designed in ways that allow multiple replication systems to use these facilities. So, -1 to including commit LSN in the SLRU alongside commit timestamp and nodeid. I am of the same opinion, I added the LSN by popular demand, but I still personally don't see the value in having it there as it does *not* enable us to do something that would be impossible otherwise. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Thu, Nov 13, 2014 at 8:18 AM, Simon Riggs si...@2ndquadrant.com wrote: Ordering transactions in LSN order is very precisly the remit of the existing logical decoding API. Any user that wishes to see a commits in sequence can do so using that API. BDR already does this, as do other users of the decoding API. So Slony already has access to a useful ordering if it wishes it. We do not need to anything *on this patch* to enable LSNs for Slony or anyone else. I don't see any reason to provide the same facility twice, in two different ways. Perhaps you could respond more specifically to concerns expressed upthread, like: http://www.postgresql.org/message-id/blu436-smtp28b68b9312cbe5d9125441dc...@phx.gbl I don't see that as a dumb argument on the face of it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 13 November 2014 21:24, Robert Haas robertmh...@gmail.com wrote: On Thu, Nov 13, 2014 at 8:18 AM, Simon Riggs si...@2ndquadrant.com wrote: Ordering transactions in LSN order is very precisly the remit of the existing logical decoding API. Any user that wishes to see a commits in sequence can do so using that API. BDR already does this, as do other users of the decoding API. So Slony already has access to a useful ordering if it wishes it. We do not need to anything *on this patch* to enable LSNs for Slony or anyone else. I don't see any reason to provide the same facility twice, in two different ways. Perhaps you could respond more specifically to concerns expressed upthread, like: http://www.postgresql.org/message-id/blu436-smtp28b68b9312cbe5d9125441dc...@phx.gbl I don't see that as a dumb argument on the face of it. We have a clear must have argument for timestamps to support replication conflicts. Adding LSNs, when we already have a way to access them, is much more of a nice to have. I don't really see it as a particularly nice to have either, since the SLRU is in no way ordered. Scope creep is a dangerous thing, so we shouldn't, and elsewhere don't, collect up ideas like a bag of mixed sweets. It's easy to overload things to the point where they don't fly at all. The whole point of this is that we're building something faster than trigger based systems. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 10/11/14 14:53, Robert Haas wrote: On Mon, Nov 10, 2014 at 8:39 AM, Petr Jelinek p...@2ndquadrant.com wrote: I did the calculation above wrong btw, it's actually 20 bytes not 24 bytes per record, I am inclined to just say we can live with that. If you do it as 20 bytes, you'll have to do some work to squeeze out the alignment padding. I'm inclined to think it's fine to have a few extra padding bytes here; someone might want to use those for something in the future, and they probably don't cost much. I did get around the alignment via memcpy, so it is still 20bytes. Since we agreed that the (B) case is not really feasible and we are doing the (C), I also wonder if extradata should be renamed to nodeid (even if it's not used at this point as nodeid). And then there is question about the size of it, since the nodeid itself can live with 2 bytes probably (64k of nodes ought to be enough for everybody ;) ). Or leave the extradata as is but use as reserved space for future use and not expose it at this time on SQL level at all? I vote for calling it node-ID, and for allowing at least 4 bytes for it. Penny wise, pound foolish. Ok, I went this way. Anyway here is v8 version of the patch, I think I addressed all the concerns mentioned, it's also rebased against current master (BRIN commit added some conflicts). Brief list of changes: - the commit timestamp record now stores timestamp, lsn and nodeid - added code to disallow turning track_commit_timestamp on with too small pagesize - the get interfaces error out when track_commit_timestamp is off - if the xid passed to get interface is out of range -infinity timestamp is returned (I think it's bad idea to throw errors here as the valid range is not static and same ID can start throwing errors between calls theoretically) - renamed the sql interfaces to pg_xact_commit_timestamp, pg_xact_commit_timestamp_data and pg_last_committed_xact, they don't expose the nodeid atm, I personally am not big fan of the xact but it seems more consistent with existing naming - documented pg_resetxlog changes and make all the pg_resetxlog options alphabetically ordered - committs is not used anymore, it's commit_ts (and CommitTs in camelcase), I think it's not really good idea to spell the timestamp everywhere as some interface then get 30+ chars long names... - added WAL logging of the track_commit_timestamp GUC - added alternative expected output of the regression test so that it works with make installcheck when track_commit_timestamp is on - added C interface to set default nodeid for current backend - several minor comment and naming adjustments mostly suggested by Michael -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/contrib/Makefile b/contrib/Makefile index b37d0dd..e331297 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -50,6 +50,7 @@ SUBDIRS = \ spi \ tablefunc \ tcn \ + test_committs \ test_decoding \ test_parser \ test_shm_mq \ diff --git a/contrib/pg_upgrade/pg_upgrade.c b/contrib/pg_upgrade/pg_upgrade.c index 3b8241b..f0a023f 100644 --- a/contrib/pg_upgrade/pg_upgrade.c +++ b/contrib/pg_upgrade/pg_upgrade.c @@ -423,8 +423,10 @@ copy_clog_xlog_xid(void) /* set the next transaction id and epoch of the new cluster */ prep_status(Setting next transaction ID and epoch for new cluster); exec_prog(UTILITY_LOG_FILE, NULL, true, - \%s/pg_resetxlog\ -f -x %u \%s\, - new_cluster.bindir, old_cluster.controldata.chkpnt_nxtxid, + \%s/pg_resetxlog\ -f -x %u -c %u \%s\, + new_cluster.bindir, + old_cluster.controldata.chkpnt_nxtxid, + old_cluster.controldata.chkpnt_nxtxid, new_cluster.pgdata); exec_prog(UTILITY_LOG_FILE, NULL, true, \%s/pg_resetxlog\ -f -e %u \%s\, diff --git a/contrib/pg_xlogdump/rmgrdesc.c b/contrib/pg_xlogdump/rmgrdesc.c index 9397198..e0af3cf 100644 --- a/contrib/pg_xlogdump/rmgrdesc.c +++ b/contrib/pg_xlogdump/rmgrdesc.c @@ -10,6 +10,7 @@ #include access/brin_xlog.h #include access/clog.h +#include access/committs.h #include access/gin.h #include access/gist_private.h #include access/hash.h diff --git a/contrib/test_committs/.gitignore b/contrib/test_committs/.gitignore new file mode 100644 index 000..1f95503 --- /dev/null +++ b/contrib/test_committs/.gitignore @@ -0,0 +1,5 @@ +# Generated subdirectories +/log/ +/isolation_output/ +/regression_output/ +/tmp_check/ diff --git a/contrib/test_committs/Makefile b/contrib/test_committs/Makefile new file mode 100644 index 000..2240749 --- /dev/null +++ b/contrib/test_committs/Makefile @@ -0,0 +1,45 @@ +# Note: because we don't tell the Makefile there are any regression tests, +# we have to clean those result files explicitly +EXTRA_CLEAN = $(pg_regress_clean_files) ./regression_output ./isolation_output + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS)
Re: [HACKERS] tracking commit timestamps
On 11/12/14, 7:06 AM, Petr Jelinek wrote: - if the xid passed to get interface is out of range -infinity timestamp is returned (I think it's bad idea to throw errors here as the valid range is not static and same ID can start throwing errors between calls theoretically) Wouldn't NULL be more appropriate? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Thu, Nov 13, 2014 at 7:56 AM, Jim Nasby jim.na...@bluetreble.com wrote: On 11/12/14, 7:06 AM, Petr Jelinek wrote: - if the xid passed to get interface is out of range -infinity timestamp is returned (I think it's bad idea to throw errors here as the valid range is not static and same ID can start throwing errors between calls theoretically) Wouldn't NULL be more appropriate? Definitely. Defining a given value for information not valid is awkward. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Wed, Nov 12, 2014 at 10:06 PM, Petr Jelinek p...@2ndquadrant.com wrote: Brief list of changes: - the commit timestamp record now stores timestamp, lsn and nodeid Now that not only the commit timestamp is stored, calling that commit timestamp, committs or commit_timestamp is strange, no? If this patch is moving toward being a more complex information provider, calling it commit information or commit data is more adapted, no? Documentation would need a fresh brush as well in this case. - added code to disallow turning track_commit_timestamp on with too small pagesize - the get interfaces error out when track_commit_timestamp is off OK, that's sane. - if the xid passed to get interface is out of range -infinity timestamp is returned (I think it's bad idea to throw errors here as the valid range is not static and same ID can start throwing errors between calls theoretically) Already mentioned by Jim in a previous mail: this would be better as NULL. - renamed the sql interfaces to pg_xact_commit_timestamp, pg_xact_commit_timestamp_data and pg_last_committed_xact, they don't expose the nodeid atm, I personally am not big fan of the xact but it seems more consistent with existing naming pg_xact_commit_timestamp and pg_xact_commit_timestamp_data are overlapping. What's wrong with a single function able to return the whole set (node ID, commit timetamp, commit LSN)? Let's say pg_xact_commit_information or pg_xact_commit_data. Already mentioned, but I also find using a SRF able to return all the available information from a given XID value quite useful. And this does not conflict with what is proposed currently, you would need just to call the function with XID + number of entries wanted to get a single one. Comments from other folks about that? - documented pg_resetxlog changes and make all the pg_resetxlog options alphabetically ordered - added WAL logging of the track_commit_timestamp GUC - added alternative expected output of the regression test so that it works with make installcheck when track_commit_timestamp is on - added C interface to set default nodeid for current backend - several minor comment and naming adjustments mostly suggested by Michael Thanks for those adjustments. Then more input about the latest patch: 1) This block is not needed, option -e is listed twice: The option-o/, option-x/, option-e/, - option-m/, option-O/, - and option-l/ + option-m/, option-O/, option-l/ + and option-e/ 2) Very small thing: a couple of files have no newlines at the end, among them committs.conf and test_committs/Makefile. 3) pg_last_committed_xact and not pg_last_xact_commit_information or similar? 4) storage.sgml needs to be updated with the new folder pg_committs 5) Er.. node ID is missing in pg_last_committed_xact, no? 6) This XXX notice can be removed: + /* +* Return empty if the requested value is older than what we have or +* newer than newest we have. +* +* XXX: should this be error instead? +*/ We are moving toward returning invalid information in the SQL interface when the information is not in history instead of an error, no? (Note that I am still a partisan of an error message to let the caller know that commit info history does not have the information requested). 7) Note that TransactionTreeSetCommitTsData still never sets do_xlog at true and that WriteSetTimestampXlogRec never gets called. So no information is WAL-logged with this patch. Wouldn't that be useful for standbys as well? Perhaps I am missing why this is disabled? This code should be activated IMO or it would be just untested. 8) As a more general point, the node ID stuff makes me uncomfortable and is just added on top of the existing patch without much thinking... So I am really skeptical about it. The need here is to pass on demand a int8 that is a node ID that can only be set through a C interface, so only extensions could play with it. The data passed to a WAL record is always built and determined by the system and entirely transparent to the user, inserting user-defined data like that inconsistent with what we've been doing until now, no? Also, a question particularly for BDR and Slony folks: do you sometimes add a new node using the base backup of an existing node :) See what I come up with: a duplication of this new node ID system with the already present system ID, no? Similarly, the LSN is added to the WAL record containing the commit timestamp, but cannot the LSN of the WAL record containing the commit timestamp itself be used as a point of reference for a better ordering? That's not exactly the same as the LSN of the transaction commit, still it provides a WAL-based reference. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 4 November 2014 08:23, Andres Freund and...@2ndquadrant.com wrote: 6) Shouldn't any value update of track_commit_timestamp be tracked in XLogReportParameters? That's thinking about making the commit timestamp available on standbys as well.. Yes, it should. Agree committs should be able to run on standby, but it seems possible to do that without it running on the master. The two should be unconnected. Not sure why we'd want to have parameter changes on master reported? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 2014-11-11 16:10:47 +, Simon Riggs wrote: On 4 November 2014 08:23, Andres Freund and...@2ndquadrant.com wrote: 6) Shouldn't any value update of track_commit_timestamp be tracked in XLogReportParameters? That's thinking about making the commit timestamp available on standbys as well.. Yes, it should. Agree committs should be able to run on standby, but it seems possible to do that without it running on the master. I don't think that's realistic. It requires WAL to be written in some cases, so that's not going to work. I also don't think it's a particularly interesting ability? The two should be unconnected. Why? Not sure why we'd want to have parameter changes on master reported? So it works correctly. We're currently truncating the slru on startup when the guc is disabled which would cause problems WAL records coming in from the primary. I think the code also needs some TLC to correctly work after a failover. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 9 November 2014 16:57, Steve Singer st...@ssinger.info wrote: On 11/07/2014 07:07 PM, Petr Jelinek wrote: The list of what is useful might be long, but we can't have everything there as there are space constraints, and LSN is another 8 bytes and I still want to have some bytes for storing the origin or whatever you want to call it there, as that's the one I personally have biggest use-case for. So this would be ~24bytes per txid already, hmm I wonder if we can pull some tricks to lower that a bit. The reason why Jim and myself are asking for the LSN and not just the timestamp is that I want to be able to order the transactions. Jim pointed out earlier in the thread that just ordering on timestamp allows for multiple transactions with the same timestamp. Maybe we don't need the entire LSN to solve that. If you already have the commit timestamp maybe you only need another byte or two of granularity to order transactions that are within the same microsecond. It looks like there are quite a few potential uses for this. If we include everything it will be too fat to use for any of the potential uses, since each will be pulled down by the others. Sounds like it needs to be configurable in some way. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 11 November 2014 16:19, Andres Freund and...@2ndquadrant.com wrote: On 2014-11-11 16:10:47 +, Simon Riggs wrote: On 4 November 2014 08:23, Andres Freund and...@2ndquadrant.com wrote: 6) Shouldn't any value update of track_commit_timestamp be tracked in XLogReportParameters? That's thinking about making the commit timestamp available on standbys as well.. Yes, it should. Agree committs should be able to run on standby, but it seems possible to do that without it running on the master. I don't think that's realistic. It requires WAL to be written in some cases, so that's not going to work. I also don't think it's a particularly interesting ability? OK, so we are saying commit timestamp will NOT be available on Standbys. I'm fine with that, since data changes aren't generated there. So it works correctly. We're currently truncating the slru on startup when the guc is disabled which would cause problems WAL records coming in from the primary. I think the code also needs some TLC to correctly work after a failover. OK -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 2014-11-11 17:09:54 +, Simon Riggs wrote: On 11 November 2014 16:19, Andres Freund and...@2ndquadrant.com wrote: On 2014-11-11 16:10:47 +, Simon Riggs wrote: On 4 November 2014 08:23, Andres Freund and...@2ndquadrant.com wrote: 6) Shouldn't any value update of track_commit_timestamp be tracked in XLogReportParameters? That's thinking about making the commit timestamp available on standbys as well.. Yes, it should. Agree committs should be able to run on standby, but it seems possible to do that without it running on the master. I don't think that's realistic. It requires WAL to be written in some cases, so that's not going to work. I also don't think it's a particularly interesting ability? OK, so we are saying commit timestamp will NOT be available on Standbys. Hm? They should be available - xact.c WAL replay will redo the setting of the timestamps and explicitly overwritten timestamps will generate their own WAL records. What I mean is just that you can't use commit timestamps without also using it on the primary. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 11/10/14, 7:40 AM, Alvaro Herrera wrote: Robert Haas wrote: On Sun, Nov 9, 2014 at 8:41 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Robert Haas wrote: I think the key question here is the time for which the data needs to be retained. 2^32 of anything is a lot, but why keep around that number of records rather than more (after all, we have epochs to distinguish one use of a given txid from another) or fewer? The problem is not how much data we retain; is about how much data we can address. I thought I was responding to a concern about disk space utilization. Ah, right. So AFAIK we don't need to keep anything older than RecentXmin or something like that -- which is not too old. If I recall correctly Josh Berkus was saying in a thread about pg_multixact that it used about 128kB or so in = 9.2 for his customers; that one was also limited to RecentXmin AFAIR. I think a similar volume of commit_ts data would be pretty acceptable. Moreso considering that it's turned off by default. FWIW, AFAICS MultiXacts are only truncated after a (auto)vacuum process is able to advance datminmxid, which will (now) only happen when an entire relation has been scanned (which should be infrequent). I believe the low normal space usage is just an indication that most databases don't use many MultiXacts. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
Jim Nasby wrote: On 11/10/14, 7:40 AM, Alvaro Herrera wrote: Ah, right. So AFAIK we don't need to keep anything older than RecentXmin or something like that -- which is not too old. If I recall correctly Josh Berkus was saying in a thread about pg_multixact that it used about 128kB or so in = 9.2 for his customers; that one was also limited to RecentXmin AFAIR. I think a similar volume of commit_ts data would be pretty acceptable. Moreso considering that it's turned off by default. FWIW, AFAICS MultiXacts are only truncated after a (auto)vacuum process is able to advance datminmxid, which will (now) only happen when an entire relation has been scanned (which should be infrequent). I believe the low normal space usage is just an indication that most databases don't use many MultiXacts. That's in 9.3. Prior to that, they were truncated much more often. Maybe you've not heard enough about this commit: commit 0ac5ad5134f2769ccbaefec73844f8504c4d6182 Author: Alvaro Herrera alvhe...@alvh.no-ip.org Date: Wed Jan 23 12:04:59 2013 -0300 Improve concurrency of foreign key locking -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 11/11/14, 2:03 PM, Alvaro Herrera wrote: Jim Nasby wrote: On 11/10/14, 7:40 AM, Alvaro Herrera wrote: Ah, right. So AFAIK we don't need to keep anything older than RecentXmin or something like that -- which is not too old. If I recall correctly Josh Berkus was saying in a thread about pg_multixact that it used about 128kB or so in = 9.2 for his customers; that one was also limited to RecentXmin AFAIR. I think a similar volume of commit_ts data would be pretty acceptable. Moreso considering that it's turned off by default. FWIW, AFAICS MultiXacts are only truncated after a (auto)vacuum process is able to advance datminmxid, which will (now) only happen when an entire relation has been scanned (which should be infrequent). I believe the low normal space usage is just an indication that most databases don't use many MultiXacts. That's in 9.3. Prior to that, they were truncated much more often. Well, we're talking about a new feature, so I wasn't looking in back branches. ;P Maybe you've not heard enough about this commit: commit 0ac5ad5134f2769ccbaefec73844f8504c4d6182 Interestingly, git.postgresql.org hasn't either: http://git.postgresql.org/gitweb/?p=postgresql.gita=searchh=HEADst=commits=0ac5ad5134f2769ccbaefec73844f8504c4d6182 The commit is certainly there though... decibel@decina:[15:12]~/pgsql/HEAD/src/backend (master=)$git log 0ac5ad5134f2769ccbaefec73844f8504c4d6182|head -n1 commit 0ac5ad5134f2769ccbaefec73844f8504c4d6182 -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
Jim Nasby wrote: On 11/11/14, 2:03 PM, Alvaro Herrera wrote: Jim Nasby wrote: On 11/10/14, 7:40 AM, Alvaro Herrera wrote: Ah, right. So AFAIK we don't need to keep anything older than RecentXmin or something like that -- which is not too old. If I recall correctly Josh Berkus was saying in a thread about pg_multixact that it used about 128kB or so in = 9.2 for his customers; that one was also limited to RecentXmin AFAIR. I think a similar volume of commit_ts data would be pretty acceptable. Moreso considering that it's turned off by default. FWIW, AFAICS MultiXacts are only truncated after a (auto)vacuum process is able to advance datminmxid, which will (now) only happen when an entire relation has been scanned (which should be infrequent). I believe the low normal space usage is just an indication that most databases don't use many MultiXacts. That's in 9.3. Prior to that, they were truncated much more often. Well, we're talking about a new feature, so I wasn't looking in back branches. ;P Well, I did mention = 9.2 in the text above ... -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Sun, Nov 9, 2014 at 8:41 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Robert Haas wrote: I think the key question here is the time for which the data needs to be retained. 2^32 of anything is a lot, but why keep around that number of records rather than more (after all, we have epochs to distinguish one use of a given txid from another) or fewer? The problem is not how much data we retain; is about how much data we can address. I thought I was responding to a concern about disk space utilization. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 10/11/14 08:01, Anssi Kääriäinen wrote: On Sun, 2014-11-09 at 11:57 -0500, Steve Singer wrote: The reason why Jim and myself are asking for the LSN and not just the timestamp is that I want to be able to order the transactions. Jim pointed out earlier in the thread that just ordering on timestamp allows for multiple transactions with the same timestamp. Maybe we don't need the entire LSN to solve that. If you already have the commit timestamp maybe you only need another byte or two of granularity to order transactions that are within the same microsecond. There is no guarantee that a commit with later LSN has a later timestamp. There are cases where the clock could move significantly backwards. A robust solution to storing transaction commit information (including commit order) in a way that can be referenced from other tables, can be loaded to another cluster, and survives crashes would be a great feature. But this feature doesn't have those properties. It has the property of surviving crashes. Not sure what you mean by referencing from other tables? And about loading to another cluster, the txid does not really have any meaning on another cluster, so the info about it does not have either? But anyway this patch is targeting extensions not DBAs - you could write extension that will provide that feature on top of this patch (although given what I wrote above I don't see how it's useful). -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Mon, Nov 10, 2014 at 2:01 AM, Anssi Kääriäinen anssi.kaariai...@thl.fi wrote: There is no guarantee that a commit with later LSN has a later timestamp. There are cases where the clock could move significantly backwards. Good point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 09/11/14 17:57, Steve Singer wrote: On 11/07/2014 07:07 PM, Petr Jelinek wrote: The list of what is useful might be long, but we can't have everything there as there are space constraints, and LSN is another 8 bytes and I still want to have some bytes for storing the origin or whatever you want to call it there, as that's the one I personally have biggest use-case for. So this would be ~24bytes per txid already, hmm I wonder if we can pull some tricks to lower that a bit. The reason why Jim and myself are asking for the LSN and not just the timestamp is that I want to be able to order the transactions. Jim pointed out earlier in the thread that just ordering on timestamp allows for multiple transactions with the same timestamp. Maybe we don't need the entire LSN to solve that. If you already have the commit timestamp maybe you only need another byte or two of granularity to order transactions that are within the same microsecond. Hmm maybe just one part of LSN, but I don't really like that either, if we want to store LSN we should probably store it as is as somebody might want to map it to txid for other reasons. I did the calculation above wrong btw, it's actually 20 bytes not 24 bytes per record, I am inclined to just say we can live with that. Since we agreed that the (B) case is not really feasible and we are doing the (C), I also wonder if extradata should be renamed to nodeid (even if it's not used at this point as nodeid). And then there is question about the size of it, since the nodeid itself can live with 2 bytes probably (64k of nodes ought to be enough for everybody ;) ). Or leave the extradata as is but use as reserved space for future use and not expose it at this time on SQL level at all? I guess Andres could answer what suits him better here. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
Robert Haas wrote: On Sun, Nov 9, 2014 at 8:41 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Robert Haas wrote: I think the key question here is the time for which the data needs to be retained. 2^32 of anything is a lot, but why keep around that number of records rather than more (after all, we have epochs to distinguish one use of a given txid from another) or fewer? The problem is not how much data we retain; is about how much data we can address. I thought I was responding to a concern about disk space utilization. Ah, right. So AFAIK we don't need to keep anything older than RecentXmin or something like that -- which is not too old. If I recall correctly Josh Berkus was saying in a thread about pg_multixact that it used about 128kB or so in = 9.2 for his customers; that one was also limited to RecentXmin AFAIR. I think a similar volume of commit_ts data would be pretty acceptable. Moreso considering that it's turned off by default. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Mon, Nov 10, 2014 at 8:39 AM, Petr Jelinek p...@2ndquadrant.com wrote: I did the calculation above wrong btw, it's actually 20 bytes not 24 bytes per record, I am inclined to just say we can live with that. If you do it as 20 bytes, you'll have to do some work to squeeze out the alignment padding. I'm inclined to think it's fine to have a few extra padding bytes here; someone might want to use those for something in the future, and they probably don't cost much. Since we agreed that the (B) case is not really feasible and we are doing the (C), I also wonder if extradata should be renamed to nodeid (even if it's not used at this point as nodeid). And then there is question about the size of it, since the nodeid itself can live with 2 bytes probably (64k of nodes ought to be enough for everybody ;) ). Or leave the extradata as is but use as reserved space for future use and not expose it at this time on SQL level at all? I vote for calling it node-ID, and for allowing at least 4 bytes for it. Penny wise, pound foolish. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Mon, Nov 10, 2014 at 8:40 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Ah, right. So AFAIK we don't need to keep anything older than RecentXmin or something like that -- which is not too old. If I recall correctly Josh Berkus was saying in a thread about pg_multixact that it used about 128kB or so in = 9.2 for his customers; that one was also limited to RecentXmin AFAIR. I think a similar volume of commit_ts data would be pretty acceptable. Moreso considering that it's turned off by default. I'm not sure whether keeping it just back to RecentXmin will be enough for everybody's needs. But we certainly don't need to keep the last 2^32 records as someone-or-other was suggesting. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 11/10/2014 08:39 AM, Petr Jelinek wrote: On 09/11/14 17:57, Steve Singer wrote: On 11/07/2014 07:07 PM, Petr Jelinek wrote: The list of what is useful might be long, but we can't have everything there as there are space constraints, and LSN is another 8 bytes and I still want to have some bytes for storing the origin or whatever you want to call it there, as that's the one I personally have biggest use-case for. So this would be ~24bytes per txid already, hmm I wonder if we can pull some tricks to lower that a bit. The reason why Jim and myself are asking for the LSN and not just the timestamp is that I want to be able to order the transactions. Jim pointed out earlier in the thread that just ordering on timestamp allows for multiple transactions with the same timestamp. Maybe we don't need the entire LSN to solve that. If you already have the commit timestamp maybe you only need another byte or two of granularity to order transactions that are within the same microsecond. Hmm maybe just one part of LSN, but I don't really like that either, if we want to store LSN we should probably store it as is as somebody might want to map it to txid for other reasons. I did the calculation above wrong btw, it's actually 20 bytes not 24 bytes per record, I am inclined to just say we can live with that. Since we agreed that the (B) case is not really feasible and we are doing the (C), I also wonder if extradata should be renamed to nodeid (even if it's not used at this point as nodeid). And then there is question about the size of it, since the nodeid itself can live with 2 bytes probably (64k of nodes ought to be enough for everybody ;) ). Or leave the extradata as is but use as reserved space for future use and not expose it at this time on SQL level at all? I guess Andres could answer what suits him better here. I am happy with renaming extradata to nodeid and not exposing it at this time. If we feel that commit-order (ie LSN or something equivalent) is really a different patch/feature than commit-timestamp then I am okay with that also but we should make sure to warn users of the commit-timestamp in the documentation that two transactions might have the same timestamp and that the commit order might not be the same as ordering by the commit timestamp. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 11/07/2014 07:07 PM, Petr Jelinek wrote: The list of what is useful might be long, but we can't have everything there as there are space constraints, and LSN is another 8 bytes and I still want to have some bytes for storing the origin or whatever you want to call it there, as that's the one I personally have biggest use-case for. So this would be ~24bytes per txid already, hmm I wonder if we can pull some tricks to lower that a bit. The reason why Jim and myself are asking for the LSN and not just the timestamp is that I want to be able to order the transactions. Jim pointed out earlier in the thread that just ordering on timestamp allows for multiple transactions with the same timestamp. Maybe we don't need the entire LSN to solve that. If you already have the commit timestamp maybe you only need another byte or two of granularity to order transactions that are within the same microsecond. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
Robert Haas wrote: I think the key question here is the time for which the data needs to be retained. 2^32 of anything is a lot, but why keep around that number of records rather than more (after all, we have epochs to distinguish one use of a given txid from another) or fewer? The problem is not how much data we retain; is about how much data we can address. We must be able to address the data for transaction with xid=2^32-1, even if you only retain the 1000 most recent transactions. In fact, we already only retain data back to RecentXmin, if I recall correctly. All slru.c users work that way. Back when pg_multixact/members had the 5-char issue, I came up with a patch that had each slru.c user declare how many chars maximum were the filenames. I didn't push further with that because there was an issue with it, I don't remember what it was offhand (but I don't think I posted it). But this is only needed so that the filenames are all equal width, which is mostly cosmetical; the rest of the module works fine with 4- or 5-char filenames, and can be trivially expanded to support 6 or more. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Sun, 2014-11-09 at 11:57 -0500, Steve Singer wrote: The reason why Jim and myself are asking for the LSN and not just the timestamp is that I want to be able to order the transactions. Jim pointed out earlier in the thread that just ordering on timestamp allows for multiple transactions with the same timestamp. Maybe we don't need the entire LSN to solve that. If you already have the commit timestamp maybe you only need another byte or two of granularity to order transactions that are within the same microsecond. There is no guarantee that a commit with later LSN has a later timestamp. There are cases where the clock could move significantly backwards. A robust solution to storing transaction commit information (including commit order) in a way that can be referenced from other tables, can be loaded to another cluster, and survives crashes would be a great feature. But this feature doesn't have those properties. - Anssi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 08/11/14 03:05, Robert Haas wrote: On Fri, Nov 7, 2014 at 7:07 PM, Petr Jelinek p...@2ndquadrant.com wrote: but we can't have everything there as there are space constraints, and LSN is another 8 bytes and I still want to have some bytes for storing the origin or whatever you want to call it there, as that's the one I personally have biggest use-case for. So this would be ~24bytes per txid already, hmm I wonder if we can pull some tricks to lower that a bit. It won't do to say let's do the things that I want, and foreclose forever the things that other people want. I find it quite hard to believe that 16 bytes per transaction is a perfectly tolerable overhead but 24 bytes per transaction will break the bank. But if that is really true then we ought to reject this patch altogether, because it's unacceptable, in any arena, for a patch that only benefits extensions to consume all of the available bit-space in, leaving none for future core needs. That's not what I said. I am actually ok with adding the LSN if people see it useful. I was just wondering if we can make the record smaller somehow - 24bytes per txid is around 96GB of data for whole txid range and won't work with pages smaller than ~4kBs unless we add 6 char support to SLRU (which is not hard and we could also not allow track_commit_timestamps to be turned on with smaller pagesize...). I remember somebody was worried about this already during the original patch submission and it can't be completely ignored in the discussion about adding more stuff into the record. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Sat, Nov 8, 2014 at 5:35 AM, Petr Jelinek p...@2ndquadrant.com wrote: That's not what I said. I am actually ok with adding the LSN if people see it useful. I was just wondering if we can make the record smaller somehow - 24bytes per txid is around 96GB of data for whole txid range and won't work with pages smaller than ~4kBs unless we add 6 char support to SLRU (which is not hard and we could also not allow track_commit_timestamps to be turned on with smaller pagesize...). I remember somebody was worried about this already during the original patch submission and it can't be completely ignored in the discussion about adding more stuff into the record. Fair point. Sorry I misunderstood. I think the key question here is the time for which the data needs to be retained. 2^32 of anything is a lot, but why keep around that number of records rather than more (after all, we have epochs to distinguish one use of a given txid from another) or fewer? Obvious alternatives include: - Keep the data for some period of time; discard the data when it's older than some threshold. - Keep a certain amount of total data; every time we create a new file, discard the oldest one. - Let consumers of the data say how much they need, and throw away data when it's no longer needed by the oldest consumer. - Some combination of the above. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 06/11/14 08:50, Andres Freund wrote: On 2014-11-05 19:31:52 -0500, Steve Singer wrote: It also doesn't allow you to load two extensions at once on a system. You wouldn't be able to have both the 'steve_commit_order' extension and BDR installed at the same time. I don't think this patch does a very good job at (B) but It wasn't intended to. Well, I don't agree that steve_commit_order makes much sense in this context. But there actually is a real problem here, namely that there's no namespacing in those bytes. I'd be ok with saying that we split the extradata in for bytes for the namespace and four for the actual data. That's roughly how it works for advisory locks already. I am not sure how would this solve problem of 2 extensions using the extradata given that there can be only 1 record per txid anyway. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 06/11/14 01:31, Steve Singer wrote: On 11/05/2014 05:43 PM, Andres Freund wrote: Is this patch supposed to: A) Add commit timestamp tracking but nothing more B) Add infrastructure to store commit timestamps and provide a facility for storing additional bits of data extensions might want to be associated with the commit C). Add commit timestamps and node identifiers to commits If the answer is (A) then I can see why some people are objecting to adding extradata.If the answer is (B) then it's fair to ask how well does this patch handle various types of things people might want to attach to the commit record (such as the LSN). I think the problem is that once you start providing a facility extensions can use to store data along with the commit record then being restricted to 4 or 8 bytes is very limiting. It also doesn't allow you to load two extensions at once on a system. You wouldn't be able to have both the 'steve_commit_order' extension and BDR installed at the same time. I don't think this patch does a very good job at (B) but It wasn't intended to. I would love to have (B) but I don't think that's realistic, at least not in the extent some people on this thread would like. I mean you can already do (B) by using table, it just isn't that great when it comes to performance of that solution. This patch is aimed to do limited version of (B) where you don't have dynamic record for storing whatever you might desire but on the upside the performance is good. And yes so far this look more like we are actually doing (C) since main purpose of the patch is enabling conflict detection and resolving of those conflicts, which is useful in many replication scenarios that are not limited to the classical multi-master setup. If what we are really doing is C, and just calling the space 'extradata' until we get the logical identifier stuff in and then we are going to rename extradata to nodeid then we should say so. If we are really concerned about the pg_upgrade impact of expanding the record later then maybe we should add 4 bytes of padding to the CommitTimeStampEntry now and but leave the manipulating the node id until later. This might not be bad idea. I don't see the extradata being useful for multiple extensions at the same time given that there is single record per txid unless we would enforce some kind of limitation that extension can only set the extradata for txids produced by that extension. The namespacing idea that Andres has would again work fine for various replication solutions as it would make it easier for them to coexist but it would still not work for your 'steve_commit_order' (which I also think should be done differently anyway). In general I do see this patch to be similar in purpose to what we did with replica triggers or logical decoding, those features also didn't really have in-core use, were optional and enabled us to take step forward with replication and had some side uses besides replication just like commit timestamps do. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 2014-11-07 17:54:32 +0100, Petr Jelinek wrote: On 06/11/14 08:50, Andres Freund wrote: On 2014-11-05 19:31:52 -0500, Steve Singer wrote: It also doesn't allow you to load two extensions at once on a system. You wouldn't be able to have both the 'steve_commit_order' extension and BDR installed at the same time. I don't think this patch does a very good job at (B) but It wasn't intended to. Well, I don't agree that steve_commit_order makes much sense in this context. But there actually is a real problem here, namely that there's no namespacing in those bytes. I'd be ok with saying that we split the extradata in for bytes for the namespace and four for the actual data. That's roughly how it works for advisory locks already. I am not sure how would this solve problem of 2 extensions using the extradata given that there can be only 1 record per txid anyway. It'd help you to detect problems. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Nov 5, 2014, at 7:31 PM, Steve Singer st...@ssinger.info wrote: On 11/05/2014 05:43 PM, Andres Freund wrote: On 2014-11-05 17:17:05 -0500, Steve Singer wrote: Imo that's essentially a different feature. What you essentially would need here is a 'commit sequence number' - but no timestamps. And probably to be useful that number has to be 8 bytes in itself. I think this gets to the heart of some of the differing views people have expressed on this patch Is this patch supposed to: A) Add commit timestamp tracking but nothing more B) Add infrastructure to store commit timestamps and provide a facility for storing additional bits of data extensions might want to be associated with the commit C). Add commit timestamps and node identifiers to commits Well put. I think the authors of this patch are suffering from a certain amount of myopia. Commit timestamps are useful, but so are commit LSNs, and it makes little sense to me to suppose that we should have two different systems for those closely-related needs. Like Andres, I think B is impractical, so let's just be honest and admit that C is what we're really doing. But let's add LSNs so the people who want that can be happy too. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 08/11/14 00:35, Robert Haas wrote: On Nov 5, 2014, at 7:31 PM, Steve Singer st...@ssinger.info wrote: On 11/05/2014 05:43 PM, Andres Freund wrote: On 2014-11-05 17:17:05 -0500, Steve Singer wrote: Imo that's essentially a different feature. What you essentially would need here is a 'commit sequence number' - but no timestamps. And probably to be useful that number has to be 8 bytes in itself. I think this gets to the heart of some of the differing views people have expressed on this patch Is this patch supposed to: A) Add commit timestamp tracking but nothing more B) Add infrastructure to store commit timestamps and provide a facility for storing additional bits of data extensions might want to be associated with the commit C). Add commit timestamps and node identifiers to commits Well put. I think the authors of this patch are suffering from a certain amount of myopia. Commit timestamps are useful, but so are commit LSNs, and it makes little sense to me to suppose that we should have two different systems for those closely-related needs. Like Andres, I think B is impractical, so let's just be honest and admit that C is what we're really doing. But let's add LSNs so the people who want that can be happy too. The list of what is useful might be long, but we can't have everything there as there are space constraints, and LSN is another 8 bytes and I still want to have some bytes for storing the origin or whatever you want to call it there, as that's the one I personally have biggest use-case for. So this would be ~24bytes per txid already, hmm I wonder if we can pull some tricks to lower that a bit. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Fri, Nov 7, 2014 at 7:07 PM, Petr Jelinek p...@2ndquadrant.com wrote: The list of what is useful might be long, That's FUD. It might also be short. but we can't have everything there as there are space constraints, and LSN is another 8 bytes and I still want to have some bytes for storing the origin or whatever you want to call it there, as that's the one I personally have biggest use-case for. So this would be ~24bytes per txid already, hmm I wonder if we can pull some tricks to lower that a bit. It won't do to say let's do the things that I want, and foreclose forever the things that other people want. I find it quite hard to believe that 16 bytes per transaction is a perfectly tolerable overhead but 24 bytes per transaction will break the bank. But if that is really true then we ought to reject this patch altogether, because it's unacceptable, in any arena, for a patch that only benefits extensions to consume all of the available bit-space in, leaving none for future core needs. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 11/01/2014 09:00 PM, Michael Paquier wrote: 1) It is untested and actually there is no direct use for it in core. 2) Pushing code that we know as dead is no good, that's a feature more or less defined as maybe-useful-but-we-are-not-sure-yet-what-to-do-with-it. 3) If you're going to re-use this API in BDR, which is a fork of Postgres. I would like to emphasise that BDR is not really a fork at all, no more than any big topic branch would be a fork. BDR is two major parts: - A collection of patches to core (commit timestamps/extradata, sequence AM, replication identifiers, logical decoding, DDL deparse, event triggers, etc). These are being progressively submitted to core. maintained as multiple feature branches plus a merged version; and - An extension that uses core features and, where necessary, the additions to core to implement bi-directional logical replication. Because of the time scales involved in getting things into core it's been necessary to *temporarily* get the 9.4-based feature branch into wider use so that it can be used to run the BDR extension, but if we can get the required features into core that need will go away. Event triggers and logical decoding were already merged in 9.4. If we can get things like commit timestamps, commit extradata / logical replication identifiers, the sequence access method, etc merged in 9.5 then it should be possible to do away with the need for the patches to core entirely and run BDR on top of stock 9.5. I'd be delighted if that were possible, as doing away with the patched 9.4 would get rid of a great deal of work and frustration on my part. Note that the BDR extension its self is PostgreSQL-licensed. Externally maintained extensions have been bought in-core before. It's a lot of code though, and I can't imagine that being a quick process. You'd better complete this API in BDR by yourself and not bother core with that. This argument would've prevented the inclusion of logical decoding, which is rapidly becoming the headline feature for 9.4, or at least shortly behind jsonb. Slony is being adapted to use it, multiple people are working on auditing systems based on it, and AFAIK EDB's xDB is being adapted to take advantage of it too. As development gets more complex and people attempt bigger features, the One Big Patch that adds a feature and an in-core user of the feature is not going to be a viable approach all the time. In my view it's already well past that, and some recent patches (like RLS) really should've been split up into patch series. If we want to avoid unreviewable monster-patches it will, IMO, be necessary to have progressive, iterative enhancement. That may sometimes mean that there's code in core that's only used by future yet-to-be-merged patches and/or by extensions. Of course its desirable to have an in-tree user of the code wherever possible/practical - but sometimes it may *not* be possible or practical. It seems to me like the benefits of committing work in smaller, more reviewable chunks outweigh the benefits of merging multiple related but separate changes just so everything can have an immediate in-tree user. That's not to say that extradata must remain glued to commit timestamps. It might make more sense as a separate patch with an API to allow extensions to manipulate it directly, plus a dummy extension showing how it works, like we do with various hooks and with APIs like FDWs. However, just like the various hooks that we have, it *does* make sense to have something in-core that has no real world in-core users. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Tue, 2014-11-04 at 23:43 -0600, Jim Nasby wrote: I'm worried about 2 commits in the same microsecond on the same system, not on 2 different systems. Or, put another way, if we're going to expose this I think it should also provide a guaranteed unique commit ordering for a single cluster. Presumably, this shouldn't be that hard since we do know the exact order in which things committed. Addition of LSN when the timestamps for two transactions are exactly same isn't enough. There isn't any guarantee that a later commit gets a later timestamp than an earlier commit. In addition, I wonder if this feature would be misused. Record transaction ids to a table to find out commit order (use case could be storing historical row versions for example). Do a dump and restore on another cluster, and all the transaction ids are completely meaningless to the system. Having the ability to record commit order into an audit table would be extremely welcome, but as is, this feature doesn't provide it. - Anssi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On Wed, Nov 5, 2014 at 5:24 PM, Anssi Kääriäinen anssi.kaariai...@thl.fi wrote: On Tue, 2014-11-04 at 23:43 -0600, Jim Nasby wrote: I'm worried about 2 commits in the same microsecond on the same system, not on 2 different systems. Or, put another way, if we're going to expose this I think it should also provide a guaranteed unique commit ordering for a single cluster. Presumably, this shouldn't be that hard since we do know the exact order in which things committed. Addition of LSN when the timestamps for two transactions are exactly same isn't enough. There isn't any guarantee that a later commit gets a later timestamp than an earlier commit. True if WAL record ID is not globally consistent. Two-level commit ordering can be done with (timestamp or LSN, nodeID). At equal timestamp, we could say as well that the node with the lowest systemID wins for example. That's not something In addition, I wonder if this feature would be misused. Record transaction ids to a table to find out commit order (use case could be storing historical row versions for example). Do a dump and restore on another cluster, and all the transaction ids are completely meaningless to the system. I think you are forgetting the fact to be able to take a consistent dump using an exported snapshot. In this case the commit order may not be that meaningless.. Having the ability to record commit order into an audit table would be extremely welcome, but as is, this feature doesn't provide it. That's something that can actually be achieved with this feature if the SQL interface is able to query all the timestamps in a xid range with for example a background worker that tracks this data periodically. Now the thing is as well: how much timestamp history do we want to keep? The patch truncating SLRU files with frozenID may cover a sufficient range... -- Michael
Re: [HACKERS] tracking commit timestamps
On 11/5/14, 6:10 AM, Michael Paquier wrote: In addition, I wonder if this feature would be misused. Record transaction ids to a table to find out commit order (use case could be storing historical row versions for example). Do a dump and restore on another cluster, and all the transaction ids are completely meaningless to the system. I think you are forgetting the fact to be able to take a consistent dump using an exported snapshot. In this case the commit order may not be that meaningless.. Anssi's point is that you can't use xmin because it can change, but I think anyone working with this feature would understand that. Having the ability to record commit order into an audit table would be extremely welcome, but as is, this feature doesn't provide it. That's something that can actually be achieved with this feature if the SQL interface is able to query all the timestamps in a xid range with for example a background worker that tracks this data periodically. Now the thing is as well: how much timestamp history do we want to keep? The patch truncating SLRU files with frozenID may cover a sufficient range... Except that commit time is not guaranteed unique *even on a single system*. That's my whole point. If we're going to bother with all the commit time machinery it seems really silly to provide a way to uniquely order every commit. Clearly trying to uniquely order commits across multiple systems is a far larger problem, and I'm not suggesting we attempt that. But for a single system AIUI all we need to do is expose the LSN of each commit record and that will give you the exact and unique order in which transactions committed. This isn't a hypothetical feature either; if we had this, logical replication systems wouldn't have to try and fake this via batches. You could actually recreate exactly what data was visible at what time to all transactions, not just repeatable read ones (as long as you kept snapshot data as well, which isn't hard). As for how much data to keep, if you have a process that's doing something to record this information permanently all it needs to do is keep an old enough snapshot around. That's not that hard to do, even from user space. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 11/5/14, 10:30 AM, Andres Freund wrote: Except that commit time is not guaranteed unique *even on a single system*. That's my whole point. If we're going to bother with all the commit time machinery it seems really silly to provide a way to uniquely order every commit. Well. I think that's the misunderstanding here. That's absolutely not what committs is supposed to be used for. For the replication stream you'd hopefully use logical decoding. That gives you the transaction data exactly in commit order. So presumably you'd want to use logical decoding to insert into a table with a sequence on it, or similar? I agree, that sounds like a better way to handle this. I think it's worth mentioning in the docs for commit_ts, because people WILL mistakenly try and use it to determine commit ordering. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 2014-11-05 10:23:15 -0600, Jim Nasby wrote: On 11/5/14, 6:10 AM, Michael Paquier wrote: In addition, I wonder if this feature would be misused. Record transaction ids to a table to find out commit order (use case could be storing historical row versions for example). Do a dump and restore on another cluster, and all the transaction ids are completely meaningless to the system. I think you are forgetting the fact to be able to take a consistent dump using an exported snapshot. In this case the commit order may not be that meaningless.. Anssi's point is that you can't use xmin because it can change, but I think anyone working with this feature would understand that. Having the ability to record commit order into an audit table would be extremely welcome, but as is, this feature doesn't provide it. That's something that can actually be achieved with this feature if the SQL interface is able to query all the timestamps in a xid range with for example a background worker that tracks this data periodically. Now the thing is as well: how much timestamp history do we want to keep? The patch truncating SLRU files with frozenID may cover a sufficient range... Except that commit time is not guaranteed unique *even on a single system*. That's my whole point. If we're going to bother with all the commit time machinery it seems really silly to provide a way to uniquely order every commit. Well. I think that's the misunderstanding here. That's absolutely not what committs is supposed to be used for. For the replication stream you'd hopefully use logical decoding. That gives you the transaction data exactly in commit order. Clearly trying to uniquely order commits across multiple systems is a far larger problem, and I'm not suggesting we attempt that. But for a single system AIUI all we need to do is expose the LSN of each commit record and that will give you the exact and unique order in which transactions committed. I don't think that's something you should attempt. That's what logical decoding is for. Hence I see little point in exposing the LSN that way. Where I think committs is useful is a method for analyzing and resolving conflicts between multiple systems. In that case you likely can't use the LSN for anything as it won't be very meaningful. If you get conflicts below the accuracy of the timestamps you better use another deterministic method of resolving them - BDR e.g. compares the system identifier, timeline id, database oid, and a user defined name. While enforcing that those aren't the same between systems. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 2014-11-05 10:34:40 -0600, Jim Nasby wrote: On 11/5/14, 10:30 AM, Andres Freund wrote: Except that commit time is not guaranteed unique *even on a single system*. That's my whole point. If we're going to bother with all the commit time machinery it seems really silly to provide a way to uniquely order every commit. Well. I think that's the misunderstanding here. That's absolutely not what committs is supposed to be used for. For the replication stream you'd hopefully use logical decoding. That gives you the transaction data exactly in commit order. So presumably you'd want to use logical decoding to insert into a table with a sequence on it, or similar? I'm not following. I'd use logical decoding to replicate the data to another system, thereby guaranteeing its done in commit order. Then, when applying the data on the other side, I can detect/resolve some forms of conflicts by looking at the timestamps of rows via committs. I agree, that sounds like a better way to handle this. I think it's worth mentioning in the docs for commit_ts, because people WILL mistakenly try and use it to determine commit ordering. Ok, sounds sensible. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
Jim Nasby jim.na...@bluetreble.com wrote: for a single system AIUI all we need to do is expose the LSN of each commit record and that will give you the exact and unique order in which transactions committed. This isn't a hypothetical feature either; if we had this, logical replication systems wouldn't have to try and fake this via batches. You could actually recreate exactly what data was visible at what time to all transactions, not just repeatable read ones (as long as you kept snapshot data as well, which isn't hard). Well, that not entirely true for serializable transactions; there are points in time where reading the committed state could cause a transaction to roll back[1] -- either a writing transaction which would make that visible state inconsistent with the later committed state or the reading transaction if it views something which is not (yet) consistent. That's not to say that this feature is a bad idea; part of the serializable implementation itself depends on being able to accurately determine commit order, and this feature could allow that to work more efficiently. I'm saying that, like hot standby, a replicated database could not provide truly serializable transactions (even read only ones) without something else in addition to this. We've discussed various ways of doing that. Perhaps the most promising is to include in the stream some indication of which points in the transaction stream are safe for a serializable transaction to read. If there's a way to implement commit order recording such that a two-state flag could be associated with each commit, I think it could be made to work for serializable transactions. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company [1] https://wiki.postgresql.org/wiki/SSI#Read_Only_Transactions -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 11/05/2014 11:23 AM, Jim Nasby wrote: Except that commit time is not guaranteed unique *even on a single system*. That's my whole point. If we're going to bother with all the commit time machinery it seems really silly to provide a way to uniquely order every commit. Clearly trying to uniquely order commits across multiple systems is a far larger problem, and I'm not suggesting we attempt that. But for a single system AIUI all we need to do is expose the LSN of each commit record and that will give you the exact and unique order in which transactions committed. This isn't a hypothetical feature either; if we had this, logical replication systems wouldn't have to try and fake this via batches. You could actually recreate exactly what data was visible at what time to all transactions, not just repeatable read ones (as long as you kept snapshot data as well, which isn't hard). As for how much data to keep, if you have a process that's doing something to record this information permanently all it needs to do is keep an old enough snapshot around. That's not that hard to do, even from user space. +1 for this. It isn't just 'replication' systems that have a need for getting the commit order of transactions on a single system. I have a application (not slony) where we want to query a table but order the output based on the transaction commit order of when the insert into the table was done (think of a queue). I'm not replicating the output but passing the data to other applications for further processing. If I just had the commit timestamp I would need to put in some other condition to break ties in a consistent way. I think being able to get an ordering by commit LSN is what I really want in this case not the timestamp. Logical decoding is one solution to this (that I was considering) but being able to do something like select * FROM event_log order by commit_id would be a lot simpler. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 2014-11-05 17:17:05 -0500, Steve Singer wrote: It isn't just 'replication' systems that have a need for getting the commit order of transactions on a single system. I have a application (not slony) where we want to query a table but order the output based on the transaction commit order of when the insert into the table was done (think of a queue). I'm not replicating the output but passing the data to other applications for further processing. If I just had the commit timestamp I would need to put in some other condition to break ties in a consistent way. I think being able to get an ordering by commit LSN is what I really want in this case not the timestamp. Logical decoding is one solution to this (that I was considering) but being able to do something like select * FROM event_log order by commit_id would be a lot simpler. Imo that's essentially a different feature. What you essentially would need here is a 'commit sequence number' - but no timestamps. And probably to be useful that number has to be 8 bytes in itself. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 11/05/2014 05:43 PM, Andres Freund wrote: On 2014-11-05 17:17:05 -0500, Steve Singer wrote: Imo that's essentially a different feature. What you essentially would need here is a 'commit sequence number' - but no timestamps. And probably to be useful that number has to be 8 bytes in itself. I think this gets to the heart of some of the differing views people have expressed on this patch Is this patch supposed to: A) Add commit timestamp tracking but nothing more B) Add infrastructure to store commit timestamps and provide a facility for storing additional bits of data extensions might want to be associated with the commit C). Add commit timestamps and node identifiers to commits If the answer is (A) then I can see why some people are objecting to adding extradata.If the answer is (B) then it's fair to ask how well does this patch handle various types of things people might want to attach to the commit record (such as the LSN). I think the problem is that once you start providing a facility extensions can use to store data along with the commit record then being restricted to 4 or 8 bytes is very limiting. It also doesn't allow you to load two extensions at once on a system. You wouldn't be able to have both the 'steve_commit_order' extension and BDR installed at the same time. I don't think this patch does a very good job at (B) but It wasn't intended to. If what we are really doing is C, and just calling the space 'extradata' until we get the logical identifier stuff in and then we are going to rename extradata to nodeid then we should say so. If we are really concerned about the pg_upgrade impact of expanding the record later then maybe we should add 4 bytes of padding to the CommitTimeStampEntry now and but leave the manipulating the node id until later. Steve Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 2014-11-05 19:31:52 -0500, Steve Singer wrote: On 11/05/2014 05:43 PM, Andres Freund wrote: On 2014-11-05 17:17:05 -0500, Steve Singer wrote: Imo that's essentially a different feature. What you essentially would need here is a 'commit sequence number' - but no timestamps. And probably to be useful that number has to be 8 bytes in itself. I think this gets to the heart of some of the differing views people have expressed on this patch I think it's actually besides the heart... Is this patch supposed to: A) Add commit timestamp tracking but nothing more B) Add infrastructure to store commit timestamps and provide a facility for storing additional bits of data extensions might want to be associated with the commit C). Add commit timestamps and node identifiers to commits If the answer is (A) then I can see why some people are objecting to adding extradata.If the answer is (B) then it's fair to ask how well does this patch handle various types of things people might want to attach to the commit record (such as the LSN). I think there's a mistake exactly here. The LSN of the commit isn't just some extra information about the commit. You can't say 'here, also attach this piece of information'. Instead you need special case code in xact.c to add it. Thus prohibiting that space to be used for something else. I think the problem is that once you start providing a facility extensions can use to store data along with the commit record then being restricted to 4 or 8 bytes is very limiting. Well, you can easily use those 4/8 bytes to start adding more data to the transaction. By referring to some table with transaction metadata for example. It also doesn't allow you to load two extensions at once on a system. You wouldn't be able to have both the 'steve_commit_order' extension and BDR installed at the same time. I don't think this patch does a very good job at (B) but It wasn't intended to. Well, I don't agree that steve_commit_order makes much sense in this context. But there actually is a real problem here, namely that there's no namespacing in those bytes. I'd be ok with saying that we split the extradata in for bytes for the namespace and four for the actual data. That's roughly how it works for advisory locks already. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 2014-11-02 19:27:25 +0100, Petr Jelinek wrote: Well, Michael has point that the extradata is pretty much useless currently, perhaps it would help to add the interface to set extradata? Only accessible via C and useless aren't the same thing. But sure, add it. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers