[HACKERS] \copy (query) delimiter syntax error
psql's \copy (query) with a delimiter yields a syntax error: test= \copy foo to foo.txt delimiter '|' (works) test= \copy (select * from foo) to foo.txt (works) test= \copy (select * from foo) to foo.txt delimiter '|' ERROR: syntax error at or near USING LINE 1: COPY ( select * from foo ) TO STDOUT USING DELIMITERS '|' The problem is that \copy sends USING DELIMITERS for backward compatibility (comment on line 502 of src/bin/psql/copy.c) but that COPY (query) doesn't support USING DELIMITERS: CopyStmt: COPY opt_binary qualified_name opt_column_list opt_oids copy_from copy_file_name copy_delimiter opt_with copy_opt_list ... | COPY select_with_parens TO copy_file_name opt_with copy_opt_list copy_delimiter: /* USING DELIMITERS kept for backward compatibility. 2002-06-15 */ opt_using DELIMITERS Sconst What should be fixed -- COPY or \copy? Does psql's \copy still need backward compatibility to unsupported pre-7.3? -- Michael Fuhr ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Referential Integrity and SHARE locks
On Fri, 2007-02-02 at 16:50 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: On Fri, 2007-02-02 at 15:57 -0500, Tom Lane wrote: , and it doesn't scale to more than two holders, and I don't think it works for combinations of share and exclusive lock. Also, what happened to the third type of lock? Well, we just need to record the maximum two lock holders (given the semantics described). The third lock type is both locks at once. You're not going to support shared locks? That will be a big step backwards ... I did say that Shared locks were supported also. The lack of ordering of multitransactions is a hole in my suggestion, so I need to reconsider. Anyway, implementation aside, I wanted to agree the overall TODO, so we can think through the best way over a long period, if you agree in general. No, I don't. I think knowledge of which columns are in a PK is quite a few levels away from the semantics of row locking. To point out just one problem, what happens when you add or drop a PK? Or drop and replace with a different column set? Yes, I know dropping one requires exclusive lock on the table, but the transaction doing it could hold row locks within the table, and now it's very unclear what they mean. There are issues, yes. Dropping PKs is a very irregular occurrence nor is it likely to be part of a complex transaction. It wouldn't bother me to say that if a transaction already holds a RowExclusiveLock or a RowShareLock it cannot upgrade to an AccessExclusiveLock. For me, PKs are intimately tied to the rows they uniquely identify; we should be mentally linking them and seeing them as a replacement for Oids, which we still see as intimately linked at low level. We'll be missing many optimizations if we don't, we've discussed others this week already. The TODO I was requesting you consider was this: Develop non-conflicting locking scheme to allow RI checks to co-exist peacefully with non-PK UPDATEs on the referenced table. That is, IMHO, a general statement of an important unresolved issue with our Referential Integrity implementation. That is in no way intended as any form of negative commentary on the excellent detailed work that has got us so far already. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] \copy (query) delimiter syntax error
Michael Fuhr wrote: psql's \copy (query) with a delimiter yields a syntax error: test= \copy foo to foo.txt delimiter '|' (works) test= \copy (select * from foo) to foo.txt (works) test= \copy (select * from foo) to foo.txt delimiter '|' ERROR: syntax error at or near USING LINE 1: COPY ( select * from foo ) TO STDOUT USING DELIMITERS '|' The problem is that \copy sends USING DELIMITERS for backward compatibility (comment on line 502 of src/bin/psql/copy.c) but that COPY (query) doesn't support USING DELIMITERS: CopyStmt: COPY opt_binary qualified_name opt_column_list opt_oids copy_from copy_file_name copy_delimiter opt_with copy_opt_list ... | COPY select_with_parens TO copy_file_name opt_with copy_opt_list copy_delimiter: /* USING DELIMITERS kept for backward compatibility. 2002-06-15 */ opt_using DELIMITERS Sconst What should be fixed -- COPY or \copy? Does psql's \copy still need backward compatibility to unsupported pre-7.3? I'd say fix psql. Not sure how far back we should backpatch it. It's interesting that this has been there since 8.0 and is only now discovered. cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] \copy (query) delimiter syntax error
On Sat, Feb 03, 2007 at 10:52:29AM -0600, Andrew Dunstan wrote: I'd say fix psql. Not sure how far back we should backpatch it. It's interesting that this has been there since 8.0 and is only now discovered. The problem is new in 8.2 because COPY (query) doesn't support USING DELIMITERS. COPY tablename does, so it has worked all along. -- Michael Fuhr ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Referential Integrity and SHARE locks
On Sat, 3 Feb 2007, Simon Riggs wrote: On Fri, 2007-02-02 at 16:50 -0500, Tom Lane wrote: No, I don't. I think knowledge of which columns are in a PK is quite a few levels away from the semantics of row locking. To point out just one problem, what happens when you add or drop a PK? Or drop and replace with a different column set? Yes, I know dropping one requires exclusive lock on the table, but the transaction doing it could hold row locks within the table, and now it's very unclear what they mean. There are issues, yes. Dropping PKs is a very irregular occurrence nor is it likely to be part of a complex transaction. It wouldn't bother me to say that if a transaction already holds a RowExclusiveLock or a RowShareLock it cannot upgrade to an AccessExclusiveLock. The lock check seems like a strange constraint, given that it's not necessarily going to be anything that conflicts with the row locks. I'm not sure there'd be a better idea given this sort of scheme, but it still seems strange. The TODO I was requesting you consider was this: Develop non-conflicting locking scheme to allow RI checks to co-exist peacefully with non-PK UPDATEs on the referenced table. That is, IMHO, a general statement of an important unresolved issue with our Referential Integrity implementation. That is in no way intended as any form of negative commentary on the excellent detailed work that has got us so far already. Well, if we really want to solve that completely then we really need column locking, or at least locking at the level of arbitrary (possibly overlapping) unique constraints, not just the PK because foreign keys don't necessarily reference the primary key. But the PK case is certainly the most common and it'd certainly be nice to cover that case. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] --enable-debug does not work with gcc
Hi all, Magnus Hagander wrote: Personally, in my development tree I use a Makefile.custom containing # back off optimization unless profiling ifeq ($(PROFILE),) CFLAGS:= $(patsubst -O2,-O1,$(CFLAGS)) endif -O1 still generates uninitialized variable warnings but the code is a lot saner to step through ... not perfect, but saner. It's been a workable compromise for a long time. I don't recommend developing with -O0, exactly because it disables some mighty valuable warnings. Agreed. I use -O1 by default myself, unless I am doing performance testing. Something for the developers FAQ perhaps? I confess I did not know of Makefile.custom :-D I did not know, either, but would have been glad, had I found such info a few weeks ago when I started digging into the pg source ;) And while we are at it, how about some CFLAGS=-DOPTIMIZER_DEBUG sweetness for the debugging section in the Dev FAQ? Are there any other macros that enable some more debug output? (Or is there a place where all this has already been documented?) Cheers, Matthias ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] \copy (query) delimiter syntax error
Michael Fuhr wrote: On Sat, Feb 03, 2007 at 10:52:29AM -0600, Andrew Dunstan wrote: I'd say fix psql. Not sure how far back we should backpatch it. It's interesting that this has been there since 8.0 and is only now discovered. The problem is new in 8.2 because COPY (query) doesn't support USING DELIMITERS. COPY tablename does, so it has worked all along. oh, good point. OK, I have cut this quick patch that will continue to accept the legacy syntax in psql in non-inline-query cases, but will make psql unreservedly emit new style syntax for COPY to the backend. Does that seem reasonable, or is it too much of a change for the stable branch? cheers andrew Index: src/bin/psql/copy.c === RCS file: /cvsroot/pgsql/src/bin/psql/copy.c,v retrieving revision 1.72 diff -c -r1.72 copy.c *** src/bin/psql/copy.c 5 Jan 2007 22:19:49 - 1.72 --- src/bin/psql/copy.c 3 Feb 2007 19:37:34 - *** *** 118,123 --- 118,124 char *token; const char *whitespace = \t\n\r; char nonstd_backslash = standard_strings() ? 0 : '\\'; + boolhave_query = false; if (args) line = pg_strdup(args); *** *** 163,168 --- 164,170 xstrcat(result-table, ); xstrcat(result-table, token); } + have_query = true; } token = strtokx(NULL, whitespace, .,(), \, *** *** 268,291 0, false, false, pset.encoding); /* ! * Allows old COPY syntax for backward compatibility 2002-06-19 */ ! if (token pg_strcasecmp(token, using) == 0) ! { ! token = strtokx(NULL, whitespace, NULL, NULL, ! 0, false, false, pset.encoding); ! if (!(token pg_strcasecmp(token, delimiters) == 0)) ! goto error; ! } ! if (token pg_strcasecmp(token, delimiters) == 0) { ! token = strtokx(NULL, whitespace, NULL, ', ! nonstd_backslash, true, false, pset.encoding); ! if (!token) ! goto error; ! result-delim = pg_strdup(token); ! token = strtokx(NULL, whitespace, NULL, NULL, ! 0, false, false, pset.encoding); } if (token) --- 270,297 0, false, false, pset.encoding); /* ! * Allows old COPY syntax for backward compatibility. ! * Skip if we have an inline query instead of a table name. */ ! if (! have_query) { ! if (token pg_strcasecmp(token, using) == 0) ! { ! token = strtokx(NULL, whitespace, NULL, NULL, ! 0, false, false, pset.encoding); ! if (!(token pg_strcasecmp(token, delimiters) == 0)) ! goto error; ! } ! if (token pg_strcasecmp(token, delimiters) == 0) ! { ! token = strtokx(NULL, whitespace, NULL, ', ! nonstd_backslash, true, false, pset.encoding); ! if (!token) ! goto error; ! result-delim = pg_strdup(token); ! token = strtokx(NULL, whitespace, NULL, NULL, ! 0, false, false, pset.encoding); ! } } if (token) *** *** 480,511 printfPQExpBuffer(query, COPY ); - /* Uses old COPY syntax for backward compatibility 2002-06-19 */ - if (options-binary) - appendPQExpBuffer(query, BINARY ); - appendPQExpBuffer(query, %s , options-table); if (options-column_list) appendPQExpBuffer(query, %s , options-column_list); - /* Uses old COPY syntax for backward compatibility 2002-06-19 */ - if (options-oids) - appendPQExpBuffer(query, WITH OIDS ); - if (options-from) appendPQExpBuffer(query, FROM STDIN); else appendPQExpBuffer(query, TO STDOUT); ! /* Uses old COPY syntax for backward compatibility 2002-06-19 */ if (options-delim) ! emit_copy_option(query, USING DELIMITERS , options-delim); - /* There is no backward-compatible CSV syntax */ if (options-null) ! emit_copy_option(query, WITH NULL AS , options-null); if (options-csv_mode) appendPQExpBuffer(query, CSV); --- 486,513 printfPQExpBuffer(query, COPY ); appendPQExpBuffer(query, %s , options-table); if (options-column_list) appendPQExpBuffer(query, %s , options-column_list); if (options-from) appendPQExpBuffer(query, FROM STDIN); else appendPQExpBuffer(query, TO STDOUT); ! if (options-binary) ! appendPQExpBuffer(query, BINARY ); ! ! if (options-oids) ! appendPQExpBuffer(query, OIDS ); ! if (options-delim) ! emit_copy_option(query, DELIMITER , options-delim); if (options-null) ! emit_copy_option(query, NULL AS , options-null); if (options-csv_mode) appendPQExpBuffer(query, CSV); ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Proposal: Commit timestamp
On 2/1/2007 11:23 PM, Jim Nasby wrote: On Jan 25, 2007, at 6:16 PM, Jan Wieck wrote: If a per database configurable tslog_priority is given, the timestamp will be truncated to milliseconds and the increment logic is done on milliseconds. The priority is added to the timestamp. This guarantees that no two timestamps for commits will ever be exactly identical, even across different servers. Wouldn't it be better to just store that information separately, rather than mucking with the timestamp? Though, there's anothe issue here... I don't think NTP is good for any better than a few milliseconds, even on a local network. How exact does the conflict resolution need to be, anyway? Would it really be a problem if transaction B committed 0.1 seconds after transaction A yet the cluster thought it was the other way around? Since the timestamp is basically a Lamport counter which is just bumped be the clock as well, it doesn't need to be too precise. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Proposal: Commit timestamp
On Feb 3, 2007, at 3:52 PM, Jan Wieck wrote: On 2/1/2007 11:23 PM, Jim Nasby wrote: On Jan 25, 2007, at 6:16 PM, Jan Wieck wrote: If a per database configurable tslog_priority is given, the timestamp will be truncated to milliseconds and the increment logic is done on milliseconds. The priority is added to the timestamp. This guarantees that no two timestamps for commits will ever be exactly identical, even across different servers. Wouldn't it be better to just store that information separately, rather than mucking with the timestamp? Though, there's anothe issue here... I don't think NTP is good for any better than a few milliseconds, even on a local network. How exact does the conflict resolution need to be, anyway? Would it really be a problem if transaction B committed 0.1 seconds after transaction A yet the cluster thought it was the other way around? Since the timestamp is basically a Lamport counter which is just bumped be the clock as well, it doesn't need to be too precise. Unless I'm missing something, you are _treating_ the counter as a Lamport timestamp, when in fact it is not and thus does not provide semantics of a Lamport timestamp. As such, any algorithms that use lamport timestamps as a basis or assumption for the proof of their correctness will not translate (provably) to this system. How are your counter semantically equivalent to Lamport timestamps? // Theo Schlossnagle // CTO -- http://www.omniti.com/~jesus/ // OmniTI Computer Consulting, Inc. -- http://www.omniti.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] --enable-debug does not work with gcc
Matthias Luedtke wrote: And while we are at it, how about some CFLAGS=-DOPTIMIZER_DEBUG sweetness for the debugging section in the Dev FAQ? Most of the debugging macros are not documented because their purpose only arises out of the source code. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Proposal: Commit timestamp
On 2/3/2007 4:05 PM, Theo Schlossnagle wrote: On Feb 3, 2007, at 3:52 PM, Jan Wieck wrote: On 2/1/2007 11:23 PM, Jim Nasby wrote: On Jan 25, 2007, at 6:16 PM, Jan Wieck wrote: If a per database configurable tslog_priority is given, the timestamp will be truncated to milliseconds and the increment logic is done on milliseconds. The priority is added to the timestamp. This guarantees that no two timestamps for commits will ever be exactly identical, even across different servers. Wouldn't it be better to just store that information separately, rather than mucking with the timestamp? Though, there's anothe issue here... I don't think NTP is good for any better than a few milliseconds, even on a local network. How exact does the conflict resolution need to be, anyway? Would it really be a problem if transaction B committed 0.1 seconds after transaction A yet the cluster thought it was the other way around? Since the timestamp is basically a Lamport counter which is just bumped be the clock as well, it doesn't need to be too precise. Unless I'm missing something, you are _treating_ the counter as a Lamport timestamp, when in fact it is not and thus does not provide semantics of a Lamport timestamp. As such, any algorithms that use lamport timestamps as a basis or assumption for the proof of their correctness will not translate (provably) to this system. How are your counter semantically equivalent to Lamport timestamps? Yes, you must be missing something. The last used timestamp is remembered. When a remote transaction is replicated, the remembered timestamp is set to max(remembered, remote). For a local transaction, the remembered timestamp is set to max(remembered+1ms, systemclock) and that value is used as the transaction commit timestamp. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Proposal: Change of pg_trigger.tg_enabled and adding
Attached is the implementation of the proposed changes as a patch for discussion. The chosen syntax is backward compatible and uses ALTER TABLE tab ENABLE TRIGGER trig (fires on origin - default) ALTER TABLE tab DISABLE TRIGGER trig (disabled) ALTER TABLE tab ENABLE REPLICA TRIGGER trig (fires on replica only) ALTER TABLE tab ENABLE ALWAYS TRIGGER trig (fires always) A sessions current role is controlled by the PG_SUSET GUC session_replication_role. The possible states are origin, replica and local. The local state is identical to origin with respect to trigger firing. It is intended to be used to issue statements that do not get replicated at all. The commands psql and pg_dump are adjusted in a backward compatible manner. Although I noticed that psql currently is incompatible with at least 8.1 databases due to querying indisvalid on \d. Comments? Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # Index: src/backend/commands/tablecmds.c === RCS file: /usr/local/pgsql/CvsRoot/pgsql/src/backend/commands/tablecmds.c,v retrieving revision 1.213 diff -u -r1.213 tablecmds.c --- src/backend/commands/tablecmds.c2 Feb 2007 00:07:02 - 1.213 +++ src/backend/commands/tablecmds.c2 Feb 2007 20:27:47 - @@ -3,7 +3,7 @@ * tablecmds.c * Commands for creating and altering table structures and settings * - * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group + * Portions Copyright (c) 1996-2007, PostgreSQL Global DevelopmEnt Group * Portions Copyright (c) 1994, Regents of the University of California * * @@ -252,7 +252,7 @@ static void ATExecSetTableSpace(Oid tableOid, Oid newTableSpace); static void ATExecSetRelOptions(Relation rel, List *defList, bool isReset); static void ATExecEnableDisableTrigger(Relation rel, char *trigname, - bool enable, bool skip_system); + char fires_when, bool skip_system); static void ATExecAddInherit(Relation rel, RangeVar *parent); static void ATExecDropInherit(Relation rel, RangeVar *parent); static void copy_relation_data(Relation rel, SMgrRelation dst); @@ -2192,6 +2192,8 @@ pass = AT_PASS_MISC; break; case AT_EnableTrig: /* ENABLE TRIGGER variants */ + case AT_EnableAlwaysTrig: + case AT_EnableReplicaTrig: case AT_EnableTrigAll: case AT_EnableTrigUser: case AT_DisableTrig:/* DISABLE TRIGGER variants */ @@ -2364,24 +2366,40 @@ case AT_ResetRelOptions:/* RESET (...) */ ATExecSetRelOptions(rel, (List *) cmd-def, true); break; - case AT_EnableTrig: /* ENABLE TRIGGER name */ - ATExecEnableDisableTrigger(rel, cmd-name, true, false); + + case AT_EnableTrig: /* ENABLE TRIGGER name */ + ATExecEnableDisableTrigger(rel, cmd-name, + TRIGGER_FIRES_ON_ORIGIN, false); + break; + case AT_EnableAlwaysTrig: /* ENABLE ALWAYS TRIGGER name */ + ATExecEnableDisableTrigger(rel, cmd-name, + TRIGGER_FIRES_ALWAYS, false); + break; + case AT_EnableReplicaTrig: /* ENABLE REPLICA TRIGGER name */ + ATExecEnableDisableTrigger(rel, cmd-name, + TRIGGER_FIRES_ON_REPLICA, false); break; case AT_DisableTrig:/* DISABLE TRIGGER name */ - ATExecEnableDisableTrigger(rel, cmd-name, false, false); + ATExecEnableDisableTrigger(rel, cmd-name, + TRIGGER_DISABLED, false); break; case AT_EnableTrigAll: /* ENABLE TRIGGER ALL */ - ATExecEnableDisableTrigger(rel, NULL, true, false); + ATExecEnableDisableTrigger(rel, NULL, + TRIGGER_FIRES_ON_ORIGIN, false); break; case AT_DisableTrigAll: /* DISABLE TRIGGER ALL */ - ATExecEnableDisableTrigger(rel, NULL, false, false); + ATExecEnableDisableTrigger(rel, NULL, + TRIGGER_DISABLED, false); break; case AT_EnableTrigUser: /* ENABLE
Re: [HACKERS] Proposal: Commit timestamp
On Feb 3, 2007, at 4:38 PM, Jan Wieck wrote: On 2/3/2007 4:05 PM, Theo Schlossnagle wrote: On Feb 3, 2007, at 3:52 PM, Jan Wieck wrote: On 2/1/2007 11:23 PM, Jim Nasby wrote: On Jan 25, 2007, at 6:16 PM, Jan Wieck wrote: If a per database configurable tslog_priority is given, the timestamp will be truncated to milliseconds and the increment logic is done on milliseconds. The priority is added to the timestamp. This guarantees that no two timestamps for commits will ever be exactly identical, even across different servers. Wouldn't it be better to just store that information separately, rather than mucking with the timestamp? Though, there's anothe issue here... I don't think NTP is good for any better than a few milliseconds, even on a local network. How exact does the conflict resolution need to be, anyway? Would it really be a problem if transaction B committed 0.1 seconds after transaction A yet the cluster thought it was the other way around? Since the timestamp is basically a Lamport counter which is just bumped be the clock as well, it doesn't need to be too precise. Unless I'm missing something, you are _treating_ the counter as a Lamport timestamp, when in fact it is not and thus does not provide semantics of a Lamport timestamp. As such, any algorithms that use lamport timestamps as a basis or assumption for the proof of their correctness will not translate (provably) to this system. How are your counter semantically equivalent to Lamport timestamps? Yes, you must be missing something. The last used timestamp is remembered. When a remote transaction is replicated, the remembered timestamp is set to max(remembered, remote). For a local transaction, the remembered timestamp is set to max(remembered+1ms, systemclock) and that value is used as the transaction commit timestamp. A Lamport clock, IIRC, require a cluster wide tick. This seems based only on activity and is thus an observational tick only which means various nodes can have various perspectives at different times. Given that time skew is prevalent, why is the system clock involved at all? As is usual distributed systems problems, they are very hard to explain casually and also hard to review from a theoretical angle without a proof. Are you basing this off a paper? If so which one? If not, have you written a rigorous proof of correctness for this approach? // Theo Schlossnagle // CTO -- http://www.omniti.com/~jesus/ // OmniTI Computer Consulting, Inc. -- http://www.omniti.com/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Proposal: Commit timestamp
On 2/3/2007 4:58 PM, Theo Schlossnagle wrote: On Feb 3, 2007, at 4:38 PM, Jan Wieck wrote: On 2/3/2007 4:05 PM, Theo Schlossnagle wrote: On Feb 3, 2007, at 3:52 PM, Jan Wieck wrote: On 2/1/2007 11:23 PM, Jim Nasby wrote: On Jan 25, 2007, at 6:16 PM, Jan Wieck wrote: If a per database configurable tslog_priority is given, the timestamp will be truncated to milliseconds and the increment logic is done on milliseconds. The priority is added to the timestamp. This guarantees that no two timestamps for commits will ever be exactly identical, even across different servers. Wouldn't it be better to just store that information separately, rather than mucking with the timestamp? Though, there's anothe issue here... I don't think NTP is good for any better than a few milliseconds, even on a local network. How exact does the conflict resolution need to be, anyway? Would it really be a problem if transaction B committed 0.1 seconds after transaction A yet the cluster thought it was the other way around? Since the timestamp is basically a Lamport counter which is just bumped be the clock as well, it doesn't need to be too precise. Unless I'm missing something, you are _treating_ the counter as a Lamport timestamp, when in fact it is not and thus does not provide semantics of a Lamport timestamp. As such, any algorithms that use lamport timestamps as a basis or assumption for the proof of their correctness will not translate (provably) to this system. How are your counter semantically equivalent to Lamport timestamps? Yes, you must be missing something. The last used timestamp is remembered. When a remote transaction is replicated, the remembered timestamp is set to max(remembered, remote). For a local transaction, the remembered timestamp is set to max(remembered+1ms, systemclock) and that value is used as the transaction commit timestamp. A Lamport clock, IIRC, require a cluster wide tick. This seems based only on activity and is thus an observational tick only which means various nodes can have various perspectives at different times. Given that time skew is prevalent, why is the system clock involved at all? This question was already answered. As is usual distributed systems problems, they are very hard to explain casually and also hard to review from a theoretical angle without a proof. Are you basing this off a paper? If so which one? If not, have you written a rigorous proof of correctness for this approach? I don't have any such paper and the proof of concept will be the implementation of the system. I do however see enough resistance against this proposal to withdraw the commit timestamp at this time. The new replication system will therefore require the installation of a patched, non-standard PostgreSQL version, compiled from sources cluster wide in order to be used. I am aware that this will dramatically reduce it's popularity but it is impossible to develop this essential feature as an external module. I thank everyone for their attention. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Proposal: Commit timestamp
On Feb 3, 2007, at 5:09 PM, Jan Wieck wrote: On 2/3/2007 4:58 PM, Theo Schlossnagle wrote: I don't have any such paper and the proof of concept will be the implementation of the system. I do however see enough resistance against this proposal to withdraw the commit timestamp at this time. The new replication system will therefore require the installation of a patched, non-standard PostgreSQL version, compiled from sources cluster wide in order to be used. I am aware that this will dramatically reduce it's popularity but it is impossible to develop this essential feature as an external module. I thank everyone for their attention. Actually, I believe the commit timestamp stuff would be very useful in general. I would certainly like to see rigorous proofs of any multi-master replication technology built on top of them. I believe that while your replication stuff might rely on the commit timestamps, the commit timestamps rely on thing else (except the work that you have been churning on). Using commit timestamps, one can easily implement cross vendor database replication. These can be used to implement something like trigger selective redo logs. I think they can be used to produce DML logs that will require a lot less accounting to manage replicating tables from PostgreSQL into another database (like Oracle or MySQL). // Theo Schlossnagle // CTO -- http://www.omniti.com/~jesus/ // OmniTI Computer Consulting, Inc. -- http://www.omniti.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Proposal: Commit timestamp
Jan Wieck wrote: I don't have any such paper and the proof of concept will be the implementation of the system. I do however see enough resistance against this proposal to withdraw the commit timestamp at this time. The new replication system will therefore require the installation of a patched, non-standard PostgreSQL version, compiled from sources cluster wide in order to be used. I am aware that this will dramatically reduce it's popularity but it is impossible to develop this essential feature as an external module. I thank everyone for their attention. Going and working on it on your own doesn't seem like the proper solution. I don't see people objecting to adding it, but they want it work, which I am sure you want too. You have to show how it will work and convince others of that, and then you have a higher chance it will work, and be in the PostgreSQL codebase. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Proposal: Change of pg_trigger.tg_enabled and adding
Jan Wieck wrote: Attached is the implementation of the proposed changes as a patch for discussion. The chosen syntax is backward compatible and uses ALTER TABLE tab ENABLE TRIGGER trig (fires on origin - default) ALTER TABLE tab DISABLE TRIGGER trig (disabled) ALTER TABLE tab ENABLE REPLICA TRIGGER trig (fires on replica only) ALTER TABLE tab ENABLE ALWAYS TRIGGER trig (fires always) snip The commands psql and pg_dump are adjusted in a backward compatible manner. Although I noticed that psql currently is incompatible with at least 8.1 databases due to querying indisvalid on \d. Comments? This is interesting. If I understand correctly the idea here is to be able to determine which triggers will get fired based on the role the database plays? E.g; I have a REPLICA TRIGGER and thus I can use that on a subscriber/slave to take replicated data and create reports automatically. How do we deal with other problems such as a PROMOTED state? Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] recovery.conf parsing problems
Added to TODO: o Allow recovery.conf to allow the same syntax as postgresql.conf, including quoting http://archives.postgresql.org/pgsql-hackers/2006-12/msg00497.php --- Simon Riggs wrote: On Wed, 2006-12-13 at 19:28 +, Simon Riggs wrote: On Wed, 2006-12-13 at 04:23 +, Andrew - Supernews wrote: While testing a PITR recovery, I discovered that recovery.conf doesn't seem to allow specifying ' in the command string, making it hard to protect the restore_command against problematic filenames (whitespace etc.). This doesn't seem to be a problem for archive_command, which allows \' (e.g. archive_command = '/path/to/script \'%f\' \'%p\''). Immediate workaround is to use a script to encapsulate the actual desired request. Should this be fixed? Yes, I'll look into that. OK, I would propose to extend the guc-file.l to include sufficient code to allow the parsing of the conf files to be identical between the postgresql.conf and the recovery.conf (it isn't the same yet). There'll be care taken to ensure that the various options are not settable in the wrong file. Any requests for specific implementation details? I'll be looking to remove code from xlog.c, if possible. Implementation won't be immediate because of other current work. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 8.1.5 release note
Tatsuo Ishii wrote: As usual, following item in the 8.1.5 release note is pretty vague: * Efficiency improvements in hash tables and bitmap index scans(Tom) Especially I'm wondering what was actually improved in bitmap index scans. I see several commit messages regarding bitmap index scans, but I cannot figure out which one is related to the item. Sorry for the delay. I don't know the answer. The commit messages are the only way. -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] SRF optimization question
I am writing a set returning function in C. There are cases where I can know definitively, upfront, that this function will only return one row. I have noticed, through happenstance of partially converted function, that I can mark a normal, non-set returning function as returning SETOF something, while not utilizing the SRF macros and using PG_RETURN_DATUM, and it still works as returning one row. I am wondering, if it is an acceptable optimization that if I know up-front that a function will only return one row, to avoid all of the SRF overhead of setting up a new memory context, and a function context struct, and requiring an extra call to my function to tell Postgres that I am done sending rows, to simply not use the SRF stuff and interact with Postgres as though I was not returning SETOF? Is this a sane idea, or did I just stumble into an accidental feature when I changed my CREATE FUNCTION statement without changing my C code? -- UNIX was half a billion (5) seconds old on Tue Nov 5 00:53:20 1985 GMT (measuring since the time(2) epoch). -- Andy Tannenbaum ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Dead code in _bt_split?
Heikki, did this code cleanup get included in your recent btree split fix? --- Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: In that case, newitemleft would be false, right? I'm saying the piece marked with X below is unreachable: Oh, I see. Hmm ... probably so, I think that chunk of code was just copied and pasted from where it occurs within the loop. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Proposal: Change of pg_trigger.tg_enabled and adding
On 2/3/2007 5:25 PM, Joshua D. Drake wrote: Jan Wieck wrote: Attached is the implementation of the proposed changes as a patch for discussion. The chosen syntax is backward compatible and uses ALTER TABLE tab ENABLE TRIGGER trig (fires on origin - default) ALTER TABLE tab DISABLE TRIGGER trig (disabled) ALTER TABLE tab ENABLE REPLICA TRIGGER trig (fires on replica only) ALTER TABLE tab ENABLE ALWAYS TRIGGER trig (fires always) snip The commands psql and pg_dump are adjusted in a backward compatible manner. Although I noticed that psql currently is incompatible with at least 8.1 databases due to querying indisvalid on \d. Comments? This is interesting. If I understand correctly the idea here is to be able to determine which triggers will get fired based on the role the database plays? Not the database, the session actually has a role, which defaults to origin. The default configuration for triggers (including RI triggers) is O (fires on origin). If the session does SET session_replication_role = replica; only triggers configured A (always) or R (replica) will fire. Not those configured O (origin) or D (disabled). This means that a row based replication system like Slony only has to set the replication role of the session in order to disable triggers. It does not need to touch the system catalog or even ALTER TABLE to do its work. This would even suppress Slony-I's deny-access-trigger, that is in place on subscribers to prevent accidental updates on a replica. Doing it on the session level is even more important for row based multimaster. At the same time where a user session does an update that needs to be added to the replication log, the replication engine in another session must be able to apply a remote transactions updates without firing the log trigger. E.g; I have a REPLICA TRIGGER and thus I can use that on a subscriber/slave to take replicated data and create reports automatically. How do we deal with other problems such as a PROMOTED state? Promoted as in transfer of origin to a replica? In the case of a master-slave system like Slony-I, the origin of a table has the log trigger, that captures row changes, while a replica has a trigger that simply bails out with an error. Transfer of ownership is done by dropping one and creating the other trigger. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Remove log segment and log_id fields from pg_controldata
The original discussion of this patch was here: http://archives.postgresql.org/pgsql-hackers/2006-11/msg00876.php Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Simon Riggs wrote: On Tue, 2006-12-05 at 17:26 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: On Tue, 2006-12-05 at 16:24 -0500, Tom Lane wrote: Sure, what would happen is that every backend passing through this code would execute the several lines of computation needed to decide whether to call RequestCheckpoint. Right, but the calculation uses RedoRecPtr, which may not be completely up to date. So presumably you want to re-read the shared memory value again to make sure we are exactly accurate and allow only one person to call checkpoint? Either way we have to take a lock. Insert lock causes deadlock, so we would need to use infolock. Not at all. It's highly unlikely that RedoRecPtr would be so out of date as to result in a false request for a checkpoint, and if it does, so what? Worst case is we perform an extra checkpoint. On its own, I wouldn't normally agree... Also, given the current structure of the routine, this is probably not the best place for that code at all --- it'd make more sense for it to be in the just-finished-a-segment code stretch, which would ensure that it's only done by one backend once per segment. But thats a much better plan since it requires no locking. There's a lot more changes there for such a simple fix though and lots more potential bugs, but I've coded it as you suggest and removed the fields from pg_control. Patch passes make check, applies cleanly on HEAD. pg_resetxlog and pgcontroldata tested. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com [ Attachment, skipping... ] ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] psql possible TODO
Added to TODO: o Add \# to list command history like \s, but with line numbers http://archives.postgresql.org/pgsql-hackers/2006-12/msg00255.php --- Joshua D. Drake wrote: Hello, O.k. this is my currently thinking: \#will list just like \s except it will include line numbers: 1 SELECT * FROM FOO; (not a tab of course) 2 UPDATE bar SET foo = 6; \# 2 will execute query number 2 \#e 2 will open the numbered query in $EDITOR I would love to figure out a way to auto explain these queries without obnoxious syntax. Any ideas? Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] elog(FATAL)ing non-existent roles during client
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- Gavin Sherry wrote: On Tue, 5 Dec 2006, Gavin Sherry wrote: On Thu, 30 Nov 2006, Tom Lane wrote: Gavin Sherry [EMAIL PROTECTED] writes: I wonder if we should check if the role exists for the other authentication methods too? get_role_line() should be very cheap and it would prevent unnecessary authentication work if we did it before contacting, for example, the client ident server. Even with trust, it would save work because otherwise we do not check if the user exists until InitializeSessionUserId(), at which time we're set up our proc entry etc. This only saves work if the supplied ID is in fact invalid, which one would surely think isn't the normal case; otherwise it costs more. Yes. I could see doing this in the ident path, because contacting a remote ident server is certainly expensive on both sides. I doubt it's a good idea in the trust case. Agreed. How about Kerberos too, applying the same logic? Attached is a patch check adds the checks. Gavin Content-Description: [ Attachment, skipping... ] ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Compacting a relation
Tom Lane wrote: Peter Eisentraut [EMAIL PROTECTED] writes: vacuumlazy.c contains a hint Consider compacting this relation but AFAICT, there is no indication anywhere how compacting is supposed to be achieved. I guess this means VACUUM FULL or CLUSTER, but I don't think the hint can be processed effectively by a user. So change it ... New message is: errhint(Consider using VACUUM FULL on this relation or increasing the configuration parameter \max_fsm_pages\.))); -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] custom variable classes
Added to TODO: * Allow custom variable classes that can restrict who can set the values http://archives.postgresql.org/pgsql-hackers/2006-11/msg00911.php --- Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: One thing I want to look at for 8.3 is improving custom variable classes. Right now these are all user settable, which makes them quite inappropriate for security related settings (such as which perl modules to load for use by trusted plperl). I'm wondering if we should perhaps allow something like: custom_variable_classes = 'foo' foo:security_level.bar = 'blurfl' This seems really ugly --- for one thing, what if the DBA gets it wrong? The value won't mean anything until the code that uses it gets loaded, at which time the correct GucContext for the variable will be supplied. How about we just compare that to the current definition source of the value, and if we see it's been set improperly, revert the value to default? It might also be a good idea to disallow ALTER USER/DATABASE SET for a custom variable that's a placeholder, since we don't at that time know if the value should be SUSET or not; and there seems no pressing need to allow these ALTERs to be done without having loaded the defining module. [ thinks for a bit... ] With that provision in place, I think it would be safe to revert to the reset value instead of the wired-in default, which would be marginally nicer. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Proposal: Commit timestamp
On 2/3/2007 5:20 PM, Bruce Momjian wrote: Jan Wieck wrote: I don't have any such paper and the proof of concept will be the implementation of the system. I do however see enough resistance against this proposal to withdraw the commit timestamp at this time. The new replication system will therefore require the installation of a patched, non-standard PostgreSQL version, compiled from sources cluster wide in order to be used. I am aware that this will dramatically reduce it's popularity but it is impossible to develop this essential feature as an external module. I thank everyone for their attention. Going and working on it on your own doesn't seem like the proper solution. I don't see people objecting to adding it, but they want it work, which I am sure you want too. You have to show how it will work and convince others of that, and then you have a higher chance it will work, and be in the PostgreSQL codebase. Bruce, I think I have sufficiently detailed explained how this Lamport timestamp will be unique and ever increasing, with the nodes ID being used as a tie breaker. The only thing important for last update wins conflict resolution is that whatever timestamp you have associated with a row, the update you do to it must be associated with a later timestamp so that all other nodes will overwrite the data. If a third node gets the two updates out of order, it will do the second nodes update and since the row it has then has a later timestamp then the first update arriving late, it will throw away that information. All nodes in sync again. This is all that is needed for last update wins resolution. And as said before, the only reason the clock is involved in this is so that nodes can continue autonomously when they lose connection without conflict resolution going crazy later on, which it would do if they were simple counters. It doesn't require microsecond synchronized clocks and the system clock isn't just used as a Lamport timestamp. The problem seems to me that people want a full scale proof of concept for the whole multimaster replication system I'm planning instead of thinking isolated about this one aspect, the intended use case and other possible uses for it (like table logging). And we all know that that discussion will take us way behind the 8.3 feature freeze date, so the whole thing will never get done. I don't want to work on this on my own and I sure would prefer it to be a default PostgreSQL feature. As said, I have learned some things from Slony-I. One of them is that I will not go through any more ugly workarounds in order to not require a patched backend. If the features I really need aren't going to be in the default codebase, people will have to install from patched sources. Finally, again, Slony-I could have well used this feature. With a logical commit timestamp, I would have never even thought about that other wart called xxid. It would have all been sooo much easier. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Referential Integrity and SHARE locks
On 2/2/2007 4:51 AM, Simon Riggs wrote: It sounds like if we don't put a SHARE lock on the referenced table then we can end the transaction in an inconsistent state if the referenced table has concurrent UPDATEs or DELETEs. BUT those operations do impose locking rules back onto the referencing tables that would not be granted until after any changes to the referencing table complete, whereupon they would restrict or cascade. So an inconsistent state doesn't seem possible to me. What am I missing? You're missing MVCC. The newly inserted reference only becomes visible when it is committed. If the order of actions is insert and check for PK, other transaction deletes PK and commits, inserted FK commits ... the other transaction didn't see it coming. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org