Re: [HACKERS] pg_rewind and log messages
On Tue, Apr 7, 2015 at 11:59 AM, Michael Paquier michael.paqu...@gmail.com wrote: On Mon, Apr 6, 2015 at 9:10 PM, Fujii Masao masao.fu...@gmail.com wrote: I'm not familiar with native language support (sorry), but don't we need to add the shortcut of gettext into every calls of pg_log and pg_fatal, e.g., change pg_fatal(xxx) to pg_fatal(_(xxx))? I know that fprintf() in pg_Log_v() has such shortcut, but I'm not sure if that's enough or not. I think I addressed those things... case PG_FATAL: -printf(\n%s, _(message)); -printf(%s, _(Failure, exiting\n)); +fprintf(stderr, _(\n%s: fatal: %s\n), progname, message); +fprintf(stderr, _(Failure, exiting\n)); Why do we need to output the error level like fatal? This seems also inconsistent with the log message policy of other tools. initdb and psql do not for a couple of warning messages, but perhaps I am just taking consistency with the original code too seriously. Why do we need to output \n at the head of the message? The second message Failure, exiting is really required? I think that those things were here to highlight the fact that this is a fatal bailout, but the error code would help the same way I guess... I eliminated a bunch of newlines in the log messages that seemed really unnecessary to me, simplifying a bit the whole. While hacking this stuff, I noticed as well that pg_rewind could be called as root on non-Windows platform, that's dangerous from a security point of view as process manipulates files in PGDATA. Hence let's block that. On Windows, a restricted token should be used. Good catch! I think it's better to implement it as a separate patch because it's very different from log message problem. Attached are new patches: - 0001 fixes the restriction issues: restricted token on Windows, and forbid non-root user on non-Windows. Thanks for the patch! I'd like to push this patch at first. But one comment is + You must run %s as the PostgreSQL superuser.\n, Isn't the term PostgreSQL superuser confusing? I'm afraid that a user might confuse PostgreSQL superuser with a database superuser. I see you just borrowed that term from pg_resetxlog.c, though. BTW, initdb and pg_ctl also have the same check and the error message is as follows. Isn't this better? Thought? (%s: cannot be run as root\n Please log in (using, e.g., \su\) as the (unprivileged) user that will\n own the server process.\n Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_rewind and log messages
On Tue, Apr 7, 2015 at 4:33 PM, Fujii Masao wrote: Isn't the term PostgreSQL superuser confusing? I'm afraid that a user might confuse PostgreSQL superuser with a database superuser. I see you just borrowed that term from pg_resetxlog.c, though. BTW, initdb and pg_ctl also have the same check and the error message is as follows. Isn't this better? Thought? (%s: cannot be run as root\n Please log in (using, e.g., \su\) as the (unprivileged) user that will\n own the server process.\n I'm fine with that as well. Why not updating the message of pg_resetxlog as well then? It would be good to be consistent. I guess that the call to set_pglocale_pgservice() should be added as well in the first patch (see last version upthread), this has nothing to do with the logging issues. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_rewind and log messages
On Tue, Apr 7, 2015 at 4:16 PM, Fujii Masao masao.fu...@gmail.com wrote: On Mon, Apr 6, 2015 at 10:01 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fujii Masao wrote: On Mon, Apr 6, 2015 at 5:33 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Mon, Apr 6, 2015 at 1:41 PM, Michael Paquier wrote: I guess that you are working on a patch? If not, you are looking for one? Code-speaking, this gives the patch attached. Thanks! Here are the review comments: I'm not familiar with native language support (sorry), but don't we need to add the shortcut of gettext into every calls of pg_log and pg_fatal, e.g., change pg_fatal(xxx) to pg_fatal(_(xxx))? I know that fprintf() in pg_Log_v() has such shortcut, but I'm not sure if that's enough or not. It's not necessary for pg_fatal and the like, because those functions are marked to have their first argument automatically translated in nls.mk. This means that the string literal is automatically extracted into pg_rewind.pot for translators. Of course, the function itself must call _() (or some variant thereof) to actually fetch the translated string at run time. Understood. Thanks! BTW, as far as I read pg_rewind's nls.mk correctly, it also has two problems. (1) file_ops.c should be added into GETTEXT_FILES. And ../../common/restricted_tokens.c. (2) pg_log should be pg_log:2 in GETTEXT_TRIGGERS Seems so. -- Michael From e5e08188c33adb74fc722c29e660832d88fdd765 Mon Sep 17 00:00:00 2001 From: Michael Paquier mich...@otacoo.com Date: Mon, 6 Apr 2015 17:18:21 +0900 Subject: [PATCH 1/2] Fix process handling of pg_rewind To begin with, pg_rewind should not be allowed to run as root on non-Windows platforms as it manipulates data folders, and file permissions. On Windows platforms, it can run under a user that has Administrator rights but in this case a restricted token needs to be used. Also add a call to set_pglocale_pgservice() that was missing. --- src/bin/pg_rewind/nls.mk | 2 +- src/bin/pg_rewind/pg_rewind.c | 17 + 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/src/bin/pg_rewind/nls.mk b/src/bin/pg_rewind/nls.mk index e43f3b9..69e87d1 100644 --- a/src/bin/pg_rewind/nls.mk +++ b/src/bin/pg_rewind/nls.mk @@ -1,7 +1,7 @@ # src/bin/pg_rewind/nls.mk CATALOG_NAME = pg_rewind AVAIL_LANGUAGES = -GETTEXT_FILES= copy_fetch.c datapagemap.c fetch.c filemap.c libpq_fetch.c logging.c parsexlog.c pg_rewind.c timeline.c ../../common/fe_memutils.c ../../../src/backend/access/transam/xlogreader.c +GETTEXT_FILES= copy_fetch.c datapagemap.c fetch.c filemap.c libpq_fetch.c logging.c parsexlog.c pg_rewind.c timeline.c ../../common/fe_memutils.c ../../common/restricted_token.c ../../../src/backend/access/transam/xlogreader.c GETTEXT_TRIGGERS = pg_log pg_fatal report_invalid_record:2 GETTEXT_FLAGS= pg_log:2:c-format \ diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c index dda3a79..04d6a46 100644 --- a/src/bin/pg_rewind/pg_rewind.c +++ b/src/bin/pg_rewind/pg_rewind.c @@ -24,6 +24,7 @@ #include access/xlog_internal.h #include catalog/catversion.h #include catalog/pg_control.h +#include common/restricted_token.h #include getopt_long.h #include storage/bufpage.h @@ -102,6 +103,7 @@ main(int argc, char **argv) TimeLineID endtli; ControlFileData ControlFile_new; + set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN(pg_rewind)); progname = get_progname(argv[0]); /* Process command-line arguments */ @@ -174,6 +176,21 @@ main(int argc, char **argv) exit(1); } + /* + * Don't allow pg_rewind to be run as root, to avoid overwriting the + * ownership of files in the data directory. We need only check for root + * -- any other user won't have sufficient permissions to modify files in + * the data directory. + */ +#ifndef WIN32 + if (geteuid() == 0) + pg_fatal(cannot be executed by \root\\n + You must run %s as the PostgreSQL superuser.\n, + progname); +#endif + + get_restricted_token(progname); + /* Connect to remote server */ if (connstr_source) libpqConnect(connstr_source); -- 2.3.5 From 9bd5eec75cceb8a12d26a9e16dabc2e23294c148 Mon Sep 17 00:00:00 2001 From: Michael Paquier mich...@otacoo.com Date: Mon, 6 Apr 2015 17:18:21 +0900 Subject: [PATCH 2/2] Fix inconsistent handling of logs in pg_rewind pg_rewind was handling a couple of things differently compared to the other src/bin utilities: - Logging output needs to be flushed on stderr, not stdout - Logging messages should be prefixed with the application name - pg_fatal exits with error code of 1, but it was sometimes followed by subsequent logs, making this information lost in the process. --- src/bin/pg_rewind/copy_fetch.c | 24 +- src/bin/pg_rewind/datapagemap.c | 4 ++- src/bin/pg_rewind/file_ops.c| 30 +++--- src/bin/pg_rewind/filemap.c | 16 ++-- src/bin/pg_rewind/libpq_fetch.c | 52
Re: [HACKERS] pg_rewind and log messages
On Mon, Apr 6, 2015 at 10:01 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fujii Masao wrote: On Mon, Apr 6, 2015 at 5:33 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Mon, Apr 6, 2015 at 1:41 PM, Michael Paquier wrote: I guess that you are working on a patch? If not, you are looking for one? Code-speaking, this gives the patch attached. Thanks! Here are the review comments: I'm not familiar with native language support (sorry), but don't we need to add the shortcut of gettext into every calls of pg_log and pg_fatal, e.g., change pg_fatal(xxx) to pg_fatal(_(xxx))? I know that fprintf() in pg_Log_v() has such shortcut, but I'm not sure if that's enough or not. It's not necessary for pg_fatal and the like, because those functions are marked to have their first argument automatically translated in nls.mk. This means that the string literal is automatically extracted into pg_rewind.pot for translators. Of course, the function itself must call _() (or some variant thereof) to actually fetch the translated string at run time. Understood. Thanks! BTW, as far as I read pg_rewind's nls.mk correctly, it also has two problems. (1) file_ops.c should be added into GETTEXT_FILES. (2) pg_log should be pg_log:2 in GETTEXT_TRIGGERS Another thing is compound messages like foo has happened\nSee --help for usage.\n and bar didn't happen\.See --help for usage. In those cases, the see --help part would need to be translated over and over, so it's best to separate them in phrases to avoid repetitive work for translators. Not sure how to do this -- maybe something like _(foo has happened\n) _(See --help) but I'm not sure how to appease the compiler. Having them in two separate pg_log() calls (or whatever) was handy for this reason. Yep. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertion failure when streaming logical changes
On 11 February 2015 at 08:51, Heikki Linnakangas hlinnakan...@vmware.com wrote: The new xmin tracking code assumes that if a snapshots's regd_count 0, it has already been pushed to the RegisteredSnapshots heap. That assumption doesn't hold because the logical decoding stuff creates its own snapshots and sets regd_count=1 to prevent snapmgr.c from freeing them, even though they haven't been registered with RegisterSnapshot. The second paragraph in this comment in snapmgr.c needs fixing: * Likewise, any snapshots that have been exported by pg_export_snapshot * have regd_count = 1 and are counted in RegisteredSnapshots, but are not * tracked by any resource owner. * * The same is true for historic snapshots used during logical decoding, * their lifetime is managed separately (as they life longer as one xact.c * transaction). Besides the spelling, that's incorrect: historic snapshots are *not* counted in RegisteredSnapshots. That was harmless before commit 94028691, but it was always wrong. I think setting regd_count=1 outside snapmgr.c is a pretty ugly hack. snapbuild.c also abuses active_count as a reference counter, which is similarly ugly. I think regd_count and active_count should both be treated as private to snapmgr.c, and initialized to zero elsewhere. As a minimal fix, we could change the logical decoding code to not use regd_count to prevent snapmgr.c from freeing its snapshots, but use active_count for that too. Setting regd_count to 1 in SnapBuildBuildSnapshot() seems unnecessary in the first place, as the callers also call SnapBuildSnapIncRefcount(). Patch attached. It might be a good idea to apply this if nothing better is forthcoming. Logical decoding in WALsenders is broken at the moment. Otherwise it needs to go on the 9.5 blockers list. But could we get rid of those active_count manipulations too? Could you replace the SnapBuildSnap[Inc|Dec]Refcount calls with [Un]RegisterSnapshot()? It would be interesting to know why it was done that way in the first place, rather than using the snapshot management infrastructure. I presume it needed to do something not directly offered by the snapshot manager but I haven't managed to grasp what, exactly. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] Proposal : REINDEX xxx VERBOSE
On Tue, Apr 7, 2015 at 9:32 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Mon, Apr 6, 2015 at 10:21 AM, Sawada Masahiko sawada.m...@gmail.com wrote: On Fri, Mar 13, 2015 at 5:10 PM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Hello, I have some trivial comments about the latest patch. At Thu, 12 Mar 2015 21:15:14 +0900, Sawada Masahiko sawada.m...@gmail.com wrote in CAD21AoBxPCpPvKQmvJMUh+p=2pfau03gkjq2r2zy47xhsh2...@mail.gmail.com sawada.mshk On Thu, Mar 12, 2015 at 6:36 AM, Jim Nasby jim.na...@bluetreble.com wrote: Are the parenthesis necessary? No other WITH option requires them, other than create table/matview (COPY doesn't actually require them). I was imagining EXPLAIN syntax. Is there some possibility of supporting multiple options for REINDEX command in future? If there is, syntax will be as follows, REINDEX { INDEX | ... } name WITH VERBOSE, XXX, XXX; I thought style with parenthesis is better than above style. The thing is, ()s are actually an odd-duck. Very little supports it, and while COPY allows it they're not required. EXPLAIN is a different story, because that's not WITH; we're actually using () *instead of* WITH. So because almost all commands that use WITH doen't even accept (), I don't think this should either. It certainly shouldn't require them, because unlike EXPLAIN, there's no need to require them. I understood what your point is. Attached patch is changed syntax, it does not have parenthesis. As I looked into the code to find what the syntax would be, I found some points which would be better to be fixed. In gram.y the options is a list of cstring but it is not necesary to be a list because there's only one kind of option now. If you prefer it to be a list, I have a comment for the way to make string list in gram.y. You stored bare cstring in the options list but I think it is not the preferable form. I suppose the followings are preferable. Corresponding fixes are needed in ReindexTable, ReindexIndex, ReindexMultipleTables. $ = list_make1(makeString($1); $ = lappend($1, list_make1(makeString($3)); In equalfuncs.c, _equalReindexStmt forgets to compare the member options. _copyReindexStmt also forgets to copy it. The way you constructed the options list prevents them from doing their jobs using prepared methods. Comparing and copying the member option is needed even if it becomes a simple string. I revised patch, and changed gram.y as I don't use the list. So this patch adds new syntax, REINDEX { INDEX | ... } name WITH VERBOSE; Also documentation is updated. Please give me feedbacks. Some notes: 1) There are a trailing space in src/backend/parser/gram.y: - | REINDEX DATABASE name opt_force + | REINDEX reindex_target_multitable name WITH opt_verbose { ReindexStmt *n = makeNode(ReindexStmt); - n-kind = REINDEX_OBJECT_DATABASE; + n-kind = $2; n-name = $3; n-relation = NULL; + n-verbose = $5; $$ = (Node *)n; } ; 2) The documentation was updated and is according the behaviour. 3) psql autocomplete is ok. 4) Lack of regression tests. I think you should add some regression like that: fabrizio=# \set VERBOSITY terse fabrizio=# create table reindex_verbose(id integer primary key); CREATE TABLE fabrizio=# reindex table reindex_verbose with verbose; INFO: index reindex_verbose_pkey was reindexed. REINDEX 5) Code style and organization is ok 6) You should add the new field ReindexStmt-verbose to src/backend/nodes/copyfuncs.c Regards, Thank you for your reviewing. I modified the patch and attached latest version patch(v7). Please have a look it. Regards, --- Sawada Masahiko diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml index 0a4c7d4..27be1a4 100644 --- a/doc/src/sgml/ref/reindex.sgml +++ b/doc/src/sgml/ref/reindex.sgml @@ -22,6 +22,7 @@ PostgreSQL documentation refsynopsisdiv synopsis REINDEX { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } replaceable class=PARAMETERname/replaceable [ FORCE ] +REINDEX { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } replaceable class=PARAMETERname/replaceable WITH VERBOSE /synopsis /refsynopsisdiv @@ -159,6 +160,15 @@ REINDEX { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } replaceable class=PARAM /para /listitem /varlistentry + + varlistentry +termliteralVERBOSE/literal/term +listitem + para + Prints a progress report as each index is reindexed. + /para +/listitem + /varlistentry /variablelist /refsect1 diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 351dcb2..fc44495 100644 --- a/src/backend/catalog/index.c
[HACKERS] PATCH: Add 'pid' column to pg_replication_slots
The attached patch adds a 'pid' column to pg_replication_slots, so it's possible to associate an active slot with the pg_stat_replication entry that describes the walsender using the slot. If a user backend (or bgworker) is using the slot over the SQL interface, the 'pid' column will correspond to the pg_stat_activity entry for that backend instead. After all, both it and pg_stat_replication are views over pg_stat_get_activity() anyway. Detailed rationale in patch. Please consider this for 9.5. It's a pretty light patch, and it'd be good to have it in place to ease monitoring of slot-based replication. Note that logical replication walsenders are broken in HEAD so testing this using the test_decoding module and pg_recvlogical will crash the walsender. This is a pre-existing bug; see http://www.postgresql.org/message-id/CAB7nPqQSdx7coHk0D6G=mkjntgyjxpdw+pwiskkssaezfts...@mail.gmail.com and http://www.postgresql.org/message-id/CAMsr+YEh50r70+hP+w=rCzEuenoQRCNMDA7PmRSK06Ro9r=9...@mail.gmail.com ) -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services From eabd1a1a66045703d7561b3c8883e90756582206 Mon Sep 17 00:00:00 2001 From: Craig Ringer cr...@2ndquadrant.com Date: Wed, 1 Apr 2015 17:18:39 +0800 Subject: [PATCH] Add a 'pid' column to pg_replication_slots This makes it easier to associate pg_stat_replication and pg_replication_slots entries. Right now an active replication slot could be being read by a walsender (using the replication protocol) or via normal user backend (using the SQL interface). The 'active' flag shows that it's in use, but not by whom. pg_stat_replication and pg_stat_activity both expose the relevant backend pid. Neither expose information about any replication slots being used. This patch adds a 'pid' column to pg_replication_slots, which is NULL when the slot isn't active, otherwise the PID of the backend that's using the slot. It makes the 'active' field wholly redundant, but that field is retained for backwards compatibility at the SQL level. This makes it possible to find out which walsenders or normal backends are using a slot, something that was not previously possible from the SQL level. This is particularly important for monitoring of replication lag. --- contrib/test_decoding/expected/ddl.out | 4 ++-- doc/src/sgml/catalogs.sgml | 13 + src/backend/catalog/system_views.sql | 1 + src/backend/replication/slot.c | 22 +++--- src/backend/replication/slotfuncs.c| 13 + src/include/catalog/pg_proc.h | 2 +- src/include/replication/slot.h | 6 -- src/test/regress/expected/rules.out| 3 ++- 8 files changed, 43 insertions(+), 21 deletions(-) diff --git a/contrib/test_decoding/expected/ddl.out b/contrib/test_decoding/expected/ddl.out index 780120d..80cf09b 100644 --- a/contrib/test_decoding/expected/ddl.out +++ b/contrib/test_decoding/expected/ddl.out @@ -603,7 +603,7 @@ SELECT pg_drop_replication_slot('regression_slot'); /* check that the slot is gone */ SELECT * FROM pg_replication_slots; - slot_name | plugin | slot_type | datoid | database | active | xmin | catalog_xmin | restart_lsn ++---++--++--+--+- + slot_name | plugin | slot_type | datoid | database | active | pid | xmin | catalog_xmin | restart_lsn +---++---++--++-+--+--+- (0 rows) diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index d0b78f2..9945d1f 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -5401,6 +5401,19 @@ /row row + entrystructfieldpid/structfield/entry + entrytypeinteger/type/entry + entry/entry + entryThe process ID of the session or WALsender using this slot if the + slot is currently actively being used. literalNULL/literal if + inactive. Corresponds to + structnamepg_stat_activity/structname.structfieldpid/structfield + (for normal backends) or + structnamepg_stat_replication/structname.structfieldpid/structfield + (for WALsenders). (Since: 9.5)/entry + /row + + row entrystructfieldxmin/structfield/entry entrytypexid/type/entry entry/entry diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 2800f73..5977075 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -654,6 +654,7 @@ CREATE VIEW pg_replication_slots AS L.datoid, D.datname AS database, L.active, +L.pid, L.xmin, L.catalog_xmin, L.restart_lsn diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index dd7ff0f..1ed0cd9 100644 --- a/src/backend/replication/slot.c +++
[HACKERS] Row security violation error is misleading
When attempting to insert a row that violates a row security policy that applies to writes, the error message emitted references WITH CHECK OPTION, even though (as far as the user knows) there's no such thing involved. If you understand the internals you'd know that row security re-uses the same logic as WITH CHECK OPTION, but it's going to be confusing for users. postgres= INSERT INTO clients (account_name, account_manager) VALUES ('peters', 'peter'), ('johannas', 'johanna'); ERROR: 44000: new row violates WITH CHECK OPTION for clients DETAIL: Failing row contains (7, johannas, johanna). LOCATION: ExecWithCheckOptions, execMain.c:1683 ... yet clients is a table, not a view, and cannot have a WITH CHECK OPTION clause. There is no reference to the policy being violated or to the fact that it's row security involved. I think this is going to be very confusing for users. I was expecting to see something more like: ERROR: 44000: new row in table 'clients' violates row level security policy 'just_own_clients' Re-using the SQLSTATE 44000 is a bit iffy too. We should probably define something to differentiate this, like: 44P01 ROW SECURITY WRITE POLICY VIOLATION (I've finally found some time to try to review the user-facing side of the row security patch as-committed). -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [HACKERS] Assertion failure when streaming logical changes
On 2015-04-07 17:22:12 +0800, Craig Ringer wrote: It might be a good idea to apply this if nothing better is forthcoming. Logical decoding in WALsenders is broken at the moment. Yes. Otherwise it needs to go on the 9.5 blockers list. But could we get rid of those active_count manipulations too? Could you replace the SnapBuildSnap[Inc|Dec]Refcount calls with [Un]RegisterSnapshot()? It would be interesting to know why it was done that way in the first place, rather than using the snapshot management infrastructure. Because the snapshot tracking infrastructure isn't particularly suitable. These snapshots are much longer lived; they span transactions and such. Nothing snapmgr.c is made for. It seemed more complicated/bug-prone to change snapmgr.c to support a foreign use case. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column mis-spelling hints vs case folding
On Sun, Apr 5, 2015 at 12:16 PM, Tom Lane t...@sss.pgh.pa.us wrote: A newbie error that we see *constantly* is misunderstanding identifier case-folding rules. ISTM that commit e529cd4ff missed a chance to help with that. You do get a hint for this: regression=# create table t1 (foo int, Bar int); CREATE TABLE regression=# select bar from t1; ERROR: column bar does not exist LINE 1: select bar from t1; ^ HINT: Perhaps you meant to reference the column t1.Bar. but apparently it's just treating that as a vanilla one-mistyped-character error, because there's no hint for this: regression=# create table t2 (foo int, BAR int); CREATE TABLE regression=# select BAR from t2; ERROR: column bar does not exist LINE 1: select BAR from t2; ^ I think this hint might be a lot more useful if its comparison mechanism were case-insensitive. If you want to make that change, I will not object. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Column mis-spelling hints vs case folding
On 7 Apr 2015 09:37, Robert Haas robertmh...@gmail.com wrote: On Sun, Apr 5, 2015 at 12:16 PM, Tom Lane t...@sss.pgh.pa.us wrote: A newbie error that we see *constantly* is misunderstanding identifier case-folding rules. I think this hint might be a lot more useful if its comparison mechanism were case-insensitive. If you want to make that change, I will not object. If you just make out case insensitive that would be an improvement. Nobody sane has similar columns that differ only in case. But if the original token was not quoted and the suggested token requires quoting IWBNI the hint directly addressed that source of confusion. But if that gets fiddly, trying to squeeze too much in one hint then better the generic hint then nothing at all. I don't want to kill a good simple change with bike shedding here
Re: [HACKERS] Replication identifiers, take 4
On 2015-03-28 23:50:20 +0100, Petr Jelinek wrote: And finally I have issue with how the new identifiers are allocated. Currently, if you create identifier 'foo', remove identifier 'foo' and create identifier 'bar', the identifier 'bar' will have same id as the old 'foo' identifier. This can be problem because the identifier id is used as origin of the data and the replication solution using the replication identifiers can end up thinking that data came from node 'bar' even though they came from the node 'foo' which no longer exists. This can have bad effects for example on conflict detection or debugging problems with replication. Maybe another reason to use standard Oids? As the same reason exists for oids, just somewhat less likely, I don't see it as a reason for much. It's really not that hard to get oid conflicts once your server has lived for a while. As soon as the oid counter has wrapped around once, it's not unlikely to have conflicts. And with temp tables (or much more extremely WITH OID tables) and such it's not that hard to reach that point. The only material difference this makes is that it's much easier to notice the problem during development. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication identifiers, take 4
On 2015-03-24 23:11:26 -0400, Robert Haas wrote: On Mon, Feb 16, 2015 at 4:46 AM, Andres Freund and...@2ndquadrant.com wrote: At a quick glance, this basic design seems workable. I would suggest expanding the replication IDs to regular 4 byte oids. Two extra bytes is a small price to pay, to make it work more like everything else in the system. I don't know. Growing from 3 to 5 byte overhead per relevant record (or even 0 to 5 in case the padding is reused) is rather noticeable. If we later find it to be a limit (I seriously doubt that), we can still increase it in a major release without anybody really noticing. You might notice that Heikki is making the same point here that I've attempted to make multiple times in the past: limiting to replication identifier to 2 bytes because that's how much padding space you happen to have available is optimizing for the wrong thing. What we should be optimizing for is consistency and uniformity of design. System catalogs have OIDs, so this one should, too. You're not going to be able to paper over the fact that the column has some funky data type that is unlike every other column in the system. To the best of my knowledge, the statement that there is a noticeable performance cost for those 2 extra bytes is also completely unsupported by any actual benchmarking. I'm starting benchmarks now. But I have to say: I find the idea that you'd need more than 2^16 identifiers anytime soon not very credible. The likelihood that replication identifiers are the limiting factor towards that seems incredibly small. Just consider how you'd apply changes from so many remotes; how to stream changes to them; how to even configure such a complex setup. We can easily change the size limits in the next major release without anybody being inconvenienced. We've gone through quite some lengths reducing the overhead of WAL. I don't understand why it's important that we do not make compromises here; but why that doesn't matter elsewhere. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Add 'pid' column to pg_replication_slots
Hi, On 2015-04-07 18:41:59 +0800, Craig Ringer wrote: The attached patch adds a 'pid' column to pg_replication_slots, so it's possible to associate an active slot with the pg_stat_replication entry that describes the walsender using the slot. This really should have been done that way from the get go. So I'm inclined to just apply this soon. Will do unless somebody protests. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication identifiers, take 4
On 2015-04-07 16:30:25 +0200, Andres Freund wrote: And with temp tables (or much more extremely WITH OID tables) and such it's not that hard to reach that point. Oh, and obviously toast data. A couple tables with toasted columns is also a good way to rapidly consume oids. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal : REINDEX xxx VERBOSE
On Tue, Apr 7, 2015 at 7:22 AM, Sawada Masahiko sawada.m...@gmail.com wrote: Thank you for your reviewing. I modified the patch and attached latest version patch(v7). Please have a look it. Looks good to me. Attached patch (v8) just fix a tab indentation in gram.y. Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog: http://fabriziomello.github.io Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello Github: http://github.com/fabriziomello diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml index 0a4c7d4..27be1a4 100644 --- a/doc/src/sgml/ref/reindex.sgml +++ b/doc/src/sgml/ref/reindex.sgml @@ -22,6 +22,7 @@ PostgreSQL documentation refsynopsisdiv synopsis REINDEX { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } replaceable class=PARAMETERname/replaceable [ FORCE ] +REINDEX { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } replaceable class=PARAMETERname/replaceable WITH VERBOSE /synopsis /refsynopsisdiv @@ -159,6 +160,15 @@ REINDEX { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } replaceable class=PARAM /para /listitem /varlistentry + + varlistentry +termliteralVERBOSE/literal/term +listitem + para + Prints a progress report as each index is reindexed. + /para +/listitem + /varlistentry /variablelist /refsect1 diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 351dcb2..fc44495 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -63,6 +63,7 @@ #include utils/inval.h #include utils/lsyscache.h #include utils/memutils.h +#include utils/pg_rusage.h #include utils/syscache.h #include utils/tuplesort.h #include utils/snapmgr.h @@ -3133,13 +3134,18 @@ IndexGetRelation(Oid indexId, bool missing_ok) * reindex_index - This routine is used to recreate a single index */ void -reindex_index(Oid indexId, bool skip_constraint_checks, char persistence) +reindex_index(Oid indexId, bool skip_constraint_checks, char persistence, +bool verbose) { Relation iRel, heapRelation; Oid heapId; IndexInfo *indexInfo; volatile bool skipped_constraint = false; + int elevel = verbose ? INFO : DEBUG2; + PGRUsage ru0; + + pg_rusage_init(ru0); /* * Open and lock the parent heap relation. ShareLock is sufficient since @@ -3283,6 +3289,13 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence) heap_close(pg_index, RowExclusiveLock); } + /* Log what we did */ + ereport(elevel, + (errmsg(index \%s\ was reindexed., + get_rel_name(indexId)), + errdetail(%s., + pg_rusage_show(ru0; + /* Close rels, but keep locks */ index_close(iRel, NoLock); heap_close(heapRelation, NoLock); @@ -3324,7 +3337,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence) * index rebuild. */ bool -reindex_relation(Oid relid, int flags) +reindex_relation(Oid relid, int flags, bool verbose) { Relation rel; Oid toast_relid; @@ -3415,7 +3428,7 @@ reindex_relation(Oid relid, int flags) RelationSetIndexList(rel, doneIndexes, InvalidOid); reindex_index(indexOid, !(flags REINDEX_REL_CHECK_CONSTRAINTS), - persistence); + persistence, verbose); CommandCounterIncrement(); @@ -3450,7 +3463,7 @@ reindex_relation(Oid relid, int flags) * still hold the lock on the master table. */ if ((flags REINDEX_REL_PROCESS_TOAST) OidIsValid(toast_relid)) - result |= reindex_relation(toast_relid, flags); + result |= reindex_relation(toast_relid, flags, verbose); return result; } diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c index 3febdd5..34ffaba 100644 --- a/src/backend/commands/cluster.c +++ b/src/backend/commands/cluster.c @@ -1532,7 +1532,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap, else if (newrelpersistence == RELPERSISTENCE_PERMANENT) reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT; - reindex_relation(OIDOldHeap, reindex_flags); + reindex_relation(OIDOldHeap, reindex_flags, false); /* * If the relation being rebuild is pg_class, swap_relation_files() diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index 99acd4a..a42a508 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -1681,7 +1681,7 @@ ChooseIndexColumnNames(List *indexElems) * Recreate a specific index. */ Oid -ReindexIndex(RangeVar *indexRelation) +ReindexIndex(RangeVar *indexRelation, bool verbose) { Oid indOid; Oid heapOid = InvalidOid; @@ -1706,7 +1706,7 @@ ReindexIndex(RangeVar *indexRelation) persistence = irel-rd_rel-relpersistence; index_close(irel, NoLock); - reindex_index(indOid, false, persistence); + reindex_index(indOid, false, persistence, verbose); return indOid; } @@ -1775,7 +1775,7 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation, * Recreate
Re: [HACKERS] Proposal : REINDEX xxx VERBOSE
On Wed, Apr 8, 2015 at 1:11 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Tue, Apr 7, 2015 at 1:04 PM, Sawada Masahiko sawada.m...@gmail.com wrote: On Tue, Apr 7, 2015 at 10:12 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Tue, Apr 7, 2015 at 7:22 AM, Sawada Masahiko sawada.m...@gmail.com wrote: Thank you for your reviewing. I modified the patch and attached latest version patch(v7). Please have a look it. Looks good to me. Attached patch (v8) just fix a tab indentation in gram.y. I had forgotten fix a tab indentation, sorry. Thank you for reviewing! It looks good to me too. Can this patch be marked as Ready for Committer? +1 Changed status to Ready for Committer. Thank you for final reviewing! Regards, --- Sawada Masahiko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication identifiers, take 4
On 4/7/15 9:30 AM, Andres Freund wrote: On 2015-03-28 23:50:20 +0100, Petr Jelinek wrote: And finally I have issue with how the new identifiers are allocated. Currently, if you create identifier 'foo', remove identifier 'foo' and create identifier 'bar', the identifier 'bar' will have same id as the old 'foo' identifier. This can be problem because the identifier id is used as origin of the data and the replication solution using the replication identifiers can end up thinking that data came from node 'bar' even though they came from the node 'foo' which no longer exists. This can have bad effects for example on conflict detection or debugging problems with replication. Maybe another reason to use standard Oids? As the same reason exists for oids, just somewhat less likely, I don't see it as a reason for much. It's really not that hard to get oid conflicts once your server has lived for a while. As soon as the oid counter has wrapped around once, it's not unlikely to have conflicts. And with temp tables (or much more extremely WITH OID tables) and such it's not that hard to reach that point. The only material difference this makes is that it's much easier to notice the problem during development. Why not just create a sequence? I suspect it may not be as fast to assign as an OID, but it's not like you'd be doing this all the time. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal : REINDEX xxx VERBOSE
On Tue, Apr 7, 2015 at 10:12 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Tue, Apr 7, 2015 at 7:22 AM, Sawada Masahiko sawada.m...@gmail.com wrote: Thank you for your reviewing. I modified the patch and attached latest version patch(v7). Please have a look it. Looks good to me. Attached patch (v8) just fix a tab indentation in gram.y. I had forgotten fix a tab indentation, sorry. Thank you for reviewing! It looks good to me too. Can this patch be marked as Ready for Committer? Regards, --- Sawada Masahiko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal : REINDEX xxx VERBOSE
On Tue, Apr 7, 2015 at 1:04 PM, Sawada Masahiko sawada.m...@gmail.com wrote: On Tue, Apr 7, 2015 at 10:12 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Tue, Apr 7, 2015 at 7:22 AM, Sawada Masahiko sawada.m...@gmail.com wrote: Thank you for your reviewing. I modified the patch and attached latest version patch(v7). Please have a look it. Looks good to me. Attached patch (v8) just fix a tab indentation in gram.y. I had forgotten fix a tab indentation, sorry. Thank you for reviewing! It looks good to me too. Can this patch be marked as Ready for Committer? +1 -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog: http://fabriziomello.github.io Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello Github: http://github.com/fabriziomello
Re: [HACKERS] Row security violation error is misleading
Craig, * Craig Ringer (cr...@2ndquadrant.com) wrote: When attempting to insert a row that violates a row security policy that applies to writes, the error message emitted references WITH CHECK OPTION, even though (as far as the user knows) there's no such thing involved. If you understand the internals you'd know that row security re-uses the same logic as WITH CHECK OPTION, but it's going to be confusing for users. Agreed and we actually have a patch from Dean already to address this, it's just been waiting on me (with a couple of other ones). It'd certainly be great if you have time to take a look at those, though, generally speaking, I feel prety happy about where those are and believe they really just need to be reviewed/tested and maybe a bit of word- smithing around the docs. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Support UPDATE table SET(*)=...
Atri Sharma atri.j...@gmail.com writes: Please find attached latest version of UPDATE (*) patch.The patch implements review comments and Tom's gripe about touching transformTargetList is addressed now. I have added regression tests and simplified parser representation a bit. I spent a fair amount of time cleaning this patch up to get it into committable shape, but as I was working on the documentation I started to lose enthusiasm for it, because I was having a hard time coming up with compelling examples. The originally proposed motivation was It solves the problem of doing UPDATE from a record variable of the same type as the table e.g. update foo set (*) = (select foorec.*) where ...; but it feels to me that this is not actually a very good solution to that problem --- even granting that the problem needs to be solved, a claim that still requires some justification IMO. Here is a possible use-case: regression=# create table src (f1 int, f2 text, f3 real); CREATE TABLE regression=# create table dst (f1 int, f2 text, f3 real); CREATE TABLE regression=# create rule r1 as on update to src do instead regression-# update dst set (*) = (new.*) where dst.f1 = old.f1; ERROR: number of columns does not match number of values LINE 2: update dst set (*) = (new.*) where dst.f1 = old.f1; ^ So that's annoying. You can work around it with this very unobvious (and undocumented) syntax hack: regression=# create rule r1 as on update to src do instead regression-# update dst set (*) = (select new.*) where dst.f1 = old.f1; CREATE RULE But what happens if dst has no matching row? Your data goes into the bit bucket, that's what. What you really wanted here was some kind of UPSERT. There's certainly a possible use-case for a notation like this in UPSERT, when we get around to implementing that; but I'm less than convinced that we need it in plain UPDATE. What's much worse though is that the rule that actually gets stored is: regression=# \d+ src Table public.src Column | Type | Modifiers | Storage | Stats target | Description +-+---+--+--+- f1 | integer | | plain| | f2 | text| | extended | | f3 | real| | plain| | Rules: r1 AS ON UPDATE TO src DO INSTEAD UPDATE dst SET (f1, f2, f3) = ( SELECT new.f1, new.f2, new.f3) WHERE dst.f1 = old.f1 That is, you don't actually have any protection at all against future additions of columns, which seems like it would be most of the point of using a notation like this. We could possibly address that by postponing expansion of the *-notation to rule rewrite time, but that seems like it'd be a complete disaster in terms of modularity, or even usability (since column-mismatch errors wouldn't get raised till then either). And it'd certainly be a far more invasive and complex patch than this. So I'm feeling that this may not be a good idea, or at least not a good implementation of the idea. I'm inclined to reject the patch rather than lock us into something that is not standard and doesn't really do what people would be likely to want. Attached is the updated patch that I had before arriving at this discouraging conclusion. regards, tom lane diff --git a/doc/src/sgml/ref/update.sgml b/doc/src/sgml/ref/update.sgml index 35b0699..adb4473 100644 *** a/doc/src/sgml/ref/update.sgml --- b/doc/src/sgml/ref/update.sgml *** PostgreSQL documentation *** 25,31 UPDATE [ ONLY ] replaceable class=PARAMETERtable_name/replaceable [ * ] [ [ AS ] replaceable class=parameteralias/replaceable ] SET { replaceable class=PARAMETERcolumn_name/replaceable = { replaceable class=PARAMETERexpression/replaceable | DEFAULT } | ( replaceable class=PARAMETERcolumn_name/replaceable [, ...] ) = ( { replaceable class=PARAMETERexpression/replaceable | DEFAULT } [, ...] ) | ! ( replaceable class=PARAMETERcolumn_name/replaceable [, ...] ) = ( replaceable class=PARAMETERsub-SELECT/replaceable ) } [, ...] [ FROM replaceable class=PARAMETERfrom_list/replaceable ] [ WHERE replaceable class=PARAMETERcondition/replaceable | WHERE CURRENT OF replaceable class=PARAMETERcursor_name/replaceable ] --- 25,33 UPDATE [ ONLY ] replaceable class=PARAMETERtable_name/replaceable [ * ] [ [ AS ] replaceable class=parameteralias/replaceable ] SET { replaceable class=PARAMETERcolumn_name/replaceable = { replaceable class=PARAMETERexpression/replaceable | DEFAULT } | ( replaceable class=PARAMETERcolumn_name/replaceable [, ...] ) = ( { replaceable class=PARAMETERexpression/replaceable | DEFAULT } [, ...] ) | ! ( replaceable class=PARAMETERcolumn_name/replaceable [, ...] ) = ( replaceable class=PARAMETERsub-SELECT/replaceable ) | ! (*) = ( {
Re: [HACKERS] Replication identifiers, take 4
Why not just create a sequence? I suspect it may not be as fast to assign as an OID, but it's not like you'd be doing this all the time. What does that have to do with the thread? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] rare avl shutdown slowness (related to signal handling)
I am playing git tip on windows 7/32 bits, with the backend compiled with visual studio 2005 (I know, it is very old :-( ). I encountered avl shutdown slowness twice, last night and this morning: after a ctrl_c is hit, wait for another 60 seconds, server shuts down. Here is one log: D:\pgsql\binpostgres -D../data --log_line_prefix=%t %p 2015-04-07 10:30:04 PDT 3148LOG: database system was shut down at 2015-04-07 10:29:24 PDT 2015-04-07 10:30:04 PDT 19548LOG: database system is ready to accept connections 2015-04-07 10:30:04 PDT 27008LOG: autovacuum launcher started 2015-04-07 10:30:08 PDT 19548LOG: received fast shutdown request 2015-04-07 10:30:08 PDT 19548LOG: aborting any active transactions 2015-04-07 10:30:08 PDT 27008LOG: autovacuum launcher shutting down 2015-04-07 10:30:08 PDT 27008ERROR: canceling statement due to user request 2015-04-07 10:31:09 PDT 27008LOG: autovacuum launcher shutting down 2015-04-07 10:31:09 PDT 15656LOG: shutting down 2015-04-07 10:31:09 PDT 15656LOG: database system is shut down The interesting part is on PID 27008: avl first ereport() shutdown, which is at the very end of the main loop and just one step away from proc_exit(). Then it seems honors a SIGINT within ereport(), longjmp to the loop head, and waits for another 60 seconds. After timeout, it ereports shutdown again, and finally exits. Not sure if this is windows only or general. I can hardly repro it. Anyone has ever seen this? Regards, Qingqing -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
I wrote: So I'm feeling that this may not be a good idea, or at least not a good implementation of the idea. I'm inclined to reject the patch rather than lock us into something that is not standard and doesn't really do what people would be likely to want. BTW, a potentially workable fix to the problem of not wanting to lock down column lists in stored rules is to create a syntax that represents whole-row, record-oriented assignment directly. Then we need not be concerned with individual columns at parse time at all. So imagine something like this: UPDATE dst SET * = new WHERE ...; UPDATE dst SET * = (SELECT src FROM src WHERE ...); or if you needed to construct a row value at runtime you could write UPDATE dst SET * = ROW(x,y,z) WHERE ...; UPDATE dst SET * = (SELECT ROW(x,y,z) FROM src WHERE ...); The main bit of functionality that would be lost compared to the current patch is the ability to use DEFAULT for some of the row members. But I am not sure there is a compelling use-case for that: seems like if you have some DEFAULTs in there then it's unlikely that you don't know the column list accurately, so the existing (col1,col2,...) syntax will serve fine. This seems like it might not be unduly complex to implement, although it would have roughly nothing in common with the current patch. If we were to go in this direction, it would be nice to at the same time add a similar whole-record syntax for INSERT. I'm not sure exactly what that should look like though. Also, again, we ought to be paying attention to how this would match up with UPSERT syntax. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
On Tue, Apr 7, 2015 at 12:01 PM, Peter Geoghegan p...@heroku.com wrote: I still don't like the idea of supporting this, though. I'm not aware of any other system allowing something like this for either MERGE or a non-standard UPSERT. That includes MySQL, BTW. Only their REPLACE statement (which is a disaster for various reasons) can do something like this. Their INSERT ... ON DUPLICATE KEY UPDATE statement (which is roughly comparable to the proposed UPSERT patch) cannot update the entire row using a terse expression that references a row excluded from insertion (following the implementation taking the UPDATE path). -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
On Tue, Apr 7, 2015 at 11:19 AM, Tom Lane t...@sss.pgh.pa.us wrote: If we were to go in this direction, it would be nice to at the same time add a similar whole-record syntax for INSERT. I'm not sure exactly what that should look like though. Also, again, we ought to be paying attention to how this would match up with UPSERT syntax. I expressed concern about allowing this for UPSERT [1]. To be fair, VoltDB's new UPSERT statement allows something similar (or rather mandates it, since you cannot just update some columns), and that doesn't look wholly unreasonable. I still don't like the idea of supporting this, though. I'm not aware of any other system allowing something like this for either MERGE or a non-standard UPSERT. [1] http://www.postgresql.org/message-id/CAM3SWZT=vxbj7qkaidamybu40ap10udsqooqhvix3ykj7wb...@mail.gmail.com -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
Tom Lane wrote: I spent a fair amount of time cleaning this patch up to get it into committable shape, but as I was working on the documentation I started to lose enthusiasm for it, because I was having a hard time coming up with compelling examples. The originally proposed motivation was It solves the problem of doing UPDATE from a record variable of the same type as the table e.g. update foo set (*) = (select foorec.*) where ...; but it feels to me that this is not actually a very good solution to that problem --- even granting that the problem needs to be solved, a claim that still requires some justification IMO. How about an UPDATE ran inside a plpgsql function, which is using a row variable of the table type? You could assign values to individual columns of q row variable, and run the multicolumn UPDATE last. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_restore -t should match views, matviews, and foreign tables
Peter Eisentraut pete...@gmx.net writes: On 3/31/15 11:01 PM, Craig Ringer wrote: this patch adds support for views, foreign tables, and materialised views to the pg_restore -t flag. I think this is a good change. Any concerns? Are we happy with pg_dump/pg_restore not distinguishing these objects by type? It seems rather analogous to letting ALTER TABLE work on views etc. Personally I'm fine with this, but certainly some people have complained about that approach so far as ALTER is concerned. (But the implication would be that we'd need four distinct switches, which is not an outcome I favor.) Also, I think you missed MATERIALIZED VIEW DATA. Also, shouldn't there be a documentation update? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] deparsing utility commands
Executive summary: There is now a CommandDeparse_hook; deparse_utility_command is provided as an extension, intended for 9.6; rest of patch would be pushed to 9.5. Long version: I've made command deparsing hookable. Attached there are three patches: the first patch contains changes to core that just add the command list stuff, so that on ddl_command_end there is access to what has been executed. This includes the OID of the object just created, the command tag, and assorted other details; the deparsed command in JSON format is not immediately part of the result. The third patch contains all the deparse code, packaged as a contrib module and extension named ddl_deparse. Essentially, it's everything that was previously in tcop/deparse_utility.c and utils/adt/ddl_json.c: the stuff that takes the parsenode and OID of a command execution and turns it into a JSON blob, and also the support function that takes the JSON blob and converts back into the plain text rendering of the command. The second patch contains some changes to core code that support the ddl_deparse extension; mainly some ruleutils.c changes. What links patches 0001 and 0003 is a hook, CommandDeparse_hook. If unset, the pg_event_trigger_ddl_commands function returns some boilerplate text like no deparse function installed; if the extension is installed, the JSON rendering is returned instead and can be used with the ddl_deparse_expand_command() function. The rationale for doing things this way is that it will be useful to have 9.5 expose the pg_event_trigger_ddl_commands() function for various uses, while we refine the JSON bits some more and get it committed for 9.6. In reviews, it's clear that there's some more bits to fiddle so that it can be as general as possible. I think we should label the whole DDL command reporting as experimental in 9.5 and subject to change, so that we can just remove the hook in 9.6 when the ddl_deparse thing becomes part of core. Thoughts? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_restore -t should match views, matviews, and foreign tables
On 3/31/15 11:01 PM, Craig Ringer wrote: Following on from this -bugs post: http://www.postgresql.org/message-id/camsr+ygj50tvtvk4dbp66gajeoc0kap6kxfehaom+neqmhv...@mail.gmail.com this patch adds support for views, foreign tables, and materialised views to the pg_restore -t flag. I think this is a good change. Any concerns? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
Peter Geoghegan p...@heroku.com writes: On Tue, Apr 7, 2015 at 11:19 AM, Tom Lane t...@sss.pgh.pa.us wrote: If we were to go in this direction, it would be nice to at the same time add a similar whole-record syntax for INSERT. I'm not sure exactly what that should look like though. Also, again, we ought to be paying attention to how this would match up with UPSERT syntax. I expressed concern about allowing this for UPSERT [1]. Yeah, your analogy to SELECT * being considered dangerous in production is not without merit. However, to the extent that the syntax is used to assign from a composite variable of the same (or compatible) data type, it seems like it would be safe enough. IOW, I think that analogy holds for the syntax implemented by the current patch, but not what I suggested in my followup. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rejected vs returned with feedback in new CF app
On 4/7/15 3:33 PM, Tom Lane wrote: I tried to mark the UPDATE SET (*) patch as returned with feedback, but the CF app informed me that if I did that the patch would automatically be moved to the next commitfest. That seems completely stupid. There is no need to reconsider it unless a new version of the patch is forthcoming (which there may or may not ever be, but that's beside the point for now). When and if the author does submit a new patch, that would be the time to include it in the next commitfest, no? I noticed that as well and have avoided closing some patches because of it. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_rewind and log messages
On 04/07/2015 05:59 AM, Michael Paquier wrote: On Mon, Apr 6, 2015 at 9:10 PM, Fujii Masao masao.fu...@gmail.com wrote: I eliminated a bunch of newlines in the log messages that seemed really unnecessary to me, simplifying a bit the whole. So the patch removed the newlines from the error messages, and added the newline into pg_fatal/log instead. Perhaps that's good idea, but it's not actually consistent with what we do in other utilities in src/bin. See write_msg() and exit_horribly() in pg_dump, write_stderr() in pg_ctl, and psql_error() in psql. If we want to change that in pg_rewind (IMHO we shouldn't), let's do that as a completely separate patch. While hacking this stuff, I noticed as well that pg_rewind could be called as root on non-Windows platform, that's dangerous from a security point of view as process manipulates files in PGDATA. Hence let's block that. On Windows, a restricted token should be used. Good catch! I think it's better to implement it as a separate patch because it's very different from log message problem. Attached are new patches: - 0001 fixes the restriction issues: restricted token on Windows, and forbid non-root user on non-Windows. Committed this one. - 0002 includes the improvements for logging, addressing the comments above. This is a bit of a mixed bag. I'll comment on the three points from the commit message separately: Fix inconsistent handling of logs in pg_rewind pg_rewind was handling a couple of things differently compared to the other src/bin utilities: - Logging output needs to be flushed on stderr, not stdout Agreed in general. But there's also precedent in printing some stuff to stdout: pg_ctl does that for the status message, like server starting. As does initdb. I'm pretty unclear on what the rule here is. - Logging messages should be prefixed with the application name We tend to do that for error messages, but not for other messages. Seems inappropriate for the progress messages. - pg_fatal exits with error code of 1, but it was sometimes followed by subsequent logs, making this information lost in the process. Fixed. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] rejected vs returned with feedback in new CF app
I tried to mark the UPDATE SET (*) patch as returned with feedback, but the CF app informed me that if I did that the patch would automatically be moved to the next commitfest. That seems completely stupid. There is no need to reconsider it unless a new version of the patch is forthcoming (which there may or may not ever be, but that's beside the point for now). When and if the author does submit a new patch, that would be the time to include it in the next commitfest, no? I ended up marking it rejected instead, but that seems a bit harsh. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Auditing extension for PostgreSQL (Take 2)
On 4/6/15 5:03 PM, Alvaro Herrera wrote: Simon Riggs wrote: The present version can trigger an audit trail event for a statement, without tracking the object that was being audited. This prevents you from searching for all SQL that touches table X, i.e. we know the statements were generated, but not which ones they were. IMHO that makes the resulting audit trail unusable for auditing purposes. I would like to see that functionality put back before it gets committed, if that occurs. Is there a consensus that the current version is the one that we should be reviewing, rather than the one Abhijit submitted? Last I checked, that wasn't at all clear. Well, this one is the commitfest thread of record. At quick glance, my comments about how does this map to specific customer requirements apply to the other submission as well. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Row security violation error is misleading
On Tue, Apr 7, 2015 at 5:11 AM, Craig Ringer cr...@2ndquadrant.com wrote: postgres= INSERT INTO clients (account_name, account_manager) VALUES ('peters', 'peter'), ('johannas', 'johanna'); ERROR: 44000: new row violates WITH CHECK OPTION for clients DETAIL: Failing row contains (7, johannas, johanna). LOCATION: ExecWithCheckOptions, execMain.c:1683 ... yet clients is a table, not a view, and cannot have a WITH CHECK OPTION clause. There is no reference to the policy being violated or to the fact that it's row security involved. FWIW, I also think that this is very confusing. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
David Rowley dgrowle...@gmail.com wrote: If we attempt to do this parallel stuff at plan time, and we happen to plan at some quiet period, or perhaps worse, some application's start-up process happens to PREPARE a load of queries when the database is nice and quite, then quite possibly we'll end up with some highly parallel queries. Then perhaps come the time these queries are actually executed the server is very busy... Things will fall apart quite quickly due to the masses of IPC and context switches that would be going on. I completely understand that this parallel query stuff is all quite new to us all and we're likely still trying to nail down the correct infrastructure for it to work well, so this is why I'm proposing that the planner should know nothing of parallel query, instead I think it should work more along the lines of: * Planner should be completely oblivious to what parallel query is. * Before executor startup the plan is passed to a function which decides if we should parallelise it, and does so if the plan meets the correct requirements. This should likely have a very fast exit path such as: if root node's cost parallel_query_cost_threshold return; /* the query is not expensive enough to attempt to make parallel */ The above check will allow us to have an almost zero overhead for small low cost queries. This function would likely also have some sort of logic in order to determine if the server has enough spare resource at the current point in time to allow queries to be parallelised There is a lot to like about this suggestion. I've seen enough performance crashes due to too many concurrent processes (even when each connection can only use a single process) to believe that, for a plan which will be saved, it is possible to know at planning time whether parallelization will be a nice win or a devastating over-saturation of resources during some later execution phase. Another thing to consider is that this is not entirely unrelated to the concept of admission control policies. Perhaps this phase could be a more general execution start-up admission control phase, where parallel processing would be one adjustment that could be considered. Initially it might be the *only* consideration, but it might be good to try to frame it in a way that allowed implementation of other policies, too. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns
Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes: To support ROW_MARK_REFERENCE on (postgres_fdw) foreign tables, I'd like to propose the following FDW APIs: RowMarkType GetForeignRowMarkType(Oid relid, LockClauseStrength strength); Decide which rowmark type to use for a foreign table (that has strength = LCS_NONE), ie, ROW_MARK_REFERENCE or ROW_MARK_COPY. (For now, the second argument takes LCS_NONE only, but is intended to be used for the possible extension to the other cases.) This is called during select_rowmark_type() in the planner. Why would you restrict that to LCS_NONE? Seems like the point is to give the FDW control of the rowmark behavior, not only partial control. (For the same reason I disagree with the error check in the patch that restricts which ROW_MARK options this function can return. If the FDW has TIDs, seems like it could reasonably use whatever options tables can use.) void BeginForeignFetch(EState *estate, ExecRowMark *erm, List *fdw_private, int eflags); Begin a remote fetch. This is called during InitPlan() in the executor. The begin/end functions seem like useless extra mechanism. Why wouldn't the FDW just handle this during its regular scan setup? It could look to see whether the foreign table is referenced by any ExecRowMarks (possibly there's room to add some convenience functions to help with that). What's more, that design would make it simpler if the basic row fetch needs to be modified. And I'd also like to propose to add a table/server option, row_mark_reference, to postgres_fdw. When a user sets the option to true for eg a foreign table, ROW_MARK_REFERENCE will be used for the table, not ROW_MARK_COPY. Why would we leave that in the hands of the user? Hardly anybody is going to know what to do with the option, or even that they should do something with it. It's basically only of value for debugging AFAICS, but if we expose an option we're going to be stuck with supporting it forever. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New error code to track unsupported contexts
On 11/28/14 11:41 PM, Michael Paquier wrote: Hi all, When pg_event_trigger_dropped_objects is run in a context that is not the one of an event trigger, currently the error code ERRCODE_FEATURE_NOT_SUPPORTED is returned. Wouldn't it be better to have an error to define an out-of-context instead? It seems that it would be a good thing to have more error verbosity for situations like the case above. Note that this idea has been mentioned on this ML a couple of weeks back. In any case, attached is a patch showing the idea. Opinions? Is that worth having? Anything ever happen with this? (FWIW, I'm in favor of it. Reporting the feature isn't supported is confusing...) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rare avl shutdown slowness (related to signal handling)
I got another repro with the shutdown slowness (DEBUG5 with verbosed log are attached). It gives a finer picture of what's going on: 1. Avl ereport(autovacuum launcher shutting down); 2. At the end of errfinish(), it honors a pending SIGINT; 3. SIGINT handler longjmp to the start of avl error handling; 4. The error handling continues and rebuild_database_list() (that's why we see begin/commit pair); 5. In main loop, it WaitLatch(60 seconds); 6. Finally it ereport() again and proc_exit(). This looks like a general pattern - don't think *nix is immune. Notice that this ereport() is special as there is way to go back. So we can insert HOLD_INTERRUPTS() just before it. Thoughts? Regards, Qingqing On Tue, Apr 7, 2015 at 10:54 AM, Qingqing Zhou zhouqq.postg...@gmail.com wrote: I am playing git tip on windows 7/32 bits, with the backend compiled with visual studio 2005 (I know, it is very old :-( ). I encountered avl shutdown slowness twice, last night and this morning: after a ctrl_c is hit, wait for another 60 seconds, server shuts down. Here is one log: D:\pgsql\binpostgres -D../data --log_line_prefix=%t %p 2015-04-07 10:30:04 PDT 3148LOG: database system was shut down at 2015-04-07 10:29:24 PDT 2015-04-07 10:30:04 PDT 19548LOG: database system is ready to accept connections 2015-04-07 10:30:04 PDT 27008LOG: autovacuum launcher started 2015-04-07 10:30:08 PDT 19548LOG: received fast shutdown request 2015-04-07 10:30:08 PDT 19548LOG: aborting any active transactions 2015-04-07 10:30:08 PDT 27008LOG: autovacuum launcher shutting down 2015-04-07 10:30:08 PDT 27008ERROR: canceling statement due to user request 2015-04-07 10:31:09 PDT 27008LOG: autovacuum launcher shutting down 2015-04-07 10:31:09 PDT 15656LOG: shutting down 2015-04-07 10:31:09 PDT 15656LOG: database system is shut down The interesting part is on PID 27008: avl first ereport() shutdown, which is at the very end of the main loop and just one step away from proc_exit(). Then it seems honors a SIGINT within ereport(), longjmp to the loop head, and waits for another 60 seconds. After timeout, it ereports shutdown again, and finally exits. Not sure if this is windows only or general. I can hardly repro it. Anyone has ever seen this? Regards, Qingqing avlexit.log Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rare avl shutdown slowness (related to signal handling)
Qingqing Zhou zhouqq.postg...@gmail.com writes: I got another repro with the shutdown slowness (DEBUG5 with verbosed log are attached). It gives a finer picture of what's going on: 1. Avl ereport(autovacuum launcher shutting down); 2. At the end of errfinish(), it honors a pending SIGINT; 3. SIGINT handler longjmp to the start of avl error handling; 4. The error handling continues and rebuild_database_list() (that's why we see begin/commit pair); 5. In main loop, it WaitLatch(60 seconds); 6. Finally it ereport() again and proc_exit(). This looks like a general pattern - don't think *nix is immune. Notice that this ereport() is special as there is way to go back. So we can insert HOLD_INTERRUPTS() just before it. Thoughts? That seems like (a) a hack, and (b) not likely to solve the problem completely, unless you leave interrupts held throughout proc_exit(), which would create all sorts of opportunities for corner case bugs during on_proc_exit hooks. I think changing the outer for(;;) to while (!got_SIGTERM) would be a much safer fix. It looks like there's a related risk associated with this bit: /* in emergency mode, just start a worker and go away */ if (!AutoVacuumingActive()) { do_start_worker(); proc_exit(0); /* done */ } If we get SIGHUP and see that autovacuum has been turned off, we exit the main loop, but we don't set got_SIGTERM. So if we then get a similar error at the shutdown report, we'd not merely waste some time, but actually incorrectly launch a child. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication identifiers, take 4
On 4/7/15 10:58 AM, Andres Freund wrote: Why not just create a sequence? I suspect it may not be as fast to assign as an OID, but it's not like you'd be doing this all the time. What does that have to do with the thread? The original bit was... And finally I have issue with how the new identifiers are allocated. Currently, if you create identifier 'foo', remove identifier 'foo' and create identifier 'bar', the identifier 'bar' will have same id as the old 'foo' identifier. This can be problem because the identifier id is used as origin of the data and the replication solution using the replication identifiers can end up thinking that data came from node 'bar' even though they came from the node 'foo' which no longer exists. This can have bad effects for example on conflict detection or debugging problems with replication. Maybe another reason to use standard Oids? Wasn't the reason for using OIDs so that we're not doing the equivalent of max(identifier) + 1? Perhaps I'm just confused... -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_restore -t should match views, matviews, and foreign tables
On Tue, Apr 7, 2015 at 1:33 PM, Tom Lane t...@sss.pgh.pa.us wrote: Peter Eisentraut pete...@gmx.net writes: On 3/31/15 11:01 PM, Craig Ringer wrote: this patch adds support for views, foreign tables, and materialised views to the pg_restore -t flag. I think this is a good change. Any concerns? Are we happy with pg_dump/pg_restore not distinguishing these objects by type? It seems rather analogous to letting ALTER TABLE work on views etc. Personally I'm fine with this, but certainly some people have complained about that approach so far as ALTER is concerned. (But the implication would be that we'd need four distinct switches, which is not an outcome I favor.) The pg_dump documentation for the equivalent -t switch states: Dump only tables (or views or sequences or foreign tables) matching table Does pg_dump need to be updated to address materialized views here? Does pg_restore need to be updated to address sequences here? ISTM that the two should mirror each other. David J.
Re: [HACKERS] rare avl shutdown slowness (related to signal handling)
Alvaro Herrera alvhe...@2ndquadrant.com writes: Tom Lane wrote: I think changing the outer for(;;) to while (!got_SIGTERM) would be a much safer fix. Ah, yeah. I was thinking in changing PG_exception_stack once shutdown was requested, but this is much simpler. Your proposed patch seems to be doing both of those, which is probably unnecessary. I don't object to the SIGHUP test and goto in the error path, but I'd put it a lot further down, like after the existing RESUME_INTERRUPTS. I doubt it's a good idea to skip the transaction cleanup steps. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support UPDATE table SET(*)=...
On 4/7/15 2:00 PM, Alvaro Herrera wrote: Tom Lane wrote: I spent a fair amount of time cleaning this patch up to get it into committable shape, but as I was working on the documentation I started to lose enthusiasm for it, because I was having a hard time coming up with compelling examples. The originally proposed motivation was It solves the problem of doing UPDATE from a record variable of the same type as the table e.g. update foo set (*) = (select foorec.*) where ...; but it feels to me that this is not actually a very good solution to that problem --- even granting that the problem needs to be solved, a claim that still requires some justification IMO. How about an UPDATE ran inside a plpgsql function, which is using a row variable of the table type? You could assign values to individual columns of q row variable, and run the multicolumn UPDATE last. Along similar lines, I've often wanted something akin to *, but allowing finer control over what you got. Generally when I want this it's because I really do want everything (as in, don't want to re-code a bunch of stuff if a column is added), but perhaps not the surrogate key field. Or I want everything, but rename some field to something else. Basically, another way to do what Alvaro is suggesting (though, the ability to rename something is new...) If we had that ability I think UPDATE * would become a lot more useful. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_rewind and log messages
Heikki Linnakangas wrote: On 04/07/2015 05:59 AM, Michael Paquier wrote: Fix inconsistent handling of logs in pg_rewind pg_rewind was handling a couple of things differently compared to the other src/bin utilities: - Logging output needs to be flushed on stderr, not stdout Agreed in general. But there's also precedent in printing some stuff to stdout: pg_ctl does that for the status message, like server starting. As does initdb. I'm pretty unclear on what the rule here is. One principle that sometimes helps is to consider what happens if you use the command as part of a larger pipeline; progress messages can be read by some other command further down (and perhaps report them in a dialog box, if you embed the program in a GUI, say), but error messages should probably be processed differently; normally the pipeline would be aborted as a whole. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doubt about AccessExclusiveLock in ALTER TABLE .. SET ( .. );
On Mon, Apr 6, 2015 at 12:53 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fabrízio de Royes Mello wrote: Ok guys. The attached patch refactor the reloptions adding a new field lockmode in relopt_gen struct and a new method to determine the required lock level from an option list. We need determine the appropriate lock level for each reloption: I don't think AccessShareLock is appropriate for any option change. You should be using a lock level that's self-conflicting, as a minimum requirement, to avoid two processes changing the value concurrently. What locklevel do you suggest? Maybe ShareLock? (I would probably go as far as ensuring that the lock level specified in the table DoLockModesConflict() with itself in an Assert somewhere.) If I understood this is to check if the locklevel of the reloption list don't conflict one each other, is it? Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog: http://fabriziomello.github.io Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello Github: http://github.com/fabriziomello
Re: [HACKERS] rare avl shutdown slowness (related to signal handling)
On Tue, Apr 7, 2015 at 2:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: That seems like (a) a hack, and (b) not likely to solve the problem completely, unless you leave interrupts held throughout proc_exit(), which would create all sorts of opportunities for corner case bugs during on_proc_exit hooks. Hmm, looks like proc_exit() already taken care of this by setting proc_exit_inprogress and StatementCancelHandler() respects it. Actually, in quickdie(), I found a similar practice for the same reason: /* * Prevent interrupts while exiting; though we just blocked signals that * would queue new interrupts, one may have been pending. We don't want a * quickdie() downgraded to a mere query cancel. */ HOLD_INTERRUPTS(); I do feel that we have too many functions instructing how to handle interrupts and they are subtle - I just found a new friend HOLD_CANCEL_INTERRUPTS :-( Regards, Qingqing -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rare avl shutdown slowness (related to signal handling)
Qingqing Zhou zhouqq.postg...@gmail.com writes: I do feel that we have too many functions instructing how to handle interrupts and they are subtle - I just found a new friend HOLD_CANCEL_INTERRUPTS :-( Indeed, which is why I think a patch for this issue should not introduce a new mode/context in which proc_exit() is executed. The code path would be so seldom executed that we'd never manage to isolate any bugs that might be in it. Thus also my objection to Alvaro's patch that had us going to proc_exit from partway through the error cleanup sequence. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rare avl shutdown slowness (related to signal handling)
On Tue, Apr 7, 2015 at 4:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: Indeed, which is why I think a patch for this issue should not introduce a new mode/context in which proc_exit() is executed. Agree. Along this line, we can add an on_proc_exit hook simply ereport(we are exiting) there. In this way, we reuse all shields proc_exit() already has. Abuse? Regards, Qingqing -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Tuple visibility within a single XID
My understanding is that all subtransactions get their own unique XID (assuming they need one), and that CommandId can't move backwards within a transaction. If that's correct, then shouldn't we be able to prune tuples where XMIN and XMAX match our *exact* XID (not all the extra stuff that TransactionIdIsCurrentTransactionId() does) and CommandId CurrentCommandId? I thought of this because of a post to -general. It's certainly not a common case, but it seems like there's not much downside... -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doubt about AccessExclusiveLock in ALTER TABLE .. SET ( .. );
Fabrízio de Royes Mello wrote: On Mon, Apr 6, 2015 at 12:53 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fabrízio de Royes Mello wrote: Ok guys. The attached patch refactor the reloptions adding a new field lockmode in relopt_gen struct and a new method to determine the required lock level from an option list. We need determine the appropriate lock level for each reloption: I don't think AccessShareLock is appropriate for any option change. You should be using a lock level that's self-conflicting, as a minimum requirement, to avoid two processes changing the value concurrently. What locklevel do you suggest? Maybe ShareLock? ShareUpdateExclusive probably. ShareLock doesn't conflict with itself. (I would probably go as far as ensuring that the lock level specified in the table DoLockModesConflict() with itself in an Assert somewhere.) If I understood this is to check if the locklevel of the reloption list don't conflict one each other, is it? I mean Assert(DoLockModesConflict(relopt_gen-locklevel, relopt_gen-locklevel)); -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_rewind and log messages
On Wed, Apr 8, 2015 at 5:53 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Heikki Linnakangas wrote: On 04/07/2015 05:59 AM, Michael Paquier wrote: Fix inconsistent handling of logs in pg_rewind pg_rewind was handling a couple of things differently compared to the other src/bin utilities: - Logging output needs to be flushed on stderr, not stdout Agreed in general. But there's also precedent in printing some stuff to stdout: pg_ctl does that for the status message, like server starting. As does initdb. I'm pretty unclear on what the rule here is. One principle that sometimes helps is to consider what happens if you use the command as part of a larger pipeline; progress messages can be read by some other command further down (and perhaps report them in a dialog box, if you embed the program in a GUI, say), but error messages should probably be processed differently; normally the pipeline would be aborted as a whole. Make sense. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tuple visibility within a single XID
On Tue, Apr 7, 2015 at 6:11 PM, Peter Geoghegan p...@heroku.com wrote: No. For one thing, unique index enforcement still requires the tuples to be treated as a conflict while the other transaction is running IMV. Not sure if I understand correctly: in uniqueness check, we see all possible tuples with a dirty snapshot. For a tuple version inserted and updated by myself, it is surely dead no matter how the transaction ends. So I interpret that we can safely pruning the version. Early pruning may cause some behavior change though. For example, here is a T1: begin; -- loop the following statements many times insert into pk values (1); -- uniqueness defined delete from pk; abort; If another transaction T2 coming later than T1, and if we prune early, then T1 can suddenly hang on insertion waiting for T2 to complete. But does this violate any isolation rule? Thanks, Qingqing -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tuple visibility within a single XID
On 4/7/15 8:11 PM, Peter Geoghegan wrote: You're not the first to consider trying something like this in specific scenarios, but my work on UPSERT leads me to believe it isn't workable. Yeah, every time I get into the really nitty-gritty details of this stuff it gets scary. That's why I didn't even bother with a patch before asking. Thanks for setting me straight. :) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doubt about AccessExclusiveLock in ALTER TABLE .. SET ( .. );
On Tue, Apr 7, 2015 at 10:15 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fabrízio de Royes Mello wrote: On Mon, Apr 6, 2015 at 12:53 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fabrízio de Royes Mello wrote: Ok guys. The attached patch refactor the reloptions adding a new field lockmode in relopt_gen struct and a new method to determine the required lock level from an option list. We need determine the appropriate lock level for each reloption: I don't think AccessShareLock is appropriate for any option change. You should be using a lock level that's self-conflicting, as a minimum requirement, to avoid two processes changing the value concurrently. What locklevel do you suggest? Maybe ShareLock? ShareUpdateExclusive probably. ShareLock doesn't conflict with itself. Ok. (I would probably go as far as ensuring that the lock level specified in the table DoLockModesConflict() with itself in an Assert somewhere.) If I understood this is to check if the locklevel of the reloption list don't conflict one each other, is it? I mean Assert(DoLockModesConflict(relopt_gen-locklevel, relopt_gen-locklevel)); Understood... IMHO the better place to perform this assert is in initialize_reloptions. Thoughts? -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog: http://fabriziomello.github.io Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello Github: http://github.com/fabriziomello
Re: [HACKERS] pg_regress writes into source tree
On Wed, Apr 8, 2015 at 10:28 AM, Jim Nasby wrote: Just so this doesn't get lost... did something make it into a CommitFest on this? Peter's patch has been committed as 64cdbbc, while the idea to create sql/ by pg_regress if it is not present did not gather much interest in this CF: https://commitfest.postgresql.org/4/100/ -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Really bad blowups with hash outer join and nulls
On 2/15/15 7:16 PM, Tomas Vondra wrote: Hi, On 16.2.2015 00:50, Andrew Gierth wrote: Tom == Tom Lane t...@sss.pgh.pa.us writes: I've now tried the attached patch to correct the bucketsize estimates, and it does indeed stop the planner from considering the offending path (in this case it just does the join the other way round). One thing I couldn't immediately see how to do was account for the case where there are a lot of nulls in the table but a strict qual (or an IS NOT NULL) filters them out; this patch will be overly pessimistic about such cases. Do estimates normally try and take things like this into account? I didn't find any other relevant examples. Improving the estimates is always good, but it's not going to fix the case of non-NULL values (it shouldn't be all that difficult to create such examples with a value whose hash starts with a bunch of zeroes). Tom There may also be something we can do in the executor, but it Tom would take closer analysis to figure out what's going wrong. I Tom don't think kluging the behavior for NULL in particular is the Tom answer. The point with nulls is that a hash value of 0 is currently special in two distinct ways: it's always in batch 0 and bucket 0 regardless of how many batches and buckets there are, and it's the result of hashing a null. These two special cases interact in a worst-case manner, so it seems worthwhile to avoid that. I think you're right this is a flaw in general - all it takes is a sufficiently common value with a hash value falling into the first batch (i.e. either 0 or starting with a lot of 0 bits, thus giving hashno==0). I think this might be solved by relaxing the check a bit. Currently we do this (see nodeHash.c:735): if (nfreed == 0 || nfreed == ninmemory) { hashtable-growEnabled = false; } which only disables nbatch growth if *all* the tuples remain in the batch. That's rather strict, and it takes a single tuple to break this. With each nbatch increase we expect 50:50 split, i.e. 1/2 the tuples staying in the batch, 1/2 moved to the new one. So a much higher ratio is suspicious and most likely mean the same hash value, so what if we did something like this: if ((nfreed = 0.9 * (nfreed + ninmemory)) || (nfreed = 0.1 * (nfreed + ninmemory))) { hashtable-growEnabled = false; } which disables nbatch growth if either =90% tuples remained in the first batch, or = 90% tuples were moved from it. The exact thresholds might be set a bit differently, but I think 10:90 sounds about good. Trivial patch implementing this attached - with it the explain analyze looks like this: test=# explain analyze select status, count(*) from q3 left join m3 on (m3.id=q3.id) group by status; QUERY PLAN -- HashAggregate (cost=64717.63..64717.67 rows=4 width=4) (actual time=514.619..514.630 rows=5 loops=1) Group Key: m3.status - Hash Right Join (cost=18100.00..62217.63 rows=50 width=4) (actual time=75.260..467.911 rows=500108 loops=1) Hash Cond: (m3.id = q3.id) - Seq Scan on m3 (cost=0.00..18334.00 rows=100 width=37) (actual time=0.003..91.799 rows=100 loops=1) - Hash (cost=7943.00..7943.00 rows=50 width=33) (actual time=74.916..74.916 rows=50 loops=1) Buckets: 65536 (originally 65536) Batches: 32 (originally 16) Memory Usage: 8824kB - Seq Scan on q3 (cost=0.00..7943.00 rows=50 width=33) (actual time=0.005..27.395 rows=50 loops=1) Planning time: 0.172 ms Execution time: 514.663 ms (10 rows) Without the patch this runs in ~240 seconds and the number of batches explodes to ~131k. In theory it might happen that threre just a few hash values and all of them are exactly the same within the first N bits (thus falling into the first batch), but N+1 bits are different. So the next split would actually help. But I think we can leave that up to probability, just like all the hash code. Anything happen with this, or the patch Andrew posted? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Replication identifiers, take 4
On Tue, Apr 7, 2015 at 11:37 PM, Andres Freund and...@anarazel.de wrote: On 2015-04-07 16:30:25 +0200, Andres Freund wrote: And with temp tables (or much more extremely WITH OID tables) and such it's not that hard to reach that point. Oh, and obviously toast data. A couple tables with toasted columns is also a good way to rapidly consume oids. You are forgetting as well large objects on the stack, when client application does not assign an OID by itself. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
On Mon, Mar 30, 2015 at 9:20 AM, Peter Geoghegan p...@heroku.com wrote: * The code uses LockTupleExclusive to lock rows. That means the fkey key locking doesn't work; That's annoying. This means that using upsert will potentially cause deadlocks in other sessions :(. I think you'll have to determine what lock to acquire by fetching the tuple, doing the key comparison, and acquire the appropriate lock. That should be fine. Looking into the implementation of this. As we discussed, the difficulty about avoiding a lock escalation within ExecUpdate() is that we must fetch the row, run EvalPlanQual() to check if the new row version generated by updating will require a LockTupleExclusive or LockTupleNoKeyExclusive, and then lock the row using the right lockmode, and only then call ExecUpdate(). Right now, UPSERT benefits from fetching and locking the row together, so going this way imposes a little additional complexity. It's probably worth it, though. Why do you think deadlocks will be a particular concern? Sure, it could make the difference between deadlocking and not deadlocking, which is bad, but it's not obvious to me that the risk would be any worse than the risk of deadlocking with FKs in 9.2. Is that really all that bad? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bringing text abbreviation debug output under the control of trace_sort
On Sun, Apr 5, 2015 at 3:59 PM, Peter Geoghegan p...@heroku.com wrote: Attached patch makes trace_sort control abbreviation debug output for the text opclass, which makes it consistent with the numeric opclass. This seems better than relying on someone going to the trouble of building Postgres themselves to debug cases where abbreviation is the wrong thing, which we're not 100% sure will not occur. It also allows wider analysis of where abbreviation helps the most and the least in production, which is surely a good thing. I have added this to the next commitfest, but I suggest that it be quickly committed to the master branch as a maintenance commit. Done. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bringing text abbreviation debug output under the control of trace_sort
On Tue, Apr 7, 2015 at 7:46 PM, Robert Haas robertmh...@gmail.com wrote: Done. Thank you. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with big lines ?
On Mon, Apr 6, 2015 at 1:51 PM, Jim Nasby jim.na...@bluetreble.com wrote: In any case, I don't think it would be terribly difficult to allow a bit more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR there's some 1GB limits there too. The point is, those limits are there on purpose. Changing things arbitrarily wouldn't be hard, but doing it in a principled way is likely to require some thought. For example, in the COPY OUT case, presumably what's happening is that we palloc a chunk for each individual datum, and then palloc a buffer for the whole row. Now, we could let the whole-row buffer be bigger, but maybe it would be better not to copy all of the (possibly very large) values for the individual columns over into a row buffer before sending it. Some refactoring that avoids the need for a potentially massive (1.6TB?) whole-row buffer would be better than just deciding to allow it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rejected vs returned with feedback in new CF app
On Tue, Apr 7, 2015 at 3:35 PM, Peter Eisentraut pete...@gmx.net wrote: On 4/7/15 3:33 PM, Tom Lane wrote: I tried to mark the UPDATE SET (*) patch as returned with feedback, but the CF app informed me that if I did that the patch would automatically be moved to the next commitfest. That seems completely stupid. There is no need to reconsider it unless a new version of the patch is forthcoming (which there may or may not ever be, but that's beside the point for now). When and if the author does submit a new patch, that would be the time to include it in the next commitfest, no? I noticed that as well and have avoided closing some patches because of it. Several people, including me, have complained about this before. I hope that Magnus will fix it soon. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted
On 4/3/15 7:44 AM, Jan Urbański wrote: To reiterate: I recognise that not handling the callbacks is not the right answer. But not stomping on callbacks others might have set sounds like a minimal and safe improvement. I think your patch is okay in that it is not a good idea to overwrite or unset someone else's callbacks. But we shouldn't mistake that for fixing the underlying problem. The only reason this patch appears to fix the presented test cases is because other OpenSSL users are also misbehaving and/or the OpenSSL interfaces are so stupid that they cannot be worked with reasonably. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rejected vs returned with feedback in new CF app
On Wed, Apr 8, 2015 at 11:57 AM, Robert Haas robertmh...@gmail.com wrote: On Tue, Apr 7, 2015 at 3:35 PM, Peter Eisentraut pete...@gmx.net wrote: On 4/7/15 3:33 PM, Tom Lane wrote: I tried to mark the UPDATE SET (*) patch as returned with feedback, but the CF app informed me that if I did that the patch would automatically be moved to the next commitfest. That seems completely stupid. There is no need to reconsider it unless a new version of the patch is forthcoming (which there may or may not ever be, but that's beside the point for now). When and if the author does submit a new patch, that would be the time to include it in the next commitfest, no? I noticed that as well and have avoided closing some patches because of it. Several people, including me, have complained about this before. I hope that Magnus will fix it soon. Yeah, you can find references about that here and there... And the current behavior is confusing. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump / copy bugs with big lines ?
On Wed, Apr 8, 2015 at 11:53 AM, Robert Haas robertmh...@gmail.com wrote: On Mon, Apr 6, 2015 at 1:51 PM, Jim Nasby jim.na...@bluetreble.com wrote: In any case, I don't think it would be terribly difficult to allow a bit more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR there's some 1GB limits there too. The point is, those limits are there on purpose. Changing things arbitrarily wouldn't be hard, but doing it in a principled way is likely to require some thought. For example, in the COPY OUT case, presumably what's happening is that we palloc a chunk for each individual datum, and then palloc a buffer for the whole row. Now, we could let the whole-row buffer be bigger, but maybe it would be better not to copy all of the (possibly very large) values for the individual columns over into a row buffer before sending it. Some refactoring that avoids the need for a potentially massive (1.6TB?) whole-row buffer would be better than just deciding to allow it. I think that something to be aware of is that this is as well going to require some rethinking of the existing libpq functions that are here to fetch a row during COPY with PQgetCopyData, to make them able to fetch chunks of data from one row. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
On Wed, Apr 8, 2015 at 1:53 AM, Kevin Grittner kgri...@ymail.com wrote: David Rowley dgrowle...@gmail.com wrote: If we attempt to do this parallel stuff at plan time, and we happen to plan at some quiet period, or perhaps worse, some application's start-up process happens to PREPARE a load of queries when the database is nice and quite, then quite possibly we'll end up with some highly parallel queries. Then perhaps come the time these queries are actually executed the server is very busy... Things will fall apart quite quickly due to the masses of IPC and context switches that would be going on. I completely understand that this parallel query stuff is all quite new to us all and we're likely still trying to nail down the correct infrastructure for it to work well, so this is why I'm proposing that the planner should know nothing of parallel query, instead I think it should work more along the lines of: * Planner should be completely oblivious to what parallel query is. * Before executor startup the plan is passed to a function which decides if we should parallelise it, and does so if the plan meets the correct requirements. This should likely have a very fast exit path such as: if root node's cost parallel_query_cost_threshold return; /* the query is not expensive enough to attempt to make parallel */ The above check will allow us to have an almost zero overhead for small low cost queries. This function would likely also have some sort of logic in order to determine if the server has enough spare resource at the current point in time to allow queries to be parallelised There is a lot to like about this suggestion. I've seen enough performance crashes due to too many concurrent processes (even when each connection can only use a single process) to believe that, for a plan which will be saved, it is possible to know at planning time whether parallelization will be a nice win or a devastating over-saturation of resources during some later execution phase. Another thing to consider is that this is not entirely unrelated to the concept of admission control policies. Perhaps this phase could be a more general execution start-up admission control phase, where parallel processing would be one adjustment that could be considered. I think there is always a chance that resources (like parallel-workers) won't be available at run-time even if we decide about them at executor-start phase unless we block it for that node's usage and OTOH if we block it (by allocating) those resources during executor-start phase then we might end up blocking it too early or may be they won't even get used if we decide not to execute that node. On that basis, it seems to me current strategy is not bad where we decide during planning time and later during execution time if not all resources (particularly parallel-workers) are not available, then we use only the available one's to execute the plan. Going forward, I think we can improve the same if we decide not to shutdown parallel workers till postmaster shutdown once they are started and then just allocate them during executor-start phase. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Tuple visibility within a single XID
On Tue, Apr 7, 2015 at 5:59 PM, Jim Nasby jim.na...@bluetreble.com wrote: My understanding is that all subtransactions get their own unique XID (assuming they need one), and that CommandId can't move backwards within a transaction. If that's correct, then shouldn't we be able to prune tuples where XMIN and XMAX match our *exact* XID (not all the extra stuff that TransactionIdIsCurrentTransactionId() does) and CommandId CurrentCommandId? No. For one thing, unique index enforcement still requires the tuples to be treated as a conflict while the other transaction is running IMV. For another, this is necessary today (from ExecUpdate()): /* * The target tuple was already updated or deleted by the * current command, or by a later command in the current * transaction. The former case is possible in a join UPDATE * where multiple tuples join to the same target tuple. This * is pretty questionable, but Postgres has always allowed it: * we just execute the first update action and ignore * additional update attempts. * * The latter case arises if the tuple is modified by a * command in a BEFORE trigger, or perhaps by a command in a * volatile function used in the query. In such situations we * should not ignore the update, but it is equally unsafe to * proceed. We don't want to discard the original UPDATE * while keeping the triggered actions based on it; and we * have no principled way to merge this update with the * previous ones. So throwing an error is the only safe * course. * * If a trigger actually intends this type of interaction, it * can re-execute the UPDATE (assuming it can figure out how) * and then return NULL to cancel the outer update. */ if (hufd.cmax != estate-es_output_cid) ereport(ERROR, (errcode(ERRCODE_TRIGGERED_DATA_CHANGE_VIOLATION), errmsg(tuple to be updated was already modified by an operation triggered by the current command), errhint(Consider using an AFTER trigger instead of a BEFORE trigger to propagate changes to other rows.))); You're not the first to consider trying something like this in specific scenarios, but my work on UPSERT leads me to believe it isn't workable. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New error code to track unsupported contexts
Jim Nasby wrote: On 11/28/14 11:41 PM, Michael Paquier wrote: Hi all, When pg_event_trigger_dropped_objects is run in a context that is not the one of an event trigger, currently the error code ERRCODE_FEATURE_NOT_SUPPORTED is returned. Wouldn't it be better to have an error to define an out-of-context instead? It seems that it would be a good thing to have more error verbosity for situations like the case above. Note that this idea has been mentioned on this ML a couple of weeks back. In any case, attached is a patch showing the idea. Opinions? Is that worth having? Anything ever happen with this? (FWIW, I'm in favor of it. Reporting the feature isn't supported is confusing...) Not opposed to the idea. Maybe it should be in class 39 'External Routine Invocation Exception' instead, like ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED is used by various trigger functions. We could invent ERRCODE_E_R_I_E_EVENT_TRIGGER_PROTOCOL_VIOLATED with value 39P03, for example. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_regress writes into source tree
On 2/13/15 11:20 PM, Michael Paquier wrote: On Sat, Feb 14, 2015 at 6:24 AM, Andrew Dunstan and...@dunslane.net wrote: On 12/18/2014 06:05 PM, Andrew Dunstan wrote: On 12/18/2014 03:02 AM, Michael Paquier wrote: On Fri, Dec 12, 2014 at 10:45 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Another thing in that patch was that I had to add the sql/ directory to the source tree, but other than that .gitignore file it was empty. Maybe pg_regress should create the sql/ directory in the build dir if it doesn't exist. This is only a problem if a pg_regress suite only runs stuff from input/, because otherwise the sql/ dir already exists in the source. +1 for having pg_regress create the sql/ directory when it does not exist. Current behavior is annoying when modules having only tests in input/... That seems like a separate issue. I think Peter should commit his patch and backpatch it immediately, and we can deal with the missing sql directory when someone sends in a patch. What's happened on this? Nothing has been committed, and as far as I understood this patch could have been simply pushed... Just so this doesn't get lost... did something make it into a CommitFest on this? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
On Sat, Apr 4, 2015 at 5:19 AM, David Rowley dgrowle...@gmail.com wrote: Going over the previous emails in this thread I see that it has been a long time since anyone discussed anything around how we might decide at planning time how many workers should be used for the query, and from the emails I don't recall anyone proposing a good idea about how this might be done, and I for one can't see how this is at all possible to do at planning time. I think that the planner should know nothing of parallel query at all, and the planner quite possibly should go completely unmodified for this patch. One major problem I can see is that, given a query such as: SELECT * FROM million_row_product_table WHERE category = 'ELECTRONICS'; Where we have a non-unique index on category, some plans which may be considered might be: 1. Index scan on the category index to get all rows matching 'ELECTRONICS' 2. Sequence scan on the table, filter matching rows. 3. Parallel plan which performs a series of partial sequence scans pulling out all matching rows. I really think that if we end choosing things like plan 3, when plan 2 was thrown out because of its cost, then we'll end up consuming more CPU and I/O than we can possibly justify using. The environmentalist in me screams that this is wrong. What if we kicked off 128 worker process on some high-end hardware to do this? I certainly wouldn't want to pay the power bill. I understand there's costing built in to perhaps stop this, but I still think it's wrong headed, and we need to still choose the fastest non-parallel plan and only consider parallelising that later. I agree that this is an area that needs more thought. I don't (currently, anyway) agree that the planner shouldn't know anything about parallelism. The problem with that is that there's lots of relevant stuff that can only be known at plan time. For example, consider the query you mention above on a table with no index. If the WHERE clause is highly selective, a parallel plan may well be best. But if the selectivity is only, say, 50%, a parallel plan is stupid: the IPC costs of shipping many rows back to the master will overwhelm any benefit we could possibly have hoped to get, and the overall result will likely be that the parallel plan both runs slower and uses more resources. At plan time, we have the selectivity information conveniently at hand, and can use that as part of the cost model to make educated decisions. Execution time is way too late to be thinking about those kinds of questions. I think one of the philosophical questions that has to be answered here is what does it mean to talk about the cost of a parallel plan?. For a non-parallel plan, the cost of the plan means both the amount of effort we will spend executing the plan and also the amount of time we think the plan will take to complete, but those two things are different for parallel plans. I'm inclined to think it's right to view the cost of a parallel plan as a proxy for execution time, because the fundamental principle of the planner is that we pick the lowest-cost plan. But there also clearly needs to be some way to prevent the selection of a plan which runs slightly faster at the cost of using vastly more resources. Currently, the planner tracks the best unsorted path for each relation as well as the best path for each useful sort order. Suppose we treat parallelism as another axis for judging the quality of a plan: we keep the best unsorted, non-parallel path; the best non-parallel path for each useful sort order; the best unsorted, parallel path; and the best parallel path for each sort order. Each time we plan a node, we generate non-parallel paths first, and then parallel paths. But, if a parallel plan isn't markedly faster than the non-parallel plan for the same sort order, then we discard it. I'm not sure exactly what the thresholds should be here, and they probably need to be configurable, because on a single-user system with excess capacity available it may be absolutely desirable to use ten times the resources to get an answer 25% faster, but on a heavy-loaded system that will stink. Some ideas for GUCs: max_parallel_degree = The largest number of processes we'll consider using for a single query. min_parallel_speedup = The minimum percentage by which a parallel path must be cheaper (in terms of execution time) than a non-parallel path in order to survive. I'm imagining the default here might be something like 15%. min_parallel_speedup_per_worker = Like the previous one, but per worker. e.g. if this is 5%, which might be a sensible default, then a plan with 4 workers must be at least 20% better to survive, but a plan using only 2 workers only needs to be 10% better. An additional benefit of this line of thinking is that planning would always produce a best non-parallel path. And sometimes, there would also be a best parallel path that is expected to run faster. We could then choose
Re: [HACKERS] Tuple visibility within a single XID
On Tue, Apr 7, 2015 at 7:16 PM, Qingqing Zhou zhouqq.postg...@gmail.com wrote: If another transaction T2 coming later than T1, and if we prune early, then T1 can suddenly hang on insertion waiting for T2 to complete. But does this violate any isolation rule? Well, it means that you don't lock a row that you delete (and values that appeared in that row if they're constrained by a unique index) just because you happened to also insert that row. That would probably break client code. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: File count restriction of directory limits number of relations inside a database.
Hi, Ya you are right, ext4 allows more directory entries(more than 32000) but we limited the number of files insides the directory to 32000 to get better performance. Sorry i'm not mentioned that in my post. That the reason we plan to use tablespace. The problem we faced in tablespace is, the location should be present on both master and slave and we need to create multiple tablespaces. That why i changed the source, to create a sub directory on the given location and take that location for tablespace creation. So i can given one location (that present in both master slave) to create multiple tablespaces. - sudalai -- View this message in context: http://postgresql.nabble.com/File-count-restriction-of-directory-limits-number-of-relations-inside-a-database-tp5844711p5845044.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
On Wed, Apr 8, 2015 at 7:54 AM, Robert Haas robertmh...@gmail.com wrote: I agree that this is an area that needs more thought. I don't (currently, anyway) agree that the planner shouldn't know anything about parallelism. The problem with that is that there's lots of relevant stuff that can only be known at plan time. For example, consider the query you mention above on a table with no index. If the WHERE clause is highly selective, a parallel plan may well be best. But if the selectivity is only, say, 50%, a parallel plan is stupid: the IPC costs of shipping many rows back to the master will overwhelm any benefit we could possibly have hoped to get, and the overall result will likely be that the parallel plan both runs slower and uses more resources. At plan time, we have the selectivity information conveniently at hand, and can use that as part of the cost model to make educated decisions. Execution time is way too late to be thinking about those kinds of questions. I think one of the philosophical questions that has to be answered here is what does it mean to talk about the cost of a parallel plan?. For a non-parallel plan, the cost of the plan means both the amount of effort we will spend executing the plan and also the amount of time we think the plan will take to complete, but those two things are different for parallel plans. I'm inclined to think it's right to view the cost of a parallel plan as a proxy for execution time, because the fundamental principle of the planner is that we pick the lowest-cost plan. But there also clearly needs to be some way to prevent the selection of a plan which runs slightly faster at the cost of using vastly more resources. Currently, the planner tracks the best unsorted path for each relation as well as the best path for each useful sort order. Suppose we treat parallelism as another axis for judging the quality of a plan: we keep the best unsorted, non-parallel path; the best non-parallel path for each useful sort order; the best unsorted, parallel path; and the best parallel path for each sort order. Each time we plan a node, we generate non-parallel paths first, and then parallel paths. But, if a parallel plan isn't markedly faster than the non-parallel plan for the same sort order, then we discard it. One disadvantage of retaining parallel-paths could be that it can increase the number of combinations planner might need to evaluate during planning (in particular during join path evaluation) unless we do some special handling to avoid evaluation of such combinations. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Proposal : REINDEX xxx VERBOSE
On Wed, Apr 8, 2015 at 1:57 AM, Sawada Masahiko sawada.m...@gmail.com wrote: On Wed, Apr 8, 2015 at 1:11 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Tue, Apr 7, 2015 at 1:04 PM, Sawada Masahiko sawada.m...@gmail.com wrote: On Tue, Apr 7, 2015 at 10:12 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Tue, Apr 7, 2015 at 7:22 AM, Sawada Masahiko sawada.m...@gmail.com wrote: Thank you for your reviewing. I modified the patch and attached latest version patch(v7). Please have a look it. Looks good to me. Attached patch (v8) just fix a tab indentation in gram.y. I had forgotten fix a tab indentation, sorry. Thank you for reviewing! It looks good to me too. Can this patch be marked as Ready for Committer? +1 Changed status to Ready for Committer. The patch adds new syntax like REINDEX ... WITH VERBOSE, i.e., () is not added after WITH clause. Did we reach the consensus about this syntax? The last email from Robert just makes me think that () should be added into the syntax. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Removal of FORCE option in REINDEX
Hi, While reviewing the REINDEX VERBOSE patch, I felt inclined to remove FORCE option support from REINDEX command. It has been marked obsolete since very old version 7.4. I think that it's no longer worth keeping supporting it. Thought? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removal of FORCE option in REINDEX
Em quarta-feira, 8 de abril de 2015, Fujii Masao masao.fu...@gmail.com escreveu: Hi, While reviewing the REINDEX VERBOSE patch, I felt inclined to remove FORCE option support from REINDEX command. It has been marked obsolete since very old version 7.4. I think that it's no longer worth keeping supporting it. Thought? +1 -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog: http://fabriziomello.github.io Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello Github: http://github.com/fabriziomello