date:20150407

Re: [HACKERS] pg_rewind and log messages

2015-04-07 Thread Fujii Masao

On Tue, Apr 7, 2015 at 11:59 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Mon, Apr 6, 2015 at 9:10 PM, Fujii Masao masao.fu...@gmail.com wrote:
 I'm not familiar with native language support (sorry), but don't we need to
 add the shortcut of gettext into every calls of pg_log and pg_fatal, e.g.,
 change pg_fatal(xxx) to pg_fatal(_(xxx))? I know that fprintf() in
 pg_Log_v() has such shortcut, but I'm not sure if that's enough or not.

 I think I addressed those things...

  case PG_FATAL:
 -printf(\n%s, _(message));
 -printf(%s, _(Failure, exiting\n));
 +fprintf(stderr, _(\n%s: fatal: %s\n), progname, message);
 +fprintf(stderr, _(Failure, exiting\n));

 Why do we need to output the error level like fatal? This seems also
 inconsistent with the log message policy of other tools.

 initdb and psql do not for a couple of warning messages, but perhaps I
 am just taking consistency with the original code too seriously.

 Why do we need to output \n at the head of the message?
 The second message Failure, exiting is really required?

 I think that those things were here to highlight the fact that this is
 a fatal bailout, but the error code would help the same way I guess...

 I eliminated a bunch of
 newlines in the log messages that seemed really unnecessary to me,
 simplifying a bit the whole. While hacking this stuff, I noticed as
 well that pg_rewind could be called as root on non-Windows platform,
 that's dangerous from a security point of view as process manipulates
 files in PGDATA. Hence let's block that. On Windows, a restricted
 token should be used.

 Good catch! I think it's better to implement it as a separate patch
 because it's very different from log message problem.

 Attached are new patches:
 - 0001 fixes the restriction issues: restricted token on Windows, and
 forbid non-root user on non-Windows.

Thanks for the patch! I'd like to push this patch at first. But one comment is

+ You must run %s as the PostgreSQL superuser.\n,

Isn't the term PostgreSQL superuser confusing? I'm afraid that a user might
confuse PostgreSQL superuser with a database superuser. I see you just
borrowed that term from pg_resetxlog.c, though. BTW, initdb and pg_ctl also
have the same check and the error message is as follows. Isn't this better?
Thought?

(%s: cannot be run as root\n
   Please log in (using, e.g., \su\) as the 
   (unprivileged) user that will\n
   own the server process.\n

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_rewind and log messages

2015-04-07 Thread Michael Paquier

On Tue, Apr 7, 2015 at 4:33 PM, Fujii Masao wrote:
 Isn't the term PostgreSQL superuser confusing? I'm afraid that a user might
 confuse PostgreSQL superuser with a database superuser. I see you just
 borrowed that term from pg_resetxlog.c, though. BTW, initdb and pg_ctl also
 have the same check and the error message is as follows. Isn't this better?
 Thought?

 (%s: cannot be run as root\n
Please log in (using, e.g., \su\) as the 
(unprivileged) user that will\n
own the server process.\n

I'm fine with that as well. Why not updating the message of
pg_resetxlog as well then? It would be good to be consistent.

I guess that the call to set_pglocale_pgservice() should be added as
well in the first patch (see last version upthread), this has nothing
to do with the logging issues.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_rewind and log messages

2015-04-07 Thread Michael Paquier

On Tue, Apr 7, 2015 at 4:16 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Apr 6, 2015 at 10:01 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 Fujii Masao wrote:
 On Mon, Apr 6, 2015 at 5:33 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  On Mon, Apr 6, 2015 at 1:41 PM, Michael Paquier wrote:
  I guess that you are working on a patch? If not, you are looking for one?
 
  Code-speaking, this gives the patch attached.

 Thanks! Here are the review comments:

 I'm not familiar with native language support (sorry), but don't we need to
 add the shortcut of gettext into every calls of pg_log and pg_fatal, e.g.,
 change pg_fatal(xxx) to pg_fatal(_(xxx))? I know that fprintf() in
 pg_Log_v() has such shortcut, but I'm not sure if that's enough or not.

 It's not necessary for pg_fatal and the like, because those functions
 are marked to have their first argument automatically translated in
 nls.mk.  This means that the string literal is automatically extracted
 into pg_rewind.pot for translators.  Of course, the function itself must
 call _() (or some variant thereof) to actually fetch the translated
 string at run time.

 Understood. Thanks!

 BTW, as far as I read pg_rewind's nls.mk correctly, it also has two problems.

 (1) file_ops.c should be added into GETTEXT_FILES.

And ../../common/restricted_tokens.c.

 (2) pg_log should be pg_log:2 in GETTEXT_TRIGGERS

Seems so.
-- 
Michael
From e5e08188c33adb74fc722c29e660832d88fdd765 Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Mon, 6 Apr 2015 17:18:21 +0900
Subject: [PATCH 1/2] Fix process handling of pg_rewind

To begin with, pg_rewind should not be allowed to run as root on
non-Windows platforms as it manipulates data folders, and file permissions.
On Windows platforms, it can run under a user that has Administrator rights
but in this case a restricted token needs to be used. Also add a call to
set_pglocale_pgservice() that was missing.
---
 src/bin/pg_rewind/nls.mk  |  2 +-
 src/bin/pg_rewind/pg_rewind.c | 17 +
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_rewind/nls.mk b/src/bin/pg_rewind/nls.mk
index e43f3b9..69e87d1 100644
--- a/src/bin/pg_rewind/nls.mk
+++ b/src/bin/pg_rewind/nls.mk
@@ -1,7 +1,7 @@
 # src/bin/pg_rewind/nls.mk
 CATALOG_NAME = pg_rewind
 AVAIL_LANGUAGES  =
-GETTEXT_FILES= copy_fetch.c datapagemap.c fetch.c filemap.c libpq_fetch.c logging.c parsexlog.c pg_rewind.c timeline.c ../../common/fe_memutils.c ../../../src/backend/access/transam/xlogreader.c
+GETTEXT_FILES= copy_fetch.c datapagemap.c fetch.c filemap.c libpq_fetch.c logging.c parsexlog.c pg_rewind.c timeline.c ../../common/fe_memutils.c ../../common/restricted_token.c ../../../src/backend/access/transam/xlogreader.c
 
 GETTEXT_TRIGGERS = pg_log pg_fatal report_invalid_record:2
 GETTEXT_FLAGS= pg_log:2:c-format \
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index dda3a79..04d6a46 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -24,6 +24,7 @@
 #include access/xlog_internal.h
 #include catalog/catversion.h
 #include catalog/pg_control.h
+#include common/restricted_token.h
 #include getopt_long.h
 #include storage/bufpage.h
 
@@ -102,6 +103,7 @@ main(int argc, char **argv)
 	TimeLineID	endtli;
 	ControlFileData ControlFile_new;
 
+	set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN(pg_rewind));
 	progname = get_progname(argv[0]);
 
 	/* Process command-line arguments */
@@ -174,6 +176,21 @@ main(int argc, char **argv)
 		exit(1);
 	}
 
+	/*
+	 * Don't allow pg_rewind to be run as root, to avoid overwriting the
+	 * ownership of files in the data directory. We need only check for root
+	 * -- any other user won't have sufficient permissions to modify files in
+	 * the data directory.
+	 */
+#ifndef WIN32
+	if (geteuid() == 0)
+		pg_fatal(cannot be executed by \root\\n
+ You must run %s as the PostgreSQL superuser.\n,
+ progname);
+#endif
+
+	get_restricted_token(progname);
+
 	/* Connect to remote server */
 	if (connstr_source)
 		libpqConnect(connstr_source);
-- 
2.3.5

From 9bd5eec75cceb8a12d26a9e16dabc2e23294c148 Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Mon, 6 Apr 2015 17:18:21 +0900
Subject: [PATCH 2/2] Fix inconsistent handling of logs in pg_rewind

pg_rewind was handling a couple of things differently compared to the
other src/bin utilities:
- Logging output needs to be flushed on stderr, not stdout
- Logging messages should be prefixed with the application name
- pg_fatal exits with error code of 1, but it was sometimes followed by
subsequent logs, making this information lost in the process.
---
 src/bin/pg_rewind/copy_fetch.c  | 24 +-
 src/bin/pg_rewind/datapagemap.c |  4 ++-
 src/bin/pg_rewind/file_ops.c| 30 +++---
 src/bin/pg_rewind/filemap.c | 16 ++--
 src/bin/pg_rewind/libpq_fetch.c | 52

Re: [HACKERS] pg_rewind and log messages

2015-04-07 Thread Fujii Masao

On Mon, Apr 6, 2015 at 10:01 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Fujii Masao wrote:
 On Mon, Apr 6, 2015 at 5:33 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  On Mon, Apr 6, 2015 at 1:41 PM, Michael Paquier wrote:
  I guess that you are working on a patch? If not, you are looking for one?
 
  Code-speaking, this gives the patch attached.

 Thanks! Here are the review comments:

 I'm not familiar with native language support (sorry), but don't we need to
 add the shortcut of gettext into every calls of pg_log and pg_fatal, e.g.,
 change pg_fatal(xxx) to pg_fatal(_(xxx))? I know that fprintf() in
 pg_Log_v() has such shortcut, but I'm not sure if that's enough or not.

 It's not necessary for pg_fatal and the like, because those functions
 are marked to have their first argument automatically translated in
 nls.mk.  This means that the string literal is automatically extracted
 into pg_rewind.pot for translators.  Of course, the function itself must
 call _() (or some variant thereof) to actually fetch the translated
 string at run time.

Understood. Thanks!

BTW, as far as I read pg_rewind's nls.mk correctly, it also has two problems.

(1) file_ops.c should be added into GETTEXT_FILES.
(2) pg_log should be pg_log:2 in GETTEXT_TRIGGERS

 Another thing is compound messages like foo has happened\nSee --help
 for usage.\n and bar didn't happen\.See --help for usage.  In those
 cases, the see --help part would need to be translated over and over,
 so it's best to separate them in phrases to avoid repetitive work for
 translators.  Not sure how to do this -- maybe something like
 _(foo has happened\n) _(See --help)
 but I'm not sure how to appease the compiler.  Having them in two
 separate pg_log() calls (or whatever) was handy for this reason.

Yep.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Assertion failure when streaming logical changes

2015-04-07 Thread Craig Ringer

On 11 February 2015 at 08:51, Heikki Linnakangas hlinnakan...@vmware.com
wrote:


 The new xmin tracking code assumes that if a snapshots's regd_count  0,
 it has already been pushed to the RegisteredSnapshots heap. That assumption
 doesn't hold because the logical decoding stuff creates its own snapshots
 and sets regd_count=1 to prevent snapmgr.c from freeing them, even though
 they haven't been registered with RegisterSnapshot.

 The second paragraph in this comment in snapmgr.c needs fixing:

   * Likewise, any snapshots that have been exported by pg_export_snapshot
  * have regd_count = 1 and are counted in RegisteredSnapshots, but are not
  * tracked by any resource owner.
  *
  * The same is true for historic snapshots used during logical decoding,
  * their lifetime is managed separately (as they life longer as one xact.c
  * transaction).


 Besides the spelling, that's incorrect: historic snapshots are *not*
 counted in RegisteredSnapshots. That was harmless before commit 94028691,
 but it was always wrong.


 I think setting regd_count=1 outside snapmgr.c is a pretty ugly hack.
 snapbuild.c also abuses active_count as a reference counter, which is
 similarly ugly. I think regd_count and active_count should both be treated
 as private to snapmgr.c, and initialized to zero elsewhere.

 As a minimal fix, we could change the logical decoding code to not use
 regd_count to prevent snapmgr.c from freeing its snapshots, but use
 active_count for that too. Setting regd_count to 1 in
 SnapBuildBuildSnapshot() seems unnecessary in the first place, as the
 callers also call SnapBuildSnapIncRefcount(). Patch attached.


It might be a good idea to apply this if nothing better is forthcoming.
Logical decoding in WALsenders is broken at the moment.

Otherwise it needs to go on the 9.5 blockers list.

But could we get rid of those active_count manipulations too? Could you
 replace the SnapBuildSnap[Inc|Dec]Refcount calls with
 [Un]RegisterSnapshot()?


It would be interesting to know why it was done that way in the first
place, rather than using the snapshot management infrastructure. I presume
it needed to do something not directly offered by the snapshot manager but
I haven't managed to grasp what, exactly.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-04-07 Thread Sawada Masahiko

On Tue, Apr 7, 2015 at 9:32 AM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:

 On Mon, Apr 6, 2015 at 10:21 AM, Sawada Masahiko sawada.m...@gmail.com
 wrote:

 On Fri, Mar 13, 2015 at 5:10 PM, Kyotaro HORIGUCHI
 horiguchi.kyot...@lab.ntt.co.jp wrote:
  Hello, I have some trivial comments about the latest patch.
 
  At Thu, 12 Mar 2015 21:15:14 +0900, Sawada Masahiko
  sawada.m...@gmail.com wrote in
  CAD21AoBxPCpPvKQmvJMUh+p=2pfau03gkjq2r2zy47xhsh2...@mail.gmail.com
  sawada.mshk On Thu, Mar 12, 2015 at 6:36 AM, Jim Nasby
  jim.na...@bluetreble.com wrote:
   Are the parenthesis necessary? No other WITH option requires them,
other
   than create table/matview (COPY doesn't actually require them).
   
  
   I was imagining EXPLAIN syntax.
   Is there some possibility of supporting multiple options for REINDEX
   command in future?
   If there is, syntax will be as follows, REINDEX { INDEX | ... } name
   WITH VERBOSE, XXX, XXX;
   I thought style with parenthesis is better than above style.
  
  
   The thing is, ()s are actually an odd-duck. Very little supports it,
   and
   while COPY allows it they're not required. EXPLAIN is a different
   story,
   because that's not WITH; we're actually using () *instead of* WITH.
  
   So because almost all commands that use WITH doen't even accept (), I
   don't
   think this should either. It certainly shouldn't require them,
   because
   unlike EXPLAIN, there's no need to require them.
  
 
  I understood what your point is.
  Attached patch is changed syntax, it does not have parenthesis.
 
  As I looked into the code to find what the syntax would be, I
  found some points which would be better to be fixed.
 
  In gram.y the options is a list of cstring but it is not necesary
  to be a list because there's only one kind of option now.
 
  If you prefer it to be a list, I have a comment for the way to
  make string list in gram.y. You stored bare cstring in the
  options list but I think it is not the preferable form. I suppose
  the followings are preferable. Corresponding fixes are needed in
  ReindexTable, ReindexIndex, ReindexMultipleTables.
 
  $ = list_make1(makeString($1);
   
  $ = lappend($1, list_make1(makeString($3));
 
 
  In equalfuncs.c, _equalReindexStmt forgets to compare the member
  options. _copyReindexStmt also forgets to copy it. The way you
  constructed the options list prevents them from doing their jobs
  using prepared methods. Comparing and copying the member option
  is needed even if it becomes a simple string.
 

 I revised patch, and changed gram.y as I don't use the list.
 So this patch adds new syntax,
 REINDEX { INDEX | ... } name WITH VERBOSE;

 Also documentation is updated.
 Please give me feedbacks.


 Some notes:

 1) There are a trailing space in src/backend/parser/gram.y:

 -   | REINDEX DATABASE name opt_force
 +   | REINDEX reindex_target_multitable name WITH opt_verbose
 {
 ReindexStmt *n = makeNode(ReindexStmt);
 -   n-kind = REINDEX_OBJECT_DATABASE;
 +   n-kind = $2;
 n-name = $3;
 n-relation = NULL;
 +   n-verbose = $5;
 $$ = (Node *)n;
 }
 ;


 2) The documentation was updated and is according the behaviour.


 3) psql autocomplete is ok.


 4) Lack of regression tests. I think you should add some regression like
 that:

 fabrizio=# \set VERBOSITY terse
 fabrizio=# create table reindex_verbose(id integer primary key);
 CREATE TABLE
 fabrizio=# reindex table reindex_verbose with verbose;
 INFO:  index reindex_verbose_pkey was reindexed.
 REINDEX


 5) Code style and organization is ok


 6) You should add the new field ReindexStmt-verbose to
 src/backend/nodes/copyfuncs.c


 Regards,



Thank you for your reviewing.
I modified the patch and attached latest version patch(v7).
Please have a look it.

Regards,

---
Sawada Masahiko
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 0a4c7d4..27be1a4 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -22,6 +22,7 @@ PostgreSQL documentation
  refsynopsisdiv
 synopsis
 REINDEX { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } replaceable class=PARAMETERname/replaceable [ FORCE ]
+REINDEX { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } replaceable class=PARAMETERname/replaceable WITH VERBOSE
 /synopsis
  /refsynopsisdiv
 
@@ -159,6 +160,15 @@ REINDEX { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } replaceable class=PARAM
  /para
 /listitem
/varlistentry
+
+   varlistentry
+termliteralVERBOSE/literal/term
+listitem
+ para
+  Prints a progress report as each index is reindexed.
+ /para
+/listitem
+   /varlistentry
   /variablelist
  /refsect1
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 351dcb2..fc44495 100644
--- a/src/backend/catalog/index.c

[HACKERS] PATCH: Add 'pid' column to pg_replication_slots

2015-04-07 Thread Craig Ringer

The attached patch adds a 'pid' column to pg_replication_slots, so it's
possible to associate an active slot with the pg_stat_replication entry
that describes the walsender using the slot.

If a user backend (or bgworker) is using the slot over the SQL interface,
the 'pid' column will correspond to the pg_stat_activity entry for that
backend instead. After all, both it and pg_stat_replication are views
over pg_stat_get_activity() anyway.

Detailed rationale in patch.

Please consider this for 9.5. It's a pretty light patch, and it'd be good
to have it in place to ease monitoring of slot-based replication.

Note that logical replication walsenders are broken in HEAD so testing this
using the test_decoding module and pg_recvlogical will crash the walsender.
This is a pre-existing bug; see
http://www.postgresql.org/message-id/CAB7nPqQSdx7coHk0D6G=mkjntgyjxpdw+pwiskkssaezfts...@mail.gmail.com
and
http://www.postgresql.org/message-id/CAMsr+YEh50r70+hP+w=rCzEuenoQRCNMDA7PmRSK06Ro9r=9...@mail.gmail.com
)

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From eabd1a1a66045703d7561b3c8883e90756582206 Mon Sep 17 00:00:00 2001
From: Craig Ringer cr...@2ndquadrant.com
Date: Wed, 1 Apr 2015 17:18:39 +0800
Subject: [PATCH] Add a 'pid' column to pg_replication_slots

This makes it easier to associate pg_stat_replication and
pg_replication_slots entries.

Right now an active replication slot could be being read by a walsender
(using the replication protocol) or via normal user backend (using the
SQL interface). The 'active' flag shows that it's in use, but not
by whom.

pg_stat_replication and pg_stat_activity both expose the relevant
backend pid. Neither expose information about any replication slots
being used.

This patch adds a 'pid' column to pg_replication_slots, which is NULL
when the slot isn't active, otherwise the PID of the backend that's
using the slot. It makes the 'active' field wholly redundant, but that
field is retained for backwards compatibility at the SQL level.

This makes it possible to find out which walsenders or normal backends
are using a slot, something that was not previously possible from the
SQL level. This is particularly important for monitoring of replication
lag.
---
 contrib/test_decoding/expected/ddl.out |  4 ++--
 doc/src/sgml/catalogs.sgml | 13 +
 src/backend/catalog/system_views.sql   |  1 +
 src/backend/replication/slot.c | 22 +++---
 src/backend/replication/slotfuncs.c| 13 +
 src/include/catalog/pg_proc.h  |  2 +-
 src/include/replication/slot.h |  6 --
 src/test/regress/expected/rules.out|  3 ++-
 8 files changed, 43 insertions(+), 21 deletions(-)

diff --git a/contrib/test_decoding/expected/ddl.out b/contrib/test_decoding/expected/ddl.out
index 780120d..80cf09b 100644
--- a/contrib/test_decoding/expected/ddl.out
+++ b/contrib/test_decoding/expected/ddl.out
@@ -603,7 +603,7 @@ SELECT pg_drop_replication_slot('regression_slot');
 
 /* check that the slot is gone */
 SELECT * FROM pg_replication_slots;
- slot_name | plugin | slot_type | datoid | database | active | xmin | catalog_xmin | restart_lsn 
++---++--++--+--+-
+ slot_name | plugin | slot_type | datoid | database | active | pid | xmin | catalog_xmin | restart_lsn 
+---++---++--++-+--+--+-
 (0 rows)
 
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index d0b78f2..9945d1f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -5401,6 +5401,19 @@
  /row
 
  row
+  entrystructfieldpid/structfield/entry
+  entrytypeinteger/type/entry
+  entry/entry
+  entryThe process ID of the session or WALsender using this slot if the
+  slot is currently actively being used.  literalNULL/literal if
+  inactive. Corresponds to
+  structnamepg_stat_activity/structname.structfieldpid/structfield
+  (for normal backends) or
+  structnamepg_stat_replication/structname.structfieldpid/structfield
+  (for WALsenders). (Since: 9.5)/entry
+ /row
+
+ row
   entrystructfieldxmin/structfield/entry
   entrytypexid/type/entry
   entry/entry
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2800f73..5977075 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -654,6 +654,7 @@ CREATE VIEW pg_replication_slots AS
 L.datoid,
 D.datname AS database,
 L.active,
+L.pid,
 L.xmin,
 L.catalog_xmin,
 L.restart_lsn
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index dd7ff0f..1ed0cd9 100644
--- a/src/backend/replication/slot.c
+++

[HACKERS] Row security violation error is misleading

2015-04-07 Thread Craig Ringer

When attempting to insert a row that violates a row security policy that
applies to writes, the error message emitted references WITH CHECK OPTION,
even though (as far as the user knows) there's no such thing involved.
If you understand the internals you'd know that row security re-uses the
same logic as WITH CHECK OPTION, but it's going to be confusing for users.

postgres= INSERT INTO clients (account_name, account_manager) VALUES
('peters', 'peter'), ('johannas', 'johanna');
ERROR:  44000: new row violates WITH CHECK OPTION for clients
DETAIL:  Failing row contains (7, johannas, johanna).
LOCATION:  ExecWithCheckOptions, execMain.c:1683


... yet clients is a table, not a view, and cannot have a WITH CHECK
OPTION clause.

There is no reference to the policy being violated or to the fact that it's
row security involved.

I think this is going to be very confusing for users. I was expecting to
see something more like:

ERROR:  44000: new row in table 'clients' violates row level security
policy 'just_own_clients'


Re-using the SQLSTATE 44000 is a bit iffy too. We should probably define
something to differentiate this, like:

   44P01 ROW SECURITY WRITE POLICY VIOLATION



(I've finally found some time to try to review the user-facing side of the
row security patch as-committed).


--
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

Re: [HACKERS] Assertion failure when streaming logical changes

2015-04-07 Thread Andres Freund

On 2015-04-07 17:22:12 +0800, Craig Ringer wrote:
 It might be a good idea to apply this if nothing better is forthcoming.
 Logical decoding in WALsenders is broken at the moment.

Yes.

 Otherwise it needs to go on the 9.5 blockers list.
 
 But could we get rid of those active_count manipulations too? Could you
  replace the SnapBuildSnap[Inc|Dec]Refcount calls with
  [Un]RegisterSnapshot()?
 
 
 It would be interesting to know why it was done that way in the first
 place, rather than using the snapshot management infrastructure.

Because the snapshot tracking infrastructure isn't particularly
suitable. These snapshots are much longer lived; they span transactions
and such. Nothing snapmgr.c is made for. It seemed more
complicated/bug-prone to change snapmgr.c to support a foreign use case.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Column mis-spelling hints vs case folding

2015-04-07 Thread Robert Haas

On Sun, Apr 5, 2015 at 12:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 A newbie error that we see *constantly* is misunderstanding identifier
 case-folding rules.  ISTM that commit e529cd4ff missed a chance to help
 with that.  You do get a hint for this:

 regression=# create table t1 (foo int, Bar int);
 CREATE TABLE
 regression=# select bar from t1;
 ERROR:  column bar does not exist
 LINE 1: select bar from t1;
^
 HINT:  Perhaps you meant to reference the column t1.Bar.

 but apparently it's just treating that as a vanilla one-mistyped-character
 error, because there's no hint for this:

 regression=# create table t2 (foo int, BAR int);
 CREATE TABLE
 regression=# select BAR from t2;
 ERROR:  column bar does not exist
 LINE 1: select BAR from t2;
^

 I think this hint might be a lot more useful if its comparison mechanism
 were case-insensitive.

If you want to make that change, I will not object.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Column mis-spelling hints vs case folding

2015-04-07 Thread Greg Stark

On 7 Apr 2015 09:37, Robert Haas robertmh...@gmail.com wrote:

 On Sun, Apr 5, 2015 at 12:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  A newbie error that we see *constantly* is misunderstanding identifier
  case-folding rules.

  I think this hint might be a lot more useful if its comparison mechanism
  were case-insensitive.

 If you want to make that change, I will not object.

If you just make out case insensitive that would be an improvement. Nobody
sane has similar columns that differ only in case.

But if the original token was not quoted and the suggested token requires
quoting IWBNI the hint directly addressed that source of confusion.

But if that gets fiddly, trying to squeeze too much in one hint then better
the generic hint then nothing at all. I don't want to kill a good simple
change with bike shedding here

Re: [HACKERS] Replication identifiers, take 4

2015-04-07 Thread Andres Freund

On 2015-03-28 23:50:20 +0100, Petr Jelinek wrote:
 And finally I have issue with how the new identifiers are allocated.
 Currently, if you create identifier 'foo', remove identifier 'foo' and
 create identifier 'bar', the identifier 'bar' will have same id as the old
 'foo' identifier. This can be problem because the identifier id is used as
 origin of the data and the replication solution using the replication
 identifiers can end up thinking that data came from node 'bar' even though
 they came from the node 'foo' which no longer exists. This can have bad
 effects for example on conflict detection or debugging problems with
 replication.
 
 Maybe another reason to use standard Oids?

As the same reason exists for oids, just somewhat less likely, I don't
see it as a reason for much. It's really not that hard to get oid
conflicts once your server has lived for a while. As soon as the oid
counter has wrapped around once, it's not unlikely to have
conflicts. And with temp tables (or much more extremely WITH OID tables)
and such it's not that hard to reach that point. The only material
difference this makes is that it's much easier to notice the problem
during development.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication identifiers, take 4

2015-04-07 Thread Andres Freund

On 2015-03-24 23:11:26 -0400, Robert Haas wrote:
 On Mon, Feb 16, 2015 at 4:46 AM, Andres Freund and...@2ndquadrant.com wrote:
  At a quick glance, this basic design seems workable. I would suggest
  expanding the replication IDs to regular 4 byte oids. Two extra bytes is a
  small price to pay, to make it work more like everything else in the 
  system.
 
  I don't know. Growing from 3 to 5 byte overhead per relevant record (or
  even 0 to 5 in case the padding is reused) is rather noticeable. If we
  later find it to be a limit (I seriously doubt that), we can still
  increase it in a major release without anybody really noticing.
 
 You might notice that Heikki is making the same point here that I've
 attempted to make multiple times in the past: limiting to replication
 identifier to 2 bytes because that's how much padding space you happen
 to have available is optimizing for the wrong thing.  What we should
 be optimizing for is consistency and uniformity of design.  System
 catalogs have OIDs, so this one should, too.  You're not going to be
 able to paper over the fact that the column has some funky data type
 that is unlike every other column in the system.
 
 To the best of my knowledge, the statement that there is a noticeable
 performance cost for those 2 extra bytes is also completely
 unsupported by any actual benchmarking.

I'm starting benchmarks now.

But I have to say: I find the idea that you'd need more than 2^16
identifiers anytime soon not very credible. The likelihood that
replication identifiers are the limiting factor towards that seems
incredibly small. Just consider how you'd apply changes from so many
remotes; how to stream changes to them; how to even configure such a
complex setup. We can easily change the size limits in the next major
release without anybody being inconvenienced.

We've gone through quite some lengths reducing the overhead of WAL. I
don't understand why it's important that we do not make compromises
here; but why that doesn't matter elsewhere.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: Add 'pid' column to pg_replication_slots

2015-04-07 Thread Andres Freund

Hi,

On 2015-04-07 18:41:59 +0800, Craig Ringer wrote:
 The attached patch adds a 'pid' column to pg_replication_slots, so it's
 possible to associate an active slot with the pg_stat_replication entry
 that describes the walsender using the slot.

This really should have been done that way from the get go. So I'm
inclined to just apply this soon. Will do unless somebody protests.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication identifiers, take 4

2015-04-07 Thread Andres Freund

On 2015-04-07 16:30:25 +0200, Andres Freund wrote:
 And with temp tables (or much more extremely WITH OID tables)
 and such it's not that hard to reach that point.

Oh, and obviously toast data. A couple tables with toasted columns is
also a good way to rapidly consume oids.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-04-07 Thread Fabrízio de Royes Mello

On Tue, Apr 7, 2015 at 7:22 AM, Sawada Masahiko sawada.m...@gmail.com
wrote:

 Thank you for your reviewing.
 I modified the patch and attached latest version patch(v7).
 Please have a look it.


Looks good to me. Attached patch (v8) just fix a tab indentation in gram.y.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello
diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index 0a4c7d4..27be1a4 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -22,6 +22,7 @@ PostgreSQL documentation
  refsynopsisdiv
 synopsis
 REINDEX { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } replaceable class=PARAMETERname/replaceable [ FORCE ]
+REINDEX { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } replaceable class=PARAMETERname/replaceable WITH VERBOSE
 /synopsis
  /refsynopsisdiv
 
@@ -159,6 +160,15 @@ REINDEX { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } replaceable class=PARAM
  /para
 /listitem
/varlistentry
+
+   varlistentry
+termliteralVERBOSE/literal/term
+listitem
+ para
+  Prints a progress report as each index is reindexed.
+ /para
+/listitem
+   /varlistentry
   /variablelist
  /refsect1
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 351dcb2..fc44495 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -63,6 +63,7 @@
 #include utils/inval.h
 #include utils/lsyscache.h
 #include utils/memutils.h
+#include utils/pg_rusage.h
 #include utils/syscache.h
 #include utils/tuplesort.h
 #include utils/snapmgr.h
@@ -3133,13 +3134,18 @@ IndexGetRelation(Oid indexId, bool missing_ok)
  * reindex_index - This routine is used to recreate a single index
  */
 void
-reindex_index(Oid indexId, bool skip_constraint_checks, char persistence)
+reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
+bool verbose)
 {
 	Relation	iRel,
 heapRelation;
 	Oid			heapId;
 	IndexInfo  *indexInfo;
 	volatile bool skipped_constraint = false;
+	int			elevel = verbose ? INFO : DEBUG2;
+	PGRUsage	ru0;
+
+	pg_rusage_init(ru0);
 
 	/*
 	 * Open and lock the parent heap relation.  ShareLock is sufficient since
@@ -3283,6 +3289,13 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence)
 		heap_close(pg_index, RowExclusiveLock);
 	}
 
+	/* Log what we did */
+	ereport(elevel,
+			(errmsg(index \%s\ was reindexed.,
+	get_rel_name(indexId)),
+	errdetail(%s.,
+			  pg_rusage_show(ru0;
+
 	/* Close rels, but keep locks */
 	index_close(iRel, NoLock);
 	heap_close(heapRelation, NoLock);
@@ -3324,7 +3337,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence)
  * index rebuild.
  */
 bool
-reindex_relation(Oid relid, int flags)
+reindex_relation(Oid relid, int flags, bool verbose)
 {
 	Relation	rel;
 	Oid			toast_relid;
@@ -3415,7 +3428,7 @@ reindex_relation(Oid relid, int flags)
 RelationSetIndexList(rel, doneIndexes, InvalidOid);
 
 			reindex_index(indexOid, !(flags  REINDEX_REL_CHECK_CONSTRAINTS),
-		  persistence);
+		  persistence, verbose);
 
 			CommandCounterIncrement();
 
@@ -3450,7 +3463,7 @@ reindex_relation(Oid relid, int flags)
 	 * still hold the lock on the master table.
 	 */
 	if ((flags  REINDEX_REL_PROCESS_TOAST)  OidIsValid(toast_relid))
-		result |= reindex_relation(toast_relid, flags);
+		result |= reindex_relation(toast_relid, flags, verbose);
 
 	return result;
 }
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 3febdd5..34ffaba 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1532,7 +1532,7 @@ finish_heap_swap(Oid OIDOldHeap, Oid OIDNewHeap,
 	else if (newrelpersistence == RELPERSISTENCE_PERMANENT)
 		reindex_flags |= REINDEX_REL_FORCE_INDEXES_PERMANENT;
 
-	reindex_relation(OIDOldHeap, reindex_flags);
+	reindex_relation(OIDOldHeap, reindex_flags, false);
 
 	/*
 	 * If the relation being rebuild is pg_class, swap_relation_files()
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 99acd4a..a42a508 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1681,7 +1681,7 @@ ChooseIndexColumnNames(List *indexElems)
  *		Recreate a specific index.
  */
 Oid
-ReindexIndex(RangeVar *indexRelation)
+ReindexIndex(RangeVar *indexRelation, bool verbose)
 {
 	Oid			indOid;
 	Oid			heapOid = InvalidOid;
@@ -1706,7 +1706,7 @@ ReindexIndex(RangeVar *indexRelation)
 	persistence = irel-rd_rel-relpersistence;
 	index_close(irel, NoLock);
 
-	reindex_index(indOid, false, persistence);
+	reindex_index(indOid, false, persistence, verbose);
 
 	return indOid;
 }
@@ -1775,7 +1775,7 @@ RangeVarCallbackForReindexIndex(const RangeVar *relation,
  *		Recreate

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-04-07 Thread Sawada Masahiko

On Wed, Apr 8, 2015 at 1:11 AM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:

 On Tue, Apr 7, 2015 at 1:04 PM, Sawada Masahiko sawada.m...@gmail.com
 wrote:

 On Tue, Apr 7, 2015 at 10:12 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
 
  On Tue, Apr 7, 2015 at 7:22 AM, Sawada Masahiko sawada.m...@gmail.com
  wrote:
 
  Thank you for your reviewing.
  I modified the patch and attached latest version patch(v7).
  Please have a look it.
 
 
  Looks good to me. Attached patch (v8) just fix a tab indentation in
  gram.y.
 

 I had forgotten fix a tab indentation, sorry.
 Thank you for reviewing!
 It looks good to me too.
 Can this patch be marked as Ready for Committer?


 +1


Changed status to Ready for Committer.
Thank you for final reviewing!

Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication identifiers, take 4

2015-04-07 Thread Jim Nasby


On 4/7/15 9:30 AM, Andres Freund wrote:

On 2015-03-28 23:50:20 +0100, Petr Jelinek wrote:

And finally I have issue with how the new identifiers are allocated.
Currently, if you create identifier 'foo', remove identifier 'foo' and
create identifier 'bar', the identifier 'bar' will have same id as the old
'foo' identifier. This can be problem because the identifier id is used as
origin of the data and the replication solution using the replication
identifiers can end up thinking that data came from node 'bar' even though
they came from the node 'foo' which no longer exists. This can have bad
effects for example on conflict detection or debugging problems with
replication.

Maybe another reason to use standard Oids?


As the same reason exists for oids, just somewhat less likely, I don't
see it as a reason for much. It's really not that hard to get oid
conflicts once your server has lived for a while. As soon as the oid
counter has wrapped around once, it's not unlikely to have
conflicts. And with temp tables (or much more extremely WITH OID tables)
and such it's not that hard to reach that point. The only material
difference this makes is that it's much easier to notice the problem
during development.


Why not just create a sequence? I suspect it may not be as fast to 
assign as an OID, but it's not like you'd be doing this all the time.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-04-07 Thread Sawada Masahiko

On Tue, Apr 7, 2015 at 10:12 PM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:

 On Tue, Apr 7, 2015 at 7:22 AM, Sawada Masahiko sawada.m...@gmail.com
 wrote:

 Thank you for your reviewing.
 I modified the patch and attached latest version patch(v7).
 Please have a look it.


 Looks good to me. Attached patch (v8) just fix a tab indentation in gram.y.


I had forgotten fix a tab indentation, sorry.
Thank you for reviewing!
It looks good to me too.
Can this patch be marked as Ready for Committer?

Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-04-07 Thread Fabrízio de Royes Mello

On Tue, Apr 7, 2015 at 1:04 PM, Sawada Masahiko sawada.m...@gmail.com
wrote:

 On Tue, Apr 7, 2015 at 10:12 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
 
  On Tue, Apr 7, 2015 at 7:22 AM, Sawada Masahiko sawada.m...@gmail.com
  wrote:
 
  Thank you for your reviewing.
  I modified the patch and attached latest version patch(v7).
  Please have a look it.
 
 
  Looks good to me. Attached patch (v8) just fix a tab indentation in
gram.y.
 

 I had forgotten fix a tab indentation, sorry.
 Thank you for reviewing!
 It looks good to me too.
 Can this patch be marked as Ready for Committer?


+1

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello

Re: [HACKERS] Row security violation error is misleading

2015-04-07 Thread Stephen Frost

Craig,

* Craig Ringer (cr...@2ndquadrant.com) wrote:
 When attempting to insert a row that violates a row security policy that
 applies to writes, the error message emitted references WITH CHECK OPTION,
 even though (as far as the user knows) there's no such thing involved.
 If you understand the internals you'd know that row security re-uses the
 same logic as WITH CHECK OPTION, but it's going to be confusing for users.

Agreed and we actually have a patch from Dean already to address this,
it's just been waiting on me (with a couple of other ones).  It'd
certainly be great if you have time to take a look at those, though,
generally speaking, I feel prety happy about where those are and believe
they really just need to be reviewed/tested and maybe a bit of word-
smithing around the docs.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Support UPDATE table SET(*)=...

2015-04-07 Thread Tom Lane

Atri Sharma atri.j...@gmail.com writes:
 Please find attached latest version of UPDATE (*) patch.The patch
 implements review comments and Tom's gripe about touching
 transformTargetList is addressed now. I have added regression tests and
 simplified parser representation a bit.

I spent a fair amount of time cleaning this patch up to get it into
committable shape, but as I was working on the documentation I started
to lose enthusiasm for it, because I was having a hard time coming up
with compelling examples.  The originally proposed motivation was

 It solves the problem of doing UPDATE from a record variable of the same
 type as the table e.g. update foo set (*) = (select foorec.*) where ...;

but it feels to me that this is not actually a very good solution to that
problem --- even granting that the problem needs to be solved, a claim
that still requires some justification IMO.  Here is a possible use-case:

regression=# create table src (f1 int, f2 text, f3 real);
CREATE TABLE
regression=# create table dst (f1 int, f2 text, f3 real);
CREATE TABLE
regression=# create rule r1 as on update to src do instead
regression-# update dst set (*) = (new.*) where dst.f1 = old.f1;
ERROR:  number of columns does not match number of values
LINE 2: update dst set (*) = (new.*) where dst.f1 = old.f1;
  ^

So that's annoying.  You can work around it with this very unobvious
(and undocumented) syntax hack:

regression=# create rule r1 as on update to src do instead
regression-# update dst set (*) = (select new.*) where dst.f1 = old.f1;
CREATE RULE

But what happens if dst has no matching row?  Your data goes into the
bit bucket, that's what.  What you really wanted here was some kind of
UPSERT.  There's certainly a possible use-case for a notation like this
in UPSERT, when we get around to implementing that; but I'm less than
convinced that we need it in plain UPDATE.

What's much worse though is that the rule that actually gets stored is:

regression=# \d+ src
  Table public.src
 Column |  Type   | Modifiers | Storage  | Stats target | Description 
+-+---+--+--+-
 f1 | integer |   | plain|  | 
 f2 | text|   | extended |  | 
 f3 | real|   | plain|  | 
Rules:
r1 AS
ON UPDATE TO src DO INSTEAD  UPDATE dst SET (f1, f2, f3) = ( SELECT new.f1,
new.f2,
new.f3)
  WHERE dst.f1 = old.f1

That is, you don't actually have any protection at all against future
additions of columns, which seems like it would be most of the point
of using a notation like this.

We could possibly address that by postponing expansion of the *-notation
to rule rewrite time, but that seems like it'd be a complete disaster in
terms of modularity, or even usability (since column-mismatch errors
wouldn't get raised till then either).  And it'd certainly be a far more
invasive and complex patch than this.

So I'm feeling that this may not be a good idea, or at least not a good
implementation of the idea.  I'm inclined to reject the patch rather than
lock us into something that is not standard and doesn't really do what
people would be likely to want.

Attached is the updated patch that I had before arriving at this
discouraging conclusion.

regards, tom lane

diff --git a/doc/src/sgml/ref/update.sgml b/doc/src/sgml/ref/update.sgml
index 35b0699..adb4473 100644
*** a/doc/src/sgml/ref/update.sgml
--- b/doc/src/sgml/ref/update.sgml
*** PostgreSQL documentation
*** 25,31 
  UPDATE [ ONLY ] replaceable class=PARAMETERtable_name/replaceable [ * ] [ [ AS ] replaceable class=parameteralias/replaceable ]
  SET { replaceable class=PARAMETERcolumn_name/replaceable = { replaceable class=PARAMETERexpression/replaceable | DEFAULT } |
( replaceable class=PARAMETERcolumn_name/replaceable [, ...] ) = ( { replaceable class=PARAMETERexpression/replaceable | DEFAULT } [, ...] ) |
!   ( replaceable class=PARAMETERcolumn_name/replaceable [, ...] ) = ( replaceable class=PARAMETERsub-SELECT/replaceable )
  } [, ...]
  [ FROM replaceable class=PARAMETERfrom_list/replaceable ]
  [ WHERE replaceable class=PARAMETERcondition/replaceable | WHERE CURRENT OF replaceable class=PARAMETERcursor_name/replaceable ]
--- 25,33 
  UPDATE [ ONLY ] replaceable class=PARAMETERtable_name/replaceable [ * ] [ [ AS ] replaceable class=parameteralias/replaceable ]
  SET { replaceable class=PARAMETERcolumn_name/replaceable = { replaceable class=PARAMETERexpression/replaceable | DEFAULT } |
( replaceable class=PARAMETERcolumn_name/replaceable [, ...] ) = ( { replaceable class=PARAMETERexpression/replaceable | DEFAULT } [, ...] ) |
!   ( replaceable class=PARAMETERcolumn_name/replaceable [, ...] ) = ( replaceable class=PARAMETERsub-SELECT/replaceable ) |
!   (*) = ( {

Re: [HACKERS] Replication identifiers, take 4

2015-04-07 Thread Andres Freund

 Why not just create a sequence? I suspect it may not be as fast to assign as
 an OID, but it's not like you'd be doing this all the time.

What does that have to do with the thread?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] rare avl shutdown slowness (related to signal handling)

2015-04-07 Thread Qingqing Zhou

I am playing git tip on windows 7/32 bits, with the backend compiled
with visual studio 2005 (I know, it is very old :-( ).

I encountered avl shutdown slowness twice, last night and this
morning: after a ctrl_c is hit, wait for another 60 seconds, server
shuts down. Here is one log:

D:\pgsql\binpostgres -D../data --log_line_prefix=%t %p
2015-04-07 10:30:04 PDT 3148LOG:  database system was shut down at
2015-04-07 10:29:24 PDT
2015-04-07 10:30:04 PDT 19548LOG:  database system is ready to accept
connections
2015-04-07 10:30:04 PDT 27008LOG:  autovacuum launcher started
2015-04-07 10:30:08 PDT 19548LOG:  received fast shutdown request
2015-04-07 10:30:08 PDT 19548LOG:  aborting any active transactions
2015-04-07 10:30:08 PDT 27008LOG:  autovacuum launcher shutting down
2015-04-07 10:30:08 PDT 27008ERROR:  canceling statement due to user request
2015-04-07 10:31:09 PDT 27008LOG:  autovacuum launcher shutting down
2015-04-07 10:31:09 PDT 15656LOG:  shutting down
2015-04-07 10:31:09 PDT 15656LOG:  database system is shut down

The interesting part is on PID 27008: avl first ereport() shutdown,
which is at the very end of the main loop and just one step away from
proc_exit(). Then it seems honors a SIGINT within ereport(), longjmp
to the loop head, and waits for another 60 seconds. After timeout, it
ereports shutdown again, and finally exits.

Not sure if this is windows only or general. I can hardly repro it.
Anyone has ever seen this?

Regards,
Qingqing


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support UPDATE table SET(*)=...

2015-04-07 Thread Tom Lane

I wrote:
 So I'm feeling that this may not be a good idea, or at least not a good
 implementation of the idea.  I'm inclined to reject the patch rather than
 lock us into something that is not standard and doesn't really do what
 people would be likely to want.

BTW, a potentially workable fix to the problem of not wanting to lock
down column lists in stored rules is to create a syntax that represents
whole-row, record-oriented assignment directly.  Then we need not be
concerned with individual columns at parse time at all.  So imagine
something like this:

UPDATE dst SET * = new WHERE ...;

UPDATE dst SET * = (SELECT src FROM src WHERE ...);

or if you needed to construct a row value at runtime you could write

UPDATE dst SET * = ROW(x,y,z) WHERE ...;

UPDATE dst SET * = (SELECT ROW(x,y,z) FROM src WHERE ...);

The main bit of functionality that would be lost compared to the current
patch is the ability to use DEFAULT for some of the row members.  But I am
not sure there is a compelling use-case for that: seems like if you have
some DEFAULTs in there then it's unlikely that you don't know the column
list accurately, so the existing (col1,col2,...) syntax will serve fine.

This seems like it might not be unduly complex to implement, although
it would have roughly nothing in common with the current patch.

If we were to go in this direction, it would be nice to at the same time
add a similar whole-record syntax for INSERT.  I'm not sure exactly what
that should look like though.  Also, again, we ought to be paying
attention to how this would match up with UPSERT syntax.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support UPDATE table SET(*)=...

2015-04-07 Thread Peter Geoghegan

On Tue, Apr 7, 2015 at 12:01 PM, Peter Geoghegan p...@heroku.com wrote:
 I still don't like the idea of
 supporting this, though. I'm not aware of any other system allowing
 something like this for either MERGE or a non-standard UPSERT.

That includes MySQL, BTW. Only their REPLACE statement (which is a
disaster for various reasons) can do something like this. Their INSERT
... ON DUPLICATE KEY UPDATE statement (which is roughly comparable to
the proposed UPSERT patch) cannot update the entire row using a terse
expression that references a row excluded from insertion (following
the implementation taking the UPDATE path).

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support UPDATE table SET(*)=...

2015-04-07 Thread Peter Geoghegan

On Tue, Apr 7, 2015 at 11:19 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 If we were to go in this direction, it would be nice to at the same time
 add a similar whole-record syntax for INSERT.  I'm not sure exactly what
 that should look like though.  Also, again, we ought to be paying
 attention to how this would match up with UPSERT syntax.

I expressed concern about allowing this for UPSERT [1].

To be fair, VoltDB's new UPSERT statement allows something similar (or
rather mandates it, since you cannot just update some columns), and
that doesn't look wholly unreasonable. I still don't like the idea of
supporting this, though. I'm not aware of any other system allowing
something like this for either MERGE or a non-standard UPSERT.

[1] 
http://www.postgresql.org/message-id/CAM3SWZT=vxbj7qkaidamybu40ap10udsqooqhvix3ykj7wb...@mail.gmail.com
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support UPDATE table SET(*)=...

2015-04-07 Thread Alvaro Herrera

Tom Lane wrote:

 I spent a fair amount of time cleaning this patch up to get it into
 committable shape, but as I was working on the documentation I started
 to lose enthusiasm for it, because I was having a hard time coming up
 with compelling examples.  The originally proposed motivation was
 
  It solves the problem of doing UPDATE from a record variable of the same
  type as the table e.g. update foo set (*) = (select foorec.*) where ...;
 
 but it feels to me that this is not actually a very good solution to that
 problem --- even granting that the problem needs to be solved, a claim
 that still requires some justification IMO.

How about an UPDATE ran inside a plpgsql function, which is using a row
variable of the table type?  You could assign values to individual
columns of q row variable, and run the multicolumn UPDATE last.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_restore -t should match views, matviews, and foreign tables

2015-04-07 Thread Tom Lane

Peter Eisentraut pete...@gmx.net writes:
 On 3/31/15 11:01 PM, Craig Ringer wrote:
 this patch adds support for views, foreign tables, and materialised
 views to the pg_restore -t flag.

 I think this is a good change.  Any concerns?

Are we happy with pg_dump/pg_restore not distinguishing these objects
by type?  It seems rather analogous to letting ALTER TABLE work on views
etc.  Personally I'm fine with this, but certainly some people have
complained about that approach so far as ALTER is concerned.  (But the
implication would be that we'd need four distinct switches, which is
not an outcome I favor.)

Also, I think you missed MATERIALIZED VIEW DATA.

Also, shouldn't there be a documentation update?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] deparsing utility commands

2015-04-07 Thread Alvaro Herrera

Executive summary:

There is now a CommandDeparse_hook;
deparse_utility_command is provided as an extension, intended for 9.6;
rest of patch would be pushed to 9.5.


Long version:

I've made command deparsing hookable.  Attached there are three patches:
the first patch contains changes to core that just add the command
list stuff, so that on ddl_command_end there is access to what has been
executed.  This includes the OID of the object just created, the command
tag, and assorted other details; the deparsed command in JSON format is
not immediately part of the result.

The third patch contains all the deparse code, packaged as a contrib
module and extension named ddl_deparse.  Essentially, it's everything
that was previously in tcop/deparse_utility.c and utils/adt/ddl_json.c:
the stuff that takes the parsenode and OID of a command execution and
turns it into a JSON blob, and also the support function that takes the
JSON blob and converts back into the plain text rendering of the
command.

The second patch contains some changes to core code that support the
ddl_deparse extension; mainly some ruleutils.c changes.

What links patches 0001 and 0003 is a hook, CommandDeparse_hook.  If
unset, the pg_event_trigger_ddl_commands function returns some
boilerplate text like no deparse function installed; if the extension
is installed, the JSON rendering is returned instead and can be used
with the ddl_deparse_expand_command() function.

The rationale for doing things this way is that it will be useful to
have 9.5 expose the pg_event_trigger_ddl_commands() function for various
uses, while we refine the JSON bits some more and get it committed for
9.6.  In reviews, it's clear that there's some more bits to fiddle so
that it can be as general as possible.  I think we should label the
whole DDL command reporting as experimental in 9.5 and subject to
change, so that we can just remove the hook in 9.6 when the ddl_deparse
thing becomes part of core.

Thoughts?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_restore -t should match views, matviews, and foreign tables

2015-04-07 Thread Peter Eisentraut

On 3/31/15 11:01 PM, Craig Ringer wrote:
 Following on from this -bugs post:
 
 http://www.postgresql.org/message-id/camsr+ygj50tvtvk4dbp66gajeoc0kap6kxfehaom+neqmhv...@mail.gmail.com
 
 this patch adds support for views, foreign tables, and materialised
 views to the pg_restore -t flag.

I think this is a good change.  Any concerns?




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support UPDATE table SET(*)=...

2015-04-07 Thread Tom Lane

Peter Geoghegan p...@heroku.com writes:
 On Tue, Apr 7, 2015 at 11:19 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 If we were to go in this direction, it would be nice to at the same time
 add a similar whole-record syntax for INSERT.  I'm not sure exactly what
 that should look like though.  Also, again, we ought to be paying
 attention to how this would match up with UPSERT syntax.

 I expressed concern about allowing this for UPSERT [1].

Yeah, your analogy to SELECT * being considered dangerous in production
is not without merit.  However, to the extent that the syntax is used to
assign from a composite variable of the same (or compatible) data type,
it seems like it would be safe enough.  IOW, I think that analogy holds
for the syntax implemented by the current patch, but not what I suggested
in my followup.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rejected vs returned with feedback in new CF app

2015-04-07 Thread Peter Eisentraut

On 4/7/15 3:33 PM, Tom Lane wrote:
 I tried to mark the UPDATE SET (*) patch as returned with feedback,
 but the CF app informed me that if I did that the patch would
 automatically be moved to the next commitfest.  That seems completely
 stupid.  There is no need to reconsider it unless a new version of the
 patch is forthcoming (which there may or may not ever be, but that's
 beside the point for now).  When and if the author does submit a new
 patch, that would be the time to include it in the next commitfest, no?

I noticed that as well and have avoided closing some patches because of it.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_rewind and log messages

2015-04-07 Thread Heikki Linnakangas


On 04/07/2015 05:59 AM, Michael Paquier wrote:

On Mon, Apr 6, 2015 at 9:10 PM, Fujii Masao masao.fu...@gmail.com wrote:

I eliminated a bunch of newlines in the log messages that seemed
really unnecessary to me, simplifying a bit the whole.


So the patch removed the newlines from the error messages, and added the 
newline into pg_fatal/log instead. Perhaps that's good idea, but it's 
not actually consistent with what we do in other utilities in src/bin. 
See write_msg() and exit_horribly() in pg_dump, write_stderr() in 
pg_ctl, and psql_error() in psql. If we want to change that in pg_rewind 
(IMHO we shouldn't), let's do that as a completely separate patch.



While hacking this stuff, I noticed as

well that pg_rewind could be called as root on non-Windows platform,
that's dangerous from a security point of view as process manipulates
files in PGDATA. Hence let's block that. On Windows, a restricted
token should be used.


Good catch! I think it's better to implement it as a separate patch
because it's very different from log message problem.


Attached are new patches:
- 0001 fixes the restriction issues: restricted token on Windows, and
forbid non-root user on non-Windows.


Committed this one.


- 0002 includes the improvements for logging, addressing the comments above.


This is a bit of a mixed bag. I'll comment on the three points from the 
commit message separately:



Fix inconsistent handling of logs in pg_rewind

pg_rewind was handling a couple of things differently compared to the
other src/bin utilities:
- Logging output needs to be flushed on stderr, not stdout


Agreed in general. But there's also precedent in printing some stuff to 
stdout: pg_ctl does that for the status message, like server starting. 
As does initdb.


I'm pretty unclear on what the rule here is.


- Logging messages should be prefixed with the application name


We tend to do that for error messages, but not for other messages. Seems 
inappropriate for the progress messages.



- pg_fatal exits with error code of 1, but it was sometimes followed by
subsequent logs, making this information lost in the process.


Fixed.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] rejected vs returned with feedback in new CF app

2015-04-07 Thread Tom Lane

I tried to mark the UPDATE SET (*) patch as returned with feedback,
but the CF app informed me that if I did that the patch would
automatically be moved to the next commitfest.  That seems completely
stupid.  There is no need to reconsider it unless a new version of the
patch is forthcoming (which there may or may not ever be, but that's
beside the point for now).  When and if the author does submit a new
patch, that would be the time to include it in the next commitfest, no?

I ended up marking it rejected instead, but that seems a bit harsh.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Auditing extension for PostgreSQL (Take 2)

2015-04-07 Thread Peter Eisentraut

On 4/6/15 5:03 PM, Alvaro Herrera wrote:
 Simon Riggs wrote:
 
 The present version can trigger an audit trail event for a statement,
 without tracking the object that was being audited. This prevents you
 from searching for all SQL that touches table X, i.e. we know the
 statements were generated, but not which ones they were. IMHO that
 makes the resulting audit trail unusable for auditing purposes. I
 would like to see that functionality put back before it gets
 committed, if that occurs.
 
 Is there a consensus that the current version is the one that we should
 be reviewing, rather than the one Abhijit submitted?  Last I checked,
 that wasn't at all clear.

Well, this one is the commitfest thread of record.

At quick glance, my comments about how does this map to specific
customer requirements apply to the other submission as well.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Row security violation error is misleading

2015-04-07 Thread Peter Geoghegan

On Tue, Apr 7, 2015 at 5:11 AM, Craig Ringer cr...@2ndquadrant.com wrote:
 postgres= INSERT INTO clients (account_name, account_manager) VALUES
 ('peters', 'peter'), ('johannas', 'johanna');
 ERROR:  44000: new row violates WITH CHECK OPTION for clients
 DETAIL:  Failing row contains (7, johannas, johanna).
 LOCATION:  ExecWithCheckOptions, execMain.c:1683


 ... yet clients is a table, not a view, and cannot have a WITH CHECK
 OPTION clause.

 There is no reference to the policy being violated or to the fact that it's
 row security involved.


FWIW, I also think that this is very confusing.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

2015-04-07 Thread Kevin Grittner

David Rowley dgrowle...@gmail.com wrote:

 If we attempt to do this parallel stuff at plan time, and we
 happen to plan at some quiet period, or perhaps worse, some
 application's start-up process happens to PREPARE a load of
 queries when the database is nice and quite, then quite possibly
 we'll end up with some highly parallel queries. Then perhaps come
 the time these queries are actually executed the server is very
 busy... Things will fall apart quite quickly due to the masses of
 IPC and context switches that would be going on.

 I completely understand that this parallel query stuff is all
 quite new to us all and we're likely still trying to nail down
 the correct infrastructure for it to work well, so this is why
 I'm proposing that the planner should know nothing of parallel
 query, instead I think it should work more along the lines of:

 * Planner should be completely oblivious to what parallel query
   is.
 * Before executor startup the plan is passed to a function which
   decides if we should parallelise it, and does so if the plan
   meets the correct requirements. This should likely have a very
   fast exit path such as:
 if root node's cost  parallel_query_cost_threshold
   return; /* the query is not expensive enough to attempt to make 
 parallel */

 The above check will allow us to have an almost zero overhead for
 small low cost queries.

 This function would likely also have some sort of logic in order
 to determine if the server has enough spare resource at the
 current point in time to allow queries to be parallelised

There is a lot to like about this suggestion.

I've seen enough performance crashes due to too many concurrent
processes (even when each connection can only use a single process)
to believe that, for a plan which will be saved, it is possible to
know at planning time whether parallelization will be a nice win or
a devastating over-saturation of resources during some later
execution phase.

Another thing to consider is that this is not entirely unrelated to
the concept of admission control policies.  Perhaps this phase
could be a more general execution start-up admission control phase,
where parallel processing would be one adjustment that could be
considered.  Initially it might be the *only* consideration, but it
might be good to try to frame it in a way that allowed
implementation of other policies, too.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns

2015-04-07 Thread Tom Lane

Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes:
 To support ROW_MARK_REFERENCE on (postgres_fdw) foreign tables, I'd like 
 to propose the following FDW APIs:

 RowMarkType
 GetForeignRowMarkType(Oid relid,
LockClauseStrength strength);

 Decide which rowmark type to use for a foreign table (that has strength 
 = LCS_NONE), ie, ROW_MARK_REFERENCE or ROW_MARK_COPY.  (For now, the 
 second argument takes LCS_NONE only, but is intended to be used for the 
 possible extension to the other cases.)  This is called during 
 select_rowmark_type() in the planner.

Why would you restrict that to LCS_NONE?  Seems like the point is to give
the FDW control of the rowmark behavior, not only partial control.
(For the same reason I disagree with the error check in the patch that
restricts which ROW_MARK options this function can return.  If the FDW has
TIDs, seems like it could reasonably use whatever options tables can use.)

 void
 BeginForeignFetch(EState *estate,
ExecRowMark *erm,
List *fdw_private,
int eflags);

 Begin a remote fetch. This is called during InitPlan() in the executor.

The begin/end functions seem like useless extra mechanism.  Why wouldn't
the FDW just handle this during its regular scan setup?  It could look to
see whether the foreign table is referenced by any ExecRowMarks (possibly
there's room to add some convenience functions to help with that).  What's
more, that design would make it simpler if the basic row fetch needs to be
modified.

 And I'd also like to propose to add a table/server option, 
 row_mark_reference, to postgres_fdw.  When a user sets the option to 
 true for eg a foreign table, ROW_MARK_REFERENCE will be used for the 
 table, not ROW_MARK_COPY.

Why would we leave that in the hands of the user?  Hardly anybody is going
to know what to do with the option, or even that they should do something
with it.  It's basically only of value for debugging AFAICS, but if we
expose an option we're going to be stuck with supporting it forever.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New error code to track unsupported contexts

2015-04-07 Thread Jim Nasby


On 11/28/14 11:41 PM, Michael Paquier wrote:

Hi all,

When pg_event_trigger_dropped_objects is run in a context that is not
the one of an event trigger, currently the error code
ERRCODE_FEATURE_NOT_SUPPORTED is returned. Wouldn't it be better to
have an error to define an out-of-context instead? It seems that it
would be a good thing to have more error verbosity for situations like
the case above. Note that this idea has been mentioned on this ML a
couple of weeks back. In any case, attached is a patch showing the
idea.

Opinions? Is that worth having?


Anything ever happen with this? (FWIW, I'm in favor of it. Reporting the 
feature isn't supported is confusing...)

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rare avl shutdown slowness (related to signal handling)

2015-04-07 Thread Qingqing Zhou

 I got another repro with the shutdown slowness (DEBUG5 with verbosed
log are attached).

It gives a finer picture of what's going on:
1. Avl ereport(autovacuum launcher shutting down);
2. At the end of errfinish(), it honors a pending SIGINT;
3. SIGINT handler longjmp to the start of avl error handling;
4.  The error handling continues and rebuild_database_list() (that's
why we see begin/commit pair);
5. In main loop, it WaitLatch(60 seconds);
6. Finally it ereport() again and proc_exit().

This looks like a general pattern - don't think *nix is immune. Notice
that this ereport() is special as there is way to go back. So we can
insert HOLD_INTERRUPTS() just before it.

Thoughts?

Regards,
Qingqing


On Tue, Apr 7, 2015 at 10:54 AM, Qingqing Zhou
zhouqq.postg...@gmail.com wrote:
 I am playing git tip on windows 7/32 bits, with the backend compiled
 with visual studio 2005 (I know, it is very old :-( ).

 I encountered avl shutdown slowness twice, last night and this
 morning: after a ctrl_c is hit, wait for another 60 seconds, server
 shuts down. Here is one log:

 D:\pgsql\binpostgres -D../data --log_line_prefix=%t %p
 2015-04-07 10:30:04 PDT 3148LOG:  database system was shut down at
 2015-04-07 10:29:24 PDT
 2015-04-07 10:30:04 PDT 19548LOG:  database system is ready to accept
 connections
 2015-04-07 10:30:04 PDT 27008LOG:  autovacuum launcher started
 2015-04-07 10:30:08 PDT 19548LOG:  received fast shutdown request
 2015-04-07 10:30:08 PDT 19548LOG:  aborting any active transactions
 2015-04-07 10:30:08 PDT 27008LOG:  autovacuum launcher shutting down
 2015-04-07 10:30:08 PDT 27008ERROR:  canceling statement due to user request
 2015-04-07 10:31:09 PDT 27008LOG:  autovacuum launcher shutting down
 2015-04-07 10:31:09 PDT 15656LOG:  shutting down
 2015-04-07 10:31:09 PDT 15656LOG:  database system is shut down

 The interesting part is on PID 27008: avl first ereport() shutdown,
 which is at the very end of the main loop and just one step away from
 proc_exit(). Then it seems honors a SIGINT within ereport(), longjmp
 to the loop head, and waits for another 60 seconds. After timeout, it
 ereports shutdown again, and finally exits.

 Not sure if this is windows only or general. I can hardly repro it.
 Anyone has ever seen this?

 Regards,
 Qingqing


avlexit.log
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rare avl shutdown slowness (related to signal handling)

2015-04-07 Thread Tom Lane

Qingqing Zhou zhouqq.postg...@gmail.com writes:
  I got another repro with the shutdown slowness (DEBUG5 with verbosed
 log are attached).

 It gives a finer picture of what's going on:
 1. Avl ereport(autovacuum launcher shutting down);
 2. At the end of errfinish(), it honors a pending SIGINT;
 3. SIGINT handler longjmp to the start of avl error handling;
 4.  The error handling continues and rebuild_database_list() (that's
 why we see begin/commit pair);
 5. In main loop, it WaitLatch(60 seconds);
 6. Finally it ereport() again and proc_exit().

 This looks like a general pattern - don't think *nix is immune. Notice
 that this ereport() is special as there is way to go back. So we can
 insert HOLD_INTERRUPTS() just before it.

 Thoughts?

That seems like (a) a hack, and (b) not likely to solve the problem
completely, unless you leave interrupts held throughout proc_exit(),
which would create all sorts of opportunities for corner case bugs
during on_proc_exit hooks.

I think changing the outer for(;;) to while (!got_SIGTERM) would
be a much safer fix.

It looks like there's a related risk associated with this bit:

/* in emergency mode, just start a worker and go away */
if (!AutoVacuumingActive())
{
do_start_worker();
proc_exit(0);   /* done */
}

If we get SIGHUP and see that autovacuum has been turned off,
we exit the main loop, but we don't set got_SIGTERM.  So if we
then get a similar error at the shutdown report, we'd not merely
waste some time, but actually incorrectly launch a child.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication identifiers, take 4

2015-04-07 Thread Jim Nasby


On 4/7/15 10:58 AM, Andres Freund wrote:

Why not just create a sequence? I suspect it may not be as fast to assign as
an OID, but it's not like you'd be doing this all the time.


What does that have to do with the thread?


The original bit was...


And finally I have issue with how the new identifiers are allocated.
Currently, if you create identifier 'foo', remove identifier 'foo' and
create identifier 'bar', the identifier 'bar' will have same id as the old
'foo' identifier. This can be problem because the identifier id is used as
origin of the data and the replication solution using the replication
identifiers can end up thinking that data came from node 'bar' even though
they came from the node 'foo' which no longer exists. This can have bad
effects for example on conflict detection or debugging problems with
replication.

Maybe another reason to use standard Oids?


Wasn't the reason for using OIDs so that we're not doing the equivalent 
of max(identifier) + 1?


Perhaps I'm just confused...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_restore -t should match views, matviews, and foreign tables

2015-04-07 Thread David G. Johnston

On Tue, Apr 7, 2015 at 1:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Peter Eisentraut pete...@gmx.net writes:
  On 3/31/15 11:01 PM, Craig Ringer wrote:
  this patch adds support for views, foreign tables, and materialised
  views to the pg_restore -t flag.

  I think this is a good change.  Any concerns?

 Are we happy with pg_dump/pg_restore not distinguishing these objects
 by type?  It seems rather analogous to letting ALTER TABLE work on views
 etc.  Personally I'm fine with this, but certainly some people have
 complained about that approach so far as ALTER is concerned.  (But the
 implication would be that we'd need four distinct switches, which is
 not an outcome I favor.)


The pg_dump documentation for the equivalent -t switch states:

Dump only tables (or views or sequences or foreign tables) matching table

Does pg_dump need to be updated to address materialized views here?

Does pg_restore need to be updated to address sequences here?

ISTM that the two should mirror each other.

David J.

Re: [HACKERS] rare avl shutdown slowness (related to signal handling)

2015-04-07 Thread Tom Lane

Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 I think changing the outer for(;;) to while (!got_SIGTERM) would
 be a much safer fix.

 Ah, yeah.  I was thinking in changing PG_exception_stack once shutdown
 was requested, but this is much simpler.

Your proposed patch seems to be doing both of those, which is probably
unnecessary.  I don't object to the SIGHUP test and goto in the error
path, but I'd put it a lot further down, like after the existing
RESUME_INTERRUPTS.  I doubt it's a good idea to skip the transaction
cleanup steps.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Support UPDATE table SET(*)=...

2015-04-07 Thread Jim Nasby


On 4/7/15 2:00 PM, Alvaro Herrera wrote:

Tom Lane wrote:


I spent a fair amount of time cleaning this patch up to get it into
committable shape, but as I was working on the documentation I started
to lose enthusiasm for it, because I was having a hard time coming up
with compelling examples.  The originally proposed motivation was


It solves the problem of doing UPDATE from a record variable of the same
type as the table e.g. update foo set (*) = (select foorec.*) where ...;


but it feels to me that this is not actually a very good solution to that
problem --- even granting that the problem needs to be solved, a claim
that still requires some justification IMO.


How about an UPDATE ran inside a plpgsql function, which is using a row
variable of the table type?  You could assign values to individual
columns of q row variable, and run the multicolumn UPDATE last.


Along similar lines, I've often wanted something akin to *, but allowing 
finer control over what you got. Generally when I want this it's because 
I really do want everything (as in, don't want to re-code a bunch of 
stuff if a column is added), but perhaps not the surrogate key field. Or 
I want everything, but rename some field to something else.


Basically, another way to do what Alvaro is suggesting (though, the 
ability to rename something is new...)


If we had that ability I think UPDATE * would become a lot more useful.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_rewind and log messages

2015-04-07 Thread Alvaro Herrera

Heikki Linnakangas wrote:
 On 04/07/2015 05:59 AM, Michael Paquier wrote:

 Fix inconsistent handling of logs in pg_rewind
 
 pg_rewind was handling a couple of things differently compared to the
 other src/bin utilities:
 - Logging output needs to be flushed on stderr, not stdout
 
 Agreed in general. But there's also precedent in printing some stuff to
 stdout: pg_ctl does that for the status message, like server starting. As
 does initdb.
 
 I'm pretty unclear on what the rule here is.

One principle that sometimes helps is to consider what happens if you
use the command as part of a larger pipeline; progress messages can be
read by some other command further down (and perhaps report them in a
dialog box, if you embed the program in a GUI, say), but error messages
should probably be processed differently; normally the pipeline would be
aborted as a whole.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doubt about AccessExclusiveLock in ALTER TABLE .. SET ( .. );

2015-04-07 Thread Fabrízio de Royes Mello

On Mon, Apr 6, 2015 at 12:53 AM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:

 Fabrízio de Royes Mello wrote:

  Ok guys. The attached patch refactor the reloptions adding a new field
  lockmode in relopt_gen struct and a new method to determine the
  required lock level from an option list.
 
  We need determine the appropriate lock level for each reloption:

 I don't think AccessShareLock is appropriate for any option change.  You
 should be using a lock level that's self-conflicting, as a minimum
 requirement, to avoid two processes changing the value concurrently.

What locklevel do you suggest? Maybe ShareLock?


 (I would probably go as far as ensuring that the lock level specified in
 the table DoLockModesConflict() with itself in an Assert somewhere.)


If I understood this is to check if the locklevel of the reloption list
don't conflict one each other, is it?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello

Re: [HACKERS] rare avl shutdown slowness (related to signal handling)

2015-04-07 Thread Qingqing Zhou

On Tue, Apr 7, 2015 at 2:32 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 That seems like (a) a hack, and (b) not likely to solve the problem
 completely, unless you leave interrupts held throughout proc_exit(),
 which would create all sorts of opportunities for corner case bugs
 during on_proc_exit hooks.


Hmm, looks like proc_exit() already taken care of this by setting
proc_exit_inprogress and StatementCancelHandler() respects it.
Actually, in quickdie(), I found a similar practice for the same
reason:

 /*
  * Prevent interrupts while exiting; though we just blocked signals that
  * would queue new interrupts, one may have been pending.  We don't want a
  * quickdie() downgraded to a mere query cancel.
  */
 HOLD_INTERRUPTS();
I do feel that we have too many functions instructing how to handle
interrupts and they are subtle - I just found a new friend
HOLD_CANCEL_INTERRUPTS :-(

Regards,
Qingqing


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rare avl shutdown slowness (related to signal handling)

2015-04-07 Thread Tom Lane

Qingqing Zhou zhouqq.postg...@gmail.com writes:
 I do feel that we have too many functions instructing how to handle
 interrupts and they are subtle - I just found a new friend
 HOLD_CANCEL_INTERRUPTS :-(

Indeed, which is why I think a patch for this issue should not introduce
a new mode/context in which proc_exit() is executed.  The code path would
be so seldom executed that we'd never manage to isolate any bugs that
might be in it.  Thus also my objection to Alvaro's patch that had us
going to proc_exit from partway through the error cleanup sequence.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rare avl shutdown slowness (related to signal handling)

2015-04-07 Thread Qingqing Zhou

On Tue, Apr 7, 2015 at 4:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Indeed, which is why I think a patch for this issue should not introduce
 a new mode/context in which proc_exit() is executed.

Agree. Along this line, we can add an on_proc_exit hook simply
ereport(we are exiting) there. In this way, we reuse all shields
proc_exit() already has. Abuse?

Regards,
Qingqing


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Tuple visibility within a single XID

2015-04-07 Thread Jim Nasby

My understanding is that all subtransactions get their own unique XID 
(assuming they need one), and that CommandId can't move backwards within 
a transaction. If that's correct, then shouldn't we be able to prune 
tuples where XMIN and XMAX match our *exact* XID (not all the extra 
stuff that TransactionIdIsCurrentTransactionId() does) and CommandId  
CurrentCommandId?


I thought of this because of a post to -general. It's certainly not a 
common case, but it seems like there's not much downside...

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doubt about AccessExclusiveLock in ALTER TABLE .. SET ( .. );

2015-04-07 Thread Alvaro Herrera

Fabrízio de Royes Mello wrote:
 On Mon, Apr 6, 2015 at 12:53 AM, Alvaro Herrera alvhe...@2ndquadrant.com
 wrote:
 
  Fabrízio de Royes Mello wrote:
 
   Ok guys. The attached patch refactor the reloptions adding a new field
   lockmode in relopt_gen struct and a new method to determine the
   required lock level from an option list.
  
   We need determine the appropriate lock level for each reloption:
 
  I don't think AccessShareLock is appropriate for any option change.  You
  should be using a lock level that's self-conflicting, as a minimum
  requirement, to avoid two processes changing the value concurrently.
 
 What locklevel do you suggest? Maybe ShareLock?

ShareUpdateExclusive probably.  ShareLock doesn't conflict with itself.

  (I would probably go as far as ensuring that the lock level specified in
  the table DoLockModesConflict() with itself in an Assert somewhere.)
 
 If I understood this is to check if the locklevel of the reloption list
 don't conflict one each other, is it?

I mean
Assert(DoLockModesConflict(relopt_gen-locklevel, 
relopt_gen-locklevel));

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_rewind and log messages

2015-04-07 Thread Fujii Masao

On Wed, Apr 8, 2015 at 5:53 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Heikki Linnakangas wrote:
 On 04/07/2015 05:59 AM, Michael Paquier wrote:

 Fix inconsistent handling of logs in pg_rewind
 
 pg_rewind was handling a couple of things differently compared to the
 other src/bin utilities:
 - Logging output needs to be flushed on stderr, not stdout

 Agreed in general. But there's also precedent in printing some stuff to
 stdout: pg_ctl does that for the status message, like server starting. As
 does initdb.

 I'm pretty unclear on what the rule here is.

 One principle that sometimes helps is to consider what happens if you
 use the command as part of a larger pipeline; progress messages can be
 read by some other command further down (and perhaps report them in a
 dialog box, if you embed the program in a GUI, say), but error messages
 should probably be processed differently; normally the pipeline would be
 aborted as a whole.

Make sense.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tuple visibility within a single XID

2015-04-07 Thread Qingqing Zhou

On Tue, Apr 7, 2015 at 6:11 PM, Peter Geoghegan p...@heroku.com wrote:
 No. For one thing, unique index enforcement still requires the tuples
 to be treated as a conflict while the other transaction is running
 IMV.


Not sure if I understand correctly: in uniqueness check, we see all
possible tuples with a dirty snapshot.  For a tuple version inserted
and updated by myself, it is surely dead no matter how the transaction
ends. So I interpret that we can safely pruning the version.

Early pruning may cause some behavior change though. For example, here is a T1:

begin;
   -- loop the following statements many times
   insert into pk values (1);  -- uniqueness defined
   delete from pk;
abort;

If another transaction T2 coming later than T1, and if we prune early,
then T1 can suddenly hang on insertion waiting for T2 to complete. But
does this violate any isolation rule?
Thanks,
Qingqing


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Tuple visibility within a single XID

2015-04-07 Thread Jim Nasby


On 4/7/15 8:11 PM, Peter Geoghegan wrote:

You're not the first to consider trying something like this in
specific scenarios, but my work on UPSERT leads me to believe it isn't
workable.


Yeah, every time I get into the really nitty-gritty details of this 
stuff it gets scary. That's why I didn't even bother with a patch before 
asking.


Thanks for setting me straight. :)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doubt about AccessExclusiveLock in ALTER TABLE .. SET ( .. );

2015-04-07 Thread Fabrízio de Royes Mello

On Tue, Apr 7, 2015 at 10:15 PM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:

 Fabrízio de Royes Mello wrote:
  On Mon, Apr 6, 2015 at 12:53 AM, Alvaro Herrera 
alvhe...@2ndquadrant.com
  wrote:
  
   Fabrízio de Royes Mello wrote:
  
Ok guys. The attached patch refactor the reloptions adding a new
field
lockmode in relopt_gen struct and a new method to determine the
required lock level from an option list.
   
We need determine the appropriate lock level for each reloption:
  
   I don't think AccessShareLock is appropriate for any option change.
You
   should be using a lock level that's self-conflicting, as a minimum
   requirement, to avoid two processes changing the value concurrently.
 
  What locklevel do you suggest? Maybe ShareLock?

 ShareUpdateExclusive probably.  ShareLock doesn't conflict with itself.


Ok.


   (I would probably go as far as ensuring that the lock level specified
in
   the table DoLockModesConflict() with itself in an Assert somewhere.)
 
  If I understood this is to check if the locklevel of the reloption list
  don't conflict one each other, is it?

 I mean
 Assert(DoLockModesConflict(relopt_gen-locklevel,
relopt_gen-locklevel));


Understood... IMHO the better place to perform this assert is in
initialize_reloptions.

Thoughts?

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello

Re: [HACKERS] pg_regress writes into source tree

2015-04-07 Thread Michael Paquier

On Wed, Apr 8, 2015 at 10:28 AM, Jim Nasby wrote:
 Just so this doesn't get lost... did something make it into a CommitFest on
 this?

Peter's patch has been committed as 64cdbbc, while the idea to create
sql/ by pg_regress if it is not present did not gather much interest
in this CF:
https://commitfest.postgresql.org/4/100/
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Really bad blowups with hash outer join and nulls

2015-04-07 Thread Jim Nasby


On 2/15/15 7:16 PM, Tomas Vondra wrote:

Hi,

On 16.2.2015 00:50, Andrew Gierth wrote:

Tom == Tom Lane t...@sss.pgh.pa.us writes:


I've now tried the attached patch to correct the bucketsize
estimates, and it does indeed stop the planner from considering the
offending path (in this case it just does the join the other way
round).

One thing I couldn't immediately see how to do was account for the
case where there are a lot of nulls in the table but a strict qual
(or an IS NOT NULL) filters them out; this patch will be overly
pessimistic about such cases. Do estimates normally try and take
things like this into account? I didn't find any other relevant
examples.


Improving the estimates is always good, but it's not going to fix the
case of non-NULL values (it shouldn't be all that difficult to create
such examples with a value whose hash starts with a bunch of zeroes).


  Tom There may also be something we can do in the executor, but it
  Tom would take closer analysis to figure out what's going wrong.  I
  Tom don't think kluging the behavior for NULL in particular is the
  Tom answer.

The point with nulls is that a hash value of 0 is currently special
in two distinct ways: it's always in batch 0 and bucket 0 regardless
of how many batches and buckets there are, and it's the result of
hashing a null. These two special cases interact in a worst-case
manner, so it seems worthwhile to avoid that.


I think you're right this is a flaw in general - all it takes is a
sufficiently common value with a hash value falling into the first batch
(i.e. either 0 or starting with a lot of 0 bits, thus giving hashno==0).

I think this might be solved by relaxing the check a bit. Currently we
do this (see nodeHash.c:735):

 if (nfreed == 0 || nfreed == ninmemory)
 {
 hashtable-growEnabled = false;
 }

which only disables nbatch growth if *all* the tuples remain in the
batch. That's rather strict, and it takes a single tuple to break this.

With each nbatch increase we expect 50:50 split, i.e. 1/2 the tuples
staying in the batch, 1/2 moved to the new one. So a much higher ratio
is suspicious and most likely mean the same hash value, so what if we
did something like this:

 if ((nfreed = 0.9 * (nfreed + ninmemory)) ||
 (nfreed = 0.1 * (nfreed + ninmemory)))
 {
 hashtable-growEnabled = false;
 }

which disables nbatch growth if either =90% tuples remained in the
first batch, or = 90% tuples were moved from it. The exact thresholds
might be set a bit differently, but I think 10:90 sounds about good.

Trivial patch implementing this attached - with it the explain analyze
looks like this:

test=# explain analyze select status, count(*) from q3 left join m3 on
(m3.id=q3.id) group by status;

QUERY PLAN
--
  HashAggregate  (cost=64717.63..64717.67 rows=4 width=4) (actual
time=514.619..514.630 rows=5 loops=1)
Group Key: m3.status
-  Hash Right Join  (cost=18100.00..62217.63 rows=50 width=4)
 (actual time=75.260..467.911 rows=500108 loops=1)
  Hash Cond: (m3.id = q3.id)
  -  Seq Scan on m3  (cost=0.00..18334.00 rows=100 width=37)
 (actual time=0.003..91.799 rows=100 loops=1)
  -  Hash  (cost=7943.00..7943.00 rows=50 width=33)
(actual time=74.916..74.916 rows=50 loops=1)
Buckets: 65536 (originally 65536)
Batches: 32 (originally 16)  Memory Usage: 8824kB
-  Seq Scan on q3
(cost=0.00..7943.00 rows=50 width=33)
(actual time=0.005..27.395 rows=50 loops=1)
  Planning time: 0.172 ms
  Execution time: 514.663 ms
(10 rows)

Without the patch this runs in ~240 seconds and the number of batches
explodes to ~131k.

In theory it might happen that threre just a few hash values and all of
them are exactly the same within the first N bits (thus falling into the
first batch), but N+1 bits are different. So the next split would
actually help. But I think we can leave that up to probability, just
like all the hash code.


Anything happen with this, or the patch Andrew posted?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication identifiers, take 4

2015-04-07 Thread Michael Paquier

On Tue, Apr 7, 2015 at 11:37 PM, Andres Freund and...@anarazel.de wrote:
 On 2015-04-07 16:30:25 +0200, Andres Freund wrote:
 And with temp tables (or much more extremely WITH OID tables)
 and such it's not that hard to reach that point.

 Oh, and obviously toast data. A couple tables with toasted columns is
 also a good way to rapidly consume oids.

You are forgetting as well large objects on the stack, when client
application does not assign an OID by itself.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-07 Thread Peter Geoghegan

On Mon, Mar 30, 2015 at 9:20 AM, Peter Geoghegan p...@heroku.com wrote:
 * The code uses LockTupleExclusive to lock rows. That means the fkey key
   locking doesn't work; That's annoying. This means that using upsert
   will potentially cause deadlocks in other sessions :(. I think you'll
   have to determine what lock to acquire by fetching the tuple, doing
   the key comparison, and acquire the appropriate lock. That should be
   fine.

 Looking into the implementation of this. As we discussed, the
 difficulty about avoiding a lock escalation within ExecUpdate() is
 that we must fetch the row, run EvalPlanQual() to check if the new row
 version generated by updating will require a LockTupleExclusive or
 LockTupleNoKeyExclusive, and then lock the row using the right
 lockmode, and only then call ExecUpdate(). Right now, UPSERT benefits
 from fetching and locking the row together, so going this way imposes
 a little additional complexity. It's probably worth it, though.

Why do you think deadlocks will be a particular concern? Sure, it
could make the difference between deadlocking and not deadlocking,
which is bad, but it's not obvious to me that the risk would be any
worse than the risk of deadlocking with FKs in 9.2. Is that really all
that bad?


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bringing text abbreviation debug output under the control of trace_sort

2015-04-07 Thread Robert Haas

On Sun, Apr 5, 2015 at 3:59 PM, Peter Geoghegan p...@heroku.com wrote:
 Attached patch makes trace_sort control abbreviation debug output for
 the text opclass, which makes it consistent with the numeric opclass.
 This seems better than relying on someone going to the trouble of
 building Postgres themselves to debug cases where abbreviation is the
 wrong thing, which we're not 100% sure will not occur. It also allows
 wider analysis of where abbreviation helps the most and the least in
 production, which is surely a good thing.

 I have added this to the next commitfest, but I suggest that it be
 quickly committed to the master branch as a maintenance commit.

Done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bringing text abbreviation debug output under the control of trace_sort

2015-04-07 Thread Peter Geoghegan

On Tue, Apr 7, 2015 at 7:46 PM, Robert Haas robertmh...@gmail.com wrote:
 Done.

Thank you.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump / copy bugs with big lines ?

2015-04-07 Thread Robert Haas

On Mon, Apr 6, 2015 at 1:51 PM, Jim Nasby jim.na...@bluetreble.com wrote:
 In any case, I don't think it would be terribly difficult to allow a bit
 more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR there's
 some 1GB limits there too.

The point is, those limits are there on purpose.  Changing things
arbitrarily wouldn't be hard, but doing it in a principled way is
likely to require some thought.  For example, in the COPY OUT case,
presumably what's happening is that we palloc a chunk for each
individual datum, and then palloc a buffer for the whole row.  Now, we
could let the whole-row buffer be bigger, but maybe it would be better
not to copy all of the (possibly very large) values for the individual
columns over into a row buffer before sending it.  Some refactoring
that avoids the need for a potentially massive (1.6TB?) whole-row
buffer would be better than just deciding to allow it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rejected vs returned with feedback in new CF app

2015-04-07 Thread Robert Haas

On Tue, Apr 7, 2015 at 3:35 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 4/7/15 3:33 PM, Tom Lane wrote:
 I tried to mark the UPDATE SET (*) patch as returned with feedback,
 but the CF app informed me that if I did that the patch would
 automatically be moved to the next commitfest.  That seems completely
 stupid.  There is no need to reconsider it unless a new version of the
 patch is forthcoming (which there may or may not ever be, but that's
 beside the point for now).  When and if the author does submit a new
 patch, that would be the time to include it in the next commitfest, no?

 I noticed that as well and have avoided closing some patches because of it.

Several people, including me, have complained about this before.  I
hope that Magnus will fix it soon.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] libpq's multi-threaded SSL callback handling is busted

2015-04-07 Thread Peter Eisentraut

On 4/3/15 7:44 AM, Jan Urbański wrote:
 To reiterate: I recognise that not handling the callbacks is not the right
 answer. But not stomping on callbacks others might have set sounds like a
 minimal and safe improvement.

I think your patch is okay in that it is not a good idea to overwrite or
unset someone else's callbacks.  But we shouldn't mistake that for
fixing the underlying problem.  The only reason this patch appears to
fix the presented test cases is because other OpenSSL users are also
misbehaving and/or the OpenSSL interfaces are so stupid that they cannot
be worked with reasonably.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rejected vs returned with feedback in new CF app

2015-04-07 Thread Michael Paquier

On Wed, Apr 8, 2015 at 11:57 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Apr 7, 2015 at 3:35 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 4/7/15 3:33 PM, Tom Lane wrote:
 I tried to mark the UPDATE SET (*) patch as returned with feedback,
 but the CF app informed me that if I did that the patch would
 automatically be moved to the next commitfest.  That seems completely
 stupid.  There is no need to reconsider it unless a new version of the
 patch is forthcoming (which there may or may not ever be, but that's
 beside the point for now).  When and if the author does submit a new
 patch, that would be the time to include it in the next commitfest, no?

 I noticed that as well and have avoided closing some patches because of it.

 Several people, including me, have complained about this before.  I
 hope that Magnus will fix it soon.

Yeah, you can find references about that here and there... And the
current behavior is confusing.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump / copy bugs with big lines ?

2015-04-07 Thread Michael Paquier

On Wed, Apr 8, 2015 at 11:53 AM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Apr 6, 2015 at 1:51 PM, Jim Nasby jim.na...@bluetreble.com wrote:
 In any case, I don't think it would be terribly difficult to allow a bit
 more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR there's
 some 1GB limits there too.

 The point is, those limits are there on purpose.  Changing things
 arbitrarily wouldn't be hard, but doing it in a principled way is
 likely to require some thought.  For example, in the COPY OUT case,
 presumably what's happening is that we palloc a chunk for each
 individual datum, and then palloc a buffer for the whole row.  Now, we
 could let the whole-row buffer be bigger, but maybe it would be better
 not to copy all of the (possibly very large) values for the individual
 columns over into a row buffer before sending it.  Some refactoring
 that avoids the need for a potentially massive (1.6TB?) whole-row
 buffer would be better than just deciding to allow it.

I think that something to be aware of is that this is as well going to
require some rethinking of the existing libpq functions that are here
to fetch a row during COPY with PQgetCopyData, to make them able to
fetch chunks of data from one row.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

2015-04-07 Thread Amit Kapila

On Wed, Apr 8, 2015 at 1:53 AM, Kevin Grittner kgri...@ymail.com wrote:

 David Rowley dgrowle...@gmail.com wrote:

  If we attempt to do this parallel stuff at plan time, and we
  happen to plan at some quiet period, or perhaps worse, some
  application's start-up process happens to PREPARE a load of
  queries when the database is nice and quite, then quite possibly
  we'll end up with some highly parallel queries. Then perhaps come
  the time these queries are actually executed the server is very
  busy... Things will fall apart quite quickly due to the masses of
  IPC and context switches that would be going on.
 
  I completely understand that this parallel query stuff is all
  quite new to us all and we're likely still trying to nail down
  the correct infrastructure for it to work well, so this is why
  I'm proposing that the planner should know nothing of parallel
  query, instead I think it should work more along the lines of:
 
  * Planner should be completely oblivious to what parallel query
is.
  * Before executor startup the plan is passed to a function which
decides if we should parallelise it, and does so if the plan
meets the correct requirements. This should likely have a very
fast exit path such as:
  if root node's cost  parallel_query_cost_threshold
return; /* the query is not expensive enough to attempt to make
parallel */
 
  The above check will allow us to have an almost zero overhead for
  small low cost queries.
 
  This function would likely also have some sort of logic in order
  to determine if the server has enough spare resource at the
  current point in time to allow queries to be parallelised

 There is a lot to like about this suggestion.

 I've seen enough performance crashes due to too many concurrent
 processes (even when each connection can only use a single process)
 to believe that, for a plan which will be saved, it is possible to
 know at planning time whether parallelization will be a nice win or
 a devastating over-saturation of resources during some later
 execution phase.

 Another thing to consider is that this is not entirely unrelated to
 the concept of admission control policies.  Perhaps this phase
 could be a more general execution start-up admission control phase,
 where parallel processing would be one adjustment that could be
 considered.

I think there is always a chance that resources (like parallel-workers)
won't be available at run-time even if we decide about them at
executor-start phase unless we block it for that node's usage and OTOH
if we block it (by allocating) those resources during executor-start phase
then we might end up blocking it too early or may be they won't even get
used if we decide not to execute that node.  On that basis, it seems to
me current strategy is not bad where we decide during planning time and
later during execution time if not all resources (particularly
parallel-workers)
are not available, then we use only the available one's to execute the plan.
Going forward, I think we can improve the same if we decide not to shutdown
parallel workers till postmaster shutdown once they are started and
then just allocate them during executor-start phase.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Tuple visibility within a single XID

2015-04-07 Thread Peter Geoghegan

On Tue, Apr 7, 2015 at 5:59 PM, Jim Nasby jim.na...@bluetreble.com wrote:
 My understanding is that all subtransactions get their own unique XID
 (assuming they need one), and that CommandId can't move backwards within a
 transaction. If that's correct, then shouldn't we be able to prune tuples
 where XMIN and XMAX match our *exact* XID (not all the extra stuff that
 TransactionIdIsCurrentTransactionId() does) and CommandId 
 CurrentCommandId?

No. For one thing, unique index enforcement still requires the tuples
to be treated as a conflict while the other transaction is running
IMV.

For another, this is necessary today (from ExecUpdate()):

/*
* The target tuple was already updated or deleted by the
* current command, or by a later command in the current
* transaction.  The former case is possible in a join UPDATE
* where multiple tuples join to the same target tuple. This
* is pretty questionable, but Postgres has always allowed it:
* we just execute the first update action and ignore
* additional update attempts.
*
* The latter case arises if the tuple is modified by a
* command in a BEFORE trigger, or perhaps by a command in a
* volatile function used in the query.  In such situations we
* should not ignore the update, but it is equally unsafe to
* proceed.  We don't want to discard the original UPDATE
* while keeping the triggered actions based on it; and we
* have no principled way to merge this update with the
* previous ones.  So throwing an error is the only safe
* course.
*
* If a trigger actually intends this type of interaction, it
* can re-execute the UPDATE (assuming it can figure out how)
* and then return NULL to cancel the outer update.
*/
if (hufd.cmax != estate-es_output_cid)
ereport(ERROR,
 (errcode(ERRCODE_TRIGGERED_DATA_CHANGE_VIOLATION),
  errmsg(tuple to be updated was already modified by an
operation triggered by the current command),
  errhint(Consider using an AFTER trigger instead of a
BEFORE trigger to propagate changes to other rows.)));


You're not the first to consider trying something like this in
specific scenarios, but my work on UPSERT leads me to believe it isn't
workable.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New error code to track unsupported contexts

2015-04-07 Thread Alvaro Herrera

Jim Nasby wrote:
 On 11/28/14 11:41 PM, Michael Paquier wrote:
 Hi all,
 
 When pg_event_trigger_dropped_objects is run in a context that is not
 the one of an event trigger, currently the error code
 ERRCODE_FEATURE_NOT_SUPPORTED is returned. Wouldn't it be better to
 have an error to define an out-of-context instead? It seems that it
 would be a good thing to have more error verbosity for situations like
 the case above. Note that this idea has been mentioned on this ML a
 couple of weeks back. In any case, attached is a patch showing the
 idea.
 
 Opinions? Is that worth having?
 
 Anything ever happen with this? (FWIW, I'm in favor of it. Reporting the
 feature isn't supported is confusing...)

Not opposed to the idea.

Maybe it should be in class 39 'External Routine Invocation Exception'
instead, like ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED is used by
various trigger functions.  We could invent
ERRCODE_E_R_I_E_EVENT_TRIGGER_PROTOCOL_VIOLATED with value 39P03, for
example.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_regress writes into source tree

2015-04-07 Thread Jim Nasby


On 2/13/15 11:20 PM, Michael Paquier wrote:

On Sat, Feb 14, 2015 at 6:24 AM, Andrew Dunstan and...@dunslane.net wrote:


On 12/18/2014 06:05 PM, Andrew Dunstan wrote:



On 12/18/2014 03:02 AM, Michael Paquier wrote:


On Fri, Dec 12, 2014 at 10:45 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:


Another thing in that patch was that I had to add the sql/ directory to
the source tree, but other than that .gitignore file it was empty.
Maybe pg_regress should create the sql/ directory in the build dir if it
doesn't exist.  This is only a problem if a pg_regress suite only runs
stuff from input/, because otherwise the sql/ dir already exists in the
source.


+1 for having pg_regress create the sql/ directory when it does not
exist. Current behavior is annoying when modules having only tests in
input/...





That seems like a separate issue. I think Peter should commit his patch
and backpatch it immediately, and we can deal with the missing sql directory
when someone sends in a patch.




What's happened on this?


Nothing has been committed, and as far as I understood this patch
could have been simply pushed...


Just so this doesn't get lost... did something make it into a CommitFest 
on this?

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

2015-04-07 Thread Robert Haas

On Sat, Apr 4, 2015 at 5:19 AM, David Rowley dgrowle...@gmail.com wrote:
 Going over the previous emails in this thread I see that it has been a long
 time since anyone discussed anything around how we might decide at planning
 time how many workers should be used for the query, and from the emails I
 don't recall anyone proposing a good idea about how this might be done, and
 I for one can't see how this is at all possible to do at planning time.

 I think that the planner should know nothing of parallel query at all, and
 the planner quite possibly should go completely unmodified for this patch.
 One major problem I can see is that, given a query such as:

 SELECT * FROM million_row_product_table WHERE category = 'ELECTRONICS';

 Where we have a non-unique index on category, some plans which may be
 considered might be:

 1. Index scan on the category index to get all rows matching 'ELECTRONICS'
 2. Sequence scan on the table, filter matching rows.
 3. Parallel plan which performs a series of partial sequence scans pulling
 out all matching rows.

 I really think that if we end choosing things like plan 3, when plan 2 was
 thrown out because of its cost, then we'll end up consuming more CPU and I/O
 than we can possibly justify using. The environmentalist in me screams that
 this is wrong. What if we kicked off 128 worker process on some high-end
 hardware to do this? I certainly wouldn't want to pay the power bill. I
 understand there's costing built in to perhaps stop this, but I still think
 it's wrong headed, and we need to still choose the fastest non-parallel plan
 and only consider parallelising that later.

I agree that this is an area that needs more thought.  I don't
(currently, anyway) agree that the planner shouldn't know anything
about parallelism.  The problem with that is that there's lots of
relevant stuff that can only be known at plan time.  For example,
consider the query you mention above on a table with no index.  If the
WHERE clause is highly selective, a parallel plan may well be best.
But if the selectivity is only, say, 50%, a parallel plan is stupid:
the IPC costs of shipping many rows back to the master will overwhelm
any benefit we could possibly have hoped to get, and the overall
result will likely be that the parallel plan both runs slower and uses
more resources.  At plan time, we have the selectivity information
conveniently at hand, and can use that as part of the cost model to
make educated decisions.  Execution time is way too late to be
thinking about those kinds of questions.

I think one of the philosophical questions that has to be answered
here is what does it mean to talk about the cost of a parallel
plan?.  For a non-parallel plan, the cost of the plan means both the
amount of effort we will spend executing the plan and also the
amount of time we think the plan will take to complete, but those two
things are different for parallel plans.  I'm inclined to think it's
right to view the cost of a parallel plan as a proxy for execution
time, because the fundamental principle of the planner is that we pick
the lowest-cost plan.  But there also clearly needs to be some way to
prevent the selection of a plan which runs slightly faster at the cost
of using vastly more resources.

Currently, the planner tracks the best unsorted path for each relation
as well as the best path for each useful sort order.  Suppose we treat
parallelism as another axis for judging the quality of a plan: we keep
the best unsorted, non-parallel path; the best non-parallel path for
each useful sort order; the best unsorted, parallel path; and the best
parallel path for each sort order.  Each time we plan a node, we
generate non-parallel paths first, and then parallel paths.  But, if a
parallel plan isn't markedly faster than the non-parallel plan for the
same sort order, then we discard it.  I'm not sure exactly what the
thresholds should be here, and they probably need to be configurable,
because on a single-user system with excess capacity available it may
be absolutely desirable to use ten times the resources to get an
answer 25% faster, but on a heavy-loaded system that will stink.

Some ideas for GUCs:

max_parallel_degree = The largest number of processes we'll consider
using for a single query.
min_parallel_speedup = The minimum percentage by which a parallel path
must be cheaper (in terms of execution time) than a non-parallel path
in order to survive.  I'm imagining the default here might be
something like 15%.
min_parallel_speedup_per_worker = Like the previous one, but per
worker.  e.g. if this is 5%, which might be a sensible default, then a
plan with 4 workers must be at least 20% better to survive, but a plan
using only 2 workers only needs to be 10% better.

An additional benefit of this line of thinking is that planning would
always produce a best non-parallel path.  And sometimes, there would
also be a best parallel path that is expected to run faster.  We could
then choose

Re: [HACKERS] Tuple visibility within a single XID

2015-04-07 Thread Peter Geoghegan

On Tue, Apr 7, 2015 at 7:16 PM, Qingqing Zhou zhouqq.postg...@gmail.com wrote:
 If another transaction T2 coming later than T1, and if we prune early,
 then T1 can suddenly hang on insertion waiting for T2 to complete. But
 does this violate any isolation rule?

Well, it means that you don't lock a row that you delete (and values
that appeared in that row if they're constrained by a unique index)
just because you happened to also insert that row. That would probably
break client code.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: File count restriction of directory limits number of relations inside a database.

2015-04-07 Thread sudalai

Hi,

  Ya you are right, ext4 allows more directory entries(more than 32000)
but we limited the number of files insides the directory to 32000 to get
better performance. Sorry i'm not mentioned that in my post.
That the reason we plan to use tablespace.  The problem we faced in
tablespace is, the location should be present on both master and slave and
we need to create multiple tablespaces. That why i changed the source, to
create a sub directory on the given location and take that location for
tablespace creation. So i can given one location (that present in both
master  slave) to create multiple tablespaces.





-
sudalai
--
View this message in context: 
http://postgresql.nabble.com/File-count-restriction-of-directory-limits-number-of-relations-inside-a-database-tp5844711p5845044.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Seq Scan

2015-04-07 Thread Amit Kapila

On Wed, Apr 8, 2015 at 7:54 AM, Robert Haas robertmh...@gmail.com wrote:

 I agree that this is an area that needs more thought.  I don't
 (currently, anyway) agree that the planner shouldn't know anything
 about parallelism.  The problem with that is that there's lots of
 relevant stuff that can only be known at plan time.  For example,
 consider the query you mention above on a table with no index.  If the
 WHERE clause is highly selective, a parallel plan may well be best.
 But if the selectivity is only, say, 50%, a parallel plan is stupid:
 the IPC costs of shipping many rows back to the master will overwhelm
 any benefit we could possibly have hoped to get, and the overall
 result will likely be that the parallel plan both runs slower and uses
 more resources.  At plan time, we have the selectivity information
 conveniently at hand, and can use that as part of the cost model to
 make educated decisions.  Execution time is way too late to be
 thinking about those kinds of questions.

 I think one of the philosophical questions that has to be answered
 here is what does it mean to talk about the cost of a parallel
 plan?.  For a non-parallel plan, the cost of the plan means both the
 amount of effort we will spend executing the plan and also the
 amount of time we think the plan will take to complete, but those two
 things are different for parallel plans.  I'm inclined to think it's
 right to view the cost of a parallel plan as a proxy for execution
 time, because the fundamental principle of the planner is that we pick
 the lowest-cost plan.  But there also clearly needs to be some way to
 prevent the selection of a plan which runs slightly faster at the cost
 of using vastly more resources.

 Currently, the planner tracks the best unsorted path for each relation
 as well as the best path for each useful sort order.  Suppose we treat
 parallelism as another axis for judging the quality of a plan: we keep
 the best unsorted, non-parallel path; the best non-parallel path for
 each useful sort order; the best unsorted, parallel path; and the best
 parallel path for each sort order.  Each time we plan a node, we
 generate non-parallel paths first, and then parallel paths.  But, if a
 parallel plan isn't markedly faster than the non-parallel plan for the
 same sort order, then we discard it.

One disadvantage of retaining parallel-paths could be that it can
increase the number of combinations planner might need to evaluate
during planning (in particular during join path evaluation) unless we
do some special handling to avoid evaluation of such combinations.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Proposal : REINDEX xxx VERBOSE

2015-04-07 Thread Fujii Masao

On Wed, Apr 8, 2015 at 1:57 AM, Sawada Masahiko sawada.m...@gmail.com wrote:
 On Wed, Apr 8, 2015 at 1:11 AM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:

 On Tue, Apr 7, 2015 at 1:04 PM, Sawada Masahiko sawada.m...@gmail.com
 wrote:

 On Tue, Apr 7, 2015 at 10:12 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
 
  On Tue, Apr 7, 2015 at 7:22 AM, Sawada Masahiko sawada.m...@gmail.com
  wrote:
 
  Thank you for your reviewing.
  I modified the patch and attached latest version patch(v7).
  Please have a look it.
 
 
  Looks good to me. Attached patch (v8) just fix a tab indentation in
  gram.y.
 

 I had forgotten fix a tab indentation, sorry.
 Thank you for reviewing!
 It looks good to me too.
 Can this patch be marked as Ready for Committer?


 +1


 Changed status to Ready for Committer.

The patch adds new syntax like REINDEX ... WITH VERBOSE, i.e., () is not
added after WITH clause. Did we reach the consensus about this syntax?
The last email from Robert just makes me think that () should be added
into the syntax.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Removal of FORCE option in REINDEX

2015-04-07 Thread Fujii Masao

Hi,

While reviewing the REINDEX VERBOSE patch, I felt inclined to remove FORCE
option support from REINDEX command. It has been marked obsolete since
very old version 7.4. I think that it's no longer worth keeping supporting it.
Thought?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removal of FORCE option in REINDEX

2015-04-07 Thread Fabrízio de Royes Mello

Em quarta-feira, 8 de abril de 2015, Fujii Masao masao.fu...@gmail.com
escreveu:

 Hi,

 While reviewing the REINDEX VERBOSE patch, I felt inclined to remove FORCE
 option support from REINDEX command. It has been marked obsolete since
 very old version 7.4. I think that it's no longer worth keeping supporting
 it.
 Thought?


+1




-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello

79 matches

Mail list logo