date:20200107

Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly

2020-01-07 Thread Michael Paquier

On Sat, Jan 04, 2020 at 09:38:24PM +0300, Alexey Kondratov wrote:
> Finally, I have also merged and unified all your and Masahiko's proposals
> with my recent changes: ereport corrections, tab-completion, docs update,
> copy/equalfuncs update, etc. New version is attached. Have it come any
> closer to a committable state now?

I have not yet reviewed this patch in details (I have that on my
TODO), but at quick glance what you have here is rather close to what
I'd expect to be committable as the tablespace OID assignment from
your patch is consistent in the REINDEX code paths with the existing
ALTER TABLE handling.
--
Michael


signature.asc
Description: PGP signature

Re: table partitioning and access privileges

2020-01-07 Thread Amit Langote

On Fri, Dec 27, 2019 at 4:26 AM Tom Lane  wrote:
> Fujii Masao  writes:
> > My customer reported me that the queries through a partitioned table
> > ignore each partition's SELECT, INSERT, UPDATE, and DELETE privileges,
> > on the other hand, only TRUNCATE privilege specified for each partition
> > is applied. I'm not sure if this behavior is expected or not. But anyway
> > is it better to document that? For example,
>
> > Access privileges may be defined and removed separately for each 
> > partition.
> > But note that queries through a partitioned table ignore each 
> > partition's
> > SELECT, INSERT, UPDATE and DELETE privileges, and apply only TRUNCATE 
> > one.
>
> I believe it's intentional that we only check access privileges on
> the table explicitly named in the query.  So I'd say SELECT etc
> are doing the right thing, and if TRUNCATE isn't in step with them
> that's a bug to fix, not something to document.

I tend to agree that TRUNCATE's permission model for inheritance
should be consistent with that for the other commands.  How about the
attached patch toward that end?

Thanks,
Amit
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1c4394abea..43377e24c4 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -298,6 +298,7 @@ struct DropRelationCallbackState
((child_is_partition) ? DEPENDENCY_AUTO : DEPENDENCY_NORMAL)
 
 static void truncate_check_rel(Oid relid, Form_pg_class reltuple);
+static void truncate_check_perms(Oid relid, Form_pg_class reltuple);
 static void truncate_check_activity(Relation rel);
 static void RangeVarCallbackForTruncate(const RangeVar *relation,

Oid relId, Oid oldRelId, void *arg);
@@ -1575,6 +1576,10 @@ ExecuteTruncate(TruncateStmt *stmt)
}
 
truncate_check_rel(RelationGetRelid(rel), 
rel->rd_rel);
+   /*
+* We skip checking the child's permission, 
because we
+* already checked the parent's.
+*/
truncate_check_activity(rel);
 
rels = lappend(rels, rel);
@@ -1660,6 +1665,7 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, 
List *relids_logged,
(errmsg("truncate cascades to 
table \"%s\"",

RelationGetRelationName(rel;
truncate_check_rel(relid, rel->rd_rel);
+   truncate_check_perms(relid, rel->rd_rel);
truncate_check_activity(rel);
rels = lappend(rels, rel);
relids = lappend_oid(relids, relid);
@@ -1910,7 +1916,6 @@ ExecuteTruncateGuts(List *explicit_rels, List *relids, 
List *relids_logged,
 static void
 truncate_check_rel(Oid relid, Form_pg_class reltuple)
 {
-   AclResult   aclresult;
char   *relname = NameStr(reltuple->relname);
 
/*
@@ -1924,12 +1929,6 @@ truncate_check_rel(Oid relid, Form_pg_class reltuple)
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 errmsg("\"%s\" is not a table", relname)));
 
-   /* Permissions checks */
-   aclresult = pg_class_aclcheck(relid, GetUserId(), ACL_TRUNCATE);
-   if (aclresult != ACLCHECK_OK)
-   aclcheck_error(aclresult, 
get_relkind_objtype(reltuple->relkind),
-  relname);
-
if (!allowSystemTableMods && IsSystemClass(relid, reltuple))
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
@@ -1939,6 +1938,22 @@ truncate_check_rel(Oid relid, Form_pg_class reltuple)
InvokeObjectTruncateHook(relid);
 }
 
+/*
+ * Check that current user has the permission to truncate given relation.
+ */
+static void
+truncate_check_perms(Oid relid, Form_pg_class reltuple)
+{
+   char   *relname = NameStr(reltuple->relname);
+   AclResult   aclresult;
+
+   /* Permissions checks */
+   aclresult = pg_class_aclcheck(relid, GetUserId(), ACL_TRUNCATE);
+   if (aclresult != ACLCHECK_OK)
+   aclcheck_error(aclresult, 
get_relkind_objtype(reltuple->relkind),
+  relname);
+}
+
 /*
  * Set of extra sanity checks to check if a given relation is safe to
  * truncate.  This is split with truncate_check_rel() as
@@ -14799,6 +14814,7 @@ RangeVarCallbackForTruncate(const RangeVar *relation,
elog(ERROR, "cache lookup failed for relation %u", relId);
 
truncate_check_rel(relId, (Form_pg_class) GETSTRUCT(tuple));
+   truncate_check_perms(relId, (Form_pg_cla

Re: pgbench - use pg logging capabilities

2020-01-07 Thread Michael Paquier

On Mon, Jan 06, 2020 at 01:36:23PM +0100, Fabien COELHO wrote:
>> I think that I would just remove the "debug" variable defined in
>> pgbench.c all together, and switch the messages for the duration and the
>> one in executeMetaCommand to use info-level logging..
> 
> Ok, done.

Thanks.  The output then gets kind of inconsistent when using --debug:
pgbench: client 2 executing script ""
client 2 executing \set aid
client 2 executing \set bid
client 2 executing \set tid
client 2 executing \set delta

My point was to just modify the code so as this uses pg_log_debug(),
with a routine doing some reverse-engineering of the Command data to
generate a  string to show up in the logs.  Sorry for the confusion..
And there is no need to use __pg_log_level either which should remain
internal to logging.h IMO.

We'd likely want a similar business in syntax_error() to be fully
consistent with all other code paths dealing with an error showing up
before exiting.

No idea what others think here.  I may be too much pedantic.
--
Michael

signature.asc
Description: PGP signature

Re: pgbench - use pg logging capabilities

2020-01-07 Thread Fabien COELHO



Bonjour Michaël,


I think that I would just remove the "debug" variable defined in
pgbench.c all together, and switch the messages for the duration and the
one in executeMetaCommand to use info-level logging..


Ok, done.


Thanks.  The output then gets kind of inconsistent when using --debug:
pgbench: client 2 executing script ""
client 2 executing \set aid
client 2 executing \set bid
client 2 executing \set tid
client 2 executing \set delta

My point was to just modify the code so as this uses pg_log_debug(),
with a routine doing some reverse-engineering of the Command data to
generate a  string to show up in the logs.  Sorry for the confusion..
And there is no need to use __pg_log_level either which should remain
internal to logging.h IMO.


For the first case with the output you point out, there is a loop to 
generate the output. I do not think that we want to pay the cost of 
generating the string and then throw it away afterwards when not under 
debug, esp as string manipulation is not that cheap, so we need to enter 
the thing only when under debug. However, there is no easy way to do that 
without accessing __pg_log_level. It could be hidden in a macro to create, 
but that's it.


For the second case I called pg_log_debug just once.


We'd likely want a similar business in syntax_error() to be fully
consistent with all other code paths dealing with an error showing up
before exiting.


The syntax error is kind of complicated because there is the location 
information which is better left as is, IMHO. I moved remainder to a 
PQExpBuffer and pg_log_fatal.



No idea what others think here.  I may be too much pedantic.


Maybe a little:-)

Note that I submitted another patch to use PQExpBuffer wherever possible 
in pgbench, especially to get rid of doubtful snprintf/strlen patterns.


Patch v5 attached tries to follow your above suggestions.

--
Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index a1e0663c8b..2e7f873702 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -70,7 +70,6 @@
 #define M_PI 3.14159265358979323846
 #endif
 
-
 #define ERRCODE_UNDEFINED_TABLE  "42P01"
 
 /*
@@ -541,8 +540,6 @@ static ParsedScript sql_script[MAX_SCRIPTS];	/* SQL script files */
 static int	num_scripts;		/* number of scripts in sql_script[] */
 static int64 total_weight = 0;
 
-static int	debug = 0;			/* debug flag */
-
 /* Builtin test scripts */
 typedef struct BuiltinScript
 {
@@ -787,14 +784,12 @@ strtoint64(const char *str, bool errorOK, int64 *result)
 
 out_of_range:
 	if (!errorOK)
-		fprintf(stderr,
-"value \"%s\" is out of range for type bigint\n", str);
+		pg_log_error("value \"%s\" is out of range for type bigint", str);
 	return false;
 
 invalid_syntax:
 	if (!errorOK)
-		fprintf(stderr,
-"invalid input syntax for type bigint: \"%s\"\n", str);
+		pg_log_error("invalid input syntax for type bigint: \"%s\"", str);
 	return false;
 }
 
@@ -810,16 +805,14 @@ strtodouble(const char *str, bool errorOK, double *dv)
 	if (unlikely(errno != 0))
 	{
 		if (!errorOK)
-			fprintf(stderr,
-	"value \"%s\" is out of range for type double\n", str);
+			pg_log_error("value \"%s\" is out of range for type double", str);
 		return false;
 	}
 
 	if (unlikely(end == str || *end != '\0'))
 	{
 		if (!errorOK)
-			fprintf(stderr,
-	"invalid input syntax for type double: \"%s\"\n", str);
+			pg_log_error("invalid input syntax for type double: \"%s\"", str);
 		return false;
 	}
 	return true;
@@ -1149,7 +1142,8 @@ executeStatement(PGconn *con, const char *sql)
 	res = PQexec(con, sql);
 	if (PQresultStatus(res) != PGRES_COMMAND_OK)
 	{
-		fprintf(stderr, "%s", PQerrorMessage(con));
+		pg_log_fatal("query failed: %s", PQerrorMessage(con));
+		pg_log_info("query was: %s", sql);
 		exit(1);
 	}
 	PQclear(res);
@@ -1164,8 +1158,8 @@ tryExecuteStatement(PGconn *con, const char *sql)
 	res = PQexec(con, sql);
 	if (PQresultStatus(res) != PGRES_COMMAND_OK)
 	{
-		fprintf(stderr, "%s", PQerrorMessage(con));
-		fprintf(stderr, "(ignoring this error and continuing anyway)\n");
+		pg_log_error("%s", PQerrorMessage(con));
+		pg_log_info("(ignoring this error and continuing anyway)");
 	}
 	PQclear(res);
 }
@@ -1211,8 +1205,7 @@ doConnect(void)
 
 		if (!conn)
 		{
-			fprintf(stderr, "connection to database \"%s\" failed\n",
-	dbName);
+			pg_log_error("connection to database \"%s\" failed", dbName);
 			return NULL;
 		}
 
@@ -1230,8 +1223,8 @@ doConnect(void)
 	/* check to see that the backend connection was successfully made */
 	if (PQstatus(conn) == CONNECTION_BAD)
 	{
-		fprintf(stderr, "connection to database \"%s\" failed:\n%s",
-dbName, PQerrorMessage(conn));
+		pg_log_error("connection to database \"%s\" failed: %s",
+	 dbName, PQerrorMessage(conn));
 		PQfinish(conn);
 		return NULL;
 	}
@@ -1360,9 +1353,8 @@ makeVariableValue(Variable *var)
 
 		if (!strtodouble(var->svalue, true, &dv))
 		{
-			fprintf(stderr,
-	"malformed variable \"%s\"

Re: WIP: System Versioned Temporal Table

2020-01-07 Thread Kyotaro Horiguchi

Hello.

Isn't this patch somehow broken?

At Mon, 28 Oct 2019 16:36:09 +0100, Vik Fearing  
wrote in 
> On 28/10/2019 13:48, Surafel Temesgen wrote:
> >
> >
> > On Fri, Oct 25, 2019 at 10:45 PM Vik Fearing
> > mailto:vik.fear...@2ndquadrant.com>> wrote:
> >
> > >
> > >     I don't understand what you mean by this.
> > >
> > >
> > >
> > > The primary purpose of adding row end time to primary key is to
> > allow
> > > duplicate value to be inserted into a table with keeping
> > constraint in
> > > current data but it can be duplicated in history data. Adding row
> > > start time column to primary key will eliminate this uniqueness for
> > > current data which is not correct  
> >
> >
> > How?  The primary/unique keys must always be unique at every point
> > in time.
> >
> >
> > From user prospect it is acceptable to delete and reinsert a record
> > with the same key value multiple time which means there will be
> > multiple record with the same key value in a history data but there is
> > only one values in current data as a table without system versioning
> > do .I add row end time column to primary key to allow user supplied
> > primary key values to be duplicated in history data which is acceptable
> >
> 
> Yes, I understand that.  I'm saying you should also add the row start
> time column.

I think that the start and end timestamps represent the period where
that version of the row was active. So UPDATE should modify the start
timestamp of the new version to the same value with the end timestamp
of the old version to the updated time. Thus, I don't think adding
start timestamp to PK doesn't work as expected. That hinders us from
rejecting rows with the same user-defined unique key because start
timestams is different each time of inserts. I think what Surafel is
doing in the current patch is correct. Only end_timestamp = +infinity
rejects another non-historical (= live) version with the same
user-defined unique key.

I'm not sure why the patch starts from "0002, but anyway it applied
on e369f37086. Then I ran make distclean, ./configure then make all,
make install, initdb and started server after all of them.

First, I tried to create a temporal table.

When I used timestamptz as the type of versioning columns, ALTER,
CREATE commands ended with server crash. 

 "CREATE TABLE t (a int, s timestamptz GENERATED ALWAYS AS ROW START, e 
timestamptz GENERATED ALWAYS AS ROW END);"
 (CREATE TABLE t (a int);)
 "ALTER TABLE t ADD COLUMN s timestamptz GENERATED ALWAYS AS ROW START"
 "ALTER TABLE t ADD COLUMN s timestamptz GENERATED ALWAYS AS ROW START, ADD 
COLUMN e timestamptz GENERATED ALWAYS AS ROW END"

If I added the start/end timestamp columns to an existing table, it
returns uncertain error.

 ALTER TABLE t ADD COLUMN s timestamp(6) GENERATED ALWAYS AS ROW START;
 ERROR:  column "s" contains null values
 ALTER TABLE t ADD COLUMN s timestamp(6) GENERATED ALWAYS AS ROW START, ADD 
COLUMN e timestamp(6) GENERATED ALWAYS AS ROW END;
 ERROR:  column "s" contains null values

When I defined only start column, SELECT on the table crashed.

 "CREATE TABLE t (s timestamp(6) GENERATED ALWAYS AS ROW START);"
 "SELECT * from t;"
 (crashed)

The following command ended with ERROR which I cannot understand the
cause, but I expected the command to be accepted.

  ALTER TABLE t ADD COLUMN start timestamp(6) GENERATED ALWAYS AS ROW START, 
ADD COLUMN end timestamp(6) GENERATED ALWAYS AS ROW END;
  ERROR:  syntax error at or near "end"

I didin't examined further but the syntax part doesn't seem designed
well, and the body part seems vulnerable to unexpected input.

I ran a few queries:

SELECT * shows the timestamp columns, don't we need to hide the period
timestamp columns from this query?

I think UPDATE needs to update the start timestamp, but it doesn't. As
the result the timestamps doesn't represent the correct lifetime of
the row version and we wouldn't be able to pick up correct versions of
a row that exprerienced updates. (I didn't confirmed that because I
couldn't do "FOR SYSTEM_TIME AS OF" query due to syntax error..)

(Sorry in advance for possible pointless comments due to my lack of
access to the SQL11 standard.)  If we have the period-timestamp
columns before the last two columns, INSERT in a common way on the
table fails, which doesn't seem to me to be expected behavior:

 CREATE TABLE t (s timestamp(6) GENERATED ALWAYS AS ROW START, e timestamp(6) 
GENERATED ALWAYS AS ROW END, a int) WITH SYSTEM VERSIONING;
 INSERT INTO t (SELECT a FROM generate_series(0, 99) a);
 ERROR:  column "s" is of type timestamp without time zone but expression is of 
type integer

Some queries using SYSTEM_TIME which I think should be accepted ends
with error. Is the grammar part missing something?

 SELECT * FROM t FOR SYSTEM_TIME AS OF '2020-01-07 09:57:55';
 ERROR:  syntax error at or near "system_time"
 LINE 1: SELECT * FROM t FOR SYSTEM_TIME AS OF '2020-01-07 09:57:55';

Re: Errors when update a view with conditional-INSTEAD rules

2020-01-07 Thread Dean Rasheed

On Sat, 4 Jan 2020 at 18:12, Dean Rasheed  wrote:
>
> On Sat, 4 Jan 2020 at 17:13, Tom Lane  wrote:
> >
> > Dean Rasheed  writes:
> > > That included a change to rewriteTargetListIU() to prevent it from
> > > adding dummy targetlist entries for unassigned-to attributes for
> > > auto-updatable views, in case they are no longer simple references to
> > > the underlying relation. Instead, that is left to expand_targetlist(),
> > > as for a normal table. However, in this case (an UPDATE on a view with
> > > a conditional rule), the target relation of the original query isn't
> > > rewritten (we leave it to the executor to report the error), and so
> > > expand_targetlist() ends up adding a new targetlist entry that
> > > references the target relation, which is still the original view.
> >
> > So why did we leave it to the executor to throw an error?  I have
> > a feeling it was either because the rewriter didn't have (easy?)
> > access to the info, or it seemed like it'd be duplicating code.
> >
> I think that the required information is easily available in the
> rewriter ...

Here's a patch along those lines. Yes, it's a little more code
duplication, but I think it's worth it for the more detailed error.
There was no previous regression test coverage of this case so I added
some (all other test output is unaltered).

The existing comment in the executor check clearly implied that it
thought that error was unreachable there, and I think it now is, but
it seems worth leaving it just in case.

Regards,
Dean
From e9036832192b27b11f1beff5871618f349477819 Mon Sep 17 00:00:00 2001
From: Dean Rasheed 
Date: Tue, 7 Jan 2020 09:17:12 +
Subject: [PATCH] Make rewriter prevent auto-updates on views with conditional
 INSTEAD rules.

A view with conditional INSTEAD rules and no unconditional INSTEAD
rules or INSTEAD OF triggers is not auto-updatable. Previously we
relied on a check in the executor to catch this, but that's
problematic since the planner may fail to properly handle such a query
and thus return a particularly unhelpful error to the user.

Instead, trap this in the rewriter and report the correct error there.
Doing so also allows us to include more useful error detail than the
executor check can provide.

Per report from Pengzhou Tang.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAG4reAQn+4xB6xHJqWdtE0ve_WqJkdyCV4P=tryr4kn8_3_...@mail.gmail.com
---
 src/backend/executor/execMain.c   |  8 ++--
 src/backend/rewrite/rewriteHandler.c  | 60 ---
 src/test/regress/expected/updatable_views.out | 21 ++
 src/test/regress/sql/updatable_views.sql  | 14 +++
 4 files changed, 94 insertions(+), 9 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 4181a7e343..b03e02ae6c 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1101,10 +1101,10 @@ CheckValidResultRel(ResultRelInfo *resultRelInfo, CmdType operation)
 
 			/*
 			 * Okay only if there's a suitable INSTEAD OF trigger.  Messages
-			 * here should match rewriteHandler.c's rewriteTargetView, except
-			 * that we omit errdetail because we haven't got the information
-			 * handy (and given that we really shouldn't get here anyway, it's
-			 * not worth great exertion to get).
+			 * here should match rewriteHandler.c's rewriteTargetView and
+			 * RewriteQuery, except that we omit errdetail because we haven't
+			 * got the information handy (and given that we really shouldn't
+			 * get here anyway, it's not worth great exertion to get).
 			 */
 			switch (operation)
 			{
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index e9fefec8b3..3b4f28874a 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -3670,21 +3670,71 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
 		}
 
 		/*
-		 * If there were no INSTEAD rules, and the target relation is a view
-		 * without any INSTEAD OF triggers, see if the view can be
+		 * If there was no unqualified INSTEAD rule, and the target relation
+		 * is a view without any INSTEAD OF triggers, see if the view can be
 		 * automatically updated.  If so, we perform the necessary query
 		 * transformation here and add the resulting query to the
 		 * product_queries list, so that it gets recursively rewritten if
 		 * necessary.
+		 *
+		 * If the view cannot be automatically updated, we throw an error here
+		 * which is OK since the query would fail at runtime anyway.  Throwing
+		 * the error here is preferable to the executor check since we have
+		 * more detailed information available about why the view isn't
+		 * updatable.
 		 */
-		if (!instead && qual_product == NULL &&
+		if (!instead &&
 			rt_entry_relation->rd_rel->relkind == RELKIND_VIEW &&
 			!view_has_instead_trigger(rt_entry_relation, event))
 		{
 			/*
+			 * If there were any qualified INSTEAD rules,

Re: RFC: seccomp-bpf support

2020-01-07 Thread Joe Conway

On 1/6/20 8:37 PM, Tomas Vondra wrote:
> Hi,
> 
> This patch is currently in "needs review" state, but the last message is
> from August 29, and my understanding is that there have been a couple of
> objections / disagreements about the architecture, difficulties with
> producing the set of syscalls, and not providing any built-in policy.
> 
> I don't think we're any closer to resolve those disagreements since
> August, so I think we should make some decision about this patch,
> instead of just moving it from one CF to the next one. The "needs
> review" status seems not reflecting the situation.
> 
> Are there any plans to post a new version of the patch with a different
> design, or something like that? If not, I propose we mark it either as
> rejected or returned with feedback (and maybe get a new patch in the
> future).


I assumed it was rejected.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: Greatest Common Divisor

2020-01-07 Thread Dean Rasheed

Do we actually need the smallint versions of these functions?

I would have thought that automatic casting would take care of any
cases that need smallints, and I can't believe that there's any
performance benefit to be had that's worth maintaining the extra code.

Regards,
Dean

[Logical Replication] TRAP: FailedAssertion("rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT || rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL || rel->rd_rel->relreplident == REPLICA_IDENTIT

2020-01-07 Thread Neha Sharma

Hi,
I am getting a server crash on publication server on HEAD for the below
test case.

Commit: b9c130a1fdf16cd99afb390c186d19acaea7d132

Data setup:
Publication server:
wal_level = logical
max_wal_senders = 10
max_replication_slots = 15
wal_log_hints = on
hot_standby_feedback = on
wal_receiver_status_interval = 1
listen_addresses='*'
log_min_messages=debug1
wal_sender_timeout = 0
logical_decoding_work_mem=64kB

Subscription server:
wal_level = logical
wal_log_hints = on
hot_standby_feedback = on
wal_receiver_status_interval = 1
log_min_messages=debug1
port=5433
logical_decoding_work_mem=64kB

Test case:
Publication server:
create table test(a int);
create publication test_pub for all tables;
alter table test replica identity NOTHING ;

Subscription server:
create table test(a int);
create subscription test_sub CONNECTION 'host=172.16.208.32 port=5432
dbname=postgres user=centos' PUBLICATION test_pub WITH ( slot_name =
test_slot_sub);

Publication server:
insert into test values(generate_series(1,5),'aa');

After executing the DML in publication server ,it crashed with the
mentioned assert.

*Publication Server log File snippet:*
2020-01-07 11:54:00.476 UTC [17417] DETAIL:  Streaming transactions
committing after 0/163CC30, reading WAL from 0/163CC30.
2020-01-07 11:54:00.476 UTC [17417] LOG:  logical decoding found consistent
point at 0/163CC30
2020-01-07 11:54:00.476 UTC [17417] DETAIL:  There are no running
transactions.
*TRAP: FailedAssertion("rel->rd_rel->relreplident ==
REPLICA_IDENTITY_DEFAULT || rel->rd_rel->relreplident ==
REPLICA_IDENTITY_FULL || rel->rd_rel->relreplident ==
REPLICA_IDENTITY_INDEX", File: "proto.c", Line: 148)*
postgres: walsender centos 172.16.208.32(40292)
idle(ExceptionalCondition+0x53)[0x8ca453]
postgres: walsender centos 172.16.208.32(40292) idle[0x74c515]
/home/centos/PG_master/postgresql/inst/lib/pgoutput.so(+0x2114)[0x7f3bb463d114]
postgres: walsender centos 172.16.208.32(40292) idle[0x747fa8]
postgres: walsender centos 172.16.208.32(40292)
idle(ReorderBufferCommit+0x12ee)[0x75187e]
postgres: walsender centos 172.16.208.32(40292) idle[0x7455a8]
postgres: walsender centos 172.16.208.32(40292)
idle(LogicalDecodingProcessRecord+0x2ea)[0x74593a]
postgres: walsender centos 172.16.208.32(40292) idle[0x766c24]
postgres: walsender centos 172.16.208.32(40292) idle[0x7693a2]
postgres: walsender centos 172.16.208.32(40292)
idle(exec_replication_command+0xbb1)[0x76a091]
postgres: walsender centos 172.16.208.32(40292)
idle(PostgresMain+0x4b9)[0x7b1099]
postgres: walsender centos 172.16.208.32(40292) idle[0x482bc7]
postgres: walsender centos 172.16.208.32(40292)
idle(PostmasterMain+0xdbf)[0x73339f]
postgres: walsender centos 172.16.208.32(40292) idle(main+0x44f)[0x48403f]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f3bc53f23d5]
postgres: walsender centos 172.16.208.32(40292) idle[0x4840a6]
2020-01-07 11:54:00.802 UTC [17359] LOG:  server process (PID 17417) was
terminated by signal 6: Aborted
2020-01-07 11:54:00.802 UTC [17359] LOG:  terminating any other active
server processes
2020-01-07 11:54:00.802 UTC [17413] WARNING:  terminating connection
because of crash of another server process
2020-01-07 11:54:00.802 UTC [17413] DETAIL:  The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.


Stack Trace:
Core was generated by `postgres: walsender centos 172.16.208.32(40286) idle
 '.
Program terminated with signal 6, Aborted.
#0  0x7f3bc5406207 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install
glibc-2.17-260.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64
krb5-libs-1.15.1-34.el7.x86_64 libcom_err-1.42.9-13.el7.x86_64
libgcc-4.8.5-36.el7.x86_64 libselinux-2.5-14.1.el7.x86_64
openssl-libs-1.0.2k-16.el7.x86_64 pcre-8.32-17.el7.x86_64
zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  0x7f3bc5406207 in raise () from /lib64/libc.so.6
#1  0x7f3bc54078f8 in abort () from /lib64/libc.so.6
#2  0x008ca472 in ExceptionalCondition (
conditionName=conditionName@entry=0xa5b9c8 "rel->rd_rel->relreplident
== REPLICA_IDENTITY_DEFAULT || rel->rd_rel->relreplident ==
REPLICA_IDENTITY_FULL || rel->rd_rel->relreplident ==
REPLICA_IDENTITY_INDEX", errorType=errorType@entry=0x918fe9
"FailedAssertion", fileName=fileName@entry=0xa5b903 "proto.c",
lineNumber=lineNumber@entry=148) at assert.c:67
#3  0x0074c515 in logicalrep_write_insert (out=0xe8c768,
rel=rel@entry=0x7f3bc69548d8, newtuple=0x7f3bb3e3a068) at proto.c:146
#4  0x7f3bb463d114 in pgoutput_change (ctx=0xe8a8b0, txn=, relation=0x7f3bc69548d8, change=0xf3e018) at pgoutput.c:345
#5  0x00747fa8 in change_cb_wrapper (cache=,
txn=, relation=, change=) at
logical.c:754
#6  0x0075187e in ReorderBufferCommit (rb=0xf2e090, xid=xid@entry=489,
commit_lsn=23318344, end_lsn=,
commit_time=commit_time@entry=631713235036663,
origin_id=origin_id@entry=0,
origin_lsn=origi

Re: Greatest Common Divisor

2020-01-07 Thread Tom Lane

Dean Rasheed  writes:
> Do we actually need the smallint versions of these functions?

Doubt it.  It'd be fairly hard even to call those, since e.g. "42"
is an int not a smallint.

regards, tom lane

Re: RFC: seccomp-bpf support

2020-01-07 Thread Tomas Vondra


On Tue, Jan 07, 2020 at 06:02:14AM -0500, Joe Conway wrote:

On 1/6/20 8:37 PM, Tomas Vondra wrote:

Hi,

This patch is currently in "needs review" state, but the last message is
from August 29, and my understanding is that there have been a couple of
objections / disagreements about the architecture, difficulties with
producing the set of syscalls, and not providing any built-in policy.

I don't think we're any closer to resolve those disagreements since
August, so I think we should make some decision about this patch,
instead of just moving it from one CF to the next one. The "needs
review" status seems not reflecting the situation.

Are there any plans to post a new version of the patch with a different
design, or something like that? If not, I propose we mark it either as
rejected or returned with feedback (and maybe get a new patch in the
future).



I assumed it was rejected.



I don't know. I still see it in the CF app with "needs review" status:

  https://commitfest.postgresql.org/26/2263/

Barring objections, I'll mark it as rejected.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: reduce size of fmgr_builtins array

2020-01-07 Thread Heikki Linnakangas


On 02/01/2020 01:15, John Naylor wrote:

I wrote:


Currently, we include the function name string in each FmgrBuiltin
struct, whose size is 24 bytes on 64 bit platforms. As far as I can
tell, the name is usually unused, so the attached (WIP, untested)
patch stores it separately, reducing this struct to 16 bytes.

We can go one step further and allocate the names as a single
character string, reducing the binary size. It doesn't help much to
store offsets, since there are ~40k characters, requiring 32-bit
offsets. If we instead compute the offset on the fly from stored name
lengths, we can use 8-bit values, saving 19kB of space in the binary
over using string pointers.


I tested with the attached C function to make sure
fmgr_internal_function() still returned the correct answer. I assume
this is not a performance critical function, but I still wanted to see
if there was a visible performance regression. I get this when calling
fmgr_internal_function() 100k times:

master: 833ms
patch: 886ms


Hmm. I was actually expecting this to slightly speed up 
fmgr_internal_function(), now that all the names fit in a smaller amount 
of cache. I guess there are more branches or a data dependency or 
something now. I'm not too worried about that though. If it mattered, we 
should switch to binary searching the array.



The point of the patch is to increase the likelihood of
fmgr_isbuiltin() finding the fmgr_builtins[] element in L1 cache. It
seems harder to put a number on that for a realistic workload, but
reducing the array size by 1/3 couldn't hurt.


Yeah. Nevertheless, it would be nice to be able to demonstrate the 
benefit in some test, at least. It feels hard to justify committing a 
performance patch if we can't show the benefit. Otherwise, we should 
just try to keep it as simple as possible, to optimize for readability.


A similar approach was actually discussed a couple of years back: 
https://www.postgresql.org/message-id/bd13812c-c4ae-3788-5b28-5633beed2929%40iki.fi. 
The conclusion then was that it's not worth the trouble or the code 
complication. So I think this patch is Rejected, unless you can come up 
with a test case that concretely shows the benefit.


- Heikki

Re: adding partitioned tables to publications

2020-01-07 Thread Rafia Sabih

On Tue, 7 Jan 2020 at 06:02, Amit Langote  wrote:

> Rebased and updated to address your comments.
>
+  
+   Partitioned tables are not considered when FOR ALL TABLES
+   is specified.
+  
+
What is the reason for above, I mean not for the comment but not
including partitioned tables in for all tables options.


-- 
Regards,
Rafia Sabih

Re: Assert failure due to "drop schema pg_temp_3 cascade" for temporary tables and \d+ is not showing any info after drooping temp table schema

2020-01-07 Thread Robert Haas

On Mon, Jan 6, 2020 at 7:22 PM Michael Paquier  wrote:
> Okay for the first one, printing the OID sounds like a good idea.
> Like Tom, I would prefer keeping the relation name with "(null)" for
> the schema name.  Or even better, could we just print the OID all the
> time?  What's preventing us from showing that information in the first
> place?  And that still looks good to have when debugging issues IMO
> for orphaned entries.

I think we should have two different messages, rather than trying to
shoehorn things into one message using a fake schema name.

> For the second one, I would really wish that we keep the restriction
> put in place by a052f6c until we actually figure out how to make the
> operation safe in the ways we want it to work because this puts
> the catalogs into an inconsistent state for any object type able to
> use a temporary schema, like functions, domains etc. for example able
> to use "pg_temp" as a synonym for the temp namespace name.  And any
> connected user is able to do that.

So what?

> On top of that, except for tables,
> these could remain as orphaned entries after a crash, no?

Tables, too, although they want have storage any more. But your patch
in no way prevents that. It just makes it harder to fix when it does
happen. So I see no advantages of it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: TRUNCATE on foreign tables

2020-01-07 Thread Kohei KaiGai

2020年1月7日(火) 16:03 Michael Paquier :
>
> On Mon, Jan 06, 2020 at 04:32:39PM -0500, Stephen Frost wrote:
> > RESTRICT, yes.  I don't know about ONLY being sensible as we don't
> > really deal with inheritance and foreign tables very cleanly today, as I
> > said up-thread, so I'm not sure if we really want to put in the effort
> > that it'd require to figure out how to make ONLY make sense.
>
> True enough.
>
> > The question about how to handle IDENTITY is a good one.  I suppose
> > we could just pass that down and let the FDW sort it out..?
>
> Looking at the code, ExecuteTruncateGuts() passes down restart_seqs,
> so it sounds sensible to me to just pass down that to the FDW
> callback and the callback decide what to do.
>
It looks to me the local sequences owned by a foreign table shall be restarted
by the core, regardless of relkind of the owner relation. So, even if FDW driver
is buggy, consistency of the local database is kept, right?
Indeed, "restart_seqs" flag is needed to propagate the behavior, however,
it shall be processed on the remote side via the secondary "TRUNCATE" command.
Is it so sensitive?

Best regards,
-- 
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei

Re: [PATCH] Atomic pgrename on Windows

2020-01-07 Thread Tomas Vondra


On Tue, Aug 06, 2019 at 09:03:08PM +0300, Alexander Korotkov wrote:


...

Unfortunately, it seems that none of such strategies would fit all the
cases.

Basically, we have two option for renaming a file.  * MoveFileEx() –
safest possible option, less likely loose files, but takes a lock on
target.  * ReplaceFile() – "lockless" option, but may loose files on
OS crash.

Also we have two different cases of files renaming: 1) Renaming
durable files.  When target already exists, on OS crash we expect
finding either old or new file in target filename.  We don't expect to
find nothing in target filename.  2) Renaming temporary files.  In
this case we don't much care on loosing files on OS crash.  But
locking appears to be an issue in some cases.

So, in 1st case it doesn't seem like an option to try ReplaceFile()
either in first or second place.  In both ways it causes a risk to
loose a target file, which seems unacceptable.

In the 2nd case, trying MoveFileEx() then ReplaceFile() or vise versa
might work.  But error codes of these functions doesn't tell explicitly
whether target file exists.  So, I prefer checking it explicitly with
GetFileAttributes().

Attached patch implements idea described in [1].  It implements
separate pgrename_temp() function, which is better for concurrent
operation but less safe.



I don't have access to a Windows machine and my developer experience
with that platform is pretty much nil, but I think this patch makes
sense. It's not an ideal solution, but it's not clear such solution
exists, and an improvement is better than nothing.

I have two minor comments about rename_temp:

1) The name might seem to be hinting it's about files opened using
OpenTemporaryFile, but in practice it's about files that are not
critical. But maybe it's true.

2) I think the rename_temp comment should mention it can only be used in
cases when the renames happen in a single process (non-concurrently). If
we could call rename_temp() concurrently from two different processes,
it won't work as expected. Of course, we only call rename_temp from stat
collector and syslogger, where it obviously works.

Anyway, I'm really nitpicking here ...

Are there any objections to get the current patch committed as is, so
that it does not get pushed yet again to the next commitfest.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: RFC: seccomp-bpf support

2020-01-07 Thread Robert Haas

On Tue, Jan 7, 2020 at 7:59 AM Tomas Vondra
 wrote:
> Barring objections, I'll mark it as rejected.

I think that's right. Since I just caught up on this thread, I'd like
to offer a few belated comments:

1. I don't think it would kill us to add a few hooks that would allow
this feature to be added by a loadable module. Someone may argue that
we should never add a hook unless we know exactly how it's going to be
used and agree with it as a goal, but I don't agree with that.
Refusing to add hooks just causes people to fork the server. If
somebody loads code that uses a hook at least you can tell that
they've done it by looking at shared_preload_libraries; if they fork
the server it may be much harder to tell that you're not dealing with
straight-up PostgreSQL. At any rate, ease-of-debugging considerations
for core developers should not take precedence over letting people do
with PostgreSQL what they wish.

2. I feel strongly that shipping this feature with mechanism but not
policy is unwise; I thought Alvaro articulated this problem
particularly well. I think the evidence on this thread is pretty
clear: this WILL break for some users, and it WILL need fixing. If the
mechanism is in core and the policy is not, then it seems likely that
employees at Crunchy, who apparently run into customers that need this
on a regular basis, will develop a set of best practices which will
allow them to advise customers as to what settings will or will not
work well, but because that knowledge will not be embedded in core, it
will be pretty hard for anybody else to support such customers, unless
they too have a lot of customers who want to run in this mode. I would
be a lot more supportive of this if both the mechanism and the policy
were going to ship in core and be maintained in core, with adequate
documentation.

3. The difficulty in making that happen is that the set of system
calls that need to be whitelisted seems likely to vary based on
platform, kernel version, glibc version, PostgreSQL build options,
loadable modules used, and which specific PostgreSQL features you care
about. I can't help feeling that this is designed mostly for processes
that do far simpler things than PostgreSQL. It would be interesting,
for example, to know what bash or perl does about this. They have the
same problem that PostgreSQL does, namely, that they are intended to
let users do almost arbitrary things by design -- not a totally
unlimited set of things, but an awful lot of things. Perhaps over time
this mechanism will undergo design changes, or a clearer set of best
practices will emerge, so that it's easier to see how PostgreSQL could
use this without breaking things. If indeed this is the future, you
can imagine something like glibc getting a "seccomp-clean" mode in
which it can be built, and if that happened and were widely used, then
the difficulties for PostgreSQL would be reduced. Because such
improvements typically happen over time through trial and error and
the efforts of many people, I think it is to our advantage to allow
people to experiment with the feature even as it exists today out of
core, which gets me back to point #1. I agree with Joshua Brindle's
point that holding our breath in response to a widely-adopted
technology is not a very useful response.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[QUESTION/PROPOSAL] loose quadtree in spgist

2020-01-07 Thread Peter Griggs

Hello, I wanted some guidance/suggestions about creating an spgist
extension. For context, i am a grad student doing research that involves
comparing the performance of different indexes for spatial data. We've
built a system that uses Postgres and one of the data structures we want to
use is a loose quadtree, but there is no implementation of this data
structure in spgist. The reason why I think this is pretty do-able is that
it is quite similar to a quadtree on boxes, which is implemented in
src/backend/utils/adt/geo_spgist.c.

Additionally, I found by grepping through the repo for the existing
functions in spgist/box_ops operator class that several catalog files need
to be updated to reflect a new operator class in spgist. The files that I
believe need to be changed to create a new
spgist_loose_box_ops operator class are:

src/include/catalog/pg_amop.dat
src/include/catalog/pg_amproc.dat
src/include/catalog/pg_opclass.dat
src/include/catalog/pg_opfamily.dat


I've poked around quite a bit in the spgist code and have tried making
minimal changes to geo_spgist.c, but I haven't done any development on
postgres before, so i'm running into some issues that I couldn't find help
with on the postgres slack, by searching the mailing list, or by scouring
the development wikis. For example, I wanted to just print out some data to
see what quadrant a box is being placed into in the geo_spgist.c code. I
understand that printing to stdout won't work in postgres, but I thought
that I could possibly write some data to the logfile. I tried updating a
function to use both elog and ereport and re-built the code. However, I
can't get anything to print out to the logfile no matter what I try. Does
anyone have tips for printing out and debugging in general for postgres
development?


Any tips or guidance would be much appreciated. Also, if there's a
different route I should go to turn this into a proposal for a patch please
let me know. I'm new to postgres dev.

Best,
Peter

Re: ERROR: attribute number 6 exceeds number of columns 5

2020-01-07 Thread Tom Lane

Kyotaro Horiguchi  writes:
> At Tue, 26 Nov 2019 10:49:11 -0500, Tom Lane  wrote in 
>> Hmm, interesting.  IMO, that *should* have thrown an error, but of
>> course not that one.  The ADD COLUMN operations are all processed
>> in parallel, so it's not okay for one of them to have a GENERATED
>> expression that refers to another one of the new columns.  But you
>> should have gotten a "no such column" type of error, not a run-time
>> cross-check failure.

> Something like this works?

I started to look at this, but it felt a bit brute-force to me.
After awhile I began to think that my offhand comment above was
wrong --- why *shouldn't* this case work?  When we insert or
update a tuple, we expect that GENERATED columns should be
computed based on the new tuple values, so why is the executor
evidently evaluating them based on the old tuple?

That thought soon led me to realize that there's an adjacent
bug that this patch fails to fix:

regression=# create table foo (f1 int);
CREATE TABLE
regression=# insert into foo values(1),(2);
INSERT 0 2
regression=# alter table foo alter column f1 type float8, add column f2 int 
generated always as (f1 * 2) stored;
ERROR:  attribute 1 of type foo has wrong type
DETAIL:  Table has type integer, but query expects double precision.

So I believe that the real problem here is that the executor is
evaluating GENERATED expressions at the wrong time.  It's evaluating
them against the pre-conversion tuples when it should be evaluating
them against the post-conversion tuples.  We need to go fix that,
rather than inserting arbitrary restrictions in the DDL code.

regards, tom lane

Re: Removing pg_pltemplate and creating "trustable" extensions

2020-01-07 Thread Robert Haas

On Mon, Jan 6, 2020 at 6:56 PM Stephen Frost  wrote:
> The first is this- ANYONE can create an extension in the system today,
> if it's marked as superuser=false.  If anything, it seems like that's
> probably too loose- certainly based on your contention that ONLY
> superusers should wield such a power and that letting anyone else do so
> is a right that a superuser must explicitly grant.

I don't think this argument makes any sense. Sure, anyone can create
an extension with superuser=false, but so what? From a security point
of view, when you create such an extension, you are using your own
privileges to do things that you could do anyway. The interesting case
is creating an extension that requires superuser privileges, probably
because it's going to call C functions. The superuser can and must
have the last word regarding who is allowed to do such things, because
the superuser is equivalent to the OS user and any other user of the
system is not. The "tenants" of the database system can't be allowed
to use it for things which the "owner" does not wish to permit.

On Mon, Jan 6, 2020 at 6:26 PM Tom Lane  wrote:
> If we were willing to break backwards compatibility, what I'd prefer
> is to just have the grantable role, and to say that you have to grant
> that to DB owners if you want them to be able to install PLs.  I'm
> not sure how loud the howls would be if we did that, but it'd be a
> lot cleaner than any of these other ideas.

That seems like a fine idea. Then the superuser has ultimate control,
and can decide which database owners they want to trust, and whether
they'd like the database owners to be able to subdelegate those
permissions. The only thing it doesn't do is give any control over
exactly which extensions can be installed by non-superusers, which
would be a really nice thing to have, especially if we're going to
significant expand the list of trusted extensions (something that I
think is, overall, quite a good idea). I accept Tom's argument that he
isn't obliged to add every related feature somebody might want just
because he's doing some work in this area, but not his contention that
the demand for such a feature is entirely hypothetical and the
suggestion that perhaps nobody will care anyway. I expect the reaction
to be along the lines of "hey, it's great that we can let DB owners do
this now, but it's really too bad that I can't blacklist
$SCARY_EXTENSION". I don't think that we'll be better off if this
entire proposal gets voted down for lack of that capability, but I
think it would be a really good thing to add.

FWIW, I don't really buy the argument that you can adjust the
extension control files to get out from under this problem.
Technically, that is true. But in practice, the extension control
files are provided by your packager, and you don't want to modify them
because then your packaging system will get grumpy with you. While
it's reasonable for the packaging to provide a tentative answer to the
question of what should be trusted, trust is ultimately a matter of
local policy, and that policy should be configured someplace that's
not managed by RPM.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PATCH] Increase the maximum value track_activity_query_size

2020-01-07 Thread Robert Haas

On Sun, Jan 5, 2020 at 11:01 PM Michael Paquier  wrote:
> Sounds like an agreement then.  The original patch documents the range
> in postgresql.conf.sample, which is fine by me as this is done for
> some parameters, and skips the part about doc/, which also matches
> with the surrounding effort for other parameters, so the whole looks
> fine seen from here.  Anybody willing to commit that?

Done. I didn't commit the postgresql.conf.sample change because:

(1) I think Bruce voted against it.

(2) It makes the line a little wide, and I'd rather not do that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: ERROR: attribute number 6 exceeds number of columns 5

2020-01-07 Thread Tom Lane

I wrote:
> So I believe that the real problem here is that the executor is
> evaluating GENERATED expressions at the wrong time.  It's evaluating
> them against the pre-conversion tuples when it should be evaluating
> them against the post-conversion tuples.  We need to go fix that,
> rather than inserting arbitrary restrictions in the DDL code.

I looked at that more closely, and realized that blaming the executor
is wrong: the real issue is that ALTER TABLE itself supposes that it
need only evaluate expressions against the old tuple.  That's easy
to fix with a bit more code though.  I propose the attached.

(Note that this should also allow relaxing the existing implementation
restriction against changing types of columns that GENERATED columns
depend on: all we have to do is re-parse the generation expression
and schedule it for evaluation.  I've not looked into that, and it
doesn't seem like a bug fix anyway.)

regards, tom lane

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1c4394a..421bc28 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -192,13 +192,15 @@ typedef struct NewConstraint
  * Phase 3 copy (this could be either a new column with a non-null default, or
  * a column that we're changing the type of).  Columns without such an entry
  * are just copied from the old table during ATRewriteTable.  Note that the
- * expr is an expression over *old* table values.
+ * expr is an expression over *old* table values, except when is_generated
+ * is true; then it is an expression over columns of the *new* tuple.
  */
 typedef struct NewColumnValue
 {
 	AttrNumber	attnum;			/* which column */
 	Expr	   *expr;			/* expression to compute */
 	ExprState  *exprstate;		/* execution state */
+	bool		is_generated;	/* is it a GENERATED expression? */
 } NewColumnValue;
 
 /*
@@ -4961,7 +4963,9 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, LOCKMODE lockmode)
 
 /*
  * Process supplied expressions to replace selected columns.
- * Expression inputs come from the old tuple.
+ *
+ * First, evaluate expressions whose inputs come from the old
+ * tuple.
  */
 econtext->ecxt_scantuple = oldslot;
 
@@ -4969,6 +4973,9 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, LOCKMODE lockmode)
 {
 	NewColumnValue *ex = lfirst(l);
 
+	if (ex->is_generated)
+		continue;
+
 	newslot->tts_values[ex->attnum - 1]
 		= ExecEvalExpr(ex->exprstate,
 	   econtext,
@@ -4978,6 +4985,26 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap, LOCKMODE lockmode)
 ExecStoreVirtualTuple(newslot);
 
 /*
+ * Now, evaluate any expressions whose inputs come from the
+ * new tuple.  We assume these columns won't reference each
+ * other, so that there's no ordering dependency.
+ */
+econtext->ecxt_scantuple = newslot;
+
+foreach(l, tab->newvals)
+{
+	NewColumnValue *ex = lfirst(l);
+
+	if (!ex->is_generated)
+		continue;
+
+	newslot->tts_values[ex->attnum - 1]
+		= ExecEvalExpr(ex->exprstate,
+	   econtext,
+	   &newslot->tts_isnull[ex->attnum - 1]);
+}
+
+/*
  * Constraints might reference the tableoid column, so
  * initialize t_tableOid before evaluating them.
  */
@@ -5892,6 +5919,7 @@ ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, Relation rel,
 			newval = (NewColumnValue *) palloc0(sizeof(NewColumnValue));
 			newval->attnum = attribute.attnum;
 			newval->expr = expression_planner(defval);
+			newval->is_generated = (colDef->generated != '\0');
 
 			tab->newvals = lappend(tab->newvals, newval);
 		}
@@ -10379,6 +10407,7 @@ ATPrepAlterColumnType(List **wqueue,
 		newval = (NewColumnValue *) palloc0(sizeof(NewColumnValue));
 		newval->attnum = attnum;
 		newval->expr = (Expr *) transform;
+		newval->is_generated = false;
 
 		tab->newvals = lappend(tab->newvals, newval);
 		if (ATColumnChangeRequiresRewrite(transform, attnum))
diff --git a/src/test/regress/expected/generated.out b/src/test/regress/expected/generated.out
index f62c93f..3ef10fa 100644
--- a/src/test/regress/expected/generated.out
+++ b/src/test/regress/expected/generated.out
@@ -593,6 +593,31 @@ ERROR:  cannot use generated column "b" in column generation expression
 DETAIL:  A generated column cannot reference another generated column.
 ALTER TABLE gtest25 ADD COLUMN x int GENERATED ALWAYS AS (z * 4) STORED;  -- error
 ERROR:  column "z" does not exist
+ALTER TABLE gtest25 ADD COLUMN c int DEFAULT 42,
+  ADD COLUMN x int GENERATED ALWAYS AS (c * 4) STORED;
+ALTER TABLE gtest25 ADD COLUMN d int DEFAULT 101;
+ALTER TABLE gtest25 ALTER COLUMN d SET DATA TYPE float8,
+  ADD COLUMN y float8 GENERATED ALWAYS AS (d * 4) STORED;
+SELECT * FROM gtest25 ORDER BY a;
+ a | b  | c  |  x  |  d  |  y  
+---+++-+-+-
+ 3 |  9 | 42 | 168 | 101 | 404
+ 4 | 12 | 42 | 168 | 101 | 404
+(2 rows)
+
+\d

Re: RFC: seccomp-bpf support

2020-01-07 Thread Tomas Vondra


On Tue, Jan 07, 2020 at 11:33:07AM -0500, Robert Haas wrote:

On Tue, Jan 7, 2020 at 7:59 AM Tomas Vondra
 wrote:

Barring objections, I'll mark it as rejected.


I think that's right.


Done.


Since I just caught up on this thread, I'd like to offer a few belated
comments:

1. I don't think it would kill us to add a few hooks that would allow
this feature to be added by a loadable module. Someone may argue that
we should never add a hook unless we know exactly how it's going to be
used and agree with it as a goal, but I don't agree with that.
Refusing to add hooks just causes people to fork the server. If
somebody loads code that uses a hook at least you can tell that they've
done it by looking at shared_preload_libraries; if they fork the server
it may be much harder to tell that you're not dealing with straight-up
PostgreSQL. At any rate, ease-of-debugging considerations for core
developers should not take precedence over letting people do with
PostgreSQL what they wish.



Not sure I understand. I don't think anyone argued by hooks vs. forking
the server in this particular thread, but the thread is fairly long so
maybe I'm missing something.

I think the "hook issue" is that the actual code is somewhere else. On
the one hand that minimizes the dev/testing/maintenance burden for us,
on the other hand it means we're not really testing the hooks. Meh.

But in this case I think the main argument against hooks was that Tom
thinks it's not really the right way to implement this. I don't know if
he's right or not, I don't have an opinion on how to integrate seccomp.


2. I feel strongly that shipping this feature with mechanism but not
policy is unwise; I thought Alvaro articulated this problem
particularly well. I think the evidence on this thread is pretty clear:
this WILL break for some users, and it WILL need fixing. If the
mechanism is in core and the policy is not, then it seems likely that
employees at Crunchy, who apparently run into customers that need this
on a regular basis, will develop a set of best practices which will
allow them to advise customers as to what settings will or will not
work well, but because that knowledge will not be embedded in core, it
will be pretty hard for anybody else to support such customers, unless
they too have a lot of customers who want to run in this mode. I would
be a lot more supportive of this if both the mechanism and the policy
were going to ship in core and be maintained in core, with adequate
documentation.



Well, but this exact argument applies to various other approaches:

1) no hooks, forking PostgreSQL
2) hooks added, but neither code nor policy included
3) hooks aded, code included, policy not included

Essentially the only case where Crunchy would not have this "lock-in"
advantage is when everything is included into PostgreSQL, at which point
we can probably make this work without hooks I suppose.


3. The difficulty in making that happen is that the set of system calls
that need to be whitelisted seems likely to vary based on platform,
kernel version, glibc version, PostgreSQL build options, loadable
modules used, and which specific PostgreSQL features you care about. I
can't help feeling that this is designed mostly for processes that do
far simpler things than PostgreSQL. It would be interesting, for
example, to know what bash or perl does about this. They have the same
problem that PostgreSQL does, namely, that they are intended to let
users do almost arbitrary things by design -- not a totally unlimited
set of things, but an awful lot of things. Perhaps over time this
mechanism will undergo design changes, or a clearer set of best
practices will emerge, so that it's easier to see how PostgreSQL could
use this without breaking things. If indeed this is the future, you can
imagine something like glibc getting a "seccomp-clean" mode in which it
can be built, and if that happened and were widely used, then the
difficulties for PostgreSQL would be reduced. Because such improvements
typically happen over time through trial and error and the efforts of
many people, I think it is to our advantage to allow people to
experiment with the feature even as it exists today out of core, which
gets me back to point #1. I agree with Joshua Brindle's point that
holding our breath in response to a widely-adopted technology is not a
very useful response.



I think this is probably the main challenge of this patch - development,
maintenance and testing of the policies. I think it's pretty clear the
community can't really support this on all possible environments, or
with third-party extensions. I don't know if it's even possible to write
a "universal policy" covering significant range of systems? It seems
much more realistic that individual providers will develop policies for
systems of customers.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: backup manifests

2020-01-07 Thread Robert Haas

On Fri, Jan 3, 2020 at 2:35 PM Stephen Frost  wrote:
> > Well, I don't know how to make you happy here.
>
> I suppose I should admit that, first off, I don't feel you're required
> to make me happy, and I don't think it's necessary to make me happy to
> get this feature into PG.

Fair enough. That is gracious of you, but I would like to try to make
you happy if it is possible to do so.

> Since you expressed that interest though, I'll go out on a limb and say
> that what would make me *really* happy would be to think about where the
> project should be taking pg_basebackup, what we should be working on
> *today* to address the concerns we hear about from our users, and to
> consider the best way to implement solutions to what they're actively
> asking for a core backup solution to be providing.  I get that maybe
> that isn't how the world works and that sometimes we have people who
> write our paychecks wanting us to work on something else, and yes, I'm
> sure there are some users who are asking for this specific thing but I
> certainly don't think it's a common ask of pg_basebackup or what users
> feel is missing from the backup options we offer in core; we had users
> on this list specifically saying they *wouldn't* use this feature
> (referring to the differential backup stuff, of course), in fact,
> because of the things which are missing, which is pretty darn rare.

Well, I mean, what you seem to be suggesting here is that somebody is
driving me with a stick to do something that I don't really like but
have to do because otherwise I won't be able to make rent, but that's
actually not the case. I genuinely believe that this is a good design,
and it's driven by me, not some shadowy conglomerate of EnterpriseDB
executives who are out to make PostgreSQL sucks. If I'm wrong and the
design sucks, that's again not the fault of shadowy EnterpriseDB
executives; it's my fault. Incidentally, my boss is not very shadowy
anyhow; he's a super-nice guy, and a major reason why I work here. :-)

I don't think the issue here is that I haven't thought about what
users want, but that not everybody wants the same thing, and it's
seems like the people with whom I interact want somewhat different
things than those with whom you interact. EnterpriseDB has an existing
tool that does parallel and block-level incremental backup, and I
started out with the goal of providing those same capabilities in
core. They are quite popular with EnterpriseDB customers, and I'd like
to make them more widely available and, as far as I can, improve on
them. From our previous discussion and from a (brief) look at
pgbackrest, I gather that the interests of your customers are somewhat
different. Apparently, block-level incremental backup isn't quite as
important to your customers, perhaps because you've already got
file-level incremental backup, but various other things like
encryption and backup verification are extremely important, and you've
got a set of ideas about what would be valuable in the future which
I'm sure is based on real input from your customers. I hope you pursue
those ideas, and I hope you do it in core rather than in a separate
piece of software, but that's up to you. Meanwhile, I think that if I
have somewhat different ideas about what I'd like to pursue, that
ought to be just fine. And I don't think it is unreasonable to hope
that you'll acknowledge my goals as legitimate even if you have
different ones.

I want to point out that my idea about how to do all of this has
shifted by a considerable amount based on the input that you and David
have provided. My original design didn't involve a backup manifest,
but now it does. That turned out to be necessary, but it was also
something you suggested, and something where I asked and took advice
on what ought to go into it. Likewise, you suggested that the process
of taking the backup should involve giving the client more control
rather than trying to do everything on the server side, and that is
now the design which I plan to pursue. You suggested that because it
would be more advantageous for out-of-core backup tools, such as
pgbackrest, and I acknowledge that as a benefit and I think we're
headed in that direction. I am not doing a single thing which, to my
knowledge, blocks anything that you might want to do with
pg_basebackup in the future. I have accepted as much of your input as
I believe that I can without killing the project off completely. To go
further, I'd have to either accept years of delay or abandon my
priorities entirely and pursue yours.

> That's what would make *me* happy.  Even some comments about how to
> *get* there while also working towards these features would be likely
> to make me happy.  Instead, I feel like we're being told that we need
> this feature badly in v13 and we're going to cut bait and do whatever
> is necessary to get us there.

This seems like a really unfair accusation given how much work I've
put into trying to satisfy you and David. If th

Re: Berserk Autovacuum (let's save next Mandrill)

2020-01-07 Thread Tomas Vondra


Hi,

This patch is currently in "needs review" state, but that seems quite
wrong - there's been a lot of discussions about how we might improve
behavior for append-only-tables, but IMO there's no clear consensus nor
a patch that we might review.

So I think this should be either "waiting on author" or maybe "rejected
with feedback". Is there any chance of getting a reviewable patch in the
current commitfest? If not, I propose to mark it as RWF.

I still hope we can improve this somehow in time for PG13. The policy is
not to allow new non-trivial patches in the last CF, but hopefully this
might be considered an old patch.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Assert failure due to "drop schema pg_temp_3 cascade" for temporary tables and \d+ is not showing any info after drooping temp table schema

2020-01-07 Thread Tom Lane

Robert Haas  writes:
> On Mon, Jan 6, 2020 at 7:22 PM Michael Paquier  wrote:
>> For the second one, I would really wish that we keep the restriction
>> put in place by a052f6c until we actually figure out how to make the
>> operation safe in the ways we want it to work because this puts
>> the catalogs into an inconsistent state for any object type able to
>> use a temporary schema, like functions, domains etc. for example able
>> to use "pg_temp" as a synonym for the temp namespace name.  And any
>> connected user is able to do that.

> So what?

I still agree with Robert that a052f6c is a bad idea.  It's not the case
that that's blocking "any connected user" from causing an issue.  The
temp schemas are always owned by the bootstrap superuser, so only a
superuser could delete them.  All that that patch is doing is preventing
superusers from doing something that they could reasonably wish to do,
and that is perfectly safe when there's not concurrent usage of the
schema.  We are not normally that nanny-ish, and the case for being so
here seems pretty thin.

regards, tom lane

Re: Removing pg_pltemplate and creating "trustable" extensions

2020-01-07 Thread Stephen Frost

Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Mon, Jan 6, 2020 at 6:56 PM Stephen Frost  wrote:
> > The first is this- ANYONE can create an extension in the system today,
> > if it's marked as superuser=false.  If anything, it seems like that's
> > probably too loose- certainly based on your contention that ONLY
> > superusers should wield such a power and that letting anyone else do so
> > is a right that a superuser must explicitly grant.
> 
> I don't think this argument makes any sense. Sure, anyone can create
> an extension with superuser=false, but so what? From a security point
> of view, when you create such an extension, you are using your own
> privileges to do things that you could do anyway. The interesting case
> is creating an extension that requires superuser privileges, probably
> because it's going to call C functions. The superuser can and must
> have the last word regarding who is allowed to do such things, because
> the superuser is equivalent to the OS user and any other user of the
> system is not. The "tenants" of the database system can't be allowed
> to use it for things which the "owner" does not wish to permit.

What's the security issue from installing a trusted extension?

> On Mon, Jan 6, 2020 at 6:26 PM Tom Lane  wrote:
> > If we were willing to break backwards compatibility, what I'd prefer
> > is to just have the grantable role, and to say that you have to grant
> > that to DB owners if you want them to be able to install PLs.  I'm
> > not sure how loud the howls would be if we did that, but it'd be a
> > lot cleaner than any of these other ideas.
> 
> That seems like a fine idea. Then the superuser has ultimate control,
> and can decide which database owners they want to trust, and whether
> they'd like the database owners to be able to subdelegate those
> permissions. The only thing it doesn't do is give any control over
> exactly which extensions can be installed by non-superusers, which
> would be a really nice thing to have, especially if we're going to
> significant expand the list of trusted extensions (something that I
> think is, overall, quite a good idea). I accept Tom's argument that he
> isn't obliged to add every related feature somebody might want just
> because he's doing some work in this area, but not his contention that
> the demand for such a feature is entirely hypothetical and the
> suggestion that perhaps nobody will care anyway. I expect the reaction
> to be along the lines of "hey, it's great that we can let DB owners do
> this now, but it's really too bad that I can't blacklist
> $SCARY_EXTENSION". I don't think that we'll be better off if this
> entire proposal gets voted down for lack of that capability, but I
> think it would be a really good thing to add.

Why would it be trusted if it's $SCARY_EXTENSION ...?  Why are we trying
to punt on solving for that question by installing a much more
complicated system here than is really necessary, just to avoid having
to make that decision?

If the extension is trusted, then there isn't a security issue with it,
and it isn't scary, by definition, imv, which negates these arguments
about making the right to install it have to be hand delegated by a
superuser and needing a system for managing who is allowed to install
what extension.

If these functions were to just be put into core (as many really should
be...), instead of being out in contrib, this whole issue also wouldn't
exist and everyone would be able to use the functions (at least, those
that we decide are safe for users to directly use- and with appropriate
privilege access over ones that aren't), without any "the superuser must
approve of this explicitly after installation" fuss.

> FWIW, I don't really buy the argument that you can adjust the
> extension control files to get out from under this problem.
> Technically, that is true. But in practice, the extension control
> files are provided by your packager, and you don't want to modify them
> because then your packaging system will get grumpy with you. While
> it's reasonable for the packaging to provide a tentative answer to the
> question of what should be trusted, trust is ultimately a matter of
> local policy, and that policy should be configured someplace that's
> not managed by RPM.

This I tend to agree with- hacking around with control files or other
files installed with extensions from RPMs isn't a great plan.

A possible alternative would be to have a *new* configuration file (not
part of the GUC system) which admins could hack up to specify who should
be allowed to install what extension.  Or we make that a catalog table
instead because, well, such things should probably be in the database
where we can have dependency management and validity checking...

On the other hand, having individual packages for different extensions
is a pretty handy way of letting an administrator decide if they want
that extension to be installed on their system or not.  That's a pain
for co

Re: [PATCH] Atomic pgrename on Windows

2020-01-07 Thread Alexander Korotkov

Hi!

On Tue, Jan 7, 2020 at 7:16 PM Tomas Vondra
 wrote:
> I don't have access to a Windows machine and my developer experience
> with that platform is pretty much nil, but I think this patch makes
> sense. It's not an ideal solution, but it's not clear such solution
> exists, and an improvement is better than nothing.

Thank you for your attention to this patch!

> I have two minor comments about rename_temp:
>
> 1) The name might seem to be hinting it's about files opened using
> OpenTemporaryFile, but in practice it's about files that are not
> critical. But maybe it's true.


We may invent another name.  What about rename_fragile()?

> 2) I think the rename_temp comment should mention it can only be used in
> cases when the renames happen in a single process (non-concurrently). If
> we could call rename_temp() concurrently from two different processes,
> it won't work as expected. Of course, we only call rename_temp from stat
> collector and syslogger, where it obviously works.

Good point, this should be reflected in comments.

> Anyway, I'm really nitpicking here ...
>
> Are there any objections to get the current patch committed as is, so
> that it does not get pushed yet again to the next commitfest.

It would be good to commit.  Let's get agreement on function name first.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [PATCH] Atomic pgrename on Windows

2020-01-07 Thread Tom Lane

Alexander Korotkov  writes:
> On Tue, Jan 7, 2020 at 7:16 PM Tomas Vondra
>  wrote:
>> I don't have access to a Windows machine and my developer experience
>> with that platform is pretty much nil, but I think this patch makes
>> sense. It's not an ideal solution, but it's not clear such solution
>> exists, and an improvement is better than nothing.

> Thank you for your attention to this patch!

FWIW, I don't like this patch much at all.  I too know nothing about
Windows, but I do *not* like inventing a distinction between "rename"
and "rename_temp" and expecting all call sites to have to decide
which one to use.  That's allowing a single platform's implementation
bugs to dictate our APIs globally; plus it's not clear that every
call site can know which is more appropriate.

regards, tom lane

Re: [PATCH] Atomic pgrename on Windows

2020-01-07 Thread Alexander Korotkov

On Tue, Jan 7, 2020 at 9:40 PM Tom Lane  wrote:
> Alexander Korotkov  writes:
> > On Tue, Jan 7, 2020 at 7:16 PM Tomas Vondra
> >  wrote:
> >> I don't have access to a Windows machine and my developer experience
> >> with that platform is pretty much nil, but I think this patch makes
> >> sense. It's not an ideal solution, but it's not clear such solution
> >> exists, and an improvement is better than nothing.
>
> > Thank you for your attention to this patch!
>
> FWIW, I don't like this patch much at all.  I too know nothing about
> Windows, but I do *not* like inventing a distinction between "rename"
> and "rename_temp" and expecting all call sites to have to decide
> which one to use.  That's allowing a single platform's implementation
> bugs to dictate our APIs globally; plus it's not clear that every
> call site can know which is more appropriate.

I'm not sure issue we faced is really about single platform.  TBH, the
assumptions we place to rename function is very strict.  We assume
rename works atomically on system crash.  And we indirectly assume it
can work concurrently as least with file readers.  The both are
probably true for Linux with most popular filesystems.  But we do
support pretty many platforms.  I think the issue we didn't
investigate rename properties well on all of them.  But if we do, it
might happen some assumptions are wrong on multiple platforms.
Windows is just used busy enough to spot the problem.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Removing pg_pltemplate and creating "trustable" extensions

2020-01-07 Thread Robert Haas

On Tue, Jan 7, 2020 at 1:17 PM Stephen Frost  wrote:
> Why would it be trusted if it's $SCARY_EXTENSION ...?  Why are we trying
> to punt on solving for that question by installing a much more
> complicated system here than is really necessary, just to avoid having
> to make that decision?

I'm not convinced that whether or not something is trusted is an
altogether objective question. For instance, postgres_fdw probably
doesn't let you become the superuser, unless it has bugs. But it does
let you make network connections originating from the database host,
and somebody might reasonably want to restrict that in a
security-sensitive environment. But the same user might be totally OK
with a particular database owner installing citext.

> If these functions were to just be put into core (as many really should
> be...), instead of being out in contrib, this whole issue also wouldn't
> exist and everyone would be able to use the functions (at least, those
> that we decide are safe for users to directly use- and with appropriate
> privilege access over ones that aren't), without any "the superuser must
> approve of this explicitly after installation" fuss.

Well, I don't agree with the idea of moving everything into core, but
I think a good solution to the problem at hand will reduce the fuss
while allowing superusers to retain some control.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: RFC: seccomp-bpf support

2020-01-07 Thread Robert Haas

On Tue, Jan 7, 2020 at 12:56 PM Tomas Vondra
 wrote:
> I think the "hook issue" is that the actual code is somewhere else. On
> the one hand that minimizes the dev/testing/maintenance burden for us,
> on the other hand it means we're not really testing the hooks. Meh.

I don't care about the testing the hooks. If we provide hooks and
someone finds them useful, great. If not, they don't have to use them.
The mere existence of this hook isn't going to put any significant
maintenance burden on the community, while the feature as a whole
probably would.

> Well, but this exact argument applies to various other approaches:
>
> 1) no hooks, forking PostgreSQL
> 2) hooks added, but neither code nor policy included
> 3) hooks aded, code included, policy not included
>
> Essentially the only case where Crunchy would not have this "lock-in"
> advantage is when everything is included into PostgreSQL, at which point
> we can probably make this work without hooks I suppose.

Well, from my point of view, in case 1 or 2, the feature is entirely
Crunchy's. If it works great, good for them. If it sucks, it's their
problem. In case 3, the feature is ostensibly a community feature but
probably nobody other than Crunchy can actually make it work. That
latter situation seems a lot more problematic to me. I don't like
PostgreSQL features that I can't make work. If it's too complicated
for other developers to figure out, it's probably going to be a real
pain for users, too.

Putting my cards on the table, I think it's likely that the proposed
design contains a significant amount of suckitude. Serious usability
and security concerns have been raised, and I find those concerns
legitimate. On the other hand, it may still be useful to some people.
More importantly, if they can more easily experiment with it, they'll
have a chance to find out whether it sucks and, if so, make it better.
Perhaps something that we can accept into core will ultimately result.
That would be good for everybody.

Also, generally, I don't think we should block features (hooks or
otherwise) because some other company might get more benefit than our
own employer. That seems antithetical to the concept of open source.
Blocking them because they're poorly designed or will impose a burden
on the community is a different thing.

> I think this is probably the main challenge of this patch - development,
> maintenance and testing of the policies. I think it's pretty clear the
> community can't really support this on all possible environments, or
> with third-party extensions. I don't know if it's even possible to write
> a "universal policy" covering significant range of systems? It seems
> much more realistic that individual providers will develop policies for
> systems of customers.

I generally agree with this, although I'm not sure that I understand
what you're advocating that we do. Accepting the feature as-is seems
like a no-go given the remarks already made, and while I don't think I
feel as strongly about it as some of the people who have already
spoken on the topic, I do share their doubts to some degree and am not
in a position to override them even if I disagreed completely. But,
hooks would give individual providers those same options, without
really burdening anyone else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: tableam vs. TOAST

2020-01-07 Thread Robert Haas

On Wed, Dec 18, 2019 at 11:37 AM Robert Haas  wrote:
> If nobody has further comments or objections, I plan to commit these
> in early January.

Done.

Which, I think, wraps up the work I felt needed to be done here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PATCH] Atomic pgrename on Windows

2020-01-07 Thread Tom Lane

Alexander Korotkov  writes:
> I'm not sure issue we faced is really about single platform.  TBH, the
> assumptions we place to rename function is very strict.  We assume
> rename works atomically on system crash.  And we indirectly assume it
> can work concurrently as least with file readers.  The both are
> probably true for Linux with most popular filesystems.  But we do
> support pretty many platforms.  I think the issue we didn't
> investigate rename properties well on all of them.  But if we do, it
> might happen some assumptions are wrong on multiple platforms.
> Windows is just used busy enough to spot the problem.

Well, atomic rename is required by POSIX.  True, we have found bugs
related to that in one or two Unix-ish platforms.  But nobody
is going to deny that those are OS bugs that the OS ought to fix,
rather than accepted behaviors that applications have to find
some way to work around.  I'm not pleased with the idea that
Windows' deficiencies in this area should result in kluges all over
our code.  I think we should stick to the autoconf recommendation:

Autoconf follows a philosophy that was formed over the years by
those who have struggled for portability: isolate the portability
issues in specific files, and then program as if you were in a
Posix environment.

regards, tom lane

Re: jsonb_set() strictness considered harmful to data

2020-01-07 Thread Pavel Stehule

Hi

po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan <
andrew.duns...@2ndquadrant.com> napsal:

> On Thu, Nov 28, 2019 at 2:15 PM Andrew Dunstan
>  wrote:
> >
> >
> > On 11/27/19 9:35 PM, Michael Paquier wrote:
> > > On Fri, Nov 15, 2019 at 09:45:59PM +0100, Pavel Stehule wrote:
> > >> Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed",
> > >> errdetail - a exception due setting "null_value_treatment" =>
> > >> raise_exception
> > >> and maybe some errhint - "Maybe you would to use Jsonb NULL -
> "null"::jsonb"
> > >>
> > >> I don't know, but in this case, the exception should be verbose. This
> is
> > >> "rich" function with lot of functionality
> > > @Andrew: This patch is waiting on input from you for a couple of days
> > > now.
> > >
> >
> >
>
>
> Updated version including docco and better error message.
>
> cheers
>
> andrew
>

I think so my objections are solved. I have small objection

+ errdetail("exception raised due to \"null_value_treatment :=
'raise_exception'\""),
+ errhint("to avoid, either change the null_value_treatment argument or
ensure that an SQL NULL is not used")));

"null_value_treatment := 'raise_exception'\""

it use proprietary PostgreSQL syntax for named parameters. Better to use
ANSI/SQL syntax

"null_value_treatment => 'raise_exception'\""

It is fixed in attached patch

source compilation without warnings,
compilation docs without warnings
check-world passed without any problems

I'll mark this patch as ready for commiter

Thank you for your work

Pavel


>
> --
> Andrew Dunstanhttps://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4b42f12862..72072e7545 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12231,6 +12231,9 @@ table2-mapping
   
jsonb_set
   
+  
+   jsonb_set_lax
+  
   
jsonb_insert
   
@@ -12545,6 +12548,26 @@ table2-mapping
  [{"f1": 1, "f2": null, "f3": [2, 3, 4]}, 2]
 

+  
+   jsonb_set_lax(target jsonb, path text[], new_value jsonb , create_missing boolean , null_value_treatment text)
+ 
+   jsonb
+   
+If new_value is not null,
+behaves identically to jsonb_set. Otherwise behaves
+according to the value of null_value_treatment
+which must be one of 'raise_exception',
+'use_json_null', 'delete_key', or
+'return_target'. The default is
+'use_json_null'.
+   
+   jsonb_set_lax('[{"f1":1,"f2":null},2,null,3]', '{0,f1}',null)
+ jsonb_set_lax('[{"f1":99,"f2":null},2]', '{0,f3}',null, true, 'return_target')
+ 
+   [{"f1":null,"f2":null},2,null,3]
+ [{"f1": 99, "f2": null}, 2]
+
+   
   


diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2fc3e3ff90..1cb2af1bcd 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1237,6 +1237,15 @@ LANGUAGE INTERNAL
 STRICT IMMUTABLE PARALLEL SAFE
 AS 'jsonb_set';
 
+CREATE OR REPLACE FUNCTION
+  jsonb_set_lax(jsonb_in jsonb, path text[] , replacement jsonb,
+create_if_missing boolean DEFAULT true,
+null_value_treatment text DEFAULT 'use_json_null')
+RETURNS jsonb
+LANGUAGE INTERNAL
+CALLED ON NULL INPUT IMMUTABLE PARALLEL SAFE
+AS 'jsonb_set_lax';
+
 CREATE OR REPLACE FUNCTION
   parse_ident(str text, strict boolean DEFAULT true)
 RETURNS text[]
diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index ab5a24a858..4b5a0214dc 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -4395,6 +4395,70 @@ jsonb_set(PG_FUNCTION_ARGS)
 }
 
 
+/*
+ * SQL function jsonb_set_lax(jsonb, text[], jsonb, boolean, text)
+ */
+Datum
+jsonb_set_lax(PG_FUNCTION_ARGS)
+{
+	/* Jsonb	   *in = PG_GETARG_JSONB_P(0); */
+	/* ArrayType  *path = PG_GETARG_ARRAYTYPE_P(1); */
+	/* Jsonb	  *newval = PG_GETARG_JSONB_P(2); */
+	/* bool		create = PG_GETARG_BOOL(3); */
+	text   *handle_null;
+	char   *handle_val;
+
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(3))
+		PG_RETURN_NULL();
+
+	/* could happen if they pass in an explicit NULL */
+	if (PG_ARGISNULL(4))
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("need delete_key, return_target, use_json_null, or raise_exception")));
+
+	/* if the new value isn't an SQL NULL just call jsonb_set */
+	if (! PG_ARGISNULL(2))
+		return jsonb_set(fcinfo);
+
+	handle_null = PG_GETARG_TEXT_P(4);
+	handle_val = text_to_cstring(handle_null);
+
+	if (strcmp(handle_val,"raise_exception") == 0)
+	{
+		ereport(ERROR,
+(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
+ errmsg("NULL is not allowed"),
+ errdetail("exception raised due to \"null_value_treatment => 'raise_exception'\""),
+ errhint("to avoid, either change the null_value_treatment argument or ensu

Re: xact_start for walsender & logical decoding not updated

2020-01-07 Thread Alvaro Herrera

On 2019-Dec-29, Tomas Vondra wrote:

> On Fri, Dec 27, 2019 at 04:46:18PM -0300, Alvaro Herrera wrote:
> > 
> > This patch changes xact.c to avoid updating transaction start timestamps
> > for walsenders (maybe more commentary is desirable).  I think logical
> > decoding is just a special form of walsender and thus it would also be
> > updated by this patch, unless I misunderstood what Tomas explained.
> > 
> 
> It's true walsender should not be doing any read-write transactions or
> executing statements (well, maybe a decoding plugin could, but using
> historic snapshot).
> 
> So I agree not leaving xact_start for walsender processes seems OK.

OK, I pushed my patch to branches 10 - master.
(See https://postgr.es/m/20200107211624.GA18974@alvherre.pgsql )

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Removing pg_pltemplate and creating "trustable" extensions

2020-01-07 Thread Stephen Frost

Greetings!

* Robert Haas (robertmh...@gmail.com) wrote:
> On Tue, Jan 7, 2020 at 1:17 PM Stephen Frost  wrote:
> > Why would it be trusted if it's $SCARY_EXTENSION ...?  Why are we trying
> > to punt on solving for that question by installing a much more
> > complicated system here than is really necessary, just to avoid having
> > to make that decision?
> 
> I'm not convinced that whether or not something is trusted is an
> altogether objective question. 

How have we managed to have an answer to that question for all of the
languages that work with PG then..?  I feel like the answer is actually
pretty clear, at least if we view it in that light, and in the specific
case below, we are in agreement on which way it goes.

> For instance, postgres_fdw probably
> doesn't let you become the superuser, unless it has bugs. But it does
> let you make network connections originating from the database host,
> and somebody might reasonably want to restrict that in a
> security-sensitive environment. But the same user might be totally OK
> with a particular database owner installing citext.

Agreement!  Progress!  At least as it relates to, specifically,
postgres_fdw and about how non-superusers should have to be granted
something special to be allowed to make network connections.

Here's the thing though..  creating the extension isn't *really* (in our
permissions model anyway) what lets you create outbound connections-
it's creating a 'SERVER', and to be able to do that you need to have
USAGE rights on the FDW, which, normally, only a superuser can create.
The crux here is that the FDW is created as part of the extension
though.  As long as only superusers can create extensions, that's fine,
but when we allow others to do so, we come to an interesting question:

No matter how we end up allowing a non-superuser to create a trusted
extension, who should end up owning it and being able to modify it
and/or grant access to objects within it?

We don't currently have anything that prevents objects from an
extension from being modified by their owner, for example, and that
seems like a problem from where I'm sitting when you're talking about
having non-superusers creating objects that previously only superusers
could, and where the ownership-level rights on those objects would allow
that user to do things we generally don't feel an 'untrusted' user
should be able to.

Which basically leads to- in my mental model of this, the 'create
trusted extension' action would be kind of like a 'sudo apt install',
where the result is an extension that's installed as if a superuser did
install it and therefore it's owned by a superuser and the DB owner
can't go monkey around with any of the functions or tables or such
(unless allowed by the extension), beyond granting access (or not) to
the schema that the extension is installed into (which is actually more
than the 'sudo apt install' example above would probably let you do).
Further, that installation doesn't give the DB owner any more rights to
do things on the system than they already had.

Of course, there's the other option, which is to just agree that,
because of the way postgres_fdw works, it's gotta be marked as
untrusted.  I would ask though- are we really sure that we aren't ever
going to have any issues with functions in untrusted languages (or any
other objects) created by extensions being owned by non-superusers?

> > If these functions were to just be put into core (as many really should
> > be...), instead of being out in contrib, this whole issue also wouldn't
> > exist and everyone would be able to use the functions (at least, those
> > that we decide are safe for users to directly use- and with appropriate
> > privilege access over ones that aren't), without any "the superuser must
> > approve of this explicitly after installation" fuss.
> 
> Well, I don't agree with the idea of moving everything into core, but
> I think a good solution to the problem at hand will reduce the fuss
> while allowing superusers to retain some control.

I don't actually mean everything, just to be clear, but a whole lot of
what's in contrib really could be in core with only a relatively modest
increase in the size of our base install/catalog (and, yes, I know some
people would complain about that, but in that case I'd argue that maybe
we should arrange to let them optionally not include those during the
build or something else because they're probably taking other steps to
minimize the size of PG on disk if they care that much..).

Having something like postgres_fdw installed as part of core would also
address the complications we have above regarding who owns it and who
gets to grant out access to it.

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: pgsql: Add basic TAP tests for psql's tab-completion logic.

2020-01-07 Thread Tom Lane

I wrote:
> I installed libedit_3.1-20150325.orig.tar.gz from source here, and it
> passes our current regression test and seems to behave just fine in
> light manual testing.  (I did not apply any of the Debian-specific
> patches at [1], but they don't look like they'd explain much.)
> So I'm a bit at a loss as to what's going wrong for you.  Is the test
> environment for Xenial the same as for the other branches?

To dig deeper, I set up an actual installation of xenial, and on that
I can replicate the tab-completion misbehavior you reported.  The cause
appears to be that libedit's rl_line_buffer does not contain the current
line as expected, but the previous line (or, initially, an empty string).
Thus, the hack I put in to make things pass on current libedit actually
makes things worse on this version --- although it doesn't fully pass
even if I revert ddd87d564, since there are other places where we
depend on rl_line_buffer to be valid.

So that raises the question: why does xenial's version of libedit
not match either its documentation or the distributed source code?
Because it doesn't.

regards, tom lane

Re: [QUESTION/PROPOSAL] loose quadtree in spgist

2020-01-07 Thread Tomas Vondra


On Tue, Jan 07, 2020 at 11:33:31AM -0500, Peter Griggs wrote:

Hello, I wanted some guidance/suggestions about creating an spgist
extension. For context, i am a grad student doing research that involves
comparing the performance of different indexes for spatial data. We've
built a system that uses Postgres and one of the data structures we want to
use is a loose quadtree, but there is no implementation of this data
structure in spgist. The reason why I think this is pretty do-able is that
it is quite similar to a quadtree on boxes, which is implemented in
src/backend/utils/adt/geo_spgist.c.

Additionally, I found by grepping through the repo for the existing
functions in spgist/box_ops operator class that several catalog files need
to be updated to reflect a new operator class in spgist. The files that I
believe need to be changed to create a new
spgist_loose_box_ops operator class are:

src/include/catalog/pg_amop.dat
src/include/catalog/pg_amproc.dat
src/include/catalog/pg_opclass.dat
src/include/catalog/pg_opfamily.dat



You should probably try using CREATE OPERATOR CLASS command [1], not
modify the catalogs directly. That's only necessary for built-in index
types (i.e. available right after initdb). But you mentioned you're
working on an extension, so the command is the right thing to do (after
all, you don't know OIDs of objects from the extension).

[1] https://www.postgresql.org/docs/current/sql-createopclass.html



I've poked around quite a bit in the spgist code and have tried making
minimal changes to geo_spgist.c, but I haven't done any development on
postgres before, so i'm running into some issues that I couldn't find help
with on the postgres slack, by searching the mailing list, or by scouring
the development wikis.


Well, learning the ropes may take a bit of time, and pgsql-hackers is
probably the right place to ask ...


For example, I wanted to just print out some data to
see what quadrant a box is being placed into in the geo_spgist.c code. I
understand that printing to stdout won't work in postgres, but I thought
that I could possibly write some data to the logfile. I tried updating a
function to use both elog and ereport and re-built the code. However, I
can't get anything to print out to the logfile no matter what I try. Does
anyone have tips for printing out and debugging in general for postgres
development?



Well, elog/ereport are the easiest approach (it's what I'd do), and they
do about the same thing. The main difference is that ereport allows
translations of messages to other languages, while elog is for internal
things that should not happen (unexpected errors, ...). For debugging
just use elog(), I guess.

It's hard to say why you're not getting anything logged, because you
haven't shown us any code. My guess is that you're uring log level that
is not high enough to make it into the log file.

The default config in postgresql.conf says

  log_min_messages = warning

which means the level has to be at least WARNING to make it into the
file. So either WARNING, ERROR, LOG, FATAL, PANIC. So for example

  elog(INFO, "test message");

won't do anything, but

  elog(LOG, "test message");

will write stuff to the log file. If you use WARNING, you'll actually
get the message on the client console (well, there's client_min_messages
but you get the idea).



Any tips or guidance would be much appreciated. Also, if there's a
different route I should go to turn this into a proposal for a patch
please let me know. I'm new to postgres dev.



A general recommendation is to show snippets of code, so that people on
this list actually can help without too much guessing what you're doing.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: xact_start for walsender & logical decoding not updated

2020-01-07 Thread Tom Lane

Alvaro Herrera  writes:
> OK, I pushed my patch to branches 10 - master.
> (See https://postgr.es/m/20200107211624.GA18974@alvherre.pgsql )

The buildfarm seems less than happy with this.

regards, tom lane

Re: A rather hackish POC for alternative implementation of WITH TIES

2020-01-07 Thread Andrew Gierth

> "Surafel" == Surafel Temesgen  writes:

 Surafel> Unlike most other executor node limit node has implementation
 Surafel> for handling backward scan that support cursor operation but
 Surafel> your approach didn't do this inherently because it outsource
 Surafel> limitNode functionality to window function and window function
 Surafel> didn't do this

Correct. But this is a non-issue: if you want to be able to do backward
scan you are supposed to declare the cursor as SCROLL; if it happens to
work without it, that is pure coincidence. (Cursors declared with neither
SCROLL nor NO SCROLL support backwards scan only if the underlying plan
supports backward scan with no additional overhead, which is something
you can't predict from the query.)

The Limit node declares that it supports backwards scan if, and only if,
its immediate child node supports it. It happens that WindowAgg does
not, so in this implementation, LIMIT ... WITH TIES will not support
backward scan without a tuplestore. I don't consider this an especially
big deal; backward scans are extremely rare (as shown by the fact that
bugs in backward scan have tended to go unnoticed for decades, e.g. bug
#15336), and therefore we should not optimize for them.

 Surafel> If am not mistaken the patch also reevaluate limit every time

The (offset+limit) expression is, yes. I noted in the original post that
this needs work - probably it should be pushed out to an InitPlan if it
doesn't fold to a constant. i.e. using the expression

  rank() over (...) > (select offset+limit)

where it currently has

  rank() over (...) > (offset+limit)

(Generating the limit expression so late in planning is the main thing
that needs changing to get this from a hack POC to usable code)

The main point here is that the same rather minimal executor changes
allow support for not only WITH TIES but also PERCENT and possibly
arbitrary stop conditions as well. (I know I've often wanted LIMIT WHEN
to stop a query at a data-dependent point without having to resort to
recursion - this patch doesn't quite get there, because of the scope
issues involved in analyzing the WHEN condition, but it at least sets up
the concept.)

-- 
Andrew (irc:RhodiumToad)

Re: xact_start for walsender & logical decoding not updated

2020-01-07 Thread Tom Lane

I wrote:
> The buildfarm seems less than happy with this.

... and, having now looked at the patch, I'm not surprised.
Breaking stmtStartTimestamp, which is what you did, seems like
an awfully side-effect-filled route to the goal.  If you want
to prevent monitoring from showing this, why didn't you just
prevent monitoring from showing it?  That is, I'd have expected
some am_walsender logic in or near pgstat.c, not here.

regards, tom lane

Re: Extracting only the columns needed for a query

2020-01-07 Thread Melanie Plageman

I just wanted to address a question we got about making scanCols
instead of using RelOptInfo->attr_needed.

David Rowley had asked:

> For planning, isn't this information already available via either
> targetlists or from the RelOptInfo->attr_needed array combined with
> min_attr and max_attr?

attr_needed does not have all of the attributes needed set. Attributes
are only added in add_vars_to_targetlist() and this is only called for
certain query classes.

Also, Jeff Davis had asked off-list why we didn't start using
RelOptInfo->attr_needed for our purpose (marking which columns will
need to be scanned for the use of table access methods) instead of
adding a new member to RangeTblEntry.

The primary reason is that RelOptInfos are used during planning and
not execution. We need access to this information somewhere in the
PlannedStmt.

Even if we used attr_needed, we would, at some point, need to transfer
the columns needed to be scanned to a data structure available during
execution.

However, the next question was why not use attr_needed during costing
(which is the other time the table access method can use scanCols).

David Kimura and I dug into how attr_needed is used currently in
Postgres to understand if its meaning and usage is consistent with
using it instead of scanCols during costing.

We could set attr_needed in the same way we are now setting scanCols
and then use it during costing, however, besides the fact that we
would then have to create a member like scanCols anyway during
execution, it seems like adding an additional meaning to attr_needed
during planning is confusing and dangerous.

attr_needed is documented as being used to determine the highest
joinrel in which attribute is needed (when it was introduced
835bb975d8d).
Since then it has been extended to be used for removing joins and
relations from queries b78f6264eba33 and to determine if whole row
vars are used in a baserel (which isn't supported as a participant in
a partition-wise join) 7cfdc77023ad507317.

It actually seems like attr_needed might be too general a name for
this field.
It isn't set for every attribute in a query -- only in specific cases
for attributes in certain parts of the query. If a developer checks
attr_needed for the columns in his/her query, many times those
columns will not be present.
It might actually be better to rename attr_needed to clarify its
usage.
scanCols, on the other hand, has a clear meaning and usage. For table
access methods, scanCols are the columns that need to be scanned.
There might even be cases where this ends up being a different set
than attr_needed for a base rel.

Melanie & David

Re: Removing pg_pltemplate and creating "trustable" extensions

2020-01-07 Thread Robert Haas

On Tue, Jan 7, 2020 at 4:36 PM Stephen Frost  wrote:
> Here's the thing though..  creating the extension isn't *really* (in our
> permissions model anyway) what lets you create outbound connections-
> it's creating a 'SERVER', and to be able to do that you need to have
> USAGE rights on the FDW, which, normally, only a superuser can create.
> The crux here is that the FDW is created as part of the extension
> though.  As long as only superusers can create extensions, that's fine,
> but when we allow others to do so, we come to an interesting question:
>
> No matter how we end up allowing a non-superuser to create a trusted
> extension, who should end up owning it and being able to modify it
> and/or grant access to objects within it?

Hmm.  Good question. But it's addressed in the documentation for the
patch Tom wrote, so I don't know why we need to discuss it de novo.
His answer seems pretty sensible and also happens to, I think, match
what you've written here.

> Of course, there's the other option, which is to just agree that,
> because of the way postgres_fdw works, it's gotta be marked as
> untrusted.  I would ask though- are we really sure that we aren't ever
> going to have any issues with functions in untrusted languages (or any
> other objects) created by extensions being owned by non-superusers?

But I don't see what the question of "who owns the objects?" has to do
with whether a superuser might want to allow some extensions to be
installed but not others. I think someone might want that, and if I
understand correctly, Tom thought so too when he wrote v1 of the
patch, because it had some capabilities along these lines. All I'm
doing is arguing that his first instinct was correct. And I'm not even
sure that you're disagreeing, since you seem to think that the
question of whether postgres_fdw ought to be marked trusted is
debatable. I'm really not sure what we're arguing about here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: xact_start for walsender & logical decoding not updated

2020-01-07 Thread Alvaro Herrera

On 2020-Jan-07, Tom Lane wrote:

> I wrote:
> > The buildfarm seems less than happy with this.
> 
> ... and, having now looked at the patch, I'm not surprised.
> Breaking stmtStartTimestamp, which is what you did, seems like
> an awfully side-effect-filled route to the goal.  If you want
> to prevent monitoring from showing this, why didn't you just
> prevent monitoring from showing it?  That is, I'd have expected
> some am_walsender logic in or near pgstat.c, not here.

That seems a pretty simple patch; attached (untested).  However, my
patch seemed a pretty decent way to achieve the goal, and I don't
understand why it causes the failure, or indeed why we care about
stmtStartTimestamp at all.  I'll look into this again tomorrow.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
commit 2808f98edb6889ec602534e070e6a2c660f6377e
Author: Alvaro Herrera 
AuthorDate: Tue Jan 7 20:33:28 2020 -0300
CommitDate: Tue Jan 7 20:33:28 2020 -0300

fix previous

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 0dd6b82f99..9b252213d9 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -758,13 +758,6 @@ GetCurrentTransactionStopTimestamp(void)
 void
 SetCurrentStatementStartTimestamp(void)
 {
-	/*
-	 * Skip if on a walsender; this is not needed, and it confuses monitoring
-	 * if we publish non-NULL values.
-	 */
-	if (am_walsender)
-		return;
-
 	if (!IsParallelWorker())
 		stmtStartTimestamp = GetCurrentTimestamp();
 	else
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 20ce48b2d8..8557684aad 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -732,7 +732,8 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
 			else
 nulls[7] = true;
 
-			if (beentry->st_xact_start_timestamp != 0)
+			if (beentry->st_xact_start_timestamp != 0 ||
+beentry->st_backendType != B_WAL_SENDER)
 values[8] = TimestampTzGetDatum(beentry->st_xact_start_timestamp);
 			else
 nulls[8] = true;

Re: Avoiding hash join batch explosions with extreme skew and weird stats

2020-01-07 Thread Thomas Munro

On Mon, Dec 30, 2019 at 4:34 PM Melanie Plageman
 wrote:
> So, I finally have a prototype to share of parallel hashloop fallback.

Hi Melanie,

Thanks for all your continued work on this!  I started looking at it
today; it's a difficult project and I think it'll take me a while to
grok.  I do have some early comments though:

* I am uneasy about BarrierArriveExplicitAndWait() (a variant of
BarrierArriveAndWait() that lets you skip directly to a given phase?);
perhaps you only needed that for a circular phase system, which you
could do with modular phase numbers, like PHJ_GROW_BATCHES_PHASE?  I
tried to make the barrier interfaces look like the libraries in other
parallel programming environments, and I'd be worried that the
explicit phase thing could easily lead to bugs.
* It seems a bit strange to have "outer_match_status_file" in
SharedTupleStore; something's gone awry layering-wise there.
* I'm not sure it's OK to wait at the end of each loop, as described
in the commit message:

Workers probing a fallback batch will wait until all workers have
finished probing before moving on so that an elected worker can read
and combine the outer match status files into a single bitmap and use
it to emit unmatched outer tuples after all chunks of the inner side
have been processed.

Maybe I misunderstood completely, but that seems to break the
programming rule described in nodeHashjoin.c's comment beginning "To
avoid deadlocks, ...".  To recap: (1) When you emit a tuple, the
program counter escapes to some other node, and maybe that other node
waits for thee, (2) Maybe the leader is waiting for you but you're
waiting for it to drain its queue so you can emit a tuple (I learned a
proper name for this: "flow control deadlock").  That's why the
current code only ever detaches (a non-waiting operation) after it's
begun emitting tuples (that is, the probing phase).  It just moves
onto another batch.  That's not a solution here: you can't simply move
to another loop, loops are not independent of each other like batches.
It's possible that barriers are not the right tool for this part of
the problem, or that there is a way to use a barrier that you don't
remain attached to while emitting, or that we should remove the
deadlock risks another way entirely[1] but I'm not sure.  Furthermore,
the new code in ExecParallelHashJoinNewBatch() appears to break the
rule even in the non-looping case (it calls BarrierArriveAndWait() in
ExecParallelHashJoinNewBatch(), where the existing code just
detaches).

> This patch does contain refactoring of nodeHashjoin.
>
> I have split the Parallel HashJoin and Serial HashJoin state machines
> up, as they were diverging in my patch to a point that made for a
> really cluttered ExecHashJoinImpl() (ExecHashJoinImpl() is now gone).

Hmm.  I'm rather keen on extending that technique further: I'd like
there to be more configuration points in the form of parameters to
that function, so that we write the algorithm just once but we
generate a bunch of specialised variants that are the best possible
machine code for each combination of parameters via constant-folding
using the "always inline" trick (steampunk C++ function templates).
My motivations for wanting to do that are: supporting different hash
sizes (CF commit e69d6445), removing branches for unused optimisations
(eg skew), and inlining common hash functions.  That isn't to say we
couldn't have two different templatoid functions from which many
others are specialised, but I feel like that's going to lead to a lot
of duplication.

> The reason I didn't do this refactoring in one patch and then put the
> adaptive hashjoin code on top of it is that I might like to make
> Parallel HashJoin and Serial HashJoin different nodes.
>
> I think that has been discussed elsewhere and was looking to
> understand the rationale for keeping them in the same node.

Well, there is a discussion about getting rid of the Hash node, since
it's so tightly coupled with Hash Join that it might as well not exist
as a separate entity.  (Incidentally, I noticed in someone's blog that
MySQL now shows Hash separately in its PostgreSQL-style EXPLAIN
output; now we'll remove it, CF the Dr Seuss story about the
Sneetches).  But as for Parallel Hash Join vs [Serial] Hash Join, I
think it makes sense to use the same node because they are
substantially the same thing, with optional extra magic, and I think
it's our job to figure out how to write code in a style that makes the
differences maintainable.  That fits into a general pattern that
"Parallel" is a mode, not a different node.  On the other hand, PHJ is
by far the most different from the original code, compared to things
like Parallel Sequential Scan etc.  FWIW I think we're probably in
relatively new territory here: as far as I know, other traditional
RDBMSs didn't really seem to have a concept like parallel-aware
executor nodes, because they tended to be based on partitioning, so
that the operators are a

Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'

2020-01-07 Thread Tom Lane

Mark Lorenz  writes:
>> Why not?  These format codes are specified as
>> Dday of the week, Sunday (1) to Saturday (7)
>> WW   week number of year (1–53) (the first week starts on the first day
>> of the year)

> I don't want to create any connection here. The day is calculated 
> correctly. But the week number is wrong. 1997-02-03 was in week number 
> 6, as well as 1997-02-04. But Postgres returns 5.

The week number is only wrong if you persist in ignoring the very clear
definition given in the manual.  According to the stated definition of WW,
"week 1" consists of Jan 1 to Jan 7, "week 2" to Jan 8-14, etc.  So it's
correct for both of those dates to be in "week 5".  There are other
possible definitions of "week" of course, such as the ISO week, under
which both those dates would be in week 6 (of 1997 anyway, not all other
years).  But if you want ISO week you should ask for it with "IW", not
expect that we'll change the longstanding behavior of "WW" to match.

As far as I can see, the only way to make a week definition that
gives sensible results in combination with "D" is to do something
like what ISO does, but with Sunday as the start day instead of Monday.
But having three different week definitions seems more likely to
confuse people (even more) than to be helpful.  Plus you'd also need
analogs of IYYY, IDDD, etc.

Why not just use IYYY-IW-ID, instead?  You'd have to adapt to
week-starts-on-Monday, but you'd be using a notation that a lot
of people are already familiar with, instead of inventing your own.

Another possibility, perhaps, is to use WW in combination with
some new field that counts 1-7, 1-7, 1-7, ... starting on Jan 1.
But then that wouldn't have any easy mapping to day names, so
there's no free lunch.

Throwing MM into the mix makes it even more exciting, as month
boundaries don't correspond with week boundaries either.  I don't
see any rational way to make -MM-W or -MM-W-D patterns
that behave in a numerically consistent fashion.  (Note that ISO
didn't try --- there is no "ISO month".)

The bottom line is that these various definitions aren't mutually
consistent, and that's just a fact of life, not something that can
be fixed.

In any case, backwards compatibility alone would be a sufficient
reason to reject a patch that changes the established behavior
of the existing format codes.  Whether you think they're buggy or
not, other people are relying on the existing documented behavior.

Perhaps we'd consider a patch that adds some new format codes with
new behavior.  But personally I'd vote against implementing new
format codes unless you can point to well-established standards
supporting their definitions.  to_char/to_date are impossibly
complex and unmaintainable already; we don't need to add more
features with narrow use-cases to them.

regards, tom lane

Re: Removing pg_pltemplate and creating "trustable" extensions

2020-01-07 Thread Stephen Frost

Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Tue, Jan 7, 2020 at 4:36 PM Stephen Frost  wrote:
> > Here's the thing though..  creating the extension isn't *really* (in our
> > permissions model anyway) what lets you create outbound connections-
> > it's creating a 'SERVER', and to be able to do that you need to have
> > USAGE rights on the FDW, which, normally, only a superuser can create.
> > The crux here is that the FDW is created as part of the extension
> > though.  As long as only superusers can create extensions, that's fine,
> > but when we allow others to do so, we come to an interesting question:
> >
> > No matter how we end up allowing a non-superuser to create a trusted
> > extension, who should end up owning it and being able to modify it
> > and/or grant access to objects within it?
> 
> Hmm.  Good question. But it's addressed in the documentation for the
> patch Tom wrote, so I don't know why we need to discuss it de novo.
> His answer seems pretty sensible and also happens to, I think, match
> what you've written here.

I would disagree about it matching what I wrote, but only because it
goes farther and lets the extension choose, which is even better.
Thanks for pointing that out, I had missed how that was addressed.

> > Of course, there's the other option, which is to just agree that,
> > because of the way postgres_fdw works, it's gotta be marked as
> > untrusted.  I would ask though- are we really sure that we aren't ever
> > going to have any issues with functions in untrusted languages (or any
> > other objects) created by extensions being owned by non-superusers?
> 
> But I don't see what the question of "who owns the objects?" has to do
> with whether a superuser might want to allow some extensions to be
> installed but not others. I think someone might want that, and if I
> understand correctly, Tom thought so too when he wrote v1 of the
> patch, because it had some capabilities along these lines. All I'm
> doing is arguing that his first instinct was correct. And I'm not even
> sure that you're disagreeing, since you seem to think that the
> question of whether postgres_fdw ought to be marked trusted is
> debatable. I'm really not sure what we're arguing about here.

Here's the thing though- I *am* disagreeing, and that Tom had addressed
the ownership issue solidifies my feeling that the justification that's
been proposed for why a superuser might want to allow some extensions to
be installed but not others (beyond the "trustable" question that we are
proposing to address) isn't valid.

You raised the point regarding postgres_fdw and a DB owner being able to
run 'create extension postgres_fdw;' and to then make network
connections, but that's proven to be invalid because, assuming we make
postgres_fdw trustable, we will surely make the FDW itself that's
created be owned by the bootstrap superuser and therefore the DB owner
*couldn't* create such network connections- at least, now without an
additional step being taken by a superuser.  Further, it's pretty clear
to everyone *why* that additional step has to be taken for postgres_fdw.

So I come back to the questions that were raised up-thread but either
weren't answered or were done so with invalid points, as explained above
regarding postgres_fdw:

What's the security issue from installing a trusted extension?

Why would a $SCARY_EXTENSION be marked as trusted?

If there's no security issue, and no $SCARY_EXTENSION that's marked as
trusted, then why wouldn't a superuser be comfortable allowing a DB
owner to install a trusted extension into the DB they own?  The DB where
they can create any other trusted object from functions in trusted
languages to operators to schemas to tables and indexes and views and
all the others?  What is the superuser concerned about?  What do they
need to check before allowing this?  What's dangerous about allowing it?

Maybe it would help to say that I'm seeing the pattern here being
something along the lines of:

0) DBA owns prod database, but is not a superuser.
1) DBA decides they want $trusted_extension but it isn't installed
2) DBA submits a ticket to the infra team and says "please install this
   RPM on to this database server"
3) infra reviews that request and decides if they're ok with the RPM
4) infra resolves the ticket and installs the RPM,
5) DBA then runs 'create extension $trusted_extension;' and go about
   doing whatever it is they want to do with that extension.

The approach you're advocating for, assuming I've understood it
correctly, requires the infra team to also log in to the database as the
postgres superuser and then grant this role to the DBA, and I just don't
see the justification for that additional step, and I'm sure they're
going to be asking themselves "why do I need to do this..?  what power
is this granting?  why is this dangerous?  What about this isn't to be
trusted?"

To the question of "how do we know if the extension is trusted?" I
answer- how do w

Re: xact_start for walsender & logical decoding not updated

2020-01-07 Thread Tom Lane

Alvaro Herrera  writes:
> On 2020-Jan-07, Tom Lane wrote:
>> ... and, having now looked at the patch, I'm not surprised.
>> Breaking stmtStartTimestamp, which is what you did, seems like
>> an awfully side-effect-filled route to the goal.  If you want
>> to prevent monitoring from showing this, why didn't you just
>> prevent monitoring from showing it?  That is, I'd have expected
>> some am_walsender logic in or near pgstat.c, not here.

> That seems a pretty simple patch; attached (untested).

I think you want && not ||, but otherwise that looks about right.

> However, my
> patch seemed a pretty decent way to achieve the goal, and I don't
> understand why it causes the failure, or indeed why we care about
> stmtStartTimestamp at all.  I'll look into this again tomorrow.

I'm not 100% sure why the failure either.  The assertion is in
code that should only be reached in a parallel worker, and surely
walsenders don't launch parallel queries?  But it looks to me
that all the critters using force_parallel_mode are unhappy.

In any case, my larger point is that stmtStartTimestamp is globally
accessible state (via GetCurrentStatementStartTimestamp()) and you
can have little idea which corners of our code are using it, let
alone what extensions might expect about it.  Plus it feeds into
xactStartTimestamp (cf StartTransaction()), increasing the footprint
for unwanted side-effects even more.  Redefining its meaning
to fix this problem is a really bad idea IMO.

regards, tom lane

Re: Assert failure due to "drop schema pg_temp_3 cascade" for temporary tables and \d+ is not showing any info after drooping temp table schema

2020-01-07 Thread Michael Paquier

On Tue, Jan 07, 2020 at 01:06:08PM -0500, Tom Lane wrote:
> I still agree with Robert that a052f6c is a bad idea.  It's not the case
> that that's blocking "any connected user" from causing an issue.  The
> temp schemas are always owned by the bootstrap superuser, so only a
> superuser could delete them.  All that that patch is doing is preventing
> superusers from doing something that they could reasonably wish to do,
> and that is perfectly safe when there's not concurrent usage of the
> schema.  We are not normally that nanny-ish, and the case for being so
> here seems pretty thin.

Okay, I am running out of arguments then, so attached is a patch to
address things.  I would also prefer if we keep the relation name in
the log even if the namespace is missing.
--
Michael
diff --git a/src/backend/commands/dropcmds.c b/src/backend/commands/dropcmds.c
index e7891a4418..b3131ab208 100644
--- a/src/backend/commands/dropcmds.c
+++ b/src/backend/commands/dropcmds.c
@@ -101,21 +101,6 @@ RemoveObjects(DropStmt *stmt)
 		 errhint("Use DROP AGGREGATE to drop aggregate functions.")));
 		}
 
-		/*
-		 * Prevent the drop of a temporary schema, be it owned by the current
-		 * session or another backend as this would mess up with the callback
-		 * registered to clean up temporary objects at the end of a session.
-		 * Note also that the creation of any follow-up temporary object would
-		 * result in inconsistencies within the session whose temporary schema
-		 * has been dropped.
-		 */
-		if (stmt->removeType == OBJECT_SCHEMA &&
-			isAnyTempNamespace(address.objectId))
-			ereport(ERROR,
-	(errcode(ERRCODE_WRONG_OBJECT_TYPE),
-	 errmsg("cannot drop temporary schema \"%s\"",
-			get_namespace_name(address.objectId;
-
 		/* Check permissions. */
 		namespaceId = get_object_namespace(&address);
 		if (!OidIsValid(namespaceId) ||
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f0e40e36af..f6afe4dbb8 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2250,11 +2250,22 @@ do_autovacuum(void)
 		}
 
 		/* OK, let's delete it */
-		ereport(LOG,
-(errmsg("autovacuum: dropping orphan temp table \"%s.%s.%s\"",
-		get_database_name(MyDatabaseId),
-		get_namespace_name(classForm->relnamespace),
-		NameStr(classForm->relname;
+		if (log_min_messages <= LOG)
+		{
+			char   *nspname = get_namespace_name(classForm->relnamespace);
+
+			if (nspname != NULL)
+ereport(LOG,
+		(errmsg("autovacuum: dropping orphan temp table \"%s.%s.%s\"",
+get_database_name(MyDatabaseId),
+get_namespace_name(classForm->relnamespace),
+NameStr(classForm->relname;
+			else
+ereport(LOG,
+		(errmsg("autovacuum: dropping orphan temp table \"%s.(null).%s\" with OID %u",
+get_database_name(MyDatabaseId),
+NameStr(classForm->relname), relid)));
+		}
 
 		object.classId = RelationRelationId;
 		object.objectId = relid;


signature.asc
Description: PGP signature

Re: Assert failure due to "drop schema pg_temp_3 cascade" for temporary tables and \d+ is not showing any info after drooping temp table schema

2020-01-07 Thread Tom Lane

Michael Paquier  writes:
> Okay, I am running out of arguments then, so attached is a patch to
> address things.  I would also prefer if we keep the relation name in
> the log even if the namespace is missing.

A couple of thoughts:

* Please revert a052f6c as a separate commit specifically doing that,
so that when it comes time to make the release notes, it's clear that
a052f6c doesn't require documentation.

* I think the check on log_min_messages <= LOG is probably wrong, since
LOG sorts out of order for this purpose.  Compare is_log_level_output()
in elog.c.  I'd suggest not bothering with trying to optimize away the
get_namespace_name call here; we shouldn't be in this code path often
enough for performance to matter, and nobody ever cared about it before.

* I don't greatly like the notation
dropping orphan temp table \"%s.(null).%s\" ...
and I bet Robert won't either.  Not sure offhand about a better
idea --- maybe
dropping orphan temp table \"%s\" with OID %u in database \"%s\"

regards, tom lane

Re: mdclose() does not cope w/ FileClose() failure

2020-01-07 Thread Kyotaro Horiguchi

At Wed, 1 Jan 2020 23:46:02 -0800, Noah Misch  wrote in 
> On Wed, Dec 25, 2019 at 10:39:32AM +0900, Kyotaro Horiguchi wrote:
> > I'm not sure which is better. If we say we know that
> > repalloc(AllocSetRealloc) doesn't free memory at all, there's no point
> > in calling repalloc for shrinking and we could omit that under the
> > name of optimization.  If we say we want to free memory as much as
> > possible, we should call repalloc pretending to believe that that
> > happens.
> 
> As long as we free the memory by the end of mdclose(), I think it doesn't
> matter whether we freed memory in the middle of mdclose().

Agreed.

> I ran a crude benchmark that found PathNameOpenFile()+FileClose() costing at
> least two hundred times as much as the repalloc() pair.  Hence, I now plan not
> to avoid repalloc(), as attached.  Crude benchmark code:

I got about 25 times difference with -O0 and about 50 times with -O2.
(xfs / CentOS8) It's smaller than I intuitively expected but perhaps
50 times difference is large enough.

The patch looks good to me.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: backup manifests

2020-01-07 Thread Stephen Frost

Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Fri, Jan 3, 2020 at 2:35 PM Stephen Frost  wrote:
> > > Well, I don't know how to make you happy here.
> >
> > I suppose I should admit that, first off, I don't feel you're required
> > to make me happy, and I don't think it's necessary to make me happy to
> > get this feature into PG.
> 
> Fair enough. That is gracious of you, but I would like to try to make
> you happy if it is possible to do so.

I certainly appreciate that, but I don't know that it is possible to do
so while approaching this in the order that you are, which I tried to
point out previously.

> > Since you expressed that interest though, I'll go out on a limb and say
> > that what would make me *really* happy would be to think about where the
> > project should be taking pg_basebackup, what we should be working on
> > *today* to address the concerns we hear about from our users, and to
> > consider the best way to implement solutions to what they're actively
> > asking for a core backup solution to be providing.  I get that maybe
> > that isn't how the world works and that sometimes we have people who
> > write our paychecks wanting us to work on something else, and yes, I'm
> > sure there are some users who are asking for this specific thing but I
> > certainly don't think it's a common ask of pg_basebackup or what users
> > feel is missing from the backup options we offer in core; we had users
> > on this list specifically saying they *wouldn't* use this feature
> > (referring to the differential backup stuff, of course), in fact,
> > because of the things which are missing, which is pretty darn rare.
> 
> Well, I mean, what you seem to be suggesting here is that somebody is
> driving me with a stick to do something that I don't really like but
> have to do because otherwise I won't be able to make rent, but that's
> actually not the case. I genuinely believe that this is a good design,
> and it's driven by me, not some shadowy conglomerate of EnterpriseDB
> executives who are out to make PostgreSQL sucks. If I'm wrong and the
> design sucks, that's again not the fault of shadowy EnterpriseDB
> executives; it's my fault. Incidentally, my boss is not very shadowy
> anyhow; he's a super-nice guy, and a major reason why I work here. :-)

Then I just have to disagree, really vehemently, that having a
block-level incremental backup solution without solid dependency
handling between incremental and full backups, solid WAL management and
archiving, expiration handling for incremental/full backups and WAL, and
the manifest that that this thread has been about, is a good design.

Ultimately, what this calls for is some kind of 'repository' which
you've stressed you don't think is a good idea for pg_basebackup to ever
deal with and I just can't disagree more with that.  I could perhaps
agree that it isn't appropriate for the specific tool "pg_basebackup" to
work with a repo because of the goal of that particular tool, but in
that case, I don't think pg_basebackup should be the tool to provide a
block-level incremental backup solution, it should continue to be a tool
to provide a simple and easy way to take a one-time, complete, snapshot
of a running PG system over the replication protocol- and adding support
for parallel backups, or encrypted backups, or similar things would be
completely in-line and appropriate for such a tool, and I'm not against
those features being added to pg_basebackup even in advance of anything
like support for a repo or dependency handling.

> I don't think the issue here is that I haven't thought about what
> users want, but that not everybody wants the same thing, and it's
> seems like the people with whom I interact want somewhat different
> things than those with whom you interact. EnterpriseDB has an existing
> tool that does parallel and block-level incremental backup, and I
> started out with the goal of providing those same capabilities in
> core. They are quite popular with EnterpriseDB customers, and I'd like
> to make them more widely available and, as far as I can, improve on
> them. From our previous discussion and from a (brief) look at
> pgbackrest, I gather that the interests of your customers are somewhat
> different. Apparently, block-level incremental backup isn't quite as
> important to your customers, perhaps because you've already got
> file-level incremental backup, but various other things like
> encryption and backup verification are extremely important, and you've
> got a set of ideas about what would be valuable in the future which
> I'm sure is based on real input from your customers. I hope you pursue
> those ideas, and I hope you do it in core rather than in a separate
> piece of software, but that's up to you. Meanwhile, I think that if I
> have somewhat different ideas about what I'd like to pursue, that
> ought to be just fine. And I don't think it is unreasonable to hope
> that you'll acknowledge my goals as legitimate even i

Re: Consolidate 'unique array values' logic into a reusable function?

2020-01-07 Thread Thomas Munro

On Sun, Dec 29, 2019 at 8:02 PM Noah Misch  wrote:
> ==00:00:00:28.322 1527557== Source and destination overlap in 
> memcpy(0x1000104, 0x1000104, 4)
> ==00:00:00:28.322 1527557==at 0x4C2E74D: memcpy@@GLIBC_2.14 
> (vg_replace_strmem.c:1035)
> ==00:00:00:28.322 1527557==by 0xA9A57B: qunique (qunique.h:34)
> ==00:00:00:28.322 1527557==by 0xA9A843: InitCatalogCache (syscache.c:1056)
> ==00:00:00:28.322 1527557==by 0xAB6B18: InitPostgres (postinit.c:682)
> ==00:00:00:28.322 1527557==by 0x91F98E: PostgresMain (postgres.c:3909)
> ==00:00:00:28.322 1527557==by 0x872DE9: BackendRun (postmaster.c:4498)
> ==00:00:00:28.322 1527557==by 0x8725B3: BackendStartup (postmaster.c:4189)
> ==00:00:00:28.322 1527557==by 0x86E7F4: ServerLoop (postmaster.c:1727)
> ==00:00:00:28.322 1527557==by 0x86E0AA: PostmasterMain (postmaster.c:1400)
> ==00:00:00:28.322 1527557==by 0x77CB56: main (main.c:210)
> ==00:00:00:28.322 1527557==
> {
>
>Memcheck:Overlap
>fun:memcpy@@GLIBC_2.14
>fun:qunique
>fun:InitCatalogCache
>fun:InitPostgres
>fun:PostgresMain
>fun:BackendRun
>fun:BackendStartup
>fun:ServerLoop
>fun:PostmasterMain
>fun:main
> }
>
> This is like the problem fixed in 9a9473f; the precedent from there would be
> to test src!=dst before calling mempcy(), e.g. as attached.  I suppose the
> alternative would be to add a suppression like the one 9a9473f removed.

Thanks for fixing that.

> I do wonder why the Valgrind buildfarm animals haven't noticed.

Optimisation levels?  For example, I see that skink is using -Og, at
which level my local GCC inlines qunique() and memcpy() so that in the
case you quoted there's a MOV instruction and valgrind has nothing to
complain about.

Re: Assert failure due to "drop schema pg_temp_3 cascade" for temporary tables and \d+ is not showing any info after drooping temp table schema

2020-01-07 Thread Michael Paquier

On Tue, Jan 07, 2020 at 07:55:17PM -0500, Tom Lane wrote:
> * Please revert a052f6c as a separate commit specifically doing that,
> so that when it comes time to make the release notes, it's clear that
> a052f6c doesn't require documentation.

Okay.  Committed the revert first then.

> * I think the check on log_min_messages <= LOG is probably wrong, since
> LOG sorts out of order for this purpose.  Compare is_log_level_output()
> in elog.c.  I'd suggest not bothering with trying to optimize away the
> get_namespace_name call here; we shouldn't be in this code path often
> enough for performance to matter, and nobody ever cared about it before.

Done.

> * I don't greatly like the notation
> dropping orphan temp table \"%s.(null).%s\" ...
> and I bet Robert won't either.  Not sure offhand about a better
> idea --- maybe
> dropping orphan temp table \"%s\" with OID %u in database \"%s\"

And done this way as per the attached.  I am of course open to
objections or better ideas, though this looks formulation looks pretty
good to me.  Robert?
--
Michael
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index f0e40e36af..22f7bdeaff 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2202,6 +2202,7 @@ do_autovacuum(void)
 		Oid			relid = lfirst_oid(cell);
 		Form_pg_class classForm;
 		ObjectAddress object;
+		char	   *nspname;
 
 		/*
 		 * Check for user-requested abort.
@@ -2249,12 +2250,18 @@ do_autovacuum(void)
 			continue;
 		}
 
-		/* OK, let's delete it */
-		ereport(LOG,
-(errmsg("autovacuum: dropping orphan temp table \"%s.%s.%s\"",
-		get_database_name(MyDatabaseId),
-		get_namespace_name(classForm->relnamespace),
-		NameStr(classForm->relname;
+		nspname = get_namespace_name(classForm->relnamespace);
+
+		if (nspname != NULL)
+			ereport(LOG,
+	(errmsg("autovacuum: dropping orphan temp table \"%s.%s.%s\"",
+			get_database_name(MyDatabaseId),
+			nspname, NameStr(classForm->relname;
+		else
+			ereport(LOG,
+	(errmsg("autovacuum: dropping orphan temp table \"%s\" with OID %u in database \"%s\"",
+			NameStr(classForm->relname), relid,
+			get_database_name(MyDatabaseId;
 
 		object.classId = RelationRelationId;
 		object.objectId = relid;


signature.asc
Description: PGP signature

Re: [PATCH] Increase the maximum value track_activity_query_size

2020-01-07 Thread Michael Paquier

On Tue, Jan 07, 2020 at 12:21:26PM -0500, Robert Haas wrote:
> Done. I didn't commit the postgresql.conf.sample change because:
> 
> (1) I think Bruce voted against it.
> 
> (2) It makes the line a little wide, and I'd rather not do that.

Thanks!
--
Michael


signature.asc
Description: PGP signature

RE: Let people set host(no)ssl settings from initdb

2020-01-07 Thread tsunakawa.ta...@fujitsu.com

From: David Fetter 
> > But I see two problems with the proposed approach: (1) initdb
> > doesn't support setting up SSL, so the only thing you can achieve
> > here is to reject all TCP/IP connections, until you have set up SSL.
> 
> I don't believe any special setup is needed to require TLS for the
> connection, which is what this patch handles in a straightforward way.

I think this feature can be useful because it's common to reject remote non-TLS 
connections.  Eliminating the need to script for pg_hba.conf is welcome.  
Setting GUC parameters just after initdb is relatively easy, because we can 
simply add lines at the end of postgresql.conf.  But pg_hba.conf is not because 
the first matching entry is effective.

In terms of rejecting non-secure remote connections, should 
hostgssenc/hostnogssenc also be handled similarly?


> > (2) The default pg_hba.conf only covers localhost connections.
> 
> As of this patch, it can be asked to cover all connections.

+  --auth-hostssl=authmethod
+  
+   
+This option specifies the authentication method for users via
fg
+TLS connections used in pg_hba.conf
+(hostssl lines).
+   
+  

The relationship between --auth/--auth-local/--auth-host and 
--auth-hostssl/--auth-hostnossl is confusing.  The former is for local 
connections, and the latter is for remote ones.  Can we just add "remote" in 
the above documentation?

Plus, you're adding the first option to initdb that handles remote connections. 
 As the following execution shows, it doesn't warn about using "trust" for 
remote connections.


$ initdb --auth=md5 --pwprompt --auth-hostssl=trust --auth-hostnossl=trust
...
syncing data to disk ... ok

Success. You can now start the database server using:

pg_ctl -D /tuna/pg2 -l logfile start



I think we should emit a warning message like the following existing one:

--
initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
-
initdb: warning: enabling "trust" authentication 


Regards
Takayuki Tsunakawa

Re: pgsql: Add basic TAP tests for psql's tab-completion logic.

2020-01-07 Thread Amit Kapila

On Mon, Jan 6, 2020 at 4:26 PM Christoph Berg  wrote:
>
> Re: Tom Lane 2020-01-05 <25771.1578249...@sss.pgh.pa.us>
>
> Ubuntu bionic fails elsewhere:
>
> 07:19:51 t/001_stream_rep.pl .. ok
> 07:19:53 t/002_archiving.pl ... ok
> 07:19:59 t/003_recovery_targets.pl  ok
> 07:20:01 t/004_timeline_switch.pl . ok
> 07:20:08 t/005_replay_delay.pl  ok
> 07:20:10 Bailout called.  Further testing stopped:  system pg_ctl failed
> 07:20:10 FAILED--Further testing stopped: system pg_ctl failed
>
> 07:20:10 2020-01-06 06:19:41.285 UTC [26415] LOG:  received fast shutdown 
> request
> 07:20:10 2020-01-06 06:19:41.285 UTC [26415] LOG:  aborting any active 
> transactions
> 07:20:10 2020-01-06 06:19:41.287 UTC [26415] LOG:  background worker "logical 
> replication launcher" (PID 26424) exited with exit code 1
> 07:20:10 2020-01-06 06:19:41.287 UTC [26419] LOG:  shutting down
> 07:20:10 2020-01-06 06:19:41.287 UTC [26419] LOG:  checkpoint starting: 
> shutdown immediate
>

It looks like this failure is more of what we are getting on
"sidewinder" where it failed because of "insufficient file descriptors
available to start server process".  Can you check in the log
(probably in 006_logical_decoding_master.log) if this is the same you
are getting or something else.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: sidewinder has one failure

2020-01-07 Thread Amit Kapila

On Sun, Jan 5, 2020 at 8:00 AM Noah Misch  wrote:
>
> On Sat, Jan 04, 2020 at 06:56:48AM +0530, Amit Kapila wrote:
> > In the latter case, we either want to
> > (a) tweak the test to raise the value of max_files_per_process, (b)
> > remove the test entirely.
>
> I generally favor keeping the test, but feel free to decide it's too hard.
>

I am thinking that for now, we should raise the limit of
max_files_per_process in the test to something like 35 or 40, so that
sidewinder passes and unblocks other people who might get blocked due
to this, for example, I think one case is reported here
(https://www.postgresql.org/message-id/20200106105608.GB18560%40msg.df7cb.de,
see Ubuntu bionic ..).  I feel with this still we shall be able to
catch the problem we are facing on 'tern' and 'mandrill'.

Do you have any opinion on this?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [PATCH] lazy relations delete

2020-01-07 Thread Kyotaro Horiguchi

Hello.

At Tue, 31 Dec 2019 13:16:49 +0300, Maxim Orlov  wrote 
in 
> Now we decided to drop many tables, let's say 1000 or 1 not in a
> single transaction, but each table in a separate one. So, due to
> "plain" shared_buffers memory we have to do for loop for every
> relation which leads to lag between master and slave.
> 
> In real case scenario such issue lead to not a minutes lag, but hours
> lag. At the same time PostgreSQL have a great routine to delete many
> relations in a single transaction.
> 
> So, to get rid of this kind of issue here came up an idea: what if not
> to delete everyone of relations right away and just store them in an
> array, prevent shared buffers (correspond to a deleted relations) from
> been flushed. And then array reaches it max size we need to walk all
> buffers only once to "free" shared buffers correspond to a deleted
> relations.

That is a greate performane gain, but the proposal seems to lead to
database corruption. We must avoid such cases.

Relfilenode can be reused right after commit. There can be a case
where readers of the resued relfilenode see the pages from already
removed files left on shared buffers. On the other hand newly
allocated buffers for the reused relfilenode are not flushed out until
the lazy invalidate machinery actually frees the "garbage" buffers and
it leads to a broken database after a crash.  But finally the
machinery trashes away the buffers involving the correct ones at
execution time.

As for performance, hash reference for every BufferFlush call could be
a cost for unrelated transactions. And it leaves garbage buffers as
dead until more than LAZY_DELETE_ARRAY_SIZE relfilenodes are
removed.


regares.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PATCH] lazy relations delete

2020-01-07 Thread Thomas Munro

On Wed, Jan 8, 2020 at 5:20 PM Kyotaro Horiguchi
 wrote:
> Relfilenode can be reused right after commit. There can be a case
> where readers of the resued relfilenode see the pages from already
> removed files left on shared buffers. On the other hand newly
> allocated buffers for the reused relfilenode are not flushed out until
> the lazy invalidate machinery actually frees the "garbage" buffers and
> it leads to a broken database after a crash.  But finally the
> machinery trashes away the buffers involving the correct ones at
> execution time.

The relfilenode can't be reused until the next checkpoint, can it?
The truncated file remains in the file system, specifically to prevent
anyone from reusing the relfilenode.  See the comment for mdunlink().
There may be other problems with the idea, but wouldn't the zombie
buffers be harmless, if they are invalidated before
SyncPostCheckpoint() unlinks the underlying files (and you never try
to flush them)?

Re: parallel vacuum options/syntax

2020-01-07 Thread Masahiko Sawada

On Mon, 6 Jan 2020 at 15:27, Masahiko Sawada
 wrote:
>
> On Sun, 5 Jan 2020 at 23:28, Amit Kapila  wrote:
> >
> > On Sun, Jan 5, 2020 at 7:38 PM Masahiko Sawada
> >  wrote:
> > >
> > > On Sun, 5 Jan 2020 at 22:39, Tomas Vondra  
> > > wrote:
> > > >
> > > >
> > > > So if we think we need an option to determine vacuum parallel degree, we
> > > > should have an option to disable parallelism too. I don't care much if
> > > > it's called DISABLE_PARALLEL, NOPARALLEL or PARALLEL 0, as long as we
> > > > make our mind and don't unnecessarily break it in the next release.
> > > >
> >
> > Fair point.  I favor parallel 0 as that avoids adding more options and
> > also it is not very clear whether that is required at all.  Till now,
> > if I see most people who have shared their opinion seems to favor this
> > as compared to another idea where we need to introduce more options.
> >
> > >
> > > Okay I got your point. It's just an idea but how about controlling
> > > parallel vacuum using two options. That is, we have PARALLEL option
> > > that takes a boolean value (true by default) and enables/disables
> > > parallel vacuum. And we have WORKERS option that takes an integer more
> > > than 1 to specify the number of workers. Of course we should raise an
> > > error if only WORKERS option is specified. WORKERS option is optional.
> > > If WORKERS option is omitted the number of workers is determined based
> > > on the number of indexes on the table.
> > >
> >
> > I think this would add failure modes without serving any additional
> > purpose.  Sure, it might give the better feeling that we have separate
> > options to enable/disable parallelism and then specify the number of
> > workers with a separate option, but we already have various examples
> > as shared by me previously where setting the value as zero means the
> > option is disabled, so why to invent something new here?
>
> I just felt it's not intuitive that specifying parallel degree to 0
> means to disable parallel vacuum. But since majority of hackers seem
> to agree with this syntax I'm not going to insist on that any further.
>

Okay I'm going to go with enabling parallel vacuum by default and
disabling it by specifying PARALLEL 0.

For combination of VACUUM command options, although parallel vacuum is
enabled by default and VACUUM FULL doesn't support it yet, 'VACUUM
(FULL)' would work. On the other hand 'VACUUM (FULL, PARALLEL)' and
'VACUUM(FULL, PARALLEL 1)' would not work with error. And I think it
is better if 'VACUUM (FULL, PARALLEL 0)' also work but I'd like to
hear opinions.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: parallel vacuum options/syntax

2020-01-07 Thread Amit Kapila

On Wed, Jan 8, 2020 at 11:31 AM Masahiko Sawada
 wrote:
>
> On Mon, 6 Jan 2020 at 15:27, Masahiko Sawada
>  wrote:
> >
> > I just felt it's not intuitive that specifying parallel degree to 0
> > means to disable parallel vacuum. But since majority of hackers seem
> > to agree with this syntax I'm not going to insist on that any further.
> >
>
> Okay I'm going to go with enabling parallel vacuum by default and
> disabling it by specifying PARALLEL 0.
>

Sounds fine to me.  However, I have already started updating the patch
for that.  I shall post the new version today or tomorrow.  Is that
fine with you?

> For combination of VACUUM command options, although parallel vacuum is
> enabled by default and VACUUM FULL doesn't support it yet, 'VACUUM
> (FULL)' would work. On the other hand 'VACUUM (FULL, PARALLEL)' and
> 'VACUUM(FULL, PARALLEL 1)' would not work with error. And I think it
> is better if 'VACUUM (FULL, PARALLEL 0)' also work but I'd like to
> hear opinions.
>

I agree with all these points.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: jsonb_set() strictness considered harmful to data

2020-01-07 Thread Andrew Dunstan

On Wed, Jan 8, 2020 at 7:08 AM Pavel Stehule  wrote:
>
> Hi
>
> po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan 
>  napsal:
>>
>>
>> Updated version including docco and better error message.
>>
>> cheers
>>
>> andrew
>
>
> I think so my objections are solved. I have small objection
>
> + errdetail("exception raised due to \"null_value_treatment := 
> 'raise_exception'\""),
> + errhint("to avoid, either change the null_value_treatment argument or 
> ensure that an SQL NULL is not used")));
>
> "null_value_treatment := 'raise_exception'\""
>
> it use proprietary PostgreSQL syntax for named parameters. Better to use 
> ANSI/SQL syntax
>
> "null_value_treatment => 'raise_exception'\""
>
> It is fixed in attached patch
>
> source compilation without warnings,
> compilation docs without warnings
> check-world passed without any problems
>
> I'll mark this patch as ready for commiter
>
> Thank you for your work
>


Thanks for the review. I propose to commit this shortly.

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [Proposal] Global temporary tables

2020-01-07 Thread 曾文旌(义从)



> 2020年1月6日 下午8:17，Dean Rasheed  写道：
> 
> On Mon, 6 Jan 2020 at 11:01, Tomas Vondra  
> wrote:
>> 
>> On Mon, Jan 06, 2020 at 01:04:15PM +0800, 曾文旌(义从) wrote:
>> 
>>> 2 We feel that gtt needs to maintain statistics, but there is no
>>> agreement on what it will be done.
>>> 
>> 
>> I certainly agree GTT needs to maintain statistics, otherwise it'll lead
>> to poor query plans.
> 
> +1
> 
>> AFAIK the current patch stores the info in a hash
>> table in a backend private memory, and I don't see how else to do that
>> (e.g. storing it in a catalog would cause catalog bloat).
>> 
> 
> It sounds like it needs a pair of system GTTs to hold the table and
> column statistics for other GTTs. One would probably have the same
> columns as pg_statistic, and the other just the relevant columns from
> pg_class. I can see it being useful for the user to be able to see
> these stats, so perhaps they could be UNIONed into the existing stats
> view.
The current patch provides several functions as extension(pg_gtt) for read gtt 
statistics. 
Next I can move them to the kernel and let the view pg_stats can see gtt’s 
statistics.


> Regards,
> Dean

Re: Memory-Bounded Hash Aggregation

2020-01-07 Thread Adam Lee

Hi, Jeff

I tried to use the logical tape APIs for hash agg spilling, based on
your 1220 version.

Turns out it doesn't make much of performance difference with the
default 8K block size (might be my patch's problem), but the disk space
(not I/O) would be saved a lot because I force the respilling to use the
same LogicalTapeSet.

Logtape APIs with default block size 8K:
```
postgres=# EXPLAIN ANALYZE SELECT avg(g) FROM generate_series(0,500) g 
GROUP BY g;
 QUERY PLAN

 HashAggregate  (cost=75000.02..75002.52 rows=200 width=36) (actual 
time=7701.706..24473.002 rows=501 loops=1)
   Group Key: g
   Memory Usage: 4096kB  Batches: 516  Disk: 116921kB
   ->  Function Scan on generate_series g  (cost=0.00..5.01 rows=501 
width=4) (actual time=1611.829..3253.150 rows=501 loops=1)
 Planning Time: 0.194 ms
 Execution Time: 25129.239 ms
(6 rows)
```

Bare BufFile APIs:
```
postgres=# EXPLAIN ANALYZE SELECT avg(g) FROM generate_series(0,500) g 
GROUP BY g;
 QUERY PLAN

 HashAggregate  (cost=75000.02..75002.52 rows=200 width=36) (actual 
time=7339.835..24472.466 rows=501 loops=1)
   Group Key: g
   Memory Usage: 4096kB  Batches: 516  Disk: 232773kB
   ->  Function Scan on generate_series g  (cost=0.00..5.01 rows=501 
width=4) (actual time=1580.057..3128.749 rows=501 loops=1)
 Planning Time: 0.769 ms
 Execution Time: 26696.502 ms
(6 rows)
```

Even though, I'm not sure which API is better, because we should avoid
the respilling as much as we could in the planner, and hash join uses
the bare BufFile.

Attached my hacky and probably not robust diff for your reference.

-- 
Adam Lee
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index f1989b10ea..8c743d7561 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -247,6 +247,7 @@
 #include "utils/datum.h"
 #include "utils/dynahash.h"
 #include "utils/expandeddatum.h"
+#include "utils/logtape.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
@@ -288,8 +289,9 @@ typedef struct HashAggSpill
int   n_partitions; /* number of output partitions */
int   partition_bits;   /* number of bits for partition mask
   
log2(n_partitions) parent partition bits */
-   BufFile **partitions;   /* output partition files */
+   int  *partitions;   /* output logtape numbers */
int64*ntuples;  /* number of tuples in each 
partition */
+   LogicalTapeSet *lts;
 } HashAggSpill;
 
 /*
@@ -298,11 +300,12 @@ typedef struct HashAggSpill
  */
 typedef struct HashAggBatch
 {
-   BufFile *input_file;/* input partition */
+   int  input_tape;/* input partition */
int  input_bits;/* number of bits for input partition 
mask */
int64input_tuples;  /* number of tuples in this batch */
int  setno; /* grouping set */
HashAggSpill spill; /* spill output */
+   LogicalTapeSet *lts;
 } HashAggBatch;
 
 static void select_current_set(AggState *aggstate, int setno, bool is_hash);
@@ -359,9 +362,8 @@ static void hash_spill_init(HashAggSpill *spill, int 
input_bits,
uint64 input_tuples, 
double hashentrysize);
 static Size hash_spill_tuple(HashAggSpill *spill, int input_bits,
 TupleTableSlot *slot, 
uint32 hash);
-static MinimalTuple hash_read_spilled(BufFile *file, uint32 *hashp);
-static HashAggBatch *hash_batch_new(BufFile *input_file, int setno,
-   int64 
input_tuples, int input_bits);
+static MinimalTuple hash_read_spilled(LogicalTapeSet *lts, int tapenum, uint32 
*hashp);
+static HashAggBatch *hash_batch_new(LogicalTapeSet *lts, int tapenum, int 
setno, int64 input_tuples, int input_bits);
 static void hash_finish_initial_spills(AggState *aggstate);
 static void hash_spill_finish(AggState *aggstate, HashAggSpill *spill,
  int setno, int 
input_bits);
@@ -2462,7 +2464,7 @@ agg_refill_hash_table(AggState *aggstate)
 
CHECK_FOR_INTERRUPTS();
 
-   tuple = hash_read_spilled(batch->input_file, &hash);
+   tuple = hash_read_spilled(batch->lts, batch->input_tape, &hash);
if (tuple

Re: Parallel grouping sets

2020-01-07 Thread Richard Guo

On Sun, Dec 1, 2019 at 10:03 AM Michael Paquier  wrote:

> On Thu, Nov 28, 2019 at 07:07:22PM +0800, Pengzhou Tang wrote:
> > Richard pointed out that he get incorrect results with the patch I
> > attached, there are bugs somewhere,
> > I fixed them now and attached the newest version, please refer to [1] for
> > the fix.
>
> Mr Robot is reporting that the latest patch fails to build at least on
> Windows.  Could you please send a rebase?  I have moved for now the
> patch to next CF, waiting on author.


Thanks for reporting this issue. Here is the rebase.

Thanks
Richard


v3-0001-Support-for-parallel-grouping-sets.patch
Description: Binary data

Re: [HACKERS] pg_shmem_allocations view

2020-01-07 Thread Kyotaro Horiguchi

At Mon, 30 Dec 2019 13:52:50 -0500, Robert Haas  wrote 
in 
> On Wed, Dec 18, 2019 at 9:53 PM Kyotaro Horiguchi
>  wrote:
> > The doc is saying that "size" is "Size of the allocation" and
> > "allocated_size" is "size including padding". It seems somewhat
> > confusing to me. I'm not sure what wording is best but I think people
> > use net/gross wordings to describe like that.
> 
> I don't think I'd find that particularly clear. It seems to me that if
> the second size includes padding, then the first one differs in not
> including padding, so I'm not really sure where the confusion is. I
> thought about writing, for the first one, that is the requested size
> of the allocation, but that seemed like it might confuse someone by
> suggesting that the actual size of the allocation might be less than
> what was requested. I also thought about describing the first one as
> the size excluding padding, but that seems redundant. Maybe it would
> be good to change the second one to say "Size of the allocation
> including padding added by the allocator itself."

Ugh.. Thanks for the explanation and I'm convinced that the current
wording is the best.

> > > All of this makes me think that we might want to do some followup to
> > > (1) convert anonymous allocations to non-anonymous allocations, for
> > > inspectability, and (2) do some renaming to get better stylistic
> > > agreement between the names of various shared memory chunks. But I
> > > think that work is properly done after this patch is committed, not
> > > before.
> >
> > I agree to (2), but regarding (1), most or perhaps all of the
> > anonymous allocations seems belonging to one of the shared hashes but
> > are recognized as holes shown by the above statement. So the current
> > output of the view is wrong in that sense. I think (1) should be
> > resolved before adding the view.
> 
> I guess I don't understand how this makes the output wrong. Either the
> allocations have a name, or they are anonymous. This dumps everything
> that has a name. What would you suggest? It seems to me that it's more
> appropriate for this patch to just tell us about what's in shared
> memory, not change what's in shared memory. If we want to do the
> latter, that's a job for another patch.

Sorry for the strange sentense. "I agree to (2)" meant that "I agree
that (2) is done in later patch".

About (1), I perplexed by the word "hole", which seemed to me as a
region that is not allocated to anything.  But no columns with the
name actually is not in the view, so the view is not wrong in the
first place.

I agree to the patch as-is. Thanks for the explanatin.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: parallel vacuum options/syntax

2020-01-07 Thread Masahiko Sawada

On Wed, 8 Jan 2020 at 15:31, Amit Kapila  wrote:
>
> On Wed, Jan 8, 2020 at 11:31 AM Masahiko Sawada
>  wrote:
> >
> > On Mon, 6 Jan 2020 at 15:27, Masahiko Sawada
> >  wrote:
> > >
> > > I just felt it's not intuitive that specifying parallel degree to 0
> > > means to disable parallel vacuum. But since majority of hackers seem
> > > to agree with this syntax I'm not going to insist on that any further.
> > >
> >
> > Okay I'm going to go with enabling parallel vacuum by default and
> > disabling it by specifying PARALLEL 0.
> >
>
> Sounds fine to me.  However, I have already started updating the patch
> for that.  I shall post the new version today or tomorrow.  Is that
> fine with you?

Yes, that's fine. Thanks.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-01-07 Thread Dilip Kumar

On Mon, Jan 6, 2020 at 4:44 PM Dilip Kumar  wrote:
>
> On Mon, Jan 6, 2020 at 4:36 PM Amit Kapila  wrote:
> >
> > On Mon, Jan 6, 2020 at 3:56 PM Dilip Kumar  wrote:
> > >
> > > On Mon, Jan 6, 2020 at 2:11 PM Amit Kapila  
> > > wrote:
> > > >
> > > > 3.
> > > > +static void
> > > > +ReorderBufferStreamTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
> > > > {
> > > > ..
> > > > + /*
> > > > + * If this is a subxact, we need to stream the top-level transaction
> > > > + * instead.
> > > > + */
> > > > + if (txn->toptxn)
> > > > + {
> > > > +
> > > > ReorderBufferStreamTXN(rb, txn->toptxn);
> > > > + return;
> > > > + }
> > > >
> > > > Is it ever possible that we reach here for subtransaction, if not,
> > > > then it should be Assert rather than if condition?
> > >
> > > ReorderBufferCheckMemoryLimit, can call it either for the
> > > subtransaction or for the main transaction, depends upon in which
> > > ReorderBufferTXN you are adding the current change.
> > >
> >
> > That function has code like below:
> >
> > ReorderBufferCheckMemoryLimit()
> > {
> > ..
> > if (ReorderBufferCanStream(rb))
> > {
> > /*
> > * Pick the largest toplevel transaction and evict it from memory by
> > * streaming the already decoded part.
> > */
> > txn = ReorderBufferLargestTopTXN(rb);
> > /* we know there has to be one, because the size is not zero */
> > Assert(txn && !txn->toptxn);
> > ..
> > ReorderBufferStreamTXN(rb, txn);
> > ..
> > }
> >
> > How can it ReorderBufferTXN pass for subtransaction?
> >
> Hmm, I missed it. You are right, will fix it.
>
I have observed one more design issue.  The problem is that when we
get a toasted chunks we remember the changes in the memory(hash table)
but don't stream until we get the actual change on the main table.
Now, the problem is that we might get the change of the toasted table
and the main table in different streams.  So basically, in a stream,
if we have only got the toasted tuples then even after
ReorderBufferStreamTXN the memory usage will not be reduced.


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

70 matches

Mail list logo