date:20200717

Re: [PATCH] Remove Extra palloc Of raw_buf For Binary Format In COPY FROM

2020-07-17 Thread Tom Lane

Bharath Rupireddy  writes:
> On Sat, Jul 11, 2020 at 11:53 PM Tom Lane  wrote:
>> Pushed with some fiddling.  Mainly, if we're going to the trouble of
>> checking for binary mode here, we might as well skip allocating the
>> line_buf too.

> Isn't it good if we backpatch this to versions 13, 12, 11 and so on?

Given the lack of complaints, I wasn't excited about it.  Transient
consumption of 64K is not a huge deal these days.  (And yes, I've
worked on machines where that was the entire address space.  But that
was a very long time ago.)

regards, tom lane

Re: [PATCH] Remove Extra palloc Of raw_buf For Binary Format In COPY FROM

2020-07-17 Thread Bharath Rupireddy

On Sat, Jul 11, 2020 at 11:53 PM Tom Lane  wrote:
>
> vignesh C  writes:
> > On Tue, Jun 30, 2020 at 2:41 PM Bharath Rupireddy
> >  wrote:
> >> I've added this patch to commitfest - 
> >> https://commitfest.postgresql.org/28/.
>
> > I felt this patch is ready for committer, changing the status to ready
> > for committer.
>
> Pushed with some fiddling.  Mainly, if we're going to the trouble of
> checking for binary mode here, we might as well skip allocating the
> line_buf too.
>

Hi Tom,

Isn't it good if we backpatch this to versions 13, 12, 11 and so on?
As we can save good amount of memory with this patch for non-binary
copy.

Attaching the patch which applies on versions 13, 12, 11.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

v1-0001-Avoid-useless-buffer-allocations-during-binary-CO.patch
Description: Binary data

Re: Which SET TYPE don't actually require a rewrite

2020-07-17 Thread Noah Misch

On Fri, Jul 17, 2020 at 04:08:36PM +0200, Magnus Hagander wrote:
> On Fri, Jul 17, 2020 at 5:40 AM Noah Misch  wrote:
> > On Wed, Jul 15, 2020 at 02:54:37PM +0200, Magnus Hagander wrote:
> > > Our Fine Manual (TM) specifies:
> > > "As an exception, when changing the type of an existing column, if the
> > > USING clause does not change the column contents and the old type is
> > either
> > > binary coercible to the new type or an unconstrained domain over the new
> > > type, a table rewrite is not needed; but any indexes on the affected
> > > columns must still be rebuilt."
> > >
> > > First of all, how is a non-internals-expert even supposed to know what a
> > > binary coercible type is?
> >
> > The manual defines it at binary coercible.
> 
> The only way to actually realize that this is a  is to look at
> the source code though, right?

I see italic typeface for .  This one deserves an , too.
(I bet many other  uses deserve an .)

> It's definitely not clear that one should go
> look at the CREATE CAST documentation to find the definition -- certainly
> not from the ALTER TABLE documentation, which I would argue is the place
> where most people would go.

Agreed.

> > We can also for example increase the precision of numeric without a
> > rewrite
> > > (but not scale). Or we can change between text and varchar. And we can
> > > increase the length of a varchar but not decrease it.
> > >
> > > Surely we can do better than this when it comes to documenting it? Even
> > if
> > > it's a pluggable thing so it may or may not be true of external
> > > datatypes installed later, we should be able to at least be more clear
> > > about the builtin types, I think?
> >
> > I recall reasoning that ATColumnChangeRequiresRewrite() is a DDL analog of
> > query optimizer logic.  The manual brings up only a minority of planner
> > optimizations, and comprehensive lists of optimization preconditions are
> > even
> > rarer.  But I don't mind if $SUBJECT documentation departs from that norm.
> 
> I can see the argument being made for that, and certainly having been made
> for it in the future. But I'd say given the very bad consequences of
> getting it wrong, it's far from minor. And given the number of times I've
> had to answer the question "can I make this change safely" (which usually
> amounts to me trying it out to see what happens, if I hadn't done that
> exact one many times before) indicates the need for a more detailed
> documentation on it.

Such a doc addition is fine with me.  I agree with Tom that it will be prone
to staleness, but I don't conclude that the potential for staleness reduces
its net value below zero.  Having said that, if the consequences of doc
staleness are "very bad", you may consider documenting the debug1 user
interface (https://postgr.es/m/20121202020736.gd13...@tornado.leadboat.com)
instead of documenting the exact rules.  Either way is fine with me.

Re: Added tab completion for the missing options in copy statement

2020-07-17 Thread Michael Paquier

On Fri, Jul 17, 2020 at 05:28:51PM +0530, vignesh C wrote:
> On Fri, Jul 17, 2020 at 11:15 AM Michael Paquier  wrote:
>> Not completely actually.  The page of psql for \copy does not mention
>> the optional where clause, and I think that it would be better to add
>> that for consistency (perhaps that's the point raised by Ahsan?).  I
>> don't see much point in splitting the description of the meta-command
>> into two lines as we already mix stdin and stdout for example which
>> only apply to respectively "FROM" and "TO", so let's just append the
>> conditional where clause at its end.  Attached is a patch doing so
>> that I intend to back-patch down to v12.
> 
> I would like to split into 2 lines similar to documentation of
> sql-copy which gives better readability, attaching a new patch in
> similar lines.

Fine by me.  I have applied and back-patched this part down to 12.

>> Coming back to your proposal, another thing is that with your patch
>> you recommend a syntax still present for compatibility reasons, but I
>> don't think that we should recommend it to the users anymore, giving
>> priority to the new grammar of the post-9.0 era.  I would actually go
>> as far as removing BINARY from the completion when specified just
>> after COPY to simplify the code, and specify the list of available
>> options after typing "COPY ... WITH (FORMAT ", with "text", "csv" and
>> "binary".  Adding completion for WHERE after COPY FROM is of course a
>> good idea.
> 
> I agree with your comments, and have made a new patch accordingly.
> Thoughts?

Nope, that's not what I meant.  My point was to drop completely from
the completion the past grammar we are keeping around for
compatibility reasons, and just complete with the new grammar
documented at the top of the COPY page.  This leads me to the
attached, which actually simplifies the code, adds more completion
patterns with the mixes of WHERE/WITH depending on if FROM or TO is
used, and at the end is less bug-prone if the grammar gets more
extended.  I have also added some completion for "WITH (FORMAT" for
text, csv and binary.
--
Michael
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index eb018854a5..d5a47be422 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2316,19 +2316,14 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("COPY|\\copy"))
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables,
    " UNION ALL SELECT '('");
-	/* If we have COPY BINARY, complete with list of tables */
-	else if (Matches("COPY", "BINARY"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
-	/* If we have COPY (, complete it with legal commands */
+	/* Complete COPY ( with legal query commands */
 	else if (Matches("COPY|\\copy", "("))
 		COMPLETE_WITH("SELECT", "TABLE", "VALUES", "INSERT", "UPDATE", "DELETE", "WITH");
-	/* If we have COPY [BINARY] , complete it with "TO" or "FROM" */
-	else if (Matches("COPY|\\copy", MatchAny) ||
-			 Matches("COPY", "BINARY", MatchAny))
+	/* Complete COPY  */
+	else if (Matches("COPY|\\copy", MatchAny))
 		COMPLETE_WITH("FROM", "TO");
-	/* If we have COPY [BINARY]  FROM|TO, complete with filename */
-	else if (Matches("COPY", MatchAny, "FROM|TO") ||
-			 Matches("COPY", "BINARY", MatchAny, "FROM|TO"))
+	/* Complete COPY  FROM|TO with filename */
+	else if (Matches("COPY", MatchAny, "FROM|TO"))
 	{
 		completion_charp = "";
 		completion_force_quote = true;	/* COPY requires quoted filename */
@@ -2340,17 +2335,28 @@ psql_completion(const char *text, int start, int end)
 		completion_force_quote = false;
 		matches = rl_completion_matches(text, complete_from_files);
 	}
-	/* Offer options after COPY [BINARY]  FROM|TO filename */
-	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny) ||
-			 Matches("COPY", "BINARY", MatchAny, "FROM|TO", MatchAny))
-		COMPLETE_WITH("BINARY", "DELIMITER", "NULL", "CSV",
-	  "ENCODING");
 
-	/* Offer options after COPY [BINARY]  FROM|TO filename CSV */
-	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "CSV") ||
-			 Matches("COPY", "BINARY", MatchAny, "FROM|TO", MatchAny, "CSV"))
-		COMPLETE_WITH("HEADER", "QUOTE", "ESCAPE", "FORCE QUOTE",
-	  "FORCE NOT NULL");
+	/* Complete COPY  TO  */
+	else if (Matches("COPY|\\copy", MatchAny, "TO", MatchAny))
+		COMPLETE_WITH("WITH (");
+
+	/* Complete COPY  FROM  */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny))
+		COMPLETE_WITH("WITH (", "WHERE");
+
+	/* Complete COPY  FROM|TO filename WITH ( */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "("))
+		COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
+	  "HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
+	  "FORCE_NOT_NULL", "FORCE_NULL", "ENCODING");
+
+	/* Complete COPY  FROM|TO filename WITH (FORMAT */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
+		COMPLETE_WITH("binary", "csv", "text");
+
+	/* Complete CO

Re: Encoding of src/timezone/tznames/Europe.txt

2020-07-17 Thread Michael Paquier

On Fri, Jul 17, 2020 at 07:24:28PM +0200, Christoph Berg wrote:
> Re: Tom Lane
>> Done that way.  I also checked for other discrepancies between
>> tznames/Default and the other files, and found a few more trivialities.
> 
> Thanks!

+1.
--
Michael


signature.asc
Description: PGP signature

Re: Default setting for enable_hashagg_disk

2020-07-17 Thread Peter Geoghegan

On Fri, Jul 17, 2020 at 5:13 PM Jeff Davis  wrote:
> The idea is growing on me a bit. It doesn't give exactly v12 behavior,
> but it does offer another lever that might tackle a lot of the
> practical cases.

Cool.

> If I were making the decision alone, I'd still choose
> the escape hatch based on simplicity, but I'm fine with this approach
> as well.

There is also the separate question of what to do about the
hashagg_avoid_disk_plan GUC (this is a separate open item that
requires a separate resolution). Tom leans slightly towards removing
it now. Is your position about the same as before?

> The patch itself looks reasonable to me. I don't see a lot of obvious
> dangers, but perhaps someone would like to take a closer look at the
> planner changes as you suggest.

It would be good to get further input on the patch from somebody else,
particularly the planner aspects.

My intention is to commit the patch myself. I was the primary advocate
for hash_mem_multiplier, so it seems as if I should own it. (You may
have noticed that I just pushed the preparatory
local-variable-renaming patch, to get that piece out of the way.)

-- 
Peter Geoghegan

Re: Using Valgrind to detect faulty buffer accesses (no pin or buffer content lock held)

2020-07-17 Thread Peter Geoghegan

On Thu, Jul 16, 2020 at 10:24 AM Anastasia Lubennikova
 wrote:
> It's impressive that this check helped to find several bugs.

While it's definitely true that it *could* have detected the bug fixed
by commit b0229f26, it's kind of debatable whether or not the bugs I
fixed in commit fa7ff642 and commit 7154aa16 (which actually were
found using this new instrumentation) were truly bugs.

The behavior in question was probably safe, since only the
special/opaque page area was accessed -- and with at least a buffer
pin held. But it's not worth having a debate about whether or not it
should be considered safe. There is no downside to not having a simple
strict rule that's easy to enforce. Also, I myself spotted some bugs
in the skip scan patch series at one point that would also be caught
by the new instrumentation.

> I only noticed small inconsistency in the new comment for
> _bt_conditionallockbuf().
>
> It says "Note: Caller is responsible for calling _bt_checkpage() on
> success.", while in _bt_getbuf() the call is not followed by
> _bt_checkpage().
> Moreover, _bt_page_recyclable() contradicts _bt_checkpage() checks.

Nice catch.

> Other than that, patches look good to me, so move them to "Ready For
> Committer".

Pushed the first patch just now, and intend to push the other one soon. Thanks!

> Are you planning to add same checks for other access methods?

Not at the moment, but it does seem like my approach could be
generalized to other index access methods.

--
Peter Geoghegan

Re: Transactions involving multiple postgres foreign servers, take 2

2020-07-17 Thread Masahiro Ikeda

On 2020-07-17 15:55, Masahiko Sawada wrote:
On Fri, 17 Jul 2020 at 11:06, Masahiro Ikeda  
wrote:

On 2020-07-16 13:16, Masahiko Sawada wrote:
> On Tue, 14 Jul 2020 at 17:24, Masahiro Ikeda 
> wrote:
>>
>> > I've attached the latest version patches. I've incorporated the review
>> > comments I got so far and improved locking strategy.
>>
>> I want to ask a question about streaming replication with 2PC.
>> Are you going to support 2PC with streaming replication?
>>
>> I tried streaming replication using v23 patches.
>> I confirm that 2PC works with streaming replication,
>> which there are primary/standby coordinator.
>>
>> But, in my understanding, the WAL of "PREPARE" and
>> "COMMIT/ABORT PREPARED" can't be replicated to the standby server in
>> sync.
>>
>> If this is right, the unresolved transaction can be occurred.
>>
>> For example,
>>
>> 1. PREPARE is done
>> 2. crash primary before the WAL related to PREPARE is
>> replicated to the standby server
>> 3. promote standby server // but can't execute "ABORT PREPARED"
>>
>> In above case, the remote server has the unresolved transaction.
>> Can we solve this problem to support in-sync replication?
>>
>> But, I think some users use async replication for performance.
>> Do we need to document the limitation or make another solution?
>>
>
> IIUC with synchronous replication, we can guarantee that WAL records
> are written on both primary and replicas when the client got an
> acknowledgment of commit. We don't replicate each WAL records
> generated during transaction one by one in sync. In the case you
> described, the client will get an error due to the server crash.
> Therefore I think the user cannot expect WAL records generated so far
> has been replicated. The same issue could happen also when the user
> executes PREPARE TRANSACTION and the server crashes.

Thanks! I didn't noticed the behavior when a user executes PREPARE
TRANSACTION is same.

IIUC with 2PC, there is a different point between (1)PREPARE 
TRANSACTION

and (2)2PC.
The point is that whether the client can know when the server crashed
and it's global tx id.

If (1)PREPARE TRANSACTION is failed, it's ok the client execute same
command
because if the remote server is already prepared the command will be
ignored.

But, if (2)2PC is failed with coordinator crash, the client can't know
what operations should be done.

If the old coordinator already executed PREPARED, there are some
transaction which should be ABORT PREPARED.
But if the PREPARED WAL is not sent to the standby, the new 
coordinator

can't execute ABORT PREPARED.
And the client can't know which remote servers have PREPARED
transactions which should be ABORTED either.

Even if the client can know that, only the old coordinator knows its
global transaction id.
Only the database administrator can analyze the old coordinator's log
and then execute the appropriate commands manually, right?

I think that's right. In the case of the coordinator crash, the user
can look orphaned foreign prepared transactions by checking the
'identifier' column of pg_foreign_xacts on the new standby server and
the prepared transactions on the remote servers.

I think there is a case we can't check orphaned foreign
prepared transaction in pg_foreign_xacts view on the new standby server.
It confuses users and database administrators.

If the primary coordinator crashes after preparing foreign transaction,
but before sending XLOG_FDWXACT_INSERT records to the standby server,
the standby server can't restore their transaction status and
pg_foreign_xacts view doesn't show the prepared foreign transactions.

To send XLOG_FDWXACT_INSERT records asynchronously leads this problem.

> To prevent this
> issue, I think we would need to send each WAL records in sync but I'm
> not sure it's reasonable behavior, and as long as we write WAL in the
> local and then send it to replicas we would need a smart mechanism to
> prevent this situation.

I agree. To send each 2PC WAL records  in sync must be with a large
performance impact.
At least, we need to document the limitation and how to handle this
situation.

Ok. I'll add it.

Thanks a lot.

> Related to the pointing out by Ikeda-san, I realized that with the
> current patch the backend waits for synchronous replication and then
> waits for foreign transaction resolution. But it should be reversed.
> Otherwise, it could lead to data loss even when the client got an
> acknowledgment of commit. Also, when the user is using both atomic
> commit and synchronous replication and wants to cancel waiting, he/she
> will need to press ctl-c twice with the current patch, which also
> should be fixed.

I'm sorry that I can't understood.

In my understanding, if COMMIT WAL is replicated to the standby in 
sync,

the standby server can resolve the transaction after crash recovery in
promoted phase.

If reversed, there are some situation which can't guarantee atomic
commit.
In case that some foreign transaction resolutions ar

Re: Default setting for enable_hashagg_disk

2020-07-17 Thread Jeff Davis

On Tue, 2020-07-14 at 21:12 -0700, Peter Geoghegan wrote:
> Attached is a WIP patch implementing hash_mem_multiplier, with 1.0 as
> the GUC's default value (i.e. the patch introduces no behavioral
> changes by default). The first patch in the series renames some local
> variables whose name is made ambiguous by the second, main patch.

The idea is growing on me a bit. It doesn't give exactly v12 behavior,
but it does offer another lever that might tackle a lot of the
practical cases. If I were making the decision alone, I'd still choose
the escape hatch based on simplicity, but I'm fine with this approach
as well.

The patch itself looks reasonable to me. I don't see a lot of obvious
dangers, but perhaps someone would like to take a closer look at the
planner changes as you suggest.

Regards,
Jeff Davis

pg_ctl behavior on Windows

2020-07-17 Thread Chapman Flack

Hi,

So, I am not a Windows native, and here I am mentoring a GSoC student
setting up CI on multiple environments, including Windows.

In my own development and testing, by habit I do everything from
unprivileged accounts, just spinning up an instance in a temp location,
running some tests, and shutting it down. So I rarely run into the
"I refuse to run from a privileged account" check in postgres.
So rarely that it tends to slip my mind.

The popular hosted CI services, by contrast, tend to run one's test
scripts from a privileged account, so the scripts can just blithely
install packages, frob configurations, and so on, and it all just works,
Except for just spinning up a one-off postgres instance; that runs afoul
of the privilege check.

One workaround, of course, is to just use the postgres instance officially
supplied by the CI service, already started and listening on the standard
port. Then, in fact, you /need/ to be running with privilege, so you can
install into the standard locations, frob configs, restart it, etc.

Another is for the testing script to use its admin powers to create a
new user without admin powers, and switch to that identity for the rest
of the show.

If I understand correctly what I'm seeing in the pg_ctl source, that would
be the sole other option on a non-Windows system; 'pg_ctl start' as root on
non-Windows will simply refuse, the same way direct invocation of postgres
would.

On the other hand, it seems that pg_ctl start on Windows has another
trick up its sleeve, and in about 180 lines of fussing with arcane Windows
APIs, it can arrange to run under the current identity but with its
administrator-ness removed and privileges capped. Which seems cool.

But there's a NOTE! in the comment for CreateRestrictedProcess: "Job object
will only work when running as a service, because it's automatically
destroyed when pg_ctl exits."

I haven't been able to find any documentation of what that really means
in practical terms, or quite figure it out from the code. Does that mean
'pg_ctl start' won't really work after all from a privileged account, or
will seem to work but something will go wrong after the server is ready
and pg_ctl exits? Does it mean the tersely-documented 'register' operation
must be used, and that's the only way to start from a privileged account?

I don't have an especially easy way to experiment on Windows; I can push
experiments to the CI service and wait a bit to see what they do, but
I figured I'd ask here first.

Regards,
-Chap

Re: expose parallel leader in CSV and log_line_prefix

2020-07-17 Thread Justin Pryzby

On Fri, Jul 17, 2020 at 05:27:21PM -0400, Alvaro Herrera wrote:
> On 2020-Jul-17, Justin Pryzby wrote:
> > Ok, but should we then consider changing pg_stat_activity for consistency ?
> > Probably in v13 to avoid changing it a year later.
> > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b025f32e0b5d7668daec9bfa957edf3599f4baa8
> > 
> > I think the story is that we're exposing to the user a "leader pid" what's
> > internally called (and used as) the "lock group leader", which for the 
> > leader
> > process is set to its own PID.  But I think what we're exposing as 
> > leader_pid
> > will seem like an implementation artifact to users.
> 
> IMO it *is* an implementation artifact if, as you say, the leader PID
> remains set after the parallel query is done.  I mentioned the pgbouncer
> case before: if you run a single parallel query, then the process
> remains a "parallel leader" for days or weeks afterwards even if it
> hasn't run a parallel query ever since.  That doesn't sound great to me.
> 
> I think it's understandable and OK if there's a small race condition
> that means you report a process as a leader shortly before or shortly
> after a parallel query is actually executed.  But doing so until backend
> termination seems confusing as well as useless.

I'm not sure that connection pooling is the strongest argument against the
current behavior, but we could change it as suggested to show as NULL the
leader_pid for the leader's own process.  I think that's the intuitive behavior
a user expects.  Parallel processes are those with leader_pid IS NOT NULL.  If
we ever used lockGroupLeader for something else, you'd also have to say AND
backend_type='parallel worker'.

We should talk about doing that for PSA and for v13 as well.  Here or on the
other thread or a new thread ?  It's a simple enough change, but the question
is if we want to provide a more "cooked" view whcih hides the internals, and if
so, is this really enough.

--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -737,3 +737,4 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
leader = proc->lockGroupLeader;
-   if (leader)
+   if (leader && leader->pid != 
beentry->st_procpid)
{
values[29] = Int32GetDatum(leader->pid);

-- 
Justin

Re: renaming configure.in to configure.ac

2020-07-17 Thread Thomas Munro

On Sat, Jul 18, 2020 at 2:12 AM Tom Lane  wrote:
> Another issue is that we're not going to open up the main repo for
> access by non-committers, so this approach doesn't help for most
> developers.  We've had some success, I think, with Munro's cfbot
> solution --- I'd rather see that approach expanded to provide more
> test environments.

Recently I've been using Cirrus CI for my own development branches
that involve portability stuff, because it supports Linux, FreeBSD,
macOS and Windows in one place.  That's nearly half the OSes we
support, and they hinted that they might be about to add more OSes
too.  What you get (if you're lucky) is a little green check mark
beside the commit hash on github, which you can click for more info,
like this:

https://github.com/macdice/postgres/tree/cirrus-ci

The .cirrus.yml file shown in that branch is just a starting point.
See list of problems at the top; help wanted.  I also put some
information about this on
https://wiki.postgresql.org/wiki/Continuous_Integration.  I think if
we could get to a really good dot file for (say) the three providers
shown there, we should just stick them in the tree so that anyone can
turn that on for their own public development branches with a click.
Then cfbot wouldn't have to add it, but it'd still have a good reason
to exist, to catch bitrot and as a second line of defence for people
who don't opt into the first one.

Re: expose parallel leader in CSV and log_line_prefix

2020-07-17 Thread Alvaro Herrera

On 2020-Jul-17, Justin Pryzby wrote:

> Ok, but should we then consider changing pg_stat_activity for consistency ?
> Probably in v13 to avoid changing it a year later.
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b025f32e0b5d7668daec9bfa957edf3599f4baa8
> 
> I think the story is that we're exposing to the user a "leader pid" what's
> internally called (and used as) the "lock group leader", which for the leader
> process is set to its own PID.  But I think what we're exposing as leader_pid
> will seem like an implementation artifact to users.

IMO it *is* an implementation artifact if, as you say, the leader PID
remains set after the parallel query is done.  I mentioned the pgbouncer
case before: if you run a single parallel query, then the process
remains a "parallel leader" for days or weeks afterwards even if it
hasn't run a parallel query ever since.  That doesn't sound great to me.

I think it's understandable and OK if there's a small race condition
that means you report a process as a leader shortly before or shortly
after a parallel query is actually executed.  But doing so until backend
termination seems confusing as well as useless.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: expose parallel leader in CSV and log_line_prefix

2020-07-17 Thread Justin Pryzby

On Fri, Jul 17, 2020 at 11:35:40AM -0400, Alvaro Herrera wrote:
> > On Fri, Jul 17, 2020 at 7:01 AM Michael Paquier  wrote:
> >
> > > Hmm.  Knowing if a leader is actually running parallel query or not
> > > requires a lookup at lockGroupMembers, that itself requires a LWLock.
> > > I think that it would be better to not require that.  So what if
> > > instead we logged %P only if Myproc has lockGroupLeader set and it
> > > does *not* match MyProcPid?
> 
> That's what I said first, so +1 for that approach.

Ok, but should we then consider changing pg_stat_activity for consistency ?
Probably in v13 to avoid changing it a year later.
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b025f32e0b5d7668daec9bfa957edf3599f4baa8

I think the story is that we're exposing to the user a "leader pid" what's
internally called (and used as) the "lock group leader", which for the leader
process is set to its own PID.  But I think what we're exposing as leader_pid
will seem like an implementation artifact to users.  It's unnatural to define a
leader PID for the leader itself, and I'm guessing that at least 30% of people
who use pg_stat_activity.leader_pid will be surprised by rows with
| backend_type='client backend' AND leader_pid IS NOT NULL
And maybe additionally confused if PSA doesn't match CSV or other log.

Right now, PSA will include processes "were leader" queries like:
| SELECT pid FROM pg_stat_activity WHERE pid=leader_pid
If we change it, I think you can get the same thing for a *current* leader like:
| SELECT pid FROM pg_stat_activity a WHERE EXISTS (SELECT 1 FROM 
pg_stat_activity b WHERE b.leader_pid=a.pid);
But once the children die, you can't get that anymore.  Is that a problem ?

I didn't think of it until now, but it would be useful to query logs for
processes which were involved in parallel process.  (It would be more useful if
it indicated the query, and not just the process)

I agree that showing the PID as the leader PID while using a connection pooler
is "noisy".  But I think that's maybe just a consequence of connection pooling.
As an analogy, I would normally use a query like:
| SELECT session_line, message, query FROM postgres_log WHERE session_id='..' 
ORDER BY 1
But that already doesn't work usefully with connection pooling (and I'm not
sure how to resolve that other than by not using pooling when logs are useful)

I'm not sure what the answer.  Probably we should either make both expose
lockGroupLeader exactly (and not filtered) or make both show lockGroupLeader
only if lockGroupLeader!=getpid().

-- 
Justin

Re: Busted includes somewhere near worker_internal.h

2020-07-17 Thread Andres Freund

Hi,

On 2020-07-17 16:09:14 -0400, Tom Lane wrote:
> headerscheck and cpluspluscheck are both unhappy about this:
> 
> ./src/include/replication/worker_internal.h:49:2: error: unknown type name 
> 'slock_t'
>   slock_t  relmutex;
>   ^~~
> 
> Now, worker_internal.h itself hasn't changed in some time.
> I conclude that somebody rearranged one of the header files
> it depends on.  Anyone have an idea what the relevant change
> was?  Should we just include spin.h here, or is there a
> better fix?

I'm probably to blame for that - I've removed the s_lock.h (it wasn't
spin.h for some reason) include from lwlock.h:

commit f219167910ad33dfd8f1b0bba15323d71a91c4e9
Author: Andres Freund 
Date:   2020-06-18 19:40:09 -0700

Clean up includes of s_lock.h.

Users of spinlocks should use spin.h, not s_lock.h. And lwlock.h
hasn't utilized spinlocks for quite a while.

Discussion: 
https://postgr.es/m/20200618183041.upyrd25eosecy...@alap3.anarazel.de

I think including spin.h is the right fix, given that it needs to know
the size of s_lock.

Greetings,

Andres Freund

Busted includes somewhere near worker_internal.h

2020-07-17 Thread Tom Lane

headerscheck and cpluspluscheck are both unhappy about this:

./src/include/replication/worker_internal.h:49:2: error: unknown type name 
'slock_t'
  slock_t  relmutex;
  ^~~

Now, worker_internal.h itself hasn't changed in some time.
I conclude that somebody rearranged one of the header files
it depends on.  Anyone have an idea what the relevant change
was?  Should we just include spin.h here, or is there a
better fix?

regards, tom lane

Re: Binary support for pgoutput plugin

2020-07-17 Thread Tom Lane

Working through this ... what is the rationale for having changed
the API of logicalrep_read_update?  It seems kind of random,
especially since no comparable change was made to
logicalrep_read_insert.  If there's actually a good reason,
it seems like it'd apply to both.  If there's not, I'd be
inclined to not change the API, because this sort of thing
is a recipe for bugs when making cross-version patches.

regards, tom lane

Re: renaming configure.in to configure.ac

2020-07-17 Thread Andres Freund

Hi,

On 2020-07-17 10:46:30 +0200, Peter Eisentraut wrote:
> Okay, let's take a look.  Attached is a patch series.

Cool.


> One thing that's annoying is that the release notes claim that configure
> should now be faster, and some of the changes they have made should support
> that, but my (limited) testing doesn't bear that out.  Most notably, the
> newly arisen test
> 
> checking for g++ option to enable C++11 features... none needed
> 
> takes approximately 10 seconds(!) on my machine (for one loop, since "none
> needed"; good luck if you need more than none).

Something got to be wrong here, no? I see that there's a surprisingly
large c++ program embedded for this test, but still, 10s?

It's not even clear why we're seeing this test at all? Is this now
always part of AC_PROG_CXX?

Greetings,

Andres Freund

Re: NaN divided by zero should yield NaN

2020-07-17 Thread Dean Rasheed

On Thu, 16 Jul 2020 at 20:29, Tom Lane  wrote:
>
> Dean Rasheed questioned this longstanding behavior:
>
> regression=# SELECT 'nan'::float8 / '0'::float8;
> ERROR:  division by zero
>
> After a bit of research I think he's right: per IEEE 754 this should
> yield NaN, not an error.  Accordingly I propose the attached patch.
> This is probably not something to back-patch, though.
>

Agreed.

> One thing that's not very clear to me is which of these spellings
> is preferable:
>
> if (unlikely(val2 == 0.0) && !isnan(val1))
> if (unlikely(val2 == 0.0 && !isnan(val1)))
>

My guess is that the first would be better, since it would tell the
compiler that it's unlikely to need to do the NaN test, so it would be
kind of like doing

if (unlikely(val2 == 0.0))
if (!isnan(val1)))

Regards,
Dean

Re: Error during make, second install

2020-07-17 Thread Tom Lane

"David G. Johnston"  writes:
> Sorry for the noise - though maybe some insight is still warranted - but
> running make clean first seems to have cleared up my problem.

Yeah.  Just doing "git pull" and "make" will often fail, because by
default there's nothing guaranteeing that all dependent files are remade.
There are two safe workflows that I know of:

1. Run "make distclean" when pulling an update.  It works a bit cleaner
if you do this before not after "git pull".  If there was no update
of the configure script, you can get away with just "make clean", but
you generally don't know that before pulling ...

2. Always configure with --enable-depend.

I prefer #1, as I find it more reliable.  If you use ccache the
build-speed advantage of #2 is pretty minimal.

In either case, when in doubt, try "git clean -dfx" and rebuild
from scratch.

regards, tom lane

Re: Encoding of src/timezone/tznames/Europe.txt

2020-07-17 Thread Christoph Berg

Re: Tom Lane
> Done that way.  I also checked for other discrepancies between
> tznames/Default and the other files, and found a few more trivialities.

Thanks!

Christoph

Re: Error during make, second install

2020-07-17 Thread Alvaro Herrera

On 2020-Jul-17, David G. Johnston wrote:

> On Fri, Jul 17, 2020 at 8:58 AM David G. Johnston <
> david.g.johns...@gmail.com> wrote:

> Sorry for the noise - though maybe some insight is still warranted - but
> running make clean first seems to have cleared up my problem.

Do you run "configure --enable-depend"?  If not, then make clean is
mandatory before pulling changes.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Warn when parallel restoring a custom dump without data offsets

2020-07-17 Thread Tom Lane

I wrote:
> Given the current state of affairs, I'm inclined to commit the
> attached with no new test coverage, and then come back and look
> at better testing after the other bugs are dealt with.

Pushed back to v12.

regards, tom lane

Re: Transactions involving multiple postgres foreign servers, take 2

2020-07-17 Thread Fujii Masao





On 2020/07/16 14:47, Masahiko Sawada wrote:

On Tue, 14 Jul 2020 at 11:19, Fujii Masao  wrote:




On 2020/07/14 9:08, Masahiro Ikeda wrote:

I've attached the latest version patches. I've incorporated the review
comments I got so far and improved locking strategy.


Thanks for updating the patch!


+1
I'm interested in these patches and now studying them. While checking
the behaviors of the patched PostgreSQL, I got three comments.


Thank you for testing this patch!



1. We can access to the foreign table even during recovery in the HEAD.
But in the patched version, when I did that, I got the following error.
Is this intentional?

ERROR:  cannot assign TransactionIds during recovery


No, it should be fixed. I'm going to fix this by not collecting
participants for atomic commit during recovery.


Thanks for trying to fix the issues!

I'd like to report one more issue. When I started new transaction
in the local server, executed INSERT in the remote server via
postgres_fdw and then quit psql, I got the following assertion failure.

TRAP: FailedAssertion("fdwxact", File: "fdwxact.c", Line: 1570)
0   postgres0x00010d52f3c0 ExceptionalCondition 
+ 160
1   postgres0x00010cefbc49 
ForgetAllFdwXactParticipants + 313
2   postgres0x00010cefff14 AtProcExit_FdwXact + 
20
3   postgres0x00010d313fe3 shmem_exit + 179
4   postgres0x00010d313e7a proc_exit_prepare + 
122
5   postgres0x00010d313da3 proc_exit + 19
6   postgres0x00010d35112f PostgresMain + 3711
7   postgres0x00010d27bb3a BackendRun + 570
8   postgres0x00010d27af6b BackendStartup + 475
9   postgres0x00010d279ed1 ServerLoop + 593
10  postgres0x00010d277940 PostmasterMain + 6016
11  postgres0x00010d1597b9 main + 761
12  libdyld.dylib   0x7fff7161e3d5 start + 1
13  ??? 0x0003 0x0 + 3

Regards,


--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: Add header support to text format and matching feature

2020-07-17 Thread Rémi Lapeyre

> 
> I don't know how to do that with git-send-email, but you can certainly do it 
> easy with git-format-patch and just attach them using your regular MUA.
> 
> (and while the cfbot and the archives have no problems dealing with the 
> change in subject, it does break threading in some other MUAs, so I would 
> recommend not doing that and sticking to the existing subject of the thread)
> 

Thanks, here are both patches attached so cfbot can read them.




v3-0001-Add-header-support-to-COPY-TO-text-format.patch
Description: Binary data


v3-0002-Add-header-matching-mode-to-COPY-FROM.patch
Description: Binary data

Re: Transactions involving multiple postgres foreign servers, take 2

2020-07-17 Thread Fujii Masao





On 2020/07/17 20:04, Masahiko Sawada wrote:

On Fri, 17 Jul 2020 at 14:22, tsunakawa.ta...@fujitsu.com
 wrote:


From: Masahiko Sawada 
I have briefly checked the only oracle_fdw but in general I think that

if an existing FDW supports transaction begin, commit, and rollback,
these can be ported to new FDW transaction APIs easily.


Does oracle_fdw support begin, commit and rollback?

And most importantly, do other major DBMSs, including Oracle, provide the API 
for preparing a transaction?  In other words, will the FDWs other than 
postgres_fdw really be able to take advantage of the new FDW functions to join 
the 2PC processing?  I think we need to confirm that there are concrete 
examples.


I also believe they do. But I'm concerned that some FDW needs to start
a transaction differently when using 2PC. For instance, IIUC MySQL
also supports 2PC but the transaction needs to be started with "XA
START id” when the transaction needs to be prepared. The transaction
started with XA START can be closed by XA END followed by XA PREPARE
or XA COMMIT ONE PHASE.


This means that FDW should provide also the API for xa_end()?
Maybe we need to consider again which API we should provide in FDW,
based on XA specification?



It means that when starts a new transaction
the transaction needs to prepare the transaction identifier and to
know that 2PC might be used. It’s quite different from PostgreSQL. In
PostgreSQL, we can start a transaction by BEGIN and end it by PREPARE
TRANSACTION, COMMIT, or ROLLBACK. The transaction identifier is
required when PREPARE TRANSACTION.

With MySQL, I guess FDW needs a way to tell the (next) transaction
needs to be started with XA START so it can be prepared. It could be a
custom GUC or an SQL function. Then when starts a new transaction on
MySQL server, FDW can generate and store a transaction identifier into
somewhere alongside the connection. At the prepare phase, it passes
the transaction identifier via GetPrepareId() API to the core.

I haven’t tested the above yet and it’s just a desk plan. it's
definitely a good idea to try integrating this 2PC feature to FDWs
other than postgres_fdw to see if design and interfaces are
implemented sophisticatedly.


With the current patch, we track whether write queries are executed
in each server. Then, if the number of servers that execute write queries
is less than two, 2PC is skipped. This "optimization" is not necessary
(cannot be applied) when using mysql_fdw because the transaction starts
with XA START. Right?

If that's the "optimization" only for postgres_fdw, maybe it's better to
get rid of that "optimization" from the first patch, to make the patch simpler.

Regards,


--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: Partitioning and postgres_fdw optimisations for multi-tenancy

2020-07-17 Thread Etsuro Fujita

On Sat, Jul 18, 2020 at 12:44 AM Ashutosh Bapat
 wrote:
> On Fri, Jul 17, 2020 at 8:24 PM Etsuro Fujita  wrote:
> > On Fri, Jul 17, 2020 at 1:56 AM Alexey Kondratov
> >  wrote:
> > > However, there is an issue with aggregates as well. For a query like:
> > >
> > > SELECT
> > >  count(*)
> > > FROM
> > >  documents
> > > WHERE
> > >  company_id = 5;
> > >
> > > It would be great to teach planner to understand, that it's a
> > > partition-wise aggregate as well, even without GROUP BY company_id,
> > > which doesn't always help as well. I'll try to look closer on this
> > > problem, but if you have any thoughts about it, then I'd be glad to
> > > know.
> >
> > The reason why the aggregation count(*) isn't pushed down to the
> > remote side is: 1) we allow the FDW to push the aggregation down only
> > when the input relation to the aggregation is a foreign (base or join)
> > relation (see create_grouping_paths()), but 2) for your case the input
> > relation would be an append relation that contains the foreign
> > partition as only one child relation, NOT just the foreign partition.
> > The resulting Append path would be removed in the postprocessing (see
> > [1]), but that would be too late for the FDW to do the push-down work.
> > I have no idea what to do about this issue.
>
> Won't partitionwise aggregate push aggregate down to partition and
> then from there to the foreign server through FDW?

Sorry, my words were not clear.  The aggregation above is count(*)
*without GROUP BY*, so we can’t apply PWA to it.

Best regards,
Etsuro Fujita

Re: Error during make, second install

2020-07-17 Thread David G. Johnston

On Fri, Jul 17, 2020 at 8:58 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> Hey,
>
> I installed PostgreSQL source for the first time a few weeks ago.  I am
> now just getting to my first pull-and-reinstall.  I run make again at the
> top of the repo and I get:
> [...]
>
> I then ran ./configure again and got the same result.  Ubuntu 18.04.
>

Sorry for the noise - though maybe some insight is still warranted - but
running make clean first seems to have cleared up my problem.

David J.

Error during make, second install

2020-07-17 Thread David G. Johnston

Hey,

I installed PostgreSQL source for the first time a few weeks ago.  I am now
just getting to my first pull-and-reinstall.  I run make again at the top
of the repo and I get:

git @ 7fe3083f4

gcc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Werror=vla -Wendif-labels
-Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wformat-security
-fno-strict-aliasing -fwrapv -fexcess-precision=standard
-Wno-format-truncation -O2 [...] -L../../src/port -L../../src/common
-Wl,--as-needed -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags
 -Wl,-E -lpthread -lrt -ldl -lm -o postgres
catalog/catalog.o: In function `GetNewRelFileNode':
catalog.c:(.text+0x3f3): undefined reference to `ParallelMasterBackendId'
catalog/storage.o: In function `RelationCreateStorage':
storage.c:(.text+0x283): undefined reference to `ParallelMasterBackendId'
utils/adt/dbsize.o: In function `pg_relation_filepath':
dbsize.c:(.text+0x166e): undefined reference to `ParallelMasterBackendId'
collect2: error: ld returned 1 exit status
Makefile:66: recipe for target 'postgres' failed
make[2]: *** [postgres] Error 1
make[2]: Leaving directory '/home/postgres/postgresql/src/backend'
Makefile:42: recipe for target 'all-backend-recurse' failed
make[1]: *** [all-backend-recurse] Error 2
make[1]: Leaving directory '/home/postgres/postgresql/src'
GNUmakefile:11: recipe for target 'all-src-recurse' failed
make: *** [all-src-recurse] Error 2

I then ran ./configure again and got the same result.  Ubuntu 18.04.

Simply checking out and re-making 3a990a12635 (plus my two patches) works
just fine.

Please advise, fixing stuff in the C parts of the codebase is not a skill
I've picked up yet - been focused on docs and tests.

Thanks!

David J.

Re: Partitioning and postgres_fdw optimisations for multi-tenancy

2020-07-17 Thread Ashutosh Bapat

On Fri, Jul 17, 2020 at 8:24 PM Etsuro Fujita  wrote:
>
> On Fri, Jul 17, 2020 at 1:56 AM Alexey Kondratov
>  wrote:
> > However, there is an issue with aggregates as well. For a query like:
> >
> > SELECT
> >  count(*)
> > FROM
> >  documents
> > WHERE
> >  company_id = 5;
> >
> > It would be great to teach planner to understand, that it's a
> > partition-wise aggregate as well, even without GROUP BY company_id,
> > which doesn't always help as well. I'll try to look closer on this
> > problem, but if you have any thoughts about it, then I'd be glad to
> > know.
>
> The reason why the aggregation count(*) isn't pushed down to the
> remote side is: 1) we allow the FDW to push the aggregation down only
> when the input relation to the aggregation is a foreign (base or join)
> relation (see create_grouping_paths()), but 2) for your case the input
> relation would be an append relation that contains the foreign
> partition as only one child relation, NOT just the foreign partition.
> The resulting Append path would be removed in the postprocessing (see
> [1]), but that would be too late for the FDW to do the push-down work.
> I have no idea what to do about this issue.

Won't partitionwise aggregate push aggregate down to partition and
then from there to the foreign server through FDW? Something else must
be stopping it. May be whole-var expression?

-- 
Best Wishes,
Ashutosh Bapat

Re: expose parallel leader in CSV and log_line_prefix

2020-07-17 Thread Alvaro Herrera

> On Fri, Jul 17, 2020 at 7:01 AM Michael Paquier  wrote:
>
> > Hmm.  Knowing if a leader is actually running parallel query or not
> > requires a lookup at lockGroupMembers, that itself requires a LWLock.
> > I think that it would be better to not require that.  So what if
> > instead we logged %P only if Myproc has lockGroupLeader set and it
> > does *not* match MyProcPid?

That's what I said first, so +1 for that approach.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Patch for reorderbuffer.c documentation.

2020-07-17 Thread Dave Cramer

On Fri, 17 Jul 2020 at 11:17, David G. Johnston 
wrote:

> On Fri, Jul 17, 2020 at 8:10 AM Dave Cramer  wrote:
>
>> This started out with just fixing
>>
>> "One option do deal" to " One option to deal"
>>
>> But after reading the rest I'd propose the following patch.
>>
>
> Suggest replacing "though" with "however" instead of trying to figure out
> what amount of commas is readable (the original seemed better IMO).
>
> "However, the transaction records are fairly small and"
>

Works for me.

Thanks,
Dave

Re: Add header support to text format and matching feature

2020-07-17 Thread Magnus Hagander

On Fri, Jul 17, 2020 at 5:11 PM Rémi Lapeyre 
wrote:

>
>
> > It's hard to find an explanation what this patch actually does.  I don't
> > want to have to go through threads dating back 4 months to determine
> > what was discussed and what was actually implemented.  Since you're
> > already using git format-patch, just add something to the commit message.
> >
> >
> > It appears that these are really two separate features, so perhaps they
> > should be two patches.
>
> Thanks for the feedback, I've split cleanly the two patches, simplified the
> tests and tried to explain the changes in the commit message.
>
> > Also, the new header matching mode could probably use more than one line
> > of documentation.
>
> I've improved the documentation, let's me know if it's better.
>
> It seems like cfbot is not happy with the way I'm sending my patches. The
> wiki
> has some good advices on how to write a patch but I couldn't find anything
> on
> how to send it. I've used


>git send-email -v3 --compose --to=... --in-reply-to=... HEAD^^
>
> here but I'm not sure if it's correct. I will see if it works and will try
> to fix
> it if it's not but since it runs once a day it may take some time.
>

If you have two patches that depend on each other, you should send them as
two attachment to the same email. You now sent them as two separate emails,
and cfbot will then pick up the latest one of them which is only patch 0002
(at least I'm fairly sure that's how it works).

I don't know how to do that with git-send-email, but you can certainly do
it easy with git-format-patch and just attach them using your regular MUA.

(and while the cfbot and the archives have no problems dealing with the
change in subject, it does break threading in some other MUAs, so I would
recommend not doing that and sticking to the existing subject of the thread)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: Which SET TYPE don't actually require a rewrite

2020-07-17 Thread Tom Lane

Magnus Hagander  writes:
> As Amit mentions it is also triggered by some store parameter changes. But
> not all. So looking at it the other way, the part that the end user really
> cares about it "which ALTER TABLE operations will rewrite the table and
> which will not". Maybe what we need is a section specifically on this that
> summarizes all the different ways that it can happen.

No, what we need is EXPLAIN for DDL ;-).  Trying to keep such
documentation in sync with the actual code behavior would be impossible.
(For one thing, some aspects can be affected by extension datatype
behaviors.)

regards, tom lane

Re: renaming configure.in to configure.ac

2020-07-17 Thread Tom Lane

Magnus Hagander  writes:
> That one does more or less what Dagfinn suggests except in a separate repo.
> We could also just have a separate repo for it where people could push if
> we wanted to. Which could be committers, or others. But in comparison with
> what Perl does, I would assume actually having "just committers"be able to
> push would really be enough for that. A committer should be able to judge
> whether a patch needs extra cross-platform testing (and the cfbot does just
> fine for the limited platforms it runs on, which would still be good enough
> for *most* patches).

By and large, once a patch has reached that stage, we just push it to
master and deal with any fallout.  I suppose you could argue that
pushing to a testing branch first would reduce the amount of time that
HEAD is broken, but TBH I think it would not help much.  An awful lot
of the stuff that breaks the buildfarm is patches that the committer
was not expecting trouble with.

regards, tom lane

Re: Patch for reorderbuffer.c documentation.

2020-07-17 Thread David G. Johnston

On Fri, Jul 17, 2020 at 8:10 AM Dave Cramer  wrote:

> This started out with just fixing
>
> "One option do deal" to " One option to deal"
>
> But after reading the rest I'd propose the following patch.
>

Suggest replacing "though" with "however" instead of trying to figure out
what amount of commas is readable (the original seemed better IMO).

"However, the transaction records are fairly small and"

The rest is straight-forward.

David J.

[PATCH v3 2/2] Add header matching mode to "COPY FROM"

2020-07-17 Thread Rémi Lapeyre


COPY FROM supports the HEADER option to silently discard the header from
a CSV or text file. It is possible to load by mistake a file that
matches the expected format, for example if two text columns have been
swapped, resulting in garbage in the database.

This option adds the possibility to actually check the header to make
sure it matches what is expected and exit immediatly if it does not.

Discussion: 
https://www.postgresql.org/message-id/flat/caf1-j-0ptcwmeltswwgv2m70u26n4g33gpe1rckqqe6wvqd...@mail.gmail.com
---
 contrib/file_fdw/input/file_fdw.source  |  6 ++
 contrib/file_fdw/output/file_fdw.source |  9 ++-
 doc/src/sgml/ref/copy.sgml  |  8 ++-
 src/backend/commands/copy.c | 84 +++--
 src/test/regress/input/copy.source  | 25 
 src/test/regress/output/copy.source | 17 +
 6 files changed, 140 insertions(+), 9 deletions(-)

diff --git a/contrib/file_fdw/input/file_fdw.source b/contrib/file_fdw/input/file_fdw.source
index 83edb71077..7a3983c785 100644
--- a/contrib/file_fdw/input/file_fdw.source
+++ b/contrib/file_fdw/input/file_fdw.source
@@ -79,6 +79,12 @@ CREATE FOREIGN TABLE agg_bad (
 OPTIONS (format 'csv', filename '@abs_srcdir@/data/agg.bad', header 'true', delimiter ';', quote '@', escape '"', null '');
 ALTER FOREIGN TABLE agg_bad ADD CHECK (a >= 0);
 
+-- test header matching
+CREATE FOREIGN TABLE header_match ("1" int, foo text) SERVER file_server
+OPTIONS (format 'csv', filename '@abs_srcdir@/data/list1.csv', delimiter ',', header 'match');
+CREATE FOREIGN TABLE header_dont_match (a int, foo text) SERVER file_server
+OPTIONS (format 'csv', filename '@abs_srcdir@/data/list1.csv', delimiter ',', header 'match');	-- ERROR
+
 -- per-column options tests
 CREATE FOREIGN TABLE text_csv (
 word1 text OPTIONS (force_not_null 'true'),
diff --git a/contrib/file_fdw/output/file_fdw.source b/contrib/file_fdw/output/file_fdw.source
index 547b81fd16..d76a3dc6f8 100644
--- a/contrib/file_fdw/output/file_fdw.source
+++ b/contrib/file_fdw/output/file_fdw.source
@@ -93,6 +93,11 @@ CREATE FOREIGN TABLE agg_bad (
 ) SERVER file_server
 OPTIONS (format 'csv', filename '@abs_srcdir@/data/agg.bad', header 'true', delimiter ';', quote '@', escape '"', null '');
 ALTER FOREIGN TABLE agg_bad ADD CHECK (a >= 0);
+-- test header matching
+CREATE FOREIGN TABLE header_match ("1" int, foo text) SERVER file_server
+OPTIONS (format 'csv', filename '/Users/remi/src/postgresql/contrib/file_fdw/data/list1.csv', delimiter ',', header 'match');
+CREATE FOREIGN TABLE header_dont_match (a int, foo text) SERVER file_server
+OPTIONS (format 'csv', filename '/Users/remi/src/postgresql/contrib/file_fdw/data/list1.csv', delimiter ',', header 'match');	-- ERROR
 -- per-column options tests
 CREATE FOREIGN TABLE text_csv (
 word1 text OPTIONS (force_not_null 'true'),
@@ -439,12 +444,14 @@ SET ROLE regress_file_fdw_superuser;
 -- cleanup
 RESET ROLE;
 DROP EXTENSION file_fdw CASCADE;
-NOTICE:  drop cascades to 7 other objects
+NOTICE:  drop cascades to 9 other objects
 DETAIL:  drop cascades to server file_server
 drop cascades to user mapping for regress_file_fdw_superuser on server file_server
 drop cascades to user mapping for regress_no_priv_user on server file_server
 drop cascades to foreign table agg_text
 drop cascades to foreign table agg_csv
 drop cascades to foreign table agg_bad
+drop cascades to foreign table header_match
+drop cascades to foreign table header_dont_match
 drop cascades to foreign table text_csv
 DROP ROLE regress_file_fdw_superuser, regress_file_fdw_user, regress_no_priv_user;
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index c628a69c57..c35914511f 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -36,7 +36,7 @@ COPY { table_name [ ( boolean ]
 DELIMITER 'delimiter_character'
 NULL 'null_string'
-HEADER [ boolean ]
+HEADER { match | true | false }
 QUOTE 'quote_character'
 ESCAPE 'escape_character'
 FORCE_QUOTE { ( column_name [, ...] ) | * }
@@ -268,7 +268,11 @@ COPY { table_name [ ( 
   Specifies that the file contains a header line with the names of each
   column in the file.  On output, the first line contains the column
-  names from the table, and on input, the first line is ignored.
+  names from the table. On input, the first line is discarded when set
+  to true or required to match the column names if set
+  to match. If the number of columns in the header is
+  not correct, their order differs from the one expected, or the name or
+  case do not match, the copy will be aborted with an error.
   This option is allowed only when using CSV or
   text format.
  
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index a21508a974..cde6582f1a 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -95,6 +95,16 @@ typedef enum CopyInsertMethod
 	CIM_MULTI_CONDITIONAL

Add header support to text format and matching feature

2020-07-17 Thread Rémi Lapeyre




> It's hard to find an explanation what this patch actually does.  I don't
> want to have to go through threads dating back 4 months to determine
> what was discussed and what was actually implemented.  Since you're
> already using git format-patch, just add something to the commit message.
>
>
> It appears that these are really two separate features, so perhaps they
> should be two patches.

Thanks for the feedback, I've split cleanly the two patches, simplified the
tests and tried to explain the changes in the commit message.

> Also, the new header matching mode could probably use more than one line
> of documentation.

I've improved the documentation, let's me know if it's better.

It seems like cfbot is not happy with the way I'm sending my patches. The wiki
has some good advices on how to write a patch but I couldn't find anything on
how to send it. I've used

   git send-email -v3 --compose --to=... --in-reply-to=... HEAD^^

here but I'm not sure if it's correct. I will see if it works and will try to 
fix
it if it's not but since it runs once a day it may take some time.

[PATCH v3 1/2] Add header support to "COPY TO" text format

2020-07-17 Thread Rémi Lapeyre


CSV format supports the HEADER option to output a header in the output,
it is convenient when other programs need to consume the output. This
patch adds the same option to the default text format.

Discussion: 
https://www.postgresql.org/message-id/flat/caf1-j-0ptcwmeltswwgv2m70u26n4g33gpe1rckqqe6wvqd...@mail.gmail.com
---
 contrib/file_fdw/input/file_fdw.source  |  1 -
 contrib/file_fdw/output/file_fdw.source |  4 +---
 doc/src/sgml/ref/copy.sgml  |  3 ++-
 src/backend/commands/copy.c | 11 +++
 src/test/regress/input/copy.source  | 12 
 src/test/regress/output/copy.source |  8 
 6 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/contrib/file_fdw/input/file_fdw.source b/contrib/file_fdw/input/file_fdw.source
index 45b728eeb3..83edb71077 100644
--- a/contrib/file_fdw/input/file_fdw.source
+++ b/contrib/file_fdw/input/file_fdw.source
@@ -37,7 +37,6 @@ CREATE USER MAPPING FOR regress_no_priv_user SERVER file_server;
 
 -- validator tests
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'xml');  -- ERROR
-CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'text', header 'true');  -- ERROR
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'text', quote ':');  -- ERROR
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'text', escape ':'); -- ERROR
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'binary', header 'true');-- ERROR
diff --git a/contrib/file_fdw/output/file_fdw.source b/contrib/file_fdw/output/file_fdw.source
index 52b4d5f1df..547b81fd16 100644
--- a/contrib/file_fdw/output/file_fdw.source
+++ b/contrib/file_fdw/output/file_fdw.source
@@ -33,14 +33,12 @@ CREATE USER MAPPING FOR regress_no_priv_user SERVER file_server;
 -- validator tests
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'xml');  -- ERROR
 ERROR:  COPY format "xml" not recognized
-CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'text', header 'true');  -- ERROR
-ERROR:  COPY HEADER available only in CSV mode
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'text', quote ':');  -- ERROR
 ERROR:  COPY quote available only in CSV mode
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'text', escape ':'); -- ERROR
 ERROR:  COPY escape available only in CSV mode
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'binary', header 'true');-- ERROR
-ERROR:  COPY HEADER available only in CSV mode
+ERROR:  COPY HEADER available only in CSV and text mode
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'binary', quote ':');-- ERROR
 ERROR:  COPY quote available only in CSV mode
 CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'binary', escape ':');   -- ERROR
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 18189abc6c..c628a69c57 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -269,7 +269,8 @@ COPY { table_name [ ( CSV format.
+  This option is allowed only when using CSV or
+  text format.
  
 

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 44da71c4cb..a21508a974 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -136,7 +136,7 @@ typedef struct CopyStateData
 	bool		binary;			/* binary format? */
 	bool		freeze;			/* freeze rows on loading? */
 	bool		csv_mode;		/* Comma Separated Value format? */
-	bool		header_line;	/* CSV header line? */
+	bool		header_line;	/* CSV or text header line? */
 	char	   *null_print;		/* NULL marker string (server encoding!) */
 	int			null_print_len; /* length of same */
 	char	   *null_print_client;	/* same converted to file encoding */
@@ -1363,10 +1363,10 @@ ProcessCopyOptions(ParseState *pstate,
  errmsg("COPY delimiter cannot be \"%s\"", cstate->delim)));
 
 	/* Check header */
-	if (!cstate->csv_mode && cstate->header_line)
+	if (cstate->binary && cstate->header_line)
 		ereport(ERROR,
 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("COPY HEADER available only in CSV mode")));
+ errmsg("COPY HEADER available only in CSV and text mode")));
 
 	/* Check quote */
 	if (!cstate->csv_mode && cstate->quote != NULL)
@@ -2099,8 +2099,11 @@ CopyTo(CopyState cstate)
 
 colname = NameStr(TupleDescAttr(tupDesc, attnum - 1)->attname);
 
-CopyAttributeOutCSV(cstate, colname, false,
+if (cstate->csv_mode)
+	CopyAttributeOutCSV(cstate, colname, false,
 	list_length(cstate->attnumlist) == 1);
+else
+	CopyAttributeOutText(cstate, colname);
 			}
 
 			CopySendEndOfRow(cstate);
diff --git a/src/test/regress/input/copy.source b/src/test/regress/input/copy.source
index a1d529ad36..2368649111 100644
--- a/src/test/regress/input/copy.source
+++ b/src/test/regress/input/copy.source
@@ -134,6 +134,18 @@ this is just a line full of junk that would error

Patch for reorderbuffer.c documentation.

2020-07-17 Thread Dave Cramer

This started out with just fixing

"One option do deal" to " One option to deal"

But after reading the rest I'd propose the following patch.

Dave Cramer


reorderdoc.patch
Description: Binary data

Re: Encoding of src/timezone/tznames/Europe.txt

2020-07-17 Thread Tom Lane

Michael Paquier  writes:
> On Thu, Jul 16, 2020 at 09:46:03PM +0200, Christoph Berg wrote:
>> Or that, yes. (The correct German transliteration is
>> "Mitteleuropaeische" with 'ae'.)

> tznames/Europe.txt is iso-latin-1-unix for buffer-file-coding-system
> since its introduction in d8b5c95, and tznames/Default is using ASCII
> as well since this point.  +1 to switch all that to ASCII and give up
> on the accents.

Done that way.  I also checked for other discrepancies between
tznames/Default and the other files, and found a few more trivialities.

regards, tom lane

Re: Partitioning and postgres_fdw optimisations for multi-tenancy

2020-07-17 Thread Etsuro Fujita

On Fri, Jul 17, 2020 at 1:56 AM Alexey Kondratov
 wrote:
> However, there is an issue with aggregates as well. For a query like:
>
> SELECT
>  count(*)
> FROM
>  documents
> WHERE
>  company_id = 5;
>
> It would be great to teach planner to understand, that it's a
> partition-wise aggregate as well, even without GROUP BY company_id,
> which doesn't always help as well. I'll try to look closer on this
> problem, but if you have any thoughts about it, then I'd be glad to
> know.

The reason why the aggregation count(*) isn't pushed down to the
remote side is: 1) we allow the FDW to push the aggregation down only
when the input relation to the aggregation is a foreign (base or join)
relation (see create_grouping_paths()), but 2) for your case the input
relation would be an append relation that contains the foreign
partition as only one child relation, NOT just the foreign partition.
The resulting Append path would be removed in the postprocessing (see
[1]), but that would be too late for the FDW to do the push-down work.
I have no idea what to do about this issue.

Best regards,
Etsuro Fujita

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8edd0e79460b414b1d971895312e549e95e12e4f;hp=f21668f328c864c6b9290f39d41774cb2422f98e

Re: renaming configure.in to configure.ac

2020-07-17 Thread Magnus Hagander

On Fri, Jul 17, 2020 at 4:12 PM Tom Lane  wrote:

> ilm...@ilmari.org (Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?=) writes:
> > Noah Misch  writes:
> >> On Thu, Jul 16, 2020 at 11:41:56AM +0100, Dagfinn Ilmari Mannsåker
> wrote:
> >>> Instead of doing this on the master branch, would it be worth defining
> a
> >>> namespace for branches that the buildfarm tests in addition to master
> >>> and REL_*_STABLE?
>
> >> Potentially.  What advantages and disadvantages has Perl experienced?
>
> > The advantage is getting proposed changes tested on a number of
> > platforms that individual developers otherwise don't have access to.
> > For example
> http://perl.develop-help.com/?b=smoke-me%2Filmari%2Fremove-symbian
> > shows the reults of one branch of mine.
> > The only disadvantage is that it takes up more build farm capacity, but
> > it's not used for all changes, only ones that developers are concerned
> > might break on other platforms (e.g. affecting platform-specific code or
> > constructs otherwise known to behave differently across platforms and
> > compilers).
>
> I'd argue that cluttering the main development repo with dead branches
> is a non-negligible cost.  We have one or two such left over from very
> ancient days, and I don't really want more.  (Is there a way to remove
> a branch once it's been pushed to a shared git repo?)
>

Yes, it's trivial to remove a branch from a shared git repo. In modern
versions of git, just "git push origin --delete stupidbranch".

The actual commits remain in the repo of course, until such time that it's
GCed.

Another issue is that we're not going to open up the main repo for
> access by non-committers, so this approach doesn't help for most
> developers.  We've had some success, I think, with Munro's cfbot
> solution --- I'd rather see that approach expanded to provide more
> test environments.
>

That one does more or less what Dagfinn suggests except in a separate repo.
We could also just have a separate repo for it where people could push if
we wanted to. Which could be committers, or others. But in comparison with
what Perl does, I would assume actually having "just committers"be able to
push would really be enough for that. A committer should be able to judge
whether a patch needs extra cross-platform testing (and the cfbot does just
fine for the limited platforms it runs on, which would still be good enough
for *most* patches).

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: renaming configure.in to configure.ac

2020-07-17 Thread Tom Lane

ilm...@ilmari.org (Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?=) writes:
> Noah Misch  writes:
>> On Thu, Jul 16, 2020 at 11:41:56AM +0100, Dagfinn Ilmari Mannsåker wrote:
>>> Instead of doing this on the master branch, would it be worth defining a
>>> namespace for branches that the buildfarm tests in addition to master
>>> and REL_*_STABLE?

>> Potentially.  What advantages and disadvantages has Perl experienced?

> The advantage is getting proposed changes tested on a number of
> platforms that individual developers otherwise don't have access to.
> For example http://perl.develop-help.com/?b=smoke-me%2Filmari%2Fremove-symbian
> shows the reults of one branch of mine.
> The only disadvantage is that it takes up more build farm capacity, but
> it's not used for all changes, only ones that developers are concerned
> might break on other platforms (e.g. affecting platform-specific code or
> constructs otherwise known to behave differently across platforms and
> compilers).

I'd argue that cluttering the main development repo with dead branches
is a non-negligible cost.  We have one or two such left over from very
ancient days, and I don't really want more.  (Is there a way to remove
a branch once it's been pushed to a shared git repo?)

Another issue is that we're not going to open up the main repo for
access by non-committers, so this approach doesn't help for most
developers.  We've had some success, I think, with Munro's cfbot
solution --- I'd rather see that approach expanded to provide more
test environments.

regards, tom lane

Re: Which SET TYPE don't actually require a rewrite

2020-07-17 Thread Magnus Hagander

On Fri, Jul 17, 2020 at 5:40 AM Noah Misch  wrote:

> On Wed, Jul 15, 2020 at 02:54:37PM +0200, Magnus Hagander wrote:
> > Our Fine Manual (TM) specifies:
> > "As an exception, when changing the type of an existing column, if the
> > USING clause does not change the column contents and the old type is
> either
> > binary coercible to the new type or an unconstrained domain over the new
> > type, a table rewrite is not needed; but any indexes on the affected
> > columns must still be rebuilt."
> >
> > First of all, how is a non-internals-expert even supposed to know what a
> > binary coercible type is?
>
> The manual defines it at binary coercible.
>

The only way to actually realize that this is a  is to look at
the source code though, right? It's definitely not clear that one should go
look at the CREATE CAST documentation to find the definition -- certainly
not from the ALTER TABLE documentation, which I would argue is the place
where most people would go.

And while having the definition there is nice, it doesn't help an end user
in any way at all to determine if their ALTER TABLE statement is going to
be "safe from rewrites" or not. It (hopefully) helps someone who knows some
things about the database internals, which is of course a valuable thing as
well, but not the end user.

> We can also for example increase the precision of numeric without a
> rewrite
> > (but not scale). Or we can change between text and varchar. And we can
> > increase the length of a varchar but not decrease it.
> >
> > Surely we can do better than this when it comes to documenting it? Even
> if
> > it's a pluggable thing so it may or may not be true of external
> > datatypes installed later, we should be able to at least be more clear
> > about the builtin types, I think?
>
> I recall reasoning that ATColumnChangeRequiresRewrite() is a DDL analog of
> query optimizer logic.  The manual brings up only a minority of planner
> optimizations, and comprehensive lists of optimization preconditions are
> even
> rarer.  But I don't mind if $SUBJECT documentation departs from that norm.
>

I can see the argument being made for that, and certainly having been made
for it in the future. But I'd say given the very bad consequences of
getting it wrong, it's far from minor. And given the number of times I've
had to answer the question "can I make this change safely" (which usually
amounts to me trying it out to see what happens, if I hadn't done that
exact one many times before) indicates the need for a more detailed
documentation on it.

As Amit mentions it is also triggered by some store parameter changes. But
not all. So looking at it the other way, the part that the end user really
cares about it "which ALTER TABLE operations will rewrite the table and
which will not". Maybe what we need is a section specifically on this that
summarizes all the different ways that it can happen.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: Parallel copy

2020-07-17 Thread Ashutosh Sharma

Some review comments (mostly) from the leader side code changes:

1) Do we need a DSM key for the FORCE_QUOTE option? I think FORCE_QUOTE
option is only used with COPY TO and not COPY FROM so not sure why you have
added it.

PARALLEL_COPY_KEY_FORCE_QUOTE_LIST

2) Should we be allocating the parallel copy data structure only when it is
confirmed that the parallel copy is allowed?

pcdata = (ParallelCopyData *) palloc0(sizeof(ParallelCopyData));
cstate->pcdata = pcdata;

Or, if you want it to be allocated before confirming if Parallel copy is
allowed or not, then I think it would be good to allocate it in
*cstate->copycontext* memory context so that when EndCopy is called towards
the end of the COPY FROM operation, the entire context itself gets deleted
thereby freeing the memory space allocated for pcdata. In fact it would be
good to ensure that all the local memory allocated inside the ctstate
structure gets allocated in the *cstate->copycontext* memory context.

3) Should we allow Parallel Copy when the insert method is
CIM_MULTI_CONDITIONAL?

+   /* Check if the insertion mode is single. */
+   if (FindInsertMethod(cstate) == CIM_SINGLE)
+   return false;

I know we have added checks in CopyFrom() to ensure that if any trigger
(before row or instead of) is found on any of partition being loaded with
data, then COPY FROM operation would fail, but does it mean that we are
okay to perform parallel copy on partitioned table. Have we done some
performance testing with the partitioned table where the data in the input
file needs to be routed to the different partitions?

4) There are lot of if-checks in IsParallelCopyAllowed function that are
checked in CopyFrom function as well which means in case of Parallel Copy
those checks will get executed multiple times (first by the leader and from
second time onwards by each worker process). Is that required?

5) Should the worker process be calling this function when the leader has
already called it once in ExecBeforeStmtTrigger()?

/* Verify the named relation is a valid target for INSERT */
CheckValidResultRel(resultRelInfo, CMD_INSERT);

6) I think it would be good to re-write the comments atop
ParallelCopyLeader(). From the present comments it appears as if you were
trying to put the information pointwise but somehow you ended up putting in
a paragraph. The comments also have some typos like *line beaks* which
possibly means line breaks. This is applicable for other comments as well
where you

7) Is the following checking equivalent to IsWorker()? If so, it would be
good to replace it with an IsWorker like macro to increase the readability.

(IsParallelCopy() && !IsLeader())

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

On Fri, Jul 17, 2020 at 2:09 PM vignesh C  wrote:

> >
> > Please find the updated patch with the fixes included.
> >
>
> Patch 0003-Allow-copy-from-command-to-process-data-from-file-ST.patch
> had few indentation issues, I have fixed and attached the patch for
> the same.
>
> Regards,
> Vignesh
> EnterpriseDB: http://www.enterprisedb.com
>

RE: Parallel Seq Scan vs kernel read ahead

2020-07-17 Thread k.jami...@fujitsu.com

On Wednesday, July 15, 2020 12:52 PM (GMT+9), David Rowley wrote:

>On Wed, 15 Jul 2020 at 14:51, Amit Kapila  wrote:
>>
>> On Wed, Jul 15, 2020 at 5:55 AM David Rowley  wrote:
>>> If we've not seen any performance regressions within 1 week, then I 
>>> propose that we (pending final review) push this to allow wider 
>>> testing.
>>
>> I think Soumyadeep has reported a regression case [1] with the earlier 
>> version of the patch.  I am not sure if we have verified that the 
>> situation improves with the latest version of the patch.  I request 
>> Soumyadeep to please try once with the latest patch.
>...
>Yeah, it would be good to see some more data points on that test.
>Jumping from 2 up to 6 workers just leaves us to guess where the performance
>started to become bad. >It would be good to know if the regression is
>repeatable or if it was affected by some other process.
>...
>It would be good to see EXPLAIN (ANALYZE, BUFFERS) with SET track_io_timing = 
>on;
>for each value of >max_parallel_workers.

Hi,

If I'm following the thread correctly, we may have gains on this patch
of Thomas and David, but we need to test its effects on different
filesystems. It's also been clarified by David through benchmark tests
that synchronize_seqscans is not affected as long as the set cap per
chunk size of parallel scan is at 8192.

I also agree that having a control on this through GUC can be
beneficial for users, however, that can be discussed in another
thread or development in the future.

David Rowley wrote:
>I'd like to propose that if anyone wants to do further testing on
>other operating systems with SSDs or HDDs then it would be good if
>that could be done within a 1 week from this email. There are various
>benchmarking ideas on this thread for inspiration.

I'd like to join on testing it, this one using HDD, and at the bottom
are the results. Due to my machine limitations, I only tested
0~6 workers, that even if I increase max_parallel_workers_per_gather
more than that, the query planner would still cap the workers at 6.
I also set the track_io_timing to on as per David's recommendation.

Tested on:
XFS filesystem, HDD virtual machine
RHEL4, 64-bit,
4 CPUs, Intel Core Processor (Haswell, IBRS)
PostgreSQL 14devel on x86_64-pc-linux-gnu

Test Case (Soumyadeep's) [1]

shared_buffers = 32MB (to use OS page cache)

create table t_heap as select generate_series(1, 1) i;   --about 3.4GB 
size

SET track_io_timing = on;

\timing

set max_parallel_workers_per_gather = 0;  --0 to 6

SELECT count(*) from t_heap;
EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) from t_heap;

[Summary]
I used the same query from the thread. However, the sql query execution time
and query planner execution time results between the master and patched do
not vary much.
OTOH, in terms of I/O stats, I observed similar regression in both master
and patched as we increase max_parallel_workers_per_gather.

It could also be possible that each benchmark setting for 
max_parallel_workers_per_gather
is affected by previous result . IOW, later benchmark runs benefit from the 
data cached by
previous runs on OS level. 
Any advice? Please refer to tables below for results.

(MASTER/UNPATCHED)
| Parallel Workers | SQLExecTime  |  PlannerExecTime |  Buffers 
   | 
|--|--|--|-|

| 0| 12942.606 ms | 37031.786 ms | shared hit=32 
read=442446   | 
| 1|  4959.567 ms | 17601.813 ms | shared hit=128 
read=442350  | 
| 2|  3273.610 ms | 11766.441 ms | shared hit=288 
read=442190  | 
| 3|  2449.342 ms |  9057.236 ms | shared hit=512 
read=441966  | 
| 4|  2482.404 ms |  8853.702 ms | shared hit=800 
read=441678  | 
| 5|  2430.944 ms |  8777.630 ms | shared hit=1152 
read=441326 | 
| 6|  2493.416 ms |  8798.200 ms | shared hit=1568 
read=440910 | 

(PATCHED V2)
| Parallel Workers | SQLExecTime |  PlannerExecTime |  Buffers  
  | 
|--|-|--|-|

| 0| 9283.193 ms | 34471.050 ms | shared hit=2624 
read=439854 | 
| 1| 4872.728 ms | 17449.725 ms | shared hit=2528 
read=439950 | 
| 2| 3240.301 ms | 11556.243 ms | shared hit=2368 
read=440110 | 
| 3| 2419.512 ms |  8709.572 ms | shared hit=2144 
read=440334 | 
| 4| 2746.820 ms |  8768.812 ms | shared hit=1856 
read=440622 | 
| 5| 2424.687 ms |  8699.762 ms | shared hit=1504 
read=440974 | 
| 6| 2581.999 ms |  8627.627 ms | shared hit=1440 
read=441038 | 

(I/O Read Stat)
| Parallel Workers | I/O (Master)  | I/O (Patched) | 
|--|---|---| 
| 0| read=1850.233 | read=1071.209 | 
| 1| read=1246.939

Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away

2020-07-17 Thread Bharath Rupireddy

>
> On Tue, Jul 14, 2020 at 6:13 PM Ashutosh Bapat
>  wrote:
> >
> > Has this been added to CF, possibly next CF?
> >
>
> I have not added yet. Request to get it done in this CF, as the final
> patch for review(v3 patch) is ready and shared. We can target it to
> the next CF if there are major issues with the patch.
>

I added this feature to the next CF - https://commitfest.postgresql.org/29/2651/

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Added tab completion for the missing options in copy statement

2020-07-17 Thread vignesh C

On Fri, Jul 17, 2020 at 11:15 AM Michael Paquier  wrote:
>
> On Fri, Jul 10, 2020 at 09:58:28AM +0530, vignesh C wrote:
> > Thanks for reviewing the patch.
> > This changes is already present in the document, no need to make any
> > changes as shown below:
> >
> > COPY table_name [ ( column_name [, ...] ) ]
> > FROM { 'filename' | PROGRAM 'command' | STDIN }
> > [ [ WITH ] ( option [, ...] ) ]
> > [ WHERE condition ]
>
> Not completely actually.  The page of psql for \copy does not mention
> the optional where clause, and I think that it would be better to add
> that for consistency (perhaps that's the point raised by Ahsan?).  I
> don't see much point in splitting the description of the meta-command
> into two lines as we already mix stdin and stdout for example which
> only apply to respectively "FROM" and "TO", so let's just append the
> conditional where clause at its end.  Attached is a patch doing so
> that I intend to back-patch down to v12.

I would like to split into 2 lines similar to documentation of
sql-copy which gives better readability, attaching a new patch in
similar lines.

> Coming back to your proposal, another thing is that with your patch
> you recommend a syntax still present for compatibility reasons, but I
> don't think that we should recommend it to the users anymore, giving
> priority to the new grammar of the post-9.0 era.  I would actually go
> as far as removing BINARY from the completion when specified just
> after COPY to simplify the code, and specify the list of available
> options after typing "COPY ... WITH (FORMAT ", with "text", "csv" and
> "binary".  Adding completion for WHERE after COPY FROM is of course a
> good idea.

I agree with your comments, and have made a new patch accordingly.
Thoughts?

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com
From c7a7897f22aa3ad9b995a75a5fc4c933d0f9b383 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Fri, 17 Jul 2020 16:01:06 +0530
Subject: [PATCH 1/2] Corrected copy syntax.

Split copy command into copy to & copy from and included the missing options.
---
 doc/src/sgml/ref/psql-ref.sgml | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 42e862c..13179e8 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -982,9 +982,15 @@ testdb=>
   
 
   
+\copy { table [ ( column_list ) ] }
+from
+{ 'filename' | program 'command' | stdin | pstdin }
+[ [ with ] ( option [, ...] ) ]
+[ where condition ]
+
 \copy { table [ ( column_list ) ] | ( query ) }
-{ from | to }
-{ 'filename' | program 'command' | stdin | stdout | pstdin | pstdout }
+to
+{ 'filename' | program 'command' | stdout | pstdout }
 [ [ with ] ( option [, ...] ) ]
 
 
-- 
1.8.3.1

From 4c56df45327aee448f1f55a7aef24b01f3f343b9 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Fri, 17 Jul 2020 17:06:15 +0530
Subject: [PATCH 2/2] Tab completion for copy statement.

Tab completion for suggesting WITH, OPTIONS & WHERE was missing, this patch has
the fix to suggest the same. I did not see any tests for tab completion, hence
no tests were added.
---
 src/bin/psql/tab-complete.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index eb01885..f835758 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2346,6 +2346,16 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("BINARY", "DELIMITER", "NULL", "CSV",
 	  "ENCODING");
 
+	/* Offer options after COPY  FROM|TO filename WITH ( */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "("))
+		COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
+	  "HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
+	  "FORCE_NOT_NULL", "FORCE_NULL", "ENCODING");
+
+	/* Offer options after COPY  FROM filename WITH options */
+	else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", MatchAny))
+		COMPLETE_WITH("WHERE");
+
 	/* Offer options after COPY [BINARY]  FROM|TO filename CSV */
 	else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "CSV") ||
 			 Matches("COPY", "BINARY", MatchAny, "FROM|TO", MatchAny, "CSV"))
-- 
1.8.3.1

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-17 Thread Bharath Rupireddy

> >
> > I will change the status to "ready for committer" in commitfest
> > tomorrow. Hope that's fine.
>
> I agree, a committer can have a look at this.
>

I changed the status in the commit fest to "Ready for Committer".

https://commitfest.postgresql.org/28/2632/

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: max_slot_wal_keep_size and wal_keep_segments

2020-07-17 Thread David Steele




On 7/17/20 5:11 AM, Fujii Masao wrote:



On 2020/07/14 20:30, David Steele wrote:

On 7/14/20 12:00 AM, Fujii Masao wrote:


The patch was no longer applied cleanly because of recent commit.
So I updated the patch. Attached.

Barring any objection, I will commit this patch.


This doesn't look right:

+   the  most recent megabytes
+   WAL files plus one WAL file are

How about:

+    megabytes of
+   WAL files plus one WAL file are


Thanks for the comment! Isn't it better to keep "most recent" part?
If so, what about either of the followings?

1.  megabytes of WAL files plus
     one WAL file that were most recently generated are kept all time.

2.  megabytes + linkend="guc-wal-segment-size"> bytes

     of WAL files that were most recently generated are kept all time.


"most recent" seemed implied to me, but I see your point.

How about:

+   the most recent  megabytes of
+   WAL files plus one additional WAL file are

Regards,
--
-David
da...@pgmasters.net

Re: Transactions involving multiple postgres foreign servers, take 2

2020-07-17 Thread Masahiko Sawada

On Fri, 17 Jul 2020 at 14:22, tsunakawa.ta...@fujitsu.com
 wrote:
>
> From: Masahiko Sawada 
> I have briefly checked the only oracle_fdw but in general I think that
> > if an existing FDW supports transaction begin, commit, and rollback,
> > these can be ported to new FDW transaction APIs easily.
>
> Does oracle_fdw support begin, commit and rollback?
>
> And most importantly, do other major DBMSs, including Oracle, provide the API 
> for preparing a transaction?  In other words, will the FDWs other than 
> postgres_fdw really be able to take advantage of the new FDW functions to 
> join the 2PC processing?  I think we need to confirm that there are concrete 
> examples.

I also believe they do. But I'm concerned that some FDW needs to start
a transaction differently when using 2PC. For instance, IIUC MySQL
also supports 2PC but the transaction needs to be started with "XA
START id” when the transaction needs to be prepared. The transaction
started with XA START can be closed by XA END followed by XA PREPARE
or XA COMMIT ONE PHASE. It means that when starts a new transaction
the transaction needs to prepare the transaction identifier and to
know that 2PC might be used. It’s quite different from PostgreSQL. In
PostgreSQL, we can start a transaction by BEGIN and end it by PREPARE
TRANSACTION, COMMIT, or ROLLBACK. The transaction identifier is
required when PREPARE TRANSACTION.

With MySQL, I guess FDW needs a way to tell the (next) transaction
needs to be started with XA START so it can be prepared. It could be a
custom GUC or an SQL function. Then when starts a new transaction on
MySQL server, FDW can generate and store a transaction identifier into
somewhere alongside the connection. At the prepare phase, it passes
the transaction identifier via GetPrepareId() API to the core.

I haven’t tested the above yet and it’s just a desk plan. it's
definitely a good idea to try integrating this 2PC feature to FDWs
other than postgres_fdw to see if design and interfaces are
implemented sophisticatedly.

>
> What I'm worried is that if only postgres_fdw can implement the prepare 
> function, it's a sign that FDW interface will be riddled with functions only 
> for Postgres.  That is, the FDW interface is getting away from its original 
> purpose "access external data as a relation" and complex.  Tomas Vondra 
> showed this concern as follows:
>
> Horizontal scalability/sharding
> https://www.postgresql.org/message-id/flat/CANP8%2BjK%3D%2B3zVYDFY0oMAQKQVJ%2BqReDHr1UPdyFEELO82yVfb9A%40mail.gmail.com#2c45f0ee97855449f1f7fedcef1d5e11
>
>
> [Tomas Vondra's remarks]
> --
> > This strikes me as a bit of a conflict of interest with FDW which
> > seems to want to hide the fact that it's foreign; the FDW
> > implementation makes it's own optimization decisions which might
> > make sense for single table queries but breaks down in the face of
> > joins.
>
> +1 to these concerns
>
> In my mind, FDW is a wonderful tool to integrate PostgreSQL with
> external data sources, and it's nicely shaped for this purpose, which
> implies the abstractions and assumptions in the code.
>
> The truth however is that many current uses of the FDW API are actually
> using it for different purposes because there's no other way to do that,
> not because FDWs are the "right way". And this includes the attempts to
> build sharding on FDW, I think.
>
> Situations like this result in "improvements" of the API that seem to
> improve the API for the second group, but make the life harder for the
> original FDW API audience by making the API needlessly complex. And I
> say "seem to improve" because the second group eventually runs into the
> fundamental abstractions and assumptions the API is based on anyway.
>
> And based on the discussions at pgcon, I think this is the main reason
> why people cringe when they hear "FDW" and "sharding" in the same sentence.
>
> ...
> My other worry is that we'll eventually mess the FDW infrastructure,
> making it harder to use for the original purpose. Granted, most of the
> improvements proposed so far look sane and useful for FDWs in general,
> but sooner or later that ceases to be the case - there sill be changes
> needed merely for the sharding. Those will be tough decisions.
> --
>
>
> > Regarding the comparison between FDW transaction APIs and transaction
> > callbacks, I think one of the benefits of providing FDW transaction
> > APIs is that the core is able to manage the status of foreign
> > transactions. We need to track the status of individual foreign
> > transactions to support atomic commit. If we use transaction callbacks
> > (XactCallback) that many FDWs are using, I think we will end up
> > calling the transaction callback and leave the transaction work to
> > FDWs, leading that the core is not able to know the return values of
> > PREPARE TRANSACTION for example. We can add more arguments passed to
> > tran

Re: renaming configure.in to configure.ac

2020-07-17 Thread Dagfinn Ilmari Mannsåker

Noah Misch  writes:

> On Thu, Jul 16, 2020 at 11:41:56AM +0100, Dagfinn Ilmari Mannsåker wrote:
>
>> Instead of doing this on the master branch, would it be worth defining a
>> namespace for branches that the buildfarm tests in addition to master
>> and REL_*_STABLE?
>> 
>> In the Perl world we have this in the form of smoke-me/* branches, and
>> it's invaluable to be able to test things across many platforms without
>> breaking blead (our name for the main development branch).
>
> Potentially.  What advantages and disadvantages has Perl experienced?

The advantage is getting proposed changes tested on a number of
platforms that individual developers otherwise don't have access to.
For example http://perl.develop-help.com/?b=smoke-me%2Filmari%2Fremove-symbian
shows the reults of one branch of mine.

The only disadvantage is that it takes up more build farm capacity, but
it's not used for all changes, only ones that developers are concerned
might break on other platforms (e.g. affecting platform-specific code or
constructs otherwise known to behave differently across platforms and
compilers).

- ilmari
-- 
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.   - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.  - Calle Dybedahl

Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-07-17 Thread Dilip Kumar

The attached patch allows the vacuum to continue by emitting WARNING
for the corrupted tuple instead of immediately error out as discussed
at [1].

Basically, it provides a new GUC called vacuum_tolerate_damage, to
control whether to continue the vacuum or to stop on the occurrence of
a corrupted tuple.  So if the vacuum_tolerate_damage is set then in
all the cases in heap_prepare_freeze_tuple where the corrupted xid is
detected, it will emit a warning and return that nothing is changed in
the tuple and the 'tuple_totally_frozen' will also be set to false.
Since we are returning false the caller will not try to freeze such
tuple and the tuple_totally_frozen is also set to false so that the
page will not be marked to all frozen even if all other tuples in the
page are frozen.

Alternatively,  we can try to freeze other XIDs in the tuple which is
not corrupted but I don't think we will gain anything from this,
because if one of the xmin or xmax is wrong then next time also if we
run the vacuum then we are going to get the same WARNING or the ERROR.
Is there any other opinion on this?

[1] 
http://postgr.es/m/ca+tgmoazwzhtffu6nujgeap6adds-qwfnoxpzgqpzmmm0vt...@mail.gmail.com

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 5421c567a168705272bde425a02eb248c4468cb0 Mon Sep 17 00:00:00 2001
From: dilipkumar 
Date: Thu, 16 Jul 2020 11:41:28 +0530
Subject: [PATCH v1] Provide a GUC to allow vacuum to continue on corrupted
 tuple

A new GUC to control whether the vacuum will error out immediately
on detection of a corrupted tuple or it will just emit a WARNING for
each such instance and complete the rest of the vacuuming.
---
 doc/src/sgml/config.sgml  | 21 
 src/backend/access/heap/heapam.c  | 28 +--
 src/backend/commands/vacuum.c | 15 ++
 src/backend/utils/misc/guc.c  |  9 
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/commands/vacuum.h |  2 +
 .../test_misc/t/002_vacuum_tolerate_damage.pl | 49 +++
 7 files changed, 120 insertions(+), 5 deletions(-)
 create mode 100644 src/test/modules/test_misc/t/002_vacuum_tolerate_damage.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b353c61683..3c1e647e27 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8250,6 +8250,27 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
   
  
 
+  
+  vacuum_tolerate_damage (boolean)
+  
+   vacuum_tolerate_damage configuration parameter
+  
+  
+  
+   
+When set to off, which is the default, the VACUUM will raise
+a ERROR-level error if detected a corrupted tuple.  This causes the operation to
+stop immediately.
+   
+
+   
+If set to on, the VACUUM will instead emit a WARNING for each
+such tuple but the operation will continue.
+vacuuming.
+   
+  
+ 
+
  
   bytea_output (enum)
   
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index e09c8101e7..fc31799da5 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -51,6 +51,7 @@
 #include "access/xloginsert.h"
 #include "access/xlogutils.h"
 #include "catalog/catalog.h"
+#include "commands/vacuum.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "port/atomics.h"
@@ -6123,6 +6124,8 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,
 	frz->t_infomask = tuple->t_infomask;
 	frz->xmax = HeapTupleHeaderGetRawXmax(tuple);
 
+	*totally_frozen_p = false;
+
 	/*
 	 * Process xmin.  xmin_frozen has two slightly different meanings: in the
 	 * !XidIsNormal case, it means "the xmin doesn't need any freezing" (it's
@@ -6137,19 +6140,25 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,
 	else
 	{
 		if (TransactionIdPrecedes(xid, relfrozenxid))
-			ereport(ERROR,
+		{
+			ereport(vacuum_damage_elevel(),
 	(errcode(ERRCODE_DATA_CORRUPTED),
 	 errmsg_internal("found xmin %u from before relfrozenxid %u",
 	 xid, relfrozenxid)));
+			return false;
+		}
 
 		xmin_frozen = TransactionIdPrecedes(xid, cutoff_xid);
 		if (xmin_frozen)
 		{
 			if (!TransactionIdDidCommit(xid))
-ereport(ERROR,
+			{
+ereport(vacuum_damage_elevel(),
 		(errcode(ERRCODE_DATA_CORRUPTED),
 		 errmsg_internal("uncommitted xmin %u from before xid cutoff %u needs to be frozen",
 		 xid, cutoff_xid)));
+return false;
+			}
 
 			frz->t_infomask |= HEAP_XMIN_FROZEN;
 			changed = true;
@@ -6218,10 +6227,13 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,
 	else if (TransactionIdIsNormal(xid))
 	{
 		if (TransactionIdPrecedes(xid, relfrozenxid))
-			ereport(ERROR,
+		{
+			ereport(vacuum_damage_elevel(),
 	(errcode(ERRCODE_DATA_CORRUPTED),
 	 errmsg_internal("found xmax %u from before relfrozenxid %u",
 	 xid, relfrozenxid)));
+			return false;
+		}
 
 		if (Tra

Re: Implement UNLOGGED clause for COPY FROM

2020-07-17 Thread Amit Kapila

On Fri, Jul 17, 2020 at 9:53 AM osumi.takami...@fujitsu.com
 wrote:
>
> Lastly, I have to admit that
> the status of target table where data is loaded by COPY UNLOGGED would be 
> marked
> as invalid and notified to standbys under replication environment
> from the point in time when the operation takes place.
> But, I'm going to allow users with special privileges (like DBA) to use this 
> clause
> and this kind of tables would be judged by them not to replicate.
> Of course, I'm thinking better idea but now what I can say is like this for 
> replication.
>

If you are going to suggest users not to replicate such tables then
why can't you suggest them to create such tables as UNLOGGED in the
first place?  Another idea could be that you create an 'unlogged'
table, copy the data to it.  Then perform Alter Table .. SET Logged
and attach it to the main table.  I think for this you need the main
table to be partitioned but I think if that is possible then it might
be better than all the hacking you are proposing to do in the server
for this special operation.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Parallel Seq Scan vs kernel read ahead

2020-07-17 Thread Amit Kapila

On Fri, Jul 17, 2020 at 11:35 AM k.jami...@fujitsu.com
 wrote:
>
> On Wednesday, July 15, 2020 12:52 PM (GMT+9), David Rowley wrote:
>
> >On Wed, 15 Jul 2020 at 14:51, Amit Kapila  wrote:
> >>
> >> On Wed, Jul 15, 2020 at 5:55 AM David Rowley  wrote:
> >>> If we've not seen any performance regressions within 1 week, then I
> >>> propose that we (pending final review) push this to allow wider
> >>> testing.
> >>
> >> I think Soumyadeep has reported a regression case [1] with the earlier
> >> version of the patch.  I am not sure if we have verified that the
> >> situation improves with the latest version of the patch.  I request
> >> Soumyadeep to please try once with the latest patch.
> >...
> >Yeah, it would be good to see some more data points on that test.
> >Jumping from 2 up to 6 workers just leaves us to guess where the performance
> >started to become bad. >It would be good to know if the regression is
> >repeatable or if it was affected by some other process.
> >...
> >It would be good to see EXPLAIN (ANALYZE, BUFFERS) with SET track_io_timing 
> >= on;
> >for each value of >max_parallel_workers.
>
> Hi,
>
> If I'm following the thread correctly, we may have gains on this patch
> of Thomas and David, but we need to test its effects on different
> filesystems. It's also been clarified by David through benchmark tests
> that synchronize_seqscans is not affected as long as the set cap per
> chunk size of parallel scan is at 8192.
>
> I also agree that having a control on this through GUC can be
> beneficial for users, however, that can be discussed in another
> thread or development in the future.
>
> David Rowley wrote:
> >I'd like to propose that if anyone wants to do further testing on
> >other operating systems with SSDs or HDDs then it would be good if
> >that could be done within a 1 week from this email. There are various
> >benchmarking ideas on this thread for inspiration.
>
> I'd like to join on testing it, this one using HDD, and at the bottom
> are the results. Due to my machine limitations, I only tested
> 0~6 workers, that even if I increase max_parallel_workers_per_gather
> more than that, the query planner would still cap the workers at 6.
> I also set the track_io_timing to on as per David's recommendation.
>
> Tested on:
> XFS filesystem, HDD virtual machine
> RHEL4, 64-bit,
> 4 CPUs, Intel Core Processor (Haswell, IBRS)
> PostgreSQL 14devel on x86_64-pc-linux-gnu
>
>
> Test Case (Soumyadeep's) [1]
>
> shared_buffers = 32MB (to use OS page cache)
>
> create table t_heap as select generate_series(1, 1) i;   --about 
> 3.4GB size
>
> SET track_io_timing = on;
>
> \timing
>
> set max_parallel_workers_per_gather = 0;  --0 to 6
>
> SELECT count(*) from t_heap;
> EXPLAIN (ANALYZE, BUFFERS) SELECT count(*) from t_heap;
>
> [Summary]
> I used the same query from the thread. However, the sql query execution time
> and query planner execution time results between the master and patched do
> not vary much.
> OTOH, in terms of I/O stats, I observed similar regression in both master
> and patched as we increase max_parallel_workers_per_gather.
>
> It could also be possible that each benchmark setting for 
> max_parallel_workers_per_gather
> is affected by previous result . IOW, later benchmark runs benefit from the 
> data cached by
> previous runs on OS level.
>

Yeah, I think to some extent that is visible in results because, after
patch, at 0 workers, the execution time is reduced significantly
whereas there is not much difference at other worker counts.  I think
for non-parallel case (0 workers), there shouldn't be any difference.
Also, I am not sure if there is any reason why after patch the number
of shared hits is improved, probably due to caching effects?

> Any advice?

I think recreating the database and restarting the server after each
run might help in getting consistent results.  Also, you might want to
take median of three runs.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: max_slot_wal_keep_size and wal_keep_segments

2020-07-17 Thread Fujii Masao





On 2020/07/14 20:30, David Steele wrote:

On 7/14/20 12:00 AM, Fujii Masao wrote:


The patch was no longer applied cleanly because of recent commit.
So I updated the patch. Attached.

Barring any objection, I will commit this patch.


This doesn't look right:

+   the  most recent megabytes
+   WAL files plus one WAL file are

How about:

+    megabytes of
+   WAL files plus one WAL file are


Thanks for the comment! Isn't it better to keep "most recent" part?
If so, what about either of the followings?

1.  megabytes of WAL files plus
one WAL file that were most recently generated are kept all time.

2.  megabytes +  bytes
of WAL files that were most recently generated are kept all time.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: renaming configure.in to configure.ac

2020-07-17 Thread Peter Eisentraut


On 2020-07-16 18:17, Tom Lane wrote:

Andres Freund  writes:

I think it'd be a good plan to adopt the beta on master.



We already have parts of it backpacked, there have been things we couldn't 
easily do because of bugs in 2.69. There aren't that many changes to configure 
it total, and particularly not in the back branches. So I think it'd be ok 
overhead wise.


Yeah.  Because we'd want to rip out those hacks, it's not quite as simple
as "regenerate configure with this other autoconf version"; somebody will
have to do some preliminary investigation and produce a patch for the
autoconf input files.  Peter, were you intending to do that?


Okay, let's take a look.  Attached is a patch series.

v1-0001-Rename-configure.in-to-configure.ac.patch

This is unsurprising.

v1-0002-Update-to-Autoconf-2.69b.patch.bz2

This runs auto(re)conf 2.69b and cleans up a minimal amount of obsoleted 
stuff.


The bulk of the changes in the produced configure are from the change 
from echo to printf.  Not much else that's too interesting.  I think a 
lot of the compatibility/testing advisories relate to the way you write 
your configure.ac, not so much to the produced shell code.


v1-0003-Remove-check_decls.m4-obsoleted-by-Autoconf-updat.patch

This is something we had backported and is now no longer necessary. 
Note that there are no significant changes in the produced configure, 
which is good.


v1-0004-configure.ac-Remove-_DARWIN_USE_64_BIT_INODE-hack.patch

This is also something that has been obsoleted.

I'm not immediately aware of anything else that can be removed, cleaned, 
or simplified.


One thing that's annoying is that the release notes claim that configure 
should now be faster, and some of the changes they have made should 
support that, but my (limited) testing doesn't bear that out.  Most 
notably, the newly arisen test


checking for g++ option to enable C++11 features... none needed

takes approximately 10 seconds(!) on my machine (for one loop, since 
"none needed"; good luck if you need more than none).


This clearly depends on a lot of specifics of the environment, so some 
more testing would be useful.  This is perhaps something we can 
construct some useful feedback for.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From f13940654f84b8fd0954408c69d5f64134fa5045 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 17 Jul 2020 10:26:27 +0200
Subject: [PATCH v1 1/4] Rename configure.in to configure.ac

---
 config/general.m4|  2 +-
 configure.in => configure.ac |  4 ++--
 src/backend/catalog/Makefile |  4 ++--
 src/include/pg_config.h.in   |  2 +-
 src/tools/msvc/Solution.pm   | 10 +-
 src/tools/version_stamp.pl   | 12 ++--
 6 files changed, 17 insertions(+), 17 deletions(-)
 rename configure.in => configure.ac (99%)

diff --git a/config/general.m4 b/config/general.m4
index 95d65ceb09..140b9737bf 100644
--- a/config/general.m4
+++ b/config/general.m4
@@ -8,7 +8,7 @@
 # argument (other than "yes/no"), etc.
 #
 # The point of this implementation is to reduce code size and
-# redundancy in configure.in and to improve robustness and consistency
+# redundancy in configure.ac and to improve robustness and consistency
 # in the option evaluation code.
 
 
diff --git a/configure.in b/configure.ac
similarity index 99%
rename from configure.in
rename to configure.ac
index 2e05ce2e4d..3f49467ccd 100644
--- a/configure.in
+++ b/configure.ac
@@ -1,5 +1,5 @@
 dnl Process this file with autoconf to produce a configure script.
-dnl configure.in
+dnl configure.ac
 dnl
 dnl Developers, please strive to achieve this order:
 dnl
@@ -21,7 +21,7 @@ AC_INIT([PostgreSQL], [14devel], 
[pgsql-b...@lists.postgresql.org], [], [https:/
 
 m4_if(m4_defn([m4_PACKAGE_VERSION]), [2.69], [], [m4_fatal([Autoconf version 
2.69 is required.
 Untested combinations of 'autoconf' and PostgreSQL versions are not
-recommended.  You can remove the check from 'configure.in' but it is then
+recommended.  You can remove the check from 'configure.ac' but it is then
 your responsibility whether the result works or not.])])
 AC_COPYRIGHT([Copyright (c) 1996-2020, PostgreSQL Global Development Group])
 AC_CONFIG_SRCDIR([src/backend/access/common/heaptuple.c])
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index 9499bb33e5..93cf6d4368 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -103,10 +103,10 @@ generated-header-symlinks: 
$(top_builddir)/src/include/catalog/header-stamp
 # won't update them if they didn't change (to avoid unnecessary recompiles).
 # Technically, this should depend on Makefile.global which supplies
 # $(MAJORVERSION); but then genbki.pl would need to be re-run after every
-# configure run, even in distribution tarballs.  So depending on configure.in
+# configure run, even in distribution tarballs.  So depending on configure.ac
 # instead is cheating

RE: Transactions involving multiple postgres foreign servers, take 2

2020-07-17 Thread tsunakawa.ta...@fujitsu.com

From: Laurenz Albe 
> On Fri, 2020-07-17 at 05:21 +, tsunakawa.ta...@fujitsu.com wrote:
> > And most importantly, do other major DBMSs, including Oracle, provide the
> API for
> > preparing a transaction?  In other words, will the FDWs other than
> postgres_fdw
> > really be able to take advantage of the new FDW functions to join the 2PC
> processing?
> > I think we need to confirm that there are concrete examples.
> 
> I bet they do.  There is even a standard for that.

If you're thinking of xa_prepare() defined in the X/Open XA specification, we 
need to be sure that other FDWs can really utilize this new 2PC mechanism.  
What I'm especially wondering is when the FDW can call xa_start().

Regards
Takayuki Tsunakawa

Re: Parallel worker hangs while handling errors.

2020-07-17 Thread Bharath Rupireddy

>
> Leader process then identifies that there are some messages that need
> to be processed, it copies the messages and sets the latch so that the
> worker process can copy the remaining message from the below function:
> shm_mq_inc_bytes_read -> SetLatch(&sender->procLatch);, Worker is not
> able to receive any signal at this point of time & hangs infinitely
> Worker hangs in this case because when the worker is started the
> signals will be masked using sigprocmask. Unblocking of signals is
> done by calling BackgroundWorkerUnblockSignals in ParallelWorkerMain.
> Now due to error handling the worker has jumped to setjmp in
> StartBackgroundWorker function. Here the signals are in a blocked
> state, hence the signal is not received by the worker process.
>

Your analysis looks fine to me.

Adding some more info:

The worker uses SIGUSR1 (with a special shared memory flag
PROCSIG_PARALLEL_MESSAGE) both for error message sharing(from
mq_putmessage) and for parallel worker shutdown(from
ParallelWorkerShutdown).

And yes, the postmaster blocks SIGUSR1 before forking bgworker(in
PostmasterMain with pqinitmask() and PG_SETMASK(&BlockSig)), bgworker
receives the same blocked signal mask for itself and enters
StartBackgroundWorker(), uses sigsetjmp for error handling, and then
goes to ParallelWorkerMain() where few of the signal handers are set
and then BackgroundWorkerUnblockSignals() is called to not block any
signals.

But observe when we did sigsetjmp the state of the signal mask is that
of we received from postmaster which is all signals blocked.

So, now in error cases when the control is jumped to sigsetjmp we
still have the signals blocked(along with SIGUSR1) mask and in the
code path of EmitErrorReport, we do send SIGUSR1 with flag
PROCSIG_PARALLEL_MESSAGE to the leader/backend and wait for the latch
to be set, this happens only if the worker is able to receive back
SIGUSR1 from the leader/backend.

In this reported issue, since SIGUSR1 is blocked at sigsetjmp in
StartBackgroundWorker(), there is no way that the worker process
receiving it from the leader and the latch cannot be set and hence the
hang occurs.

The same hang issue can occur(though I'm not able to back it up with a
use case), in the cases from wherever the EmitErrorReport() is called
from "if (sigsetjmp(local_sigjmp_buf, 1) != 0)" block, such as
autovacuum.c, bgwriter.c, bgworker.c, checkpointer.c, walwriter.c, and
postgres.c.

>
> One of the fixes could be to call BackgroundWorkerUnblockSignals just
> after sigsetjmp. I'm not sure if this is the best solution.
> Robert & myself had a discussion about the problem yesterday. We felt
> this is a genuine problem with the parallel worker error handling and
> need to be fixed.
>

Note that, in all sigsetjmp blocks, we intentionally
HOLD_INTERRUPTS(), to not cause any issues while performing error
handling, I'm concerned here that now, if we directly call
BackgroundWorkerUnblockSignals() which will open up all the signals
and our main intention of holding interrupts or block signals may go
away.

Since the main problem for this hang issue is because of blocking
SIGUSR1, in sigsetjmp, can't we just only unblock only the SIGUSR1,
instead of unblocking all signals? I tried this with parallel copy
hang, the issue is resolved.

Something like below -

if (sigsetjmp(local_sigjmp_buf, 1) != 0)
{
sigset_t x;
sigemptyset (&x);
sigaddset(&x, SIGUSR1);
sigprocmask(SIG_UNBLOCK, &x, NULL);

/* Since not using PG_TRY, must reset error stack by hand */
error_context_stack = NULL;

/* Prevent interrupts while cleaning up */
HOLD_INTERRUPTS();

If okay, with the above approach, we can put the above
sigprocmask(SIG_UNBLOCK,..) piece of code(of course generically to
unblock any given signal) in a macro similar to PG_SETMASK() and use
that in all the places wherever EmitErrorReport() is called from
sigsetjmp. We should mind the portability of sigprocmask as well.


With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Is it useful to record whether plans are generic or custom?

2020-07-17 Thread Fujii Masao




On 2020/07/16 11:50, torikoshia wrote:

On 2020-07-15 11:44, Fujii Masao wrote:

On 2020/07/14 21:24, torikoshia wrote:

On 2020-07-10 10:49, torikoshia wrote:

On 2020-07-08 16:41, Fujii Masao wrote:

On 2020/07/08 10:14, torikoshia wrote:

On 2020-07-06 22:16, Fujii Masao wrote:

On 2020/06/11 14:59, torikoshia wrote:

On 2020-06-10 18:00, Kyotaro Horiguchi wrote:



+    TupleDescInitEntry(tupdesc, (AttrNumber) 8, "last_plan",

This could be a problem if we showed the last plan in this view. I
think "last_plan_type" would be better.

+    if (prep_stmt->plansource->last_plan_type == 
PLAN_CACHE_TYPE_CUSTOM)
+    values[7] = CStringGetTextDatum("custom");
+    else if (prep_stmt->plansource->last_plan_type == 
PLAN_CACHE_TYPE_GENERIC)
+    values[7] = CStringGetTextDatum("generic");
+    else
+    nulls[7] = true;

Using swith-case prevents future additional type (if any) from being
unhandled.  I think we are recommending that as a convension.


Thanks for your reviewing!

I've attached a patch that reflects your comments.


Thanks for the patch! Here are the comments.


Thanks for your review!


+    Number of times generic plan was choosen
+    Number of times custom plan was choosen

Typo: "choosen" should be "chosen"?


Thanks, fixed them.


+  
+   last_plan_type text
+  
+  
+    Tells the last plan type was generic or custom. If the prepared
+    statement has not executed yet, this field is null
+  

Could you tell me how this information is expected to be used?
I think that generic_plans and custom_plans are useful when investigating
the cause of performance drop by cached plan mode. But I failed to get
how much useful last_plan_type is.


This may be an exceptional case, but I once had a case needed
to ensure whether generic or custom plan was chosen for specific
queries in a development environment.


In your case, probably you had to ensure that the last multiple (or every)
executions chose generic or custom plan? If yes, I'm afraid that displaying
only the last plan mode is not enough for your case. No?
So it seems better to check generic_plans or custom_plans columns in the
view rather than last_plan_type even in your case. Thought?


Yeah, I now feel last_plan is not so necessary and only the numbers of
generic/custom plan is enough.

If there are no objections, I'm going to remove this column and related codes.


As mentioned, I removed last_plan column.


Thanks for updating the patch! It basically looks good to me.

I have one comment; you added the regression tests for generic and
custom plans into prepare.sql. But the similar tests already exist in
plancache.sql. So isn't it better to add the tests for generic_plans and
custom_plans columns, into plancache.sql?



Thanks for your comments!

Agreed.
I removed tests on prepare.sql and added them to plancache.sql.
Thoughts?


Thanks for updating the patch!
I also applied the following minor changes to the patch.

-Number of times generic plan was chosen
+   Number of times generic plan was chosen
-Number of times custom plan was chosen
+   Number of times custom plan was chosen

I got rid of one space character before those descriptions because
they should start from the position of 7th character.

 -- but we can force a custom plan
 set plan_cache_mode to force_custom_plan;
 explain (costs off) execute test_mode_pp(2);
+select name, generic_plans, custom_plans from pg_prepared_statements
+  where  name = 'test_mode_pp';

In the regression test, I added the execution of pg_prepared_statements
after the last execution of test query, to confirm that custom plan is used
when force_custom_plan is set, by checking from pg_prepared_statements.

I changed the status of this patch to "Ready for Committer" in CF.

Barring any objection, I will commit this patch.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index e9cdff4864..de69487bec 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -10830,6 +10830,24 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts 
ppx
frontend/backend protocol
   
  
+
+ 
+  
+   generic_plans int8
+  
+  
+   Number of times generic plan was chosen
+  
+ 
+
+ 
+  
+   custom_plans int8
+  
+  
+   Number of times custom plan was chosen
+  
+ 
 

   
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 80d6df8ac1..4b18be5b27 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -694,7 +694,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause 
*into, ExplainState *es,
 
 /*
  * This set returning function reads all the prepared statements and
- * returns a set of (name, statemen

61 matches

Mail list logo