Re: Use of backup_label not noted in log

2023-11-20 Thread Laurenz Albe
On Mon, 2023-11-20 at 11:03 -0800, Andres Freund wrote:
> > If we add a message for starting with "backup_label", shouldn't
> > we also add a corresponding message for starting from a checkpoint
> > found in the control file?  If you see that in a problem report,
> > you immediately know what is going on.
> 
> Maybe - the reason I hesitate on that is that emitting an additional log
> message when starting from a base backup just adds something "once once the
> lifetime of a node". Whereas emitting something every start obviously doesn't
> impose any limit.

The message should only be shown if PostgreSQL replays WAL, that is,
after a crash.  That would (hopefully) make it a rare message too.

Yours,
Laurenz Albe




Re: About #13489, array dimensions and CREATE TABLE ... LIKE

2023-11-21 Thread Laurenz Albe
On Mon, 2023-11-20 at 21:13 -0500, Tom Lane wrote:
> Bruce Momjian  writes:
> > On Mon, Nov 20, 2023 at 09:04:21PM -0500, Tom Lane wrote:
> > > Bruce Momjian  writes:
> > > > An alternate approach would
> > > > be to remove pg_attribute.attndims so we don't even try to preserve 
> > > > dimensionality.
> 
> > > I could get behind that, perhaps.  It looks like we're not using the
> > > field in any meaningful way, and we could simplify TupleDescInitEntry
> > > and perhaps some other APIs.
> 
> > So should I work on that patch or do you want to try?  I think we should
> > do something.
> 
> Let's wait for some other opinions, first ...

Looking at the code, I get the impression that we wouldn't lose anything
without "pg_attribute.attndims", so +1 for removing it.

This would call for some documentation.  We should remove most of the
documentation about the non-existing difference between declaring a column
"integer[]", "integer[][]" or "integer[3][3]" and just describe the first
variant in detail, perhaps mentioning that the other notations are
accepted for backward compatibility.

I also think that it would be helpful to emphasize that while dimensionality
does not matter to a column definition, it matters for individual array values.
Perhaps it would make sense to recommend a check constraint if one wants
to make sure that an array column should contain only a certain kind of array.

Yours,
Laurenz Albe




Re: Return value of pg_promote()

2023-08-28 Thread Laurenz Albe
On Thu, 2023-08-17 at 09:37 +0900, Michael Paquier wrote:
> I have just noticed that we do not have a CF entry for this proposal,
> so I have added one with Laurenz as author:
> https://commitfest.postgresql.org/44/4504/

I have changed the author to Fujii Masao.

Yours,
Laurenz Albe




Re: Disabling Heap-Only Tuples

2023-08-28 Thread Laurenz Albe
On Thu, 2023-08-24 at 18:23 +0200, Matthias van de Meent wrote:
> On Wed, 19 Jul 2023 at 15:13, Thom Brown  wrote:
> > 
> > On Wed, 19 Jul 2023, 13:58 Laurenz Albe,  wrote:
> > > I agree that the name "max_local_update" could be improved.
> > > Perhaps "avoid_hot_above_size_mb".
> > 
> > Or "hot_table_size_threshold" or "hot_update_limit"?
> 
> Although I like these names, it doesn't quite cover the use of the
> parameter for me, as updated tuples prefer to be inserted on the same
> page as the old tuple regardless of whether HOT applies.
> 
> How about 'local_update_limit'?

I agree with your concern.  I cannot think of a better name than yours.

Yours,
Laurenz Albe




Re: Disabling Heap-Only Tuples

2023-09-05 Thread Laurenz Albe
On Wed, 2023-08-30 at 09:31 -0400, Robert Haas wrote:
> On Wed, Aug 30, 2023 at 9:01 AM Matthias van de Meent
>  wrote:
> > I've reworked the patch a bit to remove the "excessive bloat with low
> > fillfactors when local space is available" issue that this parameter
> > could cause - local updates are now done if the selected page we would
> > be inserting into is after the old tuple's page and the old tuple's
> > page still (or: now) has space available.
> > 
> > Does that alleviate your concerns?
> 
> That seems like a good chance, but my core concern is around people
> having to micromanage local_update_limit, and probably either not
> knowing how to do it properly, or not being able or willing to keep
> updating it as things change.
> 
> In a way, this parameter is a lot like work_mem, which is notoriously
> very difficult to tune.

I don't think that is a good comparison.  While most people probably
never need to touch "local_update_limit", "work_mem" is something everybody
has to consider.

And it is not so hard to tune: the setting would be the desired table
size, and you could use pgstattuple to find a good value.

I don't know what other use cases come to mind, but I see it as a tool to
shrink a table after it has grown big holes, perhaps after a mass delete.
Today, you can only VACUUM (FULL) or play with the likes of pg_squeeze and
pg_repack.

I think this is useful.

To alleviate your concerns, perhaps it would help to describe the use case
and ideas for a good setting in the documentation.

Yours,
Laurenz Albe




Re: document the need to analyze partitioned tables

2023-09-05 Thread Laurenz Albe
Sorry for dropping the ball on this; I'll add it to the next commitfest.

On Wed, 2023-01-25 at 21:43 +1300, David Rowley wrote:
> > I think your first sentence it a bit clumsy and might be streamlined to
> > 
> >   Partitioned tables do not directly store tuples and consequently do not
> >   require autovacuum to perform any VACUUM operations.
> 
> That seems better than what I had.

Ok, I went with it.

> > Also, I am a little bit unhappy about
> > 
> > 1. Your paragraph states that partitioned table need no autovacuum,
> >    but doesn't state unmistakably that they will never be treated
> >    by autovacuum.
> 
> hmm. I assume the reader realises from the text that lack of any
> tuples means VACUUM is not required.  The remaining part of what
> autovacuum does not do is explained when the text goes on to say that
> ANALYZE operations are also not performed on partitioned tables. I'm
> not sure what is left that's mistakable there.

I rewrote the paragraph a little so that it looks clearer to me.
I hope it is OK for you as well.

> > 2. You make a distinction between table partitions and "normal tables",
> >    but really there is no distiction.
> 
> We may have different mental models here. This relates to the part
> that I wasn't keen on in your patch, i.e:
> 
> +    The partitions of a partitioned table are normal tables and get processed
> +    by autovacuum
> 
> While I agree that the majority of partitions are likely to be
> relkind='r', which you might ordinarily consider a "normal table", you
> just might change your mind when you try to INSERT or UPDATE records
> that would violate the partition constraint. Some partitions might
> also be themselves partitioned tables and others might be foreign
> tables. That does not really matter much when it comes to what
> autovacuum does or does not do, but I'm not really keen to imply in
> our documents that partitions are "normal tables".

Agreed, there are differences between partitions and normal tables.
And this is not the place in the documentation where we would like to
get into detail about the differences.

Attached is the next version of my patch.

Yours,
Laurenz Albe
From 33ef30888b5f5b57c776a1bd00065e0c94daccdb Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Wed, 6 Sep 2023 05:43:30 +0200
Subject: [PATCH] Improve autovacuum doc on partitioned tables

The documentation mentioned that autovacuum doesn't process
partitioned tables, but it was unclear about the impact.
The old wording could be interpreted to mean that there are
problems with dead tuple cleanup on partitioned tables.
Clarify that the only potential problem is autoanalyze, and
that statistics for the partitions will be gathered.

Author: Laurenz Albe
Reviewed-by: Nathan Bossart, Justin Pryzby
Discussion: https://postgr.es/m/1fd81ddc7710a154834030133c6fea41e55c8efb.camel%40cybertec.at
---
 doc/src/sgml/maintenance.sgml | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 9cf9d030a8..10b1f211e8 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -861,10 +861,18 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu

 

-Partitioned tables are not processed by autovacuum.  Statistics
-should be collected by running a manual ANALYZE when it is
-first populated, and again whenever the distribution of data in its
-partitions changes significantly.
+Partitioned tables do not directly store tuples and consequently do not
+require autovacuum to perform any VACUUM operations.
+Autovacuum performs a VACUUM on the partitioned
+table's partitions the same as it does with other tables.  While that
+works fine, the fact that autovacuum doesn't process partitioned tables
+also means that it doesn't run ANALYZE on them, and this
+can be problematic, as there are various places in the query planner that
+attempt to make use of table statistics for partitioned tables when
+partitioned tables are queried.  For now, you can work around this problem
+by running a manual ANALYZE command on the partitioned
+table when the partitioned table is first populated, and again whenever
+the distribution of data in its partitions changes significantly.

 

-- 
2.41.0



Re: to_regtype() Raises Error

2023-09-17 Thread Laurenz Albe
On Mon, 2023-09-18 at 00:57 +0200, Vik Fearing wrote:
> On 9/18/23 00:41, Erik Wienhold wrote:
> > On 18/09/2023 00:13 CEST David E. Wheeler  wrote:
> > > The docs for `to_regtype()` say, “this function will return NULL rather 
> > > than
> > > throwing an error if the name is not found.” And it’s true most of the 
> > > time:
> > > 
> > > david=# select to_regtype('foo'), to_regtype('clam');
> > >   to_regtype | to_regtype
> > > +
> > >   [null] | [null]
> > > 
> > > But not others:
> > > 
> > > david=# select to_regtype('inteval second');
> > > ERROR:  syntax error at or near "second"
> > > LINE 1: select to_regtype('inteval second');
> > >  ^
> > > CONTEXT:  invalid type name "inteval second”
> > 
> > Probably a typo and you meant 'interval second' which works.
> No, that is precisely the point.  The result should be null instead of 
> an error.

Right.  I debugged into this, and found this comment to typeStringToTypeName():

 * If the string cannot be parsed as a type, an error is raised,
 * unless escontext is an ErrorSaveContext node, in which case we may
 * fill that and return NULL.  But note that the ErrorSaveContext option
 * is mostly aspirational at present: errors detected by the main
 * grammar, rather than here, will still be thrown.

"escontext" is an ErrorSaveContext node, and it is the parser failing.

Not sure if we can do anything about that or if it is worth the effort.

Perhaps the documentation could reflect the implementation.

Yours,
Laurenz Albe




Re: Disabling Heap-Only Tuples

2023-09-18 Thread Laurenz Albe
On Mon, 2023-09-18 at 12:22 -0400, Robert Haas wrote:
> On Tue, Sep 5, 2023 at 11:15 PM Laurenz Albe  wrote:
> > I don't think that is a good comparison.  While most people probably
> > never need to touch "local_update_limit", "work_mem" is something everybody
> > has to consider.
> > 
> > And it is not so hard to tune: the setting would be the desired table
> > size, and you could use pgstattuple to find a good value.
> 
> What I suspect would happen, though, is that you'd end up tuning the
> value over and over. You'd set it to some value and after some number
> of vacuums maybe you'd realize that you could save even more disk
> space if you reduced it a bit further or maybe your data set would
> grow a bit and you'd have to increase it a little (or a lot). And if
> you didn't keep adjusting it then maybe something quite bad would
> happen to your database.

There is that risk, yes.

> work_mem isn't quite the same [...] But what is the same here and in the case
> of work_mem is that you can suddenly get hosed if the situation
> changes substantially and you don't respond by updating the parameter
> setting. In the case of work_mem, again in my experience, it's quite
> common for people to suddenly find themselves in a lot of trouble if
> they have a load spike, because now they're running a lot more copies
> of the same query and the machine runs out of memory.

So the common ground is "both parameters are not so easy to get right,
and if you get them wrong, it's a problem".  For me the big difference is
that while you pretty much have to tune "work_mem", you can normally just ignore
"local_update_limit".

> The equivalent
> problem here would be if the table suddenly gets a lot bigger due to a
> load spike or some change in the way the application is used. Then
> suddenly, a setting that was previously serving to keep the table
> pleasantly small and un-bloated on disk is instead causing tons of
> updates that would have been HOT to become non-HOT, which could very
> easily result in both the table and its indexes bloating quite
> rapidly. I really don't like the idea of an anti-bloat feature that,
> when set to the wrong value, becomes a bloat-amplification feature. I
> don't know how to describe that other than "fragile and dangerous."

Yes, you can hurt yourself that way.  But that applies to many other
settings as well.  You can tank your performance with a bad value for
"commit_delay", "hot_standby_feedback" can bloat your primary, and
so on.  Still we consider these useful parameters.

> Imagine a hypothetical feature that knew how small the table could
> reasonably be kept, say by magic, and did non-HOT updates instead of
> HOT updates whenever doing so would allow moving a tuple from a page
> beyond that magical boundary to an earlier page. Such a feature would
> not have the downsides that this one does -- if there were
> opportunities to make the table smaller, the system would take
> advantage of them automatically, and if the table grew, the system
> would automatically become more relaxed to stay out of trouble. Such a
> feature is clearly more work to design and implement than what is
> proposed here, but it would also work a lot better in practice.

That sounds a bit like we should not have "shared_buffers" unless we
have a magical tool built in that gets the value right automatically.
Yes, the better is the enemy of the good.  You can kill everything with
a line of reasoning like that.

> In
> fact, I daresay that if we accept the feature as proposed, somebody's
> going to go out and write a tool to calculate what the threshold ought
> to be and automatically adjust it as things change. Users of the tool
> will then divide into two camps:
> 
> - People who try to tune it manually and get burned if anything
> changes on their system.
> - People who use that out-of-core tool.
> 
> So the out-of-core tool that does this tuning becomes a stealth
> dependency for any user who is facing this problem. Gosh, don't we
> have enough of those already? Connection pooling being perhaps the
> most obvious example, but far from the only one.

I cannot follow you there.  What I envision is that "local_update_limit"
is not set permanently on a table.  You set it when you realize your table
got bloated.  Then you wait until the bloat goes away or you launch a
couple of UPDATEs that eventually shrink the table.  Then you reset
"local_update_limit" again.
It's a more difficult, but less invasive alternative to VACUUM (FULL).

If a setting is hard to understand and hard to get right, we could invest
in good documentation that explains the use cases and pitfalls.
Wouldn't that go a long way towards defusing this perceived footgun?
I am aware that a frightening number of users don't read documentation,
but I find it hard to believe that anyone would twiddle a non-obvious
knob like "local_update_limit" without first trying to figure out what
it actually does.

Yours,
Laurenz Albe




Re: Disabling Heap-Only Tuples

2023-09-19 Thread Laurenz Albe
On Tue, 2023-09-19 at 14:50 -0400, Robert Haas wrote:
> But I know people will try to use it for instant compaction too, and
> there it's worth remembering why we removed old-style VACUUM FULL. The
> main problem is that it was mind-bogglingly slow. The other really bad
> problem is that it caused massive index bloat. I think any system
> that's based on moving around my tuples right now to make my table
> smaller right now is likely to have similar issues.

I had the same feeling that this is sort of bringing back old-style
VACUUM (FULL).  But I don't think that index bloat is a show stopper
these days, when we have REINDEX CONCURRENTLY, so I am not worried.

Yours,
Laurenz Albe




Re: Disabling Heap-Only Tuples

2023-09-19 Thread Laurenz Albe
On Tue, 2023-09-19 at 12:52 -0400, Robert Haas wrote:
> On Tue, Sep 19, 2023 at 12:30 PM Alvaro Herrera  
> wrote:
> > I was thinking something vaguely like "a table size that's roughly what
> > an optimal autovacuuming schedule would leave the table at" assuming 0.2
> > vacuum_scale_factor.  You would determine the absolute minimum size for
> > the table given the current live tuples in the table, then add 20% to
> > account for a steady state of dead tuples and vacuumed space.  So it's
> > not 1.2x of the "current" table size at the time the local_update_limit
> > feature is installed, but 1.2x of the optimal table size.
> 
> Right, that would be great. And honestly if that's something we can
> figure out, then why does the parameter even need to be an integer
> instead of a Boolean? If the system knows the optimal table size, then
> the user can just say "try to compact this table" and need not say to
> what size. The 1.2 multiplier is probably situation dependent and
> maybe the multiplier should indeed be a configuration parameter, but
> we would be way better off if the absolute size didn't need to be.

I don't have high hopes for a reliable way to automatically determine
the target table size.  There are these queries floating around to estimate
table bloat, which are used by various monitoring systems.  I find that they
get it right a lot of the time, but sometimes they get it wrong.  Perhaps
we can do better than that, but I vastly prefer a setting that I can control
(even at the danger that I can misconfigure it) over an automatism that I
cannot control and that sometimes gets it wrong.

I like Alvaro's idea to automatically reset "local_update_limit" when the
table has shrunk enough.  Why not perform that task during vacuum truncation?
If vacuum truncation has taken place, check if the table size is no bigger
than "local_update_limit" * (1 + "autovacuum_vacuum_scale_factor"), and if
it is no bigger, reset "local_update_limit".  That way, we would not have
to worry about a lock, because vacuum truncation already has the table locked.

Yours,
Laurenz Albe




Regression in COPY FROM caused by 9f8377f7a2

2023-09-25 Thread Laurenz Albe
In v16 and later, the following fails:

CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');

COPY boom FROM STDIN;
ERROR:  value too long for type character varying(5)

In PostgreSQL v15 and earlier, the COPY statement succeeds.

The error is thrown in BeginCopyFrom in line 1578 (HEAD)

  defexpr = expression_planner(defexpr);

Bisecting shows that the regression was introduced by commit 9f8377f7a2,
which introduced DEFAULT values for COPY FROM.

The table definition is clearly silly, so I am not sure if that
regression is worth fixing.  On the other hand, it is not cool if
something that worked without an error in v15 starts to fail later on.

Yours,
Laurenz Albe




Re: Regression in COPY FROM caused by 9f8377f7a2

2023-09-25 Thread Laurenz Albe
On Mon, 2023-09-25 at 09:54 +0200, Laurenz Albe wrote:
> In v16 and later, the following fails:
> 
> CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');
> 
> COPY boom FROM STDIN;
> ERROR:  value too long for type character varying(5)
> 
> In PostgreSQL v15 and earlier, the COPY statement succeeds.
> 
> The error is thrown in BeginCopyFrom in line 1578 (HEAD)
> 
>   defexpr = expression_planner(defexpr);
> 
> Bisecting shows that the regression was introduced by commit 9f8377f7a2,
> which introduced DEFAULT values for COPY FROM.

I suggest the attached fix, which evaluates default values only if
the DEFAULT option was specified or if the column does not appear in
the column list of COPY.

Yours,
Laurenz Albe
From 4af982c56df57a1a0f04340d394c297559fbabb5 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Mon, 25 Sep 2023 10:56:15 +0200
Subject: [PATCH] Evaluate defaults in COPY FROM only if necessary

Since commit 9f8377f7a2, we evaluate the column default values in
COPY FROM for all columns except generated ones, because they could
be needed if the input value matches the DEFAULT option.

This can cause a surprising regression:

  CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');
  COPY boom FROM STDIN;
  ERROR:  value too long for type character varying(5)

This worked before 9f8377f7a2, since default values were only
evaluated for columns that were not specified in the column list.

To fix, fetch the default values only if the DEFAULT option was
specified or for columns not specified in the column list.
---
 src/backend/commands/copyfrom.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 70871ed819..320b764aa9 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1567,7 +1567,14 @@ BeginCopyFrom(ParseState *pstate,
 		/* Get default info if available */
 		defexprs[attnum - 1] = NULL;
 
-		if (!att->attgenerated)
+		/*
+		 * We need the default values only for columns that do not appear in the
+		 * column list or if the DEFAULT option was given.  We also don't need
+		 * it for generated columns.
+		 */
+		if ((!list_member_int(cstate->attnumlist, attnum) ||
+			 cstate->opts.default_print != NULL) &&
+			!att->attgenerated)
 		{
 			Expr	   *defexpr = (Expr *) build_column_default(cstate->rel,
 attnum);
-- 
2.41.0



Re: Regression in COPY FROM caused by 9f8377f7a2

2023-09-25 Thread Laurenz Albe
On Mon, 2023-09-25 at 17:49 -0400, Tom Lane wrote:
> Andrew Dunstan  writes:
> > On 2023-09-25 Mo 11:06, Andrew Dunstan wrote:
> > > On 2023-09-25 Mo 04:59, Laurenz Albe wrote:
> > > > CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');
> 
> > Thinking about this a little more, wouldn't it be better if we checked 
> > at the time we set the default that the value is actually valid for the 
> > given column? This is only one manifestation of a problem you could run 
> > into given this table definition.
> 
> I dunno, it seems at least possible that someone would do this
> deliberately as a means of preventing the column from being defaulted.
> In any case, the current behavior has stood for a very long time and
> no one has complained that an error should be thrown sooner.

Moreover, this makes restoring a pg_dump from v15 to v16 fail, which
should never happen.  This is how I got that bug report.

Yours,
Laurenz Albe




Re: Regression in COPY FROM caused by 9f8377f7a2

2023-09-26 Thread Laurenz Albe
Here is an improved version of the patch with regression tests.

Yours,
Laurenz Albe
From 71744ada1e2c8cfdbb57e03018572a1af623b09e Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Tue, 26 Sep 2023 10:09:49 +0200
Subject: [PATCH] Evaluate defaults in COPY FROM only if necessary

Since commit 9f8377f7a2, we evaluate the column default values in
COPY FROM for all columns except generated ones, because they could
be needed if the input value matches the DEFAULT option.

This can cause a surprising regression:

  CREATE TABLE boom (t character varying(5) DEFAULT 'a long string');
  COPY boom FROM STDIN;
  ERROR:  value too long for type character varying(5)

This worked before 9f8377f7a2, since default values were only
evaluated for columns that were not specified in the column list.

To fix, fetch the default values only if the DEFAULT option was
specified or for columns not specified in the column list.
---
 src/backend/commands/copyfrom.c|  9 -
 src/test/regress/expected/copy.out | 17 +
 src/test/regress/sql/copy.sql  | 15 +++
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 70871ed819..3f3e631dee 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1567,7 +1567,14 @@ BeginCopyFrom(ParseState *pstate,
 		/* Get default info if available */
 		defexprs[attnum - 1] = NULL;
 
-		if (!att->attgenerated)
+		/*
+		 * We need the default values only for columns that do not appear in the
+		 * column list.  But if the DEFAULT option was given, we may need all
+		 * column default values.  We never need defaults for generated columns.
+		 */
+		if ((cstate->opts.default_print != NULL ||
+			 !list_member_int(cstate->attnumlist, attnum)) &&
+			!att->attgenerated)
 		{
 			Expr	   *defexpr = (Expr *) build_column_default(cstate->rel,
 attnum);
diff --git a/src/test/regress/expected/copy.out b/src/test/regress/expected/copy.out
index 8a8bf43fde..a5912c13a8 100644
--- a/src/test/regress/expected/copy.out
+++ b/src/test/regress/expected/copy.out
@@ -240,3 +240,20 @@ SELECT * FROM header_copytest ORDER BY a;
 (5 rows)
 
 drop table header_copytest;
+-- test COPY with overlong column defaults
+create temp table oversized_column_default (
+col1 varchar(5) DEFAULT 'more than 5 chars',
+col2 varchar(5));
+-- normal COPY should work
+copy oversized_column_default from stdin;
+-- error if the column is excluded
+copy oversized_column_default (col2) from stdin;
+ERROR:  value too long for type character varying(5)
+\.
+invalid command \.
+-- error if the DEFAULT option is given
+copy oversized_column_default from stdin (default '');
+ERROR:  value too long for type character varying(5)
+\.
+invalid command \.
+drop table oversized_column_default;
diff --git a/src/test/regress/sql/copy.sql b/src/test/regress/sql/copy.sql
index f9da7b1508..7fdb26d14f 100644
--- a/src/test/regress/sql/copy.sql
+++ b/src/test/regress/sql/copy.sql
@@ -268,3 +268,18 @@ a	c	b
 
 SELECT * FROM header_copytest ORDER BY a;
 drop table header_copytest;
+
+-- test COPY with overlong column defaults
+create temp table oversized_column_default (
+col1 varchar(5) DEFAULT 'more than 5 chars',
+col2 varchar(5));
+-- normal COPY should work
+copy oversized_column_default from stdin;
+\.
+-- error if the column is excluded
+copy oversized_column_default (col2) from stdin;
+\.
+-- error if the DEFAULT option is given
+copy oversized_column_default from stdin (default '');
+\.
+drop table oversized_column_default;
-- 
2.41.0



Re: document the need to analyze partitioned tables

2023-09-29 Thread Laurenz Albe
On Fri, 2023-09-29 at 18:08 -0400, Bruce Momjian wrote:
> On Wed, Sep  6, 2023 at 05:53:56AM +0200, Laurenz Albe wrote:
> > > We may have different mental models here. This relates to the part
> > > that I wasn't keen on in your patch, i.e:
> > > 
> > > +    The partitions of a partitioned table are normal tables and get 
> > > processed
> > > +    by autovacuum
> > > 
> > > While I agree that the majority of partitions are likely to be
> > > relkind='r', which you might ordinarily consider a "normal table", you
> > > just might change your mind when you try to INSERT or UPDATE records
> > > that would violate the partition constraint. Some partitions might
> > > also be themselves partitioned tables and others might be foreign
> > > tables. That does not really matter much when it comes to what
> > > autovacuum does or does not do, but I'm not really keen to imply in
> > > our documents that partitions are "normal tables".
> > 
> > Agreed, there are differences between partitions and normal tables.
> > And this is not the place in the documentation where we would like to
> > get into detail about the differences.
> > 
> > Attached is the next version of my patch.
> 
> I adjusted your patch to be shorter and clearer, patch attached.  I am
> planning to apply this back to PG 11.

Thanks for looking at this.

I am mostly fine with your version, but it does not directly state that
autovacuum does not process partitioned tables.  I think this should be
clarified in the beginning.

Yours,
Laurenz Albe




Re: SHARED locks barging behaviour

2023-09-29 Thread Laurenz Albe
On Fri, 2023-09-29 at 17:45 -0400, Bruce Momjian wrote:
> On Tue, Jan 17, 2023 at 12:18:28PM -0500, Arul Ajmani wrote:
> > I'm trying to better understand the following barging behaviour with SHARED
> > locks.
> ...
> > Given there is a transaction waiting to acquire a FOR UPDATE lock, I was
> > surprised to see the second FOR SHARE transaction return immediately 
> > instead of
> > waiting. I have two questions:
> > 
> > 1) Could this barging behaviour potentially starve out the transaction 
> > waiting
> > to acquire the FOR UPDATE lock, if there is a continuous queue of 
> > transactions
> > that acquire a FOR SHARE lock briefly?
> 
> Yes, see below.
> 
> > 2) Assuming this is by design, I couldn't find (in code) where this explicit
> > policy choice is made. I was looking around LockAcquireExtended, but it 
> > seems
> > like the decision is made above this layer. Could someone more familiar with
> > this code point me at the right place? 
> 
> I know this from January, but I do have an answer.  [...]

You answer the question where this is implemented.  But the more important 
question
is whether this is intentional.  This code was added by 0ac5ad5134f (introducing
FOR KEY SHARE and FOR NO KEY UPDATE).  My feeling is that it is not intentional 
that
a continuous stream of share row locks can starve out an exclusive row lock, 
since
PostgreSQL behaves differently with other locks.

On the other hand, if nobody has complained about it in these ten years, perhaps
it is just fine the way it is, if by design or not.

Yours,
Laurenz Albe




Re: document the need to analyze partitioned tables

2023-10-01 Thread Laurenz Albe
On Fri, 2023-09-29 at 22:34 -0400, Bruce Momjian wrote:
> Very good point!   Updated patch attached.

Thanks!  Some small corrections:

> diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
> index 9cf9d030a8..be1c522575 100644
> --- a/doc/src/sgml/maintenance.sgml
> +++ b/doc/src/sgml/maintenance.sgml
> @@ -861,10 +861,16 @@ analyze threshold = analyze base threshold + analyze 
> scale factor * number of tu
> 
>  
> 
> -Partitioned tables are not processed by autovacuum.  Statistics
> -should be collected by running a manual ANALYZE when 
> it is
> -first populated, and again whenever the distribution of data in its
> -partitions changes significantly.
> +Partitioned tables do not directly store tuples and consequently
> +autovacuum does not VACUUM them.  (Autovacuum does

... does not VACUUM or ANALYZE them.

Perhaps it would be shorter to say "does not process them" like the
original wording.

> +perform VACUUM on table partitions just like other

Just like *on* other tables, right?

> +tables.)  Unfortunately, this also means that autovacuum doesn't
> +run ANALYZE on partitioned tables, and this
> +can cause suboptimal plans for queries that reference partitioned
> +table statistics.  You can work around this problem by manually
> +running ANALYZE on partitioned tables when they
> +are first populated, and again whenever the distribution of data in
> +their partitions changes significantly.
> 
>  
> 

Yours,
Laurenz Albe




Re: Regression in COPY FROM caused by 9f8377f7a2

2023-10-01 Thread Laurenz Albe
On Sun, 2023-10-01 at 10:55 -0400, Andrew Dunstan wrote:
> Thanks, pushed.

Thanks for taking care of that.

Yours,
Laurenz Albe




Trigger violates foreign key constraint

2023-10-02 Thread Laurenz Albe
CREATE TABLE parent (id integer PRIMARY KEY);

CREATE TABLE child (id integer REFERENCES parent ON DELETE CASCADE);

CREATE FUNCTION silly() RETURNS trigger LANGUAGE plpgsql AS 'BEGIN RETURN NULL; 
END;';

CREATE TRIGGER silly BEFORE DELETE ON child FOR EACH ROW EXECUTE FUNCTION 
silly();

INSERT INTO parent VALUES (1);

INSERT INTO child VALUES (1);

DELETE FROM parent WHERE id = 1;

TABLE child;
 id 

  1
(1 row)

The trigger function cancels the cascaded delete on "child", and we are left 
with
a row in "child" that references no row in "parent".

Yours,
Laurenz Albe




Re: Trigger violates foreign key constraint

2023-10-02 Thread Laurenz Albe
Perhaps it would be enough to run "RI_FKey_noaction_del" after
"RI_FKey_cascade_del", although that would impact the performance.

Yours,
Laurenz Albe




Re: Trigger violates foreign key constraint

2023-10-02 Thread Laurenz Albe
On Mon, 2023-10-02 at 09:49 -0400, Tom Lane wrote:
> Laurenz Albe  writes:
> > CREATE FUNCTION silly() RETURNS trigger LANGUAGE plpgsql AS 'BEGIN RETURN 
> > NULL; END;';
> > CREATE TRIGGER silly BEFORE DELETE ON child FOR EACH ROW EXECUTE FUNCTION 
> > silly();
> 
> > The trigger function cancels the cascaded delete on "child", and we are 
> > left with
> > a row in "child" that references no row in "parent".
> 
> Yes.  This is by design: triggers operate at a lower level than
> foreign keys, so an ill-conceived trigger can break an FK constraint.
> That's documented somewhere, though maybe not visibly enough.
> 
> There are good reasons to want triggers to be able to see and
> react to FK-driven updates, so it's unlikely that we'd want to
> revisit that design decision, even if it hadn't already stood
> for decades.

Thanks for the clarification.  I keep learning.

I didn't find anything about that in the documentation or the
READMEs in the source, but perhaps I didn't look well enough.

Yours,
Laurenz Albe




Re: Trigger violates foreign key constraint

2023-10-03 Thread Laurenz Albe
On Mon, 2023-10-02 at 09:49 -0400, Tom Lane wrote:
> This is by design: triggers operate at a lower level than
> foreign keys, so an ill-conceived trigger can break an FK constraint.
> That's documented somewhere, though maybe not visibly enough.

Not having found any documentation, I propose the attached caution.

Yours,
Laurenz Albe
From b6abd7dfdf1e25515ead092489efde0d239f7053 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Tue, 3 Oct 2023 09:20:54 +0200
Subject: [PATCH] Document foreign key internals

Warn the user that foreign keys are implemented as triggers, and
that user-defined triggers can interact with them and break
referential integrity.

Discussion: https://postgr.es/m/b81fe38fcc25a81be6e2e5b3fc1ff624130762fa.camel%40cybertec.at
---
 doc/src/sgml/ddl.sgml | 12 
 1 file changed, 12 insertions(+)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 075ff32991..8016b9fd81 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -1224,6 +1224,18 @@ CREATE TABLE posts (
 syntax in the reference documentation for
 .

+
+   
+
+ Foreign key constraints are implemented as system triggers in PostgreSQL.
+ As such, they are subject to the trigger firing rules described in
+ .  In particular, other triggers
+ defined on the referencing table can cancel or modify the effects of
+ cascading delete or update, thereby breaking referential integrity.
+ This is not considered a bug, and it is the responsibility of the user
+ to write triggers so that such problems are avoided.
+
+   
   
 
   
-- 
2.41.0



Re: Document efficient self-joins / UPDATE LIMIT techniques.

2023-10-04 Thread Laurenz Albe
On Wed, 2023-06-28 at 14:20 -0400, Corey Huinker wrote:
> This patch adds a few examples to demonstrate the following:
> 
> * The existence of the ctid column on every table
> * The utility of ctds in self joins
> * A practical usage of SKIP LOCKED

I had a look at your patch, and I am in favor of the general idea.

Style considerations:
-

I think the SQL statements should end with semicolons.  Our SQL examples
are usually written like that.

Our general style with CTEs seems to be (according to
https://www.postgresql.org/docs/current/queries-with.html):

 WITH quaxi AS (
 SELECT ...
 )
 SELECT ...;

About the DELETE example:
-

The text suggests that a single, big DELETE operation can consume
too many resources.  That may be true, but the sum of your DELETEs
will consume even more resources.

In my experience, the bigger problem with bulk deletes like that is
that you can run into deadlocks easily, so maybe that would be a
better rationale to give.  You could say that with this technique,
you can force the lock to be taken in a certain order, which will
avoid the possibility of deadlock with other such DELETEs.

About the SELECT example:
-

That example belongs to UPDATE, I'd say, because that is the main
operation.

The reason you give (avoid excessive locking) is good.
Perhaps you could mention that updating in batches also avoids
excessive bload (if you VACUUM between the batches).

About the UPDATE example:
-

I think that could go, because it is pretty similar to the previous
one.  You even use ctid in both examples.

Status set to "waiting for author".

Yours,
Laurenz Albe




Re: Good News Everyone! + feature proposal

2023-10-05 Thread Laurenz Albe
On Thu, 2023-10-05 at 02:22 +, Jon Erdman wrote:

> For the proposal (this one is a bit Apple specific): because my team 
> offers managed postgres to our Apple-internal customers, many of whom 
> are not database experts, or at least not postgres experts, we'd like to 
> be able to toggle the availability of UNLOGGED tables in 
> postgresql.conf, so our less clueful users have fewer footguns.
> 
> So, my proposal is for a GUC to implement that, which would *OF COURSE* 
> undefault to allowing UNLOGGED.

It certainly sounds harmless, but there are two things that make me
unhappy about this:

- Yet another GUC.  It's not like we don't have enough of them.
  (This is a small quibble.)

- This setting would influence the way SQL is processed.
  We have had bad experiences with those; an often-quoted example is
  the "autocommit" parameter that got removed in 7.4.
  This certainly is less harmfuls, but still another pitfall that
  can confuse users.

This reminds me of the proposal for a GUC to forbid UPDATE and DELETE
without a WHERE clause.  That didn't get anywhere, see
https://www.postgresql.org/message-id/flat/20160721045746.GA25043%40fetter.org

> PS: I'm SO happy that this phase of my postgres journey has finally 
> started

I am happy for you.

Please don't be discouraged if some of your patches get stuck because
no consensus can be reached or because nobody cares enough.  Your
contributions are still welcome.  One good way to gain experience
is to review others' patches.  In fact, you are expected to do that
if you submit your own.

Yours,
Laurenz Albe




Re: Good News Everyone! + feature proposal

2023-10-05 Thread Laurenz Albe
On Thu, 2023-10-05 at 14:58 +, Jon Erdman wrote:
> > > > For the proposal (this one is a bit Apple specific): because my team
> > > > offers managed postgres to our Apple-internal customers, many of whom
> > > > are not database experts, or at least not postgres experts, we'd like to
> > > > be able to toggle the availability of UNLOGGED tables in
> > > > postgresql.conf, so our less clueful users have fewer footguns.
> 
> Someone on linked-in suggested an event trigger, so now I'm thinking of 
> a custom extension that would do nothing but create said event trigger, 
> and maybe could be toggled with a customized setting (but that might 
> allow users to turn it off themselves...which is maybe ok).

An event trigger is the perfect solution for this requirement.

> If the extension were installed by the DBA user, the customer wouldn't 
> be able to drop it, and if we decided to give them an exception, we just 
> drop or disable the extension.

Right.  Also, only a superuser can create or drop event triggers.

> As a second more general question: could my original idea (i.e. sans 
> event trigger) be implemented in an extension somehow, or is that not 
> technically possible (I suspect not)?

You could perhaps use "object_access_hook" in an extension.

Yours,
Laurenz Albe




Re: document the need to analyze partitioned tables

2023-10-06 Thread Laurenz Albe
On Fri, 2023-10-06 at 12:20 -0400, Bruce Momjian wrote:
> Good points, updated patch attached.

That patch is good to go, as far as I am concerned.

Yours,
Laurenz Albe




Re: Restoring default privileges on objects

2023-10-06 Thread Laurenz Albe
On Wed, 2023-08-30 at 12:00 +0200, Peter J. Holzer wrote:
> On 2023-08-29 14:44:48 -0600, Stuart McGraw wrote:
> > On 8/29/23 13:27, Tom Lane wrote:
> > > Fixing \dp to honor "\pset null" for this might be a reasonable
> > > thing to do too.  I'm actually a bit surprised that that doesn't
> > > work already.
> > 
> > That change would still require someone using \dp to realize that
> > the "Access privileges" value could be either '' or NULL (I guess
> > that could be pointed out more obviously in the psql doc), and then
> > do a '\pset null' before doing \dp?  That seems a little inconvenient.
> 
> Or just always do a \pset null. For me printing NULL the same as an
> empty string is just as confusing in normal tables, so that's the first
> line in my ~/.psqlrc. YMMV, of course.
> 
> But I guess the point is that people who do \pset null expect to be able
> to distinguish '' and NULL visually and might be surprised if that
> doesn't work everywhere, while people who don't \pset null know that ''
> and NULL are visually indistinguishable and that they may need some
> other way to distinguish them if the difference matters.
> 
> So +1 for me fixing \dp to honor "\pset null".

+1

Here is a patch that does away with the special handling of NULL values
in psql backslash commands.

Yours,
Laurenz Albe




Re: Restoring default privileges on objects

2023-10-06 Thread Laurenz Albe
On Fri, 2023-10-06 at 22:16 +0200, Laurenz Albe wrote:
> Here is a patch that does away with the special handling of NULL values
> in psql backslash commands.

Erm, I forgot to attach the patch.

Yours,
Laurenz Albe
From 6c67f15f011ddf1e309cb7e84580b266d674a1e2 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Fri, 6 Oct 2023 22:12:33 +0200
Subject: [PATCH] psql: honor "\pset null" in backslash commands

In the output of backslash commands, NULL was always displayed
as empty string, rather than honoring the values set with
"\pset null".

This was surprising, and in some cases it was downright confusing:
for example, default privileges (stored as NULL) were displayed
just like an empty aclitem[], making these cases undistinguishable
in the output.

Discussion: https://postgr.es/m/96d6885a-5e25-9ae8-4a1a-d7e557a5fe9c%40mtneva.com
---
 src/bin/psql/describe.c | 43 -
 1 file changed, 43 deletions(-)

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index bac94a338c..224aa22575 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -124,7 +124,6 @@ describeAggregates(const char *pattern, bool verbose, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of aggregate functions");
 	myopt.translate_header = true;
 
@@ -197,7 +196,6 @@ describeAccessMethods(const char *pattern, bool verbose)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of access methods");
 	myopt.translate_header = true;
 	myopt.translate_columns = translate_columns;
@@ -262,7 +260,6 @@ describeTablespaces(const char *pattern, bool verbose)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of tablespaces");
 	myopt.translate_header = true;
 
@@ -585,7 +582,6 @@ describeFunctions(const char *functypes, const char *func_pattern,
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of functions");
 	myopt.translate_header = true;
 	if (pset.sversion >= 90600)
@@ -702,7 +698,6 @@ describeTypes(const char *pattern, bool verbose, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of data types");
 	myopt.translate_header = true;
 
@@ -893,7 +888,6 @@ describeOperators(const char *oper_pattern,
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of operators");
 	myopt.translate_header = true;
 
@@ -995,7 +989,6 @@ listAllDbs(const char *pattern, bool verbose)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of databases");
 	myopt.translate_header = true;
 
@@ -1146,7 +1139,6 @@ permissionsList(const char *pattern, bool showSystem)
 	if (!res)
 		goto error_return;
 
-	myopt.nullPrint = NULL;
 	printfPQExpBuffer(&buf, _("Access privileges"));
 	myopt.title = buf.data;
 	myopt.translate_header = true;
@@ -1218,7 +1210,6 @@ listDefaultACLs(const char *pattern)
 	if (!res)
 		goto error_return;
 
-	myopt.nullPrint = NULL;
 	printfPQExpBuffer(&buf, _("Default access privileges"));
 	myopt.title = buf.data;
 	myopt.translate_header = true;
@@ -1417,7 +1408,6 @@ objectDescription(const char *pattern, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("Object descriptions");
 	myopt.translate_header = true;
 	myopt.translate_columns = translate_columns;
@@ -3852,7 +3842,6 @@ listDbRoleSettings(const char *pattern, const char *pattern2)
 	}
 	else
 	{
-		myopt.nullPrint = NULL;
 		myopt.title = _("List of settings");
 		myopt.translate_header = true;
 
@@ -3926,7 +3915,6 @@ describeRoleGrants(const char *pattern, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of role grants");
 	myopt.translate_header = true;
 
@@ -4122,7 +4110,6 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
 	}
 	else
 	{
-		myopt.nullPrint = NULL;
 		myopt.title = _("List of relations");
 		myopt.translate_header = true;
 		myopt.translate_columns = translate_columns;
@@ -4332,7 +4319,6 @@ listPartitionedTables(const char *reltypes, const char *pattern, bool verbose)
 	initPQExpBuffer(&title);
 	appendPQExpBufferStr(&title, tabletitle);
 
-	myopt.nullPrint = NULL;
 	myopt.title = title.data;
 	myopt.translate_header = true;
 	myopt.translate_columns = translate_columns;
@@ -4412,7 +4398,6 @@ listLanguages(const char *pattern, bool verbose, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of languages");
 	myopt.translate_header = true;
 
@@ -4497,7 +4482,6 @@ listDomains(const char *pattern, bool verbose, bool showSystem)
 	if (!res)
 		return false;
 
-	myopt.nullPrint = NULL;
 	myopt.title = _("List of d

Re: Remove Deprecated Exclusive Backup Mode

2019-01-10 Thread Laurenz Albe
Stephen Frost wrote:
> * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> > Clearly, not having to do that at all is better, but if this is all
> > there is to it, then I'm confused by the characterizations of how awful
> > and terrible this feature is and how we must rush to remove it.
> 
> It's not all there is to it though.
> 
> This issue leads to extended downtime regularly and is definitely a huge
> 'gotcha' for users, even if you don't want to call it outright broken,

Only if PostgreSQL crashes regularly, right?

Yours,
Laurenz Albe




Re: Libpq support to connect to standby server as priority

2019-01-17 Thread Laurenz Albe
Tsunakawa, Takayuki wrote:
> From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> > The problem here of course is that whoever invented target_session_attrs
> > was unconcerned with following that precedent, so what we have is
> > "target_session_attrs=(any | read-write)".
> > Are we prepared to add some aliases in service of unifying these names?
> 
> I think "yes".
> 
> > 2. Whether or not you want to follow pgJDBC's naming, it seems like we
> > ought to have both "require read only" and "prefer read only" behaviors
> > in this patch, and maybe likewise "require read write" versus "prefer
> > read write".
> 
> Agreed, although I don't see a use case for "prefer read write".  I don't 
> think
> there's an app like "I want to write, but I'm OK if I cannot."

I don't think so either, although of course I cannot prove it.

My opinion is that we shouldn't add options like "prefer read write"
just out of a fuzzy desire for symmetry.  It would probably make the code
even more complicated, and more choice means that it becomes harder for
the user to pick the right one (the latter may be a weak argument).

The motivation behind all this is to load balance reading and writing
sessions among a group of replicating servers where you don't know for sure
who is what at the moment, so "preferably read-only", "must be able to write"
and "don't care" are choice enough.

There is nothing that bars future patches from adding additional modes
if the need really arises.

> > 4. Given that other discussion, it's not quite clear what we should
> > even be checking.  The existing logic devolves to checking that
> > transaction_read_only is true, but that's not really the same thing as
> > "is a master server", eg you might have connected to a master server
> > under a role that has SET ROLE default_transaction_read_only = false.
> 
> PgJDBC uses transaction_read_only like this: [...]
> 
> But as some people said, I don't think this is the right way.  I suppose 
> what's leading
> to the current somewhat complicated situation is that there was no easy way 
> for the
> client to know whether the server is the master.  That ended up in using
> "SHOW transaction_read_only" instead, and people supported that compromise by 
> saying
> "read only status is more useful than whether the server is standby or not," 
> I'm afraid.
> 
> The original desire should have been the ability to connect to a primary or a 
> standby.
> So, I think we should go back to the original thinking (and not complicate 
> the feature),
> and create a read only GUC_REPORT variable, say, server_role, that identifies 
> whether
> the server is a primary or a standby.

I think that transaction_read_only is good.

If it is set to false, we are sure to be on a replication primary or
stand-alone server, which is enough to know for the load balancing use case.

I deem it unlikely that someone will set default_transaction_read_only to
FALSE and then complain that the feature is not working as expected, but again
I cannot prove that claim.

As Robert said, transaction_read_only might even give you the option to
use the feature for more than just load balancing between replication master 
and standby.

Yours,
Laurenz Albe




Re: Libpq support to connect to standby server as priority

2019-01-18 Thread Laurenz Albe
Tsunakawa, Takayuki wrote:
> From: Laurenz Albe [mailto:laurenz.a...@cybertec.at]
> > I think that transaction_read_only is good.
> > 
> > If it is set to false, we are sure to be on a replication primary or
> > stand-alone server, which is enough to know for the load balancing use case.
> 
> As Tatsuo-san said, setting default_transaction_read_only leads to a 
> misjudgement of the primary.

Yes, you can have a false negative, i.e. fail to recognize a primary server.

> > I deem it unlikely that someone will set default_transaction_read_only to
> > FALSE and then complain that the feature is not working as expected, but
> > again
> > I cannot prove that claim.
> 
> I wonder what default_transaction_read_only exists for.  For maing the 
> database by default
> and allowing only specific users to write to the database with "CREATE/ALTER 
> USER SET
> default_transaction_read_only = false"?

I'd guess that the main use of default_transaction_read_only is to make sure an
application (that isn't smart enough to change the parameter) won't modify any 
data.

> I'm sorry to repeat myself, but anyway, I think we need a method to connect 
> to a standby
> as the original desire, because the primary instance may be read only by 
> default while
> only limited users update data.  That's for reducing the burdon on the 
> primary and
> minimizing the impact on users who update data.  For example,
> 
> * run data reporting on the standby
> * backup the database from the standby
> * cascade replication from the standby

I see.

But then the new value should not be called "prefer-read", because that would be
misleading.  It would also not be related to the existing "read-write".

For what you have in mind, there should be the options "primary-required" and
"standby-preferred", however we implement them.

Have there been a lot of complaints that the existing "read-write" is not good
enough to detect replication primaries?

> > As Robert said, transaction_read_only might even give you the option to
> > use the feature for more than just load balancing between replication master
> > and standby.
> 
> What use case do you think of?  If you want to load balance the workload 
> between
> the primary and standbys, we can follow PgJDBC -- targetServerType=any.

One use case I can think of is logical replication (or other replication 
methods like
Slony).  You can use the feature by setting default_transaction_read_only = on
on the standby.

Yours,
Laurenz Albe




Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-28 Thread Laurenz Albe
Stefan Keller wrote:
> Pls. try to give DUAL a better name, since it's IMHO neither
> self-explaining nor correct.

I agree with the sentiment.
On the other hand, people who migrate from Oracle might be happy if
there is a DUAL table that allows them to migrate some of their
statements unmodified.

I don't know if that is a good enough argument, though. 
Currently there is "orafce" which provides DUAL, and it might be
good enough if it defines DUAL as a view on DUMMY.

Yours,
Laurenz Albe




Re: "SELECT ... FROM DUAL" is not quite as silly as it appears

2019-01-28 Thread Laurenz Albe
David Rowley wrote:
> hmm. You both know the table of that name exists only as part of a
> regression test, right?  It'll just exist for a handful of
> milliseconds during make check.

Er, I wasn't aware of that, as I didn't read the patch.
Then I think my comment should be discarded as being overly pedantic.

Yours,
Laurenz Albe




Re: reloption to prevent VACUUM from truncating empty pages at the end of relation

2019-02-01 Thread Laurenz Albe
Jamison, Kirk wrote:
> On February 1, 2019, Tsunakawa, Takayuki wrote: 
> 
> > > As most people seem to agree adding the reloption, here's the patch.  
> > > It passes make check, and works like this:
> > Sorry, I forgot to include the modified header file.  Revised patch 
> > attached.
> 
> I wonder if there is a better reloption name for shrink_enabled. 
> (truncate_enabled, vacuum_enabled? Hmm. No?)
> On the other hand, shrink_enabled seems to describe well what it's supposed 
> to do when vacuuming tables.
> Besides there's a similarly-named autovacuum_enabled option.

I like "shrink_enabled".

It may sound weird in the ears of PostgreSQL hackers, but will make sense to 
users.

Perhaps "vacuum_shrink_enabled" would be even better.

Yours,
Laurenz Albe




Re: What happens if checkpoint haven't completed until the next checkpoint interval or max_wal_size?

2019-02-04 Thread Laurenz Albe
Mohammad Sherafat wrote:
> In the name of god!

It is not considered good style to hurt people's religious feelings
by using the name of god in vain.

> What happens if checkpoint haven't completed until the next checkpoint 
> interval or max_wal_size?

Then you have two checkpoints active at the same time.

Yours,
Laurenz Albe




Re: Fix optimization of foreign-key on update actions

2019-02-06 Thread Laurenz Albe
Andrew Gierth wrote:
> SQL2016, 15.17 Execution of referential actions
> 
> 10) If a non-null value of a referenced column RC in the referenced
> table is updated to a value that is distinct from the current value
> of RC, then,
> 
>   [snip all the stuff about how ON UPDATE actions work]
> 
> does that "is distinct from" mean that IS DISTINCT FROM would be true,
> or does it mean "is in some way distinguishable from"? Nothing I can see
> in the spec suggests the latter.

My 2003 standard defines, and even condescends to be informal:

3.1.6.8 distinct (of a pair of comparable values): Capable of being 
distinguished within a given context.
Informally, not equal, not both null. A null value and a non-null value are 
distinct.

Yours,
Laurenz Albe




Re: Row Level Security − leakproof-ness and performance implications

2019-02-19 Thread Laurenz Albe
Pierre Ducroquet wrote:
> In order to increase our security, we have started deploying row-level 
> security in order to add another safety net if any issue was to happen in our 
> applications.
> After a careful monitoring of our databases, we found out that a lot of 
> queries started to go south, going extremely slow.
> The root of these slowdowns is that a lot of the PostgreSQL functions are not 
> marked as leakproof, especially the ones used for operators.
> In current git master, the following query returns 258 functions that are 
> used 
> by operators returning booleans and not marked leakproof:
> 
> SELECT proname FROM pg_proc 
> WHERE exists(select 1 from pg_operator where oprcode::name = proname) 
> AND NOT(proleakproof) 
> AND prorettype = 16;
> 
> 
> Among these, if you have for instance a table like:
> create table demo_enum (
> username text,
> flag my_enum
> );
> 
> With an index on (username, flag), the index will only be used on username 
> because flag is an enum and the enum_eq function is not marked leakproof.
> 
> For simple functions like enum_eq/enum_ne, marking them leakproof is an 
> obvious fix (patch attached to this email, including also textin/textout). And

The enum_eq part looks totally safe, but the text functions allocate memory,
so you could create memory pressure, wait for error messages that tell you
the size of the allocation that failed and this way learn about the data.

Is this a paranoid idea?

> if we had a 'RLS-enabled' context on functions, a way to make a lot of built-
> in functions leakproof would simply be to prevent them from logging errors 
> containing values.
> 
> For others, like arraycontains, it's much trickier :[...]

I feel a little out of my depth here, so I won't comment.

> A third issue we noticed is the usage of functional indexes. If you create an 
> index on, for instance, (username, leaky_function(my_field)), and then search 
> on leaky_functon(my_field) = 42, the functional index can be used only if 
> leaky_function is marked leakproof, even if it is never going to be executed 
> on invalid rows thanks to the index. After a quick look at the source code, 
> it 
> also seems complicated to implement since the decision to reject potential 
> dangerous calls to leaky_function is done really early in the process, before 
> the optimizer starts.

If there is a bitmap index scan, the condition will be rechecked, so the
function will be executed.

> I am willing to spend some time on these issues, but I have no specific 
> knowledge of the planner and optimizer, and I fear proper fixes would require 
> much digging there. If you think it would be appropriate for functions like 
> arraycontains or range_contains to require the 'inner' comparison function to 
> be leakproof, or just keep looking at the other functions in pg_proc and 
> leakproof the ones that can be, I would be happy to write the corresponding 
> patches.

Thanks, and I think that every function that can safely be marked leakproof
is a progress!

Yours,
Laurenz Albe




Identity columns should own only one sequence

2019-04-14 Thread Laurenz Albe
Identity columns don't work if they own more than one sequence.

So if one tries to convert a "serial" column to an identity column,
the following can happen:

test=> CREATE TABLE ser(id serial);
CREATE TABLE
test=> ALTER TABLE ser ALTER id ADD GENERATED ALWAYS AS IDENTITY;
ERROR:  column "id" of relation "ser" already has a default value

Hm, ok, let's drop the column default value.

test=> ALTER TABLE ser ALTER id DROP DEFAULT;
ALTER TABLE

Now it works:

test=> ALTER TABLE ser ALTER id ADD GENERATED ALWAYS AS IDENTITY;
ALTER TABLE

But not very much:

test=> INSERT INTO ser (id) VALUES (DEFAULT);
ERROR:  more than one owned sequence found


I propose that we check if there already is a dependent sequence
before adding an identity column.

The attached patch does that, and also forbids setting the ownership
of a sequence to an identity column.

I think this should be backpatched.

Yours,
Laurenz Albe
From ab536da87fa8ffc70469d3dbdaf3e1b84b0ef793 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Sun, 14 Apr 2019 17:37:03 +0200
Subject: [PATCH] Make sure identity columns own only a single sequence

If an identity column owns more than one sequence, inserting the
default value will fail with "more than one owned sequence found".

Consequently, we should prevent this from happening, and disallow
a) turning a column that already owns a sequence into an identity column
b) setting ownership of a sequence to an identity column
---
 src/backend/commands/sequence.c  | 20 +---
 src/backend/commands/tablecmds.c | 11 +++
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e9add1b987..dc3bff5fdf 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1644,6 +1644,7 @@ process_owned_by(Relation seqrel, List *owned_by, bool for_identity)
 	int			nnames;
 	Relation	tablerel;
 	AttrNumber	attnum;
+	HeapTuple	heaptup;
 
 	deptype = for_identity ? DEPENDENCY_INTERNAL : DEPENDENCY_AUTO;
 
@@ -1694,9 +1695,22 @@ process_owned_by(Relation seqrel, List *owned_by, bool for_identity)
 	(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 	 errmsg("sequence must be in same schema as table it is linked to")));
 
-		/* Now, fetch the attribute number from the system cache */
-		attnum = get_attnum(RelationGetRelid(tablerel), attrname);
-		if (attnum == InvalidAttrNumber)
+		/* Now, fetch the attribute from the system cache */
+		heaptup = SearchSysCacheAttName(RelationGetRelid(tablerel), attrname);
+		if (HeapTupleIsValid(heaptup))
+		{
+			Form_pg_attribute att_tup = (Form_pg_attribute) GETSTRUCT(heaptup);
+			bool	is_identity = att_tup->attidentity;
+
+			attnum = att_tup->attnum;
+			ReleaseSysCache(heaptup);
+			if (is_identity && !for_identity)
+ereport(ERROR,
+		(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+		 errmsg("column \"%s\" of relation \"%s\" is an identity column",
+attrname, RelationGetRelationName(tablerel;
+		}
+		else
 			ereport(ERROR,
 	(errcode(ERRCODE_UNDEFINED_COLUMN),
 	 errmsg("column \"%s\" of relation \"%s\" does not exist",
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index d48a947f7c..9ea9dae2fb 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -6378,6 +6378,17 @@ ATExecAddIdentity(Relation rel, const char *colName,
 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
  errmsg("column \"%s\" of relation \"%s\" already has a default value",
 		colName, RelationGetRelationName(rel;
+	/*
+	 * Make sure that the column does not already own a sequence.
+	 * Otherwise, inserting a default value would fail, since it is not
+	 * clear which sequence should be used.
+	 */
+	if (getOwnedSequences(RelationGetRelid(rel), attnum) != NIL)
+		ereport(ERROR,
+(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("column \"%s\" of relation \"%s\" already owns a sequence",
+		colName, RelationGetRelationName(rel)),
+ errhint("Drop the dependent sequence before making the column an identity column.")));
 
 	attTup->attidentity = cdef->identity;
 	CatalogTupleUpdate(attrelation, &tuple->t_self, tuple);
-- 
2.20.1



Re: Identity columns should own only one sequence

2019-04-14 Thread Laurenz Albe
I wrote:
> Identity columns don't work if they own more than one sequence.
> 
[...]
> test=> INSERT INTO ser (id) VALUES (DEFAULT);
> ERROR:  more than one owned sequence found
> 
> 
> I propose that we check if there already is a dependent sequence
> before adding an identity column.
> 
> The attached patch does that, and also forbids setting the ownership
> of a sequence to an identity column.

Alternatively, maybe getOwnedSequence should only consider sequences
with an "internal" dependency on the column.  That would avoid the problem
without forbidding anything, since normal OWNED BY dependencies are "auto".

What do you think?

Yours,
Laurenz Albe





Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

2019-04-20 Thread Laurenz Albe
On Fri, 2018-04-06 at 23:24 +, Robert Haas wrote:
> Allow insert and update tuple routing and COPY for foreign tables.
> 
> Also enable this for postgres_fdw.
> 
> Etsuro Fujita, based on an earlier patch by Amit Langote. The larger
> patch series of which this is a part has been reviewed by Amit
> Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost,
> and me.  Minor documentation changes to the final version by me.
> 
> Discussion: 
> http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f...@lab.ntt.co.jp

This commit makes life hard for foreign data wrappers that support
data modifications.

If a FDW implements ExecForeignInsert, this commit automatically assumes
that it also supports COPY FROM.  It will call ExecForeignInsert without
calling PlanForeignModify and BeginForeignModify, and a FDW that does not
expect that will probably fail.

So this commit silently turns a functioning FDW into a broken FDW.
That is not nice.  Probably not every FDW is aware of this change, and
maybe there are FDWs that support INSERT but don't want to support COPY
for some reason.

I propose that PostgreSQL only allows COPY FROM on a foreign table if the FDW
implements BeginForeignInsert.  The attached patch implements that.

I think this should be backpatched to v11.

Yours,
Laurenz Albe
From c4b0e871658c757124dad992578da0b60fccf962 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Sat, 20 Apr 2019 13:36:56 +0200
Subject: [PATCH] COPY FROM on foreign tables requires BeginForeignInsert

Commit 3d956d956a introduced COPY FROM support for foreign tables.

If a foreign data wrapper supports data modifications, but either has
not adapted to this change yet or doesn't want to support COPY FROM
for other reasons, it probably got broken by the above commit,
because COPY will just call ExecForeignInsert anyway, which might not
work because neither PlanForeignModify nor BeginForeignModify have
been called.

To avoid breaking third-party foreign data wrappers in that way, allow
COPY FROM for foreign tables only if the foreign data wrapper implements
BeginForeignInsert.
---
 doc/src/sgml/fdwhandler.sgml | 2 ++
 src/backend/commands/copy.c  | 5 +
 2 files changed, 7 insertions(+)

diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index 2c07a86637..8cd174a204 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -724,6 +724,8 @@ BeginForeignInsert(ModifyTableState *mtstate,
  perform any initialization needed prior to the actual insertion.
  Subsequently, ExecForeignInsert will be called for
  each tuple to be inserted into the foreign table.
+ All foreign data wrappers that support COPY FROM have
+ to provide this callback function.
 
 
 
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index c39218f8db..944d558cd4 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2857,6 +2857,11 @@ CopyFrom(CopyState cstate)
 		resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
 		resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate,
 		 resultRelInfo);
+	else
+		ereport(ERROR,
+(errcode(ERRCODE_FEATURE NOT SUPPORTED),
+ errmsg("cannot copy to foreign table \"%s\"",
+		RelationGetRelationName(cstate->rel;
 
 	/* Prepare to catch AFTER triggers. */
 	AfterTriggerBeginQuery();
-- 
2.20.1



Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

2019-04-22 Thread Laurenz Albe
On Mon, 2019-04-22 at 21:45 +0900, Etsuro Fujita wrote:

Thanks for looking into this!

> (2019/04/20 20:53), Laurenz Albe wrote:
> > On Fri, 2018-04-06 at 23:24 +, Robert Haas wrote:
> > > Allow insert and update tuple routing and COPY for foreign tables.
> > > 
> > > Also enable this for postgres_fdw.
> > > 
> > > Etsuro Fujita, based on an earlier patch by Amit Langote. The larger
> > > patch series of which this is a part has been reviewed by Amit
> > > Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost,
> > > and me.  Minor documentation changes to the final version by me.
> > > 
> > > Discussion: 
> > > http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f...@lab.ntt.co.jp
> > 
> > This commit makes life hard for foreign data wrappers that support
> > data modifications.
> > 
> > If a FDW implements ExecForeignInsert, this commit automatically assumes
> > that it also supports COPY FROM.  It will call ExecForeignInsert without
> > calling PlanForeignModify and BeginForeignModify, and a FDW that does not
> > expect that will probably fail.
> 
> This is not 100% correct; the FDW documentation says:
> 
>  
>   Tuples inserted into a partitioned table by 
> INSERT or
>   COPY FROM are routed to partitions.  If an FDW
>   supports routable foreign-table partitions, it should also provide the
>   following callback functions.  These functions are also called when
>   COPY FROM is executed on a foreign table.
>  

I don't see the difference between the documentation and what I wrote above.

Before v11, a FDW could expect that ExecForeignInsert is only called if
BeginForeignModify was called earlier.
That has silently changed with v11.

> > maybe there are FDWs that support INSERT but don't want to support COPY
> > for some reason.
> 
> I agree on that point.
> 
> > I propose that PostgreSQL only allows COPY FROM on a foreign table if the 
> > FDW
> > implements BeginForeignInsert.  The attached patch implements that.
> 
> I don't think that is a good idea, because there might be some FDWs that 
> want to support COPY FROM on foreign tables without providing 
> BeginForeignInsert.  (As for INSERT into foreign tables, we actually 
> allow FDWs to support it without providing PlanForeignModify, 
> BeginForeignModify, or EndForeignModify.)
> 
> It's permissible to throw an error in BeginForeignInsert, so what I was 
> thinking for FDWs that don't want to support COPY FROM and 
> INSERT/UPDATE/COPY FROM tuple routing was to provide BeginForeignInsert 
> implementing something like this:
> 
> static void
> fooBeginForeignInsert(ModifyTableState *mtstate,
>ResultRelInfo *resultRelInfo)
> {
>  Relationrel = resultRelInfo->ri_RelationDesc;
> 
>  if (mtstate->ps.plan == NULL)
>  ereport(ERROR,
>  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>   errmsg("cannot copy to foreign table \"%s\"",
>  RelationGetRelationName(rel;
>  else
>  ereport(ERROR,
>  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>   errmsg("cannot route tuples into foreign table \"%s\"",
>  RelationGetRelationName(rel;
> }

Sure, it is not hard to modify a FDW to continue working with v11.

My point is that this should not be necessary.

If a FDW worked well with v10, it should continue to work with v11
unless there is a necessity for a compatibility-breaking change.

On the other hand, if a FDW wants to support COPY in v11 and has no
need for BeginForeignInsert to support that, it is a simple exercise
for it to provide an empty BeginForeignInsert just to signal that it
wants to support COPY.

I realized that my previous patch forgot to check for tuple routing,
updated patch attached.

Yours,
Laurenz Albe
From 545ac9d567ea6ca610ce08355451923cc787e13d Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Mon, 22 Apr 2019 21:35:06 +0200
Subject: [PATCH] Foreign table COPY FROM and tuple routing requires
 BeginForeignInsert

Commit 3d956d956a introduced support for foreign tables as partitions
and COPY FROM on foreign tables.

If a foreign data wrapper supports data modifications, but either has
not adapted to this change yet or doesn't want to support it
for other reasons, it probably got broken by the above commit,
because COPY will just call ExecForeignInsert anyway, which might not
work because neither PlanForeignModify nor BeginForeignModify have
been called.

To avoid breaking third-party foreign data wrappers in that way, allow
COPY FROM and tuple r

Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

2019-04-22 Thread Laurenz Albe
On Mon, 2019-04-22 at 16:24 -0400, Robert Haas wrote:
> On Mon, Apr 22, 2019 at 3:37 PM Laurenz Albe  wrote:
> > Sure, it is not hard to modify a FDW to continue working with v11.
> > 
> > My point is that this should not be necessary.
> 
> I'm not sure whether this proposal is a good idea or a bad idea, but I
> think that it's inevitable that FDWs are going to need some updating
> for new releases as the API evolves.  That has happened before and
> will continue to happen.

Absolutely.
I am just unhappy that this change caused unnecessary breakage.

If you developed a read-only FDW for 9.2, it didn't break with the
write support introduced in 9.3, because that used different API
functions.  That's how it should be IMHO.

If you developed a FDW for 9.1, it got broken in 9.2, because the
API had to change to allow returning multiple paths.
That was unfortunate but necessary, so it is ok.

Nothing in this patch required an incompatible change.

Yours,
Laurenz Albe





Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

2019-04-22 Thread Laurenz Albe
On Mon, 2019-04-22 at 13:27 -0700, Andres Freund wrote:
> On 2019-04-22 21:37:25 +0200, Laurenz Albe wrote:
> > Commit 3d956d956a introduced support for foreign tables as partitions
> > and COPY FROM on foreign tables.
> > 
> > If a foreign data wrapper supports data modifications, but either has
> > not adapted to this change yet or doesn't want to support it
> > for other reasons, it probably got broken by the above commit,
> > because COPY will just call ExecForeignInsert anyway, which might not
> > work because neither PlanForeignModify nor BeginForeignModify have
> > been called.
> > 
> > To avoid breaking third-party foreign data wrappers in that way, allow
> > COPY FROM and tuple routing for foreign tables only if the foreign data
> > wrapper implements BeginForeignInsert.
> 
> Isn't this worse though? Before this it's an API change between major
> versions. With this it's an API change in a *minor* version. Sure, it's
> one that doesn't crash, but it's still a pretty substantial function
> regression, no?

Hm, that's a good point.
You could say that this patch is too late, because a FDW might already be
relying on COPY FROM to work without BeginForeignInsert in v11.

How about just applying the patch from v12 on?
Then it is a behavior change in a major release, which is acceptable.
It requires the imaginary FDW above to add an empty BeginForeignInsert
callback function, but will unbreak FDW that slept through the change
that added COPY support.

Yours,
Laurenz Albe





Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

2019-04-23 Thread Laurenz Albe
On Mon, 2019-04-22 at 14:07 -0700, Andres Freund wrote:
> How about just applying the patch from v12 on?
> > Then it is a behavior change in a major release, which is acceptable.
> > It requires the imaginary FDW above to add an empty BeginForeignInsert
> > callback function, but will unbreak FDW that slept through the change
> > that added COPY support.
> 
> I fail to see the advantage. It'll still require FDWs to be fixed to
> work correctly for v11, but additionally adds another set of API
> differences that needs to be fixed by another set of FDWs.  I think this
> ship simply has sailed.

I can accept that (having fixed my own FDW), but I am worried that it will
cause problems for FDW users.  Well, I guess they can always avoid COPY if
they don't want FDWs to crash.

Yours,
Laurenz Albe





Re: Symbol referencing errors

2019-04-23 Thread Laurenz Albe
On Tue, 2019-04-23 at 04:26 +, Li Japin wrote:
> Yes, those errors does not impact the postgresql, but when
> I use oracle_fdw extension, I couldn't startup the postgresql,
> and I find that the dlopen throw an error which lead postmaster
> exit, and there is not more information.

That may wall be a bug in oracle_fdw, since I have no reports of
anybody running it on that operating system.

Maybe you should open an oracle_fdw issue, but I don't know how
much I can help you, since this is the first time I have heard
of SmartOS.

Yours,
Laurenz Albe





Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

2019-04-24 Thread Laurenz Albe
On Wed, 2019-04-24 at 20:54 +0900, Etsuro Fujita wrote:
> How about adding to the documentation for BeginForeignInsert a mention 
> that if an FDW doesn't support COPY FROM and/or routable foreign tables, 
> it must throw an error in BeginForeignInsert accordingly.

Sure, some more documentation would be good.

The documentation of ExecForeignInsert should mention something like:

  ExecForeignInsert is called for INSERT statements as well
  as for COPY FROM and tuples that are inserted into a foreign table
  because it is a partition of a partitioned table.

  In the case of a normal INSERT, BeginForeignModify will be called
  before ExecForeignInsert to perform any necessary setup.
  In the other cases, this setup has to happen in BeginForeignInsert.

  Before PostgreSQL v11, a foreign data wrapper could be certain that
  BeginForeignModify is always called before ExecForeignInsert.
  This is no longer true.

> > On the other hand, if a FDW wants to support COPY in v11 and has no
> > need for BeginForeignInsert to support that, it is a simple exercise
> > for it to provide an empty BeginForeignInsert just to signal that it
> > wants to support COPY.
> 
> That seems to me inconsistent with the concept of the existing APIs for 
> updating foreign tables, because for an FDW that wants to support 
> INSERT/UPDATE/DELETE and has no need for 
> PlanForeignModify/BeginForeignModify, those APIs don't require the FDW 
> to provide empty PlanForeignModify/BeginForeignModify to tell the core 
> that it wants to support INSERT/UPDATE/DELETE.

That is true, but so far there hasn't been a change to the FDW API that
caused a callback to be invoked in a different fashion than it used to be.

Perhaps it would have been better to create a new callback like
ExecForeignCopy with the same signature as ExecForeignInsert so that
you can use the same callback function for both if you want.
That would also have avoided the breakage.
But, of course it is too late for that now.

Note that postgres_fdw would have been broken by that API change as well
if it hasn't been patched.

At the very least, this should have been mentioned in the list of
incompatible changes for v11.

Yours,
Laurenz Albe





Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

2019-04-24 Thread Laurenz Albe
On Wed, 2019-04-24 at 14:25 +0100, Simon Riggs wrote:
> On Wed, 24 Apr 2019 at 12:55, Etsuro Fujita  
> wrote:
>  
> > > My point is that this should not be necessary.
> > 
> > In my opinion, I think this is necessary...
> 
> Could we decide by looking at what FDWs are known to exist?
> I hope we are trying to avoid breakage in the smallest number of FDWs.

A good idea.  I don't volunteer to go through the list, but I had a look
at Multicorn, which is a FDW framework used by many FDWs, and it seems
to rely on multicornBeginForeignModify being called before
multicornExecForeignInsert (the former sets up a MulticornModifyState
used by the latter).

https://github.com/Kozea/Multicorn/blob/master/src/multicorn.c

Multicorn obviously hasn't got the message yet that the API has
changed in an incompatible fashion, so I'd argue that every
Multicorn FDW with write support is currently broken.


As Andres has argued above, it is too late to do anything more about
it than to document this and warn FDW authors as good as we can.

Yours,
Laurenz Albe





Re: Identity columns should own only one sequence

2019-04-24 Thread Laurenz Albe
On Sun, 2019-04-14 at 20:15 +0200, I wrote:
> I wrote:
> > Identity columns don't work if they own more than one sequence.
> 
> Alternatively, maybe getOwnedSequence should only consider sequences
> with an "internal" dependency on the column.  That would avoid the problem
> without forbidding anything, since normal OWNED BY dependencies are "auto".
> 
> What do you think?

Here is a patch that illustrates the second approach.

I'll add this thread to the next commitfest.

Yours,
Laurenz Albe

From 7f7bae5315b7770f1327a80eb192bb098ee9df89 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Wed, 24 Apr 2019 16:46:39 +0200
Subject: [PATCH] Change getOwnedSequence to only find sequences for identity
 columns

This makes identity columns work even if there is another sequence
owned by the column (with an auto dependency).
---
 src/backend/catalog/pg_depend.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index d63bf5e56d..4d8c333243 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -684,14 +684,18 @@ getOwnedSequences(Oid relid, AttrNumber attnum)
 		Form_pg_depend deprec = (Form_pg_depend) GETSTRUCT(tup);
 
 		/*
-		 * We assume any auto or internal dependency of a sequence on a column
-		 * must be what we are looking for.  (We need the relkind test because
-		 * indexes can also have auto dependencies on columns.)
+		 * If no "attnum" was given, we are looking for sequences with an
+		 * auto or internal dependency.
+		 * If "attnum" was given, only look for sequences with an internal
+		 * dependency, because that is what we need for identity columns.
+		 * (We need the relkind test because indexes can also have auto
+		 * dependencies on columns.)
 		 */
 		if (deprec->classid == RelationRelationId &&
 			deprec->objsubid == 0 &&
 			deprec->refobjsubid != 0 &&
-			(deprec->deptype == DEPENDENCY_AUTO || deprec->deptype == DEPENDENCY_INTERNAL) &&
+			((!attnum && deprec->deptype == DEPENDENCY_AUTO) ||
+deprec->deptype == DEPENDENCY_INTERNAL) &&
 			get_rel_relkind(deprec->objid) == RELKIND_SEQUENCE)
 		{
 			result = lappend_oid(result, deprec->objid);
-- 
2.20.1



Re: pgsql: Allow insert and update tuple routing and COPY for foreign table

2019-04-25 Thread Laurenz Albe
On Thu, 2019-04-25 at 22:17 +0900, Etsuro Fujita wrote:
> > The documentation of ExecForeignInsert should mention something like:
> >
> >ExecForeignInsert is called for INSERT statements as well
> >as for COPY FROM and tuples that are inserted into a foreign table
> >because it is a partition of a partitioned table.
> >
> >In the case of a normal INSERT, BeginForeignModify will be called
> >before ExecForeignInsert to perform any necessary setup.
> >In the other cases, this setup has to happen in BeginForeignInsert.
> 
> These seem a bit redundant to me [...]
> 
> OK, how about something like the attached?  I reworded this a bit, though.

I like your patch better than my wording.

Thanks for the effort!

Yours,
Laurenz Albe





Re: Identity columns should own only one sequence

2019-04-26 Thread Laurenz Albe
On Thu, 2019-04-25 at 09:55 +0900, Michael Paquier wrote:
> On Sun, Apr 14, 2019 at 05:51:47PM +0200, Laurenz Albe wrote:
> > test=> INSERT INTO ser (id) VALUES (DEFAULT);
> > ERROR:  more than one owned sequence found
> 
> Yes this should never be user-triggerable, so it seems that we need to
> fix and back-patch something if possible.
> 
> > I propose that we check if there already is a dependent sequence
> > before adding an identity column.
> 
> That looks awkward.  Souldn't we make sure that when dropping the
> default associated with a serial column then the dependency between
> the column and the sequence is removed instead?  This implies more
> complication in ATExecColumnDefault().
> 
> > The attached patch does that, and also forbids setting the ownership
> > of a sequence to an identity column.
> > 
> > I think this should be backpatched.
> 
> Could you add some test cases with what you think is adapted?

You are right!  Dropping the dependency with the DEFAULT is the
cleanest approach.

I have left the checks to prevent double sequence ownership from
happening.

I added regression tests covering all added code.

Yours,
Laurenz Albe

From 36bcd47cf0aaf88cd6ca84ad68b246063b838488 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Fri, 26 Apr 2019 12:39:16 +0200
Subject: [PATCH] Fix interaction of owned sequences and identity columns

If an identity column owned more that one sequence, inserting the default
value would fail with "ERROR:  more than one owned sequence found".

This was particularly likely to happen in attempts to convert a "serial"
column to an identity column by first dropping the column default,
because doing that didn't remove the sequence ownership.  Not cool.

Hence we should make sure that this does not happen:
- remove all sequence ownerships from pg_depend if the column default
  value is dropped
- forbid turning a column that owns a sequence into an identity column
- forbid that identity columns become owners of a second sequence
---
 src/backend/catalog/heap.c | 10 -
 src/backend/catalog/pg_depend.c| 56 ++
 src/backend/commands/sequence.c| 20 +++--
 src/backend/commands/tablecmds.c   | 11 +
 src/include/catalog/dependency.h   |  1 +
 src/test/regress/expected/identity.out | 22 ++
 src/test/regress/sql/identity.sql  | 18 +
 7 files changed, 134 insertions(+), 4 deletions(-)

diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 6b77eff0af..26aadc8c8d 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -1723,6 +1723,9 @@ RemoveAttrDefault(Oid relid, AttrNumber attnum,
 		object.objectId = attrtuple->oid;
 		object.objectSubId = 0;
 
+		/* renounce ownership of all sequences */
+		disownSequences(relid, attnum);
+
 		performDeletion(&object, behavior,
 		internal ? PERFORM_DELETION_INTERNAL : 0);
 
@@ -1778,7 +1781,12 @@ RemoveAttrDefaultById(Oid attrdefId)
 	/* Get an exclusive lock on the relation owning the attribute */
 	myrel = relation_open(myrelid, AccessExclusiveLock);
 
-	/* Now we can delete the pg_attrdef row */
+	/*
+	 * Now we can delete the pg_attrdef row.
+	 * Note that we don't have to explicitly disown any sequences OWNED BY
+	 * the column here:  This function is only called if the column is
+	 * dropped, and that will remove the sequence and the dependency anyway.
+	 */
 	CatalogTupleDelete(attrdef_rel, &tuple->t_self);
 
 	systable_endscan(scan);
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index d63bf5e56d..7cb9ac3ea8 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -721,6 +721,62 @@ getOwnedSequence(Oid relid, AttrNumber attnum)
 	return linitial_oid(seqlist);
 }
 
+/*
+ * Remove dependencies between an attribute and sequences OWNED BY it.
+ * Returns the number of disowned sequences.
+ */
+void
+disownSequences(Oid relid, AttrNumber attnum)
+{
+	Relation	depRel;
+	ScanKeyData key[3];
+	SysScanDesc scan;
+	HeapTuple	tup;
+
+	if (attnum <= 0)
+		elog(ERROR, "cannot remove sequence ownership for system columns");
+
+	depRel = table_open(DependRelationId, RowExclusiveLock);
+
+	ScanKeyInit(&key[0],
+Anum_pg_depend_refclassid,
+BTEqualStrategyNumber, F_OIDEQ,
+ObjectIdGetDatum(RelationRelationId));
+	ScanKeyInit(&key[1],
+Anum_pg_depend_refobjid,
+BTEqualStrategyNumber, F_OIDEQ,
+ObjectIdGetDatum(relid));
+	ScanKeyInit(&key[2],
+Anum_pg_depend_refobjsubid,
+BTEqualStrategyNumber, F_INT4EQ,
+Int32GetDatum(attnum));
+
+	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
+			  NULL, 3, key);
+
+	while (HeapTupleIsValid(tup = systable_getnext(scan)))
+	{
+		Form_pg_depend deprec = (Form_pg_depend) GETSTRUCT(tup);
+
+		/*
+		 * We assume 

Re: Identity columns should own only one sequence

2019-04-26 Thread Laurenz Albe
On Fri, 2019-04-26 at 15:23 +0200, Peter Eisentraut wrote:
> > So if one tries to convert a "serial" column to an identity column,
> > the following can happen:
> > 
> > test=> CREATE TABLE ser(id serial);
> > CREATE TABLE
> > test=> ALTER TABLE ser ALTER id ADD GENERATED ALWAYS AS IDENTITY;
> > ERROR:  column "id" of relation "ser" already has a default value
> > 
> > Hm, ok, let's drop the column default value.
> > 
> > test=> ALTER TABLE ser ALTER id DROP DEFAULT;
> > ALTER TABLE
> > 
> > Now it works:
> > 
> > test=> ALTER TABLE ser ALTER id ADD GENERATED ALWAYS AS IDENTITY;
> > ALTER TABLE
> > 
> > But not very much:
> > 
> > test=> INSERT INTO ser (id) VALUES (DEFAULT);
> > ERROR:  more than one owned sequence found
> 
> You also need to run
> 
> ALTER SEQUENCE ser_id_seq OWNED BY NONE;
> 
> because dropping the default doesn't release the linkage of the sequence
> with the table.  These are just weird artifacts of how serial is
> implemented, but that's why identity columns were added to improve
> things.  I don't think we need to make things more complicated here.

What do you think of the patch I just posted on this thread to
remove ownership automatically when the default is dropped, as Michael
suggested?  I think that would make things much more intuitive from
the user's perspective.

Correct me if I am wrong, but the sequence behind identity columns
should be an implementation detail that the user doesn't have to know about.
So the error message about "owned sequences" is likely to confuse users.

I have had a report by a confused user, so I think the problem is real.

Yours,
Laurenz Albe





Re: Identity columns should own only one sequence

2019-04-29 Thread Laurenz Albe
On Sat, 2019-04-27 at 14:16 +0200, Peter Eisentraut wrote:
> On 2019-04-26 15:37, Laurenz Albe wrote:
> > What do you think of the patch I just posted on this thread to
> > remove ownership automatically when the default is dropped, as Michael
> > suggested?  I think that would make things much more intuitive from
> > the user's perspective.
> 
> I think that adds more nonstandard behavior on top of an already
> confusing and obsolescent feature, so I can't get too excited about it.
> 
> A more forward-looking fix would be your other idea of having
> getOwnedSequence() only deal with identity sequences (not serial
> sequences).  See attached patch for a draft.

That looks good to me.

I agree that slapping on black magic that appropriates a pre-existing
owned sequence seems out of proportion.

I still think thatthat there is merit to Michael's idea of removing
sequence "ownership" (which is just a dependency) when the DEFAULT
on the column is dropped, but this approach is possibly cleaner.

Yours,
Laurenz Albe





Bad canonicalization for dateranges with 'infinity' bounds

2019-05-02 Thread Laurenz Albe
https://www.postgresql.org/docs/current/rangetypes.html#RANGETYPES-INFINITE has:

   Also, some element types have a notion of “infinity”, but that is just
   another value so far as the range type mechanisms are concerned.
   For example, in timestamp ranges, [today,] means the same thing as [today,).
   But [today,infinity] means something different from [today,infinity) —
   the latter excludes the special timestamp value infinity.

This does not work as expected for ranges with discrete base types,
notably daterange:

test=> SELECT '[2000-01-01,infinity]'::daterange;
   daterange   
---
 [2000-01-01,infinity)
(1 row)

test=> SELECT '(-infinity,2000-01-01)'::daterange;
   daterange

 [-infinity,2000-01-01)
(1 row)

This is because "daterange_canonical" makes no difference for 'infinity',
and adding one to infinity does not change the value.

I propose the attached patch which fixes the problem.

Yours,
Laurenz Albe





Re: Bad canonicalization for dateranges with 'infinity' bounds

2019-05-02 Thread Laurenz Albe
I wrote:
> I propose the attached patch which fixes the problem.

I forgot to attach the patch.  Here it is.

Yours,
Laurenz Albe
From 6bbad0acf3baae3a08d1f911b7017642c8a8afe9 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Thu, 2 May 2019 14:32:27 +0200
Subject: [PATCH] Don't canonicalize dateranges with 'infinity' bounds

Since adding one to infinity doesn't change the value, canonicalization
of dateranges with an explicit '-infinity' as exclusive lower bound
would make it an inclusive bound, while dateranges with 'infinity' as
inclusive upper bound would turn it into an exclusive bound.
---
 src/backend/utils/adt/rangetypes.c   |  4 ++--
 src/test/regress/expected/rangetypes.out | 24 
 src/test/regress/sql/rangetypes.sql  |  4 
 3 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index 72c450c70e..a98f17bb66 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -1431,13 +1431,13 @@ daterange_canonical(PG_FUNCTION_ARGS)
 	if (empty)
 		PG_RETURN_RANGE_P(r);
 
-	if (!lower.infinite && !lower.inclusive)
+	if (!(lower.infinite || DATE_NOT_FINITE(lower.val)) && !lower.inclusive)
 	{
 		lower.val = DirectFunctionCall2(date_pli, lower.val, Int32GetDatum(1));
 		lower.inclusive = true;
 	}
 
-	if (!upper.infinite && upper.inclusive)
+	if (!(upper.infinite || DATE_NOT_FINITE(upper.val)) && upper.inclusive)
 	{
 		upper.val = DirectFunctionCall2(date_pli, upper.val, Int32GetDatum(1));
 		upper.inclusive = false;
diff --git a/src/test/regress/expected/rangetypes.out b/src/test/regress/expected/rangetypes.out
index accf1e0d9e..60d875e898 100644
--- a/src/test/regress/expected/rangetypes.out
+++ b/src/test/regress/expected/rangetypes.out
@@ -652,6 +652,30 @@ select daterange('2000-01-10'::date, '2000-01-11'::date, '(]');
  [01-11-2000,01-12-2000)
 (1 row)
 
+select daterange('-infinity'::date, '2000-01-01'::date, '()');
+   daterange
+
+ (-infinity,01-01-2000)
+(1 row)
+
+select daterange('-infinity'::date, '2000-01-01'::date, '[)');
+   daterange
+
+ [-infinity,01-01-2000)
+(1 row)
+
+select daterange('2000-01-01'::date, 'infinity'::date, '[)');
+   daterange   
+---
+ [01-01-2000,infinity)
+(1 row)
+
+select daterange('2000-01-01'::date, 'infinity'::date, '[]');
+   daterange   
+---
+ [01-01-2000,infinity]
+(1 row)
+
 -- test GiST index that's been built incrementally
 create table test_range_gist(ir int4range);
 create index test_range_gist_idx on test_range_gist using gist (ir);
diff --git a/src/test/regress/sql/rangetypes.sql b/src/test/regress/sql/rangetypes.sql
index 55638a85ee..9fdb1953df 100644
--- a/src/test/regress/sql/rangetypes.sql
+++ b/src/test/regress/sql/rangetypes.sql
@@ -165,6 +165,10 @@ select daterange('2000-01-10'::date, '2000-01-20'::date, '(]');
 select daterange('2000-01-10'::date, '2000-01-20'::date, '()');
 select daterange('2000-01-10'::date, '2000-01-11'::date, '()');
 select daterange('2000-01-10'::date, '2000-01-11'::date, '(]');
+select daterange('-infinity'::date, '2000-01-01'::date, '()');
+select daterange('-infinity'::date, '2000-01-01'::date, '[)');
+select daterange('2000-01-01'::date, 'infinity'::date, '[)');
+select daterange('2000-01-01'::date, 'infinity'::date, '[]');
 
 -- test GiST index that's been built incrementally
 create table test_range_gist(ir int4range);
-- 
2.20.1



Re: Identity columns should own only one sequence

2019-05-02 Thread Laurenz Albe
On Thu, 2019-05-02 at 22:43 +0200, Peter Eisentraut wrote:
> On 2019-04-29 18:28, Laurenz Albe wrote:
> > I still think thatthat there is merit to Michael's idea of removing
> > sequence "ownership" (which is just a dependency) when the DEFAULT
> > on the column is dropped, but this approach is possibly cleaner.
> 
> I think the proper way to address this would be to create some kind of
> dependency between the sequence and the default.

That is certainly true.  But that's hard to retrofit into existing databases,
so it would probably be a modification that is not backpatchable.

Yours,
Laurenz Albe





Re: Identity columns should own only one sequence

2019-05-08 Thread Laurenz Albe
On Tue, 2019-05-07 at 13:06 +0900, Michael Paquier wrote:
> On Fri, May 03, 2019 at 08:14:35AM +0200, Laurenz Albe wrote:
> > On Thu, 2019-05-02 at 22:43 +0200, Peter Eisentraut wrote:
> >> I think the proper way to address this would be to create some kind of
> >> dependency between the sequence and the default.
> > 
> > That is certainly true.  But that's hard to retrofit into existing 
> > databases,
> > so it would probably be a modification that is not backpatchable.
> 
> And this is basically already the dependency which exists between the
> sequence and the relation created with the serial column.  So what's
> the advantage of adding more dependencies if we already have what we
> need?  I still think that we should be more careful to drop the
> dependency between the sequence and the relation's column if dropping
> the default using it.  If a DDL defines first a sequence, and then a
> default expression using nextval() on a column, then no serial-related

I believe we should have both:

- Identity columns should only use sequences with an INTERNAL dependency,
  as in Peter's patch.

- When a column default is dropped, remove all dependencies between the
  column and sequences.

In the spirit of moving this along, I have attached a patch which is
Peter's patch from above with a regression test.

Yours,
Laurenz Albe
From 4280a5251320d2051ada7aa1888ba20a610efa94 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Wed, 8 May 2019 16:42:10 +0200
Subject: [PATCH] XXX Draft with regression tests XXX

---
 src/backend/catalog/pg_depend.c| 26 +++---
 src/backend/commands/tablecmds.c   |  2 +-
 src/backend/parser/parse_utilcmd.c | 12 +---
 src/backend/rewrite/rewriteHandler.c   |  2 +-
 src/include/catalog/dependency.h   |  2 +-
 src/test/regress/expected/identity.out |  5 +
 src/test/regress/sql/identity.sql  |  7 +++
 7 files changed, 39 insertions(+), 17 deletions(-)

diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index f7caedcc02..63c94cfbe6 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -707,8 +707,8 @@ sequenceIsOwned(Oid seqId, char deptype, Oid *tableId, int32 *colId)
  * Collect a list of OIDs of all sequences owned by the specified relation,
  * and column if specified.
  */
-List *
-getOwnedSequences(Oid relid, AttrNumber attnum)
+static List *
+getOwnedSequences_internal(Oid relid, AttrNumber attnum, char deptype)
 {
 	List	   *result = NIL;
 	Relation	depRel;
@@ -750,7 +750,8 @@ getOwnedSequences(Oid relid, AttrNumber attnum)
 			(deprec->deptype == DEPENDENCY_AUTO || deprec->deptype == DEPENDENCY_INTERNAL) &&
 			get_rel_relkind(deprec->objid) == RELKIND_SEQUENCE)
 		{
-			result = lappend_oid(result, deprec->objid);
+			if (!deptype || deprec->deptype == deptype)
+result = lappend_oid(result, deprec->objid);
 		}
 	}
 
@@ -761,18 +762,29 @@ getOwnedSequences(Oid relid, AttrNumber attnum)
 	return result;
 }
 
+List *
+getOwnedSequences(Oid relid, AttrNumber attnum)
+{
+	return getOwnedSequences_internal(relid, attnum, 0);
+}
+
 /*
- * Get owned sequence, error if not exactly one.
+ * Get owned identity sequence, error if not exactly one.
  */
 Oid
-getOwnedSequence(Oid relid, AttrNumber attnum)
+getIdentitySequence(Oid relid, AttrNumber attnum, bool missing_ok)
 {
-	List	   *seqlist = getOwnedSequences(relid, attnum);
+	List	   *seqlist = getOwnedSequences_internal(relid, attnum, DEPENDENCY_INTERNAL);
 
 	if (list_length(seqlist) > 1)
 		elog(ERROR, "more than one owned sequence found");
 	else if (list_length(seqlist) < 1)
-		elog(ERROR, "no owned sequence found");
+	{
+		if (missing_ok)
+			return InvalidOid;
+		else
+			elog(ERROR, "no owned sequence found");
+	}
 
 	return linitial_oid(seqlist);
 }
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 8e4743d110..8240fec578 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -6607,7 +6607,7 @@ ATExecDropIdentity(Relation rel, const char *colName, bool missing_ok, LOCKMODE
 	table_close(attrelation, RowExclusiveLock);
 
 	/* drop the internal sequence */
-	seqid = getOwnedSequence(RelationGetRelid(rel), attnum);
+	seqid = getIdentitySequence(RelationGetRelid(rel), attnum, false);
 	deleteDependencyRecordsForClass(RelationRelationId, seqid,
 	RelationRelationId, DEPENDENCY_INTERNAL);
 	CommandCounterIncrement();
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 4564c0ae81..a3c5b005d5 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1083,7 +1083,7 @@ transformTableLikeClause(CreateStmtContext *cxt, TableLikeClause *table_like_cla
 			 * find sequence owned by old column; extract se

Re: integrate Postgres Users Authentication with our own LDAP Server

2019-05-08 Thread Laurenz Albe
On Thu, 2019-05-09 at 04:51 +, M Tarkeshwar Rao wrote:
> We would need to integrate Postgres Users Authentication with our own LDAP 
> Server.  
>  
> Basically as of now we are able to login to Postgress DB with a user/password 
> credential.
>
> [roles "pg_signal_backend" and "postgres"]
>  
> These user objects are the part of Postgres DB server. Now we want that these 
> users should be authenticated by LDAP server.
> We would want the authentication to be done with LDAP, so basically the user 
> credentials should be store in LDAP server
>  
> Can you mention the prescribed steps in Postgres needed for this integration 
> with LDAP Server?

LDAP authentication is well documented:
https://www.postgresql.org/docs/current/auth-ldap.html

But I don't think you are on the right track.

"pg_signal_backend" cannot login, it is a role to which you add a login user
to give it certain privileges.  So you don't need to authenticate the role.

"postgres" is the installation superuser.  If security is important for you,
you won't set a password for that user and you won't allow remote logins
with that user.

But for your application users LDAP authentication is a fine thing, and not
hard to set up if you know a little bit about LDAP.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Subplan result caching

2018-05-23 Thread Laurenz Albe
Heikki Linnakangas wrote:
> I've been working on a patch to add a little cache to SubPlans, to speed 
> up queries with correlated subqueries, where the same subquery is 
> currently executed multiple times with the same parameters. The idea is 
> to cache the result of the subplan, with the correlation vars as the 
> cache key.

I like the idea.

I think the cache should be limited in size, perhaps by work_mem.
Also, there probably should be a GUC for this, defaulting to "off".

Yours,
Laurenz Albe



Locks on unlogged tables are locked?!

2018-05-24 Thread Laurenz Albe
While looking at this:
https://stackoverflow.com/q/50413623/6464308
I realized that "LOCK TABLE " puts a
Standby/LOCK into the WAL stream, which causes a log flush
at COMMIT time.

That hurts performance, and I doubt that it is necessary.
At any rate, DROP TABLE on an unlogged table logs nothing.

The attached patch would take care of it, but I'm not sure
if that's the right place to check.

Yours,
Laurenz AlbeFrom d474e7b41298944e43bb9141eb33adbdd9ea1098 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Tue, 22 May 2018 18:13:31 +0200
Subject: [PATCH] Don't log locks on unlogged tables

---
 src/backend/storage/lmgr/lock.c | 28 +---
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index dc3d8d9817..70cac47ab3 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -37,6 +37,7 @@
 #include "access/twophase_rmgr.h"
 #include "access/xact.h"
 #include "access/xlog.h"
+#include "catalog/pg_class.h"
 #include "miscadmin.h"
 #include "pg_trace.h"
 #include "pgstat.h"
@@ -47,6 +48,7 @@
 #include "storage/standby.h"
 #include "utils/memutils.h"
 #include "utils/ps_status.h"
+#include "utils/rel.h"
 #include "utils/resowner_private.h"
 
 
@@ -1041,13 +1043,25 @@ LockAcquireExtended(const LOCKTAG *locktag,
 	 */
 	if (log_lock)
 	{
-		/*
-		 * Decode the locktag back to the original values, to avoid sending
-		 * lots of empty bytes with every message.  See lock.h to check how a
-		 * locktag is defined for LOCKTAG_RELATION
-		 */
-		LogAccessExclusiveLock(locktag->locktag_field1,
-			   locktag->locktag_field2);
+		bool unlogged_rel = false;
+
+		if (locktag->locktag_type == LOCKTAG_RELATION)
+		{
+			Relation r = RelationIdGetRelation(locktag->locktag_field2);
+			unlogged_rel = r->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED;
+			RelationClose(r);
+		}
+
+		if (!unlogged_rel)
+		{
+			/*
+			 * Decode the locktag back to the original values, to avoid sending
+			 * lots of empty bytes with every message.  See lock.h to check how a
+			 * locktag is defined for LOCKTAG_RELATION
+			 */
+			LogAccessExclusiveLock(locktag->locktag_field1,
+   locktag->locktag_field2);
+		}
 	}
 
 	return LOCKACQUIRE_OK;
-- 
2.14.3



Re: PGXS fails to byte-compile pgsql-ogr-fdw

2018-05-28 Thread Laurenz Albe
Christoph Berg wrote:
> pgsql-ogr-fdw fails to build against PG 11beta1 with JIT enabled. I
> just reported this as https://github.com/pramsey/pgsql-ogr-fdw/issues/153,
> but I think the problem might actually be in the PGXS Makefile - it
> assumes that all objects have a .c file to build the .bc from.

I have been bitten by that when building PostGIS.

A simple patch could fix it, but I agree that it would be better
to not build them.

Yours,
Laurenz Albe



Loaded footgun open_datasync on Windows

2018-06-01 Thread Laurenz Albe
I recently read our documentation about reliability on Windows:

> On Windows, if wal_sync_method is open_datasync (the default), write caching 
> can
> be disabled by unchecking
> My Computer\Open\disk drive\Properties\Hardware\Properties\Policies\Enable 
> write caching
> on the disk. Alternatively, set wal_sync_method to fsync or 
> fsync_writethrough,
> which prevent write caching.

It seems dangerous to me to initialize "wal_sync_method" to a method that is 
unsafe
by default.  Admittedly I am not a Windows man, but the fact that this has 
eluded me
up to now leads me to believe that other people running PostgreSQL on Windows 
might
also have missed that important piece of advice and are consequently running 
with
an unsafe setup.

Wouldn't it be smarter to set a different default value on Windows, like we do 
on
Linux (for other reasons)?

I am worried that there might be loads of Windows installations out there 
happily
running productive databases with an unsafe setup, so I'd even suggest 
backpatching
such a change.

Yours,
Laurenz Albe



Re: Loaded footgun open_datasync on Windows

2018-06-01 Thread Laurenz Albe
Amit Kapila wrote:
> On Fri, Jun 1, 2018 at 3:13 PM, Laurenz Albe  wrote:
> > I recently read our documentation about reliability on Windows:
> > 
> > > On Windows, if wal_sync_method is open_datasync (the default), write 
> > > caching can
> > > be disabled by unchecking
> > > My Computer\Open\disk 
> > > drive\Properties\Hardware\Properties\Policies\Enable write caching
> > > on the disk. Alternatively, set wal_sync_method to fsync or 
> > > fsync_writethrough,
> > > which prevent write caching.
> > 
> > It seems dangerous to me to initialize "wal_sync_method" to a method that 
> > is unsafe
> > by default.  Admittedly I am not a Windows man, but the fact that this has 
> > eluded me
> > up to now leads me to believe that other people running PostgreSQL on 
> > Windows might
> > also have missed that important piece of advice and are consequently 
> > running with
> > an unsafe setup.
> > 
> > Wouldn't it be smarter to set a different default value on Windows, like we 
> > do on
> > Linux (for other reasons)?
> > 
> 
> One thing to note is that it seems that in code we use 
> FILE_FLAG_WRITE_THROUGH for
> open_datasync which according to MSDN [1] will bypass any intermediate cache .
> See pgwin32_open.  Have you experimented to set any other option as we have a 
> comment
> in code which say Win32 only has O_DSYNC?
> 
> 
> [1] - 
> https://msdn.microsoft.com/en-us/library/windows/desktop/aa363858(v=vs.85).aspx

After studying the code I feel somewhat safer; it looks like the code is ok.
I have no Windows at hand, so I cannot test right now.

What happened is that I ran "pg_test_fsync" at a customer site on Windows, and
it returned ridiculously high rates got open_datasync.

So I think that the following should be fixed:

- Change pg_test_fsync to actually use FILE_FLAG_WRITE_THROUGH.
  Currently it uses O_DSYNC, which is defined in port/win32_port.h as

  /*
   * Supplement to .
   * This is the same value as _O_NOINHERIT in the MS header file. This is
   * to ensure that we don't collide with a future definition. It means
   * we cannot use _O_NOINHERIT ourselves.
   */
  #define O_DSYNC 0x0080

- Change the documentation so that it does not claim that open_datasync is 
unsafe
  unless you change the device settings.

If there is a consensus that this is fine, I can do the legwork.

Yours,
Laurenz Albe



Re: Loaded footgun open_datasync on Windows

2018-06-05 Thread Laurenz Albe
Amit Kapila wrote:
> On Fri, Jun 1, 2018 at 8:18 PM, Laurenz Albe  wrote:
> > Amit Kapila wrote:
> > > On Fri, Jun 1, 2018 at 3:13 PM, Laurenz Albe  
> > > wrote:
> > > > I recently read our documentation about reliability on Windows:
> > > > 
> > > > > On Windows, if wal_sync_method is open_datasync (the default), write 
> > > > > caching can
> > > > > be disabled by unchecking
> > > > > My Computer\Open\disk 
> > > > > drive\Properties\Hardware\Properties\Policies\Enable write caching
> > > > > on the disk. Alternatively, set wal_sync_method to fsync or 
> > > > > fsync_writethrough,
> > > > > which prevent write caching.
> > > > 
> > > > It seems dangerous to me to initialize "wal_sync_method" to a method 
> > > > that is unsafe
> > > > by default.  Admittedly I am not a Windows man, but the fact that this 
> > > > has eluded me
> > > > up to now leads me to believe that other people running PostgreSQL on 
> > > > Windows might
> > > > also have missed that important piece of advice and are consequently 
> > > > running with
> > > > an unsafe setup.
> > > > 
> > > > Wouldn't it be smarter to set a different default value on Windows, 
> > > > like we do on
> > > > Linux (for other reasons)?
> > > > 
> > > 
> > > One thing to note is that it seems that in code we use 
> > > FILE_FLAG_WRITE_THROUGH for
> > > open_datasync which according to MSDN [1] will bypass any intermediate 
> > > cache .
> > > See pgwin32_open.  Have you experimented to set any other option as we 
> > > have a comment
> > > in code which say Win32 only has O_DSYNC?
> > > 
> > > 
> > > [1] - 
> > > https://msdn.microsoft.com/en-us/library/windows/desktop/aa363858(v=vs.85).aspx
> > 
> > After studying the code I feel somewhat safer; it looks like the code is ok.
> > I have no Windows at hand, so I cannot test right now.
> > 
> > What happened is that I ran "pg_test_fsync" at a customer site on Windows, 
> > and
> > it returned ridiculously high rates got open_datasync.
> > 
> > So I think that the following should be fixed:
> > 
> > - Change pg_test_fsync to actually use FILE_FLAG_WRITE_THROUGH.
> 
> It sounds sensible to me to make a Windows specific change in pg_test_fsync 
> for open_datasync method.
> That will make pg_test_fsync behave similar to server.

The attached patch makes pg_test_fsync use pgwin32_open on Windows, which is 
what
we use elsewhere.  That should fix the problem.
Ist there a better way to do this?  The problem is that "c.h" is only included
at the very end of "postgres-fe.h", which makes front-end code use "open"
rather than "pgwin32_open" on Windows.

Having read it again, I think that the documentation is fine as it is:
After all, this is just advice what you can do if you are running on unsafe 
hardware,
which doesn't flush to disk like it should.

Yours,
Laurenz AlbeFrom 6ab651c48df66ec93b8d6730bebeaf60e2bb865b Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Tue, 5 Jun 2018 16:05:10 +0200
Subject: [PATCH] Make pg_test_fsync use pgwin32_open on Windows

---
 src/bin/pg_test_fsync/pg_test_fsync.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/bin/pg_test_fsync/pg_test_fsync.c b/src/bin/pg_test_fsync/pg_test_fsync.c
index e6f7ef8557..a0139a1302 100644
--- a/src/bin/pg_test_fsync/pg_test_fsync.c
+++ b/src/bin/pg_test_fsync/pg_test_fsync.c
@@ -3,6 +3,8 @@
  *		tests all supported fsync() methods
  */
 
+/* we have to include c.h first so that pgwin32_open is used on Windows */
+#include "c.h"
 #include "postgres_fe.h"
 
 #include 
-- 
2.14.3



Re: Loaded footgun open_datasync on Windows

2018-06-07 Thread Laurenz Albe
Amit Kapila wrote:
> On Wed, Jun 6, 2018 at 3:06 PM, Kuntal Ghosh  
> wrote:
> > It seems the "#ifndef FRONTEND" restriction was added around
> > pgwin32_open() for building libpq with Visual C++ or Borland C++.  The
> > restriction was added in commit 422d4819 to build libpq with VC++[1].
> > Later, in commit fd7c3f67e0bc4, the support for Borland C++ was also
> > added.
> > 
> > So, I'm not sure whether removing that restriction will work for all
> > front-end modules.
> > 
> 
> Thanks for doing investigation.  I agree with you that it appears from the old
> discussion that we have added this restriction to build libpq/psql (basically 
> frontend)
> modules.  However, I tried to build those modules on windows by removing that
> restriction and it doesn't give me any problem (except one extra argument 
> error,
> which can be easily resolved).  It might be that it gives problem with certain
> version of MSVC.  I think if we want to pursue this direction, one needs to 
> verify
> on various supportted versions (of Windows) to see if things are okay.

That's what the buildfarm is for, right?

> Speaking about the issue at-hand, one way to make pg_test_fsync work in to 
> just
> include c.h and remove inclusion of postgres_fe.h, but not sure if that is 
> the best way.
> I am not sure if we have any other options to proceed in this case.
> 
> Thoughts?

I thought of a better way.
pg_test_fsync is properly listed as a server application in the documentation,
so I think it should be built as such, as per the attached patch.

Yours,
Laurenz AlbeFrom 032a0463072097e489f912337ea08f7373a4f107 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Fri, 8 Jun 2018 04:07:55 +0200
Subject: [PATCH] Build pg_test_fsync as backend code

Including "postgres.h" rather than "postgres_fe.h" makes
pg_test_fsync use pgwin32_open() on Windows, which is the
proper thing to do because it should test the behavior on
the backend side.
---
 src/bin/pg_test_fsync/pg_test_fsync.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/bin/pg_test_fsync/pg_test_fsync.c b/src/bin/pg_test_fsync/pg_test_fsync.c
index e6f7ef8557..c1e1d7ef38 100644
--- a/src/bin/pg_test_fsync/pg_test_fsync.c
+++ b/src/bin/pg_test_fsync/pg_test_fsync.c
@@ -3,7 +3,7 @@
  *		tests all supported fsync() methods
  */
 
-#include "postgres_fe.h"
+#include "postgres.h"
 
 #include 
 #include 
-- 
2.14.4



Re: SHOW ALL does not honor pg_read_all_settings membership

2018-06-10 Thread Laurenz Albe
Alvaro Herrera wrote:
> On 2018-Mar-01, Laurenz Albe wrote:
> 
> > I noticed that commit 25fff40798fc4ac11a241bfd9ab0c45c085e2212 forgot
> > to teach SHOW ALL to show all GUCs when the user belongs to 
> > pg_read_all_settings.
> > 
> > Patch attached; I think this should be backpatched.
> 
> Done, with the further fixes downthread.  Thanks!

Thank you!

Laurenz Albe




Re: Loaded footgun open_datasync on Windows

2018-06-20 Thread Laurenz Albe
Kuntal Ghosh wrote:
[pg_test_fsync doesn't use pgwin32_open and hence doesn't respect O_DSYNC]
> On Wed, Jun 6, 2018 at 2:39 PM, Amit Kapila  wrote:
> > On Wed, Jun 6, 2018 at 10:18 AM, Michael Paquier  
> > wrote:
> > > On Wed, Jun 06, 2018 at 09:58:34AM +0530, Amit Kapila wrote:
> > > > One another alternative could be that we
> > > > define open as pgwin32_open (for WIN32) wherever we need it.
> > > 
> > > Which is what basically happens on any *nix platform, are you foreseeing
> > > anything bad here?
> > 
> > Nothing apparent, but I think we should try to find out why at the first
> > place this has been made backend specific.
> 
> It seems the "#ifndef FRONTEND" restriction was added around
> pgwin32_open() for building libpq with Visual C++ or Borland C++.  The
> restriction was added in commit 422d4819 to build libpq with VC++[1].
> Later, in commit fd7c3f67e0bc4, the support for Borland C++ was also
> added.
> 
> So, I'm not sure whether removing that restriction will work for all
> front-end modules.

Thanks for the research, and sorry for my long silence.

If I remove the "#ifndef FRONTEND", building with MSVC fails for all
call sites that use the two-argument version of open(2).

If I use the three-argument version throughout, which should be no problem,
PostgreSQL builds just fine with MSVC.

How about checking what the buildfarm thinks about the attached?

Yours,
Laurenz AlbeFrom 5e10792cf720cedd6ebaa67cfc6163b9c5d5f305 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Wed, 20 Jun 2018 15:54:20 +0200
Subject: [PATCH] Use pgwin32_open in frontend code on Windows

This is particularly important for pg_test_fsync which does not handle
O_DSYNC correctly otherwise, leading to false claims that disks are
unsafe.

All call sites that use the two-argument form of open(2) were changed
to the three-argument form to make the Windows build succeed.
---
 src/bin/pg_basebackup/pg_receivewal.c | 2 +-
 src/bin/pg_verify_checksums/pg_verify_checksums.c | 2 +-
 src/common/file_utils.c   | 2 +-
 src/include/port.h| 3 ---
 4 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c
index 071b32d19d..b97fce3c2b 100644
--- a/src/bin/pg_basebackup/pg_receivewal.c
+++ b/src/bin/pg_basebackup/pg_receivewal.c
@@ -287,7 +287,7 @@ FindStreamingStart(uint32 *tli)
 
 			snprintf(fullpath, sizeof(fullpath), "%s/%s", basedir, dirent->d_name);
 
-			fd = open(fullpath, O_RDONLY | PG_BINARY);
+			fd = open(fullpath, O_RDONLY | PG_BINARY, 0);
 			if (fd < 0)
 			{
 fprintf(stderr, _("%s: could not open compressed file \"%s\": %s\n"),
diff --git a/src/bin/pg_verify_checksums/pg_verify_checksums.c b/src/bin/pg_verify_checksums/pg_verify_checksums.c
index 845d5aba27..efb32ffcb9 100644
--- a/src/bin/pg_verify_checksums/pg_verify_checksums.c
+++ b/src/bin/pg_verify_checksums/pg_verify_checksums.c
@@ -82,7 +82,7 @@ scan_file(char *fn, int segmentno)
 	int			f;
 	int			blockno;
 
-	f = open(fn, 0);
+	f = open(fn, 0, 0);
 	if (f < 0)
 	{
 		fprintf(stderr, _("%s: could not open file \"%s\": %s\n"),
diff --git a/src/common/file_utils.c b/src/common/file_utils.c
index 48876061c3..8a4cdb47ff 100644
--- a/src/common/file_utils.c
+++ b/src/common/file_utils.c
@@ -283,7 +283,7 @@ fsync_fname(const char *fname, bool isdir, const char *progname)
 	 * unsupported operations, e.g. opening a directory under Windows), and
 	 * logging others.
 	 */
-	fd = open(fname, flags);
+	fd = open(fname, flags, 0);
 	if (fd < 0)
 	{
 		if (errno == EACCES || (isdir && errno == EISDIR))
diff --git a/src/include/port.h b/src/include/port.h
index 74a9dc4d4d..2d40045ad5 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -249,11 +249,8 @@ extern bool rmtree(const char *path, bool rmtopdir);
 #define		O_DIRECT	0x8000
 extern int	pgwin32_open(const char *, int,...);
 extern FILE *pgwin32_fopen(const char *, const char *);
-
-#ifndef FRONTEND
 #define		open(a,b,c) pgwin32_open(a,b,c)
 #define		fopen(a,b) pgwin32_fopen(a,b)
-#endif
 
 /*
  * Mingw-w64 headers #define popen and pclose to _popen and _pclose.  We want
-- 
2.14.4



Re: Loaded footgun open_datasync on Windows

2018-06-25 Thread Laurenz Albe
Kuntal Ghosh wrote:
> > If I use the three-argument version throughout, which should be no problem,
> > PostgreSQL builds just fine with MSVC.
> > 
> 
> I don't have enough experience on MSVC/Windows to comment on the same.
> 
> > How about checking what the buildfarm thinks about the attached?
> 
> +1

I have added it to the July commitfest.

Yours,
Laurenz Albe



Re: Loaded footgun open_datasync on Windows

2018-06-27 Thread Laurenz Albe
Michael Paquier wrote:
> > I have added it to the July commitfest.
> 
> Have you looked at the possibility of removing the log file constraints
> in pg_upgrade with the change you are doing here so as things would be
> more consistent with non-Windows platforms, simplifying some code on the
> way?

Can you explain what the "log file constraints" are?
If it is in any way related, and I can handle it, sure.

Yours,
Laurenz Albe



Re: Libpq support to connect to standby server as priority

2018-07-04 Thread Laurenz Albe
Haribabu Kommi wrote:
> On Wed, Jan 24, 2018 at 9:01 AM Jing Wang  wrote:
> > Hi All,
> > 
> > Recently I put a proposal to support 'prefer-read' parameter in 
> > target_session_attrs in libpq. Now I updated the patch with adding content 
> > in the sgml and regression test case.
> > 
> > Some people may have noticed there is already another patch 
> > (https://commitfest.postgresql.org/15/1148/ ) which looks similar with 
> > this. But I would say this patch is more complex than my proposal. 
> > 
> > It is better separate these 2 patches to consider.
> 
> I also feel prefer-read and read-only options needs to take as two different 
> options.
> prefer-read is simple to support than read-only.
> 
> Here I attached an updated patch that is rebased to the latest master and also
> fixed some of the corner scenarios.

The patch applies, builds and passes "make check-world".

I think the "prefer-read" functionality is desirable: It is exactly what you 
need
if you want to use replication for load balancing, and your application supports
different database connections for reading and writing queries.

"read-only" does not have a clear use case in my opinion.

With the patch, PostgreSQL behaves as expected if I have a primary and a 
standby and run:

  psql "host=/tmp,/tmp port=5433,5434 target_session_attrs=prefer-read"

But if I stop the standby (port 5434), libpq goes into an endless loop.

Concerning the code:

- The documentation needs some attention. Suggestion:

   If this parameter is set to prefer-read, connections
   where SHOW transaction_read_only returns off are 
preferred.
   If no such connection can be found, a connection that allows read-write
   transactions will be accepted.

- I think the construction with "read_write_host_index" makes the code even more
  complicated than it already is.

  What about keeping the first successful connection open and storing it in a
  variable if we are in "prefer-read" mode.
  If we get the read-only connection we desire, close that cached connection,
  otherwise use it.

Yours,
Laurenz Albe



Re: Libpq support to connect to standby server as priority

2018-07-16 Thread Laurenz Albe
Haribabu Kommi wrote:
> > On Wed, Jul 4, 2018 at 11:14 PM Laurenz Albe  
> > wrote:
> > > Haribabu Kommi wrote:
> > >  
> > > - I think the construction with "read_write_host_index" makes the code 
> > > even more
> > >   complicated than it already is.
> > > 
> > >   What about keeping the first successful connection open and storing it 
> > > in a
> > >   variable if we are in "prefer-read" mode.
> > >   If we get the read-only connection we desire, close that cached 
> > > connection,
> > >   otherwise use it.
> > 
> > Even if we add a variable to cache the connection, I don't think the logic 
> > of checking
> > the next host for the read-only host logic may not change, but the extra 
> > connection
> > request to the read-write host again will be removed.
> 
> I evaluated your suggestion of caching the connection and reuse it when there 
> is no
> read only server doesn't find, but I am thinking that it will add more 
> complexity and also
> the connection to the other servers delays, the cached connection may be 
> closed by
> the server also because of timeout.
> 
> I feel the extra time during connection may be fine, if user is preferring 
> the prefer-read
> mode, instead of adding more complexity in handling the cached connection? 
> 
> comments?

I tested the new patch, and it works as expected.

I don't think that time-out of the cached session is a valid concern, because 
that
would have to be a really short timeout.
On the other hand, establishing the connection twice (first to check if it is 
read-only,
then again because no read-only connection is found) can be quite costly.

But that is a matter of debate, as is the readability of the code.

Since I don't think I can contribute more to this patch, I'll mark it as
ready for committer.

Yours,
Laurenz Albe



Re: pg_receivewal documentation

2019-07-09 Thread Laurenz Albe
On Thu, 2019-06-27 at 10:06 -0400, Jesper Pedersen wrote:
> Here is a patch for the pg_receivewal documentation to highlight that 
> WAL isn't acknowledged to be applied.

I think it is a good idea to document this, but I have a few quibbles
with the patch as it is:

- I think there shouldn't be commas after the "note" and before the "if".
  Disclaimer: I am not a native speaker, so I am lacking authority.

- The assertion is wrong.  "on" (remote flush) is perfectly fine
  for synchronous_commit, only "remote_apply" is a problem.

- There is already something about "--synchronous" in the "Description"
  section.  It might make sense to add the additional information there.

How about the attached patch?

Yours,
Laurenz Albe
From c18b4b384a963e04cc5b5b50537c150858824f0a Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Tue, 9 Jul 2019 11:15:09 +0200
Subject: [PATCH] Better documentation for "pg_receivewal --synchronous"

"synchronous_commit" must not be set to "remote_apply" because
pg_receivewal doesn't apply WAL.
---
 doc/src/sgml/ref/pg_receivewal.sgml | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/pg_receivewal.sgml b/doc/src/sgml/ref/pg_receivewal.sgml
index 0506120c00..43932c 100644
--- a/doc/src/sgml/ref/pg_receivewal.sgml
+++ b/doc/src/sgml/ref/pg_receivewal.sgml
@@ -52,7 +52,10 @@ PostgreSQL documentation
Unlike the WAL receiver of a PostgreSQL standby server, pg_receivewal
by default flushes WAL data only when a WAL file is closed.
The option --synchronous must be specified to flush WAL data
-   in real time.
+   in real time.  Note that while WAL will be flushed with this setting,
+   it will never be applied, so  must
+   not be set to remote_apply if pg_receivewal
+   is the only synchronous standby.
   
 
   
-- 
2.20.1



Re: pg_receivewal documentation

2019-07-09 Thread Laurenz Albe
Jesper Pedersen wrote:
> Thanks for the review, and the changes.
> 
> However, I think it belongs in the --synchronous section, so what about 
> moving it there as attached ?

Works for me.

Marked as "ready for committer".

Yours,
Laurenz Albe





Re: pg_receivewal documentation

2019-07-10 Thread Laurenz Albe
On Wed, 2019-07-10 at 17:04 +0900, Michael Paquier wrote:
> Hmm.  synchronous_commit is user-settable, which means that it is
> possible to enforce a value in the connection string doing the
> connection.  Isn't that something we had better enforce directly in
> the tool?  In this case what could be fixed is GetConnection() which
> builds the connection string parameters.

I don't follow.

Are you talking about the replication connection from pg_receivewal
to the PostgreSQL server?  That wouldn't do anything, because it is
the setting of "synchronous_commit" for an independent client
connection that is the problem:

- pg_receivewal starts a replication connection.

- It is added to "synchronous_standby_names" on the server.

- A client connects. It sets "synchronous_commit" to "remote_apply".

- If the client modifies data, COMMIT will hang indefinitely,
  because pg_receivewal will never send confirmation that it has
  applied the changes.

One alternative option I see is for pg_receivewal to confirm that
it has applied the changes as soon as it flushed them.
It would be cheating somewhat, but it would work around the problem
in a way that few people would find surprising.

Yours,
Laurenz Albe





Re: PG 12 draft release notes

2019-07-15 Thread Laurenz Albe
On Sat, 2019-05-11 at 16:33 -0400, Bruce Momjian wrote:
> I have posted a draft copy of the PG 12 release notes here:

I wonder if commits 0ba06e0bf and 40cfe8606 are worth mentioning
in the release notes.  They make "pg_test_fsync" work correctly
on Windows for the first time.

Yours,
Laurenz Albe
diff --git a/doc/src/sgml/release-12.sgml b/doc/src/sgml/release-12.sgml
index 71f60d6961..e69d4955ce 100644
--- a/doc/src/sgml/release-12.sgml
+++ b/doc/src/sgml/release-12.sgml
@@ -2965,10 +2965,24 @@ Author: Andres Freund 
Require a C99-supported compiler, and MSVC
2013 or later on Windows (Andres Freund)
   
  
 
+ 
+
+
+  
+   Use the same functions to open files in frontend and backend code
+   on Windows, thus fixing the "open_datasync"
+   test in "pg_test_fsync" (Laurenz Albe)
+  
+ 
+
 
 

 



Re: pg_receivewal documentation

2019-07-16 Thread Laurenz Albe
On Tue, 2019-07-16 at 14:05 +0900, Michael Paquier wrote:
> >> How about
> >>  Note that while WAL will be flushed with this setting,
> >>  pg_receivewal never applies it, so
> >>   must not be set to
> >>  remote_apply if 
> >> pg_receivewal
> >>  is the only synchronous standby.
> 
> This is not true in all cases as since 9.6 it is possible to specify
> multiple synchronous standbys.  So if for example pg_receivewal and
> another synchronous standby are set in s_s_names and that the number
> of a FIRST (priority-based) or ANY (quorum set) is two, then the same
> issue exists, but this documentation is incorrect.  I think that we
> should have a more extensive wording  here, like "if pg_receivewal is
> part of a quorum-based or priority-based set of synchronous standbys."

I think this would be overly complicated.
The wording above seems to cover the priority-based base sufficiently
in my opinion.
Maybe a second sentence with more detail would be better:

  ... must not be set to remote_apply if
  pg_receivewal is the only synchronous standby.
  Similarly, if pg_receivewal is part of
  a quorum-based set of synchronous standbys, it won't count towards
  the quorum if  is set to
  remote_apply.

Yours,
Laurenz Albe





Re: pg_receivewal documentation

2019-07-16 Thread Laurenz Albe
On Wed, 2019-07-17 at 10:38 +0900, Michael Paquier wrote:
> +   
> +Note that while WAL will be flushed with this setting,
> +pg_receivewal never applies it, so
> + must not be set to
> +remote_apply if 
> pg_receivewal
> +is the only synchronous standby. Similarly, if
> +pg_receivewal is part of a quorum-based
> +set of synchronous standbys, it won't count towards the quorum if
> + is set to
> +remote_apply.
> +   
> 
> I think we should really document the caveat with priority-based sets
> of standbys as much as quorum-based sets.  For example if a user sets
> synchronous_commit = remote_apply in postgresql.conf, and then sets
> s_s_names to '2(pg_receivewal, my_connected_standby)' to get a
> priority-based set, then you have the same problem, and pg_receivewal
> is not the only synchronous standby in this configuration.  The patch
> does not cover that case properly.

I understand the concern, I'm just worried that too much accuracy may
render the sentence hard to read.

How about adding "or priority-based" after "quorum-based"?

Yours,
Laurenz Albe





Re: pg_receivewal documentation

2019-07-17 Thread Laurenz Albe
On Wed, 2019-07-17 at 13:59 -0400, Jesper Pedersen wrote:
> +   
> +Note that while WAL will be flushed with this setting,
> +pg_receivewal never applies it, so
> + must not be set to
> +remote_apply or on
> +if pg_receivewal is the only synchronous 
> standby.
> +Similarly, if pg_receivewal is part of a
> +priority-based synchronous replication setup 
> (FIRST),
> +or a quorum-based setup (ANY) it won't count 
> towards
> +the policy specified if  is
> +set to remote_apply or on.
> +   

That's factually wrong.  "on" (wait for WAL flush) works fine with
pg_receivewal, only "remote_apply" doesn't.

Ok, here's another attempt:

   Note that while WAL will be flushed with this setting,
   pg_receivewal never applies it, so
must not be set to
   remote_apply if pg_receivewal
   is the only synchronous standby.
   Similarly, it is no use adding pg_receivewal to a
   priority-based (FIRST) or a quorum-based
   (ANY) synchronous replication setup if
is set to 
remote_apply.

Yours,
Laurenz Albe





Re: Bad canonicalization for dateranges with 'infinity' bounds

2019-07-18 Thread Laurenz Albe
On Thu, 2019-07-18 at 13:56 -0700, Jeff Davis wrote:
> I went ahead and committed this using Thomas's suggestion to remove the
> parentheses.

Thanks for the review and the commit!

Yours,
Laurenz Albe





Re: pg_receivewal documentation

2019-07-18 Thread Laurenz Albe
On Fri, 2019-07-19 at 10:27 +0900, Michael Paquier wrote:
> On Thu, Jul 18, 2019 at 08:40:36AM -0400, Jesper Pedersen wrote:
> > On 7/18/19 1:29 AM, Michael Paquier wrote:
> > > Or more simply like that?
> > > "Note that while WAL will be flushed with this setting,
> > > pg_receivewal never applies it, so synchronous_commit must not be set
> > > to remote_apply if pg_receivewal is a synchronous standby, be it a
> > > member of a priority-based (FIRST) or a quorum-based (ANY) synchronous
> > > replication setup."
> > 
> > Yeah, better.
> 
> I was looking into committing that, and the part about
> synchronous_commit = on is not right.  The location of the warning is
> also harder to catch for the reader, so instead let's move it to the
> top where we have an extra description for --synchronous.  I am
> finishing with the attached that I would be fine to commit and
> back-patch as needed.  Does that sound fine?

It was my first reaction too that this had better be at the top.

I'm happy with the patch as it is.

Yours,
Laurenz Albe





Re: reloption to prevent VACUUM from truncating empty pages at the end of relation

2018-11-14 Thread Laurenz Albe
On Tue, 2018-04-17 at 15:09 -0400, Tom Lane wrote:
> Alvaro Herrera  writes:
> > Andres was working on a radix tree structure to fix this problem, but
> > that seems to be abandoned now, and it seems a major undertaking.  While
> > I agree that the proposed solution is a wart, it seems much better than
> > no solution at all.  Can we consider Fujii's proposal as a temporary
> > measure until we fix shared buffers?  I'm +1 on it myself.
> 
> Once we've introduced a user-visible reloption it's going to be
> practically impossible to get rid of it, so I'm -1.  I'd much rather
> see somebody put some effort into the radix-tree idea than introduce
> a kluge that we'll be stuck with, and that doesn't even provide a
> good user experience.  Disabling vacuum truncation is *not* something
> that I think we should recommend.

This new option would not only mitigate the long shared_buffers scan,
it would also get rid of the replication conflict caused by the
AccessExclusiveLock taken during truncation, which is discussed in
https://www.postgresql.org/message-id/c9374921e50a5e8fb1ecf04eb8c6ebc3%40postgrespro.ru
and seems to be a more difficult problem than anticipated.

Could that tip the scales in favor of this stop-gap?

FWIW, I have always considered heap truncation on VACUUM to be something
strange anyway.  VACUUM does not get rid of empty pages in the middle of
a relation, so why is it so important to do it at the end of the relation?
If the answer is "just because we can do it easily", then I think it would be
ok to disable the feature in cases where it causes problems.

Yours,
Laurenz Albe




Re: Libpq support to connect to standby server as priority

2018-11-16 Thread Laurenz Albe
Tom Lane wrote:
> > As it is now, the patch doesn't keep two connections open.  It remembers
> > the index of the host of the first successful writable connection, but
> > closes the connection, and opens another one to that host if no read-only
> > host can be found.
> 
> Oh!  The reason I assumed it wasn't doing that is that such a behavior
> seems completely insane.  If the point is to keep down the load on your
> master server, then connecting only to immediately disconnect is not
> a friendly way to do that --- even without counting the fact that you
> might later come back and connect again.

That's why I had argued initially to keep the session open, but you
seem to dislike that idea as well.

> If that's the best we can do, we should forget the whole feature and
> just recommend putting slave servers first in your hosts list when
> you want prefer-slave.

If you know which is which, certainly.
But in a setup with automated failover you cannot be certain which is which.
That's what the proposed feature targets.

Yours,
Laurenz Albe




Re: Updated backup APIs for non-exclusive backups

2018-11-25 Thread Laurenz Albe
I'm a bit late to the party, but I only recently noticed that the exclusive
backup API is deprecated.

On Sun, 2016-04-24 at 11:49 -0400, Stephen Frost wrote:
> * Magnus Hagander (mag...@hagander.net) wrote:
> > On Sun, Apr 24, 2016 at 5:37 AM, Bruce Momjian  wrote:
> > > On Fri, Apr 22, 2016 at 11:53:46AM -0400, Robert Haas wrote:
> > > > 
> > > > Frankly, I think that's right.  It is one thing to say that the new
> > > > method is preferred - +1 for that.  But the old method is going to
> > > > continue to be used by many people for a long time, and in some cases
> > > > will be superior.  That's not something we can deprecate, unless I'm
> > > > misunderstanding the situation.
> > > 
> > > I agree with Robert.  One the one hand we are saying pg_stop_backup()
> > > doesn't work well in psql because you get those two file contents output
> > > that you have to write, and on the other hand we are saying we are going
> > > to deprecate the method that does work well in psql?  I must be missing
> > > something too, as that makes no sense.
> > 
> > I don't agree. I don't see how "making a backup using psql" is more
> > important than "making a backup without potentially dangerous sideeffects".
> > But if others don't agree, could one of you at least provide an example of
> > how you'd like the docs to read in a way that doesn't deprecate the unsafe
> > way but still informs the user properly?
> 
> I'm with Magnus on this, primairly because I've come to understand just
> how dangerous the old backup method is.  That method *should* be
> deprecated and discouraged.  A backup method which means your database
> doesn't restart properly if the system crashes during the backup is
> *bad*.
> 
> Fixing that means using something more complicated than the old method
> and that's a bit of a pain in psql, but that doesn't mean we should tell
> people that the old method is an acceptable approach.
> 
> Perhaps we can look at improving psql to make it easier to use it for
> the new backup method but, honestly, all these hackish scripts to do
> backups aren't doing a lot of things that a real backup solution needs
> to do.  Improving psql for this is a bit like new paint on the titanic.

I guess that the danger mentioned above is that the database may not
restart if it crashes while in exclusive backup mode, because the
WAL containing the starting checkpoint has already been purged.
True, that is a problem.

I would say the typical use case for the exclusive backup method is
the following (and I have seen it often):

You have some kind of backup software that does file system backups
and is able to run a "pre-backup" and "post-backup" script.
The backup is triggered by the backup software.

Another thing that is problematic with non-exclusive backups is that
you have to write the backup_label file into the backup after the
backup has been taken.  With a technique like the above, you cannot
easily do that.

So in essence, all backup methods that work like the above won't be
possible any more once the exclusive backup is gone.


I expect that that will make a lot of users unhappy.

Would it be an option to change the semantics so that "backup_label"
is ignored if there is no "recovery.conf"?  Of course that opens
another way to corrupt your database (by starting it from a backup
without first creating "recovery.conf"), but we could add another
big warning.

I'd say that the people who ignore such a warning are the same
people that happily remove "backup_label" when they see the following
message upon starting a cluster from a backup without recovery:

  If you are not restoring from a backup, try removing the file "backup_label".

Yours,
Laurenz Albe




Re: Updated backup APIs for non-exclusive backups

2018-11-25 Thread Laurenz Albe
On Sun, 2018-11-25 at 13:50 -0500, Stephen Frost wrote:
> I don't see any compelling argument for trying to do something half-way
> any more today than I did two years ago when this was being discussed.

That may well be so.  It may be better to make users unhappy than to
make them very unhappy...

But I find the following points unconvincing:

> > I would say the typical use case for the exclusive backup method is
> > the following (and I have seen it often):
> > 
> > You have some kind of backup software that does file system backups
> > and is able to run a "pre-backup" and "post-backup" script.
> > The backup is triggered by the backup software.
> 
> Seeing it often doesn't make it a good solution.  Running just
> pre-backup and post-backup scripts and copying the filesystem isn't
> enough to perform an online PostgreSQL backup- the WAL needs to be
> collected as well, and you need to make sure that you have all of the
> WAL before the backup can be considered complete.

Yes, that's why "pg_stop_backup" has the "wait_for_archive" parameter.
So this is not a problem.

> On restore, you're
> going to need to create a recovery.conf (at least in released versions)
> which provides a restore command (needed even in HEAD today) to get the
> old WAL, so having to also create the backup_label file shouldn't be
> that difficult.

You write "recovery.conf" upon recovery, when you have the restored
backup, so you have it on a file system.  No problem adding a file then.

This is entirely different from adding a "backup_label" file to
a backup that has been taken by a backup software in some arbitrary
format in some arbitrary location (think snapshot).

So these two cases don't compare.

> Lastly, if you really want, you can extract out the data from
> pg_stop_backup in whatever your post-backup script is.

Come on, now.
You usually use backup techniques like that because you can't get
your large database backed up in the available time window otherwise.

> > Another thing that is problematic with non-exclusive backups is that
> > you have to write the backup_label file into the backup after the
> > backup has been taken.  With a technique like the above, you cannot
> > easily do that.
> 
> ... why not?  You need to create the recovery.conf anyway, and you need
> to be archiving WAL somewhere, so it certainly seems like you could put
> the backup_label there too.

As I said above, you don't add "recovery.conf" to the backup right away,
so these two cases don't compare.

> > I expect that that will make a lot of users unhappy.
> 
> If it means that they implement a better backup strategy, then it's
> probably a good thing, which is the goal of this.

I thought our goal is to provide convenient backup methods...


Ignoring "backup_label" on restart, as I suggested in my previous message,
probably isn't such a hot idea.

But what's wrong with retaining the exclusive backup method and just
sticking a big "Warning: this may cause a restart to fail after a crash"
on it?  That sure wouldn't be unsafe.

If somebody prefers a simpler backup method at the price of having to
manually restart the server after a crash, what's wrong with that?
Why not give them the choice?

I'd say that one could write a server start script that just removes
"backup_label", but I am sure someone will argue that then somebody
will restore a backup and then restart the server without first
recovering the database...

Yours,
Laurenz Albe




Re: Updated backup APIs for non-exclusive backups

2018-11-25 Thread Laurenz Albe
Stephen Frost wrote: 
> > > Seeing it often doesn't make it a good solution.  Running just
> > > pre-backup and post-backup scripts and copying the filesystem isn't
> > > enough to perform an online PostgreSQL backup- the WAL needs to be
> > > collected as well, and you need to make sure that you have all of the
> > > WAL before the backup can be considered complete.
> > 
> > Yes, that's why "pg_stop_backup" has the "wait_for_archive" parameter.
> > So this is not a problem.
> 
> That doesn’t actually make sure you have all of the WAL reliably saved
> across the backup, it just cares what archive command returns, which is
> sadly often a bad thing to depend on.  I certainly wouldn’t rely on only
> that for any system I cared about. 

If you write a bad archive_command, you have a problem.
But that is quite unrelated to the problem at hand, if I am not mistaken.

> > > On restore, you're
> > > going to need to create a recovery.conf (at least in released versions)
> > > which provides a restore command (needed even in HEAD today) to get the
> > > old WAL, so having to also create the backup_label file shouldn't be
> > > that difficult.
> > 
> > You write "recovery.conf" upon recovery, when you have the restored
> > backup, so you have it on a file system.  No problem adding a file then.
> > 
> > This is entirely different from adding a "backup_label" file to
> > a backup that has been taken by a backup software in some arbitrary
> > format in some arbitrary location (think snapshot).
> 
> There isn’t any need to write the backup label before you restore the 
> database,
> just as you write recovery.conf then.

Granted.
But it is pretty convenient, and writing it to the data directory right away
is a good thing on top, because it reduces the danger of inadvertedly
starting the backup without recovery.

> > > Lastly, if you really want, you can extract out the data from
> > > pg_stop_backup in whatever your post-backup script is.
> > 
> > Come on, now.
> > You usually use backup techniques like that because you can't get
> > your large database backed up in the available time window otherwise.
> 
> I’m not following what you’re trying to get at here, why can’t you extract
> the data for the backup label from pg_stop_backup..?  Certainly other tools
> do, even ones that do extremely fast parallel backups..  the two are
> completely independent.
> 
> Did you think I meant pg_basebackup..?  I certaily didn’t.

Oh yes, I misunderstood.  Sorry.

Yes, you can come up with a post-backup script that somehow communicates
with your pre-backup script to get the information, but it sure is
inconvenient.  Simplicity is good in backup solutions, because complicated
things tend to break more easily.

> > I thought our goal is to provide convenient backup methods...
> 
> Correctness would be first and having a broken system because of a crash 
> during a backup isn’t correct.

"Not starting up without manual intervention" is not actually broken...

> > But what's wrong with retaining the exclusive backup method and just
> > sticking a big "Warning: this may cause a restart to fail after a crash"
> > on it?  That sure wouldn't be unsafe.
> 
> I haven’t seen anyone pushing for it to be removed immediately, but users 
> should
> not use it and newcomers would be much better served by using the non 
> exclusive api.
> There is a reason it was deprecated and it’s because it simply isn’t a good 
> API.
> Coming along a couple years later and saying that it’s a good API while 
> ignoring
> the issues that it has doesn’t change that.

I don't think I'm ignoring the issues, I just think there is a valid use case 
for
the exclusive backup API, with all its caveats.

Of course I'm not arguing on behalf of organizations running lots of databases
for whom manual intervention after a crash is unacceptable.

I'm arguing on behalf of users that run a few databases, want a simple backup
solution and are ready to deal with the shortcomings.


But I will gladly accept defeat in this matter, I just needed to vent my 
opinion.

Yours,
Laurenz Albe




Re: Updated backup APIs for non-exclusive backups

2018-11-25 Thread Laurenz Albe
On Sun, 2018-11-25 at 16:04 -0500, Stephen Frost wrote:
> > > There isn’t any need to write the backup label before you restore the 
> > > database,
> > > just as you write recovery.conf then.
> > 
> > Granted.
> > But it is pretty convenient, and writing it to the data directory right away
> > is a good thing on top, because it reduces the danger of inadvertedly
> > starting the backup without recovery.
> 
> Writing it into the data directory is *not* a good thing because as soon as 
> you do
> that you risk there being an issue if there’s a crash.  Writing into the 
> backup
> isn’t a bad idea but if you’re managing your backups then writing it 
> somewhere else
> (such as where you write your WAL) and associating it with the backup 
> (presumably
> it has a label) should make it easy to pull back when you restore. 

If there is a crash during the backup procedure, the backup is bad.
Doesn't matter during which part of the backup procedure it happens.

> > Yes, you can come up with a post-backup script that somehow communicates
> > with your pre-backup script to get the information, but it sure is
> > inconvenient.  Simplicity is good in backup solutions, because complicated
> > things tend to break more easily.
> 
> Not sure what communication is necessary here..?   The data needed for the 
> backup
> label file comes from pg_stop_backup in a non-exclusive backup.

Right, and pg_stop_backup has to be run from the "pre-backup" script.

Yours,
Laurenz Albe





Re: Updated backup APIs for non-exclusive backups

2018-11-25 Thread Laurenz Albe
On Sun, 2018-11-25 at 22:01 +0100, Magnus Hagander wrote:
[about managing backups from pre- and post-file-system-backup scrips]
> I agree with your point that it's not an uncommon thing to need. If a good 
> solution
> for it can be proposed that requires the exclusive backup interface, then I 
> wouldn't
> be against un-deprecating that.  But that's going to require a lot more than 
> just a
> documentation change, IMHO. What could perhaps be handled with a 
> documentation change,
> however, is to document a good way for this type of setup to use the new 
> interfaces.

Good point, and it puts the ball in my court :^)

> > I'm arguing on behalf of users that run a few databases, want a simple 
> > backup
> > solution and are ready to deal with the shortcomings.
> 
> Those that want a simple backup solution have one -- pg_basebackup.
> 
> The exclusive backup API is *not* simple. It is convenient, but it is not 
> simple.
> 
> Actually having a simple API that worked with the pre/post backup scripts, 
> that
> would be an improvement that we should definitely want!

Right; unfortunately it is not simple to come up with one that doesn't exhibit
the problems of the existing exclusive backup.

Perhaps it's theoretically impossible.  The goal is to disambiguate what a file
system backup sees in backup mode and what the startup process sees after a 
crash
in backup mode, and I can't see how that could be done.

Yours,
Laurenz Albe




Re: pg_config wrongly marked as not parallel safe?

2018-11-26 Thread Laurenz Albe
On Mon, 2018-11-26 at 19:51 -0500, Stephen Frost wrote:
> If a function's results can change across minor or major versions, we
> shouldn't be marking it as immutable because, by definition, it's not
> immutable.
> 
> We, today, have a baked in assumption that any function marked as
> immutable will remain immutable across all major versions that we allow
> indexes to be retained through, which is all of them since about 8.3 at
> this point.
> 
> We absolutely need a policy that if you decide to change the results of
> some immutable function across a major version change, you need to
> consider the results on indexes and possibly write into pg_upgrade
> checks to try and detect any usage of that immutable function.  I hope
> we're in agreement there.

It's hard to make a guarantee that a function will never change in the
future.  What if we fix some rounding or overflow problem in a floating
point function?

If we went that far, hardly any function could be IMMUTABLE.

I think it is best to use IMMUTABLE whenever we don't expect it to
change and it is a function useful for indexes, and if it happens to
change nonetheless, write into the release notes that certain indexes
have to be rebuilt after upgrade.

Of course, there is no problem to mark pg_config as stable, because
there is little chance it will be used in an index anyway.

Yours,
Laurenz Albe




Re: pg_config wrongly marked as not parallel safe?

2018-11-28 Thread Laurenz Albe
On Wed, 2018-11-28 at 21:17 -0500, Stephen Frost wrote:
> I don't agree that we can simply declare that all functional and partial
> indexes have to be rebuilt between every major version upgrade, which is
> the alternative.  There really isn't another option- either we're
> extremely careful and take immutability of functions seriously and make
> sure to preserve behavior, and therefore indexes, across major versions,
> or we don't and we require indexes to be rebuilt.

I absolutely agree with that.

I guess that when I said that we should declare functions IMMUTABLE
even if they might change owing to bug fixes, I didn't make it clear that
each such occurrence would necessitate a fat warning in the release notes
that indexes using them have to be rebuilt.

Yours,
Laurenz Albe




Re: Sketch of a fix for that truncation data corruption issue

2018-12-10 Thread Laurenz Albe
Tom Lane wrote:
> We got another report today [1] that seems to be due to the problem
> we've seen before with failed vacuum truncations leaving corrupt state
> on-disk [2].  Reflecting on that some more, [...]

This may seem heretical, but I'll say it anyway.

Why don't we do away with vacuum truncation for good?
Is that a feature that does anybody any good?
To me it has always seemed to be more a wart than a feature, like
someone just thought it was low hanging fruit without considering
all the implications.

VACUUM doesn't reclaim space, VACUUM (FULL) does.  That's the way it
(mostly) is, so why complicate matters unnecessarily?

Yours,
Laurenz Albe




Re: Updated backup APIs for non-exclusive backups

2018-12-11 Thread Laurenz Albe
On Mon, 2018-11-26 at 10:18 +0100, Magnus Hagander wrote:
> > [about managing backups from pre- and post-file-system-backup scrips]
> > > I agree with your point that it's not an uncommon thing to need. If a 
> > > good solution
> > > for it can be proposed that requires the exclusive backup interface, then 
> > > I wouldn't
> > > be against un-deprecating that.  But that's going to require a lot more 
> > > than just a
> > > documentation change, IMHO. What could perhaps be handled with a 
> > > documentation change,
> > > however, is to document a good way for this type of setup to use the new 
> > > interfaces.
> > 
> > Good point, and it puts the ball in my court :^)
> 
> Enjoy :)

I have come up with some sample code here:
https://github.com/cybertec-postgresql/safe-backup

This just uses bash and psql.
Does that look reasonably safe?

It's probably too big to be introduced into the documentation, but maybe
we could add it to the Wiki.

Yours,
Laurenz Albe




Re: Updated backup APIs for non-exclusive backups

2018-12-11 Thread Laurenz Albe
On Tue, 2018-12-11 at 23:43 -0500, David Steele wrote:
> > > > [about managing backups from pre- and post-file-system-backup scrips]
> > I have come up with some sample code here:
> > https://github.com/cybertec-postgresql/safe-backup
> > 
> > This just uses bash and psql.
> > Does that look reasonably safe?
> > 
> > It's probably too big to be introduced into the documentation, but maybe
> > we could add it to the Wiki.
> 
> My bash-fu is not all that great, but it seems to me you could do this
> all in one script and forgo the table entirely.  Just do the file copy
> at about line 60 in pgpre.sh.
> 
> It does look workable to me, just wonder if it could be simpler.

Thanks for having a look.  Yes, it looks appallingly complicated.

Sure, if I knew there was a file to write, and I knew where to write
it, I could do it in the "pre" script.  But since I have no idea how the
actual backup is performed and how the "backup_label" file is going to
be saved, I thought it best to return the information to the caller and
persist it somewhere, and only the "post" script can actually return the
information.

Yours,
Laurenz Albe




Re: Remove Deprecated Exclusive Backup Mode

2018-12-11 Thread Laurenz Albe
On Wed, 2018-12-12 at 13:31 +0900, Robert Haas wrote:
> On Wed, Dec 12, 2018 at 1:23 PM David Steele  wrote:
> > I didn't get the impression that Peter was against, he just thought that
> > it needed to stand on its own, rather than be justified by the
> > recovery.conf changes, which I agree with.
> > 
> > Simon rather clearly said that he thinks we should wait until the next
> > release, which I don't see as being entirely against.
> 
> Well, nobody is saying that we should NEVER remove this.  The
> discussion is about what to do in v12.
> 
> Most of the features I've been involved in removing have been
> deprecated for 5+ years.  The first release where this one was
> deprecated was only 2 years ago.  So it feels dramatically faster to
> me than what I think we have typically done.
> 
> Actually, I hadn't realized until this discussion that the exclusive
> backup interface was actually deprecated -- I thought we were just
> recommending the new non-exclusive backup interface should be used.
> If we're in a rush to remove this (and apparently many of us are), I
> think we should make that warning a lot more prominent, maybe copy it
> into a few more places, and back-patch the changes.

+1

I too only learned about this recently, while the problem with exclusive
backups has been known at least since 2008 (c979a1fe), and nobody felt
this to be a terrible problem back then.

I believe that the danger is greatly overrated.  It is not like you end
up with a corrupted database after a crash, and you get a pretty helpful
error message.  Many people are happy enough to live with that.

I'm on board with deprecating and removing it eventually, but I see no
problem in waiting for the customary 5 years.
And yes, a prominent warning in the next major release notes would be
a good thing.

Yours,
Laurenz Albe




Re: Where to save data used by extension ?

2018-12-14 Thread Laurenz Albe
Hao Wu wrote:
> I have an extension for postgresql. The extension works across some 
> databases, and needs to save some data.
> The data might be modified dynamically by the extension, and all the changed 
> result must be saved. I have considered some methods.
> 1. Use GUC
> I find no easy way to write the GUC values back to postgres.conf. And the 
> data might be a bit complex
> 2. Create a table under a special database, like postgres
> The weak is the strong assumption that the database postgres does exist and 
> will not be dropped.
> 3. Create a table under a database created by the extension
> A little heavy, but without dependencies.
> 4. Write to a native file.
> The file can not sync to a standby
> 
> Currently, I use #2. Is there any better way to do this?

Only 2 and 3 are feasible.

Since extensions have a schema, and the database is the best place to persist
data for a database extension, I'd use a table in the extension schema, so I'd
go for 3.

Why is that heavier than 2?

Yours,
Laurenz Albe




Re: Identity columns should own only one sequence

2019-08-05 Thread Laurenz Albe
Peter Eisentraut wrote:
> On 2019-05-08 16:49, Laurenz Albe wrote:
> > I believe we should have both:
> > 
> > - Identity columns should only use sequences with an INTERNAL dependency,
> >   as in Peter's patch.
> 
> I have committed this.

Thanks!

> > - When a column default is dropped, remove all dependencies between the
> >   column and sequences.
> 
> There is no proposed patch for this, AFAICT.

There was one in
https://www.postgresql.org/message-id/3916586ef7f33948235fe60f54a3750046f5d940.camel%40cybertec.at

> So I have closed this commit fest item for now.

That's fine with me.

Yours,
Laurenz Albe





Re: Identity columns should own only one sequence

2019-08-05 Thread Laurenz Albe
Peter Eisentraut wrote:
> On 2019-05-08 16:49, Laurenz Albe wrote:
> > I believe we should have both:
> > 
> > - Identity columns should only use sequences with an INTERNAL dependency,
> >   as in Peter's patch.
> 
> I have committed this.

Since this is a bug fix, shouldn't it be backpatched?

Yours,
Laurenz Albe





Re: understand the pg locks in in an simple case

2019-08-20 Thread Laurenz Albe
Alex wrote:
> But when I check the pg_locks: session 1.  I can see no tuple lock
> there,  when I check the session 2,   I can see a
> tuple(ExclusiveLock) is granted,  but it is waiting for a
> transactionid. 
> 
> since every tuple has txn information,  so it is not hard to
> implement it this way.  but is there any benefits over the the
> straight way?   with the current implementation, what is the point
> of tuple(ExclusiveLock) for session 2?

>From what I can tell the reason is that pg_locks reports the
information from the shared memory locking table, and tuple locks
are *not* maintained there, but in the "xmax" of the row itself.

To see all tuple locks in pg_locks would require a sequential
scan of all tables which have certain locks on them, which is not
going to happen.

Yours,
Laurenz Albe





Re: Procedure support improvements

2019-08-26 Thread Laurenz Albe
[CC to -hackers]
Dave Cramer wrote:
> On Mon, 26 Aug 2019 at 13:43,
Laurenz Albe 
> wrote:
> > Dave Cramer wrote:
>
> > As I said, I'd entertain a connection parameter that switched the
>
> > CALL to call procedures but ideally you'd complain to the server
> >
> folks to make Procedures useful.
> > 
> > Apart from the obvious
problem that procedures make life hard for 
> > the JDBC driver, because
it does not know if it shall render a call
> > as SELECT or CALL:
> >
What is missing in PostgreSQL procedures to make them useful?
> 
> being
able to use transactions inside a procedure inside a 
> transaction.

test=> CREATE OR REPLACE PROCEDURE testproc() LANGUAGE plpgsql AS
   $$BEGIN PERFORM 42; COMMIT; PERFORM 'x'; END;$$;
CREATE PROCEDURE
test=> CALL testproc();
CALL
test=> BEGIN;
BEGIN
test=> CALL testproc();
ERROR:  invalid transaction termination
CONTEXT:  PL/pgSQL function testproc() line 1 at COMMIT

Oops.
I find that indeed surprising.

What is the rationale for this?

Yours,
Laurenz Albe





Re: Problem during Windows service start

2018-08-02 Thread Laurenz Albe
Sakai, Teppei wrote:
> This is my first posting to the mailing list.
> 
> Currently our customer uses PostgreSQL 9.5 and hits problem during Windows 
> service start.
> The Windows service status of the instance is different from actual status.
> 
> We got the following situation.
> 1. Register service with 'pg_ctl register -N "PostgreSQL" -U postgres -P  
> -D D:\data\inst1 -w'
> 2. Start the instance from the Windows service screen.
> 3. After 60 seconds, the startup process fails with a timeout.
>Because crash recovery takes a lot of times.
> 
> Then, the service status of the instance become "STOPPED",
> but the instance was running.
> It cannot be stopped from the Windows service screen (it can be stopped only 
> with pg_ctl).
> 
> PostgreSQL version : 9.5.12
> Operating system : Windows Server 2012 R2
> 
> I think this is a bug.
> I think it has not been fixed in the latest version, is my understanding 
> correct?
> If it is correct, I will fix it.

I agree that this is not nice.

How do you propose to fix it?

Yours,
Laurenz Albe



Re: Allow postgres_fdw passwordless non-superuser conns with prior superuser permission

2018-08-06 Thread Laurenz Albe
Craig Ringer wrote:
> Currently postgres_fdw cannot be used with 'cert' authentication, i.e. 
> client-certificate validation
> and cert cn => postgres username mapping. You also can't use things like 
> Kerberos, SSPI, etc with
> a superuser-created FDW and username map.
> 
> To permit this, I'd like to allow postgres_fdw user mappings to be created 
> with a new
> 'permit_passwordless' option. Only the superuser is allowed to create such a 
> mapping.
> If it's set to true, we bypass the check_conn_params(...) connection-string 
> password check
> and the connect_pg_server(...) check for the conn using a password when a 
> non-superuser
> establishes a connection.
> 
> This doesn't re-open CVE-2007-6601 because the superuser has to explicitly 
> grant the access.

I have wished for a feature like that before, so +1 on the idea.

ALTER USER MAPPING has to be restricted to superusers as well.

Yours,
Laurenz Albe



Re: Loaded footgun open_datasync on Windows

2018-08-06 Thread Laurenz Albe
Thomas Munro wrote:
> It looks like initdb is failing with this patch:
> 
> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.6732
> 
> Unfortunately cfbot is not clever enough to print out the contents of
> the initdb log so I don't have any more information...

Sorry for the delay, I haven't been near a Windows machine lately.

Thanks for poking me, there was indeed a problem:
Now you have to tell initdb to open the PKI file as text file.

Attached is an updated patch.

Yours,
Laurenz Albediff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 3f203c6ca6..9f6a9dc3d9 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -490,7 +490,11 @@ readfile(const char *path)
 	char	   *buffer;
 	int			c;
 
+#ifdef WIN32
+	if ((infile = fopen(path, "rt")) == NULL)
+#else
 	if ((infile = fopen(path, "r")) == NULL)
+#endif
 	{
 		fprintf(stderr, _("%s: could not open file \"%s\" for reading: %s\n"),
 progname, path, strerror(errno));
diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c
index 8be8d48a8a..912aed8d7c 100644
--- a/src/bin/pg_basebackup/pg_receivewal.c
+++ b/src/bin/pg_basebackup/pg_receivewal.c
@@ -288,7 +288,7 @@ FindStreamingStart(uint32 *tli)
 
 			snprintf(fullpath, sizeof(fullpath), "%s/%s", basedir, dirent->d_name);
 
-			fd = open(fullpath, O_RDONLY | PG_BINARY);
+			fd = open(fullpath, O_RDONLY | PG_BINARY, 0);
 			if (fd < 0)
 			{
 fprintf(stderr, _("%s: could not open compressed file \"%s\": %s\n"),
diff --git a/src/bin/pg_verify_checksums/pg_verify_checksums.c b/src/bin/pg_verify_checksums/pg_verify_checksums.c
index 28c975446e..d1a7121a26 100644
--- a/src/bin/pg_verify_checksums/pg_verify_checksums.c
+++ b/src/bin/pg_verify_checksums/pg_verify_checksums.c
@@ -83,7 +83,7 @@ scan_file(char *fn, int segmentno)
 	int			f;
 	int			blockno;
 
-	f = open(fn, 0);
+	f = open(fn, 0, 0);
 	if (f < 0)
 	{
 		fprintf(stderr, _("%s: could not open file \"%s\": %s\n"),
diff --git a/src/common/file_utils.c b/src/common/file_utils.c
index 48876061c3..8a4cdb47ff 100644
--- a/src/common/file_utils.c
+++ b/src/common/file_utils.c
@@ -283,7 +283,7 @@ fsync_fname(const char *fname, bool isdir, const char *progname)
 	 * unsupported operations, e.g. opening a directory under Windows), and
 	 * logging others.
 	 */
-	fd = open(fname, flags);
+	fd = open(fname, flags, 0);
 	if (fd < 0)
 	{
 		if (errno == EACCES || (isdir && errno == EISDIR))
diff --git a/src/include/port.h b/src/include/port.h
index 74a9dc4d4d..2d40045ad5 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -249,11 +249,8 @@ extern bool rmtree(const char *path, bool rmtopdir);
 #define		O_DIRECT	0x8000
 extern int	pgwin32_open(const char *, int,...);
 extern FILE *pgwin32_fopen(const char *, const char *);
-
-#ifndef FRONTEND
 #define		open(a,b,c) pgwin32_open(a,b,c)
 #define		fopen(a,b) pgwin32_fopen(a,b)
-#endif
 
 /*
  * Mingw-w64 headers #define popen and pclose to _popen and _pclose.  We want


<    1   2   3   4   5   6   7   8   9   >