RE: simplifying foreign key/RI checks

2021-03-01 Thread houzj.f...@fujitsu.com
Hi amit,

(sorry about not cc the hacker list)
I have an issue about command id here.
It's probably not directly related to your patch, so I am sorry if it bothers 
you.

+   /*
+* Start the scan. To make the changes of the current command visible to
+* the scan and for subsequent locking of the tuple (if any) found,
+* increment the command counter.
+*/
+   CommandCounterIncrement();

For insert on fk relation, is it necessary to create new command id every time ?
I think it is only necessary when it modifies the referenced table.
for example: 1) has modifyingcte
 2) has modifying function(trigger/domain...)

All of the above seems not supported in parallel mode(parallel unsafe).
So I was wondering if we can avoid the CommandCounterIncrement in parallel mode.

Best regards,
houzj


Re: repeated decoding of prepared transactions

2021-03-01 Thread vignesh C
On Tue, Mar 2, 2021 at 9:33 AM Amit Kapila  wrote:
>
> On Tue, Mar 2, 2021 at 8:20 AM vignesh C  wrote:
> >
> >
> > I have a minor comment regarding the below:
> > + 
> > +  
> > +   two_phase bool
> > +  
> > +  
> > +  True if two-phase commits are enabled on this slot.
> > +  
> > + 
> >
> > Can we change something like:
> > True if the slot is enabled for decoding prepared transaction
> > information. Refer link for more information.(link should point where
> > more detailed information is available for two-phase in
> > pg_create_logical_replication_slot).
> >
> > Also there is one small indentation in that line, I think there should
> > be one space before "True if".
> >
>
> Okay, fixed these but I added a slightly different description. I have
> also added the parameter description for
> pg_create_logical_replication_slot in docs and changed the comments at
> various places in the code. Apart from that ran pgindent. The patch
> looks good to me now. Let me know what do you think?

Patch applies cleanly, make check and make check-world passes. I did
not find any other issue. The patch looks good to me.

Regards,
Vignesh




Re: We should stop telling users to "vacuum that database in single-user mode"

2021-03-01 Thread David Rowley
On Tue, 2 Mar 2021 at 04:32, Hannu Krosing  wrote:
>
> It looks like we are unnecessarily instructing our usiers to vacuum their
> databases in single-user mode when just vacuuming would be enough.
>
> We should fix the error message to be less misleading.

It would be good to change the message as it's pretty outdated. Back
in 8ad3965a1 (2005) when the message was added, SELECT and VACUUM
would have called GetNewTransactionId().  That's no longer the case.
We only do that when we actually need an XID.

However, I wonder if it's worth going a few steps further to try and
reduce the chances of that message being seen in the first place.
Maybe it's worth considering ditching any (auto)vacuum cost limits for
any table which is within X transaction from wrapping around.
Likewise for "VACUUM;" when the database's datfrozenxid is getting
dangerously high.

Such "emergency" vacuums could be noted in the auto-vacuum log and
NOTICEd or WARNING sent to the user during manual VACUUMs. Maybe the
value of X could be xidStopLimit minus a hundred million or so.

I have seen it happen that an instance has a vacuum_cost_limit set and
someone did start the database in single-user mode, per the advice of
the error message only to find that the VACUUM took a very long time
due to the restrictive cost limit. I struggle to imagine why anyone
wouldn't want the vacuum to run as quickly as possible in that
situation.

(Ideally, the speed of auto-vacuum would be expressed as a percentage
of time spent working vs sleeping rather than an absolute speed
limit... that way, faster servers would get faster vacuums, assuming
the same settings.  Vacuums may also get more work done per unit of
time during offpeak, which seems like it might be a thing that people
might want.)

David




Re: 64-bit XIDs in deleted nbtree pages

2021-03-01 Thread Masahiko Sawada
On Tue, Mar 2, 2021 at 1:06 PM Masahiko Sawada  wrote:
>
> On Tue, Mar 2, 2021 at 6:40 AM Peter Geoghegan  wrote:
> >
> > On Sun, Feb 28, 2021 at 8:08 PM Masahiko Sawada  
> > wrote:
> > > > Even though "the letter of the law" favors removing the
> > > > vacuum_cleanup_index_scale_factor GUC + param in the way I have
> > > > outlined, that is not the only thing that matters -- we must also
> > > > consider "the spirit of the law".
> >
> > > > I suppose I could ask Tom what he thinks?
> > >
> > > +1
> >
> > Are you going to start a new thread, or should I?
>
> Ok, I'll start a new thread soon.

I've started a new thread[1]. Please feel free to add your thoughts.

Regards,

[1] 
https://www.postgresql.org/message-id/CAD21AoA4WHthN5uU6%2BWScZ7%2BJ_RcEjmcuH94qcoUPuB42ShXzg%40mail.gmail.com

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Removing vacuum_cleanup_index_scale_factor

2021-03-01 Thread Masahiko Sawada
Hi all,

I've started this new thread separated from the thread[1] to discuss
removing vacuum_cleanup_index_scale_factor GUC parameter proposed by
Peter Geoghegan.

btvacuumcleanup() has been playing two roles: recycling deleted pages
and collecting index statistics. This discussion focuses on the
latter. Since PG11, btvacuumcleanup() uses
vacuum_cleanup_index_scale_factor as a threshold to do an index scan
to update index statistics (pg_class.reltuples and pg_class.relpages).
Therefore, even if there is no update on the btree index at all (e.g.,
btbulkdelete() was not called earlier), btvacuumcleanup() scans the
whole index to collect the index statistics if the number of newly
inserted tuples exceeds the vacuum_cleanup_index_scale_factor fraction
of the total number of heap tuples detected by the previous statistics
collection. On the other hand, those index statistics are updated also
by ANALYZE and autoanalyze. pg_class.reltuples calculated by ANALYZE
is an estimation whereas the value returned by btvacuumcleanup() is an
accurate value. But perhaps we can rely on ANALYZE and autoanalyze to
update those index statistics. The points of this discussion are what
we really need to do in btvacuumcleanup() and whether
btvacuumcleanup() really needs to do an index scan for the purpose of
index statistics update.

The original design that made VACUUM set
pg_class.reltuples/pg_class.relpages in indexes (from 15+ years ago)
assumed that it was cheap to handle statistics in passing. Even if we
have btvacuumcleanup() not do an index scan at all, this is 100%
allowed by the amvacuumcleanup contract described in the
documentation:

"It is OK to return NULL if the index was not changed at all during
the VACUUM operation, but otherwise correct stats should be returned."

The above description was added by commit e57345975cf in 2006 and
hasn't changed for now.

For instance, looking at hash indexes, it hasn't really changed since
2006 in terms of amvacuumcleanup(). hashvacuumcleanup() simply sets
stats->num_pages and stats->num_index_tuples without an index scan.
I'd like to quote the in-depth analysis by Peter Geoghegan:

-
/*
 * Post-VACUUM cleanup.
 *
 * Result: a palloc'd struct containing statistical info for VACUUM displays.
 */
IndexBulkDeleteResult *
hashvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
{
Relationrel = info->index;
BlockNumber num_pages;

/* If hashbulkdelete wasn't called, return NULL signifying no change */
/* Note: this covers the analyze_only case too */
if (stats == NULL)
return NULL;

/* update statistics */
num_pages = RelationGetNumberOfBlocks(rel);
stats->num_pages = num_pages;

return stats;
}

Clearly hashvacuumcleanup() was considered by Tom when he revised the
documentation in 2006. Here are some observations about
hashvacuumcleanup() that seem relevant now:

* There is no "analyze_only" handling, just like nbtree.

"analyze_only" is only used by GIN, even now, 15+ years after it was
added. GIN uses it to make autovacuum workers (never VACUUM outside of
an AV worker) do pending list insertions for ANALYZE -- just to make
it happen more often.  This is a niche thing -- clearly we don't have
to care about it in nbtree, even if we make btvacuumcleanup() (almost)
always return NULL when there was no btbulkdelete() call.

* num_pages (which will become pg_class.relpages for the index) is not
set when we return NULL -- hashvacuumcleanup() assumes that ANALYZE
will get to it eventually in the case where VACUUM does no real work
(when it just returns NULL).

* We also use RelationGetNumberOfBlocks() to set pg_class.relpages for
index relations during ANALYZE -- it's called when we call
vac_update_relstats() (I quoted this do_analyze_rel() code to you
directly in a recent email).

* In general, pg_class.relpages isn't an estimate (because we use
RelationGetNumberOfBlocks(), both in the VACUUM-updates case and the
ANALYZE-updates case) -- only pg_class.reltuples is truly an estimate
during ANALYZE, and so getting a "true count" seems to have only
limited practical importance.

I think that this sets a precedent in support of my view that we can
simply get rid of vacuum_cleanup_index_scale_factor without any
special effort to maintain pg_class.reltuples. As I said before, we
can safely make btvacuumcleanup() just like hashvacuumcleanup(),
except when there are known deleted-but-not-recycled pages, where a
full index scan really is necessary for reasons that are not related
to statistics at all (of course we still need the *logic* that was
added to nbtree by the vacuum_cleanup_index_scale_factor commit --
that is clearly necessary). My guess is that Tom would have made
btvacuumcleanup() look identical to hashvacuumcleanup() in 2006 if
nbtree didn't have page deletion to consider -- but that had to be
considered.
-

The above discussions make sense to me as a support for the "removing
vacuum_cleanup_index_scale_factor GUC" 

Re: Add --tablespace option to reindexdb

2021-03-01 Thread Michael Paquier
On Mon, Mar 01, 2021 at 10:12:54AM -0800, Mark Dilger wrote:
> Your check verifies that reindexing a system table on a new
> tablespace fails, but does not verify what the failure was.  I
> wonder if you might want to make it more robust, something like:

I was not sure if that was worth adding or not, but no objections to
do so.  So updated the patch to do that.

On Mon, Mar 01, 2021 at 09:47:57AM -0800, Mark Dilger wrote:
> I recognize that you are borrowing some of that from
> src/bin/pgbench/t/001_pgbench_with_server.pl, but I don't find
> anything intuitive about the name "ets" and would rather not see
> that repeated.  There is nothing in TestLib::perl2host to explain
> this name choice, nor in pgbench's test, nor yours.

Okay, I have made the variable names more explicit.

> but I don't see what tests of the --concurrently option have to do
> with this patch.  I'm not saying there is anything wrong with this
> change, but it seems out of place.  Am I missing something?

While hacking on this feature I have just bumped into this separate
issue, where the same test just gets repeated twice.  I have just
applied an independent patch to take care of this problem separately,
and backpatched it down to 12 where this got introduced.

On Mon, Mar 01, 2021 at 09:26:03AM -0800, Mark Dilger wrote:
> I think the language "to reindex on" could lead users to think that
> indexes already existent in the given tablespace will be reindexed.
> In other words, the option might be misconstrued as a way to specify
> all the indexes you want reindexed.  Whatever you do here, beware
> that you are using similar language in the --help, so consider
> changing that, too.

OK.  I have switched the docs to "Specifies the tablespace where
indexes are rebuilt" and --help to "tablespace where indexes are
rebuilt".

I have also removed the assertions based on the version number of the
backend, based on Daniel's input sent upthread.

What do you think?
--
Michael
From 3fe23196c67685fb02782d637538ac4ba1e83c67 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Tue, 2 Mar 2021 14:56:50 +0900
Subject: [PATCH v2] Add --tablespace option to reindexdb

---
 src/bin/scripts/reindexdb.c| 109 -
 src/bin/scripts/t/090_reindexdb.pl |  81 -
 doc/src/sgml/ref/reindexdb.sgml|  10 +++
 3 files changed, 163 insertions(+), 37 deletions(-)

diff --git a/src/bin/scripts/reindexdb.c b/src/bin/scripts/reindexdb.c
index 9f072ac49a..cf28176243 100644
--- a/src/bin/scripts/reindexdb.c
+++ b/src/bin/scripts/reindexdb.c
@@ -40,14 +40,15 @@ static void reindex_one_database(const ConnParams *cparams, ReindexType type,
  SimpleStringList *user_list,
  const char *progname,
  bool echo, bool verbose, bool concurrently,
- int concurrentCons);
+ int concurrentCons, const char *tablespace);
 static void reindex_all_databases(ConnParams *cparams,
   const char *progname, bool echo,
   bool quiet, bool verbose, bool concurrently,
-  int concurrentCons);
+  int concurrentCons, const char *tablespace);
 static void run_reindex_command(PGconn *conn, ReindexType type,
 const char *name, bool echo, bool verbose,
-bool concurrently, bool async);
+bool concurrently, bool async,
+const char *tablespace);
 
 static void help(const char *progname);
 
@@ -72,6 +73,7 @@ main(int argc, char *argv[])
 		{"verbose", no_argument, NULL, 'v'},
 		{"concurrently", no_argument, NULL, 1},
 		{"maintenance-db", required_argument, NULL, 2},
+		{"tablespace", required_argument, NULL, 3},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -84,6 +86,7 @@ main(int argc, char *argv[])
 	const char *host = NULL;
 	const char *port = NULL;
 	const char *username = NULL;
+	const char *tablespace = NULL;
 	enum trivalue prompt_password = TRI_DEFAULT;
 	ConnParams	cparams;
 	bool		syscatalog = false;
@@ -164,6 +167,9 @@ main(int argc, char *argv[])
 			case 2:
 maintenance_db = pg_strdup(optarg);
 break;
+			case 3:
+tablespace = pg_strdup(optarg);
+break;
 			default:
 fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 exit(1);
@@ -228,7 +234,7 @@ main(int argc, char *argv[])
 		cparams.dbname = maintenance_db;
 
 		reindex_all_databases(&cparams, progname, echo, quiet, verbose,
-			  concurrently, concurrentCons);
+			  concurrently, concurrentCons, tablespace);
 	}
 	else if (syscatalog)
 	{
@@ -268,7 +274,7 @@ main(int argc, char *argv[])
 
 		reindex_one_database(&cparams, REINDEX_SYSTEM, NULL,
 			 progname, echo, verbose,
-			 concurrently, 1);
+			 concurrently, 1, tablespace);
 	}
 	else
 	{
@@ -298,17 +304,17 @@ main(int argc, char *argv[])
 		if (schemas.head != NULL)
 			reindex_one_database(&cparams, REINDEX_SCHEMA, &schemas,
  progname, echo, verbose,
- concurrently, concurrentCons);
+ concurrently, concurrentCons, tablespace);
 
 		if (indexes.head

Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-01 Thread Joel Jacobson
On Tue, Mar 2, 2021, at 06:22, Isaac Morland wrote:
> On Tue, 2 Mar 2021 at 00:06, Joel Jacobson  wrote:
>> I find it strange two ranges of zero-length with different bounds are 
>> considered equal:
>> 
>> SELECT '[7,7)'::int4range = '[8,8)'::int4range;
>> ?column?
>> --
>> t
>> (1 row)
>> 
>> This seems like a bug to me. What am I missing here?
>> 
>> Unless fixed, then the way I see it, I don't think we can use int4range[] 
>> for regexp_positions(),
>> if we want to allow returning the positions for zero-length matches, which 
>> would be nice.
> 
> Ranges are treated as sets. As such equality is defined by membership.
> 
> That being said, I agree that there may be situations in which it would be 
> convenient to have empty ranges at specific locations. Doing this would 
> introduce numerous questions which would have to be resolved. For example, 
> where/when is the empty range resulting from an intersection operation?

Hmm, I think I would assume the intersection of two non-overlapping ranges to 
be isempty()=TRUE,
and for lower() and upper() to continue to return NULL.

But I think a zero-length range created with actual bounds should
return the lower() and upper() values during creation, instead of NULL.

I tried to find some other programming environments with range types.

The first one I found was Ada.

The below example is similar to int4range(7,6,'[]') which is invalid in 
PostgreSQL:

with Ada.Text_IO; use Ada.Text_IO;
procedure Hello is
   type Foo is range 7 .. 6;
begin
   Put_Line ( Foo'Image(Foo'First) );
   Put_Line ( Foo'Image(Foo'Last) );
end Hello;

$ ./gnatmake hello.adb
$ ./hello
7
6

I Ada, the 'Range of the Empty_String is 1 .. 0
https://en.wikibooks.org/wiki/Ada_Programming/Types/array#Array_Attributes

I think there is a case for allowing access to the the lower/upper vals instead 
of returning NULL,
since we can do so without changing what isempty() would return for the same 
values,.

/Joel

Re: libpq compression

2021-03-01 Thread Andrey Borodin


> 26 февр. 2021 г., в 02:18, Daniil Zakhlystov  
> написал(а):
> 
> Yes, there is still no possibility for compressed traffic pass-through for 
> poolers,
> but do we actually need it?
> I don’t see any solution except starting a new compression context for
> each message in order to make it work.
> Do we actually need to hurt the compression ratio for this specific use case?
> 
> Actually, there is an ugly hack - we may force-reset the compression context 
> by sending the 
> SetCompressionMethod (which will reset the compression algorithm & context) 
> after each CompressedData message.
> 
> This should allow interpreting of each CompressedData message on its own but 
> will add overhead and hurt the compression ratio.

As a maintainer of a connection pooler, I don't think compressing individual 
messages on their own compression context adds any value for that particular 
pooler.
Network has two different costs:
1. Cross-datacenter traffic is costly and needs to be compressed. Better 
compression ratios are preferable to CPU savings.
2. Intra-datacenter traffic is cheap and compression is not that crucial. In my 
opinion on most loaded installations, the pooler must be on the same host as 
DB. Compression can be just off.

I don't think we should craft one more CPU\Network tradeoff point by our own 
hands. Enough tradeoff points are implemented by lz4\zstd\whatever.

CPU usage to compress data is neglectable small compared to CPU overhead to 
produce data traffic.

The only real feature of compressing individual messages is better CRIME 
mitigation. E.i. not compressing Auth packets and some parameters of the query. 
This feature does not depend on resetting compression context. And resetting 
compression context often does not help to mitigate CRIME.

Thanks!

Best regards, Andrey Borodin.



Re: New IndexAM API controlling index vacuum strategies

2021-03-01 Thread Peter Geoghegan
On Mon, Mar 1, 2021 at 7:00 PM Peter Geoghegan  wrote:
> I think that you're right. However, in practice it isn't harmful
> because has_dead_tuples is only used when "all_visible = true", and
> only to detect corruption (which should never happen). I think that it
> should be fixed as part of this work, though.

Currently the first callsite that calls the new
lazy_vacuum_table_and_indexes() function in the patch
("skip_index_vacuum.patch") skips index vacuuming in exactly the same
way as the second and final lazy_vacuum_table_and_indexes() call site.
Don't we need to account for maintenance_work_mem in some way?

lazy_vacuum_table_and_indexes() should probably not skip index
vacuuming when we're close to exceeding the space allocated for the
LVDeadTuples array. Maybe we should not skip when
vacrelstats->dead_tuples->num_tuples is greater than 50% of
dead_tuples->max_tuples? Of course, this would only need to be
considered when lazy_vacuum_table_and_indexes() is only called once
for the entire VACUUM operation (otherwise we have far too little
maintenance_work_mem/dead_tuples->max_tuples anyway).

-- 
Peter Geoghegan




Re: Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-01 Thread Tom Lane
"Joel Jacobson"  writes:
> Unless fixed, then the way I see it, I don't think we can use int4range[] for 
> regexp_positions(),

Yeah.  It's a cute idea, but the semantics aren't quite right.

regards, tom lane




Re: repeated decoding of prepared transactions

2021-03-01 Thread Amit Kapila
On Tue, Mar 2, 2021 at 10:38 AM Ajin Cherian  wrote:
>
> On Tue, Mar 2, 2021 at 3:03 PM Amit Kapila  wrote:
>
> One minor comment:
> +  
> +  
> +   True if the slot is enabled for decoding prepared transactions.  
> Always
> +   false for physical slots.
> +  
> + 
>
> There is an extra space before Always. But when rendered in html this
> is not seen, so this might not be a problem.
>

I am just trying to be consistent with the nearby description. For example, see:
"The number of bytes that can be written to WAL such that this slot is
not in danger of getting in state "lost".  It is NULL for lost slots,
as well as if max_slot_wal_keep_size is
-1."

In Pg docs, comments, you will find that there are places where we use
a single space before the new line and also places where we use two
spaces. In this case, for the sake of consistency with the nearby
description, I used two spaces.

-- 
With Regards,
Amit Kapila.




Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-01 Thread Isaac Morland
On Tue, 2 Mar 2021 at 00:06, Joel Jacobson  wrote:

> I find it strange two ranges of zero-length with different bounds are
> considered equal:
>
> SELECT '[7,7)'::int4range = '[8,8)'::int4range;
> ?column?
> --
> t
> (1 row)
>
> This seems like a bug to me. What am I missing here?
>
> Unless fixed, then the way I see it, I don't think we can use int4range[]
> for regexp_positions(),
> if we want to allow returning the positions for zero-length matches, which
> would be nice.
>

Ranges are treated as sets. As such equality is defined by membership.

That being said, I agree that there may be situations in which it would be
convenient to have empty ranges at specific locations. Doing this would
introduce numerous questions which would have to be resolved. For example,
where/when is the empty range resulting from an intersection operation?


Re: repeated decoding of prepared transactions

2021-03-01 Thread Ajin Cherian
On Tue, Mar 2, 2021 at 3:03 PM Amit Kapila  wrote:

One minor comment:
+  
+  
+   True if the slot is enabled for decoding prepared transactions.  Always
+   false for physical slots.
+  
+ 

There is an extra space before Always. But when rendered in html this
is not seen, so this might not be a problem.

Other than that no more comments about the patch. Looks good.

regards,
Ajin Cherian
Fujitsu Australia




Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-01 Thread Joel Jacobson
On Tue, Mar 2, 2021, at 01:12, Mark Dilger wrote:
> I like the idea so I did a bit of testing.  I think the following should not 
> error, but does:
>
> +SELECT regexp_positions('foObARbEqUEbAz', $re$(?=beque)$re$, 'i');
> +ERROR:  range lower bound must be less than or equal to range upper bound

Thanks for testing!

The bug is due to using an (inclusive,inclusive) range,
so for the example the code tried to construct a (7,6,'[]') range.

When trying to fix, I think I've found a general problem with ranges:

I'll use int4range() to demonstrate the problem:

First the expected error for what the patch tries to do internally using 
make_range().
This is all good:

# SELECT int4range(7,6,'[]');
ERROR:  range lower bound must be less than or equal to range upper bound

I tried to fix this like this:

@ src/backend/utils/adt/regexp.c
-   lower.val = Int32GetDatum(so + 1);
+   lower.val = Int32GetDatum(so);
lower.infinite = false;
-   lower.inclusive = true;
+   lower.inclusive = false;
lower.lower = true;

Which would give the same result as doing:

SELECT int4range(6,6,'(]');
int4range
---
empty
(1 row)

Hmm. This "empty" value what surprise to me.
I would instead have assumed the canonical form "[7,7)".

If I wanted to know if the range is empty or not,
I would have guessed I should use the isempty() function, like this:

SELECT isempty(int4range(6,6,'(]'));
isempty
-
t
(1 row)

Since we have this isempty() function, I don't see the point of discarding
the lower/upper vals, since they contain possibly interesting information
on where the empty range exists.

I find it strange two ranges of zero-length with different bounds are 
considered equal:

SELECT '[7,7)'::int4range = '[8,8)'::int4range;
?column?
--
t
(1 row)

This seems like a bug to me. What am I missing here?

Unless fixed, then the way I see it, I don't think we can use int4range[] for 
regexp_positions(),
if we want to allow returning the positions for zero-length matches, which 
would be nice.

/Joel

Re: 64-bit XIDs in deleted nbtree pages

2021-03-01 Thread Peter Geoghegan
On Mon, Mar 1, 2021 at 8:06 PM Masahiko Sawada  wrote:
> I think that removing vacuum_cleanup_index_scale_factor in the back
> branches would affect the existing installation much. It would be
> better to have btree indexes not use this parameter while not changing
> the contents of meta page. That is, just remove the check related to
> vacuum_cleanup_index_scale_factor from _bt_vacuum_needs_cleanup().

That's really what I meant -- we cannot just remove a GUC or storage
param in the backbranches, of course (it breaks postgresql.conf, stuff
like that). But we can disable GUCs at the code level.

> And
> I personally prefer to fix the "IndexVacuumInfo.num_heap_tuples is
> inaccurate outside of btvacuumcleanup-only VACUUMs" bug separately.

I have not decided on my own position on the backbranches. Hopefully
there will be clear guidance from other hackers.

> Yeah, this argument makes sense to me. The default values of
> autovacuum_vacuum_insert_scale_factor/threshold are 0.2 and 1000
> respectively whereas one of vacuum_cleanup_index_scale_factor is 0.1.
> It means that in insert-only workload with default settings,
> autovacuums triggered by autovacuum_vacuum_insert_scale_factor always
> scan the all btree index to update the index statistics. I think most
> users would not expect this behavior. As I mentioned above, I think we
> can have nbtree not use this parameter or increase the default value
> of vacuum_cleanup_index_scale_factor in back branches.

It's not just a problem when autovacuum_vacuum_insert_scale_factor
triggers a cleanup-only VACUUM in all indexes. It's also a problem
with cases where there is a small number of dead tuples by an
autovacuum VACUUM triggered by autovacuum_vacuum_insert_scale_factor.
It will get index scans done by btbulkdeletes() -- which are more
expensive than a VACUUM that only calls btvacuumcleanup().

Of course this is exactly what the patch you're working on for
Postgres 14 helps with. It's actually not very different (1 dead tuple
and 0 dead tuples are not very different). So it makes sense that we
ended up here -- vacuumlazy.c alone should be in control of this
stuff, because only vacuumlazy.c has the authority to see that 1 dead
tuple and 0 dead tuples should be considered the same thing (or almost
the same). So...maybe we can only truly fix the problem in Postgres 14
anyway, and should just accept that?

OTOH scanning the indexes for no reason when
autovacuum_vacuum_insert_scale_factor triggers an autovacuum VACUUM
does seem *particularly* silly. So I don't know what to think.

-- 
Peter Geoghegan




Re: Fix DROP TABLESPACE on Windows with ProcSignalBarrier?

2021-03-01 Thread Thomas Munro
On Tue, Feb 2, 2021 at 11:16 AM Thomas Munro  wrote:
> Right, the checkpoint itself is probably worse than this
> "close-all-your-files!" thing in some cases [...]

I've been wondering what obscure hazards these "tombstone" (for want
of a better word) files guard against, besides the one described in
the comments for mdunlink().  I've been  thinking about various
schemes that can be summarised as "put the tombstones somewhere else",
but first... this is probably a stupid question, but what would break
if we just ... turned all this stuff off when wal_level is high enough
(as it is by default)?


0001-Make-relfile-tombstone-files-conditional-on-WAL-leve.not-for-cfbot-patch
Description: Binary data


Re: 64-bit XIDs in deleted nbtree pages

2021-03-01 Thread Masahiko Sawada
On Tue, Mar 2, 2021 at 6:40 AM Peter Geoghegan  wrote:
>
> On Sun, Feb 28, 2021 at 8:08 PM Masahiko Sawada  wrote:
> > > Even though "the letter of the law" favors removing the
> > > vacuum_cleanup_index_scale_factor GUC + param in the way I have
> > > outlined, that is not the only thing that matters -- we must also
> > > consider "the spirit of the law".
>
> > > I suppose I could ask Tom what he thinks?
> >
> > +1
>
> Are you going to start a new thread, or should I?

Ok, I'll start a new thread soon.

>
> > Since it seems not a bug I personally think we don't need to do
> > anything for back branches. But if we want not to trigger an index
> > scan by vacuum_cleanup_index_scale_factor, we could change the default
> > value to a high value (say, to 1) so that it can skip an index
> > scan in most cases.
>
> One reason to remove vacuum_cleanup_index_scale_factor in the back
> branches is that it removes any need to fix the
> "IndexVacuumInfo.num_heap_tuples is inaccurate outside of
> btvacuumcleanup-only VACUUMs" bug -- it just won't matter if
> btm_last_cleanup_num_heap_tuples is inaccurate anymore. (I am still
> not sure about backpatch being a good idea, though.)

I think that removing vacuum_cleanup_index_scale_factor in the back
branches would affect the existing installation much. It would be
better to have btree indexes not use this parameter while not changing
the contents of meta page. That is, just remove the check related to
vacuum_cleanup_index_scale_factor from _bt_vacuum_needs_cleanup(). And
I personally prefer to fix the "IndexVacuumInfo.num_heap_tuples is
inaccurate outside of btvacuumcleanup-only VACUUMs" bug separately.

>
> > > Another new concern for me (another concern unique to Postgres 13) is
> > > autovacuum_vacuum_insert_scale_factor-driven autovacuums.
> >
> > IIUC the purpose of autovacuum_vacuum_insert_scale_factor is
> > visibility map maintenance. And as per this discussion, it seems not
> > necessary to do an index scan in btvacuumcleanup() triggered by
> > autovacuum_vacuum_insert_scale_factor.
>
> Arguably the question of skipping scanning the index should have been
> considered by the autovacuum_vacuum_insert_scale_factor patch when it
> was committed for Postgres 13 -- but it wasn't. There is a regression
> that was tied to autovacuum_vacuum_insert_scale_factor in Postgres 13
> by Mark Callaghan, which I suspect is relevant:
>
> https://smalldatum.blogspot.com/2021/01/insert-benchmark-postgres-is-still.html
>
> The blog post says: "Updates - To understand the small regression
> mentioned above for the l.i1 test (more CPU & write IO) I repeated the
> test with 100M rows using 2 configurations: one disabled index
> deduplication and the other disabled insert-triggered autovacuum.
> Disabling index deduplication had no effect and disabling
> insert-triggered autovacuum resolves the regression."
>
> This is quite specifically with an insert-only workload, with 4
> indexes (that's from memory, but I'm pretty sure it's 4). I think that
> the failure to account for skipping index scans is probably the big
> problem here. Scanning the heap to set VM bits is unlikely to be
> expensive compared to the full index scans. An insert-only workload is
> going to find most of the heap blocks it scans to set VM bits in
> shared_buffers. Not so for the indexes.
>
> So in Postgres 13 we have this autovacuum_vacuum_insert_scale_factor
> issue, in addition to the deduplication + btvacuumcleanup issue we
> talked about (the problems left by my Postgres 13 bug fix commit
> 48e12913). These two issues make removing
> vacuum_cleanup_index_scale_factor tempting, even in the back branches
> -- it might actually be the more conservative approach, at least for
> Postgres 13.

Yeah, this argument makes sense to me. The default values of
autovacuum_vacuum_insert_scale_factor/threshold are 0.2 and 1000
respectively whereas one of vacuum_cleanup_index_scale_factor is 0.1.
It means that in insert-only workload with default settings,
autovacuums triggered by autovacuum_vacuum_insert_scale_factor always
scan the all btree index to update the index statistics. I think most
users would not expect this behavior. As I mentioned above, I think we
can have nbtree not use this parameter or increase the default value
of vacuum_cleanup_index_scale_factor in back branches.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: repeated decoding of prepared transactions

2021-03-01 Thread Amit Kapila
On Tue, Mar 2, 2021 at 8:20 AM vignesh C  wrote:
>
>
> I have a minor comment regarding the below:
> + 
> +  
> +   two_phase bool
> +  
> +  
> +  True if two-phase commits are enabled on this slot.
> +  
> + 
>
> Can we change something like:
> True if the slot is enabled for decoding prepared transaction
> information. Refer link for more information.(link should point where
> more detailed information is available for two-phase in
> pg_create_logical_replication_slot).
>
> Also there is one small indentation in that line, I think there should
> be one space before "True if".
>

Okay, fixed these but I added a slightly different description. I have
also added the parameter description for
pg_create_logical_replication_slot in docs and changed the comments at
various places in the code. Apart from that ran pgindent. The patch
looks good to me now. Let me know what do you think?

-- 
With Regards,
Amit Kapila.


v9-0001-Add-option-to-enable-two_phase-commits-via-pg_cre.patch
Description: Binary data


Re: REINDEX backend filtering

2021-03-01 Thread Julien Rouhaud
On Fri, Feb 26, 2021 at 11:17:26AM +0100, Magnus Hagander wrote:
> On Fri, Feb 26, 2021 at 11:07 AM Julien Rouhaud  wrote:
> >
> > It means that you'll have to distribute the work on a per-table basis
> > rather than a per-index basis.  The time spent to find out that a
> > table doesn't have any impacted index should be negligible compared to
> > the cost of running a reindex.  This obviously won't help that much if
> > you have a lot of table but only one being gigantic.
> 
> Yeah -- or at least a couple of large and many small, which I find to
> be a very common scenario. Or the case of some tables having many
> affected indexes and some having few.
> 
> You'd basically want to order the operation by table on something like
> "total size of the affected indexes on table x" -- which may very well
> put a smaller table with many indexes earlier in the queue. But you
> can't do that without having access to the filter

So, long running reindex due to some gigantic and/or numerous indexes on a
single (or few) table is not something that we can solve, but inefficient
reindex due to wrong table size / to-be-reindexed-indexes-size correlation can
be addressed.

I would still prefer to go to backend implementation, so that all client tools
can benefit from it by default.  We could simply export the current
index_has_oudated_collation(oid) function in sql, and tweak pg_dump to order
tables by the cumulated size of such indexes as you mentioned below, would
that work for you?

Also, given Thomas proposal in a nearby email this function would be renamed to
index_has_oudated_dependencies(oid) or something like that.

> > But even if we put the logic in the client, this still won't help as
> > reindexdb doesn't support multiple job with an index list:
> >
> >  * Index-level REINDEX is not supported with multiple jobs as we
> >  * cannot control the concurrent processing of multiple indexes
> >  * depending on the same relation.
> >  */
> > if (concurrentCons > 1 && indexes.head != NULL)
> > {
> > pg_log_error("cannot use multiple jobs to reindex indexes");
> > exit(1);
> > }
> 
> That sounds like it would be a fixable problem though, in principle.
> It could/should probably still limit all indexes on the same table to
> be processed in the same connection for the locking reasons of course,
> but doing an order by the total size of the indexes like above, and
> ensuring that they are grouped that way, doesn't sound *that* hard. I
> doubt it's that important in the current usecase of manually listing
> the indexes, but it would be useful for something like this.

Yeah, I don't think that in case of oudated dependency the --index will be
useful, it's likely that there will be too many indexes to process.  We can
still try to improve reindexdb to be able to process index lists with parallel
connections, but I would rather keep that separated from this patch.




Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2021-03-01 Thread Julien Rouhaud
On Wed, Jan 20, 2021 at 12:43 AM Julien Rouhaud  wrote:
>
> On Fri, Jan 8, 2021 at 1:07 AM Julien Rouhaud  wrote:
> >
> > v15 that fixes recent conflicts.
>
> Rebase only, thanks to the cfbot!  V16 attached.

Recent commit exposed that the explain_filter() doesn't filter
negative sign.  This can now be a problem with query identifiers in
explain output as they use the whole bigint range.  v17 attached fixes
that, also rebased against current HEAD.
From 8904b60ed3cc770f209ae5f97804b4d1ffe7d175 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud 
Date: Sun, 8 Mar 2020 14:34:44 +0100
Subject: [PATCH v17 3/3] Expose query identifier in verbose explain

If a query identifier has been computed, either by enabling compute_queryid or
using a third-party module, verbose explain will display it.

Author: Julien Rouhaud
Reviewed-by:
Discussion: https://postgr.es/m/CA+8PKvQnMfOE-c3YLRwxOsCYXQDyP8VXs6CDtMZp1V4=d4l...@mail.gmail.com
---
 doc/src/sgml/config.sgml  | 14 +++---
 doc/src/sgml/ref/explain.sgml |  6 --
 src/backend/commands/explain.c| 18 ++
 src/test/regress/expected/explain.out | 11 ++-
 src/test/regress/sql/explain.sql  |  5 -
 5 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 1763790473..2aeb146223 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7524,13 +7524,13 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
 Enables or disables in core query identifier computation.  A query
 identifier can be displayed in the pg_stat_activity
-view, or emitted in the log if configured via the  parameter.  The  extension also requires a query identifier
-to be computed.  Note that an external module can alternatively be used
-if the in core query identifier computation specification doesn't suit
-your need.  In this case, in core computation must be disabled.  The
-default is off.
+view, using EXPLAIN, or emitted in the log if
+configured via the  parameter.
+The  extension also requires a query
+identifier to be computed.  Note that an external module can
+alternatively be used if the in core query identifier computation
+specification doesn't suit your need.  In this case, in core
+computation must be disabled.  The default is off.

   
  
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index c4512332a0..105b069b41 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -136,8 +136,10 @@ ROLLBACK;
   the output column list for each node in the plan tree, schema-qualify
   table and function names, always label variables in expressions with
   their range table alias, and always print the name of each trigger for
-  which statistics are displayed.  This parameter defaults to
-  FALSE.
+  which statistics are displayed.  The query identifier will also be
+  displayed if one has been compute, see  for more details.  This parameter
+  defaults to FALSE.
  
 

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index afc45429ba..ac5879c1cf 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -24,6 +24,7 @@
 #include "nodes/extensible.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "parser/analyze.h"
 #include "parser/parsetree.h"
 #include "rewrite/rewriteHandler.h"
 #include "storage/bufmgr.h"
@@ -163,6 +164,8 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 {
 	ExplainState *es = NewExplainState();
 	TupOutputState *tstate;
+	JumbleState *jstate = NULL;
+	Query		*query;
 	List	   *rewritten;
 	ListCell   *lc;
 	bool		timing_set = false;
@@ -239,6 +242,13 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
 	/* if the summary was not set explicitly, set default value */
 	es->summary = (summary_set) ? es->summary : es->analyze;
 
+	query = castNode(Query, stmt->query);
+	if (compute_queryid)
+		jstate = JumbleQuery(query, pstate->p_sourcetext);
+
+	if (post_parse_analyze_hook)
+		(*post_parse_analyze_hook) (pstate, query, jstate);
+
 	/*
 	 * Parse analysis was done already, but we still have to run the rule
 	 * rewriter.  We do not do AcquireRewriteLocks: we assume the query either
@@ -598,6 +608,14 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
 	/* Create textual dump of plan tree */
 	ExplainPrintPlan(es, queryDesc);
 
+	if (es->verbose && plannedstmt->queryId != UINT64CONST(0))
+	{
+		char	buf[MAXINT8LEN+1];
+
+		pg_lltoa(plannedstmt->queryId, buf);
+		ExplainPropertyText("Query Identifier", buf, es);
+	}
+
 	/* Show buffer usage in planning */
 	if (bufusage)
 	{
diff --git a/src/test/regress/expected/explain.out b/src/test/regress/expected/explain.out
index dc7ab2ce8b..f45f069f30 100644
--- a/src/

Re: [PATCH] refactor ATExec{En,Dis}ableRowSecurity

2021-03-01 Thread Michael Paquier
On Mon, Mar 01, 2021 at 03:30:44PM +0900, Michael Paquier wrote:
> 0001 is a clean simplification and a good catch, so I'll see about
> applying it.  0002 just makes the code more confusing to the reader
> IMO, and its interface could easily lead to unwanted errors.

0001 has been applied as of fabde52.
--
Michael


signature.asc
Description: PGP signature


Re: 64-bit XIDs in deleted nbtree pages

2021-03-01 Thread Peter Geoghegan
On Mon, Mar 1, 2021 at 1:40 PM Peter Geoghegan  wrote:
> > Since it seems not a bug I personally think we don't need to do
> > anything for back branches. But if we want not to trigger an index
> > scan by vacuum_cleanup_index_scale_factor, we could change the default
> > value to a high value (say, to 1) so that it can skip an index
> > scan in most cases.
>
> One reason to remove vacuum_cleanup_index_scale_factor in the back
> branches is that it removes any need to fix the
> "IndexVacuumInfo.num_heap_tuples is inaccurate outside of
> btvacuumcleanup-only VACUUMs" bug -- it just won't matter if
> btm_last_cleanup_num_heap_tuples is inaccurate anymore. (I am still
> not sure about backpatch being a good idea, though.)

Attached is v8 of the patch series, which has new patches. No real
changes compared to v7 for the first patch, though.

There are now two additional prototype patches to remove the
vacuum_cleanup_index_scale_factor GUC/param along the lines we've
discussed. This requires teaching VACUUM ANALYZE about when to trust
VACUUM cleanup to set the statistics (that's what v8-0002* does).

The general idea for VACUUM ANALYZE in v8-0002* is to assume that
cleanup-only VACUUMs won't set the statistics accurately -- so we need
to keep track of this during VACUUM (in case it's a VACUUM ANALYZE,
which now needs to know if index vacuuming was "cleanup only" or not).
This is not a new thing for hash indexes -- they never did anything in
the cleanup-only case (hashvacuumcleanup() just returns NULL). And now
nbtree does the same thing (usually). Not all AMs will, but the new
assumption is much better than the one it replaces.

I thought of another existing case that violated the faulty assumption
made by VACUUM ANALYZE (which v8-0002* fixes): VACUUM's INDEX_CLEANUP
feature (which was added to Postgres 12 by commit a96c41feec6) is
another case where VACUUM does nothing with indexes. VACUUM ANALYZE
mistakenly considers that index vacuuming must have run and set the
pg_class statistics to an accurate value (more accurate than it is
capable of). But with INDEX_CLEANUP we won't even call
amvacuumcleanup().

-- 
Peter Geoghegan
From 967a057607ce2d0b648e324a9085ab4ccecd828e Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Thu, 25 Feb 2021 15:17:22 -0800
Subject: [PATCH v8 1/3] Recycle pages deleted during same VACUUM.

Author: Peter Geoghegan 
Discussion: https://postgr.es/m/CAH2-Wzk76_P=67iuscb1un44-gyzl-kgpsxbsxq_bdcma7q...@mail.gmail.com
---
 src/include/access/nbtree.h | 22 ++-
 src/backend/access/nbtree/README| 31 +
 src/backend/access/nbtree/nbtpage.c | 40 
 src/backend/access/nbtree/nbtree.c  | 97 +
 src/backend/access/nbtree/nbtxlog.c | 22 +++
 5 files changed, 211 insertions(+), 1 deletion(-)

diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index b56b7b7868..876b8f3437 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -279,7 +279,8 @@ BTPageGetDeleteXid(Page page)
  * Is an existing page recyclable?
  *
  * This exists to centralize the policy on which deleted pages are now safe to
- * re-use.
+ * re-use.  The _bt_newly_deleted_pages_recycle() optimization behaves more
+ * aggressively, though that has certain known limitations.
  *
  * Note: PageIsNew() pages are always safe to recycle, but we can't deal with
  * them here (caller is responsible for that case themselves).  Caller might
@@ -316,14 +317,33 @@ BTPageIsRecyclable(Page page)
  * BTVacState is private nbtree.c state used during VACUUM.  It is exported
  * for use by page deletion related code in nbtpage.c.
  */
+typedef struct BTPendingRecycle
+{
+	BlockNumber blkno;
+	FullTransactionId safexid;
+} BTPendingRecycle;
+
 typedef struct BTVacState
 {
+	/*
+	 * VACUUM operation state
+	 */
 	IndexVacuumInfo *info;
 	IndexBulkDeleteResult *stats;
 	IndexBulkDeleteCallback callback;
 	void	   *callback_state;
 	BTCycleId	cycleid;
+
+	/*
+	 * Page deletion state for VACUUM
+	 */
 	MemoryContext pagedelcontext;
+	BTPendingRecycle *deleted;
+	bool		grow;
+	bool		full;
+	uint32		ndeletedspace;
+	uint64		maxndeletedspace;
+	uint32		ndeleted;
 } BTVacState;
 
 /*
diff --git a/src/backend/access/nbtree/README b/src/backend/access/nbtree/README
index 46d49bf025..265814ea46 100644
--- a/src/backend/access/nbtree/README
+++ b/src/backend/access/nbtree/README
@@ -430,6 +430,37 @@ whenever it is subsequently taken from the FSM for reuse.  The deleted
 page's contents will be overwritten by the split operation (it will become
 the new right sibling page).
 
+Prior to PostgreSQL 14, VACUUM was only able to recycle pages that were
+deleted by a previous VACUUM operation (VACUUM typically placed all pages
+deleted by the last VACUUM into the FSM, though there were and are no
+certainties here).  This had the obvious disadvantage of creating
+uncertainty about when and how pages get recycled, especially with bursty
+workloads.  It was naive, 

Re: archive_command / pg_stat_archiver & documentation

2021-03-01 Thread Julien Rouhaud
On Tue, Mar 2, 2021 at 9:29 AM Michael Paquier  wrote:
>
> On Mon, Mar 01, 2021 at 05:17:06PM +0800, Julien Rouhaud wrote:
> > Maybe this can be better addressed than with a link in the
> > documentation.  The final outcome is that it can be difficult to
> > monitor the archiver state in such case.  That's orthogonal to this
> > patch but maybe we can add a new "archiver_start" timestamptz column
> > in pg_stat_archiver, so monitoring tools can detect a problem if it's
> > too far away from pg_postmaster_start_time() for instance?
>
> There may be other solutions as well.  I have applied the doc patch
> for now.

Thanks!




Re: New IndexAM API controlling index vacuum strategies

2021-03-01 Thread Peter Geoghegan
On Sun, Feb 21, 2021 at 10:28 PM Masahiko Sawada  wrote:
> Sorry for the late response.

Me too!

> 1. Whereas skipping index vacuum and heap vacuum is a very attractive
> improvement, if we skip that by default I wonder if we need a way to
> disable it. Vacuum plays a role in cleaning and diagnosing tables in
> practice. So in a case where the table is bad state and the user wants
> to clean all heap pages, it would be good to have a way to disable
> this skipping behavior. One solution would be that index_cleanup
> option has three different behaviors: on, auto (or smart), and off. We
> enable this skipping behavior by default in ‘auto’ mode, but
> specifying "INDEX_CLEANUP true” means to enforce index vacuum and
> therefore disabling it.

Sounds reasonable to me. Maybe users should express the skipping
behavior that they desire in terms of the *proportion* of all heap
blocks with LP_DEAD line pointers that we're willing to have while
still skipping index vacuuming + lazy_vacuum_heap() heap scan. In
other words, it can be a "scale" type GUC/param (though based on heap
blocks *at the end* of the first heap scan, not tuples at the point
the av launcher considers launching AV workers).

> @@ -1299,6 +1303,7 @@ lazy_scan_heap(Relation onerel, VacuumParams
> *params, LVRelStats *vacrelstats,
> {
> lazy_record_dead_tuple(dead_tuples, &(tuple.t_self));
> all_visible = false;
> +   has_dead_tuples = true;
> continue;
> }
>
> I added the above change in the patch to count the number of heap
> pages having at least one LP_DEAD line pointer. But it's weird to me
> that we have never set has_dead_tuples true when we found an LP_DEAD
> line pointer.

I think that you're right. However, in practice it isn't harmful
because has_dead_tuples is only used when "all_visible = true", and
only to detect corruption (which should never happen). I think that it
should be fixed as part of this work, though.

Lots of stuff in this area is kind of weird already. Sometimes this is
directly exposed to users, even. This came up recently, when I was
working on VACUUM VERBOSE stuff. (This touched the precise piece of
code you've patched in the quoted diff snippet, so perhaps you know
some of the story I will tell you now already.)

I recently noticed that VACUUM VERBOSE can report a very low
tups_vacuumed/"removable heap tuples" when run against tables where
most pruning is opportunistic pruning rather than VACUUM pruning
(which is very common), provided there are no HOT updates (which is
common but not very common). This can be very confusing, because
VACUUM VERBOSE will report a "tuples_deleted" for the heap relation
that is far far less than the "tuples_removed" it reports for indexes
on the same table -- even though both fields have values that are
technically accurate (e.g., not very affected by concurrent activity
during VACUUM, nothing like that).

This came to my attention when I was running BenchmarkSQL for the
64-bit XID deleted pages patch. One of the BenchmarkSQL tables (though
only one -- the table whose UPDATEs are not HOT safe, which is unique
among BenchmarkSQL/TPC-C tables). I pushed a commit with comment
changes [1] to make that aspect of VACUUM VERBOSE a little less
confusing. (I was actually running a quick-and-dirty hack that made
log_autovacuum show VACUUM VERBOSE index stuff -- I would probably
have missed the weird difference between heap tups_vacuumed and index
tuples_removed without this custom log_autovacuum hack.)

Just to be clear (I think you agree already): we should base any
triggering logic for skipping index vacuuming/lazy_vacuum_heap() on
logic that does not care *when* heap pages first contained LP_DEAD
line pointers (could be that they were counted in tups_vacuumed due to
being pruned during this VACUUM operation, could be from an earlier
opportunistic pruning, etc).

> Currently, we set it to false true in 'tupgone' case but
> it seems to me that we should do that in this case as well since we
> use this flag in the following check:
>
>else if (PageIsAllVisible(page) && has_dead_tuples)
>{
>elog(WARNING, "page containing dead tuples is marked as
> all-visible in relation \"%s\" page %u",
> vacrelstats->relname, blkno);
>PageClearAllVisible(page);
>MarkBufferDirty(buf);
>visibilitymap_clear(onerel, blkno, vmbuffer,
>VISIBILITYMAP_VALID_BITS);
>}

The "tupgone = true"/HEAPTUPLE_DEAD-race case is *extremely* weird. It
has zero test coverage according to coverage.postgresql.org [2],
despite being very complicated.

3 points on the "tupgone = true" weirdness (I'm writing this as a
record for myself, almost):

1. It is the reason why lazy_vacuum_heap() must be prepared to set
tuples LP_UNUSED that are not already LP_DEAD. So when
lazy_vacuum_page() says "the first dead tuple for this page",  that
doesn't

[PATCH] psql : Improve code for help option

2021-03-01 Thread miyake_kouta

Hi.

I found some redundant code in psql/help.c, so I propose a patch to fix 
it.
In the current help.c, the variable "user" is set to the value of 
$PGUSER (or get_user_name).

However, $PGUSER is referenced again in the code that follows.
We can replace this part with "user", I think.

Regards.
--
Kota Miyakediff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index e44120bf76..75a2217cac 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -134,9 +134,7 @@ usage(unsigned short int pager)
 	fprintf(output, _("  -p, --port=PORT  database server port (default: \"%s\")\n"),
 			env ? env : DEF_PGPORT_STR);
 	/* Display default user */
-	env = getenv("PGUSER");
-	if (!env)
-		env = user;
+	env = user;
 	fprintf(output, _("  -U, --username=USERNAME  database user name (default: \"%s\")\n"), env);
 	fprintf(output, _("  -w, --no-passwordnever prompt for password\n"));
 	fprintf(output, _("  -W, --password   force password prompt (should happen automatically)\n"));


Re: [PATCH] pgbench: Bug fix for the -d option

2021-03-01 Thread miyake_kouta

2021-02-26 20:30, Michael Paquier wrote:

By the way, I can help but wonder why pgbench has such a different
handling for the user name, fetching first PGUSER and then looking at
the options while most of the other binaries use get_user_name().  It
seems to me that we could simplify the handling around "login" without
really impacting the usefulness of the tool, no?


Hi.

Thank you for your comment.
I modified the patch based on other binaries.
In this new patch, if there is a $PGUSER, then it's set to login.
If not, get_user_name_or_exit is excuted.
Plese let me know what you think about this change.
--
Kota Miyakediff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 31a4df45f5..b20c7b5b27 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -32,6 +32,7 @@
 #endif
 
 #include "postgres_fe.h"
+#include "common/username.h"
 
 #include 
 #include 
@@ -5487,8 +5488,10 @@ main(int argc, char **argv)
 		pghost = env;
 	if ((env = getenv("PGPORT")) != NULL && *env != '\0')
 		pgport = env;
-	else if ((env = getenv("PGUSER")) != NULL && *env != '\0')
+	if ((env = getenv("PGUSER")) != NULL && *env != '\0')
 		login = env;
+	else
+		login = get_user_name_or_exit(progname);
 
 	state = (CState *) pg_malloc0(sizeof(CState));
 


Re: repeated decoding of prepared transactions

2021-03-01 Thread vignesh C
On Tue, Mar 2, 2021 at 6:37 AM Ajin Cherian  wrote:
>
> On Mon, Mar 1, 2021 at 8:08 PM Amit Kapila  wrote:
>
> > Few minor comments on 0002 patch
> > =
> > 1.
> >   ctx->streaming &= enable_streaming;
> > - ctx->twophase &= enable_twophase;
> > +
> >  }
> >
> > Spurious line addition.
>
> Deleted.
>
> >
> > 2.
> > -  proallargtypes =>
> > '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8}',
> > -  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o}',
> > -  proargnames =>
> > '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size}',
> > +  proallargtypes =>
> > '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool}',
> > +  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
> > +  proargnames =>
> > '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,twophase}',
> >prosrc => 'pg_get_replication_slots' },
> >  { oid => '3786', descr => 'set up a logical replication slot',
> >proname => 'pg_create_logical_replication_slot', provolatile => 'v',
> > -  proparallel => 'u', prorettype => 'record', proargtypes => 'name name 
> > bool',
> > -  proallargtypes => '{name,name,bool,name,pg_lsn}',
> > -  proargmodes => '{i,i,i,o,o}',
> > -  proargnames => '{slot_name,plugin,temporary,slot_name,lsn}',
> > +  proparallel => 'u', prorettype => 'record', proargtypes => 'name
> > name bool bool',
> > +  proallargtypes => '{name,name,bool,bool,name,pg_lsn}',
> > +  proargmodes => '{i,i,i,i,o,o}',
> > +  proargnames => '{slot_name,plugin,temporary,twophase,slot_name,lsn}',
> >
> > I think it is better to use two_phase here and at other places as well
> > to be consistent with similar parameters.
>
> Updated as requested.
> >
> > 3.
> > --- a/src/backend/catalog/system_views.sql
> > +++ b/src/backend/catalog/system_views.sql
> > @@ -894,7 +894,8 @@ CREATE VIEW pg_replication_slots AS
> >  L.restart_lsn,
> >  L.confirmed_flush_lsn,
> >  L.wal_status,
> > -L.safe_wal_size
> > +L.safe_wal_size,
> > + L.twophase
> >  FROM pg_get_replication_slots() AS L
> >
> > Indentation issue. Here, you need you spaces instead of tabs.
>
> Updated.
> >
> > 4.
> > @@ -533,6 +533,12 @@ CreateDecodingContext(XLogRecPtr start_lsn,
> >
> >   ctx->reorder->output_rewrites = ctx->options.receive_rewrites;
> >
> > + /*
> > + * If twophase is set on the slot at create time, then
> > + * make sure the field in the context is also updated.
> > + */
> > + ctx->twophase &= MyReplicationSlot->data.twophase;
> > +
> >
> > Why didn't you made similar change in CreateInitDecodingContext when I
> > already suggested the same in my previous email? If we don't make that
> > change then during slot initialization two_phase will always be true
> > even though user passed in as false. It looks inconsistent and even
> > though there is no direct problem due to that but it could be cause of
> > possible problem in future.
>
> Updated.
>

I have a minor comment regarding the below:
+ 
+  
+   two_phase bool
+  
+  
+  True if two-phase commits are enabled on this slot.
+  
+ 

Can we change something like:
True if the slot is enabled for decoding prepared transaction
information. Refer link for more information.(link should point where
more detailed information is available for two-phase in
pg_create_logical_replication_slot).

Also there is one small indentation in that line, I think there should
be one space before "True if".

Regards,
Vignesh




Re: We should stop telling users to "vacuum that database in single-user mode"

2021-03-01 Thread Noah Misch
On Mon, Mar 01, 2021 at 04:32:23PM +0100, Hannu Krosing wrote:
> It looks like we are unnecessarily instructing our usiers to vacuum their
> databases in single-user mode when just vacuuming would be enough.

> When I started looking at improving the situation I discovered, that there
> already is no need to run VACUUM in single user mode in any currently 
> supported
> PostgreSQL version as you can run VACUUM perfectly well when the wraparound
> protection is active.

A comment in SetTransactionIdLimit() says, "VACUUM requires an XID if it
truncates at wal_level!=minimal."  Hence, I think plain VACUUM will fail some
of the time; VACUUM (TRUNCATE false) should be reliable.  In general, I like
your goal of replacing painful error message advice with less-painful advice.
At the same time, it's valuable for the advice to reliably get the user out of
the bad situation.




Re: A reloption for partitioned tables - parallel_workers

2021-03-01 Thread Amit Langote
On Tue, Mar 2, 2021 at 12:10 AM Laurenz Albe  wrote:
> Here is an updated patch with this fix.

Thanks for updating the patch.  I was about to post an updated version
myself but you beat me to it.

> I added regression tests and adapted the documentation a bit.
>
> I also added support for lowering the number of parallel workers.

+ALTER TABLE pagg_tab_ml SET (parallel_workers = 0);
+EXPLAIN (COSTS OFF)
+SELECT a FROM pagg_tab_ml WHERE b = 42;
+QUERY PLAN
+---
+ Append
+   ->  Seq Scan on pagg_tab_ml_p1 pagg_tab_ml_1
+ Filter: (b = 42)
+   ->  Seq Scan on pagg_tab_ml_p2_s1 pagg_tab_ml_2
+ Filter: (b = 42)
+   ->  Seq Scan on pagg_tab_ml_p2_s2 pagg_tab_ml_3
+ Filter: (b = 42)
+(7 rows)

I got the same result with my implementation, but I am wondering if
setting parallel_workers=0 on the parent table shouldn't really
disable a regular (non-parallel-aware) Append running under Gather
even if it does Parallel Append (parallel-aware)?  So in this test
case, there should have been a Gather atop Append, with individual
partitions scanned using Parallel Seq Scan where applicable.


--
Amit Langote
EDB: http://www.enterprisedb.com




Re: simplifying foreign key/RI checks

2021-03-01 Thread Amit Langote
On Mon, Mar 1, 2021 at 3:14 PM Corey Huinker  wrote:
>> > It seems to me 1 (RI_PLAN_CHECK_LOOKUPPK) is still alive. (Yeah, I
>> > know that doesn't mean the usefulness of the macro but the mechanism
>> > the macro suggests, but it is confusing.) On the other hand,
>> > RI_PLAN_CHECK_LOOKUPPK_FROM_PK and RI_PLAN_LAST_ON_PK seem to be no
>> > longer used.  (Couldn't we remove them?)
>>
>> Yeah, better to just remove those _PK macros and say this module no
>> longer runs any queries on the PK table.
>>
>> How about the attached?
>>
>
> Sorry for the delay.
> I see that the changes were made as described.
> Passes make check and make check-world yet again.
> I'm marking this Ready For Committer unless someone objects.

Thank you Corey for the review.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: doc review for v14

2021-03-01 Thread Michael Paquier
On Mon, Mar 01, 2021 at 03:17:40PM +0900, Michael Paquier wrote:
> You could say here "Checksums can be enabled", but "normally" does not
> sound bad to me either as it insists on the fact that it is better to
> do that when the cluster is initdb'd as this has no downtime compared
> to enabling checksums on an existing cluster.

I looked at that again this morning, and the last version sent
upthread still looked fine to me, so I have just applied that.

Thanks for caring about that, Justin.
--
Michael


signature.asc
Description: PGP signature


Re: [BUG] Autovacuum not dynamically decreasing cost_limit and cost_delay

2021-03-01 Thread Masahiko Sawada
On Mon, Feb 8, 2021 at 11:49 PM Mead, Scott  wrote:
>
> Hello,
>I recently looked at what it would take to make a running autovacuum 
> pick-up a change to either cost_delay or cost_limit.  Users frequently will 
> have a conservative value set, and then wish to change it when autovacuum 
> initiates a freeze on a relation.  Most users end up finding out they are in 
> ‘to prevent wraparound’ after it has happened, this means that if they want 
> the vacuum to take advantage of more I/O, they need to stop and then restart 
> the currently running vacuum (after reloading the GUCs).
>
>   Initially, my goal was to determine feasibility for making this dynamic.  I 
> added debug code to vacuum.c:vacuum_delay_point(void) and found that changes 
> to cost_delay and cost_limit are already processed by a running vacuum.  
> There was a bug preventing the cost_delay or cost_limit from being configured 
> to allow higher throughput however.
>
> I believe this is a bug because currently, autovacuum will dynamically detect 
> and increase the cost_limit or cost_delay, but it can never decrease those 
> values beyond their setting when the vacuum began.  The current behavior is 
> for vacuum to limit the maximum throughput of currently running vacuum 
> processes to the cost_limit that was set when the vacuum process began.

Thanks for your report.

I've not looked at the patch yet but I agree that the calculation for
autovacuum cost delay seems not to work fine if vacuum-delay-related
parameters (e.g., autovacuum_vacuum_cost_delay etc) are changed during
vacuuming a table to speed up running autovacuums. Here is my
analysis:

Suppose we have the following parameters and 3 autovacuum workers are
running on different tables:

autovacuum_vacuum_cost_delay = 100
autovacuum_vacuum_cost_limit = 100

Vacuum cost-based delay parameters for each workers are follows:

worker->wi_cost_limit_base = 100
worker->wi_cost_limit = 66
worker->wi_cost_delay = 100

Each running autovacuum has "wi_cost_limit = 66" because the total
limit (100) is equally rationed. And another point is that the total
wi_cost_limit (198 = 66*3) is less than autovacuum_vacuum_cost_limit,
100. Which are fine.

Here let's change autovacuum_vacuum_cost_delay/limit value to speed up
running autovacuums.

Case 1 : increasing autovacuum_vacuum_cost_limit to 1000.

After reloading the configuration file, vacuum cost-based delay
parameters for each worker become as follows:

worker->wi_cost_limit_base = 100
worker->wi_cost_limit = 100
worker->wi_cost_delay = 100

If we rationed autovacuum_vacuum_cost_limit, 1000, to 3 workers, it
would be 333. But since we cap it by wi_cost_limit_base, the
wi_cost_limit is 100. I think this is what Mead reported here.

Case 2 : decreasing autovacuum_vacuum_cost_delay to 10.

After reloading the configuration file, vacuum cost-based delay
parameters for each workers become as follows:

worker->wi_cost_limit_base = 100
worker->wi_cost_limit = 100
worker->wi_cost_delay = 100

Actually, the result is the same as case 1. But In this case, the
total cost among the three workers is 300, which is greater than
autovacuum_vacuum_cost_limit, 100. This behavior violates what the
documentation explains in the description of
autovacuum_vacuum_cost_limit:

---
Note that the value is distributed proportionally among the running
autovacuum workers, if there is more than one, so that the sum of the
limits for each worker does not exceed the value of this variable.
---

It seems to me that those problems come from the fact that we don't
change both wi_cost_limit_base and wi_cost_delay during auto-vacuuming
a table in spite of using autovacuum_vac_cost_limit/delay to calculate
cost_avail. Such a wrong calculation happens until all running
autovacuum workers finish the current vacuums. When a worker starts to
process a new table, it resets both wi_cost_limit_base and
wi_cost_delay.

Looking at autovac_balance_cost(), it considers worker's
wi_cost_limit_base to calculate the total base cost limit of
participating active workers as follows:

cost_total +=
(double) worker->wi_cost_limit_base / worker->wi_cost_delay;

But what is the point of calculating it while assuming each worker
having a different cost limit? Since workers vacuuming on a table
whose cost parameters are set individually doesn't participate in this
calculation (by commit 1021bd6a8 in 2014), having at_dobalance true, I
wonder if we can just assume all workers have the same cost_limit and
cost_delay except for workers setting at_dobalance true. If we can do
that, I guess we no longer need wi_cost_limit_base.

Also, we don't change wi_cost_delay during vacuuming a table, which
seems wrong to me. autovac_balance_cost() can change workers'
wi_cost_delay, eventually applying to VacuumCostDelay.

What do you think?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: TRIM_ARRAY

2021-03-01 Thread Vik Fearing
On 3/2/21 1:02 AM, Dian M Fay wrote:
> On Mon Mar 1, 2021 at 6:53 PM EST, Vik Fearing wrote:
>>> This basically does what it says, and the code looks good. The
>>> documentation is out of alphabetical order (trim_array should appear
>>> under cardinality rather than over)) but good otherwise.
>>
>> Hmm. It appears between cardinality and unnest in the source code and
>> also my compiled html. Can you say more about where you're seeing the
>> wrong order?
> 
> I applied the patch to the latest commit, ffd3944ab9. Table 9.52 is
> ordered:
> 
> array_to_string
> array_upper
> trim_array
> cardinality
> unnest

So it turns out I must have fixed it locally after I posted the patch
and then forgot I did that.  Attached is a new patch with the order
correct.  Thanks for spotting it!

>> The problem here is that postgres needs to know what the return
>> type is and it can only determine that from the input.
>>
>> If you give the function a typed null, it returns null as expected.
>>
>>> The new status of this patch is: Waiting on Author
>>
>> I put it back to Needs Review without a new patch because I don't know
>> what I would change.
> 
> I'd thought that checking v and returning null instead of raising the
> error would be more friendly, should it be possible to pass an untyped
> null accidentally instead of on purpose, and I couldn't rule that out.

As Tom said, that is something that does not belong in this patch.
-- 
Vik Fearing
>From 351bf14400d8ea2948788a96b47ce6c20847de97 Mon Sep 17 00:00:00 2001
From: Vik Fearing 
Date: Tue, 16 Feb 2021 18:38:24 +0100
Subject: [PATCH] implement trim_array

---
 doc/src/sgml/func.sgml   | 18 
 src/backend/catalog/sql_features.txt |  2 +-
 src/backend/utils/adt/arrayfuncs.c   | 42 
 src/include/catalog/pg_proc.dat  |  5 
 src/test/regress/expected/arrays.out | 23 +++
 src/test/regress/sql/arrays.sql  | 19 +
 6 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 08f08322ca..2aac87cf8d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -17909,6 +17909,24 @@ SELECT NULLIF(value, '(none)') ...

   
 
+  
+   
+
+ trim_array
+
+trim_array ( array anyarray, n integer )
+anyarray
+   
+   
+Trims an array by removing the last n elements.
+If the array is multidimensional, only the first dimension is trimmed.
+   
+   
+trim_array(ARRAY[1,2,3,4,5,6], 2)
+{1,2,3,4}
+   
+  
+
   

 
diff --git a/src/backend/catalog/sql_features.txt b/src/backend/catalog/sql_features.txt
index ab0895ce3c..32eed988ab 100644
--- a/src/backend/catalog/sql_features.txt
+++ b/src/backend/catalog/sql_features.txt
@@ -398,7 +398,7 @@ S301	Enhanced UNNEST			YES
 S401	Distinct types based on array types			NO	
 S402	Distinct types based on distinct types			NO	
 S403	ARRAY_MAX_CARDINALITY			NO	
-S404	TRIM_ARRAY			NO	
+S404	TRIM_ARRAY			YES	
 T011	Timestamp in Information Schema			NO	
 T021	BINARY and VARBINARY data types			NO	
 T022	Advanced support for BINARY and VARBINARY data types			NO	
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index f7012cc5d9..d38a99f0b0 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -6631,3 +6631,45 @@ width_bucket_array_variable(Datum operand,
 
 	return left;
 }
+
+/*
+ * Trim the right N elements from an array by calculating an appropriate slice.
+ * Only the first dimension is trimmed.
+ */
+Datum
+trim_array(PG_FUNCTION_ARGS)
+{
+	ArrayType  *v = PG_GETARG_ARRAYTYPE_P(0);
+	int			n = PG_GETARG_INT32(1);
+	int			array_length = ARR_DIMS(v)[0];
+	int16		elmlen;
+	bool		elmbyval;
+	char		elmalign;
+	int			lower[MAXDIM];
+	int			upper[MAXDIM];
+	bool		lowerProvided[MAXDIM];
+	bool		upperProvided[MAXDIM];
+	Datum		result;
+
+	/* Throw an error if out of bounds */
+	if (n < 0 || n > array_length)
+		ereport(ERROR,
+(errcode(ERRCODE_ARRAY_ELEMENT_ERROR),
+ errmsg("number of elements to trim must be between 0 and %d", array_length)));
+
+	/* Set all the bounds as unprovided except the first upper bound */
+	memset(lowerProvided, 0, sizeof(lowerProvided));
+	memset(upperProvided, 0, sizeof(upperProvided));
+	upper[0] = ARR_LBOUND(v)[0] + array_length - n - 1;
+	upperProvided[0] = true;
+
+	/* Fetch the needed information about the element type */
+	get_typlenbyvalalign(ARR_ELEMTYPE(v), &elmlen, &elmbyval, &elmalign);
+
+	/* Get the slice */
+	result = array_get_slice(PointerGetDatum(v), 1,
+			 upper, lower, upperProvided, lowerProvided,
+			 -1, elmlen, elmbyval, elmalign);
+
+	PG_RETURN_DATUM(result);
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 1487710d59..8ab911238d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.

Re: archive_command / pg_stat_archiver & documentation

2021-03-01 Thread Michael Paquier
On Mon, Mar 01, 2021 at 05:17:06PM +0800, Julien Rouhaud wrote:
> Maybe this can be better addressed than with a link in the
> documentation.  The final outcome is that it can be difficult to
> monitor the archiver state in such case.  That's orthogonal to this
> patch but maybe we can add a new "archiver_start" timestamptz column
> in pg_stat_archiver, so monitoring tools can detect a problem if it's
> too far away from pg_postmaster_start_time() for instance?

There may be other solutions as well.  I have applied the doc patch
for now.
--
Michael


signature.asc
Description: PGP signature


Re: [PATCH] Bug fix in initdb output

2021-03-01 Thread Michael Paquier
On Tue, Mar 02, 2021 at 01:28:57AM +0100, Juan José Santamaría Flecha wrote:
> For me it is a +1 for the change to absolute. Let's see if more people want
> to weigh in on the matter.

FWIW, I don't think that it is a good idea to come back to this
decision for *nix platforms, so I would let it as-is, and use relative
paths if initdb is called using a relative path.

The number of people using a relative path for Windows initialization
sounds rather limited to me.  However, if you can write something that
makes the path printed compatible for a WIN32 terminal while keeping
the behavior consistent with other platforms, people would have no
objections to that.
--
Michael


signature.asc
Description: PGP signature


Re: PostmasterIsAlive() in recovery (non-USE_POST_MASTER_DEATH_SIGNAL builds)

2021-03-01 Thread Thomas Munro
On Tue, Mar 2, 2021 at 12:00 AM Thomas Munro  wrote:
> On Mon, Nov 16, 2020 at 8:56 PM Michael Paquier  wrote:
> > No objections with the two changes from pg_usleep() to WaitLatch() so
> > they could be applied separately first.
>
> I thought about committing that first part, and got as far as
> splitting the patch into two (see attached), but then I re-read
> Fujii-san's message about the speed of promotion and realised that we
> really should have something like a condition variable for walRcvState
> changes.  I'll look into that.

Here's an experimental attempt at that, though I'm not sure if it's
the right approach.  Of course it's not necessary to use condition
variables here: we could use recoveryWakeupLatch, because we're not in
any doubt about who needs to be woken up.  But then you could get
constant wakeups while recovery is paused, unless you also suppressed
that somehow.  You could use the startup process's procLatch,
advertised in shmem, but that's almost a condition variable.  With a
condition variable, you get to name it something like
walRcvStateChanged, and then the programming rule is very clear: if
you change walRcvState, you need to broadcast that fact (and you don't
have to worry about who might be interested).  One question I haven't
got to the bottom of: is it a problem for the startup process that CVs
use CHECK_FOR_INTERRUPTS()?
From 1e28b9c21a953e66c48f78a90192f3a0ca83d0aa Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 2 Mar 2021 11:08:06 +1300
Subject: [PATCH v6 1/3] Add condition variable for walreceiver state.

Use this new CV to wait for walreceiver shutdown without a sleep/poll
loop, while also benefiting from standard postmaster death handling.

Reviewed-by: Heikki Linnakangas 
Reviewed-by: Fujii Masao 
Reviewed-by: Michael Paquier 
Discussion: https://postgr.es/m/CA%2BhUKGK1607VmtrDUHQXrsooU%3Dap4g4R2yaoByWOOA3m8xevUQ%40mail.gmail.com
---
 doc/src/sgml/monitoring.sgml   |  4 +++
 src/backend/postmaster/pgstat.c|  3 ++
 src/backend/replication/walreceiver.c  | 10 ++
 src/backend/replication/walreceiverfuncs.c | 42 ++
 src/include/pgstat.h   |  1 +
 src/include/replication/walreceiver.h  |  2 ++
 6 files changed, 48 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3513e127b7..cf00210cb3 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1757,6 +1757,10 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
   Waiting for confirmation from a remote server during synchronous
replication.
  
+ 
+  WalrcvExit
+  Waiting for the walreceiver to exit.
+ 
  
   XactGroupUpdate
   Waiting for the group leader to update transaction status at
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index f75b52719d..2dbfb81e40 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -4122,6 +4122,9 @@ pgstat_get_wait_ipc(WaitEventIPC w)
 		case WAIT_EVENT_SYNC_REP:
 			event_name = "SyncRep";
 			break;
+		case WAIT_EVENT_WALRCV_EXIT:
+			event_name = "WalrcvExit";
+			break;
 		case WAIT_EVENT_XACT_GROUP_UPDATE:
 			event_name = "XactGroupUpdate";
 			break;
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index e5f8a06fea..397a94d7af 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -207,6 +207,8 @@ WalReceiverMain(void)
 
 		case WALRCV_STOPPED:
 			SpinLockRelease(&walrcv->mutex);
+			/* We might have changed state and fallen through above. */
+			ConditionVariableBroadcast(&walrcv->walRcvStateChanged);
 			proc_exit(1);
 			break;
 
@@ -249,6 +251,8 @@ WalReceiverMain(void)
 
 	SpinLockRelease(&walrcv->mutex);
 
+	ConditionVariableBroadcast(&walrcv->walRcvStateChanged);
+
 	pg_atomic_write_u64(&WalRcv->writtenUpto, 0);
 
 	/* Arrange to clean up at walreceiver exit */
@@ -647,6 +651,8 @@ WalRcvWaitForStartPosition(XLogRecPtr *startpoint, TimeLineID *startpointTLI)
 	walrcv->receiveStartTLI = 0;
 	SpinLockRelease(&walrcv->mutex);
 
+	ConditionVariableBroadcast(&walrcv->walRcvStateChanged);
+
 	set_ps_display("idle");
 
 	/*
@@ -675,6 +681,8 @@ WalRcvWaitForStartPosition(XLogRecPtr *startpoint, TimeLineID *startpointTLI)
 			*startpointTLI = walrcv->receiveStartTLI;
 			walrcv->walRcvState = WALRCV_STREAMING;
 			SpinLockRelease(&walrcv->mutex);
+
+			ConditionVariableBroadcast(&walrcv->walRcvStateChanged);
 			break;
 		}
 		if (walrcv->walRcvState == WALRCV_STOPPING)
@@ -784,6 +792,8 @@ WalRcvDie(int code, Datum arg)
 	walrcv->latch = NULL;
 	SpinLockRelease(&walrcv->mutex);
 
+	ConditionVariableBroadcast(&walrcv->walRcvStateChanged);
+
 	/* Terminate the connection gracefully. */
 	if (wrconn != NULL)
 		walrcv_disconnect(wrconn);
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backen

Re: repeated decoding of prepared transactions

2021-03-01 Thread Ajin Cherian
On Mon, Mar 1, 2021 at 8:08 PM Amit Kapila  wrote:

> Few minor comments on 0002 patch
> =
> 1.
>   ctx->streaming &= enable_streaming;
> - ctx->twophase &= enable_twophase;
> +
>  }
>
> Spurious line addition.

Deleted.

>
> 2.
> -  proallargtypes =>
> '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8}',
> -  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o}',
> -  proargnames =>
> '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size}',
> +  proallargtypes =>
> '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,int8,bool}',
> +  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
> +  proargnames =>
> '{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,safe_wal_size,twophase}',
>prosrc => 'pg_get_replication_slots' },
>  { oid => '3786', descr => 'set up a logical replication slot',
>proname => 'pg_create_logical_replication_slot', provolatile => 'v',
> -  proparallel => 'u', prorettype => 'record', proargtypes => 'name name 
> bool',
> -  proallargtypes => '{name,name,bool,name,pg_lsn}',
> -  proargmodes => '{i,i,i,o,o}',
> -  proargnames => '{slot_name,plugin,temporary,slot_name,lsn}',
> +  proparallel => 'u', prorettype => 'record', proargtypes => 'name
> name bool bool',
> +  proallargtypes => '{name,name,bool,bool,name,pg_lsn}',
> +  proargmodes => '{i,i,i,i,o,o}',
> +  proargnames => '{slot_name,plugin,temporary,twophase,slot_name,lsn}',
>
> I think it is better to use two_phase here and at other places as well
> to be consistent with similar parameters.

Updated as requested.
>
> 3.
> --- a/src/backend/catalog/system_views.sql
> +++ b/src/backend/catalog/system_views.sql
> @@ -894,7 +894,8 @@ CREATE VIEW pg_replication_slots AS
>  L.restart_lsn,
>  L.confirmed_flush_lsn,
>  L.wal_status,
> -L.safe_wal_size
> +L.safe_wal_size,
> + L.twophase
>  FROM pg_get_replication_slots() AS L
>
> Indentation issue. Here, you need you spaces instead of tabs.

Updated.
>
> 4.
> @@ -533,6 +533,12 @@ CreateDecodingContext(XLogRecPtr start_lsn,
>
>   ctx->reorder->output_rewrites = ctx->options.receive_rewrites;
>
> + /*
> + * If twophase is set on the slot at create time, then
> + * make sure the field in the context is also updated.
> + */
> + ctx->twophase &= MyReplicationSlot->data.twophase;
> +
>
> Why didn't you made similar change in CreateInitDecodingContext when I
> already suggested the same in my previous email? If we don't make that
> change then during slot initialization two_phase will always be true
> even though user passed in as false. It looks inconsistent and even
> though there is no direct problem due to that but it could be cause of
> possible problem in future.

Updated.

regards,
Ajin Cherian
Fujitsu Australia


v8-0001-Add-option-to-enable-two-phase-commits-in-pg_crea.patch
Description: Binary data


Re: TRIM_ARRAY

2021-03-01 Thread Dian M Fay
On Mon Mar 1, 2021 at 6:53 PM EST, Vik Fearing wrote:
> > This basically does what it says, and the code looks good. The
> > documentation is out of alphabetical order (trim_array should appear
> > under cardinality rather than over)) but good otherwise.
>
> Hmm. It appears between cardinality and unnest in the source code and
> also my compiled html. Can you say more about where you're seeing the
> wrong order?

I applied the patch to the latest commit, ffd3944ab9. Table 9.52 is
ordered:

array_to_string
array_upper
trim_array
cardinality
unnest

> The problem here is that postgres needs to know what the return
> type is and it can only determine that from the input.
>
> If you give the function a typed null, it returns null as expected.
>
> > The new status of this patch is: Waiting on Author
>
> I put it back to Needs Review without a new patch because I don't know
> what I would change.

I'd thought that checking v and returning null instead of raising the
error would be more friendly, should it be possible to pass an untyped
null accidentally instead of on purpose, and I couldn't rule that out.
I've got no objections other than the docs having been displaced.




Re: Why does the BF sometimes not dump regression.diffs?

2021-03-01 Thread Thomas Munro
On Tue, Mar 2, 2021 at 1:32 PM Andrew Dunstan  wrote:
> The version numbering is a bit misleading on fairywren, which, as it's a
> machine I control runs from a git checkout, which clearly is later than
> REL_11 even though that's what the version string says. Commit 13d2143
> fixed this. It was included in Release 12.

Oh, thanks!




Re: Why does the BF sometimes not dump regression.diffs?

2021-03-01 Thread Andrew Dunstan


On 3/1/21 4:05 PM, Thomas Munro wrote:
> Hello,
>
> I saw this failure after a recent commit (though the next build
> succeeded, and I don't yet have any particular reason to believe that
> the commits it blamed are at fault, we'll see...):
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gombessa&dt=2021-03-01%2004%3A58%3A17
>
> Strangely, it didn't spit out the regression.diffs so I don't know
> anything more than "commit_ts failed".  I emailed the owner asking for
> that log but I see now it's too late.  I wonder if there is some
> scripting in there that doesn't work correctly on OpenBSD for whatever
> reason...  I know that we *sometimes* get regression.diffs from
> failures in here, but I haven't tried to figure out the pattern (which
> animals etc).  Here's a similar failure that does show the .diffs I
> wish I could see, same BF client version (11), on Windows:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2020-07-13%2000:47:47
>
>


The version numbering is a bit misleading on fairywren, which, as it's a
machine I control runs from a git checkout, which clearly is later than
REL_11 even though that's what the version string says. Commit 13d2143
fixed this. It was included in Release 12.


cheers


andrew



--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [PATCH] Bug fix in initdb output

2021-03-01 Thread Juan José Santamaría Flecha
On Mon, Mar 1, 2021 at 10:18 PM Alvaro Herrera 
wrote:

> On 2021-Mar-01, Juan José Santamaría Flecha wrote:
>
> > Uhm, now that you point it out, an absolute path would make the message
> > more consistent and reusable.
>
> Well.  This code was introduced in a00c58314745, with discussion at
>
> http://postgr.es/m/CAHeEsBeAe1FeBypT3E8R1ZVZU0e8xv3A-7BHg6bEOi=jzny...@mail.gmail.com
> which did not touch on the point of the pg_ctl path being relative or
> absolute.  The previous decision to use relative seems to have been made
> here in commit ee814b4511ec, which was backed by this discussion
>
> https://www.postgresql.org/message-id/flat/200411020134.52513.peter_e%40gmx.net
>
> So I'm not sure that anybody would love me if I change it again to
> absolute.
>

For me it is a +1 for the change to absolute. Let's see if more people want
to weigh in on the matter.

Regards,

Juan José Santamaría Flecha


Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-01 Thread Mark Dilger



> On Mar 1, 2021, at 9:38 AM, Joel Jacobson  wrote:
> 
> Forgive me for just sending a patch without much discussion on the list,
> but it was so easy to implement, so I thought an implementation can
> help the discussion on if this is something we want or not.

I like the idea so I did a bit of testing.  I think the following should not 
error, but does:

+SELECT regexp_positions('foObARbEqUEbAz', $re$(?=beque)$re$, 'i');
+ERROR:  range lower bound must be less than or equal to range upper bound


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: TRIM_ARRAY

2021-03-01 Thread Tom Lane
Dian Fay  writes:
> This basically does what it says, and the code looks good. The
> documentation is out of alphabetical order (trim_array should appear
> under cardinality rather than over)) but good otherwise. I was able to
> "break" the function with an untyped null in psql:

> select trim_array(null, 2);
> ERROR:  could not determine polymorphic type because input has type unknown

That's a generic parser behavior for polymorphic functions, not something
this particular function could or should dodge.

regards, tom lane




Re: TRIM_ARRAY

2021-03-01 Thread Vik Fearing
On 3/2/21 12:14 AM, Dian Fay wrote:
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   tested, passed
> Spec compliant:   tested, failed
> Documentation:tested, failed

Thank you for looking at my patch!

> This basically does what it says, and the code looks good. The
> documentation is out of alphabetical order (trim_array should appear
> under cardinality rather than over)) but good otherwise.

Hmm.  It appears between cardinality and unnest in the source code and
also my compiled html.  Can you say more about where you're seeing the
wrong order?

> I was able to
> "break" the function with an untyped null in psql:
> 
> select trim_array(null, 2);
> ERROR:  could not determine polymorphic type because input has type unknown
> 
> I don't know whether there are any circumstances other than manual entry
> in psql where this could happen, since column values and variables will
> always be typed. I don't have access to the standard, but DB2's docs[1]
> note "if any argument is null, the result is the null value", so an
> up-front null check might be preferable to a slightly arcane user-facing
> error, even if it's a silly invocation of a silly function :)
> 
> [1] 
> https://www.ibm.com/support/knowledgecenter/en/SSEPEK_12.0.0/sqlref/src/tpc/db2z_bif_trimarray.html

The standard also says that if either argument is null, the result is
null.  The problem here is that postgres needs to know what the return
type is and it can only determine that from the input.

If you give the function a typed null, it returns null as expected.

> The new status of this patch is: Waiting on Author

I put it back to Needs Review without a new patch because I don't know
what I would change.
-- 
Vik Fearing




Re: 2019-03 CF now in progress

2021-03-01 Thread Justin Pryzby
Could I suggest to update the CF APP to allow:
| Target version: 15

Also, I wanted to suggest that toward the end of the devel cycle, that
committers set/unset target version to allow more focused review effort.
And so it's not a total surprise when something isn't progressed.
And as a simple way for a committer to mark an "intent to commit" so it's not a
surprise when something *is* committed.

Most of my patches are currently marked v14, but it'd be fine to kick them to
later.

-- 
Justin




Re: TRIM_ARRAY

2021-03-01 Thread Dian Fay
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, failed
Documentation:tested, failed

This basically does what it says, and the code looks good. The
documentation is out of alphabetical order (trim_array should appear
under cardinality rather than over)) but good otherwise. I was able to
"break" the function with an untyped null in psql:

select trim_array(null, 2);
ERROR:  could not determine polymorphic type because input has type unknown

I don't know whether there are any circumstances other than manual entry
in psql where this could happen, since column values and variables will
always be typed. I don't have access to the standard, but DB2's docs[1]
note "if any argument is null, the result is the null value", so an
up-front null check might be preferable to a slightly arcane user-facing
error, even if it's a silly invocation of a silly function :)

[1] 
https://www.ibm.com/support/knowledgecenter/en/SSEPEK_12.0.0/sqlref/src/tpc/db2z_bif_trimarray.html

The new status of this patch is: Waiting on Author


Re: Table AM modifications to accept column projection lists

2021-03-01 Thread Jacob Champion
On Mon, 2021-03-01 at 16:59 -0600, Justin Pryzby wrote:
> -   
> pgstat_progress_update_param(PROGRESS_ANALYZE_CURRENT_CHILD_TABLE_RELID,  
>   
>
> 
> -
> RelationGetRelid(childrel));  
> 
> - 
>   
>
> 
> Why is this removed ?

Mm, both of those analyze.c changes seem suspect. Possibly a mismerge
from the zedstore branch; let me double-check.

> 
> +* returningCols of it's base table's RTE.
> 
> its (possessive) not it's ("it is")

Thanks, I'll fix this at the same time.

--Jacob


Re: Table AM modifications to accept column projection lists

2021-03-01 Thread Justin Pryzby
On Thu, Dec 31, 2020 at 01:02:24PM -0800, Soumyadeep Chakraborty wrote:
> Hey Masahiko,
> 
> I added it to the Jan CF (https://commitfest.postgresql.org/31/2922/).
> 
> PFA a rebased version against latest head.

Thanks for working on this.

-   
pgstat_progress_update_param(PROGRESS_ANALYZE_CURRENT_CHILD_TABLE_RELID,

   
-
RelationGetRelid(childrel));
  
-   

   

Why is this removed ?

+* returningCols of it's base table's RTE.

its (possessive) not it's ("it is")

-- 
Justin




Re: enable_incremental_sort changes query behavior

2021-03-01 Thread Tomas Vondra

On 3/1/21 8:23 PM, rbrooks wrote:

Just a bump to see if there is a time frame for a fix on this.

Google Cloud doesn't yet support setting the *enable_incremental_sort * flag
yet.
Was able to temporarily resolve by running the following though:

ALTER DATABASE corpdb SET enable_incremental_sort TO OFF;

Hope this helps others until the core issue has been resolved.



I don't follow - all fixes in this thread were committed - that's why 
it's marked like that in the CF app. And thus it should be in the recent 
13.2 release.


I don't know what version is currently supported by Google Cloud, maybe 
they haven't upgraded their PostgreSQL version yet.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: new heapcheck contrib module

2021-03-01 Thread Mark Dilger



> On Mar 1, 2021, at 1:14 PM, Robert Haas  wrote:
> 
> On Wed, Feb 24, 2021 at 1:55 PM Mark Dilger
>  wrote:
>> [ new patches ]
> 
> Regarding 0001:
> 
> There seem to be whitespace-only changes to the comment for select_loop().
> 
> I wonder if the ParallelSlotsSetupOneDB()/ParallelSlotsSetupMinimal()
> changes could be simpler. First idea: Suppose you had
> ParallelSlotsSetup(numslots) that just creates the slot array with 0
> connections, and then ParallelSlotsAdopt(slots, conn, cparams) if you
> want to make it own an existing connection. That seems like it might
> be cleaner. Second idea: Why not get rid of ParallelSlotsSetupOneDB()
> altogether, and just let ParallelSlotsGetIdle() connect the other
> slots as required? Preconnecting all slots before we do anything is
> good because ... of what?

Mostly because, if --jobs is set too high, you get an error before launching 
any work.  I don't know that it's really a big deal if vacuumdb or reindexdb 
have a bunch of tasks kicked off prior to exit(1) due to not being able to open 
connections for all the slots, but it is a behavioral change.

> I also wonder if things might be simplified by introducing a wrapper
> object, e.g. ParallelSlotArray. Suppose a ParallelSlotArray stores the
> number of slots (num_slots), the array of actual PGconn objects, and
> the ConnParams to be used for new connections, and the initcmd to be
> used for new connections. Maybe also the progname. This seems like it
> would avoid a bunch of repeated parameter passing: you could just
> create the ParallelSlotArray with the right contents, and then pass it
> around everywhere, instead of having to keep passing the same stuff
> in. If you want to switch to connecting to a different DB, you tweak
> the ConnParams - maybe using an accessor function - and the system
> figures the rest out.

The existing version of parallel slots (before any of my changes) could already 
have been written that way, but the author chose not to.  I thought about 
making the sort of change you suggest, and decided against, mostly on the 
principle of stare decisis.  But the idea is very appealing, and since you're 
on board, I think I'll go make that change.

> I wonder if it's really useful to generalize this to a point of caring
> about all the ConnParams fields, too. Like, if you only provide
> ParallelSlotUpdateDB(slotarray, dbname), then that's the only field
> that can change so you don't need to care about the others. And maybe
> you also don't really need to keep the ConnParams fields in every
> slot, either. Like, couldn't you just say something like: if
> (strcmp(PQdb(conn) , slotarray->cparams->dbname) != 0) { wrong DB,
> can't reuse without a reconnect }? I know sometimes a dbname is really
> a whole connection string, but perhaps we could try to fix that by
> using PQconninfoParse() in the right place, so that what ends up in
> the cparams is just the db name, not a whole connection string.

I went a little out of my way to avoid that, as I didn't want the next 
application that uses parallel slots to have to refactor it again, if for 
example they want to process in parallel databases listening on different 
ports, or to process commands issued under different roles.

> This is just based on a relatively short amount of time spent studying
> the patch, so I might well be off-base here. What do you think?

I like the ParallelSlotArray idea, and will go do that now.  I'm happy to defer 
to your judgement on the other stuff, too, but will wait to hear back from you.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: 64-bit XIDs in deleted nbtree pages

2021-03-01 Thread Peter Geoghegan
On Sun, Feb 28, 2021 at 8:08 PM Masahiko Sawada  wrote:
> > Even though "the letter of the law" favors removing the
> > vacuum_cleanup_index_scale_factor GUC + param in the way I have
> > outlined, that is not the only thing that matters -- we must also
> > consider "the spirit of the law".

> > I suppose I could ask Tom what he thinks?
>
> +1

Are you going to start a new thread, or should I?

> Since it seems not a bug I personally think we don't need to do
> anything for back branches. But if we want not to trigger an index
> scan by vacuum_cleanup_index_scale_factor, we could change the default
> value to a high value (say, to 1) so that it can skip an index
> scan in most cases.

One reason to remove vacuum_cleanup_index_scale_factor in the back
branches is that it removes any need to fix the
"IndexVacuumInfo.num_heap_tuples is inaccurate outside of
btvacuumcleanup-only VACUUMs" bug -- it just won't matter if
btm_last_cleanup_num_heap_tuples is inaccurate anymore. (I am still
not sure about backpatch being a good idea, though.)

> > Another new concern for me (another concern unique to Postgres 13) is
> > autovacuum_vacuum_insert_scale_factor-driven autovacuums.
>
> IIUC the purpose of autovacuum_vacuum_insert_scale_factor is
> visibility map maintenance. And as per this discussion, it seems not
> necessary to do an index scan in btvacuumcleanup() triggered by
> autovacuum_vacuum_insert_scale_factor.

Arguably the question of skipping scanning the index should have been
considered by the autovacuum_vacuum_insert_scale_factor patch when it
was committed for Postgres 13 -- but it wasn't. There is a regression
that was tied to autovacuum_vacuum_insert_scale_factor in Postgres 13
by Mark Callaghan, which I suspect is relevant:

https://smalldatum.blogspot.com/2021/01/insert-benchmark-postgres-is-still.html

The blog post says: "Updates - To understand the small regression
mentioned above for the l.i1 test (more CPU & write IO) I repeated the
test with 100M rows using 2 configurations: one disabled index
deduplication and the other disabled insert-triggered autovacuum.
Disabling index deduplication had no effect and disabling
insert-triggered autovacuum resolves the regression."

This is quite specifically with an insert-only workload, with 4
indexes (that's from memory, but I'm pretty sure it's 4). I think that
the failure to account for skipping index scans is probably the big
problem here. Scanning the heap to set VM bits is unlikely to be
expensive compared to the full index scans. An insert-only workload is
going to find most of the heap blocks it scans to set VM bits in
shared_buffers. Not so for the indexes.

So in Postgres 13 we have this autovacuum_vacuum_insert_scale_factor
issue, in addition to the deduplication + btvacuumcleanup issue we
talked about (the problems left by my Postgres 13 bug fix commit
48e12913). These two issues make removing
vacuum_cleanup_index_scale_factor tempting, even in the back branches
-- it might actually be the more conservative approach, at least for
Postgres 13.

-- 
Peter Geoghegan




Re: enable_incremental_sort changes query behavior

2021-03-01 Thread rbrooks
Just a bump to see if there is a time frame for a fix on this.

Google Cloud doesn't yet support setting the *enable_incremental_sort * flag
yet.
Was able to temporarily resolve by running the following though:

ALTER DATABASE corpdb SET enable_incremental_sort TO OFF;

Hope this helps others until the core issue has been resolved.





--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: [PATCH] Bug fix in initdb output

2021-03-01 Thread Alvaro Herrera
On 2021-Mar-01, Juan José Santamaría Flecha wrote:

> Uhm, now that you point it out, an absolute path would make the message
> more consistent and reusable.

Well.  This code was introduced in a00c58314745, with discussion at
http://postgr.es/m/CAHeEsBeAe1FeBypT3E8R1ZVZU0e8xv3A-7BHg6bEOi=jzny...@mail.gmail.com
which did not touch on the point of the pg_ctl path being relative or
absolute.  The previous decision to use relative seems to have been made
here in commit ee814b4511ec, which was backed by this discussion
https://www.postgresql.org/message-id/flat/200411020134.52513.peter_e%40gmx.net

So I'm not sure that anybody would love me if I change it again to
absolute.

-- 
Álvaro Herrera   Valdivia, Chile
Voy a acabar con todos los humanos / con los humanos yo acabaré
voy a acabar con todos (bis) / con todos los humanos acabaré ¡acabaré! (Bender)




Re: new heapcheck contrib module

2021-03-01 Thread Robert Haas
On Wed, Feb 24, 2021 at 1:55 PM Mark Dilger
 wrote:
> [ new patches ]

Regarding 0001:

There seem to be whitespace-only changes to the comment for select_loop().

I wonder if the ParallelSlotsSetupOneDB()/ParallelSlotsSetupMinimal()
changes could be simpler. First idea: Suppose you had
ParallelSlotsSetup(numslots) that just creates the slot array with 0
connections, and then ParallelSlotsAdopt(slots, conn, cparams) if you
want to make it own an existing connection. That seems like it might
be cleaner. Second idea: Why not get rid of ParallelSlotsSetupOneDB()
altogether, and just let ParallelSlotsGetIdle() connect the other
slots as required? Preconnecting all slots before we do anything is
good because ... of what?

I also wonder if things might be simplified by introducing a wrapper
object, e.g. ParallelSlotArray. Suppose a ParallelSlotArray stores the
number of slots (num_slots), the array of actual PGconn objects, and
the ConnParams to be used for new connections, and the initcmd to be
used for new connections. Maybe also the progname. This seems like it
would avoid a bunch of repeated parameter passing: you could just
create the ParallelSlotArray with the right contents, and then pass it
around everywhere, instead of having to keep passing the same stuff
in. If you want to switch to connecting to a different DB, you tweak
the ConnParams - maybe using an accessor function - and the system
figures the rest out.

I wonder if it's really useful to generalize this to a point of caring
about all the ConnParams fields, too. Like, if you only provide
ParallelSlotUpdateDB(slotarray, dbname), then that's the only field
that can change so you don't need to care about the others. And maybe
you also don't really need to keep the ConnParams fields in every
slot, either. Like, couldn't you just say something like: if
(strcmp(PQdb(conn) , slotarray->cparams->dbname) != 0) { wrong DB,
can't reuse without a reconnect }? I know sometimes a dbname is really
a whole connection string, but perhaps we could try to fix that by
using PQconninfoParse() in the right place, so that what ends up in
the cparams is just the db name, not a whole connection string.

This is just based on a relatively short amount of time spent studying
the patch, so I might well be off-base here. What do you think?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Why does the BF sometimes not dump regression.diffs?

2021-03-01 Thread Thomas Munro
Hello,

I saw this failure after a recent commit (though the next build
succeeded, and I don't yet have any particular reason to believe that
the commits it blamed are at fault, we'll see...):

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gombessa&dt=2021-03-01%2004%3A58%3A17

Strangely, it didn't spit out the regression.diffs so I don't know
anything more than "commit_ts failed".  I emailed the owner asking for
that log but I see now it's too late.  I wonder if there is some
scripting in there that doesn't work correctly on OpenBSD for whatever
reason...  I know that we *sometimes* get regression.diffs from
failures in here, but I haven't tried to figure out the pattern (which
animals etc).  Here's a similar failure that does show the .diffs I
wish I could see, same BF client version (11), on Windows:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2020-07-13%2000:47:47




Re: Improvements in prepared statements

2021-03-01 Thread Pavel Stehule
po 1. 3. 2021 v 21:35 odesílatel Alejandro Sánchez 
napsal:

> Using any() has the disadvantage that in JDBC it is necessary
>
> to create an array with connection.createArrayOf() and indicate
>
> the type of the array, which complicates automation.
>
>
> With statement.setObject() you can pass any type of parameter.
>
> JDBC could add a method that doesn't need the array type.
>
>
> String sql = "select author from article where id = any(?)";
> try (PreparedStatement statement = connection.prepareStatement(sql)) {
> statement.setArray(1, connection.createArrayOf("varchar",
> new String[] {"home", "system"}));
> }
>
> VS
>
> query("select author from article where id = any(?)", new String[]
> {"home", "system"});
>

Can be, but this is a client side issue. It is about design of client side
API.

Pavel



> El lun, 01-03-2021 a las 17:21 +0100, Pavel Stehule escribió:
>
>
>
> po 1. 3. 2021 v 17:15 odesílatel Pavel Stehule 
> napsal:
>
>
>
> po 1. 3. 2021 v 17:08 odesílatel Alejandro Sánchez 
> napsal:
>
> The benefit is ease of use. O
>
> ne of the great advantages of prepared statements is not
>
> having to concatenate strings. The use of arrays would also be very useful.
>
>
> query("select " + column1 + "," + column2 from " " + table + " where id in 
> (?), ids);
>
>
>
>
> The argument with arrays is not good.  You can work with arrays just on
> binary level, that is more effective. But just you should use operator =
> ANY() instead IN.
>
> Regards
>
> Pavel
>
>
>


Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2021-03-01 Thread Mark Dilger



> On Mar 1, 2021, at 12:23 PM, Pavel Borisov  wrote:
> 
> If bt_index_check() and bt_index_parent_check() are to have this 
> functionality, shouldn't there be an option controlling it much as the option 
> (heapallindexed boolean) controls checking whether all entries in the heap 
> are indexed in the btree?  It seems inconsistent to have an option to avoid 
> checking the heap for that, but not for this.  Alternately, this might even 
> be better coded as its own function, named something like 
> bt_unique_index_check() perhaps.  I hope Peter might advise?
>  
> As for heap checking, my reasoning was that we can not check whether a unique 
> constraint violated by the index, without checking heap tuple visibility. 
> I.e. we can have many identical index entries without uniqueness violated if 
> only one of them corresponds to a visible heap tuple. So heap checking 
> included in my patch is _necessary_ for unique constraint checking, it should 
> not have an option to be disabled, otherwise, the only answer we can get is 
> that unique constraint MAY be violated and may not be, which is quite 
> useless. If you meant something different, please elaborate. 

I completely agree that checking uniqueness requires looking at the heap, but I 
don't agree that every caller of bt_index_check on an index wants that 
particular check to be performed.  There are multiple ways in which an index 
might be corrupt, and Peter wrote the code to only check some of them by 
default, with options to expand the checks to other things.  This is why 
heapallindexed is optional.  If you don't want to pay the price of checking all 
entries in the heap against the btree, you don't have to.

I'm not against running uniqueness checks on unique indexes.  It seems fairly 
normal that a user would want that.  Perhaps the option should default to 
'true' if unspecified?  But having no option at all seems to run contrary to 
how the other functionality is structured.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PATCH] Bug fix in initdb output

2021-03-01 Thread Juan José Santamaría Flecha
On Mon, Mar 1, 2021 at 9:09 PM Alvaro Herrera 
wrote:

> On 2021-Mar-01, Juan José Santamaría Flecha wrote:
>
> > On Mon, Mar 1, 2021 at 7:50 PM Alvaro Herrera 
> > wrote:
>
> > > Ah, so another way to fix it would be to make the path to pg_ctl be
> > > absolute?
> >
> > Yes, that's right. If you call initdb with an absolute path you won't
> see a
> > problem.
>
> So, is make_native_path a better fix than make_absolute_path?  (I find
> it pretty surprising that initdb shows a relative path, but maybe that's
> just me.)
>

Uhm, now that you point it out, an absolute path would make the message
more consistent and reusable.

Regards,

Juan José Santamaría Flecha


Re: Use extended statistics to estimate (Var op Var) clauses

2021-03-01 Thread Tomas Vondra




On 3/1/21 8:58 PM, Mark Dilger wrote:




On Nov 12, 2020, at 5:14 PM, Tomas Vondra  wrote:

<0001-Support-estimation-of-clauses-of-the-form-V-20201113.patch>


Your patch no longer applies.  Can we get a new version please?



I do not plan to work on this patch in the 2021-03 commitfest. I'll 
focus on the other patch about extended statistics on expressions.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Improvements in prepared statements

2021-03-01 Thread Alejandro Sánchez
Using any() has the disadvantage that in JDBC it is necessaryto create
an array with connection.createArrayOf() and indicatethe type of the
array, which complicates automation.
With statement.setObject() you can pass any type of parameter.JDBC
could add a method that doesn't need the array type.
String sql = "select author from article where id = any(?)";
try (PreparedStatement statement = connection.prepareStatement(sql)) {  
statement.setArray(1,
connection.createArrayOf("varchar", new String[] {"home",
"system"}));}
VS
query("select author from article where id = any(?)",  new String[]
{"home", "system"});
El lun, 01-03-2021 a las 17:21 +0100, Pavel Stehule escribió:
> po 1. 3. 2021 v 17:15 odesílatel Pavel Stehule <
> pavel.steh...@gmail.com> napsal:
> > po 1. 3. 2021 v 17:08 odesílatel Alejandro Sánchez <
> > a...@nexttypes.com> napsal:
> > > The benefit is ease of use. One of the great advantages of
> > > prepared statements is nothaving to concatenate strings. The use
> > > of arrays would also be very useful. 
> > > query("select " + column1 + "," + column2 from " " + table + "
> > > where id in (?), ids);
> 
> The argument with arrays is not good.  You can work with arrays just
> on binary level, that is more effective. But just you should use
> operator = ANY() instead IN. 
> 
> Regards
> Pavel


Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2021-03-01 Thread Pavel Borisov
>
> If bt_index_check() and bt_index_parent_check() are to have this
> functionality, shouldn't there be an option controlling it much as the
> option (heapallindexed boolean) controls checking whether all entries in
> the heap are indexed in the btree?  It seems inconsistent to have an option
> to avoid checking the heap for that, but not for this.  Alternately, this
> might even be better coded as its own function, named something like
> bt_unique_index_check() perhaps.  I hope Peter might advise?
>

As for heap checking, my reasoning was that we can not check whether a
unique constraint violated by the index, without checking heap tuple
visibility. I.e. we can have many identical index entries without
uniqueness violated if only one of them corresponds to a visible heap
tuple. So heap checking included in my patch is _necessary_ for unique
constraint checking, it should not have an option to be disabled,
otherwise, the only answer we can get is that unique constraint MAY be
violated and may not be, which is quite useless. If you meant something
different, please elaborate.

I can try to rewrite unique constraint checking to be done after all others
check but I suppose it's the performance considerations are that made
previous amcheck routines to do many checks simultaneously. I tried to
stick to this practice. It's also not so elegant to duplicate much code to
make uniqueness checks independently and the resulting patch will be much
bigger and harder to review.

Anyway, your and Peter's further considerations are always welcome.

-- 
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com 


Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2021-03-01 Thread Tom Lane
Mark Dilger  writes:
> Yes, my review was of v2.  Updating to v3, I see that the test passes on my 
> laptop.  It still looks brittle to have all the tid values in the test 
> output, but it does pass.

Hm, anyone tried it on 32-bit hardware?

regards, tom lane




Re: Regex back-reference semantics and performance

2021-03-01 Thread Tom Lane
"Joel Jacobson"  writes:
> On Mon, Mar 1, 2021, at 01:53, Tom Lane wrote:
>> 0001-fix-backref-semantics.patch
>> 0002-backref-performance-hack.patch

> I've successfully tested both patches.

Again, thanks for testing!

> On HEAD the trouble-query took forever, I cancelled it after 23 minutes.

Yeah, I have not had the patience to run it to completion either.

> No notable timing differences:

I'm seeing a win of maybe 1% across the entire corpus, which isn't
much but it's something.  It's not too surprising that this backref
issue is seldom hit, or we'd have had more complaints about it.

BTW, I had what seemed like a great idea to improve the situation in
the left-deep-tree case I talked about: we could remember the places
where we'd had to use the NFA to check a backreference subre, and
then at the point where we capture the reference string, recheck any
previous approximate answers, and fail the capture node's match when
any previous backref doesn't match.  Turns out this only mostly works.
In corner cases where the match is ambiguous, it can change the
results from what we got before, which I don't think is acceptable.
Basically, that allows the backref node to have action-at-a-distance
effects on where the earlier concat node divides a substring, which
changes the behavior.

So it seems this is about the best we can do for now.  I'll wait
a little longer to see if anyone complains about the proposed
semantics change, though.

regards, tom lane




Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2021-03-01 Thread Mark Dilger



> On Mar 1, 2021, at 12:05 PM, Pavel Borisov  wrote:
> 
> The regression test you provided is not portable.  I am getting lots of 
> errors due to differing output of the form "page lsn=0/4DAD7E0".  You might 
> turn this into a TAP test and use a regular expression to check the output.
> May I ask you to ensure you used v3 of a patch to check? I've made tests 
> portable in v3, probably, you've checked not the last version.

Yes, my review was of v2.  Updating to v3, I see that the test passes on my 
laptop.  It still looks brittle to have all the tid values in the test output, 
but it does pass.

> Thanks for your attention to the patch

Thanks for the patch!

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PATCH] Bug fix in initdb output

2021-03-01 Thread Alvaro Herrera
On 2021-Mar-01, Juan José Santamaría Flecha wrote:

> On Mon, Mar 1, 2021 at 7:50 PM Alvaro Herrera 
> wrote:

> > Ah, so another way to fix it would be to make the path to pg_ctl be
> > absolute?
> 
> Yes, that's right. If you call initdb with an absolute path you won't see a
> problem.

So, is make_native_path a better fix than make_absolute_path?  (I find
it pretty surprising that initdb shows a relative path, but maybe that's
just me.)

-- 
Álvaro Herrera39°49'30"S 73°17'W
"How strange it is to find the words "Perl" and "saner" in such close
proximity, with no apparent sense of irony. I doubt that Larry himself
could have managed it." (ncm, http://lwn.net/Articles/174769/)




buildfarm windows checks / tap tests on windows

2021-03-01 Thread Andres Freund
Hi,

As part of trying to make the aio branch tests on cirrus CI pass with
some tap tests I noticed that "src/tools/msvc/vcregress.pl recoverycheck"
hangs. A long phase of remote debugging later I figured out that that's
not a fault of the aio branch - it also doesn't pass on master (fixed in
[1]).

Which confused me - shouldn't the buildfarm have noticed? But it seems
that all msvc animals either don't have tap tests enabled or they
disable 'misc-checks' which in turn is what the buildfarm client uses to
trigger all the 'recovery' checks.

It seems we're not just skipping recovery, but also e.g. tap tests in
contrib, all the tests in src/test/modules/...

Andrew, what's the reason for that? Is it just that they hung at some
point? Were too slow?


On that last point: Running the tap tests on windows appears to be
*excruciatingly* slow. How does anybody develop on windows without a
mechanism to actually run tests in parallel?

I think it'd be good if vcregress.pl at least respected PROVE_FLAGS from
the environment - it can't currently be passed in for several of the
vcregress.pl tests, and it does seem to make things to be at least
somewhat less painful.


This makes it even clearer to me that we really need a builtin
testrunner that runs tests efficiently *and* debuggable on windows.

Greetings,

Andres Freund

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=1e6e40447115ca7b4749d7d117b81b016ee5e2c2




Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2021-03-01 Thread Pavel Borisov
>
> The regression test you provided is not portable.  I am getting lots of
> errors due to differing output of the form "page lsn=0/4DAD7E0".  You might
> turn this into a TAP test and use a regular expression to check the output.
>
May I ask you to ensure you used v3 of a patch to check? I've made tests
portable in v3, probably, you've checked not the last version.

Thanks for your attention to the patch
-- 
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com 


Re: Use extended statistics to estimate (Var op Var) clauses

2021-03-01 Thread Mark Dilger



> On Nov 12, 2020, at 5:14 PM, Tomas Vondra  
> wrote:
> 
> <0001-Support-estimation-of-clauses-of-the-form-V-20201113.patch>

Your patch no longer applies.  Can we get a new version please?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2021-03-01 Thread Mark Dilger



> On Feb 9, 2021, at 10:43 AM, Pavel Borisov  wrote:
> 
> вт, 9 февр. 2021 г. в 01:46, Mark Dilger :
> 
> 
> > On Feb 8, 2021, at 2:46 AM, Pavel Borisov  wrote:
> > 
> > 0002 - is a temporary hack for testing. It will allow inserting duplicates 
> > in a table even if an index with the exact name "idx" has a unique 
> > constraint (generally it is prohibited to insert). Then a new amcheck will 
> > tell us about these duplicates. It's pity but testing can not be done 
> > automatically, as it needs a core recompile. For testing I'd recommend a 
> > protocol similar to the following:
> > 
> > - Apply patch 0002
> > - Set autovaccum = off in postgresql.conf
> 
> Thanks Pavel and Anastasia for working on this!
> 
> Updating pg_catalog directly is ugly, but the following seems a simpler way 
> to set up a regression test than having to recompile.  What do you think?
> 
> Very nice idea, thanks!
> I've made a regression test based on it. PFA v.2 of a patch. Now it doesn't 
> need anything external for testing.

If bt_index_check() and bt_index_parent_check() are to have this functionality, 
shouldn't there be an option controlling it much as the option (heapallindexed 
boolean) controls checking whether all entries in the heap are indexed in the 
btree?  It seems inconsistent to have an option to avoid checking the heap for 
that, but not for this.  Alternately, this might even be better coded as its 
own function, named something like bt_unique_index_check() perhaps.  I hope 
Peter might advise?

The regression test you provided is not portable.  I am getting lots of errors 
due to differing output of the form "page lsn=0/4DAD7E0".  You might turn this 
into a TAP test and use a regular expression to check the output.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PATCH] Bug fix in initdb output

2021-03-01 Thread Juan José Santamaría Flecha
On Mon, Mar 1, 2021 at 7:50 PM Alvaro Herrera 
wrote:

> On 2021-Mar-01, Juan José Santamaría Flecha wrote:
>
> > This is not a problem with the APi, but the shell. e.g. when using a CMD:
> >
> > - This works:
> > c:\>c:\Windows\System32\notepad.exe
> > c:\>c:/Windows/System32/notepad.exe
> > c:\>/Windows/System32/notepad.exe
> >
> > - This doesn't:
> > c:\>./Windows/System32/notepad.exe
> > '.' is not recognized as an internal or external command,
> > operable program or batch file.
> > c:\>Windows/System32/notepad.exe
> > 'Windows' is not recognized as an internal or external command,
> > operable program or batch file.
>
> Ah, so another way to fix it would be to make the path to pg_ctl be
> absolute?
>

Yes, that's right. If you call initdb with an absolute path you won't see a
problem.

Regards,

Juan José Santamaría Flecha


Re: [PATCH] Bug fix in initdb output

2021-03-01 Thread Alvaro Herrera
On 2021-Mar-01, Juan José Santamaría Flecha wrote:

> This is not a problem with the APi, but the shell. e.g. when using a CMD:
> 
> - This works:
> c:\>c:\Windows\System32\notepad.exe
> c:\>c:/Windows/System32/notepad.exe
> c:\>/Windows/System32/notepad.exe
> 
> - This doesn't:
> c:\>./Windows/System32/notepad.exe
> '.' is not recognized as an internal or external command,
> operable program or batch file.
> c:\>Windows/System32/notepad.exe
> 'Windows' is not recognized as an internal or external command,
> operable program or batch file.

Ah, so another way to fix it would be to make the path to pg_ctl be
absolute?

-- 
Álvaro Herrera39°49'30"S 73°17'W




Re: [PATCH] Bug fix in initdb output

2021-03-01 Thread Juan José Santamaría Flecha
El lun., 1 mar. 2021 19:16, Alvaro Herrera 
escribió:

>
> I don't get it.  I thought the windows API accepted both forward slashes
> and backslashes as path separators.  Did you try the command and see it
> fail?
>

This is not a problem with the APi, but the shell. e.g. when using a CMD:

- This works:
c:\>c:\Windows\System32\notepad.exe
c:\>c:/Windows/System32/notepad.exe
c:\>/Windows/System32/notepad.exe

- This doesn't:
c:\>./Windows/System32/notepad.exe
'.' is not recognized as an internal or external command,
operable program or batch file.
c:\>Windows/System32/notepad.exe
'Windows' is not recognized as an internal or external command,
operable program or batch file.

Regards,

Juan José Santamaría Flecha


Re: [PATCH] Cross-reference comments on signal handling logic

2021-03-01 Thread Mark Dilger



> On Jan 17, 2021, at 11:51 PM, Craig Ringer  
> wrote:
> 
> 

In src/backend/postmaster/interrupt.c:

+ * These handlers are NOT used by normal user backends as they do not support

vs.

+ * Most backends use this handler.

These two comments seem to contradict.  If interrupt.c contains handlers that 
normal user backends to not use, then how can it be that most backends use one 
of the handlers in the file?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PATCH] Bug fix in initdb output

2021-03-01 Thread Alvaro Herrera
On 2021-Mar-01, Juan José Santamaría Flecha wrote:

> On Mon, Mar 1, 2021 at 5:52 AM Nitin Jadhav 
> wrote:
> 
> >
> >> Please share your thoughts on this. If we go ahead with this change,
> > then we need to back-patch. I would be happy to create those patches.
> 
> A full path works, even with the slashes. The commiter will take care of
> back-patching, if needed. As for HEAD at least, this LGTM.

I don't get it.  I thought the windows API accepted both forward slashes
and backslashes as path separators.  Did you try the command and see it
fail?

-- 
Álvaro Herrera39°49'30"S 73°17'W
"Entristecido, Wutra (canción de Las Barreras)
echa a Freyr a rodar
y a nosotros al mar"




Re: Add --tablespace option to reindexdb

2021-03-01 Thread Mark Dilger



> On Feb 25, 2021, at 10:49 PM, Michael Paquier  wrote:
> 
> Anyway, I would rather group the whole set of
> tests together, and using the --tablespace option introduced here
> within a TAP test does the job.

Your check verifies that reindexing a system table on a new tablespace fails, 
but does not verify what the failure was.  I wonder if you might want to make 
it more robust, something like:

diff --git a/src/bin/scripts/t/090_reindexdb.pl 
b/src/bin/scripts/t/090_reindexdb.pl
index 6946268209..8453acc817 100644
--- a/src/bin/scripts/t/090_reindexdb.pl
+++ b/src/bin/scripts/t/090_reindexdb.pl
@@ -3,7 +3,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 54;
+use Test::More tests => 58;
 
 program_help_ok('reindexdb');
 program_version_ok('reindexdb');
@@ -108,23 +108,35 @@ $node->issues_sql_like(
 # names, and CONCURRENTLY cannot be used in transaction blocks, preventing
 # the use of TRY/CATCH blocks in a custom function to filter error
 # messages.
-$node->command_fails(
+$node->command_checks_all(
[ 'reindexdb', '-t', $toast_table, '--tablespace', $tbspace, 'postgres' 
],
+   1,
+   [ ],
+   [ qr/cannot move system relation/ ],
'reindex toast table with tablespace');
-$node->command_fails(
+$node->command_checks_all(
[
'reindexdb','--concurrently', '-t', $toast_table,
'--tablespace', $tbspace, 'postgres'
],
+   1,
+   [ ],
+   [ qr/cannot move system relation/ ],
'reindex toast table concurrently with tablespace');
-$node->command_fails(
+$node->command_checks_all(
[ 'reindexdb', '-i', $toast_index, '--tablespace', $tbspace, 'postgres' 
],
+   1,
+   [ ],
+   [ qr/cannot move system relation/ ],
'reindex toast index with tablespace');
-$node->command_fails(
+$node->command_checks_all(
[
'reindexdb','--concurrently', '-i', $toast_index,
'--tablespace', $tbspace, 'postgres'
],
+   1,
+   [ ],
+   [ qr/cannot move system relation/ ],
'reindex toast index concurrently with tablespace');
 
 # connection strings


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Add --tablespace option to reindexdb

2021-03-01 Thread Mark Dilger



> On Feb 25, 2021, at 10:49 PM, Michael Paquier  wrote:
> 
> While on it, I have added tests for toast tables and indexes with a
> tablespace move during a REINDEX. 


In t/090_reindexdb.pl, you add:

+# Create a tablespace for testing.
+my $ts = $node->basedir . '/regress_reindex_tbspace';
+mkdir $ts or die "cannot create directory $ts";
+# this takes care of WIN-specific path issues
+my $ets = TestLib::perl2host($ts);
+my $tbspace = 'reindex_tbspace';
+$node->safe_psql('postgres', "CREATE TABLESPACE $tbspace LOCATION '$ets';");

I recognize that you are borrowing some of that from 
src/bin/pgbench/t/001_pgbench_with_server.pl, but I don't find anything 
intuitive about the name "ets" and would rather not see that repeated.  There 
is nothing in TestLib::perl2host to explain this name choice, nor in pgbench's 
test, nor yours.

You also change a test:

 $node->issues_sql_like(
-   [ 'reindexdb', '-v', '-t', 'test1', 'postgres' ],
-   qr/statement: REINDEX \(VERBOSE\) TABLE public\.test1;/,
-   'reindex with verbose output');
+   [ 'reindexdb', '--concurrently', '-v', '-t', 'test1', 'postgres' ],
+   qr/statement: REINDEX \(VERBOSE\) TABLE CONCURRENTLY public\.test1;/,
+   'reindex concurrently with verbose output');

but I don't see what tests of the --concurrently option have to do with this 
patch.  I'm not saying there is anything wrong with this change, but it seems 
out of place.  Am I missing something?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







[PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-03-01 Thread Joel Jacobson
Hi,

I suggest adding a new function, regexp_positions(),
which works exactly like regexp_matches(),
except it returns int4range[] start/end positions for *where* the matches 
occurs.

I first thought I could live without this function,
and just get the positions using strpos(),
but as Andreas Karlsson kindly helped me understand,
that naive idea doesn't always work.

Andreas provided this pedagogic example
to demonstrate the problem:

SELECT regexp_matches('caaabaaa', '(?<=b)(a+)', 'g');
regexp_matches

{aaa}
(1 row)

If we would try to use strpos() to find the position,
based on the returned matched substring,
we would get the wrong answer:

SELECT strpos('caaabaaa','aaa');
strpos

  2
(1 row)

Sure, there is "aaa" at position 2,
but that's not where the match occurred,
since the (?<=b) means "positive lookbehind of the character b",
so the match actually occurred at position 6,
where there is a "b" before the "aaa".

Using regexp_positions(), we can now get the correct answer:

SELECT regexp_positions('caaabaaa', '(?<=b)(a+)', 'g');
regexp_positions
--
{"[6,9)"}
(1 row)

Some more examples from the regress/sql/strings.sql,
showing both regexp_matches() and regexp_positions()
for the same examples, as they both return the same structure,
but with different types:

SELECT regexp_matches('foobarbequebazilbarfbonk', $re$(b[^b]+)(b[^b]+)$re$, 
'g');
regexp_matches

{bar,beque}
{bazil,barf}
(2 rows)

SELECT regexp_positions('foobarbequebazilbarfbonk', $re$(b[^b]+)(b[^b]+)$re$, 
'g');
   regexp_positions
---
{"[4,7)","[7,12)"}
{"[12,17)","[17,21)"}
(2 rows)

I've added documentation and tests.

Forgive me for just sending a patch without much discussion on the list,
but it was so easy to implement, so I thought an implementation can
help the discussion on if this is something we want or not.

A few words on the implementation:
I copied build_regexp_match_result() to a new function 
build_regexp_positions_result(),
and removed the string parts, replacing it with code to make ranges instead.
Maybe there are common parts that could be put into some helper-function,
but I think not, since the functions are two small for it to be worth it.

Thanks to David Fetter for the idea on using ranges.

Based on HEAD (f5a5773a9dc4185414fe538525e20d8512c2ba35).

/Joel

0001-regexp-positions.patch
Description: Binary data


Re: Add --tablespace option to reindexdb

2021-03-01 Thread Mark Dilger



> On Feb 25, 2021, at 10:49 PM, Michael Paquier  wrote:
> 
> Hi all,

Hi Michael,

> Since c5b2860, it is possible to specify a tablespace for a REINDEX,
> but the equivalent option has not been added to reindexdb.  Attached
> is a patch to take care of that.
> 
> This includes documentation and tests.

The documentation of the TABLESPACE option for REINDEX says:

+  Specifies that indexes will be rebuilt on a new tablespace.

but in your patch for reindexdb, it is less clear:

+Specifies the tablespace to reindex on. (This name is processed as
+a double-quoted identifier.)

I think the language "to reindex on" could lead users to think that indexes 
already existent in the given tablespace will be reindexed.  In other words, 
the option might be misconstrued as a way to specify all the indexes you want 
reindexed.  Whatever you do here, beware that you are using similar language in 
the --help, so consider changing that, too.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Improvements in prepared statements

2021-03-01 Thread Alejandro Sánchez
I have already implemented this in my Java project with some kind of
SQL preprocessor. I leave the idea here in case more people are
interested and PostgreSQL developers findit convenient to include it.
It is just string concatenation but it is a sintactic sugar very useful
in any SQL application. 
El lun, 01-03-2021 a las 17:35 +0100, Pavel Stehule escribió:
> Hi
> 
> po 1. 3. 2021 v 17:26 odesílatel Alejandro Sánchez <
> a...@nexttypes.com> napsal:
> > It is a matter of taste. I think this functionality would be better
> > in SQLand be the same for all languages without the need to use
> > string functions.
> 
> You can try to implement it, and send a patch. But I think a) it will
> be just string concatenation, and then it is surely useless, or b)
> you should introduce a new parser, because current parser need to
> know SQL identifiers immediately.  But anyway - anybody can write
> your opinion here. From me - I don't like this idea. 
> 
> Regards
> Pavel


Re: Improvements in prepared statements

2021-03-01 Thread Pavel Stehule
Hi

po 1. 3. 2021 v 17:26 odesílatel Alejandro Sánchez 
napsal:

> It is a matter of taste. I think this functionality would be better in SQL
>
> and be the same for all languages without the need to use string functions.
>
>
You can try to implement it, and send a patch. But I think a) it will be
just string concatenation, and then it is surely useless, or b) you should
introduce a new parser, because current parser need to know SQL identifiers
immediately.  But anyway - anybody can write your opinion here. From me - I
don't like this idea.

Regards

Pavel




>
> El lun, 01-03-2021 a las 17:15 +0100, Pavel Stehule escribió:
>
>
>
> po 1. 3. 2021 v 17:08 odesílatel Alejandro Sánchez 
> napsal:
>
> The benefit is ease of use. O
>
> ne of the great advantages of prepared statements is not
>
> having to concatenate strings. The use of arrays would also be very useful.
>
>
> query("select " + column1 + "," + column2 from " " + table + " where id in 
> (?), ids);
>
>
> VS
>
>
> query("select # from # where id in (?)", columns, table, ids);
>
>
> And it doesn't have to be done with prepared statements, it can just be 
> another SQL syntax.
>
>
> This is not too strong an argument - any language (and Database API) has
> necessary functionality now. Just you should use it.
>
> You can use fprintf in php, format in plpgsql, String.Format in C#, Java,
> ...
>
> Regards
>
> Pavel
>
>
>
>
>
> El lun, 01-03-2021 a las 16:46 +0100, Pavel Stehule escribió:
>
>
>
> po 1. 3. 2021 v 16:39 odesílatel Alejandro Sánchez 
> napsal:
>
> Hello, as far as I know it is not done in JDBC, in many frameworks it is.
>
> Although the execution plans cannot be reused it would be something
>
> very useful.
>
> It is included in a lot of frameworks and is
>
> a recurrent
>
>
> 
>
> question in database forums
>
>
> 
>
> . I
>
> t
>
>  would be nice if it was included in plain
>
> SQL.
>
>
>
> I am very sceptical about it. What benefit do you expect? When you cannot
> reuse an execution plan, then there is not any benefit of this. Then you
> don't need prepared statements, and all this API is useless. So some
> questions are frequent and don't mean necessity to redesign. The developers
> just miss the fundamental knowledge of database technology.
>
> Regards
>
> Pavel
>
>
> Best regards.
>
> Alejandro Sánchez.
>
>
> El lun, 01-03-2021 a las 15:31 +0100, Pavel Stehule escribió:
>
> Hi
>
> po 1. 3. 2021 v 15:20 odesílatel Alejandro Sánchez 
> napsal:
>
> Hello, some improvements in the prepared statements would facilitate
> their use from applications:
>
> - Use of table and column names in prepared statements.
>
> Example: select # from # where # = ?;
>
> - Use of arrays in prepared statements.
>
> Example: select # from article where id in (?);
>
> # = author,title
> ? = 10,24,45
>
>
> The server side prepared statements are based on reusing execution plans.
> You cannot reuse execution plans if you change table, or column. This is
> the reason why SQL identifiers are immutable in prepared statements. There
> are client side prepared statements - JDBC does it. There it is possible.
> But it is impossible on the server side. Prepared statements are like a
> compiled program. You can change parameters, variables - but you cannot
> change the program.
>
> Regards
>
> Pavel
>
>
>
>
>
> Best regards.
> Alejandro Sánchez.
>
>
>
>


Re: Improvements in prepared statements

2021-03-01 Thread Alejandro Sánchez
It is a matter of taste. I think this functionality would be better in
SQLand be the same for all languages without the need to use string
functions.
El lun, 01-03-2021 a las 17:15 +0100, Pavel Stehule escribió:
> po 1. 3. 2021 v 17:08 odesílatel Alejandro Sánchez <
> a...@nexttypes.com> napsal:
> > The benefit is ease of use. One of the great advantages of prepared
> > statements is nothaving to concatenate strings. The use of arrays
> > would also be very useful. 
> > query("select " + column1 + "," + column2 from " " + table + "
> > where id in (?), ids);
> > VS
> > query("select # from # where id in (?)", columns, table, ids);
> > 
> > And it doesn't have to be done with prepared statements, it can
> > just be another SQL syntax.
> 
> This is not too strong an argument - any language (and Database API)
> has necessary functionality now. Just you should use it.
> 
> You can use fprintf in php, format in plpgsql, String.Format in C#,
> Java, ...
> 
> Regards
> 
> Pavel
> 
> 
> 
> 
> > El lun, 01-03-2021 a las 16:46 +0100, Pavel Stehule escribió:
> > > po 1. 3. 2021 v 16:39 odesílatel Alejandro Sánchez <
> > > a...@nexttypes.com> napsal:
> > > > Hello, as far as I know it is not done in JDBC, in many
> > > > frameworks it is.Although the execution plans cannot be reused
> > > > it would be somethingvery useful. It is included in a lot of
> > > > frameworks and is a recurrentquestion in database forums. It
> > > > would be nice if it was included in plain SQL.
> > > 
> > > I am very sceptical about it. What benefit do you expect? When
> > > you cannot reuse an execution plan, then there is not any benefit
> > > of this. Then you don't need prepared statements, and all this
> > > API is useless. So some questions are frequent and don't mean
> > > necessity to redesign. The developers just miss the fundamental
> > > knowledge of database technology. 
> > > 
> > > Regards
> > > Pavel
> > > 
> > > > Best regards.Alejandro Sánchez.
> > > > El lun, 01-03-2021 a las 15:31 +0100, Pavel Stehule escribió:
> > > > > Hi
> > > > > 
> > > > > po 1. 3. 2021 v 15:20 odesílatel Alejandro Sánchez <
> > > > > a...@nexttypes.com> napsal:
> > > > > > Hello, some improvements in the prepared statements would
> > > > > > facilitate
> > > > > > 
> > > > > > their use from applications:
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > - Use of table and column names in prepared statements.
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > Example: select # from # where # = ?;
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > - Use of arrays in prepared statements.
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > Example: select # from article where id in (?);
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > # = author,title
> > > > > > 
> > > > > > ? = 10,24,45
> > > > > 
> > > > > The server side prepared statements are based on reusing
> > > > > execution plans. You cannot reuse execution plans if you
> > > > > change table, or column. This is the reason why SQL
> > > > > identifiers are immutable in prepared statements. There are
> > > > > client side prepared statements - JDBC does it. There it is
> > > > > possible. But it is impossible on the server side. Prepared
> > > > > statements are like a compiled program. You can change
> > > > > parameters, variables - but you cannot change the program.
> > > > > 
> > > > > Regards
> > > > > Pavel
> > > > > 
> > > > > 
> > > > >  
> > > > > > 
> > > > > > Best regards.
> > > > > > 
> > > > > > Alejandro Sánchez.
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > 


Re: Improvements in prepared statements

2021-03-01 Thread Pavel Stehule
po 1. 3. 2021 v 17:15 odesílatel Pavel Stehule 
napsal:

>
>
> po 1. 3. 2021 v 17:08 odesílatel Alejandro Sánchez 
> napsal:
>
>> The benefit is ease of use. One of the great advantages of prepared 
>> statements is not
>>
>> having to concatenate strings. The use of arrays would also be very useful.
>>
>>
>> query("select " + column1 + "," + column2 from " " + table + " where id in 
>> (?), ids);
>>
>>
>>
The argument with arrays is not good.  You can work with arrays just on
binary level, that is more effective. But just you should use operator =
ANY() instead IN.

Regards

Pavel



> VS
>>
>>
>> query("select # from # where id in (?)", columns, table, ids);
>>
>>
>> And it doesn't have to be done with prepared statements, it can just be 
>> another SQL syntax.
>>
>>
> This is not too strong an argument - any language (and Database API) has
> necessary functionality now. Just you should use it.
>
> You can use fprintf in php, format in plpgsql, String.Format in C#, Java,
> ...
>
> Regards
>
> Pavel
>
>
>
>
>
>> El lun, 01-03-2021 a las 16:46 +0100, Pavel Stehule escribió:
>>
>>
>>
>> po 1. 3. 2021 v 16:39 odesílatel Alejandro Sánchez 
>> napsal:
>>
>> Hello, as far as I know it is not done in JDBC, in many frameworks it is.
>>
>> Although the execution plans cannot be reused it would be something
>>
>> very useful.
>>
>> It is included in a lot of frameworks and is
>>
>> a recurrent
>>
>>
>> 
>>
>> question in database forums
>>
>>
>> 
>>
>> . I
>>
>> t
>>
>>  would be nice if it was included in plain
>>
>> SQL.
>>
>>
>>
>> I am very sceptical about it. What benefit do you expect? When you cannot
>> reuse an execution plan, then there is not any benefit of this. Then you
>> don't need prepared statements, and all this API is useless. So some
>> questions are frequent and don't mean necessity to redesign. The developers
>> just miss the fundamental knowledge of database technology.
>>
>> Regards
>>
>> Pavel
>>
>>
>> Best regards.
>>
>> Alejandro Sánchez.
>>
>>
>> El lun, 01-03-2021 a las 15:31 +0100, Pavel Stehule escribió:
>>
>> Hi
>>
>> po 1. 3. 2021 v 15:20 odesílatel Alejandro Sánchez 
>> napsal:
>>
>> Hello, some improvements in the prepared statements would facilitate
>> their use from applications:
>>
>> - Use of table and column names in prepared statements.
>>
>> Example: select # from # where # = ?;
>>
>> - Use of arrays in prepared statements.
>>
>> Example: select # from article where id in (?);
>>
>> # = author,title
>> ? = 10,24,45
>>
>>
>> The server side prepared statements are based on reusing execution plans.
>> You cannot reuse execution plans if you change table, or column. This is
>> the reason why SQL identifiers are immutable in prepared statements. There
>> are client side prepared statements - JDBC does it. There it is possible.
>> But it is impossible on the server side. Prepared statements are like a
>> compiled program. You can change parameters, variables - but you cannot
>> change the program.
>>
>> Regards
>>
>> Pavel
>>
>>
>>
>>
>>
>> Best regards.
>> Alejandro Sánchez.
>>
>>
>>
>>


Re: Improvements in prepared statements

2021-03-01 Thread Pavel Stehule
po 1. 3. 2021 v 17:08 odesílatel Alejandro Sánchez 
napsal:

> The benefit is ease of use. One of the great advantages of prepared 
> statements is not
>
> having to concatenate strings. The use of arrays would also be very useful.
>
>
> query("select " + column1 + "," + column2 from " " + table + " where id in 
> (?), ids);
>
>
> VS
>
>
> query("select # from # where id in (?)", columns, table, ids);
>
>
> And it doesn't have to be done with prepared statements, it can just be 
> another SQL syntax.
>
>
This is not too strong an argument - any language (and Database API) has
necessary functionality now. Just you should use it.

You can use fprintf in php, format in plpgsql, String.Format in C#, Java,
...

Regards

Pavel





> El lun, 01-03-2021 a las 16:46 +0100, Pavel Stehule escribió:
>
>
>
> po 1. 3. 2021 v 16:39 odesílatel Alejandro Sánchez 
> napsal:
>
> Hello, as far as I know it is not done in JDBC, in many frameworks it is.
>
> Although the execution plans cannot be reused it would be something
>
> very useful.
>
> It is included in a lot of frameworks and is
>
> a recurrent
>
>
> 
>
> question in database forums
>
>
> 
>
> . I
>
> t
>
>  would be nice if it was included in plain
>
> SQL.
>
>
>
> I am very sceptical about it. What benefit do you expect? When you cannot
> reuse an execution plan, then there is not any benefit of this. Then you
> don't need prepared statements, and all this API is useless. So some
> questions are frequent and don't mean necessity to redesign. The developers
> just miss the fundamental knowledge of database technology.
>
> Regards
>
> Pavel
>
>
> Best regards.
>
> Alejandro Sánchez.
>
>
> El lun, 01-03-2021 a las 15:31 +0100, Pavel Stehule escribió:
>
> Hi
>
> po 1. 3. 2021 v 15:20 odesílatel Alejandro Sánchez 
> napsal:
>
> Hello, some improvements in the prepared statements would facilitate
> their use from applications:
>
> - Use of table and column names in prepared statements.
>
> Example: select # from # where # = ?;
>
> - Use of arrays in prepared statements.
>
> Example: select # from article where id in (?);
>
> # = author,title
> ? = 10,24,45
>
>
> The server side prepared statements are based on reusing execution plans.
> You cannot reuse execution plans if you change table, or column. This is
> the reason why SQL identifiers are immutable in prepared statements. There
> are client side prepared statements - JDBC does it. There it is possible.
> But it is impossible on the server side. Prepared statements are like a
> compiled program. You can change parameters, variables - but you cannot
> change the program.
>
> Regards
>
> Pavel
>
>
>
>
>
> Best regards.
> Alejandro Sánchez.
>
>
>
>


Re: Improvements in prepared statements

2021-03-01 Thread Alejandro Sánchez

The benefit is ease of use. One of the great advantages of prepared
statements is nothaving to concatenate strings. The use of arrays would
also be very useful. 
query("select " + column1 + "," + column2 from " " + table + " where id
in (?), ids);
VS
query("select # from # where id in (?)", columns, table, ids);
And it doesn't have to be done with prepared statements, it can just be
another SQL syntax.
El lun, 01-03-2021 a las 16:46 +0100, Pavel Stehule escribió:
> po 1. 3. 2021 v 16:39 odesílatel Alejandro Sánchez <
> a...@nexttypes.com> napsal:
> > Hello, as far as I know it is not done in JDBC, in many frameworks
> > it is.Although the execution plans cannot be reused it would be
> > somethingvery useful. It is included in a lot of frameworks and is
> > a recurrentquestion in database forums. It would be nice if it was
> > included in plain SQL.
> 
> I am very sceptical about it. What benefit do you expect? When you
> cannot reuse an execution plan, then there is not any benefit of
> this. Then you don't need prepared statements, and all this API is
> useless. So some questions are frequent and don't mean necessity to
> redesign. The developers just miss the fundamental knowledge of
> database technology. 
> 
> Regards
> Pavel
> 
> > Best regards.Alejandro Sánchez.
> > El lun, 01-03-2021 a las 15:31 +0100, Pavel Stehule escribió:
> > > Hi
> > > 
> > > po 1. 3. 2021 v 15:20 odesílatel Alejandro Sánchez <
> > > a...@nexttypes.com> napsal:
> > > > Hello, some improvements in the prepared statements would
> > > > facilitate
> > > > 
> > > > their use from applications:
> > > > 
> > > > 
> > > > 
> > > > - Use of table and column names in prepared statements.
> > > > 
> > > > 
> > > > 
> > > > Example: select # from # where # = ?;
> > > > 
> > > > 
> > > > 
> > > > - Use of arrays in prepared statements.
> > > > 
> > > > 
> > > > 
> > > > Example: select # from article where id in (?);
> > > > 
> > > > 
> > > > 
> > > > # = author,title
> > > > 
> > > > ? = 10,24,45
> > > 
> > > The server side prepared statements are based on reusing
> > > execution plans. You cannot reuse execution plans if you change
> > > table, or column. This is the reason why SQL identifiers are
> > > immutable in prepared statements. There are client side prepared
> > > statements - JDBC does it. There it is possible. But it is
> > > impossible on the server side. Prepared statements are like a
> > > compiled program. You can change parameters, variables - but you
> > > cannot change the program.
> > > 
> > > Regards
> > > Pavel
> > > 
> > > 
> > >  
> > > > 
> > > > Best regards.
> > > > 
> > > > Alejandro Sánchez.
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 


Re: proposal: psql –help reflecting service or URI usage

2021-03-01 Thread Mark Dilger



> On Feb 28, 2021, at 10:10 AM, Paul Förster  wrote:
> 
> Hi Mark,
> 
>> On 28. Feb, 2021, at 17:54, Mark Dilger  wrote:
>> 
>> "definited" is a typo.
> 
> yes, definitely a typo, sorry. Thanks for pointing this out.
> 
>> Should this say "as defined in pg_service.conf"?  That's the default, but 
>> the user might have $PGSERVICEFILE set to something else.  Perhaps you could 
>> borrow the wording of other options and use "(default: as defined in 
>> pg_service.conf)", or something like that, but of course being careful to 
>> still fit in the line length limit.
> 
> I agree to all, thanks. What is the line length limit?

The output from --help should fit in a terminal window with only 80 characters 
width.  For example, in src/bin/scripts/createuser.c the line about 
--interactive is wrapped:

> printf(_("  -S, --no-superuserrole will not be superuser 
> (default)\n"));
> printf(_("  -V, --version output version information, then 
> exit\n"));
> printf(_("  --interactive prompt for missing role name and 
> attributes rather\n"
>  "than using defaults\n"));
> printf(_("  --replication role can initiate replication\n"));

You can find counter-examples where applications do not follow this rule.  
pg_isready is one of them.


>> Other client applications follow the same pattern as psql, so if this change 
>> were adopted, it should apply to all of them.
> 
> well, psql is central and IMHO the best place to start. I'd have to try out 
> all of them then. What I do know, though, is that pg_isready does not 
> understand a URI (why is that?), which is very unfortunate. So, I'd have to 
> try them all out and supply patches for them all?

There is a pattern in the client applications that the --help output is less 
verbose than the docs (see, for example, 
https://www.postgresql.org/docs/13/reference-client.html).  Your proposed patch 
makes psql's --help output a bit more verbose about this issue while leaving 
the other applications less so, without any obvious reason for the difference.

> Still, supporting a feature and not documenting it in its help is IMHO not a 
> good idea.

Ok.

>> Your proposal seems like something that would have been posted to the list 
>> before, possibly multiple times.  Any chance you could dig up past 
>> conversations on this subject and post links here for context?
> 
> I don't know any past discussions here. I only subscribed today to 
> psql-hackers. It might have been mentioned on psql-general, though. But I'm 
> not sure. This idea popped into my mind just yesterday when I was playing 
> around with psql and URIs and noticed that psql –help doesn't show them.

I mentioned looking at the mailing list archives, but neglected to give you a 
link:  https://www.postgresql.org/list/

Over the years, many proposals get made and debated, with some accepted and 
some rejected.  The rejected proposals often have some merit, and get suggested 
again, only to get rejected for the same reasons as the previous times they 
were suggested.  So searching the archives before posting a patch can sometimes 
be enlightening.  The difficulty in my experience is knowing what words and 
phrases to search for.  It can be a bit time consuming trying to find a prior 
discussion on a topic.

I don't know of any specific discussion which rejected your patch idea.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Improvements in prepared statements

2021-03-01 Thread Pavel Stehule
po 1. 3. 2021 v 16:39 odesílatel Alejandro Sánchez 
napsal:

> Hello, as far as I know it is not done in JDBC, in many frameworks it is.
>
> Although the execution plans cannot be reused it would be something
>
> very useful. It is included in a lot of frameworks and is a recurrent 
> 
>
> question in database forums 
> .
>  It would be nice if it was included in plain
>
> SQL.
>
>
I am very sceptical about it. What benefit do you expect? When you cannot
reuse an execution plan, then there is not any benefit of this. Then you
don't need prepared statements, and all this API is useless. So some
questions are frequent and don't mean necessity to redesign. The developers
just miss the fundamental knowledge of database technology.

Regards

Pavel


> Best regards.
>
> Alejandro Sánchez.
>
>
> El lun, 01-03-2021 a las 15:31 +0100, Pavel Stehule escribió:
>
> Hi
>
> po 1. 3. 2021 v 15:20 odesílatel Alejandro Sánchez 
> napsal:
>
> Hello, some improvements in the prepared statements would facilitate
> their use from applications:
>
> - Use of table and column names in prepared statements.
>
> Example: select # from # where # = ?;
>
> - Use of arrays in prepared statements.
>
> Example: select # from article where id in (?);
>
> # = author,title
> ? = 10,24,45
>
>
> The server side prepared statements are based on reusing execution plans.
> You cannot reuse execution plans if you change table, or column. This is
> the reason why SQL identifiers are immutable in prepared statements. There
> are client side prepared statements - JDBC does it. There it is possible.
> But it is impossible on the server side. Prepared statements are like a
> compiled program. You can change parameters, variables - but you cannot
> change the program.
>
> Regards
>
> Pavel
>
>
>
>
>
> Best regards.
> Alejandro Sánchez.
>
>
>
>


Re: Improving connection scalability: GetSnapshotData()

2021-03-01 Thread Konstantin Knizhnik




On 27.02.2021 20:40, AJG wrote:

Hi,

Greatly appreciate if you could please reply to the following questions as
time allows.

I have seen previous discussion/patches on a built-in connection pooler. How
does this scalability improvement, particularly idle connection improvements
etc, affect that built-in pooler need, if any?


Same general question about an external connection pooler in general in
Production? Still required to route to different servers but no longer
needed for the pooling part. as an example.


Many Thanks!



Connection pooler is still needed.
The patch for GetSnapshotData() mostly improves scalability of read-only 
queries and reduce contention for procarray lock.
But read-write upload cause contention for many other resources: 
relation extension lock, buffer locks, tuple locks and so on.


If you run pgbench at NUMA machine with hundreds of cores, then you will 
still observe significant degradation of performance with increasing 
number of connection.
And this degradation will be dramatic if you replace some non-uniform 
distribution of keys, for example Zipfian distribution.









--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html








Re: Improvements in prepared statements

2021-03-01 Thread Alejandro Sánchez
Hello, as far as I know it is not done in JDBC, in many frameworks it
is.Although the execution plans cannot be reused it would be
somethingvery useful. It is included in a lot of frameworks and is a
recurrentquestion in database forums. It would be nice if it was
included in plain SQL.
Best regards.Alejandro Sánchez.
El lun, 01-03-2021 a las 15:31 +0100, Pavel Stehule escribió:
> Hi
> 
> po 1. 3. 2021 v 15:20 odesílatel Alejandro Sánchez <
> a...@nexttypes.com> napsal:
> > Hello, some improvements in the prepared statements would
> > facilitate
> > 
> > their use from applications:
> > 
> > 
> > 
> > - Use of table and column names in prepared statements.
> > 
> > 
> > 
> > Example: select # from # where # = ?;
> > 
> > 
> > 
> > - Use of arrays in prepared statements.
> > 
> > 
> > 
> > Example: select # from article where id in (?);
> > 
> > 
> > 
> > # = author,title
> > 
> > ? = 10,24,45
> 
> The server side prepared statements are based on reusing execution
> plans. You cannot reuse execution plans if you change table, or
> column. This is the reason why SQL identifiers are immutable in
> prepared statements. There are client side prepared statements - JDBC
> does it. There it is possible. But it is impossible on the server
> side. Prepared statements are like a compiled program. You can change
> parameters, variables - but you cannot change the program.
> 
> Regards
> Pavel
> 
> 
>  
> > 
> > Best regards.
> > 
> > Alejandro Sánchez.
> > 
> > 
> > 
> > 
> > 
> > 
> > 


We should stop telling users to "vacuum that database in single-user mode"

2021-03-01 Thread Hannu Krosing
It looks like we are unnecessarily instructing our usiers to vacuum their
databases in single-user mode when just vacuuming would be enough.

We should fix the error message to be less misleading.

== The story

I think most of us have at some time seen the following message, if not in their
own database, then at some client.

ERROR: database is not accepting commands to avoid wraparound data
loss in database ""
HINT: Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions.

But "vacuum that database in single-user mode" is the very last thing
one wants to
do, because

*   it is single-user mode, so nothing else works ...
*   CHECKPOINTs are not running, so all the WAL segments can not be rotated and
reused
*   Replication does not work, so after vacuum is done and database is started
in normal mode, there is huge backlog to replicate
*   pg_stat_progress_vacuum is not available so you have no idea when the
command is going to complete
*   VACUUM VERBOSE isn't - there is absolutely no output from single-user mode
vacuum with or without VERBOSE, so you *really* do not know what is going on
and how much progress is made (if you are locky you can guess something from
IO and CPU monitoring, but it is inexact at best )

When I started looking at improving the situation I discovered, that there
already is no need to run VACUUM in single user mode in any currently supported
PostgreSQL version as you can run VACUUM perfectly well when the wraparound
protection is active.

It worked in all PG versions from v9.6 to v13.

I also tested v 8.3 as this is where we added virtual transactions, but there
VACUUM really fails to run successfully without single-user mode..

So my proposal is to change the error message [*]  to something that does not
suggest the single-user mode as the requirement for running VACUUM.
Also COMMIT PREPARED still works ok in this situation.

Single-user mode still may be needed in case one needs to drop a
replication slot
or something similar.

[*] The message is in src/backend/access/transam/varsup.c around line 120

=== How to test

The following instructions let you run into wraparound in about an hour,
depending on your setup (was 1.2 hours on my laptop)

 First, set some flags

To allow PREPARE TRANSACTION to block VACUUM cleanup

```
alter system set max_prepared_transactions = 10;
```

Also set *_min_messages to errors, unless you want to get 10M of
WARNINGs (~4GB) to
logs and the same amount sent to client, slowing down the last 10M transactions
significantly.

```
alter system set log_min_messages = error;
alter system set client_min_messages = error;
```

 Restart the system to activate the settings

 Block Vacuum from cleaning up transactions

Create a database `wraptest` and connect to it, then

```
create table t (i int);

BEGIN;
insert into t values(1);
PREPARE TRANSACTION 'trx_id_pin';
```

Now you have a prepared transaction, which makes sure that even well-tuned
autovacuum does not prevent running into the wraparound protection.

```
[local]:5096 hannu@wraptest=# SELECT * FROM pg_prepared_xacts;
 transaction |gid |   prepared| owner | database
-++---+---+--
 593 | trx_id_pin | 2021-03-01 08:57:27.024777+01 | hannu | wraptest
(1 row)
```

 Create a function to consume transaction ids as fast as possible:

```
CREATE OR REPLACE FUNCTION trx_eater(n int) RETURNS void
LANGUAGE plpgsql
AS $plpgsql$
BEGIN
 FOR i IN 0..n LOOP
   BEGIN
 INSERT INTO t values(i);
   EXCEPTION WHEN OTHERS THEN
   RAISE; -- raise it again, so that we actually err out on wraparound
   END;
 END LOOP;
END;
$plpgsql$;
```

 Use pgbench to drive this function

Make a pgbench command file

$ echo 'select trx_eater(10);' > trx_eater.pgbench

and start pgbench to run this function in a few backends in parallel

$ pgbench -c 16 -T 2 -P 60 -n wraptest -f trx_eater.pgbench

 Wait 1-2 hours

In about an hour or two this should error out with

ERROR: database is not accepting commands to avoid wraparound data
loss in database "postgres"
HINT: Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions.

After this just do

COMMIT PREPARED 'trx_id_pin';

 Verify that VACUUM still works

to release the blocket 2PC transaction and you can verify yourself that

*   you can run VACUUM on any table, and
*   Autovacuum is working, and will eventually clear up the situation

If you have not tuned autovacuum_vacuum_cost_* at all, especially in earlier
versions where it is 20ms by default the autovacuum-started vacuum is
running really
slowly, and it will take about 8 hours to clean up the table, but this
can be sped up
if you set autovacuum_vacuum_cost_delay=0 and then either restart the database
or just kill th

2019-03 CF now in progress

2021-03-01 Thread David Steele

Hackers,

The 2019-03 commitfest is now in progress. It's a big one as usual.

Needs review: 213.
Waiting on Author: 21.
Ready for Committer: 28.
Committed: 29.
Withdrawn: 3.
Total: 294.

If you are a patch author please check http://commitfest.cputube.org to 
be sure your patch still applies, compiles, and passes tests.


Please be sure to review patches of equal or greater complexity to the 
patches you submit. This only works well if everyone participates.


Happy coding!

Regards,
--
-David
da...@pgmasters.net




Re: A reloption for partitioned tables - parallel_workers

2021-03-01 Thread Laurenz Albe
On Mon, 2021-03-01 at 17:39 +0900, Amit Langote wrote:
> > I am not sure.
> > IMO, for normal table, we use the following macro to get the 
> > parallel_workers:
> > --
> > /*
> >   * RelationGetParallelWorkers
> >   *  Returns the relation's parallel_workers reloption setting.
> >   *  Note multiple eval of argument!
> >   */
> > #define RelationGetParallelWorkers(relation, defaultpw) \
> >  ((relation)->rd_options ? \
> >   ((StdRdOptions *) (relation)->rd_options)->parallel_workers : 
> > (defaultpw))
> > --
> > Since we add new struct " PartitionedTableRdOptions ", It seems we need to 
> > get parallel_workers in different way.
> > Do we need similar macro to get partitioned table's parallel_workers ?
> 
> Oh, you're right.  The parallel_workers setting of a relation is only
> accessed through this macro, even for partitioned tables, and I can
> see that it is actually wrong to access a partitioned table's
> parallel_workers through this macro as-is.   Although I hadn't tested
> that, so thanks for pointing that out.
> 
> > Like:
> > #define PartitionedTableGetParallelWorkers(relation, defaultpw) \
> > xxx
> > (PartitionedTableRdOptions *) (relation)->rd_options)->parallel_workers : 
> > (defaultpw))
> 
> I'm thinking it would be better to just modify the existing macro to
> check relkind to decide which struct pointer type to cast the value in
> rd_options to.

Here is an updated patch with this fix.

I added regression tests and adapted the documentation a bit.

I also added support for lowering the number of parallel workers.

Yours,
Laurenz Albe
From d48874a61ac698dc284b83b7e3b147a71158d09d Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Mon, 1 Mar 2021 16:06:49 +0100
Subject: [PATCH] Allow setting parallel_workers on partitioned tables.

Sometimes it's beneficial to do

ALTER TABLE my_part_tbl SET (parallel_workers = 32);

when the query planner is planning too few workers to read from your
partitions. For example, if you have a partitioned table where each
partition is a foreign table that cannot itself have this setting on it.
Shown to give a 100%+ performance increase (in my case, 700%) when used
with cstore_fdw column store partitions.

This is also useful to lower the number of parallel workers, for
example when the executor is expected to prune most partitions.

Authors: Seamus Abshere, Amit Langote, Laurenz Albe
Discussion: https://postgr.es/m/95b1dd96-8634-4545-b1de-e2ac779beb44%40www.fastmail.com
---
 doc/src/sgml/ref/create_table.sgml|  15 +-
 src/backend/access/common/reloptions.c|  14 +-
 src/backend/optimizer/path/allpaths.c | 155 ++
 src/include/utils/rel.h   |  15 +-
 .../regress/expected/partition_aggregate.out  |  51 +-
 src/test/regress/sql/partition_aggregate.sql  |  17 +-
 6 files changed, 188 insertions(+), 79 deletions(-)

diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 3b2b227683..c8c14850c1 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1337,8 +1337,9 @@ WITH ( MODULUS numeric_literal, REM
 If a table parameter value is set and the
 equivalent toast. parameter is not, the TOAST table
 will use the table's parameter value.
-Specifying these parameters for partitioned tables is not supported,
-but you may specify them for individual leaf partitions.
+These parameters, with the exception of parallel_workers,
+are not supported on partitioned tables, but you may specify them for
+individual leaf partitions.

 

@@ -1401,9 +1402,13 @@ WITH ( MODULUS numeric_literal, REM
  
   This sets the number of workers that should be used to assist a parallel
   scan of this table.  If not set, the system will determine a value based
-  on the relation size.  The actual number of workers chosen by the planner
-  or by utility statements that use parallel scans may be less, for example
-  due to the setting of .
+  on the relation size and the number of scanned partitions.
+  When set on a partitioned table, the specified number of workers will
+  work on distinct partitions, so the number of partitions affected by the
+  parallel operation should be taken into account.
+  The actual number of workers chosen by the planner or by utility
+  statements that use parallel scans may be less, for example due
+  to the setting of .
  
 

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index c687d3ee9e..f8443d2361 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -377,7 +377,7 @@ static relopt_int intRelOpts[] =
 		{
 			"parallel_workers",
 			"Number of parallel processes that can be used per executor node for this relation.",
-			RELOPT_KIND_HEAP,
+			RELOPT_KIND_HE

Re: macOS SIP, next try

2021-03-01 Thread Tom Lane
Peter Eisentraut  writes:
> I have since learned that there is a way to disable only the part of SIP 
> that is relevant for us.  This seems like a useful compromise, and it 
> appears that a number of other open-source projects are following the 
> same route.  I suggest the attached documentation patch and then close 
> this issue.

Hmm, interesting.  Where is it documented what this does?

regards, tom lane




Re: proposal: psql –help reflecting service or URI usage

2021-03-01 Thread Euler Taveira
On Mon, Mar 1, 2021, at 10:32 AM, Paul Förster wrote:
> still, what is the line length limit? Where do I find it?
We try to limit it to 80 characters but it is not a hard limit. Long
descriptions should certainly be split into multiple lines.

The question is: how popular is service and connection URIs? We cannot  
  
certainly include all possibilities in the --help that's why we have the
  
documentation. IMO we could probably include "connection string" that accepts 2
formats: (i) multiple keyword - value string and URIs ("service" is included in 
the (i)).   
  

   
Usage:  
  
   psql [OPTION]... [DBNAME [USERNAME]|CONNINFO]
   

   
or even 
  

   
Usage:  
  
   psql [OPTION]... [DBNAME [USERNAME]] 
   
   psql [OPTION]... [CONNINFO]  
   

   

   
Connection options: 
  
   -h, --host=HOSTNAME  database server host or socket directory (default: 
"local socket")
   -p, --port=PORT  database server port (default: "")  
   
   -U, --username=USERNAME  database user name (default: "euler")   
   
   -w, --no-passwordnever prompt for password   
   
   -W, --password   force password prompt (should happen automatically) 

   
   CONNINFO connection string to connect to (key = value strings
or connection URI)  
   

I don't like the CONNINFO in the connection options. It seems a natural place  
but DBNAME and USERNAME aren't included in it. Should we include it too?
  
Someone can argue that they are self-explanatory, hence, a description is not  
necessary.  
  

   
It might be a different topic but since we are talking about --help 
improvements, I have some suggestions:  
  

* an Example section could help newbies to Postgres command-line   
tools to figure out how to inform the connection parameters. In this case, we   
  
could include at least 3 examples: (i) -h, -p, -U parameters, (ii) key/value
  
connection string and (iii) connection URI.
* Connection options could be moved to the top (maybe after General options) if
we consider that it is more important than the other sections (you cannot  
probably execute psql without using a connection parameter in production).


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: Improvements in prepared statements

2021-03-01 Thread Pavel Stehule
Hi

po 1. 3. 2021 v 15:20 odesílatel Alejandro Sánchez 
napsal:

> Hello, some improvements in the prepared statements would facilitate
> their use from applications:
>
> - Use of table and column names in prepared statements.
>
> Example: select # from # where # = ?;
>
> - Use of arrays in prepared statements.
>
> Example: select # from article where id in (?);
>
> # = author,title
> ? = 10,24,45
>

The server side prepared statements are based on reusing execution plans.
You cannot reuse execution plans if you change table, or column. This is
the reason why SQL identifiers are immutable in prepared statements. There
are client side prepared statements - JDBC does it. There it is possible.
But it is impossible on the server side. Prepared statements are like a
compiled program. You can change parameters, variables - but you cannot
change the program.

Regards

Pavel




>
> Best regards.
> Alejandro Sánchez.
>
>
>
>


SSL negotiation error on massive connect/disconnect

2021-03-01 Thread Maxim Orlov

Hi!

I have primary server on port 55942 and two standbys on 55943 and 55944. 
Then use connection string like 
"postgresql://127.0.0.1:55942,127.0.0.1:55943,127.0.0.1:55944/postgres" 
to connect to the servers via psql.


Everything works perfectly no matter how many attempts to connect I do.
But when I stop primary server, very rarely I do get an error "received 
invalid response to SSL negotiation" from psql. I got this error when I 
tried to make massive connects/disconnects test and it's unlikely to 
reproduce manually without thousands of connections sequentially with no 
intentional delay in between.


The problem present only on Linux, MacOS works fine.

As far as I understand this particular problem is because of postgresql 
gets "zero" (i.e. 0) byte in SSLok in 
PQconnectPoll@src/interfaces/libpq/fe-connect.c. This lead to select 
"else" branch with described error message. This may be fixed by 
handling zero byte as 'E' byte. But I'm not sure if it's good solution, 
since I don't know why fe-connect gets an zero byte at all.


I consider it's worth to correct this. This error is rare but it's 
really odd to notice this unexpected and wrong behavior.


---
Best regards,
Maxim Orlov.From 860f14c99ec70fcea58897c97e218b8a54286a39 Mon Sep 17 00:00:00 2001
From: Maxim Orlov 
Date: Mon, 1 Mar 2021 16:47:41 +0300
Subject: [PATCH] WIP

---
 src/interfaces/libpq/Makefile   |   3 +
 src/interfaces/libpq/fe-connect.c   |   2 +-
 src/interfaces/libpq/t/004-multihost.pl | 143 
 3 files changed, 147 insertions(+), 1 deletion(-)
 create mode 100644 src/interfaces/libpq/t/004-multihost.pl

diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile
index f74677eaf9b..98fd1d694ef 100644
--- a/src/interfaces/libpq/Makefile
+++ b/src/interfaces/libpq/Makefile
@@ -120,6 +120,9 @@ install: all installdirs install-lib
 installcheck:
 	$(MAKE) -C test $@
 
+check:
+	$(prove_check)
+
 installdirs: installdirs-lib
 	$(MKDIR_P) '$(DESTDIR)$(includedir)' '$(DESTDIR)$(includedir_internal)' '$(DESTDIR)$(datadir)'
 
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index db71fea169c..43c14a4433a 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3023,7 +3023,7 @@ keep_going:		/* We will come back to here until there is
 		conn->status = CONNECTION_MADE;
 		return PGRES_POLLING_WRITING;
 	}
-	else if (SSLok == 'E')
+	else if (SSLok == 'E' || SSLok == '\0')
 	{
 		/*
 		 * Server failure of some sort, such as failure to
diff --git a/src/interfaces/libpq/t/004-multihost.pl b/src/interfaces/libpq/t/004-multihost.pl
new file mode 100644
index 000..116f4a82635
--- /dev/null
+++ b/src/interfaces/libpq/t/004-multihost.pl
@@ -0,0 +1,143 @@
+# target_server_type
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 1;
+use Scalar::Util qw(blessed);
+
+# Initialize master node
+my $node_master = get_new_node('master');
+$node_master->init(allows_streaming => 1);
+$node_master->append_conf('postgresql.conf', "listen_addresses = '$PostgresNode::test_localhost'");
+$node_master->start;
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_master->backup($backup_name);
+
+# Create streaming standby 1 linking to master
+my $node_standby_1 = get_new_node('standby_1');
+$node_standby_1->init_from_backup($node_master, $backup_name, has_streaming => 1);
+$node_standby_1->append_conf('postgresql.conf', "listen_addresses = '$PostgresNode::test_localhost'");
+$node_standby_1->start;
+
+# Create streaming standby 2 linking to master
+my $node_standby_2 = get_new_node('standby_2');
+$node_standby_2->init_from_backup($node_master, $backup_name, has_streaming => 1);
+$node_standby_2->append_conf('postgresql.conf', "listen_addresses = '$PostgresNode::test_localhost'");
+$node_standby_2->start;
+
+sub get_host_port
+{
+	my $node = shift;
+	return "$PostgresNode::test_localhost:" . $node->port;
+}
+
+sub multiconnstring
+{
+	my $nodes= shift;
+	my $database = shift || "postgres";
+	my $params   = shift;
+	my $extra= "";
+	if ($params)
+	{
+		my @cs;
+		while (my ($key, $val) = each %$params)
+		{
+			push @cs, $key . "=" . $val;
+		}
+		$extra = "?" . join("&", @cs);
+	}
+	my $str =
+		"postgresql://"
+		. join(",", map({ get_host_port($_) } @$nodes))
+		. "/$database$extra";
+
+	return $str;
+}
+
+#
+# Copied from PosgresNode.pm passing explicit connect-string instead of
+# constructed from object
+#
+
+sub psql2
+{
+	# We expect dbname to be part of connstr
+	my ($connstr, $sql, %params) = @_;
+
+	my $stdout= $params{stdout};
+	my $stderr= $params{stderr};
+	my @psql_params   = ('psql', '-XAtq', '-d', $connstr, '-f', '-');
+
+	# Allocate temporary ones to capture them so we
+	# can return them. Otherwise we won't redirect them at all.
+	if (!defined($stdout))
+	{
+		my $temp_stdout = "";
+		$stdout = \$temp_stdout;

Improvements in prepared statements

2021-03-01 Thread Alejandro Sánchez
Hello, some improvements in the prepared statements would facilitate
their use from applications:

- Use of table and column names in prepared statements.

Example: select # from # where # = ?;

- Use of arrays in prepared statements.

Example: select # from article where id in (?);

# = author,title
? = 10,24,45

Best regards.
Alejandro Sánchez.





Re: [PATCH] Bug fix in initdb output

2021-03-01 Thread Juan José Santamaría Flecha
On Mon, Mar 1, 2021 at 5:52 AM Nitin Jadhav 
wrote:

>
>> Please share your thoughts on this. If we go ahead with this change,
> then we need to back-patch. I would be happy to create those patches.
>

A full path works, even with the slashes. The commiter will take care of
back-patching, if needed. As for HEAD at least, this LGTM.

Regards,

Juan José Santamaría Flecha


Re: proposal: psql –help reflecting service or URI usage

2021-03-01 Thread Paul Förster
Hi Mark,

I revisited my first admittedly naive and faulty attempt.

> On 28. Feb, 2021, at 19:10, Paul Förster  wrote:
> 
>> but of course being careful to still fit in the line length limit.
> I agree to all, thanks. What is the line length limit?

still, what is the line length limit? Where do I find it?

I'd be very happy if you could take a look at this version. Thanks in advance 
and thanks very much for the input.

Cheers,
Paul



psql-help.c.patch
Description: Binary data


Re: Improving connection scalability: GetSnapshotData()

2021-03-01 Thread luis . roberto


- Mensagem original -
> De: "AJG" 
> Para: "Pg Hackers" 
> Enviadas: Sábado, 27 de fevereiro de 2021 14:40:58
> Assunto: Re: Improving connection scalability: GetSnapshotData()

> Hi,

> Greatly appreciate if you could please reply to the following questions as
> time allows.

> I have seen previous discussion/patches on a built-in connection pooler. How
> does this scalability improvement, particularly idle connection improvements
> etc, affect that built-in pooler need, if any?

> Same general question about an external connection pooler in general in
> Production? Still required to route to different servers but no longer
> needed for the pooling part. as an example.

> Many Thanks!

> --
> Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html


As I understand it, the improvements made to GetSnapShotData() mean having 
higher connection count does not incur as much a penalty to performance as 
before. 
I am not sure it solves the connection stablishment side of things, but I may 
be wrong.

Luis R. Weck 




Re: Improving connection scalability: GetSnapshotData()

2021-03-01 Thread AJG
Hi,

Greatly appreciate if you could please reply to the following questions as
time allows.

I have seen previous discussion/patches on a built-in connection pooler. How
does this scalability improvement, particularly idle connection improvements
etc, affect that built-in pooler need, if any?


Same general question about an external connection pooler in general in
Production? Still required to route to different servers but no longer
needed for the pooling part. as an example.


Many Thanks!





--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




  1   2   >