date:20220203

Re: RFC: Logging plan of the running query

2022-02-03 Thread Robert Treat

On Wed, Feb 2, 2022 at 7:59 AM torikoshia  wrote:
>
> 2022-02-01 01:51, Fujii Masao wrote:

> > +Note that nested statements (statements executed inside a
> > function) are not
> > +considered for logging. Only the plan of the most deeply nested
> > query is logged.
> >
> > Now the plan of even nested statement can be logged. So this
> > description needs to be updated?
>
> Modified it as below:
>
>  -Note that nested statements (statements executed inside a
> function) are not
>  -considered for logging. Only the plan of the most deeply nested
> query is logged.
>  +Note that when the statements are executed inside a function,
> only the
>  +plan of the most deeply nested query is logged.
>

Minor nit, but I think the "the" is superfluous.. ie.

"Note that when statements are executed inside a function,
only the plan of the most deeply nested query is logged"


> On 2022-02-01 17:27, Kyotaro Horiguchi wrote:
>
> Thanks for reviewing Horiguchi-san!
>
> > By the way, I'm anxious about the following part and I'd like to
> > remove it.
>
> I also think it would be nice if it's possible.
> >
> > +* Ensure no page lock is held on this process.
> >
> > It seems to me what is wrong is ginInsertCleanup(), not this feature.
>
> > Actually, the discussion is a bit dubious. What we need really to
> check is wheter such locks are not held on an index *elsewhere*.
>
> Since I'm not sure how long it will take to discuss this point, the
> attached patch is based on the current HEAD at this time.
> I also think it may be better to discuss it on another thread.
>

While I agree on the above points, IMHO I don't believe it should be a
show-stopper for adding this functionality to v15, but we have a few
more commitments before we get to that point.


Robert Treat
https://xzilla.net

Re: do only critical work during single-user vacuum?

2022-02-03 Thread Masahiko Sawada

On Wed, Feb 2, 2022 at 6:50 AM John Naylor  wrote:
>
> On Thu, Jan 27, 2022 at 8:28 PM Justin Pryzby  wrote:
>
> > I'm sure you meant "&" here (fixed in attached patch to appease the cfbot):
> > +   if (options | VACOPT_MINIMAL)
>
> Thanks for catching that! That copy-pasto was also masking my failure
> to process the option properly -- fixed in the attached as v5.
>
> > It should either refuse to run if a list of tables is specified with 
> > MINIMAL,
> > or it should filter that list by XID condition.
>
> I went with the former for simplicity. As a single-purpose option, it
> makes sense.
>
> > As for the name, it could be MINIMAL or FAILSAFE or EMERGENCY or ??
> > I think the name should actually be a bit more descriptive, and maybe say 
> > XID,
> > like MINIMAL_XID or XID_EMERGENCY...
>
> I went with EMERGENCY in this version to reinforce its purpose in the
> mind of the user (and reader of this code).
>
> > Normally, options are independent, but VACUUM (MINIMAL) is a "shortcut" to a
> > hardcoded set of options: freeze on, truncate off, cleanup off.  So it 
> > refuses
> > to be combined with other options - good.
> >
> > This is effectively a shortcut to hypothetical parameters for selecting 
> > tables
> > by XID/MXID age.  In the future, someone could debate adding user-facing 
> > knobs
> > for table selection by age.
>
> I used the params struct in v5 for the emergency cutoff ages. Even
> with the values hard-coded, it seems cleaner to keep them here.
>
> > I still wonder if the relations should be processed in order of decreasing 
> > age.
> > An admin might have increased autovacuum_freeze_max_age up to 2e9, and your
> > query might return thousands of tables, with a wide range of sizes and ages.
> >
> > Processing them in order of decreasing age would allow the admin to quickly
> > vacuum the oldest tables, and optionally interrupt vacuum to get out of 
> > single
> > user mode ASAP - even if their just want to run VACUUM(MINIMAL) in a normal
> > backend when services aren't offline.  Processing them out of order might be
> > pretty surprising - they might run vacuum for an hour (or overnight), cancel
> > it, attempt to start the DB in normal mode, and conclude that it made no
> > visible progress.
>
> While that seems like a nice property to have, it does complicate
> things, so can be left for follow-on work.
>
> Also in v5:
>
> - It mentions the new command in the error hint in
> GetNewTransactionId(). I'm not sure if multi-word commands should be
> quoted like this.
> - A first draft of documentation

Thank you for updating the patch.

I have a few questions and comments:

+  The only other option that may be combined with
VERBOSE, although in single-user mode no client
messages are
+  output.

Given VERBOSE with EMERGENCY can work only in multi-user mode, why
only VERBOSE can be specified with EMERGENCY? I think the same is true
for other options like PARALLEL; PARALLEL can work only in multi-user
mode.

---
+  It performs a database-wide vacuum on tables, toast tables, and
materialized views whose
+  xid age or mxid age is older than 1 billion.

Do we need to allow the user to specify the threshold or need a higher
value (at least larger than 1.6 billion, default value of
vacuum_failsafe_age)? I imagined a case where there are a few very-old
tables (say 2 billion old) and many tables that are older than 1
billion. In this case, VACUUM (EMERGENCY) would take a long time to
complete. But to minimize the downtime, we might want to run VACUUM
(EMERGENCY) on only the very-old tables, start the cluster in
multi-user mode, and run vacuum on multiple tables in parallel.

---
+   if (params->options & VACOPT_EMERGENCY)
+   {
+   /*
+   * Only consider relations able to hold unfrozen XIDs (anything else
+   * should have InvalidTransactionId in relfrozenxid anyway).
+   */
+   if (classForm->relkind != RELKIND_RELATION &&
+   classForm->relkind != RELKIND_MATVIEW &&
+   classForm->relkind != RELKIND_TOASTVALUE)
+   {
+   Assert(!TransactionIdIsValid(classForm->relfrozenxid));
+   Assert(!MultiXactIdIsValid(classForm->relminmxid));
+   continue;
+   }
+
+   table_xid_age = DirectFunctionCall1(xid_age,
classForm->relfrozenxid);
+   table_mxid_age = DirectFunctionCall1(mxid_age,
classForm->relminmxid);
+

I think that instead of calling xid_age and mxid_age for each
relation, we can compute the thresholds for xid and mxid once, and
then compare them to relation's relfrozenxid and relminmxid.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Latest LLVM breaks our code again

2022-02-03 Thread Fabien COELHO



Hello Andres,


I'm doubtful that tracking development branches of LLVM is a good
investment. Their development model is to do changes in-tree much more than we
do. Adjusting to API changes the moment they're made will often end up with
further changes to the same / related lines.  Once they open the relevant
release-NN branch, it's a different story.

Maybe it'd make sense to disable --with-llvm on seawasp>
and have ppa separate animal that tracks the newest release branch?


The point of seawasp is somehow to have an early warning of upcoming 
changes, especially the --with-llvm part. LLVM indeed is a moving target, 
but they very rarely back down on their API changes… As pg version are 
expected to run for 5 years, they will encounter newer compiler versions, 
so being as ready as possible seems worthwhile.


ISTM that there is no actual need to fix LLVM breakage on the spot, but I 
think it is pertinent to be ok near release time. This is why there is a 
"EXPERIMENTAL" marker in the system description. There can be some noise 
when LLVM development version is broken, this has happened in the past 
with seawasp (llvm) and moonjelly (gcc), but not often.


About tracking releases: this means more setups and runs and updates, and 
as I think they do care about compatibility *within* a release we would 
not see breakages so it would not be very useful, and we would only get 
the actual breakages at LLVM release time, which may be much later.


For these reasons, I'm inclined to let seawasp as it is.

--
Fabien.

Re: Design of pg_stat_subscription_workers vs pgstats

2022-02-03 Thread Peter Eisentraut




On 02.02.22 07:54, Amit Kapila wrote:

Where do you propose to store this information?



pg_subscription_worker

The error message and context is very important.  Just make sure it is only non-null when 
the worker state is "syncing failed" (or whatever term we use).


We could name the table something like pg_subscription_worker_error, so 
it's explicit that it is collecting error information only.



Sure, but is this the reason you want to store all the error info in
the system catalog? I agree that providing more error info could be
useful and also possibly the previously failed (apply) xacts info as
well but I am not able to see why you want to have that sort of info
in the catalog. I could see storing info like err_lsn/err_xid that can
allow to proceed to apply worker automatically or to slow down the
launch of errored apply worker but not all sort of other error info
(like err_cnt, err_code, err_message, err_time, etc.). I want to know
why you are insisting to make all the error info persistent via the
system catalog?


Let's flip this around and ask, why not?  Tables are the place to store 
data, by default.

Re: UNIQUE null treatment option

2022-02-03 Thread Peter Eisentraut


On 28.01.22 13:56, Pavel Borisov wrote:

Makes sense.  Here is an updated patch with this change.

I didn't end up renaming anynullkeys.  I came up with names like
"anyalwaysdistinctkeys", but in the end that felt too abstract, and
moreover, it would require rewriting a bunch of code comments that
refer
to null values in this context.  Since as you wrote, anynullkeys is
just
a local concern between two functions, this slight inaccuracy is
perhaps
better than some highly general but unclear terminology.

Agree with that. With the comment it is clear how it works.

I've looked at the patch v3. It seems good enough for me. CFbot tests 
have also come green.

Suggest it is RFC now.


Committed.  Thanks.

Re: Bugs in pgoutput.c

2022-02-03 Thread Amit Kapila

On Thu, Feb 3, 2022 at 8:18 AM Tom Lane  wrote:
>
> Amit Kapila  writes:
> > Tom, is it okay for you if I go ahead with this patch after some testing?
>
> I've been too busy to get back to it, so sure.
>

Thanks. I have tested the patch by generating an invalidation message
for table DDL before accessing the syscache in
logicalrep_write_tuple(). I see that it correctly invalidates the
entry and rebuilds it for the next operation. I couldn't come up with
some automatic test for it so used the debugger to test it. I have
made a minor change in one of the comments. I am planning to push this
tomorrow unless there are comments or suggestions.

-- 
With Regards,
Amit Kapila.

v2-0001-Improve-invalidation-handling-in-pgoutput.c.patch
Description: Binary data

Re: Replace pg_controldata output fields with macros for better code manageability

2022-02-03 Thread Bharath Rupireddy

On Thu, Feb 3, 2022 at 11:34 AM Kyotaro Horiguchi
 wrote:
> > > #define PG_CONTROL_FIELD_CHECKPOINT_OLDESTXID "Latest checkpoint's 
> > > oldestXID:"
> > > #define PG_CONTROL_FIELD_CHECKPOINT_OLDESTXID_DB "Latest checkpoint's
> > > oldestXID's DB:"
> > > and so on.
> >
> > That seems like a very good idea.
>
> +1 for consolidate the texts an any shape.
>
> But it doesn't sound like the way we should take to simply replace
> only the concrete text labels to symbols.  Significant downsides of
> doing that are untranslatability and difficulty to add the proper
> indentation before the value field.  So we need to define the texts
> including indentation spaces and format placeholder.

It looks like we also translate the printf statements that tools emit.
I'm not sure how having a macro in the printf creates a problem with
the translation.

Yes, the indentation part needs to be taken care in any case, which is
also true now, different fields are adjusted to different indentation
(number of spaces, not tabs) for the sake of readability in the
output.

> But if we define the strings separately, I would rather define them in
> an array.
>
> typedef enum pg_control_fields
> {
>
> const char *pg_control_item_formats[] = {
> gettext_noop("pg_control version number:%u\n"),
>
> const char *
> get_pg_control_item_format(pg_control_fields item_id)
> {
> return _(pg_control_item_formats[item_id]);
> };
>
> const char *
> get_pg_control_item()
> {
> return _(pg_control_item_formats[item_id]);
> };
>
> pg_control_fields
> get_pg_control_item(const char *str, const char **valp)
> {
> int i;
> for (i = 0 ; i < PGCNTL_NUM_FIELDS ; i++)
> {
> const char *fmt = pg_control_item_formats[i];
> const char *p = strchr(fmt, ':');
>
> Assert(p);
> if (strncmp(str, fmt, p - fmt) == 0)
> {
> p = str + (p - fmt);
> for(p++ ; *p == ' ' ; p++);
> *valp = p;
> return i;
> }
> }
>
> return -1;
> }
>
> Printer side does:
>
>printf(get_pg_control_item_format(PGCNTRL_FIELD_VERSION),
>   ControlFile->pg_control_version);
>
> And the reader side would do:
>
>switch(get_pg_control_item(str, &v))
>{
>case PGCNTRL_FIELD_VERSION:
>cluster->controldata.ctrl_ver = str2uint(v);
>break;

Thanks for your time on the above code. IMHO, I don't know if we ever
go the complex way with the above sort of style (error-prone, huge
maintenance burden and extra function calls). Instead, I would be
happy to keep the code as is if the macro-way(as proposed originally
in this thread) really has a translation problem.

Regards,
Bharath Rupireddy.


v1-0001-sample-patch.patch
Description: Binary data

Re: pg_receivewal - couple of improvements

2022-02-03 Thread Bharath Rupireddy

On Wed, Feb 2, 2022 at 9:28 PM Julien Rouhaud  wrote:
>
> On Wed, Feb 02, 2022 at 09:14:03PM +0530, Bharath Rupireddy wrote:
> >
> > FYI that thread is closed, it committed the change (f61e1dd [1]) that
> > pg_receivewal can read from its replication slot restart lsn.
> >
> > I know that providing the start pos as an option came up there [2],
> > but I wanted to start the discussion fresh as that thread got closed.
>
> Ah sorry I misunderstood your email.
>
> I'm not sure it's a good idea.  If you have missing WALs in your target
> directory but have an alternative backup location, you will have to restore 
> the
> WAL from that alternative location anyway, so I'm not sure how accepting a
> different start position is going to help in that scenario.  On the other hand
> allowing a position at the command line can also lead to accepting a bogus
> position, which could possibly make things worse.

Isn't complex for anyone to go to the archive location which involves
extra steps - getting authentication tokens, searching there for the
required WAL file, downloading it, unzipping it, copying back to
pg_receivewal node etc. in production environments? You know, this
will just be problematic and adds more time for bringing up the
pg_receivewal. Instead if I know that the latest checkpoint LSN and
archived WAL file from the primary, I can just provide the startpos
(probably the last checkpoint LSN) to pg_receivewal so that it can
continue getting the WAL records from primary, avoiding the whole
bunch of the manual work that I had to do.

> > 2) Currently, RECONNECT_SLEEP_TIME is 5sec - but I may want to have
> > more reconnect time as I know that the primary can go down at any time
> > for whatever reasons in production environments which can take some
> > time till I bring up primary and I don't want to waste compute cycles
> > in the node on which pg_receivewal is running
>
> I don't think that attempting a connection is really costly.  Also, increasing
> this retry time also increases the amount of time you're not streaming WALs,
> and thus the amount of data you can lose so I'm not sure that's actually a 
> good
> idea.  But you might also want to make it more aggressive, so no objection to
> make it configurable.

Yeah, making it configurable helps tune the reconnect time as per the
requirements.

Regards,
Bharath Rupireddy.

Re: Replace pg_controldata output fields with macros for better code manageability

2022-02-03 Thread Peter Eisentraut


On 29.01.22 15:30, Bharath Rupireddy wrote:

While working on another pg_control patch, I observed that the
pg_controldata output fields such as "Latest checkpoint's
TimeLineID:", "Latest checkpoint's NextOID:'' and so on, are being
used in pg_resetwal.c, pg_controldata.c and pg_upgrade/controldata.c.


The duplication between pg_resetwal and pg_controldata could probably be 
handled by refactoring the code so that only one place is responsible 
for printing it.  The usages in pg_upgrade are probably best guarded by 
tests that ensure that any changes anywhere else don't break things. 
Using macros like you suggest only protects against one kind of change: 
changing the wording of a line.  But it doesn't protect for example 
against a line being removed from the output.


While I'm sympathetic to the issue you describe, the proposed solution 
introduces a significant amount of ugliness for a problem that is really 
quite rare.

Re: psql - add SHOW_ALL_RESULTS option

2022-02-03 Thread Peter Eisentraut


On 29.01.22 15:40, Fabien COELHO wrote:
With this approach results are not available till the last one has been 
returned? If so, it loses the nice asynchronous property of getting 
results as they come when they come? This might or might not be 
desirable depending on the use case. For "psql", ISTM that we should 
want immediate and asynchronous anyway??


Well, I'm not sure.  I'm thinking about this in terms of the dynamic 
result sets from stored procedures feature.  That is typically used for 
small result sets.  The interesting feature there is that the result 
sets can have different shapes.  But of course people can use it 
differently.  What is your motivation for this feature, and what is your 
experience how people would use it?

Re: pg_receivewal - couple of improvements

2022-02-03 Thread Julien Rouhaud

On Thu, Feb 03, 2022 at 06:40:55PM +0530, Bharath Rupireddy wrote:
> 
> Isn't complex for anyone to go to the archive location which involves
> extra steps - getting authentication tokens, searching there for the
> required WAL file, downloading it, unzipping it, copying back to
> pg_receivewal node etc. in production environments? You know, this
> will just be problematic and adds more time for bringing up the
> pg_receivewal. Instead if I know that the latest checkpoint LSN and
> archived WAL file from the primary, I can just provide the startpos
> (probably the last checkpoint LSN) to pg_receivewal so that it can
> continue getting the WAL records from primary, avoiding the whole
> bunch of the manual work that I had to do.

I don't get it.  If you're missing WAL it means that will you have to do that
tedious manual work to retrieve them no matter what.  So on top of that tedious
work, you also have to make sure that you don't provide a bogus start position.

Also, doesn't this scenario implies that you have both archive_command and
pg_receivewal for storing your WALs elsewhere?  In my experience this isn't
really common.

If you want to make sure you won't have to do that tedious work of retrieving
the WALs from a different location, you should probably rely on the facilities
to make sure it won't happen, like using a replication slots and monitoring the
pg_wal usage.

Maybe that's a good idea but I'm still having a hard time imagining a scenario
where it would actually be a good idea.

Re: Support for NSS as a libpq TLS backend

2022-02-03 Thread Peter Eisentraut


On 28.01.22 15:30, Robert Haas wrote:

I would really, really like to have an alternative to OpenSSL for PG.


What are the reasons people want that?  With OpenSSL 3, the main reasons 
-- license and FIPS support -- have gone away.

Re: libpq async duplicate error results

2022-02-03 Thread Peter Eisentraut

Tom, you worked on reorganzing the error message handling in libpq in 
PostgreSQL 14 (commit ffa2e4670123124b92f037d335a1e844c3782d3f).  Any 
thoughts on this?



On 25.01.22 09:32, Peter Eisentraut wrote:


This issue was discovered by Fabien in the SHOW_ALL_RESULTS thread.  I'm 
posting it here separately, because I think it ought to be addressed in 
libpq rather than with a workaround in psql, as proposed over there.


When using PQsendQuery() + PQgetResult() and the server crashes during 
the execution of the command, PQgetResult() then returns two result sets 
with partially duplicated error messages, like this from the attached 
test program:


command = SELECT 'before';
result 1 status = PGRES_TUPLES_OK
error message = ""

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator 
command

"
result 2 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator 
command

server closed the connection unexpectedly
 This probably means the server terminated abnormally
 before or while processing the request.
"

command = SELECT 'after';
PQsendQuery() error: no connection to the server


This is hidden in normal use because PQexec() throws away all but the 
last result set, but the extra one is still generated internally.


Apparently, this has changed between PG13 and PG14.  In PG13 and 
earlier, the output is


command = SELECT 'before';
result 1 status = PGRES_TUPLES_OK
error message = ""

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator 
command

"
result 2 status = PGRES_FATAL_ERROR
error message = "server closed the connection unexpectedly
 This probably means the server terminated abnormally
 before or while processing the request.
"

command = SELECT 'after';
PQsendQuery() error: no connection to the server

In PG13, PQexec() concatenates all the error messages from multiple 
results, so a user of PQexec() sees the same output before and after. 
But for users of the lower-level APIs, things have become a bit more 
confusing.


Also, why are there multiple results being generated in the first place?


[0]: 
https://www.postgresql.org/message-id/alpine.DEB.2.22.394.2112230703530.2668598@pseudo

Re: daitch_mokotoff module

2022-02-03 Thread Dag Lem

Hi,

Just some minor adjustments to the patch:

* Removed call to locale-dependent toupper()
* Cleaned up input normalization

I have been asked to sign up to review a commitfest patch or patches -
unfortunately I've been ill with COVID-19 and it's not until now that
I feel well enough to have a look.

Julien: I'll have a look at https://commitfest.postgresql.org/36/3468/
as you suggested (https://commitfest.postgresql.org/36/3379/ seems to
have been reviewed now).

If there are other suggestions for a patch or patches to review for
someone new to PostgreSQL internals, I'd be grateful for that.


Best regards

Dag Lem

diff --git a/contrib/fuzzystrmatch/Makefile b/contrib/fuzzystrmatch/Makefile
index 0704894f88..1d5bd84be8 100644
--- a/contrib/fuzzystrmatch/Makefile
+++ b/contrib/fuzzystrmatch/Makefile
@@ -3,14 +3,15 @@
 MODULE_big = fuzzystrmatch
 OBJS = \
 	$(WIN32RES) \
+	daitch_mokotoff.o \
 	dmetaphone.o \
 	fuzzystrmatch.o
 
 EXTENSION = fuzzystrmatch
-DATA = fuzzystrmatch--1.1.sql fuzzystrmatch--1.0--1.1.sql
+DATA = fuzzystrmatch--1.2.sql fuzzystrmatch--1.1--1.2.sql fuzzystrmatch--1.0--1.1.sql
 PGFILEDESC = "fuzzystrmatch - similarities and distance between strings"
 
-REGRESS = fuzzystrmatch
+REGRESS = fuzzystrmatch fuzzystrmatch_utf8
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
@@ -22,3 +23,8 @@ top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 include $(top_srcdir)/contrib/contrib-global.mk
 endif
+
+daitch_mokotoff.o: daitch_mokotoff.h
+
+daitch_mokotoff.h: daitch_mokotoff_header.pl
+	perl $< > $@
diff --git a/contrib/fuzzystrmatch/daitch_mokotoff.c b/contrib/fuzzystrmatch/daitch_mokotoff.c
new file mode 100644
index 00..ba87061845
--- /dev/null
+++ b/contrib/fuzzystrmatch/daitch_mokotoff.c
@@ -0,0 +1,587 @@
+/*
+ * Daitch-Mokotoff Soundex
+ *
+ * Copyright (c) 2021 Finance Norway
+ * Author: Dag Lem 
+ *
+ * This implementation of the Daitch-Mokotoff Soundex System aims at high
+ * performance.
+ *
+ * - The processing of each phoneme is initiated by an O(1) table lookup.
+ * - For phonemes containing more than one character, a coding tree is traversed
+ *   to process the complete phoneme.
+ * - The (alternate) soundex codes are produced digit by digit in-place in
+ *   another tree structure.
+ *
+ *  References:
+ *
+ * https://www.avotaynu.com/soundex.htm
+ * https://www.jewishgen.org/InfoFiles/Soundex.html
+ * https://familypedia.fandom.com/wiki/Daitch-Mokotoff_Soundex
+ * https://stevemorse.org/census/soundex.html (dmlat.php, dmsoundex.php)
+ * https://github.com/apache/commons-codec/ (dmrules.txt, DaitchMokotoffSoundex.java)
+ * https://metacpan.org/pod/Text::Phonetic (DaitchMokotoff.pm)
+ *
+ * A few notes on other implementations:
+ *
+ * - All other known implementations have the same unofficial rules for "UE",
+ *   these are also adapted by this implementation (0, 1, NC).
+ * - The only other known implementation which is capable of generating all
+ *   correct soundex codes in all cases is the JOS Soundex Calculator at
+ *   https://www.jewishgen.org/jos/jossound.htm
+ * - "J" is considered (only) a vowel in dmlat.php
+ * - The official rules for "RS" are commented out in dmlat.php
+ * - Identical code digits for adjacent letters are not collapsed correctly in
+ *   dmsoundex.php when double digit codes are involved. E.g. "BESST" yields
+ *   744300 instead of 743000 as for "BEST".
+ * - "J" is considered (only) a consonant in DaitchMokotoffSoundex.java
+ * - "Y" is not considered a vowel in DaitchMokotoffSoundex.java
+ *
+ * Permission to use, copy, modify, and distribute this software and its
+ * documentation for any purpose, without fee, and without a written agreement
+ * is hereby granted, provided that the above copyright notice and this
+ * paragraph and the following two paragraphs appear in all copies.
+ *
+ * IN NO EVENT SHALL THE AUTHOR OR DISTRIBUTORS BE LIABLE TO ANY PARTY FOR
+ * DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING
+ * LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS
+ * DOCUMENTATION, EVEN IF THE AUTHOR OR DISTRIBUTORS HAVE BEEN ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ * THE AUTHOR AND DISTRIBUTORS SPECIFICALLY DISCLAIM ANY WARRANTIES,
+ * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
+ * AND FITNESS FOR A PARTICULAR PURPOSE.  THE SOFTWARE PROVIDED HEREUNDER IS
+ * ON AN "AS IS" BASIS, AND THE AUTHOR AND DISTRIBUTORS HAS NO OBLIGATIONS TO
+ * PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
+*/
+
+#include "daitch_mokotoff.h"
+
+#include "postgres.h"
+#include "utils/builtins.h"
+#include "mb/pg_wchar.h"
+
+#include 
+
+/* Internal C implementation */
+static char *_daitch_mokotoff(char *word, char *soundex, size_t n);
+
+
+PG_FUNCTION_INFO_V1(daitch_mokotoff);
+Datum
+daitch_mokotoff(PG_FUNCTION_ARGS)
+{
+	text	   *arg = PG_GETARG_TEXT_PP(0);
+	char	   *string,
+			   *tmp_soundex;
+	text	   *soundex;
+
+	/*
+	 * The maximum theoreti

Re: Support for NSS as a libpq TLS backend

2022-02-03 Thread Daniel Gustafsson

> On 3 Feb 2022, at 15:07, Peter Eisentraut  
> wrote:
> 
> On 28.01.22 15:30, Robert Haas wrote:
>> I would really, really like to have an alternative to OpenSSL for PG.
> 
> What are the reasons people want that?  With OpenSSL 3, the main reasons -- 
> license and FIPS support -- have gone away.

At least it will go away when OpenSSL 3 is FIPS certified, which is yet to
happen (submitted, not processed).

I see quite a few valid reasons to want an alternative, a few off the top of my
head include:

- Using trust stores like Keychain on macOS with Secure Transport.  There is
AFAIK something similar on Windows and NSS has it's certificate databases.
Especially on client side libpq it would be quite nice to integrate with where
certificates already are rather than rely on files on disks.

- Not having to install OpenSSL, Schannel and Secure Transport would make life
easier for packagers.

- Simply having an alternative.  The OpenSSL projects recent venture into
writing transport protocols have made a lot of people worried over their
bandwidth for fixing and supporting core features.

Just my $0.02, everyones mileage varies on these.

--
Daniel Gustafsson   https://vmware.com/

Re: row filtering for logical replication

2022-02-03 Thread Ajin Cherian

On Sat, Jan 29, 2022 at 11:31 AM Andres Freund  wrote:
>
> Hi,
>
> Are there any recent performance evaluations of the overhead of row filters? I
> think it'd be good to get some numbers comparing:
>
> 1) $workload with master
> 2) $workload with patch, but no row filters
> 3) $workload with patch, row filter matching everything
> 4) $workload with patch, row filter matching few rows
>
> For workload I think it'd be worth testing:
> a) bulk COPY/INSERT into one table
> b) Many transactions doing small modifications to one table
> c) Many transactions targetting many different tables
> d) Interspersed DDL + small changes to a table
>

Here's the performance data results for scenario d:

HEAD   "with patch no row filter" "with patch 0%" "row-filter-patch
25%" "row-filter-patch v74 50%" "row-filter-patch 75%"
"row-filter-patch v74 100%"
1 65.397639 64.414034 5.919732 20.012096 36.35911 49.412548 64.508842
2 65.641783 65.255775 5.715082 20.157575 36.957403 51.355821 65.708444
3 65.096526 64.795163 6.146072 21.130709 37.679346 49.568513 66.602145
4 65.173569 64.68 5.787197 20.784607 34.465133 55.397313 63.545337
5 65.791092 66.000412 5.642696 20.258802 36.493626 52.873252 63.511428

The performance is similar to the other scenarios.
The script used is below:

CREATE TABLE test (key int, value text, value1 text, data jsonb,
PRIMARY KEY(key, value));

CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 0); -- 100% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 25); -- 75% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 50); -- 50% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 75); -- 25% allowed
CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 100); -- 0% allowed

DO
$do$
BEGIN
FOR i IN 1..101 BY 4000 LOOP
Alter table test alter column value1 TYPE varchar(30);
INSERT INTO test VALUES(i,'BAH', row_to_json(row(i)));
Alter table test ALTER COLUMN value1 TYPE text;
UPDATE test SET value = 'FOO' WHERE key = i;
COMMIT;
END LOOP;
END
$do$;

regards,
Ajin Cherian
Fujitsu Australia

Re: do only critical work during single-user vacuum?

2022-02-03 Thread John Naylor

On Thu, Feb 3, 2022 at 3:14 AM Masahiko Sawada  wrote:

> +  The only other option that may be combined with
> VERBOSE, although in single-user mode no client
> messages are
> +  output.
>
> Given VERBOSE with EMERGENCY can work only in multi-user mode, why
> only VERBOSE can be specified with EMERGENCY? I think the same is true
> for other options like PARALLEL; PARALLEL can work only in multi-user
> mode.

You are right; it makes sense to allow options that would be turned
off automatically in single-user mode. Even if we don't expect it to
be used in normal mode, the restrictions should make sense. Also,
maybe documenting the allowed combinations is a distraction in the
main entry and should be put in the notes at the bottom.

> +  It performs a database-wide vacuum on tables, toast tables, and
> materialized views whose
> +  xid age or mxid age is older than 1 billion.
>
> Do we need to allow the user to specify the threshold or need a higher
> value (at least larger than 1.6 billion, default value of
> vacuum_failsafe_age)? I imagined a case where there are a few very-old
> tables (say 2 billion old) and many tables that are older than 1
> billion. In this case, VACUUM (EMERGENCY) would take a long time to
> complete.

I still don't think fine-tuning is helpful here. Shutdown vacuum
should be just as trivial to run as it is now, but with better
behavior. I believe a user knowledgeable enough to come up with the
best number is unlikely to get in this situation in the first place.
I'm also not sure a production support engineer would (or should)
immediately figure out a better number than a good default. That said,
the 1 billion figure was a suggestion from Peter G. upthread, and a
higher number could be argued.

> But to minimize the downtime, we might want to run VACUUM
> (EMERGENCY) on only the very-old tables, start the cluster in
> multi-user mode, and run vacuum on multiple tables in parallel.

That's exactly the idea. Also, back in normal mode, we can start
streaming WAL again. However, we don't want to go back online so close
to the limit that we risk shutdown again. People have a reasonable
expectation that if you fix an emergency, it's now fixed and the
application can go back online. Falling down repeatedly, or worrying
if it's possible, is very bad.

> +   if (params->options & VACOPT_EMERGENCY)
> +   {
> +   /*
> +   * Only consider relations able to hold unfrozen XIDs (anything 
> else
> +   * should have InvalidTransactionId in relfrozenxid anyway).
> +   */
> +   if (classForm->relkind != RELKIND_RELATION &&
> +   classForm->relkind != RELKIND_MATVIEW &&
> +   classForm->relkind != RELKIND_TOASTVALUE)
> +   {
> +   Assert(!TransactionIdIsValid(classForm->relfrozenxid));
> +   Assert(!MultiXactIdIsValid(classForm->relminmxid));
> +   continue;
> +   }
> +
> +   table_xid_age = DirectFunctionCall1(xid_age,
> classForm->relfrozenxid);
> +   table_mxid_age = DirectFunctionCall1(mxid_age,
> classForm->relminmxid);
> +
>
> I think that instead of calling xid_age and mxid_age for each
> relation, we can compute the thresholds for xid and mxid once, and
> then compare them to relation's relfrozenxid and relminmxid.

That sounds like a good idea if it's simple to implement, so I will
try it. If it adds complexity, I don't think it's worth it. Scanning a
few thousand rows in pg_class along with the function calls is tiny
compared to the actual vacuum work.

-- 
John Naylor
EDB: http://www.enterprisedb.com

Re: libpq async duplicate error results

2022-02-03 Thread Tom Lane

Peter Eisentraut  writes:
> Tom, you worked on reorganzing the error message handling in libpq in 
> PostgreSQL 14 (commit ffa2e4670123124b92f037d335a1e844c3782d3f).  Any 
> thoughts on this?

Ah, sorry, I'd not noticed this thread.

I concur with Fabien's analysis: we report the FATAL message from
the server during the first PQgetResult, and then the second call
discovers that the connection is gone and reports "server closed
the connection unexpectedly".  Those are two independent events
(in libpq's view anyway) so reporting them separately is correct,
or at least necessary unless you want to engage in seriously
major redesign and behavioral changes.

It is annoying that some of the text is duplicated in the second
report, but in the end that's cosmetic, and I'm not sure we can
improve it without breaking other things.  In particular, we
can't just think about what comes back in the PGgetResult calls,
we also have to consider what PQerrorMessage(conn) will report
afterwards.  So I don't think that resetting conn->errorMessage
during a PQgetResult series would be a good fix.

An idea that I don't have time to pursue right now is to track
how much of conn->errorMessage has been read out by PQgetResult,
and only report newly-added text in later PQgetResult calls of
a series, while PQerrorMessage would continue to report the
whole thing.  Buffer resets would occur where they do now.

It'd probably be necessary to make a user-visible PQgetResult
that becomes a wrapper around PQgetResultInternal, because
internal callers such as PQexecFinish will need the old behavior,
or at least some of them will.

regards, tom lane

Re: do only critical work during single-user vacuum?

2022-02-03 Thread John Naylor

Thinking further about the use of emergency mode, we have this:

"If for some reason autovacuum fails to clear old XIDs from a table,
the system will begin to emit warning messages like this when the
database's oldest XIDs reach forty million transactions from the
wraparound point:

WARNING:  database "mydb" must be vacuumed within 39985967 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in
that database.
"

It seems people tend not to see these warnings if they didn't already
have some kind of monitoring which would prevent them from getting
here in the first place. But if they do, the hint should mention the
emergency option here, too. This puts Justin's idea upthread in a new
light -- if the admin does notice this warning, then emergency mode
should indeed vacuum the oldest tables first, since autovacuum is not
(yet) smart enough to do that. I'll pursue that as a follow-up.

-- 
John Naylor
EDB: http://www.enterprisedb.com

Re: Server-side base backup: why superuser, not pg_write_server_files?

2022-02-03 Thread Dagfinn Ilmari Mannsåker

Tom Lane  writes:

> I wrote:
>> The Windows animals don't like this:
>> pg_basebackup: error: connection to server at "127.0.0.1", port 59539
>> failed: FATAL: SSPI authentication failed for user "backupuser"
>
>> Not sure whether we have a standard method to get around that.
>
> Ah, right, we do.  Looks like adding something like
>
> auth_extra => [ '--create-role', 'backupuser' ]
>
> to the $node->init call would do it, or you could mess with
> invoking pg_regress --config-auth directly.

This was enough incentive for me to set up Cirrus-CI for my fork on
GitHub, and the auth_extra approach in the attached patch fixed the
test:

https://cirrus-ci.com/task/6578617030279168?logs=test_bin#L21

- ilmari

>From fec0e29ac0f57d1449229849b837438dfb0b9a07 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Thu, 3 Feb 2022 15:45:50 +
Subject: [PATCH] Fix non-superuser server-side basebackup test on Windows

Windows doesn't do trust auth for local users by default, but needs
explicitly allowing the OS user to authenticate as the extra user.
---
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 2283a8c42d..f996c6e9b9 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -31,7 +31,7 @@
 umask(0077);
 
 # Initialize node without replication settings
-$node->init(extra => ['--data-checksums']);
+$node->init(extra => ['--data-checksums'], auth_extra => ['--create-role', 'backupuser']);
 $node->start;
 my $pgdata = $node->data_dir;
 
-- 
2.30.2

Re: Implementing Incremental View Maintenance

2022-02-03 Thread Yugo NAGATA

Hi,

On Thu, 13 Jan 2022 18:23:42 +0800
Julien Rouhaud  wrote:

> Hi,
> 
> On Thu, Nov 25, 2021 at 04:37:10PM +0900, Yugo NAGATA wrote:
> > On Wed, 24 Nov 2021 04:31:25 +
> > "r.takahash...@fujitsu.com"  wrote:
> > 
> > > 
> > > I checked the same procedure on v24 patch.
> > > But following error occurs instead of the original error.
> > > 
> > > ERROR:  relation "ivm_t_index" already exists
> > 
> > Thank you for pointing out it!
> > 
> > Hmmm, an index is created when IMMV is defined, so CREAE INDEX called
> > after this would fail... Maybe, we should not create any index automatically
> > if IMMV is created WITH NO DATA.
> > 
> > I'll fix it after some investigation.
> 
> Are you still investigating on that problem?  Also, the patchset doesn't apply
> anymore:

I attached the updated and rebased patch set.

I fixed to not create a unique index when an IMMV is created WITH NO DATA.
Instead, the index is created by REFRESH WITH DATA only when the same one
is not created yet.

Also, I fixed the documentation to describe that foreign tables and partitioned
tables are not supported according with Takahashi-san's suggestion. 
 
> There isn't any answer to your following email summarizing the feature yet, so
> I'm not sure what should be the status of this patch, as there's no ideal
> category for that.  For now I'll change the patch to Waiting on Author on the
> cf app, feel free to switch it back to Needs Review if you think it's more
> suitable, at least for the design discussion need.

I changed the status to Needs Review.


Regards,
Yugo Nagata

-- 
Yugo NAGATA

Re: Implementing Incremental View Maintenance

2022-02-03 Thread Zhihong Yu

On Thu, Feb 3, 2022 at 8:28 AM Yugo NAGATA  wrote:

> Hi,
>
> On Thu, 13 Jan 2022 18:23:42 +0800
> Julien Rouhaud  wrote:
>
> > Hi,
> >
> > On Thu, Nov 25, 2021 at 04:37:10PM +0900, Yugo NAGATA wrote:
> > > On Wed, 24 Nov 2021 04:31:25 +
> > > "r.takahash...@fujitsu.com"  wrote:
> > >
> > > >
> > > > I checked the same procedure on v24 patch.
> > > > But following error occurs instead of the original error.
> > > >
> > > > ERROR:  relation "ivm_t_index" already exists
> > >
> > > Thank you for pointing out it!
> > >
> > > Hmmm, an index is created when IMMV is defined, so CREAE INDEX called
> > > after this would fail... Maybe, we should not create any index
> automatically
> > > if IMMV is created WITH NO DATA.
> > >
> > > I'll fix it after some investigation.
> >
> > Are you still investigating on that problem?  Also, the patchset doesn't
> apply
> > anymore:
>
> I attached the updated and rebased patch set.
>
> I fixed to not create a unique index when an IMMV is created WITH NO DATA.
> Instead, the index is created by REFRESH WITH DATA only when the same one
> is not created yet.
>
> Also, I fixed the documentation to describe that foreign tables and
> partitioned
> tables are not supported according with Takahashi-san's suggestion.
>
> > There isn't any answer to your following email summarizing the feature
> yet, so
> > I'm not sure what should be the status of this patch, as there's no ideal
> > category for that.  For now I'll change the patch to Waiting on Author
> on the
> > cf app, feel free to switch it back to Needs Review if you think it's
> more
> > suitable, at least for the design discussion need.
>
> I changed the status to Needs Review.
>
>
> Hi,
Did you intend to attach updated patch ?

I don't seem to find any.

FYI

Fix CheckIndexCompatible comment

2022-02-03 Thread Yugo NAGATA

Hello,

I found a old parameter name 'heapRelation' in the comment
of CheckIndexCompatible. This parameter was removed by 5f173040.

Attached is a patch to remove it from the comment.

Regards,
Yugo Nagata

-- 
Yugo NAGATA 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 560dcc87a2..95cb15cbb9 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -129,7 +129,6 @@ typedef struct ReindexErrorInfo
  *		prospective index definition, such that the existing index storage
  *		could become the storage of the new index, avoiding a rebuild.
  *
- * 'heapRelation': the relation the index would apply to.
  * 'accessMethodName': name of the AM to use.
  * 'attributeList': a list of IndexElem specifying columns and expressions
  *		to index on.

should vacuum's first heap pass be read-only?

2022-02-03 Thread Robert Haas

VACUUM's first pass over the heap is implemented by a function called
lazy_scan_heap(), while the second pass is implemented by a function
called lazy_vacuum_heap_rel(). This seems to imply that the first pass
is primarily an examination of what is present, while the second pass
does the real work. This used to be more true than it now is. In
PostgreSQL 7.2, the first release that implemented concurrent vacuum,
the first heap pass could set hint bits as a side effect of calling
HeapTupleSatisfiesVacuum(), and it could freeze old xmins. However,
neither of those things wrote WAL, and you had a reasonable chance of
escaping without dirtying the page at all. By the time PostgreSQL 8.2
was released, it had been understood that making critical changes to
pages without writing WAL was not a good plan, and so freezing now
wrote WAL, but no big deal: most vacuums wouldn't freeze anything
anyway. Things really changed a lot in PostgreSQL 8.3. With the
addition of HOT, lazy_scan_heap() was made to prune the page, meaning
that the first heap pass would likely dirty a large fraction of the
pages that it touched, truncating dead tuples to line pointers and
defragmenting the page. The second heap pass would then have to dirty
the page again to mark dead line pointers unused. In the absolute
worst case, that's a very large increase in WAL generation. VACUUM
could write full page images for all of those pages while HOT-pruning
them, and then a checkpoint could happen, and then VACUUM could write
full-page images of all of them again while marking the dead line
pointers unused. I don't know whether anyone spent time and energy
worrying about this problem, but considering how much HOT improves
performance overall, it would be entirely understandable if this
didn't seem like a terribly important thing to worry about.

But maybe we should reconsider. What benefit do we get out of dirtying
the page twice like this, writing WAL each time? What if we went back
to the idea of having the first heap pass be read-only? In fact, I'm
thinking we might want to go even further and try to prevent even hint
bit changes to the page during the first pass, especially because now
we have checksums and wal_log_hints. If our vacuum cost settings are
to believed (and I am not sure that they are) dirtying a page is 10
times as expensive as reading one from the disk. So on a large table,
we're paying 44 vacuum cost units per heap page vacuumed twice, when
we could be paying only 24 such cost units. What a bargain! The
downside is that we would be postponing, perhaps substantially, the
work that can be done immediately, namely freeing up space in the page
and updating the free space map. The former doesn't seem like a big
loss, because it can be done by anyone who visits the page anyway, and
skipped if nobody does. The latter might be a loss, because getting
the page into the freespace map sooner could prevent bloat by allowing
space to be recycled sooner. I'm not sure how serious a problem this
is. I'm curious what other people think. Would it be worth the delay
in getting pages into the FSM if it means we dirty the pages only
once? Could we have our cake and eat it too by updating the FSM with
the amount of free space that the page WOULD have if we pruned it, but
not actually do so?

I'm thinking about this because of the "decoupling table and index
vacuuming" thread, which I was discussing with Dilip this morning. In
a world where table vacuuming and index vacuuming are decoupled, it
feels like we want to have only one kind of heap vacuum. It pushes us
in the direction of unifying the first and second pass, and doing all
the cleanup work at once. However, I don't know that we want to use
the approach described there in all cases. For a small table that is,
let's just say, not part of any partitioning hierarchy, I'm not sure
that using the conveyor belt approach makes a lot of sense, because
the total amount of work we want to do is so small that we should just
get it over with and not clutter up the disk with more conveyor belt
forks -- especially for people who have large numbers of small tables,
the inode consumption could be a real issue. And we won't really save
anything either. The value of decoupling operations has to do with
improving concurrency and error recovery and allowing global indexes
and a bunch of stuff that, for a small table, simply doesn't matter.
So it would be nice to fall back to an approach more like what we do
now. But then you end up with two fairly distinct code paths, one
where you want the heap phases combined and another where you want
them separated. If the first pass were a strictly read-only pass, you
could do that if there's no conveyor belt, or else read from the
conveyor belt if there is one, and then the phase where you dirty the
heap looks about the same either way.

Aside from the question of whether this is a good idea at all, I'm
also wondering what kinds of experiments we could run to try to find
out. Wh

Re: Server-side base backup: why superuser, not pg_write_server_files?

2022-02-03 Thread Andrew Dunstan



On 2/2/22 17:52, Tom Lane wrote:
> I wrote:
>> The Windows animals don't like this:
>> pg_basebackup: error: connection to server at "127.0.0.1", port 59539 
>> failed: FATAL:  SSPI authentication failed for user "backupuser"
>> Not sure whether we have a standard method to get around that.
> Ah, right, we do.  Looks like adding something like
>
> auth_extra => [ '--create-role', 'backupuser' ]
>
> to the $node->init call would do it, or you could mess with
> invoking pg_regress --config-auth directly.
>
>   



I've fixed this using the auth_extra method, which avoids a reload.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: do only critical work during single-user vacuum?

2022-02-03 Thread Justin Pryzby

On Tue, Feb 01, 2022 at 04:50:31PM -0500, John Naylor wrote:
> On Thu, Jan 27, 2022 at 8:28 PM Justin Pryzby  wrote:
> 
> > I'm sure you meant "&" here (fixed in attached patch to appease the cfbot):
> > +   if (options | VACOPT_MINIMAL)
> 
> Thanks for catching that! That copy-pasto was also masking my failure
> to process the option properly -- fixed in the attached as v5.

Thank the cfbot ;)

Actually, your most recent patch failed again without this:

-   if (params->VACOPT_EMERGENCY)
+   if (params->options & VACOPT_EMERGENCY)

> - It mentions the new command in the error hint in
> GetNewTransactionId(). I'm not sure if multi-word commands should be
> quoted like this.

Use  ?

> +  xid age or mxid age is older than 1 billion. To complete as quickly as 
> possible, an emergency
> +  vacuum will skip truncation and index cleanup, and will skip toast 
> tables whose age has not
> +  exceeded the cutoff.

Why does this specially mention toast tables ?

> +  While this option could be used while the postmaster is running, it is 
> expected that the wraparound
> +  failsafe mechanism will automatically work in the same way to prevent 
> imminent shutdown.
> +  When EMERGENCY is specified no tables may be 
> listed, since it is designed to

specified comma

> +  select candidate relations from the entire database.
> +  The only other option that may be combined with 
> VERBOSE, although in single-user mode no client messages 
> are

this is missing a word?
Maybe say: May not be combined with any other option, other than VERBOSE.

Should the docs describe that the vacuum is done with cost based delay disabled
and with vacuum_freeze_table_age=0 (and other settings).

> + ereport(ERROR,
> + (errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("option 
> \"%s\" is incompatible with EMERGENCY", opt->defname),
> + 
> parser_errposition(pstate, opt->location)));

IMO new code should avoid using the outer parenthesis around errcode().

Maybe the errmsg should say: .. may not be specified with EMERGENCY.
EMERGENCY probably shouldn't be part of the translatable string.

+   if (strcmp(opt->defname, "emergency") != 0 &&
+   strcmp(opt->defname, "verbose") != 0 &&
+   defGetBoolean(opt))

It's wrong to call getBoolean(), since the options may not be boolean.
postgres=# VACUUM(EMERGENCY, INDEX_CLEANUP auto);
ERROR:  index_cleanup requires a Boolean value

I think EMERGENCY mode should disable process_toast.  It already processes
toast tables separately.  See 003.

Should probably exercise (EMERGENCY) in vacuum.sql.  See 003.

> > I still wonder if the relations should be processed in order of decreasing 
> > age.
> > An admin might have increased autovacuum_freeze_max_age up to 2e9, and your
> > query might return thousands of tables, with a wide range of sizes and ages.
> 
> While that seems like a nice property to have, it does complicate
> things, so can be left for follow-on work.

I added that in the attached 003.

-- 
Justin
>From a870303f4bd62b7c653a4bef53ed6d2748268bc0 Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Tue, 1 Feb 2022 16:50:31 -0500
Subject: [PATCH 1/3] do only critical work during single-user vacuum?

Feb 01 John Naylor
---
 doc/src/sgml/maintenance.sgml   |  12 ++--
 doc/src/sgml/ref/vacuum.sgml|  22 ++
 src/backend/access/transam/varsup.c |   4 +-
 src/backend/commands/vacuum.c   | 107 +---
 src/include/commands/vacuum.h   |   5 ++
 5 files changed, 134 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 36f975b1e5b..5c360499504 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -629,17 +629,19 @@ HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that data
 
 
 ERROR:  database is not accepting commands to avoid wraparound data loss in database "mydb"
-HINT:  Stop the postmaster and vacuum that database in single-user mode.
+HINT:  Stop the postmaster and run "VACUUM (EMERGENCY)" in that database in single-user mode.
 
 
 The three-million-transaction safety margin exists to let the
 administrator recover without data loss, by manually executing the
-required VACUUM commands.  However, since the system will not
+required VACUUM command.  However, since the system will not
 execute commands once it has gone into the safety shutdown mode,
 the only way to do this is to stop the server and start the server in single-user
-mode to execute VACUUM.  The shutdown mode is not enforced
-in single-user mode.  See the  reference
-page for details about using single-user mode.
+mode to ex

Re: Implementing Incremental View Maintenance

2022-02-03 Thread Zhihong Yu

On Thu, Feb 3, 2022 at 8:50 AM Yugo NAGATA  wrote:

> On Thu, 3 Feb 2022 08:48:00 -0800
> Zhihong Yu  wrote:
>
> > On Thu, Feb 3, 2022 at 8:28 AM Yugo NAGATA  wrote:
> >
> > > Hi,
> > >
> > > On Thu, 13 Jan 2022 18:23:42 +0800
> > > Julien Rouhaud  wrote:
> > >
> > > > Hi,
> > > >
> > > > On Thu, Nov 25, 2021 at 04:37:10PM +0900, Yugo NAGATA wrote:
> > > > > On Wed, 24 Nov 2021 04:31:25 +
> > > > > "r.takahash...@fujitsu.com"  wrote:
> > > > >
> > > > > >
> > > > > > I checked the same procedure on v24 patch.
> > > > > > But following error occurs instead of the original error.
> > > > > >
> > > > > > ERROR:  relation "ivm_t_index" already exists
> > > > >
> > > > > Thank you for pointing out it!
> > > > >
> > > > > Hmmm, an index is created when IMMV is defined, so CREAE INDEX
> called
> > > > > after this would fail... Maybe, we should not create any index
> > > automatically
> > > > > if IMMV is created WITH NO DATA.
> > > > >
> > > > > I'll fix it after some investigation.
> > > >
> > > > Are you still investigating on that problem?  Also, the patchset
> doesn't
> > > apply
> > > > anymore:
> > >
> > > I attached the updated and rebased patch set.
> > >
> > > I fixed to not create a unique index when an IMMV is created WITH NO
> DATA.
> > > Instead, the index is created by REFRESH WITH DATA only when the same
> one
> > > is not created yet.
> > >
> > > Also, I fixed the documentation to describe that foreign tables and
> > > partitioned
> > > tables are not supported according with Takahashi-san's suggestion.
> > >
> > > > There isn't any answer to your following email summarizing the
> feature
> > > yet, so
> > > > I'm not sure what should be the status of this patch, as there's no
> ideal
> > > > category for that.  For now I'll change the patch to Waiting on
> Author
> > > on the
> > > > cf app, feel free to switch it back to Needs Review if you think it's
> > > more
> > > > suitable, at least for the design discussion need.
> > >
> > > I changed the status to Needs Review.
> > >
> > >
> > > Hi,
> > Did you intend to attach updated patch ?
> >
> > I don't seem to find any.
>
> Oops, I attached. Thanks!
>
> Hi,
For CreateIndexOnIMMV():

+   ereport(NOTICE,
+   (errmsg("could not create an index on materialized view
\"%s\" automatically",
...
+   return;
+   }

Should the return type be changed to bool so that the caller knows whether
the index creation succeeds ?
If index creation is unsuccessful, should the call
to CreateIvmTriggersOnBaseTables() be skipped ?

For check_ivm_restriction_walker():

+   break;
+   expression_tree_walker(node, check_ivm_restriction_walker,
NULL);
+   break;

Something is missing between the break and expression_tree_walker().

Cheers

Re: do only critical work during single-user vacuum?

2022-02-03 Thread Robert Haas

On Thu, Dec 9, 2021 at 8:56 PM Andres Freund  wrote:
> I think we should move *away* from single user mode, rather than the
> opposite. It's a substantial code burden and it's hard to use.

Yes. This thread seems to be largely devoted to the topic of making
single-user vacuum work better, but I don't see anyone asking the
question "why do we have a message that tells people to vacuum in
single user mode in the first place?". It's basically bad advice, with
one small exception that I'll talk about in a minute. Suppose we had a
message in the tree that said "HINT: Consider angering a live anaconda
to fix this problem." If that were so, the correct thing to do
wouldn't be to add a section to our documentation explaining how to
deal with angry anacondas. The correct thing to do would be to remove
the hint as bad advice that we never should have offered in the first
place. And so here. We should not try to make vacuum in single
user-mode work better or differently, or at least that shouldn't be
our primary objective. We should just stop telling people to do it. We
should probably add messages and documentation *discouraging* the use
of single user mode for recovering from wraparound trouble, exactly
the opposite of what we do now. There's nothing we can do in
single-user mode that we can't do equally well in multi-user mode. If
people try to fix wraparound problems in multi-user mode, they still
have read-only access to their database, they can use parallelism,
they can use command line utilities like vacuumdb, and they can use
psql which has line editing and allows remote access and is a way
nicer user experience than running postgres --single. We need a really
compelling reason to tell people to give up all those advantages, and
there is no such reason. It makes just as much sense as telling people
to deal with wraparound problems by angering a live anaconda.

I did say there was an exception, and it's this: the last time I
studied this issue back in 2019,[1] vacuum insisted on trying to
truncate tables even when the system is in wraparound danger. Then it
would fail, because truncating the table required allocating an XID,
which would fail if we were short on XIDs. By putting the system in
single user mode, you could continue to allocate XIDs and thus VACUUM
would work. However, if you think about this for even 10 seconds, you
can see that it's terrible. If we're so short of XIDs that we are
scared to allocate them for fear of causing an actual wraparound,
putting the system into a mode where that protection is bypassed is a
super-terrible idea. People will be able to run vacuum, yes, but if
they have too many tables, they will actually experience wraparound
and thus data loss before they process all the tables they have. What
we ought to do to solve this problem is NOT TRUNCATE when the number
of remaining XIDs is small, so that we don't consume any of the
remaining XIDs until we get the system out of wraparound danger. I
think the "failsafe" stuff Peter added in v14 fixes that, though. If
not, we should adjust it so it does. And then we should KILL WITH FIRE
the message telling people to use single user mode -- and once we do
that, the question of what the behavior ought to be when someone does
run VACUUM in single user mode becomes a lot less important.

This problem is basically self-inflicted. We have given people bad
advice (use single user mode) and then they suffer when they take it.
Ameliorating the suffering isn't the worst idea ever, but it's
basically fixing the wrong problem.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

[1] 
http://postgr.es/m/CA+Tgmob1QCMJrHwRBK8HZtGsr+6cJANRQw2mEgJ9e=d+z7c...@mail.gmail.com

Re: Server-side base backup: why superuser, not pg_write_server_files?

2022-02-03 Thread Robert Haas

On Thu, Feb 3, 2022 at 12:26 PM Andrew Dunstan  wrote:
> I've fixed this using the auth_extra method, which avoids a reload.

Thank you much.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: row filtering for logical replication

2022-02-03 Thread Andres Freund

Hi,

On 2022-02-01 13:31:36 +1100, Peter Smith wrote:
> TEST STEPS - Workload case a
> 
> 1. Run initdb pub and sub and start both postgres instances (use the nosync 
> postgresql.conf)
> 
> 2. Run psql for both instances and create tables
> CREATE TABLE test (key int, value text, data jsonb, PRIMARY KEY(key, value));
> 
> 3. create the PUBLISHER on pub instance (e.g. choose from below depending on 
> filter)
> CREATE PUBLICATION pub_1 FOR TABLE test;  
> -- 100% (no filter)
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 0);  -- 100% 
> allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 25); -- 75% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 50); -- 50% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 75); -- 25% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 100);-- 0% 
> allowed
> 
> 4. create the SUBSCRIBER on sub instance
> CREATE SUBSCRIPTION sync_sub CONNECTION 'host=127.0.0.1 port=5432 
> dbname=postgres application_name=sync_sub' PUBLICATION pub_1;
> 
> 5. On pub side modify the postgresql.conf on the publisher side and restart
> \q quite psql
> edit synchronous_standby_names = 'sync_sub' 
> restart the pub instance
> 
> 6. Run psql (pub side) and perform the test run.
> \timing
> INSERT INTO test SELECT i, i::text, row_to_json(row(i)) FROM 
> generate_series(1,101)i;
> select count(*) from test;
> TRUNCATE test;
> select count(*) from test;
> repeat 6 for each test run.

I think think using syncrep as the mechanism for benchmarking the decoding
side makes the picture less clear than it could be - you're measuring a lot of
things other than the decoding. E.g. the overhead of applying those changes. I
think it'd be more accurate to do something like:

/* create publications, table, etc */

-- create a slot from before the changes
SELECT pg_create_logical_replication_slot('origin', 'pgoutput');

/* the changes you're going to measure */

-- save end LSN
SELECT pg_current_wal_lsn();

-- create a slot for pg_recvlogical to consume
SELECT * FROM pg_copy_logical_replication_slot('origin', 'consume');

-- benchmark, endpos is from pg_current_wal_lsn() above
time pg_recvlogical -S consume --endpos 0/2413A720 --start -o proto_version=3 
-o publication_names=pub_1 -f /dev/null  -d postgres

-- clean up
SELECT pg_drop_replication_slot('consume');

Then repeat this with the different publications and compare the time taken
for the pg_recvlogical. That way the WAL is exactly the same, there is no
overhead of actually doing anything with the data on the other side, etc.

Greetings,

Andres Freund

Re: do only critical work during single-user vacuum?

2022-02-03 Thread John Naylor

On Thu, Feb 3, 2022 at 1:06 PM Robert Haas  wrote:
>
> On Thu, Dec 9, 2021 at 8:56 PM Andres Freund  wrote:
> > I think we should move *away* from single user mode, rather than the
> > opposite. It's a substantial code burden and it's hard to use.
>
> Yes. This thread seems to be largely devoted to the topic of making
> single-user vacuum work better, but I don't see anyone asking the
> question "why do we have a message that tells people to vacuum in
> single user mode in the first place?". It's basically bad advice, with
> one small exception that I'll talk about in a minute.

The word "advice" sounds like people have a choice, rather than the
system not accepting commands anymore. It would be much less painful
if the system closed connections and forbade all but superusers to
connect, but that sounds like a lot of work. (happy to be proven
otherwise)

-- 
John Naylor
EDB: http://www.enterprisedb.com

Re: do only critical work during single-user vacuum?

2022-02-03 Thread Robert Haas

On Thu, Feb 3, 2022 at 1:34 PM John Naylor  wrote:
> The word "advice" sounds like people have a choice, rather than the
> system not accepting commands anymore. It would be much less painful
> if the system closed connections and forbade all but superusers to
> connect, but that sounds like a lot of work. (happy to be proven
> otherwise)

They *do* have a choice. They can continue to operate the system in
multi-user mode, they can have read access to their data, and they can
run VACUUM and other non-XID-allocating commands to fix the issue.
Sure, their application can't run commands that allocate XIDs, but
it's not going to be able to do that if they go to single-user mode
either.

I don't understand why we would want the system to stop accepting
connections other than superuser connections. That would provide
strictly less functionality and I don't understand what it would gain.
But it would still be better than going into single-user mode, which
provides even less functionality and has basically no advantages of
any kind.

Why are you convinced that the user HAS to go to single-user mode? I
don't think they have to do that, and I don't think they should want
to do that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Support for NSS as a libpq TLS backend

2022-02-03 Thread Stephen Frost

Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Feb  1, 2022 at 01:52:09PM -0800, Andres Freund wrote:
> > There's https://hg.mozilla.org/projects/nspr/file/tip/pr/src - which is I
> > think the upstream source.
> > 
> > A project without even a bare-minimal README at the root does have a 
> > "internal
> > only" feel to it...
> 
> I agree --- it is a library --- if they don't feel the need to publish
> the API, it seems to mean they want to maintain the ability to change it
> at any time, and therefore it is inappropriate for other software to
> rely on that API.

This is really not a reasonable representation of how this library has
been maintained historically nor is there any reason to think that their
policy regarding the API has changed recently.  They do have a
documented API and that hasn't changed- it's just that it's not easily
available in web-page form any longer and that's due to something
independent of the library maintenance.  They've also done a good job
with maintaining the API as one would expect from a library and so this
really isn't a reason to avoid using it.  If there's actual specific
examples of the API not being well maintained and causing issues then
please point to them and we can discuss if that is a reason to consider
not depending on NSS/NSPR.

> This is not the same as Postgres extensions needing to read the Postgres
> source code --- they are an important but edge use case and we never saw
> the need to standardize or publish the internal functions that must be
> studied and adjusted possibly for major releases.

I agree that extensions and public libraries aren't entirely the same
but I don't think it's all that unreasonable for developers that are
using a library to look at the source code for that library when
developing against it, that's certainly something I've done for a
number of different libraries.

> This kind of feels like the Chrome JavaScript code that used to be able
> to be build separately for PL/v8, but has gotten much harder to do in
> the past few years.

This isn't at all like that case, where the maintainers made a very
clear and intentional choice to make it quite difficult for packagers to
pull v8 out to package it.  Nothing like that has happened with NSS and
there isn't any reason to think that it will based on what the
maintainers have said and what they've done across the many years that
NSS has been around.

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: Support for NSS as a libpq TLS backend

2022-02-03 Thread Stephen Frost

Greetings,

* Daniel Gustafsson (dan...@yesql.se) wrote:
> > On 3 Feb 2022, at 15:07, Peter Eisentraut 
> >  wrote:
> > 
> > On 28.01.22 15:30, Robert Haas wrote:
> >> I would really, really like to have an alternative to OpenSSL for PG.
> > 
> > What are the reasons people want that?  With OpenSSL 3, the main reasons -- 
> > license and FIPS support -- have gone away.
> 
> At least it will go away when OpenSSL 3 is FIPS certified, which is yet to
> happen (submitted, not processed).
> 
> I see quite a few valid reasons to want an alternative, a few off the top of 
> my
> head include:
> 
> - Using trust stores like Keychain on macOS with Secure Transport.  There is
> AFAIK something similar on Windows and NSS has it's certificate databases.
> Especially on client side libpq it would be quite nice to integrate with where
> certificates already are rather than rely on files on disks.
> 
> - Not having to install OpenSSL, Schannel and Secure Transport would make life
> easier for packagers.
> 
> - Simply having an alternative.  The OpenSSL projects recent venture into
> writing transport protocols have made a lot of people worried over their
> bandwidth for fixing and supporting core features.
> 
> Just my $0.02, everyones mileage varies on these.

Yeah, agreed on all of these.

Thanks,

Stephen


signature.asc
Description: PGP signature

Re: archive modules

2022-02-03 Thread Robert Haas

On Wed, Feb 2, 2022 at 5:44 PM Nathan Bossart  wrote:
> > I would suggest s/if it eventually/only when it/
>
> Done.

Committed. I'm going to be 0% surprised if the buildfarm turns pretty
colors, but I don't know how to know what it's going to be unhappy
about except by trying it, so here goes.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Latest LLVM breaks our code again

2022-02-03 Thread Andres Freund

Hi,

On 2022-02-03 10:44:11 +0100, Fabien COELHO wrote:
> > I'm doubtful that tracking development branches of LLVM is a good
> > investment. Their development model is to do changes in-tree much more than 
> > we
> > do. Adjusting to API changes the moment they're made will often end up with
> > further changes to the same / related lines.  Once they open the relevant
> > release-NN branch, it's a different story.
> >
> > Maybe it'd make sense to disable --with-llvm on seawasp>
> > and have ppa separate animal that tracks the newest release branch?
>
> The point of seawasp is somehow to have an early warning of upcoming
> changes, especially the --with-llvm part. LLVM indeed is a moving target,
> but they very rarely back down on their API changes…

The point is less backing down, but making iterative changes that require
changing the same lines repeatedly...


> About tracking releases: this means more setups and runs and updates, and as
> I think they do care about compatibility *within* a release we would not see
> breakages so it would not be very useful, and we would only get the actual
> breakages at LLVM release time, which may be much later.

LLVM's release branches are created well before the release. E.g 13 was afaics
branched off 2021-07-27 [1], released 2021-09-30 [2].


> For these reasons, I'm inclined to let seawasp as it is.

I find seawasp tracking the development trunk compilers useful. Just not for
--with-llvm. The latter imo *reduces* seawasp's, because once there's an API
change we can't see whether there's e.g. a compiler codegen issue leading to
crashes or whatnot.

What I was proposing was to remove --with-llvm from seawasp, and have a
separate animal tracking the newest llvm release branch (I can run/host that
if needed).

Greetings,

Andres Freund

[1] git show $(git merge-base origin/release/13.x origin/main)
[2] git show llvmorg-13.0.0

Re: Support for NSS as a libpq TLS backend

2022-02-03 Thread Peter Eisentraut


On 03.02.22 15:53, Daniel Gustafsson wrote:

I see quite a few valid reasons to want an alternative, a few off the top of my
head include:

- Using trust stores like Keychain on macOS with Secure Transport.  There is
AFAIK something similar on Windows and NSS has it's certificate databases.
Especially on client side libpq it would be quite nice to integrate with where
certificates already are rather than rely on files on disks.

- Not having to install OpenSSL, Schannel and Secure Transport would make life
easier for packagers.


Those are good reasons for Schannel and Secure Transport, less so for NSS.


- Simply having an alternative.  The OpenSSL projects recent venture into
writing transport protocols have made a lot of people worried over their
bandwidth for fixing and supporting core features.


If we want simply an alternative, we had a GnuTLS variant almost done a 
few years ago, but in the end people didn't want it enough.  It seems to 
be similar now.

Re: support for CREATE MODULE

2022-02-03 Thread Swaha Miller

Thank you for the feedback Pavel and Julien. I'll try to explain some of
the issues and points you raise to the best of my understanding.

The reason for modules is that it would serve as an organizational unit
that can allow setting permissions on those units. So, for example, all
functions in a module can be subject to setting access permissions on for
some user(s) or group(s). I didn't explain it well in the sgml docs, but
along with module syntax, I'm proposing introducing privileges to
grant/revoke on modules and routines in modules. And why modules for this
purpose? Because its in the SQL spec so seems like a way to do it.

I'm adding comments inline for the list of functionality you mentioned. I
look forward to discussing this more and figuring out how to make a useful
contribution to the community.

On Wed, Feb 2, 2022 at 11:22 PM Pavel Stehule 
wrote:

>
>
> čt 3. 2. 2022 v 5:46 odesílatel Julien Rouhaud 
> napsal:
>
>> Hi,
>>
>> On Thu, Feb 03, 2022 at 05:25:27AM +0100, Pavel Stehule wrote:
>> >
>> > čt 3. 2. 2022 v 3:28 odesílatel Swaha Miller 
>> > napsal:
>> >
>> > > Hi,
>> > >
>> > > I'm following up from Jim's POC for adding MODULE to PostgreSQL. [1]
>> > >
>> > > My proposal implements modules as schema objects to be stored in a new
>> > > system catalog pg_module with new syntax for CREATE [OR REPLACE]
>> MODULE,
>> > > ALTER MODULE, DROP MODULE and for GRANT and REVOKE for privileges on
>> > > modules and module routines. I am attempting to follow the SQL spec.
>> > > However, for right now, I'm proposing to support only routines as
>> module
>> > > contents, with local temporary tables and path specifications as
>> defined
>> > > in the SQL spec, to be supported in a future submission. We could also
>> > > include support for variables depending on its status. [2]
>> >
>> > I dislike this feature. The modules are partially redundant to schemas
>> and
>> > to extensions in Postgres, and I am sure, so there is no reason to
>> > introduce this.
>> >
>> > What is the benefit against schemas and extensions?
>>
>> I agree with Pavel.  It seems that it's mainly adding another namespacing
>> layer
>> between schemas and objects, and it's likely going to create a mess.
>> That's also going to be problematic if you want to add support for module
>> variables, as you won't be able to use e.g.
>> dbname.schemaname.modulename.variablename.fieldname.
>>
>>
I haven't yet added support for variables so will need to look into the
problems with this if we're going to do that.

> Also, my understanding was that the major interest of modules (at least
>> for the
>> routines part) was the ability to make some of them private to the
>> module, but
>> it doesn't look like it's doing that, so I also don't see any real benefit
>> compared to schemas and extensions.
>>
>
>
Yes, that is indeed the goal/use-case with setting permissions with grant
and revoke. Right now, I have proposed create and reference as the kinds of
access that can be controlled on modules, and reference as the kind of
access that can be controlled on routines inside modules.

> The biggest problem is coexistence of Postgres's SEARCH_PATH  object
> identification, and local and public scopes used in MODULEs or in Oracle's
> packages.
>
>
I am not extremely familiar with Oracle's packages, but do know of them.
I'm wondering if local and public scopes for MODULE is in the SQL spec? (I
will check for that...) My thinking was to implement functionality that
conforms to the SQL spec, not try to match Oracle's package which differs
from the spec in some ways.

> I can imagine MODULES as third level of database unit object grouping with
> following functionality
>
> 1. It should support all database objects like schemas
>

Do you mean that schemas should be groupable under modules? My thinking was
to follow what the SQL spec says about what objects should be in modules,
and I started with routines as one of the objects that there are use cases
for. Such a controlling access permissions on routines at some granularity
that is not an entire schema and not individual functions/procedures.

> 2. all public objects should be accessed directly when outer schema is in
> SEARCH_PATH
>

Yes, an object inside a module is in a schema and can be accessed with
schemaname.func() as well as modulename.func() as well as
schemaname.modulename.func(). I think you are saying it should be
accessible with func() without a qualifying schemaname or modulename if the
schemaname is in the search path, and that sounds reasonable too. Unless,
of course, func() was created in a module, in which case access permissions
for the module and module contents will determine whether func() should be
directly accessible. In my current proposal, a previously created func()
can't be added to a module created later. The purpose of creating routines
inside a module (either when the module is created or after the module is
created) would be with the intent of setting access permissions

Re: archive modules

2022-02-03 Thread Nathan Bossart

On Thu, Feb 03, 2022 at 02:11:18PM -0500, Robert Haas wrote:
> Committed. I'm going to be 0% surprised if the buildfarm turns pretty
> colors, but I don't know how to know what it's going to be unhappy
> about except by trying it, so here goes.

Thanks!  I'll keep an eye on the buildfarm and will send any new patches
that are needed.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Support for NSS as a libpq TLS backend

2022-02-03 Thread Robert Haas

On Thu, Feb 3, 2022 at 2:16 PM Peter Eisentraut
 wrote:
> If we want simply an alternative, we had a GnuTLS variant almost done a
> few years ago, but in the end people didn't want it enough.  It seems to
> be similar now.

Yeah. I think it's pretty clear that the only real downside of
committing support for GnuTLS or NSS or anything else is that we then
need to maintain that support (or eventually remove it). I don't
really see a problem if Daniel wants to commit this, set up a few
buildfarm animals, and fix stuff when it breaks. If he does that, I
don't see that we're losing anything. But, if he commits it in the
hope that other people are going to step up to do the maintenance
work, maybe that's not going to happen, or at least not without
grumbling. I'm not objecting to this being committed in the sense that
I don't ever want to see it in the tree, but I'm also not volunteering
to maintain it.

As a philosophical matter, I don't think it's great for us - or the
Internet in general - to be too dependent on OpenSSL. Software
monocultures are not great, and OpenSSL has near-constant security
updates and mediocre documentation. Now, maybe anything else we
support will end up having similar issues, or worse. But if we and
other projects are never willing to support anything but OpenSSL, then
there will never be viable alternatives to OpenSSL, because a library
that isn't actually used by the software you care about is of no use.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Adding CI to our tree

2022-02-03 Thread Andres Freund

Hi,

On 2022-02-02 21:58:28 -0600, Justin Pryzby wrote:
> FYI: I've rebased these against your cirrus/windows changes.

Did you put then on a dedicated branch, or only intermixed with other changes?

> A recent cirrus result is here; this has other stuff in the branch, so you can
> see the artifacts with HTML docs and code coverage.

I'm a bit worried about the increased storage and runtime overhead due to the
docs changes. We probably can make it a good bit cheaper though.

> https://github.com/justinpryzby/postgres/runs/5046465342

> 95793675633 cirrus: spell ccache_maxsize

Yep, will apply with a bunch of your other changes, if you answer a question
or two...

> e5286ede1b4 cirrus: avoid unnecessary double star **

Can't get excited about this, but whatever.

What I am excited about is that some of your other changes showed that we
don't need separate *_artifacts for separate directories anymore. That used to
be the case, but an array of paths is now supported. Putting log, diffs, and
regress_log in one directory will be considerably more convenient...

> 03f6de4643e cirrus: include hints how to install OS packages..

What's the idea behind

#echo 'deb http://deb.debian.org/debian bullseye main' >>/etc/apt/sources.list

That's already in sources.list, so I'm not sure what this shows?

I think it may be a bit cleaner to have the "install additional packages"
"template step" be just that, and not merge in other contents into it?

> 39cc2130e76 cirrus/linux: script test.log..

I still don't understand what this commit is trying to achieve.

> 398b7342349 cirrus/macos: uname -a and export at end of sysinfo

Shrug.

> 9d0f03d3450 wip: cirrus: code coverage

I don't think it's good to just unconditionally reference the master branch
here. It'll do bogus stuff once 15 is branched off. It works for cfbot, but
not other uses.

Perhaps we could have a cfbot special case (by checking for a repository
variable variable indicating the base branch) and show the incremental changes
to that branch? Or we could just check which branch has the smallest distance
in #commits?

If cfbot weren't a thing, I'd just make a code coverage / docs generation a
manual task (startable by a click in the UI). But that requires permission on
the repository...

Hm. I wonder if cfbot could maintain the code not as branches as such, but as
pull requests? Those include information about what the base branch is ;)

Is looking at the .c files in the change really a reliable predictor of where
code coverage changes? I'm doubtful. Consider stuff like removing the last
user of some infrastructure or such. Or adding the first.

> bff64e8b998 cirrus: upload html docs as artifacts
> 80f52c3b172 wip: only upload changed docs

Similar to the above.

> 7f3b50f1847 pg_upgrade: show list of files copied only in vebose mode

I think that should be discussed on a different thread.

> 97d7b039b9b vcregress/ci: test modules/contrib with NO_INSTALLCHECK=1

Probably also worth breaking out into a new thread.

> 654b6375401 wip: vcsregress: add alltaptests

I assume this doesn't yet work to a meaningful degree? Last time I checked
there were quite a few tests that needed to be invoked in a specific
directory.  In the meson branch I worked around that by having a wrapper
around the invocation of individual tap tests that changes CWD.

Greetings,

Andres Freund

Re: support for CREATE MODULE

2022-02-03 Thread Pavel Stehule

čt 3. 2. 2022 v 20:21 odesílatel Swaha Miller 
napsal:

> Thank you for the feedback Pavel and Julien. I'll try to explain some of
> the issues and points you raise to the best of my understanding.
>
> The reason for modules is that it would serve as an organizational unit
> that can allow setting permissions on those units. So, for example, all
> functions in a module can be subject to setting access permissions on for
> some user(s) or group(s). I didn't explain it well in the sgml docs, but
> along with module syntax, I'm proposing introducing privileges to
> grant/revoke on modules and routines in modules. And why modules for this
> purpose? Because its in the SQL spec so seems like a way to do it.
>

This part of the standard is dead - there is no strong reason to implement
it.


>
> I'm adding comments inline for the list of functionality you mentioned. I
> look forward to discussing this more and figuring out how to make a useful
> contribution to the community.
>
> On Wed, Feb 2, 2022 at 11:22 PM Pavel Stehule 
> wrote:
>
>>
>>
>> čt 3. 2. 2022 v 5:46 odesílatel Julien Rouhaud 
>> napsal:
>>
>>> Hi,
>>>
>>> On Thu, Feb 03, 2022 at 05:25:27AM +0100, Pavel Stehule wrote:
>>> >
>>> > čt 3. 2. 2022 v 3:28 odesílatel Swaha Miller 
>>> > napsal:
>>> >
>>> > > Hi,
>>> > >
>>> > > I'm following up from Jim's POC for adding MODULE to PostgreSQL. [1]
>>> > >
>>> > > My proposal implements modules as schema objects to be stored in a
>>> new
>>> > > system catalog pg_module with new syntax for CREATE [OR REPLACE]
>>> MODULE,
>>> > > ALTER MODULE, DROP MODULE and for GRANT and REVOKE for privileges on
>>> > > modules and module routines. I am attempting to follow the SQL spec.
>>> > > However, for right now, I'm proposing to support only routines as
>>> module
>>> > > contents, with local temporary tables and path specifications as
>>> defined
>>> > > in the SQL spec, to be supported in a future submission. We could
>>> also
>>> > > include support for variables depending on its status. [2]
>>> >
>>> > I dislike this feature. The modules are partially redundant to schemas
>>> and
>>> > to extensions in Postgres, and I am sure, so there is no reason to
>>> > introduce this.
>>> >
>>> > What is the benefit against schemas and extensions?
>>>
>>> I agree with Pavel.  It seems that it's mainly adding another
>>> namespacing layer
>>> between schemas and objects, and it's likely going to create a mess.
>>> That's also going to be problematic if you want to add support for module
>>> variables, as you won't be able to use e.g.
>>> dbname.schemaname.modulename.variablename.fieldname.
>>>
>>>
> I haven't yet added support for variables so will need to look into the
> problems with this if we're going to do that.
>
>
>> Also, my understanding was that the major interest of modules (at least
>>> for the
>>> routines part) was the ability to make some of them private to the
>>> module, but
>>> it doesn't look like it's doing that, so I also don't see any real
>>> benefit
>>> compared to schemas and extensions.
>>>
>>
>>
> Yes, that is indeed the goal/use-case with setting permissions with grant
> and revoke. Right now, I have proposed create and reference as the kinds of
> access that can be controlled on modules, and reference as the kind of
> access that can be controlled on routines inside modules.
>
>
>> The biggest problem is coexistence of Postgres's SEARCH_PATH  object
>> identification, and local and public scopes used in MODULEs or in Oracle's
>> packages.
>>
>>
> I am not extremely familiar with Oracle's packages, but do know of them.
> I'm wondering if local and public scopes for MODULE is in the SQL spec? (I
> will check for that...) My thinking was to implement functionality that
> conforms to the SQL spec, not try to match Oracle's package which differs
> from the spec in some ways.
>
>
>> I can imagine MODULES as third level of database unit object grouping
>> with following functionality
>>
>> 1. It should support all database objects like schemas
>>
>
> Do you mean that schemas should be groupable under modules? My thinking
> was to follow what the SQL spec says about what objects should be in
> modules, and I started with routines as one of the objects that there are
> use cases for. Such a controlling access permissions on routines at some
> granularity that is not an entire schema and not individual
> functions/procedures.
>

SQLspec says so there can be just temporary tables and routines. It is
useless. Unfortunately SQL/PSM came too late and there is no progress in
the last 20 years.  It is a dead horse.


>
>> 2. all public objects should be accessed directly when outer schema is in
>> SEARCH_PATH
>>
>
> Yes, an object inside a module is in a schema and can be accessed with
> schemaname.func() as well as modulename.func() as well as
> schemaname.modulename.func(). I think you are saying it should be
> accessible with func() without a qualifying schemaname or modulename if the
> schemaname is in the

Re: archive modules

2022-02-03 Thread Robert Haas

On Thu, Feb 3, 2022 at 2:27 PM Nathan Bossart  wrote:
> On Thu, Feb 03, 2022 at 02:11:18PM -0500, Robert Haas wrote:
> > Committed. I'm going to be 0% surprised if the buildfarm turns pretty
> > colors, but I don't know how to know what it's going to be unhappy
> > about except by trying it, so here goes.
>
> Thanks!  I'll keep an eye on the buildfarm and will send any new patches
> that are needed.

Andres just pointed out to me that thorntail is unhappy:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail&dt=2022-02-03%2019%3A54%3A42

It says:

==~_~===-=-===~_~==
pgsql.build/contrib/basic_archive/log/postmaster.log
==~_~===-=-===~_~==
2022-02-03 23:17:49.019 MSK [1253623:1] FATAL:  WAL archival cannot be
enabled when wal_level is "minimal"

The notes for the machine say:

UBSan; force_parallel_mode; wal_level=minimal; OS bug breaks truncate()

So apparently we need to either skip this test when wal_level=minimal,
or force a higher wal_level to be used for this particular test. Not
sure what the existing precedents are, if any.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: archive modules

2022-02-03 Thread Nathan Bossart

On Thu, Feb 03, 2022 at 04:04:33PM -0500, Robert Haas wrote:
> So apparently we need to either skip this test when wal_level=minimal,
> or force a higher wal_level to be used for this particular test. Not
> sure what the existing precedents are, if any.

The only precedent I've found so far is test_decoding, which sets wal_level
to "logical."  Perhaps we can just set it to "replica" in
basic_archive.conf.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: archive modules

2022-02-03 Thread Robert Haas

On Thu, Feb 3, 2022 at 4:11 PM Nathan Bossart  wrote:
> On Thu, Feb 03, 2022 at 04:04:33PM -0500, Robert Haas wrote:
> > So apparently we need to either skip this test when wal_level=minimal,
> > or force a higher wal_level to be used for this particular test. Not
> > sure what the existing precedents are, if any.
>
> The only precedent I've found so far is test_decoding, which sets wal_level
> to "logical."  Perhaps we can just set it to "replica" in
> basic_archive.conf.

Yeah, that seems to make sense.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: do only critical work during single-user vacuum?

2022-02-03 Thread John Naylor

On Thu, Feb 3, 2022 at 1:42 PM Robert Haas  wrote:
>
> On Thu, Feb 3, 2022 at 1:34 PM John Naylor  
> wrote:
> > The word "advice" sounds like people have a choice, rather than the
> > system not accepting commands anymore. It would be much less painful
> > if the system closed connections and forbade all but superusers to
> > connect, but that sounds like a lot of work. (happy to be proven
> > otherwise)
>
> They *do* have a choice. They can continue to operate the system in
> multi-user mode, they can have read access to their data, and they can
> run VACUUM and other non-XID-allocating commands to fix the issue.
> Sure, their application can't run commands that allocate XIDs, but
> it's not going to be able to do that if they go to single-user mode
> either.

I just checked some client case notes where they tried just that
before getting outside help, and both SELECT and VACUUM FREEZE
commands were rejected. The failure is clearly indicated in the log.
-- 
John Naylor
EDB: http://www.enterprisedb.com

Re: archive modules

2022-02-03 Thread Nathan Bossart

On Thu, Feb 03, 2022 at 04:15:30PM -0500, Robert Haas wrote:
> On Thu, Feb 3, 2022 at 4:11 PM Nathan Bossart  
> wrote:
>> On Thu, Feb 03, 2022 at 04:04:33PM -0500, Robert Haas wrote:
>> > So apparently we need to either skip this test when wal_level=minimal,
>> > or force a higher wal_level to be used for this particular test. Not
>> > sure what the existing precedents are, if any.
>>
>> The only precedent I've found so far is test_decoding, which sets wal_level
>> to "logical."  Perhaps we can just set it to "replica" in
>> basic_archive.conf.
> 
> Yeah, that seems to make sense.

024_archive_recovery.pl seems to do something similar.  Patch attached.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
diff --git a/contrib/basic_archive/basic_archive.conf b/contrib/basic_archive/basic_archive.conf
index b26b2d4144..db029f4b8e 100644
--- a/contrib/basic_archive/basic_archive.conf
+++ b/contrib/basic_archive/basic_archive.conf
@@ -1,3 +1,4 @@
 archive_mode = 'on'
 archive_library = 'basic_archive'
 basic_archive.archive_directory = '.'
+wal_level = 'replica'

Re: archive modules

2022-02-03 Thread Robert Haas

On Thu, Feb 3, 2022 at 4:25 PM Nathan Bossart  wrote:
> 024_archive_recovery.pl seems to do something similar.  Patch attached.

Committed. I think this is mostly an issue for pg_regress tests, as
opposed to 024_archive_recovery.pl, which is a TAP test. Maybe I'm
wrong about that, but it looks to me like most TAP tests choose what
they want explicitly, while pg_regress tests tend to inherit the
value.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: do only critical work during single-user vacuum?

2022-02-03 Thread Andres Freund

Hi,

On 2022-02-03 13:42:20 -0500, Robert Haas wrote:
> They *do* have a choice. They can continue to operate the system in
> multi-user mode, they can have read access to their data, and they can
> run VACUUM and other non-XID-allocating commands to fix the issue.
> Sure, their application can't run commands that allocate XIDs, but
> it's not going to be able to do that if they go to single-user mode
> either.

I wonder if we shouldn't add some exceptions to the xid allocation
prevention. It makes sense that we don't allow random DML. But it's e.g. often
more realistic to drop / truncate a few tables with unimportant content,
rather than spend the time vacuuming those.  We could e.g. allow xid
consumption within VACUUM, TRUNCATE, DROP TABLE / INDEX when run at the top
level for longer than we allow it for anything else.

> But it would still be better than going into single-user mode, which
> provides even less functionality and has basically no advantages of
> any kind.

Indeed. Single user is the worst response to this (and just about anything
else, really). Even just getting into the single user mode takes a while
(shutdown checkpoint). The user interface is completely different (and
awful). The buffer cache is completely cold. The system is slower because
there's no wal writer / checkpointer running.  Which basically is a list of
things one absolutely do not wants when confronted with a wraparound
situation.

Greetings,

Andres Freund

Re: archive modules

2022-02-03 Thread Nathan Bossart

On Thu, Feb 03, 2022 at 04:45:52PM -0500, Robert Haas wrote:
> On Thu, Feb 3, 2022 at 4:25 PM Nathan Bossart  
> wrote:
>> 024_archive_recovery.pl seems to do something similar.  Patch attached.
> 
> Committed. I think this is mostly an issue for pg_regress tests, as
> opposed to 024_archive_recovery.pl, which is a TAP test. Maybe I'm
> wrong about that, but it looks to me like most TAP tests choose what
> they want explicitly, while pg_regress tests tend to inherit the
> value.

Thanks!

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: do only critical work during single-user vacuum?

2022-02-03 Thread Robert Haas

On Thu, Feb 3, 2022 at 4:18 PM John Naylor  wrote:
> I just checked some client case notes where they tried just that
> before getting outside help, and both SELECT and VACUUM FREEZE
> commands were rejected. The failure is clearly indicated in the log.

It would be helpful to know how it failed - what was the error? And
then I think we should just fix whatever the problem is. As I said
before, I know TRUNCATE has been an issue in the past, and if that's
not already fixed in v14, we should. If there's other stuff, we should
fix that too. The point I'm making here, which I still believe to be
valid, is that there's nothing intrinsically better about being in
single user mode. In fact, it's clearly worse. And I don't think it's
hard to fix it so that we avoid people needing to do that in the first
place.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: do only critical work during single-user vacuum?

2022-02-03 Thread Andres Freund

Hi,

On 2022-02-03 16:18:27 -0500, John Naylor wrote:
> I just checked some client case notes where they tried just that
> before getting outside help, and both SELECT and VACUUM FREEZE
> commands were rejected.

What kind of SELECT was that? Any chance it caused a write via functions, a
view, whatnot? And what version? What was the exact error message?

VACUUM FREEZE is a *terrible* idea to run when encountering anti-wraparound
issues. I understand why people thing key might need it, but basically all it
achieves is to make VACUUM do a lot more, none of it helpful to get out of the
wraparound-can't-write situation (those rows will already get frozen).

I'd plus one the addition of a HINT that tells users that FREEZE likely is a
bad idea when in wraparound land. We should allow it, because there are
situation where it might make sense, but the people that can make that
judgement know they can ignore the HINT.

Greetings,

Andres Freund

Re: do only critical work during single-user vacuum?

2022-02-03 Thread Robert Haas

On Thu, Feb 3, 2022 at 4:50 PM Andres Freund  wrote:
> I wonder if we shouldn't add some exceptions to the xid allocation
> prevention. It makes sense that we don't allow random DML. But it's e.g. often
> more realistic to drop / truncate a few tables with unimportant content,
> rather than spend the time vacuuming those.  We could e.g. allow xid
> consumption within VACUUM, TRUNCATE, DROP TABLE / INDEX when run at the top
> level for longer than we allow it for anything else.

True, although we currently don't start refusing XID allocation
altogether until only 1 million remain, IIRC. And that's cutting it
really close if we need to start consuming 1 XID per table we need to
drop. We might need to push out some of the thresholds a bit.

For the most part, I think that there's no reason why autovacuum
shouldn't be able to recover from this situation automatically, as
long as old replication slots and prepared transactions are cleaned up
and any old transactions are killed off. I don't think we're very far
from that Just Working, but we are not all there yet either. Manual
intervention to drop tables etc. is reasonable to allow a bit more
than we do now, but the big problem IMO is that the behavior when we
run short of XIDs has had very little testing and bug fixing, so
things that don't really need to break just do anyway.

> Indeed. Single user is the worst response to this (and just about anything
> else, really). Even just getting into the single user mode takes a while
> (shutdown checkpoint). The user interface is completely different (and
> awful). The buffer cache is completely cold. The system is slower because
> there's no wal writer / checkpointer running.  Which basically is a list of
> things one absolutely do not wants when confronted with a wraparound
> situation.

+1.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: do only critical work during single-user vacuum?

2022-02-03 Thread John Naylor

On Thu, Feb 3, 2022 at 4:58 PM Andres Freund  wrote:
>
> Hi,
>
> On 2022-02-03 16:18:27 -0500, John Naylor wrote:
> > I just checked some client case notes where they tried just that
> > before getting outside help, and both SELECT and VACUUM FREEZE
> > commands were rejected.
>
> What kind of SELECT was that? Any chance it caused a write via functions, a
> view, whatnot? And what version? What was the exact error message?

Looking closer, there is a function defined by an extension. I'd have
to dig further to see if writes happen. The error is exactly what
we've been talking about:

2022-01-03 22:03:23 PST ERROR: database is not accepting commands to
avoid wraparound data loss in database ""
2022-01-03 22:03:23 PST HINT: Stop the postmaster and vacuum that
database in single-user mode. You might also need to commit or roll
back old prepared transactions.

-- 
John Naylor
EDB: http://www.enterprisedb.com

Re: fairywren is generating bogus BASE_BACKUP commands

2022-02-03 Thread Andrew Dunstan



On 1/24/22 21:36, Andres Freund wrote:
> Hi,
>
> On 2022-01-24 16:47:28 -0500, Andrew Dunstan wrote:
>> Give me what you can and I'll see what I can do. I have a couple of
>> moderately high priority items on my plate, but I will probably be able
>> to fit in some testing when those make my eyes completely glaze over.
> Steps:
>
> # install msys from https://www.msys2.org/
> # install dependencies in msys shell
> pacman -S git bison flex make ucrt64/mingw-w64-ucrt-x86_64-perl 
> ucrt64/mingw-w64-ucrt-x86_64-gcc ucrt64/mingw-w64-ucrt-x86_64-zlib 
> ucrt64/mingw-w64-ucrt-x86_64-ccache diffutils
>
>
> # start mingw ucrt64 x64 shell
> cpan install -T IPC::Run
> perl -MIPC::Run -e 1 # verify ipc run is installed
>
> cd /c/dev
> # I added --reference postgres to accelerate the cloning
> git clone https://git.postgresql.org/git/postgresql.git postgres-mingw
> cd /c/dev/postgres-mingw
>
> git revert ed52c3707bcf8858defb0d9de4b55f5c7f18fed7
> git revert 6051857fc953a62db318329c4ceec5f9668fd42a
>
> # apply attached patch
>
> # see below why out-of-tree is easier or now
> mkdir build
> cd build
> # path parameters probably not necessary, I thought I needed them earlier, 
> not sure why
> ../configure --without-readline --cache cache --enable-tap-tests 
> PROVE=/ucrt64/bin/core_perl/prove PERL=/ucrt64/bin/perl.exe CC="ccache gcc"
> make -j8 -s world-bin && make -j8 -s -C src/interfaces/ecpg/test
> make -j8 -s temp-install
>
> # pg_regress' make_temp_socketdir() otherwise picks up the wrong TMPDIR
> mkdir /c/dev/postgres-mingw/build/tmp
>
> # the TAR= ensures that tests pick up a tar accessible with a windows path
> # PG_TEST_USE_UNIX_SOCKETS=1 is required for test concurrency, otherwise 
> there are port conflicts
>
> (make -Otarget -j12 check-world NO_TEMP_INSTALL=1 PG_TEST_USE_UNIX_SOCKETS=1 
> TMPDIR=C:/dev/postgres-mingw/tmp TAR="C:\Windows\System32\tar.exe" 2>&1 && 
> echo test-world-success || echo test-world-fail) 2>&1 |tee test-world.log



OK, I have all the pieces working and I know what I need to do to adapt
fairywren. The patch you provided is not necessary any more.

(I think your TMPDIR spec is missing a /build/)

The recipe worked (mutatis mutandis) for the mingw64 toolchain as well
as for the ucrt64 toolchain. Is there a reason to prefer ucrt64?

I think the next steps are:

  * do those two reverts
  * adjust fairywren
  * get rid of perl2host

At that stage jacana will no longer be able to run TAP tests. I can do
one of these:

  * disable the TAP tests on jacana
  * migrate jacana to msys2
  * kiss jacana goodbye.

Thoughts?


> To make tests in "in-tree" builds work, a bit more hackery would be
> needed. The problem is that windows chooses binaries from the current working
> directory *before* PATH. That's a problem for things like initdb.exe or
> pg_ctl.exe that want to find postgres.exe, as that only works with the program
> in their proper location, rather than CWD.
>

Yeah, we should do something about that. For example, we could possibly
use the new install_path option of PostgreSQL::Test::Cluster::new() so
it would find these in the right location.


However, I don't need it as my animals all use vpath builds.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work

2022-02-03 Thread Nathan Bossart

On Thu, Feb 03, 2022 at 09:45:08AM +0530, Bharath Rupireddy wrote:
> On Thu, Feb 3, 2022 at 12:07 AM Nathan Bossart  
> wrote:
>> If there is a problem reading the directory, we will LOG and then exit the
>> loop.  If we didn't scan through all the entries in the directory, there is
>> a chance that we didn't fsync() all the files that need it.
> 
> Thanks. I get it. For syncing map files, we don't want to tolerate any
> errors, whereas removal of the old map files (lesser than cutoff LSN)
> can be tolerated in CheckPointLogicalRewriteHeap.

LGTM.  Andres noted upthread [0] that the comment above sscanf() about
skipping editors' lock files might not be accurate.  I don't think it's a
huge problem if sscanf() matches those files, but perhaps we can improve
the comment.

[0] https://postgr.es/m/20220120194618.hmfd4kxkng2cgryh%40alap3.anarazel.de

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Fix BUG #17335: Duplicate result rows in Gather node

2022-02-03 Thread David Rowley

On Wed, 26 Jan 2022 at 05:32, Tom Lane  wrote:
> Therefore, what I think could be useful is some very-late-stage
> assertion check (probably in createplan.c) verifying that the
> child of a Gather is parallel-aware.  Or maybe the condition
> needs to be more general than that, but anyway the idea is for
> the back end of the planner to verify that we didn't build a
> silly plan.

I had a go at writing something along these lines, but I've ended up
with something I really don't like very much.

I ended up having to write a recursive path traversal function.  It's
generic and it can be given a callback function to do whatever we like
with the Path.  The problem is, that this seems like quite a bit of
code to maintain just for plan validation in Assert builds.

Currently, the patch validates 3 rules:

1) Ensure a parallel_aware path has only parallel_aware or
parallel_safe subpaths.
2) Ensure Gather is either single_copy or contains at least one
parallel_aware subnode.
3) Ensure GatherMerge contains at least one parallel_aware subnode.

I had to relax rule #1 a little as a Parallel Append can run subnodes
that are only parallel_safe and not parallel_aware.  The problem with
relaxing this rule is that it does not catch the case that this bug
report was about. I could maybe tweak that so there's a special case
for Append to allow parallel aware or safe and ensure all other nodes
have only parallel_safe subnodes. I just don't really like that
special case as it's likely to get broken/forgotten over time when we
add new nodes.

I'm unsure if just being able to enforce rules #2 and #3 make this worthwhile.

Happy to listen to other people's opinions and ideas on this.  Without
those, I'm unlikely to try to push this any further.

David

do_some_plan_validation_during_create_plan.patch
Description: Binary data

Re: Fix CheckIndexCompatible comment

2022-02-03 Thread Fujii Masao





On 2022/02/04 1:46, Yugo NAGATA wrote:

Hello,

I found a old parameter name 'heapRelation' in the comment
of CheckIndexCompatible. This parameter was removed by 5f173040.

Attached is a patch to remove it from the comment.


Thanks for the report! I agree to remove the mention of parameter already dropped, from the 
comment. OTOH, I found CheckIndexCompatible() now has "oldId" parameter but there is no 
comment about it though there are comments about other parameters. Isn't it better to add the 
comment about "oldId"?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: Stats collector's idx_blks_hit value is highly misleading in practice

2022-02-03 Thread John Naylor

On Fri, Oct 30, 2020 at 9:46 PM Tomas Vondra 
wrote:
>
> On Fri, Oct 16, 2020 at 03:35:51PM -0700, Peter Geoghegan wrote:

> >I suppose that a change like this could end up affecting other things,
> >such as EXPLAIN ANALYZE statistics. OTOH we only break out index pages
> >separately for bitmap scans at the moment, so maybe it could be fairly
> >well targeted. And, maybe this is unappealing given the current
> >statistics collector limitations. I'm not volunteering to work on it
> >right now, but it would be nice to fix this. Please don't wait for me
> >to do it.
> >
>
> It seems to me this should not be a particularly difficult patch in
> principle, so suitable for new contributors. The main challenge would be
> passing information about what page we're dealing with (internal/leaf)
> to the place actually calling pgstat_count_buffer_(read|hit). That
> happens in ReadBufferExtended, which just has no idea what page it's
> dealing with. Not sure how to do that cleanly ...

Is this a TODO candidate? What would be a succinct title for it?

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Fix CheckIndexCompatible comment

2022-02-03 Thread Nathan Bossart

On Fri, Feb 04, 2022 at 09:08:22AM +0900, Fujii Masao wrote:
> On 2022/02/04 1:46, Yugo NAGATA wrote:
>> I found a old parameter name 'heapRelation' in the comment
>> of CheckIndexCompatible. This parameter was removed by 5f173040.
>> 
>> Attached is a patch to remove it from the comment.

It looks like this parameter was removed in 5f17304.
 
> Thanks for the report! I agree to remove the mention of parameter already 
> dropped, from the comment. OTOH, I found CheckIndexCompatible() now has 
> "oldId" parameter but there is no comment about it though there are comments 
> about other parameters. Isn't it better to add the comment about "oldId"?

+1

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: do only critical work during single-user vacuum?

2022-02-03 Thread Robert Haas

On Thu, Feb 3, 2022 at 5:08 PM John Naylor  wrote:
> Looking closer, there is a function defined by an extension. I'd have
> to dig further to see if writes happen. The error is exactly what
> we've been talking about:
>
> 2022-01-03 22:03:23 PST ERROR: database is not accepting commands to
> avoid wraparound data loss in database ""
> 2022-01-03 22:03:23 PST HINT: Stop the postmaster and vacuum that
> database in single-user mode. You might also need to commit or roll
> back old prepared transactions.

That error comes from GetNewTransactionId(), so that function must
either try to execute DML or do something else which causes an XID to
be assigned. I think a plain SELECT should work just fine.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Fix BUG #17335: Duplicate result rows in Gather node

2022-02-03 Thread Robert Haas

On Thu, Feb 3, 2022 at 7:08 PM David Rowley  wrote:
> Currently, the patch validates 3 rules:
>
> 1) Ensure a parallel_aware path has only parallel_aware or
> parallel_safe subpaths.

I think that every path that is parallel_aware must also be
parallel_safe. So checking for either parallel_aware or parallel_safe
should be equivalent to just checking parallel_safe, unless I am
confused.

I think the actual rule is: every path under a Gather or GatherMerge
must be parallel-safe.

I don't think there's any real rule about what has to be under
parallel-aware paths -- except that it would have to be all
parallel-safe stuff, because the whole thing is under a Gather
(Merge). There may seem to be such a rule, but I suspect it's just an
accident of whatever code we have now rather than anything intrinsic.

> 2) Ensure Gather is either single_copy or contains at least one
> parallel_aware subnode.

I agree that this one is a rule which we could check.

> 3) Ensure GatherMerge contains at least one parallel_aware subnode.

This one, too.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: [PATCH] reduce page overlap of GiST indexes built using sorted method

2022-02-03 Thread Alexander Korotkov

Hi!

On Wed, Jan 26, 2022 at 7:07 PM sergei sh.  wrote:
> I've fixed issues 2 -- 4 in attached version.
>
> GUC parameter has been removed for the number of pages to collect
> before splitting and fixed value of 4 is used instead, as discussed
> off-list with Andrey Borodin, Aliaksandr Kalenik, Darafei
> Praliaskouski. Benchmarking shows that using higher values has almost
> no effect on query efficiency while increasing index building time.
>
> PSA graphs for index creation and query time, "tiling" and "self-join"
> refer to queries used in previous benchmarks:
> https://github.com/mngr777/pg_index_bm2
>
> Sorted build method description has been added in GiST README.

Thank you for the revision.  This patch looks good to me.  I've
slightly adjusted comments and formatting and wrote the commit
message.

I'm going to push this if no objections.

--
Regards,
Alexander Korotkov


0001-Reduce-non-leaf-keys-overlap-in-GiST-indexes-prod-v5.patch
Description: Binary data

Re: do only critical work during single-user vacuum?

2022-02-03 Thread Andres Freund

Hi,

On 2022-02-03 17:02:15 -0500, Robert Haas wrote:
> On Thu, Feb 3, 2022 at 4:50 PM Andres Freund  wrote:
> > I wonder if we shouldn't add some exceptions to the xid allocation
> > prevention. It makes sense that we don't allow random DML. But it's e.g. 
> > often
> > more realistic to drop / truncate a few tables with unimportant content,
> > rather than spend the time vacuuming those.  We could e.g. allow xid
> > consumption within VACUUM, TRUNCATE, DROP TABLE / INDEX when run at the top
> > level for longer than we allow it for anything else.
>
> True, although we currently don't start refusing XID allocation
> altogether until only 1 million remain, IIRC. And that's cutting it
> really close if we need to start consuming 1 XID per table we need to
> drop. We might need to push out some of the thresholds a bit.

Yea, I'd have no problem leaving the "hard" limit somewhere closer to 1
million (although 100k should be just as well), but introduce a softer "only
vacuum/drop/truncate" limit a good bit before that.

> For the most part, I think that there's no reason why autovacuum
> shouldn't be able to recover from this situation automatically, as
> long as old replication slots and prepared transactions are cleaned up
> and any old transactions are killed off.

To address the "as long as" part: I think that describing better what is
holding back the horizon would be a significant usability improvement.

Imagine that instead of the generic hints in these messages:
ereport(ERROR,

(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 errmsg("database is not 
accepting commands to avoid wraparound data loss in database \"%s\"",
oldest_datname),
 errhint("Stop the postmaster 
and vacuum that database in single-user mode.\n"
 "You might 
also need to commit or roll back old prepared transactions, or drop stale 
replication slots.")));
and
ereport(WARNING,
(errmsg("oldest xmin is far in the past"),
 errhint("Close open transactions soon to avoid 
wraparound problems.\n"
 "You might also need to commit 
or roll back old prepared transactions, or drop stale replication slots.")));

we'd actually tell the user a bit more what about what is causing the
problem.

We can compute the:
1) oldest slot by xmin, with name
2) oldest walsender by xmin, with pid
3) oldest prepared transaction id by xid / xmin, with name
4) oldest in-progress transaction id by xid / xmin, with name
5) oldest database datfrozenxid, with database name

If 1-4) are close to 5), there's no point in trying to vacuum aggressively, it
won't help. So we instead can say that the xmin horizon (with a better name)
is held back by the oldest of these, with enough identifying information for
the user to actually know where to look.

In contrast, if 5) is older than 1-4), then we can tell the user which
database is the problem, as we do right now, but we can stop mentioning the
"You might also need to commit ..." bit.

Also, adding an SRF providing the above in a useful format would be great for
monitoring and for "remote debugging" of problems.

Greetings,

Andres Freund

Re: support for CREATE MODULE

2022-02-03 Thread Alvaro Herrera

On 2022-Feb-03, Pavel Stehule wrote:

> The biggest problem is coexistence of Postgres's SEARCH_PATH  object
> identification, and local and public scopes used in MODULEs or in Oracle's
> packages.
> 
> I can imagine MODULES as third level of database unit object grouping with
> following functionality
> 
> 1. It should support all database objects like schemas

I proposed a way for modules to coexist with schemas that got no reply,
https://postgr.es/m/202106021908.ddmebx7qfdld@alvherre.pgsql

I still think that that idea is valuable; it would let us create
"private" routines, for example, which are good for encapsulation.
But the way it interacts with schemas means we don't end up with a total
mess in the namespace resolution rules.  I argued that modules would
only have functions, and maybe a few other useful object types, but not
*all* object types, because we don't need all object types to become
private.  For example, I don't think I would like to have data types or
casts to be private, so they can only be in a schema and they cannot be
in a module.

Of course, that idea of modules would also ease porting large DB-based
applications from other database systems.

What do others think?

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/

Re: fairywren is generating bogus BASE_BACKUP commands

2022-02-03 Thread Andres Freund

Hi,

On 2022-02-03 17:25:51 -0500, Andrew Dunstan wrote:
> OK, I have all the pieces working and I know what I need to do to adapt
> fairywren. The patch you provided is not necessary any more.

Cool. Are you going to post that?

> (I think your TMPDIR spec is missing a /build/)

I think I went back/forth between in-tree/out-of-tree build...

> The recipe worked (mutatis mutandis) for the mingw64 toolchain as well
> as for the ucrt64 toolchain. Is there a reason to prefer ucrt64?

There's a lot of oddities in the mingw64 target, due to targetting the much
older C runtime library (lots of bugs, missing functionality). MSVC targets
UCRT by default for quite a few years by now. Targetting msvcrt is basically
on its way out from what I understand.

> I think the next steps are:
> 
>   * do those two reverts
>   * adjust fairywren
>   * get rid of perl2host
> 
> At that stage jacana will no longer be able to run TAP tests. I can do
> one of these:

I guess because its install is too old?

>   * disable the TAP tests on jacana
>   * migrate jacana to msys2
>   * kiss jacana goodbye.

Having a non-server mingw animal seems like it could be useful (I think that's
just Jacana), even if server / client versions of windows have grown
closer. So I think an update to msys2 makes the most sense?

> > To make tests in "in-tree" builds work, a bit more hackery would be
> > needed. The problem is that windows chooses binaries from the current 
> > working
> > directory *before* PATH. That's a problem for things like initdb.exe or
> > pg_ctl.exe that want to find postgres.exe, as that only works with the 
> > program
> > in their proper location, rather than CWD.

> Yeah, we should do something about that. For example, we could possibly
> use the new install_path option of PostgreSQL::Test::Cluster::new() so
> it would find these in the right location.

It'd be easy enough to adjust the central invocations of initdb. I think the
bigger problem is that there's plenty calls to initdb, pg_ctl "directly" in
the respective test scripts.

> However, I don't need it as my animals all use vpath builds.

I think it'd be fine to just error out in non-vpath builds on msvc. The
search-for-binaries behaviour is just too weird.

Greetings,

Andres Freund

Re: Add checkpoint and redo LSN to LogCheckpointEnd log message

2022-02-03 Thread Fujii Masao





On 2022/02/03 15:50, Kyotaro Horiguchi wrote:

On way to take. In that case should we log something like
"Restartpoint canceled" or something?


+1



By the way, restart point should start only while recoverying, and at
the timeof the start both checkpoint.redo and checkpoint LSN are
already past. We shouldn't update minRecovery point after promotion,
but is there any reason for not updating the checkPoint and
checkPointCopy?  If we update them after promotion, the
which-LSN-to-show problem would be gone.


I tried to find the reason by reading the past discussion, but have not found 
that yet.

If we update checkpoint and REDO LSN at pg_control in that case, we also need 
to update min recovery point at pg_control? Otherwise the min recovery point at 
pg_control still indicates the old LSN that previous restart point set.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: do only critical work during single-user vacuum?

2022-02-03 Thread Robert Haas

On Thu, Feb 3, 2022 at 8:35 PM Andres Freund  wrote:
> Yea, I'd have no problem leaving the "hard" limit somewhere closer to 1
> million (although 100k should be just as well), but introduce a softer "only
> vacuum/drop/truncate" limit a good bit before that.

+1.

> To address the "as long as" part: I think that describing better what is
> holding back the horizon would be a significant usability improvement.
>
> Imagine that instead of the generic hints in these messages:
> ereport(ERROR,
> 
> (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
>  errmsg("database is not 
> accepting commands to avoid wraparound data loss in database \"%s\"",
> 
> oldest_datname),
>  errhint("Stop the postmaster 
> and vacuum that database in single-user mode.\n"
>  "You might 
> also need to commit or roll back old prepared transactions, or drop stale 
> replication slots.")));
> and
> ereport(WARNING,
> (errmsg("oldest xmin is far in the past"),
>  errhint("Close open transactions soon to 
> avoid wraparound problems.\n"
>  "You might also need to 
> commit or roll back old prepared transactions, or drop stale replication 
> slots.")));
>
> we'd actually tell the user a bit more what about what is causing the
> problem.
>
> We can compute the:
> 1) oldest slot by xmin, with name
> 2) oldest walsender by xmin, with pid
> 3) oldest prepared transaction id by xid / xmin, with name
> 4) oldest in-progress transaction id by xid / xmin, with name
> 5) oldest database datfrozenxid, with database name
>
> If 1-4) are close to 5), there's no point in trying to vacuum aggressively, it
> won't help. So we instead can say that the xmin horizon (with a better name)
> is held back by the oldest of these, with enough identifying information for
> the user to actually know where to look.

Yes. This kind of thing strikes me as potentially a huge help. To
rephrase that in other terms, we could tell the user what the actual
problem is instead of suggesting to them that they shut down their
database just for fun. It's "just for fun" because (a) it typically
won't fix the real problem, which is most often (1) or (3) from your
list, and even if it's (2) or (4) they could just kill the session
instead of shutting down the whole database, and (b) no matter what
needs to be done, whether it's VACUUM or ROLLBACK PREPARED or
something else, they may as well do that thing in multi-user mode
rather than single-user mode, unless we as PostgreSQL developers
forgot to make that actually work.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Windows now has fdatasync()

2022-02-03 Thread Thomas Munro

I added a commitfest entry for this to try to attract Windows-hacker
reviews.  I wondered about adjusting it to run on older systems, but I
think we're about ready to drop support for Windows < 10 anyway, so
maybe I'll go and propose that separately, instead.

https://commitfest.postgresql.org/37/3530/

Re: Collecting statistics about contents of JSONB columns

2022-02-03 Thread Tomas Vondra




On 1/25/22 17:56, Mahendra Singh Thalor wrote:
>

...

For the last few days, I was trying to understand these patches, and 
based on Tomas's suggestion, I was doing some performance tests.


With the attached .SQL file, I can see that analyze is taking more time 
with these patches.


*Setup: *
autovacuum=off
rest all are default settings.

Insert attached file with and without the patch to compare the time 
taken by analyze.


*With json patches:*
postgres=# analyze test ;
ANALYZE
Time: *28464.062 ms (00:28.464)*
postgres=#
postgres=# SELECT pg_size_pretty( 
pg_total_relation_size('pg_catalog.pg_statistic') );

  pg_size_pretty

  328 kB
(1 row)
--

*Without json patches:*
postgres=# analyze test ;
ANALYZE
*Time: 120.864* ms
postgres=# SELECT pg_size_pretty( 
pg_total_relation_size('pg_catalog.pg_statistic') );

  pg_size_pretty

  272 kB

I haven't found the root cause of this but I feel that this time is due 
to a loop of all the paths.
In my test data, there is a total of 951 different-2 paths. While doing 
analysis, first we check all the sample rows(3) and we collect all 
the different-2 paths (here 951), and after that for every single path, 
we loop over all the sample rows again to collect stats for a particular 
path. I feel that these loops might be taking time.


I will run perf and will try to find out the root cause of this.



Thanks, I've been doing some performance tests too, and you're right it 
takes quite a bit of time. I wanted to compare how the timing changes 
with complexity of the JSON documents (nesting, number of keys, ...) so 
I wrote a simple python script to generate random JSON documents with 
different parameters - see the attached json-generate.py script.


It's a bit crude, but it generates synthetic documents with a chosen 
number of levels, keys per level, distinct key values, etc. The 
generated documents are loaded directly into a "json_test" database, 
into a table "test_table" with a single jsonb column called "v". 
Tweaking this to connect to a different database, or just dump the 
generated documents to a file, should be trivial.


The attached bash script runs the data generator for a couple of 
combinations, and them measures how long it takes to analyze the table, 
how large the statistics are (in a rather crude way), etc.


The results look like this (the last two columns are analyze duration in 
milliseconds, for master and with the patch):


  levels   keys  unique keyspaths  masterpatched
  --
   1  112 153122
   1  1 1000 1001 134   1590
   1  889 157367
   1  8 1000 1001 155   1838
   1 64   64   65 189   2298
   1 64 1000 1001  46   9322
   2  113 237197
   2  1 100030580 152  46468
   2  88   73 245   1780

So yeah, it's significantly slower - in most cases not as much as you 
observed, but an order of magnitude slower than master. For size of the 
statistics, it's similar:


  levels   keys  unique keys   paths  table size   masterpatched
  --
   1  11   2 184320016360  24325
   1  1 10001001 1843200168191425400
   1  88   9 471040028948  88837
   1  8 10001001 6504448426943915802
   1 64   64  6530154752   209713 689842
   1 64 1000100149086464 10937755214
   2  11   3 257228824883  48727
   2  1 1000   30580 257228811422   26396365
   2  88  7323068672   164785 862258

This is measured by by dumping pg_statistic for the column, so in 
database it might be compressed etc. It's larger, but that's somewhat 
expected because we simply store more detailed stats. The size grows 
with the number of paths extracted - which is expected, of course.


If you noticed why this doesn't show data for additional combinations 
(e.g. 2 levels 8 keys and 1000 distinct key values), then that's the bad 
news - that takes ages (multiple minutes) and then it gets killed by OOM 
killer because it eats gigabytes of memory.


I agree the slowness is largely due to extracting all paths and then 
processing them one by one - which means we have to loop over the tuples 
over and over. In this case there's about 850k distinct paths extracted, 
so we do ~850k loops over 30k tuples. That's gotta take time.


I don't know what exactly to do about this, but I already mentioned we 
may need to pick a subset o

Re: Collecting statistics about contents of JSONB columns

2022-02-03 Thread Tomas Vondra





On 2/4/22 03:47, Tomas Vondra wrote:

./json-generate.py 3 2 8 1000 6 1000


Sorry, this should be (different order of parameters):

./json-generate.py 3 2 1000 8 6 1000


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Windows crash / abort handling

2022-02-03 Thread David Rowley

On Thu, 3 Feb 2022 at 15:38, Andres Freund  wrote:
> I've pushed the patch this thread is about now. Lets see what the buildfarm
> says. I only could one windows version.  Separately I've also pushed a patch
> to run the windows tests under a timeout. I hope in combination these patches
> address the hangs.

I tried this out today on a windows machine I have here. I added some
code to trigger an Assert failure and the options of attaching a
debugger work well. Tested both from running the regression tests and
running a query manually with psql.

Tested on Windows 8.1 with VS2017.

Thanks for working on this.

David

Re: do only critical work during single-user vacuum?

2022-02-03 Thread Andres Freund

Hi,

On 2022-02-03 21:08:03 -0500, Robert Haas wrote:
> On Thu, Feb 3, 2022 at 8:35 PM Andres Freund  wrote:
> > We can compute the:
> > 1) oldest slot by xmin, with name
> > 2) oldest walsender by xmin, with pid
> > 3) oldest prepared transaction id by xid / xmin, with name
> > 4) oldest in-progress transaction id by xid / xmin, with name
> > 5) oldest database datfrozenxid, with database name
> >
> > If 1-4) are close to 5), there's no point in trying to vacuum aggressively, 
> > it
> > won't help. So we instead can say that the xmin horizon (with a better name)
> > is held back by the oldest of these, with enough identifying information for
> > the user to actually know where to look.
> 
> Yes. This kind of thing strikes me as potentially a huge help. To
> rephrase that in other terms, we could tell the user what the actual
> problem is instead of suggesting to them that they shut down their
> database just for fun. It's "just for fun" because (a) it typically
> won't fix the real problem, which is most often (1) or (3) from your
> list, and even if it's (2) or (4) they could just kill the session
> instead of shutting down the whole database

Not that it matters, but IME the leading cause is 5). Often due to autovacuum
configuration. Which reminded me of the one thing that single user mode
is actually helpful for: Being able to start a manual VACUUM.

Once autovacuum is churning along in anti-wrap mode, with multiple workers, it
can be hard to manually VACUUM without waiting for autovacuum to do it's
throttled thing. The only way is to start the manual VACUUM and kill
autovacuum workers whenever they're blocking the manual vacuum(s).

Which reminds me: Perhaps we ought to hint about reducing / removing
autovacuum cost limits in this situation? And perhaps make autovacuum absorb
config changes while running? It's annoying that an autovac halfway into a
huge table doesn't absorb changed cost limits for example.

> (b) no matter what needs to be done, whether it's VACUUM or ROLLBACK
> PREPARED or something else, they may as well do that thing in multi-user
> mode rather than single-user mode, unless we as PostgreSQL developers forgot
> to make that actually work.

One thing that we made quite hard is to rollback prepared transactions,
because we require to be in the same database (a lot of fun in single user
mode with a lot of databases). We can't commit in the same database, but I
wonder if it's doable to allow rollbacks?

Greetings,

Andres Freund

Re: Design of pg_stat_subscription_workers vs pgstats

2022-02-03 Thread Amit Kapila

On Thu, Feb 3, 2022 at 3:25 PM Peter Eisentraut
 wrote:
>
> On 02.02.22 07:54, Amit Kapila wrote:
>
> > Sure, but is this the reason you want to store all the error info in
> > the system catalog? I agree that providing more error info could be
> > useful and also possibly the previously failed (apply) xacts info as
> > well but I am not able to see why you want to have that sort of info
> > in the catalog. I could see storing info like err_lsn/err_xid that can
> > allow to proceed to apply worker automatically or to slow down the
> > launch of errored apply worker but not all sort of other error info
> > (like err_cnt, err_code, err_message, err_time, etc.). I want to know
> > why you are insisting to make all the error info persistent via the
> > system catalog?
>
> Let's flip this around and ask, why not?
>

Because we don't necessarily need all this information after the crash
and neither is this information about any system object which we
require for performing operations on objects. OTOH, if we need some
particular error/apply state(s) that should be okay to keep in
persistent form (in system catalog). In walreceiver (for standby), we
don't store the errors/conflicts in any table, they are either
reported in logs or shared via stats. Similarly, the archiver process
do share its failure information either via stats or logs. Similar is
the case for checkpointer which also just logs the error. Now,
similarly in this case also we are sharing the error information via
logs and stats.

-- 
With Regards,
Amit Kapila.

Re: warn if GUC set to an invalid shared library

2022-02-03 Thread Maciek Sakrejda

On Wed, Feb 2, 2022 at 7:39 AM David G. Johnston 
wrote:

> I would at least consider having the UX go something like:
>
> postgres=# ALTER SYSTEM SET shared_preload_libraries = not_such_library;
> ERROR:  that library is not present>.
> HINT: to bypass the error please add FORCE before SET
> postgres=# ALTER SYSTEM FORCE SET shared_preload_libraries =
> no_such_library;
> NOTICE: Error suppressed while setting shared_preload_libraries.
>
> That is, have the user express their desire to leave the system in a
> precarious state explicitly before actually doing so.
>

While I don't have a problem with that behavior, given that there are
currently no such facilities for asserting "yes, really" with ALTER SYSTEM,
I don't think it's worth introducing that just for this patch. A warning
seems like a reasonable first step. This can always be expanded later. I'd
rather see a warning ship than move the goalposts out of reach.

Re: do only critical work during single-user vacuum?

2022-02-03 Thread Justin Pryzby

On Thu, Feb 03, 2022 at 07:26:01PM -0800, Andres Freund wrote:
> Which reminds me: Perhaps we ought to hint about reducing / removing
> autovacuum cost limits in this situation? And perhaps make autovacuum absorb
> config changes while running? It's annoying that an autovac halfway into a
> huge table doesn't absorb changed cost limits for example.

I remembered this thread:

https://commitfest.postgresql.org/32/2983/
| Running autovacuum dynamic update to cost_limit and delay

https://www.postgresql.org/message-id/flat/13A6B954-5C21-4E60-BC06-751C8EA469A0%40amazon.com
https://www.postgresql.org/message-id/flat/0A3F8A3C-4328-4A4B-80CF-14CEBE0B695D%40amazon.com

-- 
Justin

Re: Adding CI to our tree

2022-02-03 Thread Justin Pryzby

On Thu, Feb 03, 2022 at 11:57:18AM -0800, Andres Freund wrote:
> On 2022-02-02 21:58:28 -0600, Justin Pryzby wrote:
> > FYI: I've rebased these against your cirrus/windows changes.
> 
> Did you put then on a dedicated branch, or only intermixed with other changes?

Yes it's intermixed (because I have too many branches, and because in this case
it's useful to show the doc builds and coverage artifacts).

> > A recent cirrus result is here; this has other stuff in the branch, so you 
> > can
> > see the artifacts with HTML docs and code coverage.
> 
> I'm a bit worried about the increased storage and runtime overhead due to the
> docs changes. We probably can make it a good bit cheaper though.

If you mean overhead of additional disk operations, it shouldn't be an issue,
since the git clone uses shared references (not even hardlinks).

If you meant storage capacity, I'm only uploading the *changed* docs as
artifacts.  The motivation being that it's a lot more convenient to look though
a single .html, and not hundreds.

> What's the idea behind
> #echo 'deb http://deb.debian.org/debian bullseye main' >>/etc/apt/sources.list
> That's already in sources.list, so I'm not sure what this shows?

At one point I thought I needed it - maybe all I needed was "apt-get update"..

> > 9d0f03d3450 wip: cirrus: code coverage
>
> I don't think it's good to just unconditionally reference the master branch
> here. It'll do bogus stuff once 15 is branched off. It works for cfbot, but
> not other uses.

That's only used for filtering changed files.  It uses git diff --cherry-pick
postgres/master..., which would *try* to avoid showing anything which is also
in master.

> Is looking at the .c files in the change really a reliable predictor of where
> code coverage changes? I'm doubtful. Consider stuff like removing the last
> user of some infrastructure or such. Or adding the first.

You're right that it isn't particularly accurate, but it's a step forward if
lots of patches use it to check/improve coverge of new code.

In addition to the HTML generated for changed .c files, it's possible to create
a coverage.gcov output for everything, which could be retrieved to generate
full HTML locally.  It's ~8MB (or 2MB after gzip).

> > bff64e8b998 cirrus: upload html docs as artifacts
> > 80f52c3b172 wip: only upload changed docs
> 
> Similar to the above.

Do you mean it's not reliable ?  This is looking at which .html have changed
(not sgml).  Surely that's reliable ?

> > 654b6375401 wip: vcsregress: add alltaptests
> 
> I assume this doesn't yet work to a meaningful degree? Last time I checked
> there were quite a few tests that needed to be invoked in a specific
> directory.

It works - tap_check() does chdir().  But it's slow, and maybe should try to
implement a check-world target.  It currently fails in 027_stream_regress.pl,
although I keep hoping that it had been fixed...
https://cirrus-ci.com/task/6116235950686208

(BTW, I just realized that that commit should also remove the recoverycheck
call.)

-- 
Justin

Re: CREATEROLE and role ownership hierarchies

2022-02-03 Thread Maciek Sakrejda

I'm chiming in a little late here, but as someone who worked on a
system to basically work around the lack of unprivileged CREATE ROLE
for a cloud provider (I worked on the Heroku Data team for several
years), I thought it might be useful to offer my perspective. This is,
of course, not the only use case, but maybe it's useful to have
something concrete. As a caveat, I don't know how current this still
is (I no longer work there, though the docs [1] seem to still describe
the same system), or if there are better ways to achieve the goals of
a service provider.

Broadly, the general use case is something like what Robert has
sketched out in his e-mails. Heroku took care of setting up the
database, archiving, replication, and any other system-level config.
It would then keep the superuser credentials private, create a
database, and a user that owned that database and had all the
permissions we could grant it without compromising the integrity of
the system. (We did not want customers to break their databases, both
to ensure a better user experience and to avoid getting paged.)
Initially, this meant customers got just the one database user because
of CREATE ROLE's limitations.

To work around that, at some point, we added an API that would CREATE
ROLE for you, accessible through a dashboard and the Heroku CLI. This
ran CREATE ROLE (or DROP ROLE) for you, but otherwise it largely let
you configure the resulting roles as you pleased (using the original
role we create for you). We wanted to avoid reinventing the wheel as
much as possible, and the customer database (including the role
configuration) was mostly a black box for us (we did manage some
predefined permissions configurations through our dashboard, but the
Postgres catalogs were the source of truth for that).

Thinking about how this would fit into a potential non-superuser
CREATE ROLE world, the sandbox superuser model discussed above covers
this pretty well, though I share some of Robert's concerns around how
this fits into existing systems.

Hope this is useful feedback. Thanks for working on this!

[1]: https://devcenter.heroku.com/articles/heroku-postgresql-credentials

Re: make MaxBackends available in _PG_init

2022-02-03 Thread Nathan Bossart

On Wed, Feb 02, 2022 at 08:15:02AM -0500, Robert Haas wrote:
> On Tue, Feb 1, 2022 at 5:36 PM Nathan Bossart  
> wrote:
>> I can work on a new patch if this is the direction we want to go.  There
>> were a couple of functions that called GetMaxBackends() repetitively that I
>> should probably fix before the patch should be seriously considered.
> 
> Sure, that sort of thing should be tidied up. It's unlikely to make
> any real difference, but as a fairly prominent PostgreSQL hacker once
> said, a cycle saved is a cycle earned.

Here is a new patch with fewer calls to GetMaxBackends().

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 2d0ad3b35b0b0d537599f0bbf8b1d6d8ec297156 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Mon, 16 Aug 2021 02:59:32 +
Subject: [PATCH v8 1/1] Disallow external access to MaxBackends.

Presently, MaxBackends is externally visible, but it may still be
uninitialized in places where it would be convenient to use (e.g.,
_PG_init()).  This change makes MaxBackends static to postinit.c to
disallow such direct access.  Instead, MaxBackends should now be
accessed via GetMaxBackends().
---
 src/backend/access/nbtree/nbtutils.c|  4 +-
 src/backend/access/transam/multixact.c  |  8 ++-
 src/backend/access/transam/twophase.c   |  3 +-
 src/backend/commands/async.c| 11 ++--
 src/backend/libpq/pqcomm.c  |  3 +-
 src/backend/postmaster/auxprocess.c |  2 +-
 src/backend/postmaster/postmaster.c |  4 +-
 src/backend/storage/ipc/dsm.c   |  2 +-
 src/backend/storage/ipc/procarray.c |  2 +-
 src/backend/storage/ipc/procsignal.c| 17 --
 src/backend/storage/ipc/sinvaladt.c |  4 +-
 src/backend/storage/lmgr/deadlock.c | 31 +-
 src/backend/storage/lmgr/lock.c | 23 
 src/backend/storage/lmgr/predicate.c| 10 ++--
 src/backend/storage/lmgr/proc.c | 17 +++---
 src/backend/utils/activity/backend_status.c | 63 +++--
 src/backend/utils/adt/lockfuncs.c   |  5 +-
 src/backend/utils/init/postinit.c   | 50 ++--
 src/include/miscadmin.h |  3 +-
 19 files changed, 161 insertions(+), 101 deletions(-)

diff --git a/src/backend/access/nbtree/nbtutils.c b/src/backend/access/nbtree/nbtutils.c
index 6a651d8397..84164748b3 100644
--- a/src/backend/access/nbtree/nbtutils.c
+++ b/src/backend/access/nbtree/nbtutils.c
@@ -2072,7 +2072,7 @@ BTreeShmemSize(void)
 	Size		size;
 
 	size = offsetof(BTVacInfo, vacuums);
-	size = add_size(size, mul_size(MaxBackends, sizeof(BTOneVacInfo)));
+	size = add_size(size, mul_size(GetMaxBackends(), sizeof(BTOneVacInfo)));
 	return size;
 }
 
@@ -2101,7 +2101,7 @@ BTreeShmemInit(void)
 		btvacinfo->cycle_ctr = (BTCycleId) time(NULL);
 
 		btvacinfo->num_vacuums = 0;
-		btvacinfo->max_vacuums = MaxBackends;
+		btvacinfo->max_vacuums = GetMaxBackends();
 	}
 	else
 		Assert(found);
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 806f2e43ba..9ee805078c 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -285,7 +285,7 @@ typedef struct MultiXactStateData
  * Last element of OldestMemberMXactId and OldestVisibleMXactId arrays.
  * Valid elements are (1..MaxOldestSlot); element 0 is never used.
  */
-#define MaxOldestSlot	(MaxBackends + max_prepared_xacts)
+#define MaxOldestSlot	(GetMaxBackends() + max_prepared_xacts)
 
 /* Pointers to the state data in shared memory */
 static MultiXactStateData *MultiXactState;
@@ -684,6 +684,7 @@ MultiXactIdSetOldestVisible(void)
 	if (!MultiXactIdIsValid(OldestVisibleMXactId[MyBackendId]))
 	{
 		MultiXactId oldestMXact;
+		int			maxOldestSlot = MaxOldestSlot;
 		int			i;
 
 		LWLockAcquire(MultiXactGenLock, LW_EXCLUSIVE);
@@ -697,7 +698,7 @@ MultiXactIdSetOldestVisible(void)
 		if (oldestMXact < FirstMultiXactId)
 			oldestMXact = FirstMultiXactId;
 
-		for (i = 1; i <= MaxOldestSlot; i++)
+		for (i = 1; i <= maxOldestSlot; i++)
 		{
 			MultiXactId thisoldest = OldestMemberMXactId[i];
 
@@ -2507,6 +2508,7 @@ GetOldestMultiXactId(void)
 {
 	MultiXactId oldestMXact;
 	MultiXactId nextMXact;
+	int			maxOldestSlot = MaxOldestSlot;
 	int			i;
 
 	/*
@@ -2525,7 +2527,7 @@ GetOldestMultiXactId(void)
 		nextMXact = FirstMultiXactId;
 
 	oldestMXact = nextMXact;
-	for (i = 1; i <= MaxOldestSlot; i++)
+	for (i = 1; i <= maxOldestSlot; i++)
 	{
 		MultiXactId thisoldest;
 
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 271a3146db..608c5149e5 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -260,6 +260,7 @@ TwoPhaseShmemInit(void)
 	{
 		GlobalTransaction gxacts;
 		int			i;
+		int			max_backends = GetMaxBackends();
 
 		Assert(!found);
 		TwoPhaseState->freeGXacts = NULL;
@@ -293,7 +294,7 @@ TwoPhaseShmemInit(void)
 			 * prepare

Re: support for CREATE MODULE

2022-02-03 Thread Julien Rouhaud

Hi,

On Thu, Feb 03, 2022 at 10:42:32PM -0300, Alvaro Herrera wrote:
> On 2022-Feb-03, Pavel Stehule wrote:
> 
> > The biggest problem is coexistence of Postgres's SEARCH_PATH  object
> > identification, and local and public scopes used in MODULEs or in Oracle's
> > packages.
> > 
> > I can imagine MODULES as third level of database unit object grouping with
> > following functionality
> > 
> > 1. It should support all database objects like schemas
> 
> I proposed a way for modules to coexist with schemas that got no reply,
> https://postgr.es/m/202106021908.ddmebx7qfdld@alvherre.pgsql

Ah, sorry I missed this one.

> I still think that that idea is valuable; it would let us create
> "private" routines, for example, which are good for encapsulation.
> But the way it interacts with schemas means we don't end up with a total
> mess in the namespace resolution rules.

>I argued that modules would
> only have functions, and maybe a few other useful object types, but not
> *all* object types, because we don't need all object types to become
> private.  For example, I don't think I would like to have data types or
> casts to be private, so they can only be in a schema and they cannot be
> in a module.
> 
> Of course, that idea of modules would also ease porting large DB-based
> applications from other database systems.
> 
> What do others think?

This approach seems way better as it indeed fixes the qualification issues with
the patch proposed in this thread.

Re: Adding CI to our tree

2022-02-03 Thread Andres Freund

Hi, 

On February 3, 2022 9:04:04 PM PST, Justin Pryzby  wrote:
>On Thu, Feb 03, 2022 at 11:57:18AM -0800, Andres Freund wrote:
>> On 2022-02-02 21:58:28 -0600, Justin Pryzby wrote:
>> > FYI: I've rebased these against your cirrus/windows changes.
>> 
>> What's the idea behind
>> #echo 'deb http://deb.debian.org/debian bullseye main' 
>> >>/etc/apt/sources.list
>> That's already in sources.list, so I'm not sure what this shows?
>
>At one point I thought I needed it - maybe all I needed was "apt-get update"..

Yes, that's needed. There's no old pre fetched package list, because it'd be 
outdated anyway, and work *sometimes* for some packages... They're also not 
small (image size influences job start speed heavily).


>> > 9d0f03d3450 wip: cirrus: code coverage
>>
>> I don't think it's good to just unconditionally reference the master branch
>> here. It'll do bogus stuff once 15 is branched off. It works for cfbot, but
>> not other uses.
>
>That's only used for filtering changed files.  It uses git diff --cherry-pick
>postgres/master..., which would *try* to avoid showing anything which is also
>in master.

The point is that master is not a relevant point of comparison when a commit in 
the 14 branch is tested.


Regards,

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

81 matches

Mail list logo