date:20221223

Re: Force streaming every change in logical decoding

2022-12-23 Thread Amit Kapila

On Fri, Dec 23, 2022 at 10:31 PM Masahiko Sawada  wrote:
>
> On Fri, Dec 23, 2022 at 5:32 PM shiy.f...@fujitsu.com
>  wrote:
> >
> > On Fri, Dec 23, 2022 1:50 PM Amit Kapila 
> > >
> > > On Thu, Dec 22, 2022 at 6:18 PM shiy.f...@fujitsu.com
> > >  wrote:
> > > >
> > > >
> > > > Besides, I tried to reduce data size in streaming subscription tap 
> > > > tests by this
> > > > new GUC (see 0002 patch). But I didn't covert all streaming tap tests
> > > because I
> > > > think we also need to cover the case that there are lots of changes. 
> > > > So, 015*
> > > is
> > > > not modified. And 017* is not modified because streaming transactions 
> > > > and
> > > > non-streaming transactions are tested alternately in this test.
> > > >
> > >
> > > I think we can remove the newly added test from the patch and instead
> > > combine the 0001 and 0002 patches. I think we should leave the
> > > 022_twophase_cascade as it is because it can impact code coverage,
> > > especially the below part of the test:
> > > # 2PC PREPARE with a nested ROLLBACK TO SAVEPOINT
> > > $node_A->safe_psql(
> > > 'postgres', "
> > > BEGIN;
> > > INSERT INTO test_tab VALUES (, 'foobar');
> > > SAVEPOINT sp_inner;
> > > INSERT INTO test_tab SELECT i, md5(i::text) FROM
> > > generate_series(3, 5000) s(i);
> > >
> > > Here, we will stream first time after the subtransaction, so can
> > > impact the below part of the code in ReorderBufferStreamTXN:
> > > if (txn->snapshot_now == NULL)
> > > {
> > > ...
> > > dlist_foreach(subxact_i, >subtxns)
> > > {
> > > ReorderBufferTXN *subtxn;
> > >
> > > subtxn = dlist_container(ReorderBufferTXN, node, subxact_i.cur);
> > > ReorderBufferTransferSnapToParent(txn, subtxn);
> > > }
> > > ...
> > >
> >
> > OK, I removed the modification in 022_twophase_cascade.pl and combine the 
> > two patches.
>
> Thank you for updating the patch. The v6 patch looks good to me.
>

LGTM as well. I'll push this on Monday.

-- 
With Regards,
Amit Kapila.

Re: [BUG] pg_upgrade test fails from older versions.

2022-12-23 Thread Michael Paquier

On Fri, Dec 23, 2022 at 10:39:25AM -0600, Justin Pryzby wrote:
> On Fri, Dec 23, 2022 at 05:51:28PM +0900, Michael Paquier wrote:
>> FWIW, I find the use of a FOR loop with a DO block much cleaner to
>> follow in this context, so something like the attached would be able
>> to group the two queries and address your point on O(N^2).  Do you
>> like that? 
> 
> LGTM.  Thanks.

I am a bit busy for the next few days, but I may be able to get that
done next Monday.
--
Michael


signature.asc
Description: PGP signature

Re: daitch_mokotoff module

2022-12-23 Thread Dag Lem

Alvaro Herrera  writes:

> On 2022-Dec-23, Alvaro Herrera wrote:
>

[...]

> I tried downloading a list of surnames from here
> https://www.bibliotecadenombres.com/apellidos/apellidos-espanoles/
> pasted that in a text file and \copy'ed it into a table.  Then I ran
> this query
>
> select string_agg(a, ' ' order by a), daitch_mokotoff(a), count(*)
> from apellidos
> group by daitch_mokotoff(a)
> order by count(*) desc;
>
> so I have a first entry like this
>
> string_agg │ Balasco Balles Belasco Belles Blas Blasco Fallas Feliz
> Palos Pelaez Plaza Valles Vallez Velasco Velez Veliz Veloz Villas
> daitch_mokotoff │ 784000
> count   │ 18
>
> but then I have a bunch of other entries with the same code 784000 as
> alternative codes,
>
> string_agg  │ Velazco
> daitch_mokotoff │ 784500 784000
> count   │ 1
>
> string_agg  │ Palacio
> daitch_mokotoff │ 785000 784000
> count   │ 1
>
> I suppose I need to group these together somehow, and it would make more
> sense to do that if the values were arrays.
>
>
> If I scroll a bit further down and choose, say, 794000 (a relatively
> popular one), then I have this
>
> string_agg │ Barraza Barrios Barros Bras Ferraz Frias Frisco Parras
> Peraza Peres Perez Porras Varas Veras
> daitch_mokotoff │ 794000
> count   │ 14
>
> and looking for that code in the result I also get these three
>
> string_agg  │ Barca Barco Parco
> daitch_mokotoff │ 795000 794000
> count   │ 3
>
> string_agg  │ Borja
> daitch_mokotoff │ 79 794000
> count   │ 1
>
> string_agg  │ Borjas
> daitch_mokotoff │ 794000 794400
> count   │ 1
>
> and then I see that I should also search for possible matches in codes
> 795000, 79 and 794400, so that gives me
>
> string_agg │ Baria Baro Barrio Barro Berra Borra Feria Para Parra
> Perea Vera
> daitch_mokotoff │ 79
> count   │ 11
>
> string_agg  │ Barriga Borge Borrego Burgo Fraga
> daitch_mokotoff │ 795000
> count   │ 5
>
> string_agg  │ Borjas
> daitch_mokotoff │ 794000 794400
> count   │ 1
>
> which look closely related (compare "Veras" in the first to "Vera" in
> the later set.  If you ignore that pseudo-match, you're likely to miss
> possible family relationships.)
>
>
> I suppose if I were a genealogy researcher, I would be helped by having
> each of these codes behave as a separate unit, rather than me having to
> split the string into the several possible contained values.

It seems to me like you're trying to use soundex coding for something it
was never designed for.

As stated in my previous mail, soundex algorithms are designed to index
names on some representation of sound, so that alike sounding names with
different spellings will match, and as shown in the documentation
example, that is exactly what the implementation facilitates.

Daitch-Mokotoff Soundex indexes alternative sounds for the same name,
however if I understand correctly, you want to index names by single
sounds, linking all alike sounding names to the same soundex code. I
fail to see how that is useful - if you want to find matches for a name,
you simply match against all indexed names. If you only consider one
sound, you won't find all names that match.

In any case, as explained in the documentation, the implementation is
intended to be a companion to Full Text Search, thus text is the natural
representation for the soundex codes.

BTW Vera 79 does not match Veras 794000, because they don't sound
the same (up to the maximum soundex code length).

Best regards

Dag Lem

Re: Infinite Interval

2022-12-23 Thread Joseph Koshakow

Hi Ashutosh,

I ended up doing some more work on this today. All of the major
features should be implemented now. Below are what I think are the
outstanding TODOs:
- Clean up error messages and error codes
- Figure out how to correctly implement interval_part for infinite
intervals. For now I pretty much copied the implementation of
timestamp_part, but I'm not convinced that's correct.
- Fix horology tests.
- Test consolidation. After looking through the interval tests, I
realized that I may have duplicated some test cases. It would probably
be best to remove those duplicate tests.
- General cleanup, remove TODOs.

Attached is my most recent patch.

- Joe Koshakow
From 380cde4061afd6eed4cde938a4c668a2c96bb58f Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sat, 17 Dec 2022 14:21:26 -0500
Subject: [PATCH] This is WIP.

Following things are supported
1. Accepts '+/-infinity' as a valid string input for interval type.
2. Support interval_pl, interval_div
3. Tests in interval.sql for comparison operators working fine.

TODOs
1. Various TODOs in code
2. interval_pl: how to handle infinite values with opposite signs
3. timestamp, timestamptz, date and time arithmetic
4. Fix horology test.

Ashutosh Bapat
---
 src/backend/utils/adt/date.c   |  20 +
 src/backend/utils/adt/datetime.c   |   2 +
 src/backend/utils/adt/timestamp.c  | 330 -
 src/include/datatype/timestamp.h   |  22 +
 src/test/regress/expected/interval.out | 953 -
 src/test/regress/sql/interval.sql  | 182 -
 6 files changed, 1442 insertions(+), 67 deletions(-)

diff --git a/src/backend/utils/adt/date.c b/src/backend/utils/adt/date.c
index 1cf7c7652d..a2c9214bcf 100644
--- a/src/backend/utils/adt/date.c
+++ b/src/backend/utils/adt/date.c
@@ -2073,6 +2073,11 @@ time_pl_interval(PG_FUNCTION_ARGS)
 	Interval   *span = PG_GETARG_INTERVAL_P(1);
 	TimeADT		result;
 
+	if (INTERVAL_NOT_FINITE(span))
+		ereport(ERROR,
+(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+ errmsg("TODO")));
+
 	result = time + span->time;
 	result -= result / USECS_PER_DAY * USECS_PER_DAY;
 	if (result < INT64CONST(0))
@@ -2091,6 +2096,11 @@ time_mi_interval(PG_FUNCTION_ARGS)
 	Interval   *span = PG_GETARG_INTERVAL_P(1);
 	TimeADT		result;
 
+	if (INTERVAL_NOT_FINITE(span))
+		ereport(ERROR,
+(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+ errmsg("TODO")));
+
 	result = time - span->time;
 	result -= result / USECS_PER_DAY * USECS_PER_DAY;
 	if (result < INT64CONST(0))
@@ -2605,6 +2615,11 @@ timetz_pl_interval(PG_FUNCTION_ARGS)
 	Interval   *span = PG_GETARG_INTERVAL_P(1);
 	TimeTzADT  *result;
 
+	if (INTERVAL_NOT_FINITE(span))
+		ereport(ERROR,
+(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+ errmsg("TODO")));
+
 	result = (TimeTzADT *) palloc(sizeof(TimeTzADT));
 
 	result->time = time->time + span->time;
@@ -2627,6 +2642,11 @@ timetz_mi_interval(PG_FUNCTION_ARGS)
 	Interval   *span = PG_GETARG_INTERVAL_P(1);
 	TimeTzADT  *result;
 
+	if (INTERVAL_NOT_FINITE(span))
+		ereport(ERROR,
+(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+ errmsg("TODO")));
+
 	result = (TimeTzADT *) palloc(sizeof(TimeTzADT));
 
 	result->time = time->time - span->time;
diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index b5b117a8ca..1e98c6dc78 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -3634,6 +3634,8 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 			case DTK_STRING:
 			case DTK_SPECIAL:
 type = DecodeUnits(i, field[i], );
+if (type == UNKNOWN_FIELD)
+	type = DecodeSpecial(i, field[i], );
 if (type == IGNORE_DTF)
 	continue;
 
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index 3f2508c0c4..d108057ce5 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -79,6 +79,8 @@ static bool AdjustIntervalForTypmod(Interval *interval, int32 typmod,
 static TimestampTz timestamp2timestamptz(Timestamp timestamp);
 static Timestamp timestamptz2timestamp(TimestampTz timestamp);
 
+static void EncodeSpecialInterval(Interval *interval, char *str);
+static void negate_interval(Interval *interval, Interval *result);
 
 /* common code for timestamptypmodin and timestamptztypmodin */
 static int32
@@ -943,6 +945,14 @@ interval_in(PG_FUNCTION_ARGS)
 		 errmsg("interval out of range")));
 			break;
 
+		case DTK_LATE:
+			INTERVAL_NOEND(result);
+			break;
+
+		case DTK_EARLY:
+			INTERVAL_NOBEGIN(result);
+			break;
+
 		default:
 			elog(ERROR, "unexpected dtype %d while parsing interval \"%s\"",
  dtype, str);
@@ -965,8 +975,13 @@ interval_out(PG_FUNCTION_ARGS)
 			   *itm = 
 	char		buf[MAXDATELEN + 1];
 
-	interval2itm(*span, itm);
-	EncodeInterval(itm, IntervalStyle, buf);
+	if (INTERVAL_NOT_FINITE(span))
+		EncodeSpecialInterval(span, buf);
+	else
+	{
+		interval2itm(*span, itm);
+		EncodeInterval(itm, IntervalStyle, buf);
+	}

Re: daitch_mokotoff module

2022-12-23 Thread Dag Lem

Alvaro Herrera  writes:

> I wonder why do you have it return the multiple alternative codes as a
> space-separated string.  Maybe an array would be more appropriate.  Even
> on your documented example use, the first thing you do is split it on
> spaces.

In the example, the *input* is split on whitespace, the returned soundex
codes are not. The splitting of the input is done in order to code each
word separately. One of the stated rules of the Daitch-Mokotoff Soundex
Coding is that "When a name consists of more than one word, it is coded
as if one word", and this may not always be desired. See
https://www.avotaynu.com/soundex.htm or
https://www.jewishgen.org/InfoFiles/soundex.html for the rules.

The intended use for the Daitch-Mokotoff soundex, as for any other
soundex algorithm, is to index names (or words) on some representation
of sound, so that alike sounding names with different spellings will
match.

In PostgreSQL, the Daitch-Mokotoff Soundex and Full Text Search makes
for a powerful combination to match alike sounding names. Full Text
Search (as any other free text search engine) works with documents, and
thus the Daitch-Mokotoff Soundex implementation produces documents
(words separated by space). As stated in the documentation: "Any
alternative soundex codes are separated by space, which makes the
returned text suited for use in Full Text Search".

Best regards,

Dag Lem

Re: daitch_mokotoff module

2022-12-23 Thread Dag Lem

Tom Lane  writes:

> Alvaro Herrera  writes:
>> On 2022-Dec-22, Dag Lem wrote:
>>> This should hopefully fix the last Cfbot failures, by exclusion of
>>> daitch_mokotoff.h from headerscheck and cpluspluscheck.
>
>> Hmm, maybe it'd be better to move the typedefs to the .h file instead.
>
> Indeed, that sounds like exactly the wrong way to fix such a problem.
> The bar for excluding stuff from headerscheck needs to be very high.
>

OK, I've moved enough declarations back to the generated header file
again so as to avoid excluding it from headerscheck and cpluspluscheck.

Best regards,

Dag Lem

diff --git a/contrib/fuzzystrmatch/Makefile b/contrib/fuzzystrmatch/Makefile
index 0704894f88..12baf2d884 100644
--- a/contrib/fuzzystrmatch/Makefile
+++ b/contrib/fuzzystrmatch/Makefile
@@ -3,14 +3,15 @@
 MODULE_big = fuzzystrmatch
 OBJS = \
 	$(WIN32RES) \
+	daitch_mokotoff.o \
 	dmetaphone.o \
 	fuzzystrmatch.o
 
 EXTENSION = fuzzystrmatch
-DATA = fuzzystrmatch--1.1.sql fuzzystrmatch--1.0--1.1.sql
+DATA = fuzzystrmatch--1.1.sql fuzzystrmatch--1.1--1.2.sql fuzzystrmatch--1.0--1.1.sql
 PGFILEDESC = "fuzzystrmatch - similarities and distance between strings"
 
-REGRESS = fuzzystrmatch
+REGRESS = fuzzystrmatch fuzzystrmatch_utf8
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
@@ -22,3 +23,14 @@ top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 include $(top_srcdir)/contrib/contrib-global.mk
 endif
+
+# Force this dependency to be known even without dependency info built:
+daitch_mokotoff.o: daitch_mokotoff.h
+
+daitch_mokotoff.h: daitch_mokotoff_header.pl
+	perl $< $@
+
+distprep: daitch_mokotoff.h
+
+maintainer-clean:
+	rm -f daitch_mokotoff.h
diff --git a/contrib/fuzzystrmatch/daitch_mokotoff.c b/contrib/fuzzystrmatch/daitch_mokotoff.c
new file mode 100644
index 00..e809d4a39e
--- /dev/null
+++ b/contrib/fuzzystrmatch/daitch_mokotoff.c
@@ -0,0 +1,582 @@
+/*
+ * Daitch-Mokotoff Soundex
+ *
+ * Copyright (c) 2021 Finance Norway
+ * Author: Dag Lem 
+ *
+ * This implementation of the Daitch-Mokotoff Soundex System aims at high
+ * performance.
+ *
+ * - The processing of each phoneme is initiated by an O(1) table lookup.
+ * - For phonemes containing more than one character, a coding tree is traversed
+ *   to process the complete phoneme.
+ * - The (alternate) soundex codes are produced digit by digit in-place in
+ *   another tree structure.
+ *
+ *  References:
+ *
+ * https://www.avotaynu.com/soundex.htm
+ * https://www.jewishgen.org/InfoFiles/Soundex.html
+ * https://familypedia.fandom.com/wiki/Daitch-Mokotoff_Soundex
+ * https://stevemorse.org/census/soundex.html (dmlat.php, dmsoundex.php)
+ * https://github.com/apache/commons-codec/ (dmrules.txt, DaitchMokotoffSoundex.java)
+ * https://metacpan.org/pod/Text::Phonetic (DaitchMokotoff.pm)
+ *
+ * A few notes on other implementations:
+ *
+ * - All other known implementations have the same unofficial rules for "UE",
+ *   these are also adapted by this implementation (0, 1, NC).
+ * - The only other known implementation which is capable of generating all
+ *   correct soundex codes in all cases is the JOS Soundex Calculator at
+ *   https://www.jewishgen.org/jos/jossound.htm
+ * - "J" is considered (only) a vowel in dmlat.php
+ * - The official rules for "RS" are commented out in dmlat.php
+ * - Identical code digits for adjacent letters are not collapsed correctly in
+ *   dmsoundex.php when double digit codes are involved. E.g. "BESST" yields
+ *   744300 instead of 743000 as for "BEST".
+ * - "J" is considered (only) a consonant in DaitchMokotoffSoundex.java
+ * - "Y" is not considered a vowel in DaitchMokotoffSoundex.java
+ *
+ * Permission to use, copy, modify, and distribute this software and its
+ * documentation for any purpose, without fee, and without a written agreement
+ * is hereby granted, provided that the above copyright notice and this
+ * paragraph and the following two paragraphs appear in all copies.
+ *
+ * IN NO EVENT SHALL THE AUTHOR OR DISTRIBUTORS BE LIABLE TO ANY PARTY FOR
+ * DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING
+ * LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS
+ * DOCUMENTATION, EVEN IF THE AUTHOR OR DISTRIBUTORS HAVE BEEN ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ * THE AUTHOR AND DISTRIBUTORS SPECIFICALLY DISCLAIM ANY WARRANTIES,
+ * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
+ * AND FITNESS FOR A PARTICULAR PURPOSE.  THE SOFTWARE PROVIDED HEREUNDER IS
+ * ON AN "AS IS" BASIS, AND THE AUTHOR AND DISTRIBUTORS HAS NO OBLIGATIONS TO
+ * PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
+*/
+
+#include "postgres.h"
+
+#include "lib/stringinfo.h"
+#include "mb/pg_wchar.h"
+#include "utils/builtins.h"
+#include "utils/memutils.h"
+
+
+/*
+ * The soundex coding chart table is adapted from from
+ * https://www.jewishgen.org/InfoFiles/Soundex.html
+ * See daitch_mokotoff_header.pl for details.
+*/
+
+/* Generated

Re: fixing CREATEROLE

2022-12-23 Thread Robert Haas

On Thu, Dec 22, 2022 at 9:14 AM Alvaro Herrera  wrote:
> The contents looks good to me other than that problem, and I agree to
> backpatch it.

Cool. Thanks for the review.

> Why did you choose to use two dots for ellipses in some command
> s rather than three?  I know I've made that choice too on
> occassion, but there aren't many such cases and maybe we should put a
> stop to it (or a period) before it spreads too much.

Honestly, I wasn't aware that we had some other convention for it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Error-safe user functions

2022-12-23 Thread Tom Lane

Ted Yu  writes:
> On Fri, Dec 23, 2022 at 1:22 PM Tom Lane  wrote:
>> Ted Yu  writes:
>>> +   /* See regexp.c for explanation */
>>> +   CHECK_FOR_INTERRUPTS();

>> Yes, it is.  We don't want a query-cancel transformed into a soft error.

> `ereturn(escontext` calls appear in multiple places in the patch.
> What about other callsites (w.r.t. checking interrupt) ?

What about them?  The reason this one is special is that backend/regexp
might return a failure code that's specifically "I gave up because
there's a query cancel pending".  We don't want to report that as a soft
error.  It's true that we might cancel the query for real a bit later on
even if this check weren't here, but that doesn't mean it's okay to go
down the soft error path and hope that there'll be a CHECK_FOR_INTERRUPTS
sometime before there's any visible evidence that we did the wrong thing.

regards, tom lane

Re: daitch_mokotoff module

2022-12-23 Thread Dag Lem

Andres Freund  writes:

> On 2022-12-22 14:27:54 +0100, Dag Lem wrote:
>> This should hopefully fix the last Cfbot failures, by exclusion of
>> daitch_mokotoff.h from headerscheck and cpluspluscheck.
>
> Btw, you can do the same tests as cfbot in your own repo by enabling CI
> in a github repo. See src/tools/ci/README
>

OK, thanks, I've set it up now.

Best regards,

Dag Lem

Re: Error-safe user functions

2022-12-23 Thread Ted Yu

On Fri, Dec 23, 2022 at 1:22 PM Tom Lane  wrote:

> Ted Yu  writes:
> > In makeItemLikeRegex :
>
> > +   /* See regexp.c for explanation */
> > +   CHECK_FOR_INTERRUPTS();
> > +   pg_regerror(re_result, _tmp, errMsg,
> > sizeof(errMsg));
> > +   ereturn(escontext, false,
>
> > Since an error is returned, I wonder if the `CHECK_FOR_INTERRUPTS` call
> is
> > still necessary.
>
> Yes, it is.  We don't want a query-cancel transformed into a soft error.
>
> regards, tom lane
>
Hi,
`ereturn(escontext` calls appear in multiple places in the patch.
What about other callsites (w.r.t. checking interrupt) ?

Cheers

Re: Error-safe user functions

2022-12-23 Thread Tom Lane

Ted Yu  writes:
> In makeItemLikeRegex :

> +   /* See regexp.c for explanation */
> +   CHECK_FOR_INTERRUPTS();
> +   pg_regerror(re_result, _tmp, errMsg,
> sizeof(errMsg));
> +   ereturn(escontext, false,

> Since an error is returned, I wonder if the `CHECK_FOR_INTERRUPTS` call is
> still necessary.

Yes, it is.  We don't want a query-cancel transformed into a soft error.

regards, tom lane

Re: Error-safe user functions

2022-12-23 Thread Ted Yu

On Fri, Dec 23, 2022 at 9:20 AM Andrew Dunstan  wrote:

>
> On 2022-12-22 Th 11:44, Tom Lane wrote:
> > Andrew Dunstan  writes:
> >> Yeah, I started there, but it's substantially more complex - unlike cube
> >> the jsonpath scanner calls the error routines as well as the parser.
> >> Anyway, here's a patch.
> > I looked through this and it seems generally OK.  A minor nitpick is
> > that we usually write "(Datum) 0" not "(Datum) NULL" for dont-care Datum
> > values.
>
>
> Fixed in the new version attached.
>
>
> > A slightly bigger issue is that makeItemLikeRegex still allows
> > an error to be thrown from RE_compile_and_cache if a bogus regex is
> > presented.  But that could be dealt with later.
>
>
> I'd rather fix it now while we're paying attention.
>
>
> >
> > (I wonder why this is using RE_compile_and_cache at all, really,
> > rather than some other API.  There doesn't seem to be value in
> > forcing the regex into the cache at this point.)
> >
> >
>
>
> I agree. The attached uses pg_regcomp instead. I had a lift a couple of
> lines from regexp.c, but not too many.
>
>
> cheers
>
>
> andrew
>
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com

Hi,
In makeItemLikeRegex :

+   /* See regexp.c for explanation */
+   CHECK_FOR_INTERRUPTS();
+   pg_regerror(re_result, _tmp, errMsg,
sizeof(errMsg));
+   ereturn(escontext, false,

Since an error is returned, I wonder if the `CHECK_FOR_INTERRUPTS` call is
still necessary.

 Cheers

Re: Making Vars outer-join aware

2022-12-23 Thread Tom Lane

Ted Yu  writes:
> For v8-0012-invent-join-domains.patch, in `distribute_qual_to_rels`, it
> seems that `pseudoconstant` and `root->hasPseudoConstantQuals` carry the
> same value.
> Can the variable `pseudoconstant` be omitted ?

Surely not.  'pseudoconstant' tells whether the current qual clause
is pseudoconstant.  root->hasPseudoConstantQuals remembers whether
we have found any pseudoconstant qual in the query.

regards, tom lane

Re: Making Vars outer-join aware

2022-12-23 Thread Ted Yu

On Fri, Dec 23, 2022 at 10:21 AM Tom Lane  wrote:

> Here's a new edition of this patch series.
>
> I shoved some preliminary refactoring into the 0001 patch,
> notably splitting deconstruct_jointree into two passes.
> 0002-0009 cover the same ground as they did before, though
> with some differences in detail.  0010-0012 are new work
> mostly aimed at removing kluges we no longer need.
>
> There are two big areas that I would still like to improve, but
> I think we've run out of time to address them in the v16 cycle:
>
> * It'd be nice to apply the regular EquivalenceClass deduction
> mechanisms to outer-join equalities, instead of the
> reconsider_outer_join_clauses kluge.  I've made several stabs at that
> without much success.  I think that the "join domain" framework added
> in 0012 is likely to provide a workable foundation, but some more
> effort is needed.
>
> * I really want to get rid of RestrictInfo.is_pushed_down and
> RINFO_IS_PUSHED_DOWN(), because those seem exceedingly awkward
> and squishy.  I've not gotten far with finding a better
> replacement there, either.
>
> Despite the work being unfinished, I feel that this has moved us a
> long way towards outer-join handling being less of a jury-rigged
> affair.
>
> regards, tom lane
>
> Hi,
For v8-0012-invent-join-domains.patch, in `distribute_qual_to_rels`, it
seems that `pseudoconstant` and `root->hasPseudoConstantQuals` carry the
same value.
Can the variable `pseudoconstant` be omitted ?

Cheers

Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL

2022-12-23 Thread David Christensen

On Fri, Dec 23, 2022 at 12:57 PM David Christensen
 wrote:
>
> On Wed, Dec 21, 2022 at 5:47 AM Bharath Rupireddy
>  wrote:

[snip]

> > 2. +$node->init(extra => ['-k'], allows_streaming => 1);
> > When enabled with allows_streaming, there are a bunch of things that
> > happen to the node while initializing, I don't think we need all of
> > them for this.
>
> I think the "allows_streaming" was required to ensure the WAL files
> were preserved properly, and was the approach we ended up taking
> rather than trying to fail the archive_command or other approaches I'd
> taken earlier. I'd rather keep this if we can, unless you can propose
> a different approach that would continue to work in the same way.

Confirmed that we needed this in order to create the replication slot,
so this /is/ required for the test to work.

Enclosing v11 with yours and Michael's latest feedback.

Best,

David


v11-0001-Teach-pg_waldump-to-extract-FPIs-from-the-WAL-st.patch
Description: Binary data

Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL

2022-12-23 Thread David Christensen

On Mon, Dec 19, 2022 at 12:23 AM Michael Paquier  wrote:
>
> On Thu, Dec 15, 2022 at 05:17:46PM -0600, David Christensen wrote:
> > On Thu, Dec 15, 2022 at 12:36 AM Michael Paquier  
> > wrote:
> > This v10 should incorporate your feedback as well as Bharath's.
>
> Thanks for the new version.  I have minor comments.
>
> >> It seems to me that you could allow things to work even if the
> >> directory exists and is empty.  See for example
> >> verify_dir_is_empty_or_create() in pg_basebackup.c.
> >
> > The `pg_mkdir_p()` supports an existing directory (and I don't think
> > we want to require it to be empty first), so this only errors when it
> > can't create a directory for some reason.
>
> Sure, but things can also be made so as we don't fail if the directory
> exists and is empty?  This would be more consistent with the base
> directories created by pg_basebackup and initdb.

I guess I'm feeling a little dense here; how is this failing if there
is an existing empty directory?

> >> +$node->safe_psql('postgres', < >> +SELECT 'init' FROM 
> >> pg_create_physical_replication_slot('regress_pg_waldump_slot', true, 
> >> false);
> >> +CREATE TABLE test_table AS SELECT generate_series(1,100) a;
> >> +CHECKPOINT; -- required to force FPI for next writes
> >> +UPDATE test_table SET a = a + 1;
> >> Using an EOF to execute a multi-line query would be a first.  Couldn't
> >> you use the same thing as anywhere else?  009_twophase.pl just to
> >> mention one.  (Mentioned by Bharath upthread, where he asked for an
> >> extra opinion so here it is.)
> >
> > Fair enough, while idiomatic perl to me, not a hill to die on;
> > converted to a standard multiline string.
>
> By the way, knowing that we have an option called --fullpage, could be
> be better to use --save-fullpage for the option name?

Works for me.  I think I'd just wanted to avoid reformatting the
entire usage message which is why I'd gone with the shorter version.

> +   OPF = fopen(filename, PG_BINARY_W);
> +   if (!OPF)
> +   pg_fatal("couldn't open file for output: %s", filename);
> [..]
> +   if (fwrite(page, BLCKSZ, 1, OPF) != 1)
> +   pg_fatal("couldn't write out complete full page image to file: 
> %s", filename);
> These should more more generic, as of "could not open file \"%s\"" and
> "could not write file \"%s\"" as the file name provides all the
> information about what this writes.

Sure, will update.

Best,

David

Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL

2022-12-23 Thread David Christensen

On Wed, Dec 21, 2022 at 5:47 AM Bharath Rupireddy
 wrote:
>
> On Fri, Dec 16, 2022 at 4:47 AM David Christensen
>  wrote:
> >
> > On Thu, Dec 15, 2022 at 12:36 AM Michael Paquier  
> > wrote:
> > >
> > > On Wed, Dec 14, 2022 at 04:44:34PM -0600, David Christensen wrote:
> > > > I can get one sent in tomorrow.
> >
> > This v10 should incorporate your feedback as well as Bharath's.
>
> Thanks for the patch. Here're some minor comments:
>
> 1. +my $node =  PostgreSQL::Test::Cluster->new('primary');
> Can the name be other than 'primary' because we don't create a standby
> for this test? Something like - 'node_a' or 'node_extract_fpi' or some
> other.

Sure, no issues.

> 2. +$node->init(extra => ['-k'], allows_streaming => 1);
> When enabled with allows_streaming, there are a bunch of things that
> happen to the node while initializing, I don't think we need all of
> them for this.

I think the "allows_streaming" was required to ensure the WAL files
were preserved properly, and was the approach we ended up taking
rather than trying to fail the archive_command or other approaches I'd
taken earlier. I'd rather keep this if we can, unless you can propose
a different approach that would continue to work in the same way.

> 3. +$node->init(extra => ['-k'], allows_streaming => 1);
> Can we use --data-checksums instead of -k for more readability?
> Perhaps, a comment on why we need that option helps greatly.

Yeah, can spell out; don't recall exactly why we needed it offhand,
but will confirm or remove if insignificant.

> 4.
> +page = (Page) buf.data;
> +
> +if (!XLogRecHasBlockRef(record, block_id))
> +continue;
> +
> +if (!XLogRecHasBlockImage(record, block_id))
> +continue;
> +
> +if (!RestoreBlockImage(record, block_id, page))
> +continue;
> Can you shift  page = (Page) buf.data; just before the last if
> condition RestoreBlockImage() so that it doesn't get executed for the
> other two continue statements?

Sure; since it was just setting a pointer value I didn't consider it
to be a hotspot for optimization.

Best,

David

Re: Error-safe user functions

2022-12-23 Thread Tom Lane

Andrew Dunstan  writes:
> On 2022-12-22 Th 11:44, Tom Lane wrote:
>> (I wonder why this is using RE_compile_and_cache at all, really,
>> rather than some other API.  There doesn't seem to be value in
>> forcing the regex into the cache at this point.)

> I agree. The attached uses pg_regcomp instead. I had a lift a couple of
> lines from regexp.c, but not too many.

LGTM.  No further comments.

regards, tom lane

Re: Error-safe user functions

2022-12-23 Thread Andrew Dunstan


On 2022-12-22 Th 11:44, Tom Lane wrote:
> Andrew Dunstan  writes:
>> Yeah, I started there, but it's substantially more complex - unlike cube
>> the jsonpath scanner calls the error routines as well as the parser.
>> Anyway, here's a patch.
> I looked through this and it seems generally OK.  A minor nitpick is
> that we usually write "(Datum) 0" not "(Datum) NULL" for dont-care Datum
> values.  


Fixed in the new version attached.


> A slightly bigger issue is that makeItemLikeRegex still allows
> an error to be thrown from RE_compile_and_cache if a bogus regex is
> presented.  But that could be dealt with later.


I'd rather fix it now while we're paying attention.


>
> (I wonder why this is using RE_compile_and_cache at all, really,
> rather than some other API.  There doesn't seem to be value in
> forcing the regex into the cache at this point.)
>
>   


I agree. The attached uses pg_regcomp instead. I had a lift a couple of
lines from regexp.c, but not too many.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com
diff --git a/src/backend/utils/adt/jsonpath.c b/src/backend/utils/adt/jsonpath.c
index 91af030095..6c7a5b9854 100644
--- a/src/backend/utils/adt/jsonpath.c
+++ b/src/backend/utils/adt/jsonpath.c
@@ -66,16 +66,19 @@
 #include "funcapi.h"
 #include "lib/stringinfo.h"
 #include "libpq/pqformat.h"
+#include "nodes/miscnodes.h"
 #include "miscadmin.h"
 #include "utils/builtins.h"
 #include "utils/json.h"
 #include "utils/jsonpath.h"
 
 
-static Datum jsonPathFromCstring(char *in, int len);
+static Datum jsonPathFromCstring(char *in, int len, struct Node *escontext);
 static char *jsonPathToCstring(StringInfo out, JsonPath *in,
 			   int estimated_len);
-static int	flattenJsonPathParseItem(StringInfo buf, JsonPathParseItem *item,
+static bool	flattenJsonPathParseItem(StringInfo buf, int *result,
+	 struct Node *escontext,
+	 JsonPathParseItem *item,
 	 int nestingLevel, bool insideArraySubscript);
 static void alignStringInfoInt(StringInfo buf);
 static int32 reserveSpaceForItemPointer(StringInfo buf);
@@ -95,7 +98,7 @@ jsonpath_in(PG_FUNCTION_ARGS)
 	char	   *in = PG_GETARG_CSTRING(0);
 	int			len = strlen(in);
 
-	return jsonPathFromCstring(in, len);
+	return jsonPathFromCstring(in, len, fcinfo->context);
 }
 
 /*
@@ -119,7 +122,7 @@ jsonpath_recv(PG_FUNCTION_ARGS)
 	else
 		elog(ERROR, "unsupported jsonpath version number: %d", version);
 
-	return jsonPathFromCstring(str, nbytes);
+	return jsonPathFromCstring(str, nbytes, NULL);
 }
 
 /*
@@ -165,24 +168,29 @@ jsonpath_send(PG_FUNCTION_ARGS)
  * representation of jsonpath.
  */
 static Datum
-jsonPathFromCstring(char *in, int len)
+jsonPathFromCstring(char *in, int len, struct Node *escontext)
 {
-	JsonPathParseResult *jsonpath = parsejsonpath(in, len);
+	JsonPathParseResult *jsonpath = parsejsonpath(in, len, escontext);
 	JsonPath   *res;
 	StringInfoData buf;
 
-	initStringInfo();
-	enlargeStringInfo(, 4 * len /* estimation */ );
-
-	appendStringInfoSpaces(, JSONPATH_HDRSZ);
+	if (SOFT_ERROR_OCCURRED(escontext))
+		return (Datum) NULL;
 
 	if (!jsonpath)
-		ereport(ERROR,
+		ereturn(escontext, (Datum) NULL,
 (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
  errmsg("invalid input syntax for type %s: \"%s\"", "jsonpath",
 		in)));
 
-	flattenJsonPathParseItem(, jsonpath->expr, 0, false);
+	initStringInfo();
+	enlargeStringInfo(, 4 * len /* estimation */ );
+
+	appendStringInfoSpaces(, JSONPATH_HDRSZ);
+
+	if (!flattenJsonPathParseItem(, NULL, escontext,
+  jsonpath->expr, 0, false))
+		return (Datum) NULL;
 
 	res = (JsonPath *) buf.data;
 	SET_VARSIZE(res, buf.len);
@@ -225,9 +233,10 @@ jsonPathToCstring(StringInfo out, JsonPath *in, int estimated_len)
  * Recursive function converting given jsonpath parse item and all its
  * children into a binary representation.
  */
-static int
-flattenJsonPathParseItem(StringInfo buf, JsonPathParseItem *item,
-		 int nestingLevel, bool insideArraySubscript)
+static bool
+flattenJsonPathParseItem(StringInfo buf,  int *result, struct Node *escontext,
+		 JsonPathParseItem *item, int nestingLevel,
+		 bool insideArraySubscript)
 {
 	/* position from beginning of jsonpath data */
 	int32		pos = buf->len - JSONPATH_HDRSZ;
@@ -295,16 +304,22 @@ flattenJsonPathParseItem(StringInfo buf, JsonPathParseItem *item,
 int32		left = reserveSpaceForItemPointer(buf);
 int32		right = reserveSpaceForItemPointer(buf);
 
-chld = !item->value.args.left ? pos :
-	flattenJsonPathParseItem(buf, item->value.args.left,
-			 nestingLevel + argNestingLevel,
-			 insideArraySubscript);
+if (!item->value.args.left)
+	chld = pos;
+else if (! flattenJsonPathParseItem(buf, , escontext,
+	item->value.args.left,
+	nestingLevel + argNestingLevel,
+	insideArraySubscript))
+	return false;
 *(int32 *) (buf->data + left) = chld - pos;
 
-chld =

Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-23 Thread Magnus Hagander

On Fri, Dec 23, 2022 at 2:06 AM Michael Paquier  wrote:

> On Thu, Dec 22, 2022 at 05:03:35PM +0530, Bharath Rupireddy wrote:
> > On Thu, Dec 22, 2022 at 4:57 PM Michael Paquier 
> wrote:
> >> As in using "sequence number" removing "file" from the docs and
> >> changing the OUT parameter name to segment_number rather than segno?
> >> Fine by me.
> >
> > +1.
>
> Okay, done this way.
>
>
Thanks!

//Magnus

Re: Force streaming every change in logical decoding

2022-12-23 Thread Masahiko Sawada

On Fri, Dec 23, 2022 at 5:32 PM shiy.f...@fujitsu.com
 wrote:
>
> On Fri, Dec 23, 2022 1:50 PM Amit Kapila 
> >
> > On Thu, Dec 22, 2022 at 6:18 PM shiy.f...@fujitsu.com
> >  wrote:
> > >
> > >
> > > Besides, I tried to reduce data size in streaming subscription tap tests 
> > > by this
> > > new GUC (see 0002 patch). But I didn't covert all streaming tap tests
> > because I
> > > think we also need to cover the case that there are lots of changes. So, 
> > > 015*
> > is
> > > not modified. And 017* is not modified because streaming transactions and
> > > non-streaming transactions are tested alternately in this test.
> > >
> >
> > I think we can remove the newly added test from the patch and instead
> > combine the 0001 and 0002 patches. I think we should leave the
> > 022_twophase_cascade as it is because it can impact code coverage,
> > especially the below part of the test:
> > # 2PC PREPARE with a nested ROLLBACK TO SAVEPOINT
> > $node_A->safe_psql(
> > 'postgres', "
> > BEGIN;
> > INSERT INTO test_tab VALUES (, 'foobar');
> > SAVEPOINT sp_inner;
> > INSERT INTO test_tab SELECT i, md5(i::text) FROM
> > generate_series(3, 5000) s(i);
> >
> > Here, we will stream first time after the subtransaction, so can
> > impact the below part of the code in ReorderBufferStreamTXN:
> > if (txn->snapshot_now == NULL)
> > {
> > ...
> > dlist_foreach(subxact_i, >subtxns)
> > {
> > ReorderBufferTXN *subtxn;
> >
> > subtxn = dlist_container(ReorderBufferTXN, node, subxact_i.cur);
> > ReorderBufferTransferSnapToParent(txn, subtxn);
> > }
> > ...
> >
>
> OK, I removed the modification in 022_twophase_cascade.pl and combine the two 
> patches.

Thank you for updating the patch. The v6 patch looks good to me.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: pgsql: Doc: Explain about Column List feature.

2022-12-23 Thread Alvaro Herrera

On 2022-Dec-21, Peter Smith wrote:

> On Wed, Dec 21, 2022 at 6:59 PM Alvaro Herrera  
> wrote:
>
> > Oh, I see.  It's been so long that I haven't looked at the PDFs, that I
> > failed to realize that they don't use color.  I agree that would be a
> > problem.  Maybe we can change the title to have the word:
> >
> > Warning: Combining Column Lists from Multiple 
> > Publications
> 
> That last idea LGTM. But no patch at all LGTM also.

... I hear you, but honestly that warning box with a section title looks
completely wrong to me, so I've pushed it.  Many thanks for looking.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"But static content is just dynamic content that isn't moving!"
http://smylers.hates-software.com/2007/08/15/fe244d0c.html

Re: [RFC] building postgres with meson - v13

2022-12-23 Thread Justin Pryzby

> From 680ff3f7b4da1dbf21d0c7cd87af9bb5ee8b230c Mon Sep 17 00:00:00 2001
> From: Andres Freund 
> Date: Wed, 21 Sep 2022 20:36:36 -0700
> Subject: [PATCH v17 01/23] meson: ci: wip: move compilerwarnings task to meson

>always:
>  gcc_warning_script: |
> -  time ./configure \
> ---cache gcc.cache \
> ---enable-dtrace \
> -${LINUX_CONFIGURE_FEATURES} \
> -CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang"
> -  make -s -j${BUILD_JOBS} clean
> -  time make -s -j${BUILD_JOBS} world-bin
> +  mkdir build && cd build
> +  CC="ccache gcc" CXX="ccache g++" \
> +meson setup \
> +  -Dwerror=true \
> +  -Dcassert=false \
> +  -Ddtrace=enabled \
> +  ${LINUX_MESON_FEATURES} \
> +  ..
> +  time ninja -j${BUILD_JOBS}

With gcc, autoconf uses -O2, so I think this should specify
buildtype=debugoptimized, or pass -Doptimization=2.  Otherwise it ends
up in release mode with -O3.

Re: [BUG] pg_upgrade test fails from older versions.

2022-12-23 Thread Justin Pryzby

On Fri, Dec 23, 2022 at 05:51:28PM +0900, Michael Paquier wrote:
> On Thu, Dec 22, 2022 at 09:27:24PM -0600, Justin Pryzby wrote:
> > This would do a single seqscan:
> > SELECT format('ALTER TABLE %I ALTER COLUMN %I TYPE TEXT',
> > attrelid::regclass, attname) FROM pg_attribute WHERE
> > atttypid='aclitem'::regtype; -- AND ...
> > \gexec
> 
> FWIW, I find the use of a FOR loop with a DO block much cleaner to
> follow in this context, so something like the attached would be able
> to group the two queries and address your point on O(N^2).  Do you
> like that? 

LGTM.  Thanks.

-- 
Justin

RE: Time delayed LR (WAS Re: logical replication restrictions)

2022-12-23 Thread Takamichi Osumi (Fujitsu)

On Thursday, December 22, 2022 3:02 PM Takamichi Osumi (Fujitsu) 
 wrote:
> Attached the updated patch.
> Again, I used one basic patch in another thread to wake up logical replication
> worker shared in [2] for TAP tests.
The v11 caused a cfbot failure in [1]. But, failed tests looked irrelevant
to the feature to me at present.

While waiting for another test execution of cfbot, I'd like to check the 
detailed reason
and update the patch if necessary.

[1] - https://cirrus-ci.com/task/4580705867399168


Best Regards,
Takamichi Osumi

Re: daitch_mokotoff module

2022-12-23 Thread Tom Lane

Alvaro Herrera  writes:
> On 2022-Dec-22, Dag Lem wrote:
>> This should hopefully fix the last Cfbot failures, by exclusion of
>> daitch_mokotoff.h from headerscheck and cpluspluscheck.

> Hmm, maybe it'd be better to move the typedefs to the .h file instead.

Indeed, that sounds like exactly the wrong way to fix such a problem.
The bar for excluding stuff from headerscheck needs to be very high.

regards, tom lane

Re: Error-safe user functions

2022-12-23 Thread Andrew Dunstan



On 2022-12-22 Th 01:10, Tom Lane wrote:
> Andrew Dunstan  writes:
>> And here's another for contrib/seg
>> I'm planning to commit these two in the next day or so.
> I didn't look at the jsonpath one yet.  The seg patch passes
> an eyeball check, with one minor nit: in seg_atof,
>
> + *result = float4in_internal(value, NULL, "real", value, escontext);
>
> don't we want to use "seg" as the type_name?
>
> Even more nitpicky, in
>
> +seg_yyerror(SEG *result, struct Node *escontext, const char *message)
>  {
> + if (SOFT_ERROR_OCCURRED(escontext))
> + return;
>
> I'd be inclined to add some explanation, say
>
> +seg_yyerror(SEG *result, struct Node *escontext, const char *message)
>  {
> + /* if we already reported an error, don't overwrite it */
> + if (SOFT_ERROR_OCCURRED(escontext))
> + return;
>
>   


Thanks for the review.


Fixed both of these and pushed.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: daitch_mokotoff module

2022-12-23 Thread Alvaro Herrera

On 2022-Dec-23, Alvaro Herrera wrote:

> I wonder why do you have it return the multiple alternative codes as a
> space-separated string.  Maybe an array would be more appropriate.  Even
> on your documented example use, the first thing you do is split it on
> spaces.

I tried downloading a list of surnames from here
https://www.bibliotecadenombres.com/apellidos/apellidos-espanoles/
pasted that in a text file and \copy'ed it into a table.  Then I ran
this query

select string_agg(a, ' ' order by a), daitch_mokotoff(a), count(*)
from apellidos
group by daitch_mokotoff(a)
order by count(*) desc;

so I have a first entry like this

string_agg  │ Balasco Balles Belasco Belles Blas Blasco Fallas Feliz Palos 
Pelaez Plaza Valles Vallez Velasco Velez Veliz Veloz Villas
daitch_mokotoff │ 784000
count   │ 18

but then I have a bunch of other entries with the same code 784000 as
alternative codes,

string_agg  │ Velazco
daitch_mokotoff │ 784500 784000
count   │ 1

string_agg  │ Palacio
daitch_mokotoff │ 785000 784000
count   │ 1

I suppose I need to group these together somehow, and it would make more
sense to do that if the values were arrays.


If I scroll a bit further down and choose, say, 794000 (a relatively
popular one), then I have this

string_agg  │ Barraza Barrios Barros Bras Ferraz Frias Frisco Parras Peraza 
Peres Perez Porras Varas Veras
daitch_mokotoff │ 794000
count   │ 14

and looking for that code in the result I also get these three

string_agg  │ Barca Barco Parco
daitch_mokotoff │ 795000 794000
count   │ 3

string_agg  │ Borja
daitch_mokotoff │ 79 794000
count   │ 1

string_agg  │ Borjas
daitch_mokotoff │ 794000 794400
count   │ 1

and then I see that I should also search for possible matches in codes
795000, 79 and 794400, so that gives me

string_agg  │ Baria Baro Barrio Barro Berra Borra Feria Para Parra Perea 
Vera
daitch_mokotoff │ 79
count   │ 11

string_agg  │ Barriga Borge Borrego Burgo Fraga
daitch_mokotoff │ 795000
count   │ 5

string_agg  │ Borjas
daitch_mokotoff │ 794000 794400
count   │ 1

which look closely related (compare "Veras" in the first to "Vera" in
the later set.  If you ignore that pseudo-match, you're likely to miss
possible family relationships.)


I suppose if I were a genealogy researcher, I would be helped by having
each of these codes behave as a separate unit, rather than me having to
split the string into the several possible contained values.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Industry suffers from the managerial dogma that for the sake of stability
and continuity, the company should be independent of the competence of
individual employees."  (E. Dijkstra)

Re: daitch_mokotoff module

2022-12-23 Thread Alvaro Herrera

I wonder why do you have it return the multiple alternative codes as a
space-separated string.  Maybe an array would be more appropriate.  Even
on your documented example use, the first thing you do is split it on
spaces.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

2022-12-23 Thread Ranier Vilela

Em sex., 23 de dez. de 2022 às 09:10, David Rowley 
escreveu:

>
> I've now pushed your fix plus that test.
>
Thank you all for getting involved to resolve this.

regards,
Ranier Vilela

Re: appendBinaryStringInfo stuff

2022-12-23 Thread David Rowley

On Fri, 23 Dec 2022 at 22:04, Peter Eisentraut
 wrote:
>
> On 19.12.22 23:48, David Rowley wrote:
> > On Tue, 20 Dec 2022 at 11:42, Tom Lane  wrote:
> >> I think Peter is entirely right to question whether *this* type's
> >> output function is performance-critical.  Who's got large tables with
> >> jsonpath columns?  It seems to me the type would mostly only exist
> >> as constants within queries.
> >
> > The patch touches code in the path of jsonb's output function too. I
> > don't think you could claim the same for that.
>
> Ok, let's leave the jsonb output alone.  The jsonb output code also
> won't change a lot, but there is a bunch of stuff for jsonpath on the
> horizon, so having some more robust coding style to imitate there seems
> useful.  Here is another patch set with the jsonb changes omitted.

Maybe if there's concern that inlining appendStringInfoString is going
to bloat the binary too much, then how about we just invent an inlined
version of it using some other name that we can use when performance
matters?  We could then safely replace the offending
appendBinaryStringInfos from both places without any concern for
regressing performance.

FWIW, I just did a few compilation runs of our supported versions to
see how much postgres binary grew release to release:

branch  postgres binary size growth bytes
REL_10_STABLE8230232  0
REL_11_STABLE8586024 355792
REL_12_STABLE8831664 245640
REL_13_STABLE8990824 159160
REL_14_STABLE9484848 494024
REL_15_STABLE9744680 259832
master   9977896 233216
inline_asis 10004032  26136

(inlined_asis  = inlined appendStringInfoString)

On the other hand, if we went with inlining the existing function,
then it looks to be about 10% of the growth we saw between v14 and
v15. That seems quite large.

David

Re: daitch_mokotoff module

2022-12-23 Thread Alvaro Herrera

On 2022-Dec-22, Dag Lem wrote:

> This should hopefully fix the last Cfbot failures, by exclusion of
> daitch_mokotoff.h from headerscheck and cpluspluscheck.

Hmm, maybe it'd be better to move the typedefs to the .h file instead.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Pensar que el espectro que vemos es ilusorio no lo despoja de espanto,
sólo le suma el nuevo terror de la locura" (Perelandra, C.S. Lewis)

RE: Exit walsender before confirming remote flush in logical replication

2022-12-23 Thread Hayato Kuroda (Fujitsu)

Dear Horiguchi-san,

> Thus how about before entering an apply_delay, logrep worker sending a
> kind of crafted feedback, which reports commit_data.end_lsn as
> flushpos?  A little tweak is needed in send_feedback() but seems to
> work..

Thanks for replying! I tested your saying but it could not work well...

I made PoC based on the latest time-delayed patches [1] for non-streaming case.
Apply workers that are delaying applications send begin_data.final_lsn as 
recvpos and flushpos in send_feedback().

Followings were contents of the feedback message I got, and we could see that 
recv and flush were overwritten.

```
DEBUG:  sending feedback (force 1) to recv 0/1553638, write 0/1553550, flush 
0/1553638
CONTEXT:  processing remote data for replication origin "pg_16390" during 
message type "BEGIN" in transaction 730, finished at 0/1553638
```

In terms of walsender, however, sentPtr seemed to be slightly larger than 
flushed position on subscriber.

```
(gdb) p MyWalSnd->sentPtr 
$2 = 22361760
(gdb) p MyWalSnd->flush
$3 = 22361656
(gdb) p *MyWalSnd
$4 = {pid = 28807, state = WALSNDSTATE_STREAMING, sentPtr = 22361760, 
needreload = false, write = 22361656, 
  flush = 22361656, apply = 22361424, writeLag = 20020343, flushLag = 20020343, 
applyLag = 20020343, 
  sync_standby_priority = 0, mutex = 0 '\000', latch = 0x7ff0350cbb94, 
replyTime = 725113263592095}
```

Therefore I could not shut down the publisher node when applications were 
delaying.
Do you have any opinions about them?

```
$ pg_ctl stop -D data_pub/
waiting for server to shut 
down... failed
pg_ctl: server does not shut down
```

[1]: 
https://www.postgresql.org/message-id/tycpr01mb83730a3e21e921335f6efa38ed...@tycpr01mb8373.jpnprd01.prod.outlook.com
 

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Re: Avoid lost result of recursion (src/backend/optimizer/util/inherit.c)

2022-12-23 Thread David Rowley

On Fri, 23 Dec 2022 at 17:04, Richard Guo  wrote:
> Thanks for the test!  I looked at this and found that with WCO
> constraints we can also hit the buggy code.  Based on David's test case,
> I came up with the following in the morning.

I ended up going with a WCO option test in the end.  I wanted to steer
clear of having a test that has expected broken results from the
generated column code.  Also, I just couldn't help thinking the
generated column test felt like it was just glued on the end and not
really in the correct place in the file.

I've put the new WCO test in along with the existing one.  I also
considered modifying one of the existing tests to add another
partitioning level, but I ended up staying clear of that as I felt
like it caused a bit more churn than I wanted with an existing test.
The test I put together tests for the bug and also checks the WCO
works by not updating the row that's outside of the scope of the WCO
view and updating the row that is in the scope of the view.

I've now pushed your fix plus that test.

It feels a bit like famine to feast when it comes to tests for this bug today.

David

Re: cirrus scripts could use make -k

2022-12-23 Thread Andres Freund

Hi,

On 2022-12-15 12:31:26 +0100, Peter Eisentraut wrote:
> I would find it useful if the cirrus scripts used make -k (and ninja -k) to
> keep building everything in the presence of errors.  For example, I once had
> some issue in code that caused a bunch of compiler warnings on Windows.  The
> cirrus build would only show me the first one, then I had to fix, reupload,
> see the next one, etc.  Are there any drawbacks to using these options in
> this context?
> 

-1 - it makes it much harder to find the first error. To the point of
the error output getting big enough that CI task output gets deleted
more quickly because of the larger output.

I can see -k 3 or something though.

Greetings,

Andres Freund

Re: daitch_mokotoff module

2022-12-23 Thread Andres Freund

On 2022-12-22 14:27:54 +0100, Dag Lem wrote:
> This should hopefully fix the last Cfbot failures, by exclusion of
> daitch_mokotoff.h from headerscheck and cpluspluscheck.

Btw, you can do the same tests as cfbot in your own repo by enabling CI
in a github repo. See src/tools/ci/README

Exposing the lock manager's WaitForLockers() to SQL

2022-12-23 Thread Will Mortensen

Hi there,

We'd like to be able to call the lock manager's WaitForLockers() and
WaitForLockersMultiple() from SQL. Below I describe our use case, but
basically I'm wondering if this:

1. Seems like a reasonable thing to do

2. Would be of interest upstream

3. Should be done with a new pg_foo() function (taking an
   oid?), or a whole new SQL command, or something else

If this sounds promising, we may be able to code this up and submit it.

The rest of this email describes our use cases and related observations:


 Use Case Background 

Our use case is inspired by this blog post by Marco Slot (CC'ed) at
Citus Data: 
https://www.citusdata.com/blog/2018/06/14/scalable-incremental-data-aggregation/
. This describes a scheme for correctly aggregating rows given minimal
coordination with an arbitrary number of writers while keeping minimal
additional state. It relies on two simple facts:

1. INSERT/UPDATE take their ROW EXCLUSIVE lock on the target
   table before evaluating any column DEFAULT expressions,
   and thus before e.g. calling nextval() on a sequence in
   the DEFAULT expression. And of course, this lock is only
   released when the transaction commits or rolls back.

2. pg_sequence_last_value() (still undocumented!) can be
   used to obtain an instantaneous upper bound on the
   sequence values that have been returned by nextval(), even
   if the transaction that called nextval() hasn't yet
   committed.

So, assume we have a table:

create table tbl (
id bigserial,
data text
);

which is only ever modified by INSERTs that use DEFAULT for id. Then,
a client can process each row exactly once using a loop like this
(excuse the pseudo-SQL):

min_id := 0;
while true:
max_id := pg_sequence_last_value('tbl_id_seq');
wait_for_writers('tbl'::regclass);
SELECT
some_aggregation(data)
FROM tbl
WHERE id > min_id AND id <= max_id;
min_id := max_id;

In the blog post, the equivalent of wait_for_writers() is implemented
by taking and immediately releasing a SHARE ROW EXCLUSIVE lock on tbl.
It's unclear why this can't be SHARE, since it just needs to conflict
with INSERT's ROW EXCLUSIVE, but in any case it's sufficient for
correctness.

(Note that this version only works if the rows committed by the
transactions that it waited for are actually visible to the SELECT, so
for example, the whole thing can't be within a Repeatable Read or
Serializable transaction.)


 Why WaitForLockers()? 

No new writer can acquire a ROW EXCLUSIVE lock as long as we're
waiting to obtain the SHARE lock, even if we only hold it for an
instant. If we have to wait a long time, because some existing writer
holds its ROW EXCLUSIVE lock for a long time, this could noticeably
reduce overall writer throughput.

But we don't actually need to obtain a lock at all--and waiting for
transactions that already hold conflicting locks is exactly what
WaitForLockers() / WaitForLockersMultiple() does. Using it instead
would prevent any interference with writers.


 Appendix: Extensions and Observations 

Aside from downgrading to SHARE mode and merely waiting instead of
locking, we propose a couple other extensions and observations related
to Citus' scheme. These only tangentially motivate our need for
WaitForLockers(), so you may stop reading here unless the overall
scheme is of interest.

== Separate client for reading sequences and waiting ==

First, in our use case each batch of rows might require extensive
processing as part of a larger operation that doesn't want to block
waiting for writers to commit. A simple extension is to separate the
processing from the determination of sequence values. In other words,
have a single client that sits in a loop:

while true:
seq_val := pg_sequence_last_value('tbl_id_seq');
WaitForLockers('tbl'::regclass, 'SHARE');
publish(seq_val);

and any number of other clients that use the series of published
sequence values to do their own independent processing (maintaining
their own additional state).

This can be extended to multiple tables with WaitForLockersMultiple():

while true:
seq_val1 := pg_sequence_last_value('tbl1_id_seq');
seq_val2 := pg_sequence_last_value('tbl2_id_seq');
WaitForLockersMultiple(
ARRAY['tbl1', 'tbl2']::regclass[], 'SHARE');
publish('tbl1', seq_val1);
publish('tbl2', seq_val2);

Which is clearly more efficient than locking or waiting for the tables
in sequence, hence the desire for that function as well.

== Latency ==

This brings us to a series of observations about latency. If some
writers take a long time to commit, some already-committed rows might
not be processed for a long time. To avoid exacerbating this when
using WaitForLockersMultiple(), which obviously has to wait for the
last writer of any specified table, it should be

Re: Improve WALRead() to suck data directly from WAL buffers when possible

2022-12-23 Thread Bharath Rupireddy

On Mon, Dec 12, 2022 at 8:27 AM Kyotaro Horiguchi
 wrote:
>

Thanks for providing thoughts.

> At Fri, 9 Dec 2022 14:33:39 +0530, Bharath Rupireddy 
>  wrote in
> > The patch introduces concurrent readers for the WAL buffers, so far
> > only there are concurrent writers. In the patch, WALRead() takes just
> > one lock (WALBufMappingLock) in shared mode to enable concurrent
> > readers and does minimal things - checks if the requested WAL page is
> > present in WAL buffers, if so, copies the page and releases the lock.
> > I think taking just WALBufMappingLock is enough here as the concurrent
> > writers depend on it to initialize and replace a page in WAL buffers.
> >
> > I'll add this to the next commitfest.
> >
> > Thoughts?
>
> This adds copying of the whole page (at least) at every WAL *record*
> read,

In the worst case yes, but that may not always be true. On a typical
production server with decent write traffic, it happens that the
callers of WALRead() read a full WAL page of size XLOG_BLCKSZ bytes or
MAX_SEND_SIZE bytes.

> fighting all WAL writers by taking WALBufMappingLock on a very
> busy page while the copying. I'm a bit doubtful that it results in an
> overall improvement.

Well, the tests don't reflect that [1], I've run an insert work load
[2]. The WAL is being read from WAL buffers 99% of the time, which is
pretty cool. If you have any use-cases in mind, please share them
and/or feel free to run at your end.

> It seems to me almost all pread()s here happens
> on file buffer so it is unclear to me that copying a whole WAL page
> (then copying the target record again) wins over a pread() call that
> copies only the record to read.

That's not always guaranteed. Imagine a typical production server with
decent write traffic and heavy analytical queries (which fills OS page
cache with the table pages accessed for the queries), the WAL pread()
calls turn to IOPS. Despite the WAL being present in WAL buffers,
customers will be paying unnecessarily for these IOPS too. With the
patch, we are basically avoiding the pread() system calls which may
turn into IOPS on production servers (99% of the time for the insert
use case [1][2], 95% of the time for pgbench use case specified
upthread). With the patch, WAL buffers can act as L1 cache, if one
calls OS page cache as L2 cache (of course this illustration is not
related to the typical processor L1 and L2 ... caches).

> Do you have an actual number of how
> frequent WAL reads go to disk, or the actual number of performance
> gain or real I/O reduction this patch offers?

It might be a bit tough to generate such heavy traffic. An idea is to
ensure the WAL page/file goes out of the OS page cache before
WALRead() - these might help here - 0002 patch from
https://www.postgresql.org/message-id/CA%2BhUKGLmeyrDcUYAty90V_YTcoo5kAFfQjRQ-_1joS_%3DX7HztA%40mail.gmail.com
and tool https://github.com/klando/pgfincore.

> This patch copies the bleeding edge WAL page without recording the
> (next) insertion point nor checking whether all in-progress insertion
> behind the target LSN have finished. Thus the copied page may have
> holes.  That being said, the sequential-reading nature and the fact
> that WAL buffers are zero-initialized may make it work for recovery,
> but I don't think this also works for replication.

WALRead() callers are smart enough to take the flushed bytes only.
Although they read the whole WAL page, they calculate the valid bytes.

> I remember that the one of the advantage of reading the on-memory WAL
> records is that that allows walsender to presend the unwritten
> records. So perhaps we should manage how far the buffer is filled with
> valid content (or how far we can presend) in this feature.

Yes, the non-flushed WAL can be read and sent across if one wishes to
to make replication faster and parallel flushing on primary and
standbys at the cost of a bit of extra crash handling, that's
mentioned here 
https://www.postgresql.org/message-id/CALj2ACXCSM%2BsTR%3D5NNRtmSQr3g1Vnr-yR91azzkZCaCJ7u4d4w%40mail.gmail.com.
However, this can be a separate discussion.

I also want to reiterate that the patch implemented a TODO item:

 * XXX probably this should be improved to suck data directly from the
 * WAL buffers when possible.
 */
bool
WALRead(XLogReaderState *state,

[1]
PATCHED:
1 1470.329907
2 1437.096329
4 2966.096948
8 5978.441040
16 11405.538255
32 22933.546058
64 43341.870038
128 73623.837068
256 104754.248661
512 115746.359530
768 106106.691455
1024 91900.079086
2048 84134.278589
4096 62580.875507

-[ RECORD 1 ]--+---
application_name   | assb1
sent_lsn   | 0/1B8106A8
write_lsn  | 0/1B8106A8
flush_lsn  | 0/1B8106A8
replay_lsn | 0/1B8106A8
write_lag  |
flush_lag  |
replay_lag |
wal_read   | 104
wal_read_bytes | 10733008
wal_read_time  | 1.845
wal_read_buffers   | 76662
wal_read_bytes_buffers | 383598808

Re: Force streaming every change in logical decoding

2022-12-23 Thread shveta malik

On Fri, Dec 23, 2022 at 2:03 PM shiy.f...@fujitsu.com 
wrote:
>
> On Fri, Dec 23, 2022 1:50 PM Amit Kapila 
> >
> > On Thu, Dec 22, 2022 at 6:18 PM shiy.f...@fujitsu.com
> >  wrote:
> > >
> > >
> > > Besides, I tried to reduce data size in streaming subscription tap
tests by this
> > > new GUC (see 0002 patch). But I didn't covert all streaming tap tests
> > because I
> > > think we also need to cover the case that there are lots of changes.
So, 015*
> > is
> > > not modified. And 017* is not modified because streaming transactions
and
> > > non-streaming transactions are tested alternately in this test.
> > >
> >
> > I think we can remove the newly added test from the patch and instead
> > combine the 0001 and 0002 patches. I think we should leave the
> > 022_twophase_cascade as it is because it can impact code coverage,
> > especially the below part of the test:
> > # 2PC PREPARE with a nested ROLLBACK TO SAVEPOINT
> > $node_A->safe_psql(
> > 'postgres', "
> > BEGIN;
> > INSERT INTO test_tab VALUES (, 'foobar');
> > SAVEPOINT sp_inner;
> > INSERT INTO test_tab SELECT i, md5(i::text) FROM
> > generate_series(3, 5000) s(i);
> >
> > Here, we will stream first time after the subtransaction, so can
> > impact the below part of the code in ReorderBufferStreamTXN:
> > if (txn->snapshot_now == NULL)
> > {
> > ...
> > dlist_foreach(subxact_i, >subtxns)
> > {
> > ReorderBufferTXN *subtxn;
> >
> > subtxn = dlist_container(ReorderBufferTXN, node, subxact_i.cur);
> > ReorderBufferTransferSnapToParent(txn, subtxn);
> > }
> > ...
> >
>
> OK, I removed the modification in 022_twophase_cascade.pl and combine the
two patches.
>
> Please see the attached patch.
> I also fixed Kuroda-san's comments[1].
>
> [1]
https://www.postgresql.org/message-id/TYAPR01MB5866CD99CF86EAC84119BC91F5E99%40TYAPR01MB5866.jpnprd01.prod.outlook.com
>
> Regards,
> Shi yu

Hello,
I ran tests (4 runs) on both versions (v5 and v6) in release mode. And the
data looks promising, time is reduced now:

  HEAD V5
Delta (sec)
2:20.535307  2:15.865241
4.670066
2:19.220917  2:14.445312
4.775605
2:22.492128  2:17.35755
 5.134578
2:20.737309  2:15.564306
5.173003

HEADV6
   Delta (sec)
2:20.535307   2:15.363567
5.17174
2:19.220917   2:15.079082
4.14.1835
2:22.492128   2:16.244139
6.247989
2:20.737309   2:16.108033
4.629276

thanks
Shveta

Re: [BUG] pg_upgrade test fails from older versions.

2022-12-23 Thread Anton A. Melnikov


Sorry, didn't get to see the last letter!

On 23.12.2022 11:51, Michael Paquier wrote:


FWIW, I find the use of a FOR loop with a DO block much cleaner to
follow in this context, so something like the attached would be able
to group the two queries and address your point on O(N^2).  Do you
like that?
--
Michael


The query:

 DO $$
   DECLARE
 rec record;
   BEGIN
   FOR rec in
SELECT oid::regclass::text as rel, attname as col
 FROM pg_class c, pg_attribute a
 WHERE c.relname !~ '^pg_'
  AND c.relkind IN ('r')
   AND a.attrelid = c.oid
   AND a.atttypid = 'aclitem'::regtype
 ORDER BY 1
   LOOP
EXECUTE 'ALTER TABLE ' || quote_ident(rec.rel) || ' ALTER COLUMN ' ||
   quote_ident(rec.col) || ' SET DATA TYPE text';
   END LOOP;
   END; $$;

gives the average time of 36 ms at the same conditions.


With the best wishes!

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [BUG] pg_upgrade test fails from older versions.

2022-12-23 Thread Anton A. Melnikov


Hello!

On 23.12.2022 06:27, Justin Pryzby wrote:


This would do a single seqscan:
SELECT format('ALTER TABLE %I ALTER COLUMN %I TYPE TEXT', attrelid::regclass, 
attname) FROM pg_attribute WHERE atttypid='aclitem'::regtype; -- AND ...
\gexec



Touched a bit on how long it takes to execute different types of queries on my 
PC.
At each measurement, the server restarted with a freshly copied regression 
database.
1)
DO $$
DECLARE
change_aclitem_type TEXT;
BEGIN
FOR change_aclitem_type IN
SELECT 'ALTER TABLE ' || table_schema || '.' ||
table_name || ' ALTER COLUMN ' ||
column_name || ' SET DATA TYPE text;'
AS change_aclitem_type
FROM information_schema.columns
WHERE data_type = 'aclitem' and table_schema != 'pg_catalog'
LOOP
EXECUTE change_aclitem_type;
END LOOP;
END;
$$;

2)
DO $$
  DECLARE
rec text;
col text;
  BEGIN
  FOR rec in
SELECT oid::regclass::text
FROM pg_class
WHERE relname !~ '^pg_'
  AND relkind IN ('r')
ORDER BY 1
  LOOP
FOR col in SELECT attname FROM pg_attribute
  WHERE attrelid::regclass::text = rec
  AND atttypid = 'aclitem'::regtype
LOOP
  EXECUTE 'ALTER TABLE ' || quote_ident(rec) || ' ALTER COLUMN ' ||
quote_ident(col) || ' SET DATA TYPE text';
END LOOP;
  END LOOP;
 END; $$;

3)
SELECT format('ALTER TABLE %I ALTER COLUMN %I TYPE TEXT', attrelid::regclass, 
attname) FROM pg_attribute WHERE atttypid='aclitem'::regtype;
\gexec

4) The same as 3) but in the DO block
DO $$
DECLARE
change_aclitem_type TEXT;
BEGIN
FOR change_aclitem_type IN
SELECT 'ALTER TABLE ' || attrelid::regclass || ' ALTER COLUMN ' ||
attname || ' TYPE TEXT;'
AS change_aclitem_type
FROM pg_attribute
WHERE atttypid = 'aclitem'::regtype
LOOP
EXECUTE change_aclitem_type;
END LOOP;
END;
$$;

Average execution time for three times:
_
|N of query:   |  1 |   2  | 3  |  4 |
|
|Avg time, ms: | 58 | 1076 | 51 | 33 |
|

Raw results in timing.txt

Best wishes,

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
1) Time: 53,699 ms
Time: 60,146 ms
Time: 60,594 ms

Avg:  58,1 ms

2) Time: 1020,832 ms
Time: 1061,554 ms (00:01,062)
Time: 1148,029 ms (00:01,148)

Avg: 1076 ms

3)  Time: 20,972 ms
regression=# \gexec
ALTER TABLE
Time: 12,601 ms
Time: 3,106 ms
sum = 36,67

Time: 22,087 ms
regression=# \gexec
ALTER TABLE
Time: 40,768 ms
Time: 3,154 ms
sum = 66,01

Time: 13,865 ms
regression=# \gexec
ALTER TABLE
Time: 34,619 ms
Time: 3,063 ms
sum = 51,55

Avg: 51,4 ms

4) Time: 25,518 ms
ime: 35,746 ms
Time: 39,232 ms

Avg: 33,4 ms

Re: Timeout when changes are filtered out by the core during logical replication

2022-12-23 Thread Amit Kapila

On Thu, Dec 22, 2022 at 6:58 PM Ashutosh Bapat
 wrote:
>
> Hi All,
> A customer ran a script dropping a few dozens of users in a transaction. 
> Before dropping a user they change the ownership of the tables owned by that 
> user to another user and revoking all the accesses from that user in the same 
> transaction. There were a few thousand tables whose privileges and ownership 
> was changed by this transaction. Since all of these changes were in catalog 
> table, those changes were filtered out in ReorderBufferProcessTXN()
> by the following code
>if (!RelationIsLogicallyLogged(relation))
> goto change_done;
>
> I tried to reproduce a similar situation through the attached TAP test. For 
> 500 users and 1000 tables, we see that the transaction takes significant time 
> but logical decoding does not take much time. So with the default 1 min WAL 
> sender and receiver timeout I could not reproduce the timeout. Beyond that 
> our TAp test itself times out.
>
> But I think there's a possibility that the logical receiver will time out 
> this way when decoding a sufficiently large transaction which takes more than 
> the timeout amount of time to decode. So I think we need to call 
> OutputPluginUpdateProgress() after a regular interval (in terms of time or 
> number of changes) to consume any feedback from the subscriber or send a 
> keep-alive message.
>

I don't think it will be a good idea to directly call
OutputPluginUpdateProgress() from reorderbuffer.c. There is already a
patch to discuss this problem [1].

> Following commit
> ```
> commit 87c1dd246af8ace926645900f02886905b889718
> Author: Amit Kapila 
> Date:   Wed May 11 10:12:23 2022 +0530
>
> Fix the logical replication timeout during large transactions.
>
>  ```
> fixed a similar problem when the changes were filtered by an output plugin, 
> but in this case the changes are not being handed over to the output plugin 
> as well. If we fix it in the core we may not need to handle it in the output 
> plugin as that commit does. The commit does not have a test case which I 
> could run to reproduce the timeout.
>

It is not evident how to write a stable test for this because
estimating how many changes are enough for the configured
wal_receiver_timeout to
pass on all the buildfarm machines is tricky. If you have good ideas
then feel free to propose a test patch.

[1] - 
https://www.postgresql.org/message-id/OS3PR01MB62751A8063A9A75A096000D89E3F9%40OS3PR01MB6275.jpnprd01.prod.outlook.com

-- 
With Regards,
Amit Kapila.

RE: Force streaming every change in logical decoding

2022-12-23 Thread Hayato Kuroda (Fujitsu)

Dear Shi,

> I also fixed Kuroda-san's comments[1].

Thanks for updating the patch! I confirmed that my comments were fixed and
your patch could pass subscription and test_decoding tests. I think we can
change the status to "Ready for Committer".

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Re: appendBinaryStringInfo stuff

2022-12-23 Thread Peter Eisentraut


On 19.12.22 23:48, David Rowley wrote:

On Tue, 20 Dec 2022 at 11:42, Tom Lane  wrote:

I think Peter is entirely right to question whether *this* type's
output function is performance-critical.  Who's got large tables with
jsonpath columns?  It seems to me the type would mostly only exist
as constants within queries.


The patch touches code in the path of jsonb's output function too. I
don't think you could claim the same for that.


Ok, let's leave the jsonb output alone.  The jsonb output code also 
won't change a lot, but there is a bunch of stuff for jsonpath on the 
horizon, so having some more robust coding style to imitate there seems 
useful.  Here is another patch set with the jsonb changes omitted.
From 319c206bec80d326808a211f9edd50346edbeb8c Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 19 Dec 2022 07:01:27 +0100
Subject: [PATCH v2 1/2] Use appendStringInfoString instead of
 appendBinaryStringInfo where possible

For the jsonpath output, we don't need to squeeze out every bit of
performance, so instead use a more robust coding style.  There are
similar calls in jsonb.c, which we leave alone here since there is
indeed a performance impact for bulk exports.

Discussion: 
https://www.postgresql.org/message-id/flat/a0086cfc-ff0f-2827-20fe-52b591d2666c%40enterprisedb.com
---
 src/backend/utils/adt/jsonpath.c | 42 
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/src/backend/utils/adt/jsonpath.c b/src/backend/utils/adt/jsonpath.c
index 91af030095..a39eab9c20 100644
--- a/src/backend/utils/adt/jsonpath.c
+++ b/src/backend/utils/adt/jsonpath.c
@@ -213,7 +213,7 @@ jsonPathToCstring(StringInfo out, JsonPath *in, int 
estimated_len)
enlargeStringInfo(out, estimated_len);
 
if (!(in->header & JSONPATH_LAX))
-   appendBinaryStringInfo(out, "strict ", 7);
+   appendStringInfoString(out, "strict ");
 
jspInit(, in);
printJsonPathItem(out, , false, true);
@@ -510,9 +510,9 @@ printJsonPathItem(StringInfo buf, JsonPathItem *v, bool 
inKey,
break;
case jpiBool:
if (jspGetBool(v))
-   appendBinaryStringInfo(buf, "true", 4);
+   appendStringInfoString(buf, "true");
else
-   appendBinaryStringInfo(buf, "false", 5);
+   appendStringInfoString(buf, "false");
break;
case jpiAnd:
case jpiOr:
@@ -553,13 +553,13 @@ printJsonPathItem(StringInfo buf, JsonPathItem *v, bool 
inKey,
  
operationPriority(elem.type) <=
  
operationPriority(v->type));
 
-   appendBinaryStringInfo(buf, " like_regex ", 12);
+   appendStringInfoString(buf, " like_regex ");
 
escape_json(buf, v->content.like_regex.pattern);
 
if (v->content.like_regex.flags)
{
-   appendBinaryStringInfo(buf, " flag \"", 7);
+   appendStringInfoString(buf, " flag \"");
 
if (v->content.like_regex.flags & 
JSP_REGEX_ICASE)
appendStringInfoChar(buf, 'i');
@@ -591,13 +591,13 @@ printJsonPathItem(StringInfo buf, JsonPathItem *v, bool 
inKey,
appendStringInfoChar(buf, ')');
break;
case jpiFilter:
-   appendBinaryStringInfo(buf, "?(", 2);
+   appendStringInfoString(buf, "?(");
jspGetArg(v, );
printJsonPathItem(buf, , false, false);
appendStringInfoChar(buf, ')');
break;
case jpiNot:
-   appendBinaryStringInfo(buf, "!(", 2);
+   appendStringInfoString(buf, "!(");
jspGetArg(v, );
printJsonPathItem(buf, , false, false);
appendStringInfoChar(buf, ')');
@@ -606,10 +606,10 @@ printJsonPathItem(StringInfo buf, JsonPathItem *v, bool 
inKey,
appendStringInfoChar(buf, '(');
jspGetArg(v, );
printJsonPathItem(buf, , false, false);
-   appendBinaryStringInfo(buf, ") is unknown", 12);
+   appendStringInfoString(buf, ") is unknown");
break;
case jpiExists:
-   appendBinaryStringInfo(buf, "exists (", 8);
+   appendStringInfoString(buf, "exists (");
jspGetArg(v, );
printJsonPathItem(buf, , false, false);

Re: [BUG] pg_upgrade test fails from older versions.

2022-12-23 Thread Michael Paquier

On Thu, Dec 22, 2022 at 09:27:24PM -0600, Justin Pryzby wrote:
> This would do a single seqscan:
> SELECT format('ALTER TABLE %I ALTER COLUMN %I TYPE TEXT',
> attrelid::regclass, attname) FROM pg_attribute WHERE
> atttypid='aclitem'::regtype; -- AND ...
> \gexec

FWIW, I find the use of a FOR loop with a DO block much cleaner to
follow in this context, so something like the attached would be able
to group the two queries and address your point on O(N^2).  Do you
like that? 
--
Michael


tweak_upgrade_query.sql
Description: application/sql


signature.asc
Description: PGP signature

Re: ARRNELEMS Out-of-bounds possible errors

2022-12-23 Thread Kyotaro Horiguchi

At Fri, 23 Dec 2022 17:37:55 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Thu, 22 Dec 2022 12:35:58 -0300, Ranier Vilela  wrote 
> in 
> > Hi.
> > 
> > Per Coverity.
> > 
> > The commit ccff2d2
> > ,
> > changed the behavior function ArrayGetNItems,
> > with the introduction of the function ArrayGetNItemsSafe.
> > 
> > Now ArrayGetNItems may return -1, according to the comment.
> > " instead of throwing an exception. -1 is returned after an error."
> 
> If I'm reading the code correctly, it's the definition of
> ArrayGetNItems*Safe*. ArrayGetNItems() calls that function with a NULL
> escontext and the NULL turns ereturn() into ereport().

> That doesn't seem to be changed by the commit.

No.. It seems to me that the commit didn't change its behavior in that
regard.

> Of course teaching Coverity not to issue the false warnings would be
> another actual issue that we should do, maybe.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: ARRNELEMS Out-of-bounds possible errors

2022-12-23 Thread Kyotaro Horiguchi

At Thu, 22 Dec 2022 12:35:58 -0300, Ranier Vilela  wrote 
in 
> Hi.
> 
> Per Coverity.
> 
> The commit ccff2d2
> ,
> changed the behavior function ArrayGetNItems,
> with the introduction of the function ArrayGetNItemsSafe.
> 
> Now ArrayGetNItems may return -1, according to the comment.
> " instead of throwing an exception. -1 is returned after an error."

If I'm reading the code correctly, it's the definition of
ArrayGetNItems*Safe*. ArrayGetNItems() calls that function with a NULL
escontext and the NULL turns ereturn() into ereport(). That doesn't
seem to be changed by the commit.

Of course teaching Coverity not to issue the false warnings would be
another actual issue that we should do, maybe.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: File API cleanup

2022-12-23 Thread Peter Eisentraut


On 01.12.22 09:25, Peter Eisentraut wrote:
Here are a couple of patches that clean up the internal File API and 
related things a bit:


Here are two follow-up patches that clean up some stuff related to the 
earlier patch set.  I suspect these are all historically related.


0001-Remove-unnecessary-casts.patch

Some code carefully cast all data buffer arguments for data write
and read function calls to void *, even though the respective
arguments are already void *.  Remove this unnecessary clutter.

0002-Add-const-to-BufFileWrite.patch

Make data buffer argument to BufFileWrite a const pointer and bubble
this up to various callers and related APIs.  This makes the APIs
clearer and more consistent.
From cec7ec188f2473d983b048b9e1eddc7dbc3a4241 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 23 Dec 2022 08:35:42 +0100
Subject: [PATCH 1/2] Remove unnecessary casts

Some code carefully cast all data buffer arguments for data write and
read function calls to void *, even though the respective arguments
are already void *.  Remove this unnecessary clutter.
---
 src/backend/executor/nodeAgg.c |  6 +++---
 src/backend/storage/file/buffile.c |  4 ++--
 src/backend/utils/sort/logtape.c   | 18 +-
 src/backend/utils/sort/tuplesort.c |  2 +-
 src/backend/utils/sort/tuplesortvariants.c | 16 
 5 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 30c9143183..f15bb83a1a 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -2961,10 +2961,10 @@ hashagg_spill_tuple(AggState *aggstate, HashAggSpill 
*spill,
 
tape = spill->partitions[partition];
 
-   LogicalTapeWrite(tape, (void *) , sizeof(uint32));
+   LogicalTapeWrite(tape, , sizeof(uint32));
total_written += sizeof(uint32);
 
-   LogicalTapeWrite(tape, (void *) tuple, tuple->t_len);
+   LogicalTapeWrite(tape, tuple, tuple->t_len);
total_written += tuple->t_len;
 
if (shouldFree)
@@ -3029,7 +3029,7 @@ hashagg_batch_read(HashAggBatch *batch, uint32 *hashp)
tuple->t_len = t_len;
 
nread = LogicalTapeRead(tape,
-   (void *) ((char *) 
tuple + sizeof(uint32)),
+   (char *) tuple + 
sizeof(uint32),
t_len - sizeof(uint32));
if (nread != t_len - sizeof(uint32))
ereport(ERROR,
diff --git a/src/backend/storage/file/buffile.c 
b/src/backend/storage/file/buffile.c
index b0b4eeb3bd..b07cca28d6 100644
--- a/src/backend/storage/file/buffile.c
+++ b/src/backend/storage/file/buffile.c
@@ -607,7 +607,7 @@ BufFileRead(BufFile *file, void *ptr, size_t size)
memcpy(ptr, file->buffer.data + file->pos, nthistime);
 
file->pos += nthistime;
-   ptr = (void *) ((char *) ptr + nthistime);
+   ptr = (char *) ptr + nthistime;
size -= nthistime;
nread += nthistime;
}
@@ -655,7 +655,7 @@ BufFileWrite(BufFile *file, void *ptr, size_t size)
file->pos += nthistime;
if (file->nbytes < file->pos)
file->nbytes = file->pos;
-   ptr = (void *) ((char *) ptr + nthistime);
+   ptr = (char *) ptr + nthistime;
size -= nthistime;
}
 }
diff --git a/src/backend/utils/sort/logtape.c b/src/backend/utils/sort/logtape.c
index c384f98e13..9db220b7ea 100644
--- a/src/backend/utils/sort/logtape.c
+++ b/src/backend/utils/sort/logtape.c
@@ -319,7 +319,7 @@ ltsReadFillBuffer(LogicalTape *lt)
datablocknum += lt->offsetBlockNumber;
 
/* Read the block */
-   ltsReadBlock(lt->tapeSet, datablocknum, (void *) thisbuf);
+   ltsReadBlock(lt->tapeSet, datablocknum, thisbuf);
if (!lt->frozen)
ltsReleaseBlock(lt->tapeSet, datablocknum);
lt->curBlockNumber = lt->nextBlockNumber;
@@ -806,7 +806,7 @@ LogicalTapeWrite(LogicalTape *lt, void *ptr, size_t size)
 
/* set the next-pointer and dump the current block. */
TapeBlockGetTrailer(lt->buffer)->next = nextBlockNumber;
-   ltsWriteBlock(lt->tapeSet, lt->curBlockNumber, (void *) 
lt->buffer);
+   ltsWriteBlock(lt->tapeSet, lt->curBlockNumber, 
lt->buffer);
 
/* initialize the prev-pointer of the next block */
TapeBlockGetTrailer(lt->buffer)->prev = 
lt->curBlockNumber;
@@ -826,7 +826,7 @@ LogicalTapeWrite(LogicalTape *lt, void *ptr, size_t size)
lt->pos += nthistime;
if (lt->nbytes < lt->pos)
lt->nbytes = lt->pos;
-   ptr =

RE: Force streaming every change in logical decoding

2022-12-23 Thread shiy.f...@fujitsu.com

On Fri, Dec 23, 2022 1:50 PM Amit Kapila 
> 
> On Thu, Dec 22, 2022 at 6:18 PM shiy.f...@fujitsu.com
>  wrote:
> >
> >
> > Besides, I tried to reduce data size in streaming subscription tap tests by 
> > this
> > new GUC (see 0002 patch). But I didn't covert all streaming tap tests
> because I
> > think we also need to cover the case that there are lots of changes. So, 
> > 015*
> is
> > not modified. And 017* is not modified because streaming transactions and
> > non-streaming transactions are tested alternately in this test.
> >
> 
> I think we can remove the newly added test from the patch and instead
> combine the 0001 and 0002 patches. I think we should leave the
> 022_twophase_cascade as it is because it can impact code coverage,
> especially the below part of the test:
> # 2PC PREPARE with a nested ROLLBACK TO SAVEPOINT
> $node_A->safe_psql(
> 'postgres', "
> BEGIN;
> INSERT INTO test_tab VALUES (, 'foobar');
> SAVEPOINT sp_inner;
> INSERT INTO test_tab SELECT i, md5(i::text) FROM
> generate_series(3, 5000) s(i);
> 
> Here, we will stream first time after the subtransaction, so can
> impact the below part of the code in ReorderBufferStreamTXN:
> if (txn->snapshot_now == NULL)
> {
> ...
> dlist_foreach(subxact_i, >subtxns)
> {
> ReorderBufferTXN *subtxn;
> 
> subtxn = dlist_container(ReorderBufferTXN, node, subxact_i.cur);
> ReorderBufferTransferSnapToParent(txn, subtxn);
> }
> ...
> 

OK, I removed the modification in 022_twophase_cascade.pl and combine the two 
patches.

Please see the attached patch.
I also fixed Kuroda-san's comments[1].

[1] 
https://www.postgresql.org/message-id/TYAPR01MB5866CD99CF86EAC84119BC91F5E99%40TYAPR01MB5866.jpnprd01.prod.outlook.com

Regards,
Shi yu


v6-0001-Add-logical_decoding_mode-GUC.patch
Description: v6-0001-Add-logical_decoding_mode-GUC.patch

Re: Force streaming every change in logical decoding

2022-12-23 Thread Amit Kapila

On Fri, Dec 23, 2022 at 1:12 PM Hayato Kuroda (Fujitsu)
 wrote:
>
> Dear hackers,
>
> > I will check and report the test coverage if I can.
>
> I ran make coverage. PSA the screen shot that shows results.
> According to the result the coverage seemed to be not changed
> even if the elapsed time was reduced.
>
> Only following lines at process_syncing_tables_for_apply() seemed to be not 
> hit after patching,
> but I thought it was the timing issue because we do not modify around there.
>
> ```
> /*
>  * Enter busy loop and wait for 
> synchronization worker to
>  * reach expected state (or die 
> trying).
>  */
> if (!started_tx)
> {
> StartTransactionCommand();
> started_tx = true;
> }
> ```
>

This part of the code is related to synchronization between apply and
sync workers which depends upon timing. So, we can ignore this
difference.

-- 
With Regards,
Amit Kapila.

50 matches

Mail list logo