Re: Deduplicate logicalrep_read_tuple()

2023-03-04 Thread Peter Smith
On Fri, Mar 3, 2023 at 10:04 PM Amit Kapila  wrote:
>
> On Fri, Mar 3, 2023 at 4:13 PM Bharath Rupireddy
>  wrote:
> >
> > On Thu, Jan 19, 2023 at 8:36 AM Peter Smith  wrote:
> > >
> > > On Wed, Jan 18, 2023 at 6:26 PM Bharath Rupireddy
> > >  wrote:
> > > >
> > > > Hi,
> > > >
> > > > logicalrep_read_tuple() duplicates code for LOGICALREP_COLUMN_TEXT and
> > > > LOGICALREP_COLUMN_BINARY introduced by commit 9de77b5. While it
> > > > doesn't hurt anyone, deduplication makes code a bit leaner by 57 bytes
> > > > [1]. I've attached a patch for $SUBJECT.
> > > >
> > > > Thoughts?
> > > >
> > >
> > > The code looks the same but there is a subtle comment difference where
> > > previously only LOGICALREP_COLUMN_BINARY case said:
> > >  /* not strictly necessary but per StringInfo practice */
> > >
> > > So if you de-duplicate the code then should that comment be modified to 
> > > say
> > > /* not strictly necessary for LOGICALREP_COLUMN_BINARY but per
> > > StringInfo practice */
> >
> > Thanks. Done so in the attached v2.
> >
>
> LGTM. Unless Peter or someone has any comments on this, I'll push this
> early next week.
>

No more comments. Patch v2 LGTM.

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Request for comment on setting binary format output per session

2023-03-04 Thread Dave Cramer
On Sat, 4 Mar 2023 at 19:06, Tom Lane  wrote:

> Jeff Davis  writes:
> > On Sat, 2023-03-04 at 18:04 -0500, Dave Cramer wrote:
> >> Most of the clients know how to decode the builtin types. I'm not
> >> sure there is a use case for binary encode types that the clients
> >> don't have a priori knowledge of.
>
> > The client could, in theory, have a priori knowledge of a non-builtin
> > type.
>
> I don't see what's "in theory" about that.  There seems plenty of
> use for binary I/O of, say, PostGIS types.  Even for built-in types,
> do we really want to encourage people to hard-wire their OIDs into
> applications?
>

How does a client read these? I'm pretty narrowly focussed. The JDBC API
doesn't really have a way to read a non built-in type.  There is a facility
to read a UDT, but the user would have to provide that transcoder. I guess
I'm curious how other clients read binary UDT's ?

>
> I don't see a big problem with driving this off a GUC, but I think
> it should be a list of type names not OIDs.  We already have plenty
> of precedent for dealing with that sort of thing; see search_path
> for the canonical example.  IIRC, there's similar caching logic
> for temp_tablespaces.
>

I have no issue with allowing names, OID's were compact, but we could
easily support both

Dave


Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

2023-03-04 Thread Lukas Fittl
Hi,

I was debugging a planner problem on Postgres 14.4 the other day - and the
involved "bad" plan was including Memoize - though I don't necessarily
think that Memoize is to blame (and this isn't any of the problems recently
fixed in Memoize costing).

However, what I noticed whilst trying different ways to fix the plan, is
that the Memoize output was a bit hard to reason about - especially since
the plan involving Memoize was expensive to run, and so I was mostly
running EXPLAIN without ANALYZE to look at the costing.

Here is an example of the output I was looking at:

 ->  Nested Loop  (cost=1.00..971672.56 rows=119623 width=0)
   ->  Index Only Scan using table1_idx on table1
(cost=0.43..372676.50 rows=23553966 width=8)
   ->  Memoize  (cost=0.57..0.61 rows=1 width=8)
 Cache Key: table1.table2_id
 Cache Mode: logical
 ->  Index Scan using table2_idx on table2
(cost=0.56..0.60 rows=1 width=8)
   Index Cond: (id = table1.table2_id)

The other plan I was comparing with (that I wanted the planner to choose
instead), had a total cost of 1,451,807.35 -- and so I was trying to figure
out why the Nested Loop was costed as 971,672.56.

Simple math makes me expect the Nested Loop should roughly have a total
cost of14,740,595.76 here (372,676.50 + 23,553,966 * 0.61), ignoring a lot
of the smaller costs. Thus, in this example, it appears Memoize made the
plan cost significantly cheaper (roughly 6% of the regular cost).

Essentially this comes down to the "cost reduction" performed by Memoize
only being implicitly visible in the Nested Loop's total cost - and with
nothing useful on the Memoize node itself - since the rescan costs are not
shown.

I think explicitly adding the estimated cache hit ratio for Memoize nodes
might make this easier to reason about, like this:

->  Memoize  (cost=0.57..0.61 rows=1 width=8)
 Cache Key: table1.table2_id
 Cache Mode: logical
 Cache Hit Ratio Estimated: 0.94

Alternatively (or in addition) we could consider showing the "ndistinct"
value that is calculated in cost_memoize_rescan - since that's the most
significant contributor to the cache hit ratio (and you can influence that
directly by improving the ndistinct statistics).

See attached a patch that implements showing the cache hit ratio as a
discussion starter.

I'll park this in the July commitfest for now.

Thanks,
Lukas

-- 
Lukas Fittl


v1-0001-Add-Estimated-Cache-Hit-Ratio-for-Memoize-plan-no.patch
Description: Binary data


Re: Request for comment on setting binary format output per session

2023-03-04 Thread David G. Johnston
On Sat, Mar 4, 2023 at 5:07 PM Tom Lane  wrote:

> Jeff Davis  writes:
> > On Sat, 2023-03-04 at 18:04 -0500, Dave Cramer wrote:
> >> Most of the clients know how to decode the builtin types. I'm not
> >> sure there is a use case for binary encode types that the clients
> >> don't have a priori knowledge of.
>
> > The client could, in theory, have a priori knowledge of a non-builtin
> > type.
>
> I don't see what's "in theory" about that.  There seems plenty of
> use for binary I/O of, say, PostGIS types.  Even for built-in types,
> do we really want to encourage people to hard-wire their OIDs into
> applications?
>
> I don't see a big problem with driving this off a GUC, but I think
> it should be a list of type names not OIDs.  We already have plenty
> of precedent for dealing with that sort of thing; see search_path
> for the canonical example.  IIRC, there's similar caching logic
> for temp_tablespaces.
>
>
This seems slightly different since types depend upon schemas whereas
search_path is top-level and tablespaces are global.  But I agree that
names should be accepted, maybe in addition to OIDs, the latter, for core
types in particular, being a way to not have to worry about masking in
user-space.

David J.


Re: Add standard collation UNICODE

2023-03-04 Thread Tom Lane
Jeff Davis  writes:
> On Sun, 2023-03-05 at 08:27 +1300, Thomas Munro wrote:
>> It's created for UTF-8 only, and UTF-8 sorts the same way as the
>> encoded code points, when interpreted as a sequence of unsigned char
>> by memcmp(), strcmp() etc.  Seems right?

> Right, makes sense.

> Though in principle, shouldn't someone using another encoding also be
> able to use ucs_basic? I'm not sure if that's a practical problem or
> not; I'm just curious. Does ICU provide a locale for sorting by code
> point?

ISTM we could trivially allow it in LATIN1 encoding as well;
strcmp would still have the effect of sorting by unicode code points.

Given the complete lack of field demand for making it work in
other encodings, I'm unexcited about spending more effort than that.

regards, tom lane




Re: Request for comment on setting binary format output per session

2023-03-04 Thread Tom Lane
Jeff Davis  writes:
> On Sat, 2023-03-04 at 18:04 -0500, Dave Cramer wrote:
>> Most of the clients know how to decode the builtin types. I'm not
>> sure there is a use case for binary encode types that the clients
>> don't have a priori knowledge of.

> The client could, in theory, have a priori knowledge of a non-builtin
> type.

I don't see what's "in theory" about that.  There seems plenty of
use for binary I/O of, say, PostGIS types.  Even for built-in types,
do we really want to encourage people to hard-wire their OIDs into
applications?

I don't see a big problem with driving this off a GUC, but I think
it should be a list of type names not OIDs.  We already have plenty
of precedent for dealing with that sort of thing; see search_path
for the canonical example.  IIRC, there's similar caching logic
for temp_tablespaces.

regards, tom lane




Re: Request for comment on setting binary format output per session

2023-03-04 Thread Jeff Davis
On Sat, 2023-03-04 at 18:04 -0500, Dave Cramer wrote:
> Most of the clients know how to decode the builtin types. I'm not
> sure there is a use case for binary encode types that the clients
> don't have a priori knowledge of.

The client could, in theory, have a priori knowledge of a non-builtin
type.

I don't have a great solution for that, though. Maybe it's only
practical for builtin types.

Regards,
Jeff Davis





Re: Add standard collation UNICODE

2023-03-04 Thread Jeff Davis
On Sun, 2023-03-05 at 08:27 +1300, Thomas Munro wrote:
> It's created for UTF-8 only, and UTF-8 sorts the same way as the
> encoded code points, when interpreted as a sequence of unsigned char
> by memcmp(), strcmp() etc.  Seems right?

Right, makes sense.

Though in principle, shouldn't someone using another encoding also be
able to use ucs_basic? I'm not sure if that's a practical problem or
not; I'm just curious. Does ICU provide a locale for sorting by code
point?

Regards,
Jeff Davis





Re: Date-Time dangling unit fix

2023-03-04 Thread Joseph Koshakow
On Sat, Mar 4, 2023 at 4:05 PM Tom Lane  wrote:
>
>I started to look at this, and soon noticed that while we have test
cases
>matching this sort of date input, there is no documentation for it.
The
>code claims it's an "ISO" (presumably ISO 8601) format, and maybe it is
>because it looks a lot like the ISO 8601 format for intervals
(durations).
>But I don't have a copy of ISO 8601, and some googling fails to find
any
>indication that anybody else believes this is a valid datetime format.
>Wikipedia for example documents a lot of variants of ISO 8601 [1],
>but nothing that looks like this.
>
>I wonder if we should just rip this code out instead of fixing it.
>I suspect its real-world usage is not different from zero.  We'd
>have to keep the "Jnnn" Julian-date case, though, so maybe there's
>little to be saved.
>
>If we do keep it, there's documentation work to be done.  But the
>first bit of doco I'd want to see is a pointer to a standard.

I also don't have a copy of ISO 8601 and wasn't able to find anything
about this variant on Google. I did find this comment in datetime.c

/*
* Was this an "ISO date" with embedded field labels? An
* example is "y2001m02d04" - thomas 2001-02-04
*/

which comes from this commit [1], which was authored by Thomas Lockhart
(presumably the same thomas from the comment). I've CC'ed Thomas in
case the email still exists and they happen to remember. The commit
message mentions ISO, but not the variant mentioned in the comment.
The mailing list thread can be found here [2], but it doesn't provide
much more information. I also found the following thread [3], which
happens to have you in it in case you remember it, which seemed to be
the motivation for commit [1]. It only contains the following line
about ISO:

> o support for "ISO variants" on input, including embedded "T" preceeding
the time fields

All that seems to imply the "y2001m02d04" ISO variant was never really
discussed in much detail and it's probably fine to remove it. Though,
it has been around for 22 years which makes it a bit scary to remove.

- Joe Koshakow

[1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=6f58115dddfa8ca63004c4784f57ef660422861d
[2]
https://www.postgresql.org/message-id/flat/3BB433D5.3CB4164E%40fourpalms.org
[3]
https://www.postgresql.org/message-id/flat/3B970FF8.B9990807%40fourpalms.org#c57d83c80d295bfa19887c92122369c3


Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2023-03-04 Thread Tom Lane
Andres Freund  writes:
> Just pushed the actual pg_stat_io view, the splitting of the tablespace test,
> and the pg_stat_io tests.

One of the test cases is flapping a bit:

diff -U3 
/home/pg/build-farm-15/buildroot/HEAD/pgsql.build/src/test/regress/expected/stats.out
 
/home/pg/build-farm-15/buildroot/HEAD/pgsql.build/src/test/regress/results/stats.out
--- 
/home/pg/build-farm-15/buildroot/HEAD/pgsql.build/src/test/regress/expected/stats.out
   2023-03-04 21:30:05.891579466 +0100
+++ 
/home/pg/build-farm-15/buildroot/HEAD/pgsql.build/src/test/regress/results/stats.out
2023-03-04 21:34:26.745552661 +0100
@@ -1201,7 +1201,7 @@
 SELECT :io_sum_shared_after_reads > :io_sum_shared_before_reads;
  ?column? 
 --
- t
+ f
 (1 row)
 
 DROP TABLE test_io_shared;

There are two instances of this today [1][2], and I've seen it before
but failed to note down where.

regards, tom lane

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison=2023-03-04%2021%3A19%3A39
[2] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mule=2023-03-04%2020%3A30%3A05




Re: Allow tests to pass in OpenSSL FIPS mode

2023-03-04 Thread Tom Lane
Peter Eisentraut  writes:
> [ v2-0001-Remove-incidental-md5-function-uses-from-main-reg.patch ]

I've gone through this and have a modest suggestion: let's invent some
wrapper functions around encode(sha256()) to reduce the cosmetic diffs
and consequent need for closer study of patch changes.  In the attached
I called them "notmd5()", but I'm surely not wedded to that name.

This also accounts for some relatively recent additions to stats_ext.sql
that introduced yet more uses of md5().  This passes for me on a
FIPS-enabled Fedora system, with the exception of md5.sql and
password.sql.  I agree that the right thing for md5.sql is just to add
a variant expected-file.  password.sql could perhaps use some refactoring
so that we don't have two large expected-files to manage.

The only other place that perhaps needs discussion is rowsecurity.sql,
which has some surprisingly large changes: not only do the random
strings change, but there are rowcount differences in some results.
I believe this is because there are RLS policy checks and view conditions
that actually examine the contents of the "md5" strings, eg

CREATE POLICY p1 ON s1 USING (a in (select x from s2 where y like '%2f%'));

My recommendation is to just accept those changes as OK and move on.
I doubt that anybody checked the existing results line-by-line either.

So, once we've done something about md5.sql and password.sql, I think
this is committable.

regards, tom lane

diff --git a/src/test/regress/expected/arrays.out b/src/test/regress/expected/arrays.out
index 0ff54a18de..1c46da1e0b 100644
--- a/src/test/regress/expected/arrays.out
+++ b/src/test/regress/expected/arrays.out
@@ -2303,14 +2303,14 @@ insert into src
 create type textandtext as (c1 text, c2 text);
 create temp table dest (f1 textandtext[]);
 insert into dest select array[row(f1,f1)::textandtext] from src;
-select length(md5((f1[1]).c2)) from dest;
+select length(notmd5((f1[1]).c2)) from dest;
  length 
 
  32
 (1 row)
 
 delete from src;
-select length(md5((f1[1]).c2)) from dest;
+select length(notmd5((f1[1]).c2)) from dest;
  length 
 
  32
@@ -2318,7 +2318,7 @@ select length(md5((f1[1]).c2)) from dest;
 
 truncate table src;
 drop table src;
-select length(md5((f1[1]).c2)) from dest;
+select length(notmd5((f1[1]).c2)) from dest;
  length 
 
  32
diff --git a/src/test/regress/expected/brin.out b/src/test/regress/expected/brin.out
index 73fa38396e..b3aac15ecc 100644
--- a/src/test/regress/expected/brin.out
+++ b/src/test/regress/expected/brin.out
@@ -530,7 +530,7 @@ EXPLAIN (COSTS OFF) SELECT * FROM brin_test WHERE b = 1;
 CREATE TABLE brintest_3 (a text, b text, c text, d text);
 -- long random strings (~2000 chars each, so ~6kB for min/max on two
 -- columns) to trigger toasting
-WITH rand_value AS (SELECT string_agg(md5(i::text),'') AS val FROM generate_series(1,60) s(i))
+WITH rand_value AS (SELECT string_agg(notmd5(i::text),'') AS val FROM generate_series(1,60) s(i))
 INSERT INTO brintest_3
 SELECT val, val, val, val FROM rand_value;
 CREATE INDEX brin_test_toast_idx ON brintest_3 USING brin (b, c);
@@ -545,7 +545,7 @@ VACUUM brintest_3;
 -- retry insert with a different random-looking (but deterministic) value
 -- the value is different, and so should replace either min or max in the
 -- brin summary
-WITH rand_value AS (SELECT string_agg(md5((-i)::text),'') AS val FROM generate_series(1,60) s(i))
+WITH rand_value AS (SELECT string_agg(notmd5((-i)::text),'') AS val FROM generate_series(1,60) s(i))
 INSERT INTO brintest_3
 SELECT val, val, val, val FROM rand_value;
 -- now try some queries, accessing the brin index
diff --git a/src/test/regress/expected/brin_multi.out b/src/test/regress/expected/brin_multi.out
index f3309f433f..28d136f59c 100644
--- a/src/test/regress/expected/brin_multi.out
+++ b/src/test/regress/expected/brin_multi.out
@@ -29,7 +29,7 @@ INSERT INTO brintest_multi SELECT
 	(four + 1.0)/(hundred+1),
 	odd::float8 / (tenthous + 1),
 	format('%s:00:%s:00:%s:00', to_hex(odd), to_hex(even), to_hex(hundred))::macaddr,
-	substr(md5(unique1::text), 1, 16)::macaddr8,
+	substr(notmd5(unique1::text), 1, 16)::macaddr8,
 	inet '10.2.3.4/24' + tenthous,
 	cidr '10.2.3/24' + tenthous,
 	date '1995-08-15' + tenthous,
@@ -179,7 +179,7 @@ INSERT INTO brinopers_multi VALUES
 	('macaddr8col', 'macaddr8',
 	 '{>, >=, =, <=, <}',
 	 '{b1:d1:0e:7b:af:a4:42:12, d9:35:91:bd:f7:86:0e:1e, 72:8f:20:6c:2a:01:bf:57, 23:e8:46:63:86:07:ad:cb, 13:16:8e:6a:2e:6c:84:b4}',
-	 '{33, 15, 1, 13, 6}'),
+	 '{31, 17, 1, 11, 4}'),
 	('inetcol', 'inet',
 	 '{=, <, <=, >, >=}',
 	 '{10.2.14.231/24, 255.255.255.255, 255.255.255.255, 0.0.0.0, 0.0.0.0}',
@@ -327,7 +327,7 @@ INSERT INTO brintest_multi SELECT
 	(four + 1.0)/(hundred+1),
 	odd::float8 / (tenthous + 1),
 	format('%s:00:%s:00:%s:00', to_hex(odd), to_hex(even), to_hex(hundred))::macaddr,
-	substr(md5(unique1::text), 1, 16)::macaddr8,
+	substr(notmd5(unique1::text), 1, 16)::macaddr8,
 	

Re: Request for comment on setting binary format output per session

2023-03-04 Thread Dave Cramer
Dave Cramer


On Sat, 4 Mar 2023 at 11:35, Jeff Davis  wrote:

> On Thu, 2023-03-02 at 09:13 -0500, Dave Cramer wrote:
> > I'd like to open up this discussion again so that we can
> > move forward. I prefer the GUC as it is relatively simple and as
> > Peter mentioned it works, but I'm not married to the idea.
>
> It's not very friendly to extensions, where the types are not
> guaranteed to have stable OIDs. Did you consider any proposals that
> work with type names?
>

I had not.
Most of the clients know how to decode the builtin types. I'm not sure
there is a use case for binary encode types that the clients don't have a
priori knowledge of.

Dave

>
> Regards,
> Jeff Davis
>
>


Re: Date-Time dangling unit fix

2023-03-04 Thread Tom Lane
Joseph Koshakow  writes:
> On Mon, Dec 12, 2022 at 10:55 AM Joseph Koshakow  wrote:
>> I just found another class of this bug that the submitted patch does
>> not fix. If the units are at the beginning of the string, then they are
>> also ignored. For example, `date 'm d y2020m11d3'` is also valid. I
>> think the fix here is to check and make sure that ptype is 0 before
>> reassigning the value to a non-zero number. I'll send an updated patch
>> with this tonight.

> Attached is the described patch.

I started to look at this, and soon noticed that while we have test cases
matching this sort of date input, there is no documentation for it.  The
code claims it's an "ISO" (presumably ISO 8601) format, and maybe it is
because it looks a lot like the ISO 8601 format for intervals (durations).
But I don't have a copy of ISO 8601, and some googling fails to find any
indication that anybody else believes this is a valid datetime format.
Wikipedia for example documents a lot of variants of ISO 8601 [1],
but nothing that looks like this.

I wonder if we should just rip this code out instead of fixing it.
I suspect its real-world usage is not different from zero.  We'd
have to keep the "Jnnn" Julian-date case, though, so maybe there's
little to be saved.

If we do keep it, there's documentation work to be done.  But the
first bit of doco I'd want to see is a pointer to a standard.

regards, tom lane

[1] https://en.wikipedia.org/wiki/ISO_8601




Re: Date-time extraneous fields with reserved keywords

2023-03-04 Thread Joseph Koshakow
On Sat, Mar 4, 2023 at 2:48 PM Tom Lane  wrote:
>
>Right.  So really we ought to move the ValidateDate call as
>well as the next half-dozen lines about "mer" down into
>the subsequent "do additional checking" stanza.  It's all
>only relevant to normal date specs.
>
>BTW, looking at the set of RESERV tokens in datetktbl[],
>it looks to me like this change renders the final "default:"
>case unreachable, so probably we could just make that an error.

Please see the attached patch with these changes.

- Joe Koshakow
From 64a71ed287aa9611c22eaa6e2cbb7e080d93be79 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sun, 11 Dec 2022 16:08:43 -0500
Subject: [PATCH] Handle extraneous fields in date-time input

DecodeDateTime sometimest allowed extraneous fields to be included with
reserved keywords. For example `date '1995-08-06 epoch'` would be
parsed successfully, but the date was ignored. This commit fixes the
issue so an error is returned instead.
---
 src/backend/utils/adt/datetime.c  | 35 ++-
 src/test/regress/expected/date.out| 33 +
 src/test/regress/expected/timestamp.out   | 33 +
 src/test/regress/expected/timestamptz.out | 33 +
 src/test/regress/sql/date.sql | 10 +++
 src/test/regress/sql/timestamp.sql| 10 +++
 src/test/regress/sql/timestamptz.sql  | 10 +++
 7 files changed, 150 insertions(+), 14 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 01660637a2..0c1207223c 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -1431,8 +1431,15 @@ DecodeDateTime(char **field, int *ftype, int nf,
 	*tzp = 0;
 break;
 
-			default:
+			case DTK_EPOCH:
+			case DTK_LATE:
+			case DTK_EARLY:
+tmask = (DTK_DATE_M | DTK_TIME_M | DTK_M(TZ));
 *dtype = val;
+break;
+
+			default:
+return DTERR_BAD_FORMAT;
 		}
 
 		break;
@@ -1567,22 +1574,22 @@ DecodeDateTime(char **field, int *ftype, int nf,
 		fmask |= tmask;
 	}			/* end loop over fields */
 
-	/* do final checking/adjustment of Y/M/D fields */
-	dterr = ValidateDate(fmask, isjulian, is2digits, bc, tm);
-	if (dterr)
-		return dterr;
-
-	/* handle AM/PM */
-	if (mer != HR24 && tm->tm_hour > HOURS_PER_DAY / 2)
-		return DTERR_FIELD_OVERFLOW;
-	if (mer == AM && tm->tm_hour == HOURS_PER_DAY / 2)
-		tm->tm_hour = 0;
-	else if (mer == PM && tm->tm_hour != HOURS_PER_DAY / 2)
-		tm->tm_hour += HOURS_PER_DAY / 2;
-
 	/* do additional checking for full date specs... */
 	if (*dtype == DTK_DATE)
 	{
+		/* do final checking/adjustment of Y/M/D fields */
+		dterr = ValidateDate(fmask, isjulian, is2digits, bc, tm);
+		if (dterr)
+			return dterr;
+
+		/* handle AM/PM */
+		if (mer != HR24 && tm->tm_hour > HOURS_PER_DAY / 2)
+			return DTERR_FIELD_OVERFLOW;
+		if (mer == AM && tm->tm_hour == HOURS_PER_DAY / 2)
+			tm->tm_hour = 0;
+		else if (mer == PM && tm->tm_hour != HOURS_PER_DAY / 2)
+			tm->tm_hour += HOURS_PER_DAY / 2;
+
 		if ((fmask & DTK_DATE_M) != DTK_DATE_M)
 		{
 			if ((fmask & DTK_TIME_M) == DTK_TIME_M)
diff --git a/src/test/regress/expected/date.out b/src/test/regress/expected/date.out
index f5949f3d17..c874f06546 100644
--- a/src/test/regress/expected/date.out
+++ b/src/test/regress/expected/date.out
@@ -1532,3 +1532,36 @@ select make_time(10, 55, 100.1);
 ERROR:  time field value out of range: 10:55:100.1
 select make_time(24, 0, 2.1);
 ERROR:  time field value out of range: 24:00:2.1
+-- test errors with reserved keywords
+SELECT date '1995-08-06 epoch';
+ERROR:  invalid input syntax for type date: "1995-08-06 epoch"
+LINE 1: SELECT date '1995-08-06 epoch';
+^
+SELECT date '1995-08-06 infinity';
+ERROR:  invalid input syntax for type date: "1995-08-06 infinity"
+LINE 1: SELECT date '1995-08-06 infinity';
+^
+SELECT date '1995-08-06 -infinity';
+ERROR:  invalid input syntax for type date: "1995-08-06 -infinity"
+LINE 1: SELECT date '1995-08-06 -infinity';
+^
+SELECT date 'epoch 1995-08-06';
+ERROR:  invalid input syntax for type date: "epoch 1995-08-06"
+LINE 1: SELECT date 'epoch 1995-08-06';
+^
+SELECT date 'infinity 1995-08-06';
+ERROR:  invalid input syntax for type date: "infinity 1995-08-06"
+LINE 1: SELECT date 'infinity 1995-08-06';
+^
+SELECT date '-infinity 1995-08-06';
+ERROR:  invalid input syntax for type date: "-infinity 1995-08-06"
+LINE 1: SELECT date '-infinity 1995-08-06';
+^
+SELECT date 'now infinity';
+ERROR:  invalid input syntax for type date: "now infinity"
+LINE 1: SELECT date 'now infinity';
+^
+SELECT date '-infinity infinity';
+ERROR:  invalid input syntax for type date: "-infinity infinity"
+LINE 1: SELECT date '-infinity infinity';
+^
diff --git 

Re: Latches vs lwlock contention

2023-03-04 Thread Thomas Munro
On Sat, Jan 28, 2023 at 3:39 AM vignesh C  wrote:
> On Tue, 1 Nov 2022 at 16:40, Thomas Munro  wrote:
> > Here's an attempt at that.  There aren't actually any cases of uses of
> > this stuff in critical sections here, so perhaps I shouldn't bother
> > with that part.  The part I'd most like some feedback on is the
> > heavyweight lock bits.  I'll add this to the commitfest.
>
> The patch does not apply on top of HEAD as in [1], please post a rebased 
> patch:

Rebased.  I dropped the CV patch for now.
From d678e740c61ed5408ad254d2ca5f8e2fff7d2cd1 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 1 Nov 2022 23:29:38 +1300
Subject: [PATCH v3 1/6] Allow palloc_extended(NO_OOM) in critical sections.

Commit 4a170ee9e0e banned palloc() and similar in critical sections, because an
allocation failure would produce a panic.  Make an exception for allocation
with NULL on failure, for code that has a backup plan.

Discussion: https://postgr.es/m/CA%2BhUKGJHudZMG_vh6GiPB61pE%2BGgiBk5jxzd7inijqx5nEZLCw%40mail.gmail.com
---
 src/backend/utils/mmgr/mcxt.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 0b00802df7..eb18b89368 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -1123,7 +1123,8 @@ MemoryContextAllocExtended(MemoryContext context, Size size, int flags)
 	void	   *ret;
 
 	Assert(MemoryContextIsValid(context));
-	AssertNotInCriticalSection(context);
+	if ((flags & MCXT_ALLOC_NO_OOM) == 0)
+		AssertNotInCriticalSection(context);
 
 	if (!((flags & MCXT_ALLOC_HUGE) != 0 ? AllocHugeSizeIsValid(size) :
 		  AllocSizeIsValid(size)))
@@ -1278,7 +1279,8 @@ palloc_extended(Size size, int flags)
 	MemoryContext context = CurrentMemoryContext;
 
 	Assert(MemoryContextIsValid(context));
-	AssertNotInCriticalSection(context);
+	if ((flags & MCXT_ALLOC_NO_OOM) == 0)
+		AssertNotInCriticalSection(context);
 
 	if (!((flags & MCXT_ALLOC_HUGE) != 0 ? AllocHugeSizeIsValid(size) :
 		  AllocSizeIsValid(size)))
@@ -1509,7 +1511,8 @@ repalloc_extended(void *pointer, Size size, int flags)
 		  AllocSizeIsValid(size)))
 		elog(ERROR, "invalid memory alloc request size %zu", size);
 
-	AssertNotInCriticalSection(context);
+	if ((flags & MCXT_ALLOC_NO_OOM) == 0)
+		AssertNotInCriticalSection(context);
 
 	/* isReset must be false already */
 	Assert(!context->isReset);
-- 
2.39.2

From a3917505d7bd7cec7e98010d05bc016ddcd0dbac Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Wed, 26 Oct 2022 15:51:45 +1300
Subject: [PATCH v3 2/6] Provide SetLatches() for batched deferred latches.

If we have a way to buffer a set of wakeup targets and process them at a
later time, we can:

* move SetLatch() system calls out from under LWLocks, so that locks can
  be released faster; this is especially interesting in cases where the
  target backends will immediately try to acquire the same lock, or
  generally when the lock is heavily contended

* possibly gain some micro-opimization from issuing only two memory
  barriers for the whole batch of latches, not two for each latch to be
  set

* prepare for future IPC mechanisms which might allow us to wake
  multiple backends in a single system call

Individual users of this facility will follow in separate patches.

Discussion: https://postgr.es/m/CA%2BhUKGKmO7ze0Z6WXKdrLxmvYa%3DzVGGXOO30MMktufofVwEm1A%40mail.gmail.com
---
 src/backend/storage/ipc/latch.c  | 214 ---
 src/include/storage/latch.h  |  13 ++
 src/tools/pgindent/typedefs.list |   1 +
 3 files changed, 154 insertions(+), 74 deletions(-)

diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index f4123e7de7..96dd47575a 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -591,6 +591,85 @@ WaitLatchOrSocket(Latch *latch, int wakeEvents, pgsocket sock,
 	return ret;
 }
 
+/*
+ * Set multiple latches at the same time.
+ * Note: modifies input array.
+ */
+static void
+SetLatchV(Latch **latches, int nlatches)
+{
+	/* Flush any other changes out to main memory just once. */
+	pg_memory_barrier();
+
+	/* Keep only latches that are not already set, and set them. */
+	for (int i = 0; i < nlatches; ++i)
+	{
+		Latch	   *latch = latches[i];
+
+		if (!latch->is_set)
+			latch->is_set = true;
+		else
+			latches[i] = NULL;
+	}
+
+	pg_memory_barrier();
+
+	/* Wake the remaining latches that might be sleeping. */
+	for (int i = 0; i < nlatches; ++i)
+	{
+		Latch	   *latch = latches[i];
+
+		/*
+		 * See if anyone's waiting for the latch. It can be the current
+		 * process if we're in a signal handler. We use the self-pipe or
+		 * SIGURG to ourselves to wake up WaitEventSetWaitBlock() without
+		 * races in that case. If it's another process, send a signal.
+		 *
+		 * Fetch owner_pid only once, in case the latch is concurrently
+		 * getting owned or disowned. XXX: This assumes that pid_t is atomic,
+		 * which isn't guaranteed 

Re: Date-time extraneous fields with reserved keywords

2023-03-04 Thread Tom Lane
Joseph Koshakow  writes:
> On Sat, Mar 4, 2023 at 1:56 PM Tom Lane  wrote:
>> Why do you want to skip ValidateDate in some cases?  If we've not
>> had to do that before, I don't see why it's a good idea now.

> This goes back to the abstraction break of
> setting tmask without updating tm. Certain
> validations will check that if a field is set in
> fmask (which is an accumulation of tmask from
> every iteration) then it's value in tm is valid.

Ah.  Another way could be to fill tm with something that would
satisfy ValidateDate, but that seems pretty silly.

> As far as I can tell dtype always equals DTK_DATE
> except when the timestamp/date is 'epoch',
> 'infinity', '-infinity', and none of the
> validations apply to those date/timestamps.

Right.  So really we ought to move the ValidateDate call as
well as the next half-dozen lines about "mer" down into
the subsequent "do additional checking" stanza.  It's all
only relevant to normal date specs.

BTW, looking at the set of RESERV tokens in datetktbl[],
it looks to me like this change renders the final "default:"
case unreachable, so probably we could just make that an error.

regards, tom lane




Re: Date-time extraneous fields with reserved keywords

2023-03-04 Thread Joseph Koshakow
On Sat, Mar 4, 2023 at 1:56 PM Tom Lane  wrote:
>
>I think we should tread very carefully about disallowing inputs that
>have been considered acceptable for 25 years.  I agree with disallowing
>numeric fields along with 'epoch' and 'infinity', but for example
>this seems perfectly useful and sensible:
>
># select timestamptz 'today 12:34';
>  timestamptz
>
> 2023-03-04 12:34:00-05
>(1 row)

Yeah, that makes sense. I'll leave it as is with
the explicit case for 'epoch', 'infinity', and
'-infinity'.

>Why do you want to skip ValidateDate in some cases?  If we've not
>had to do that before, I don't see why it's a good idea now.

This goes back to the abstraction break of
setting tmask without updating tm. Certain
validations will check that if a field is set in
fmask (which is an accumulation of tmask from
every iteration) then it's value in tm is valid.
For example:

if (fmask & DTK_M(YEAR))
{
// ...
else
{
/* there is no year zero in AD/BC notation */
if (tm->tm_year <= 0)
return DTERR_FIELD_OVERFLOW;
}
}

As far as I can tell dtype always equals DTK_DATE
except when the timestamp/date is 'epoch',
'infinity', '-infinity', and none of the
validations apply to those date/timestamps.
Though, I think you're right this is probably
not a good idea. I'll try and brainstorm a
different approach, unless you have some ideas.


Re: Add standard collation UNICODE

2023-03-04 Thread Thomas Munro
On Sun, Mar 5, 2023 at 7:30 AM Jeff Davis  wrote:
> Sorting by codepoint should be encoding-independent (i.e. decode to
> codepoint first); but the C collation is just strcmp, which is
> encoding-dependent. So is UCS_BASIC wrong today?

It's created for UTF-8 only, and UTF-8 sorts the same way as the
encoded code points, when interpreted as a sequence of unsigned char
by memcmp(), strcmp() etc.  Seems right?




Re: [Proposal] Allow pg_dump to include all child tables with the root table

2023-03-04 Thread Tom Lane
Gilles Darold  writes:
> But I disagree the use of --table-with-childs and 
> --exclude-table-with-childs because we already have the --table and 
> --exclude-table, and it will add lot of code where we just need a switch 
> to include children tables.

I quite dislike the idea of a separate --with-whatever switch, because
it will (presumably) apply to all of your --table and --exclude-table
switches, where you may need it to apply to just some of them.
Spelling considerations aside, attaching the property to the
individual switches seems far superior.  And I neither believe that
this would add a lot of code, nor accept that as an excuse even if
it's true.

As noted, "childs" is bad English and "partitions" is flat out wrong
(unless you change it to recurse only to partitions, which doesn't
seem like a better definition).  We could go with
--[exclude-]table-and-children, or maybe
--[exclude-]table-and-child-tables, but those are getting into
carpal-tunnel-syndrome-inducing territory :-(.  I lack a better
naming suggestion offhand.

regards, tom lane




Re: Date-time extraneous fields with reserved keywords

2023-03-04 Thread Tom Lane
Joseph Koshakow  writes:
>   - I'm not sure if we should hard code in those
>   three specific reserved keywords or set tmask
>   in the default case.

I think we should tread very carefully about disallowing inputs that
have been considered acceptable for 25 years.  I agree with disallowing
numeric fields along with 'epoch' and 'infinity', but for example
this seems perfectly useful and sensible:

# select timestamptz 'today 12:34';
  timestamptz   

 2023-03-04 12:34:00-05
(1 row)

> Any thoughts?

Why do you want to skip ValidateDate in some cases?  If we've not
had to do that before, I don't see why it's a good idea now.

regards, tom lane




Re: Add standard collation UNICODE

2023-03-04 Thread Jeff Davis
On Wed, 2023-03-01 at 11:09 +0100, Peter Eisentraut wrote:

> When collation support was added to PostgreSQL, we added UCS_BASIC, 
> since that could easily be mapped to the C locale.

Sorting by codepoint should be encoding-independent (i.e. decode to
codepoint first); but the C collation is just strcmp, which is
encoding-dependent. So is UCS_BASIC wrong today?

(Aside: I wonder whether we should differentiate between the libc
provider, which uses strcoll(), and the provider of non-localized
comparisons that just use strcmp(). That would be a better reflection
of what the code actually does.)

> With ICU support, we can provide the UNICODE collation, since it's
> just 
> the root locale.

+1

>   I suppose one hesitation was that ICU was not a 
> standard feature, so this would create variations in the default
> catalog 
> contents, or something like that.

It looks like the way you've handled this is by inserting the collation
with collprovider=icu even if built without ICU support. I think that's
a new case, so we need to make sure it throws reasonable user-facing
errors.

I do like your approach though because, if someone is using a standard
collation, I think "not built with ICU" (feature not supported) is a
better error than "collation doesn't exist". It also effectively
reserves the name "unicode".


-- 
Jeff Davis
PostgreSQL Contributor Team - AWS






Re: Missing free_var() at end of accum_sum_final()?

2023-03-04 Thread Joel Jacobson
On Fri, Mar 3, 2023, at 16:11, Dean Rasheed wrote:
> So IMO the results just don't justify such an extensive set of
> changes, and I think we should abandon this fixed buffer approach.

I agree. I was hoping it would be possible to reduce the invasiveness,
but I think it's difficult and probably not worth it.

> Having said that, I think the results from the COPY test are worth
> looking at more closely. Your results seem to suggest around a 5%
> improvement. On my machine it was only around 3%, but that still might
> be worth having, if it didn't involve such invasive changes throughout
> the rest of the code.
>
> As an experiment, I took another look at my earlier patch, making
> make_result() construct the result using the same allocated memory as
> the variable's digit buffer (if allocated). That eliminates around a
> third of all free_var() calls from numeric.c, and for most functions,
> it saves both a palloc() and a pfree(). In the case of numeric_in(), I
> realised that it's possible to go further, by reusing the decimal
> digits buffer for the NumericVar's digits, and then later for the
> final Numeric result. Also, by carefully aligning things, it's
> possible to arrange it so that the final make_result() doesn't need to
> copy/move the digits at all. With that I get something closer to a 15%
> improvement in the COPY test, which is definitely worth having.

Nice! Patch LGTM.

> I didn't do all the tests that you did though, so it would be
> interesting to see how it fares in those.

SELECT count(*) FROM generate_series(1::numeric, 1000::numeric);
Time: 1196.801 ms (00:01.197) -- HEAD
Time: 1278.376 ms (00:01.278) -- make-result-using-vars-buf-v2.patch

TRUNCATE foo; COPY foo FROM '/tmp/random-numerics.csv';
Time: 8450.551 ms (00:08.451) -- HEAD
Time: 7176.838 ms (00:07.177) -- make-result-using-vars-buf-v2.patch

SELECT SUM(n1+n2+n3+n4) FROM t;
Time: 643.961 ms -- HEAD
Time: 620.303 ms -- make-result-using-vars-buf-v2.patch

SELECT max(a + b + '17'::numeric + c)
FROM
(SELECT generate_series(1::numeric, 1000::numeric)) aa(a),
(SELECT generate_series(1::numeric, 100::numeric)) bb(b),
(SELECT generate_series(1::numeric, 10::numeric)) cc(c);
Time: 141.070 ms -- HEAD
Time: 139.562 ms -- make-result-using-vars-buf-v2.patch

SELECT SUM(amount*1.25 + 0.5) FROM ledger;
Time: 933.461 ms -- HEAD
Time: 862.619 ms -- make-result-using-vars-buf-v2.patch

Looks like a win in all cases except the first one.

Great work!

/Joel




Re: Date-time extraneous fields with reserved keywords

2023-03-04 Thread Joseph Koshakow
Attached is the described patch. I have two notes
after implementing it:
  - It feels like a bit of an abstraction break to
  set tmask without actually setting any fields in
  tm.
  - I'm not sure if we should hard code in those
  three specific reserved keywords or set tmask
  in the default case.

Any thoughts?

- Joe Koshakow
From 78d8f39db8df68502369ffd9edd6f6e38f4dadb8 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sun, 11 Dec 2022 16:08:43 -0500
Subject: [PATCH] Handle extraneous fields in date-time input

DecodeDateTime sometimest allowed extraneous fields to be included with
reserved keywords. For example `date '1995-08-06 epoch'` would be
parsed successfully, but the date was ignored. This commit fixes the
issue so an error is returned instead.
---
 src/backend/utils/adt/datetime.c  | 18 ++---
 src/test/regress/expected/date.out| 33 +++
 src/test/regress/expected/timestamp.out   | 33 +++
 src/test/regress/expected/timestamptz.out | 33 +++
 src/test/regress/sql/date.sql | 10 +++
 src/test/regress/sql/timestamp.sql| 10 +++
 src/test/regress/sql/timestamptz.sql  | 10 +++
 7 files changed, 143 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 01660637a2..6f82465fd1 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -1431,6 +1431,13 @@ DecodeDateTime(char **field, int *ftype, int nf,
 	*tzp = 0;
 break;
 
+			case DTK_EPOCH:
+			case DTK_LATE:
+			case DTK_EARLY:
+tmask = (DTK_DATE_M | DTK_TIME_M | DTK_M(TZ));
+*dtype = val;
+break;
+
 			default:
 *dtype = val;
 		}
@@ -1567,10 +1574,13 @@ DecodeDateTime(char **field, int *ftype, int nf,
 		fmask |= tmask;
 	}			/* end loop over fields */
 
-	/* do final checking/adjustment of Y/M/D fields */
-	dterr = ValidateDate(fmask, isjulian, is2digits, bc, tm);
-	if (dterr)
-		return dterr;
+	if (*dtype == DTK_DATE)
+	{
+		/* do final checking/adjustment of Y/M/D fields */
+		dterr = ValidateDate(fmask, isjulian, is2digits, bc, tm);
+		if (dterr)
+			return dterr;
+	}
 
 	/* handle AM/PM */
 	if (mer != HR24 && tm->tm_hour > HOURS_PER_DAY / 2)
diff --git a/src/test/regress/expected/date.out b/src/test/regress/expected/date.out
index f5949f3d17..c874f06546 100644
--- a/src/test/regress/expected/date.out
+++ b/src/test/regress/expected/date.out
@@ -1532,3 +1532,36 @@ select make_time(10, 55, 100.1);
 ERROR:  time field value out of range: 10:55:100.1
 select make_time(24, 0, 2.1);
 ERROR:  time field value out of range: 24:00:2.1
+-- test errors with reserved keywords
+SELECT date '1995-08-06 epoch';
+ERROR:  invalid input syntax for type date: "1995-08-06 epoch"
+LINE 1: SELECT date '1995-08-06 epoch';
+^
+SELECT date '1995-08-06 infinity';
+ERROR:  invalid input syntax for type date: "1995-08-06 infinity"
+LINE 1: SELECT date '1995-08-06 infinity';
+^
+SELECT date '1995-08-06 -infinity';
+ERROR:  invalid input syntax for type date: "1995-08-06 -infinity"
+LINE 1: SELECT date '1995-08-06 -infinity';
+^
+SELECT date 'epoch 1995-08-06';
+ERROR:  invalid input syntax for type date: "epoch 1995-08-06"
+LINE 1: SELECT date 'epoch 1995-08-06';
+^
+SELECT date 'infinity 1995-08-06';
+ERROR:  invalid input syntax for type date: "infinity 1995-08-06"
+LINE 1: SELECT date 'infinity 1995-08-06';
+^
+SELECT date '-infinity 1995-08-06';
+ERROR:  invalid input syntax for type date: "-infinity 1995-08-06"
+LINE 1: SELECT date '-infinity 1995-08-06';
+^
+SELECT date 'now infinity';
+ERROR:  invalid input syntax for type date: "now infinity"
+LINE 1: SELECT date 'now infinity';
+^
+SELECT date '-infinity infinity';
+ERROR:  invalid input syntax for type date: "-infinity infinity"
+LINE 1: SELECT date '-infinity infinity';
+^
diff --git a/src/test/regress/expected/timestamp.out b/src/test/regress/expected/timestamp.out
index c64bcb7c12..c2159c2cec 100644
--- a/src/test/regress/expected/timestamp.out
+++ b/src/test/regress/expected/timestamp.out
@@ -2125,3 +2125,36 @@ select * from generate_series('2020-01-01 00:00'::timestamp,
   '2020-01-02 03:00'::timestamp,
   '0 hour'::interval);
 ERROR:  step size cannot equal zero
+-- test errors with reserved keywords
+SELECT timestamp '1995-08-06 01:01:01 epoch';
+ERROR:  invalid input syntax for type timestamp: "1995-08-06 01:01:01 epoch"
+LINE 1: SELECT timestamp '1995-08-06 01:01:01 epoch';
+ ^
+SELECT timestamp '1995-08-06 01:01:01 infinity';
+ERROR:  invalid input syntax for type timestamp: "1995-08-06 01:01:01 infinity"
+LINE 1: SELECT timestamp '1995-08-06 01:01:01 infinity';
+

Re: libpq-fe.h should compile *entirely* standalone

2023-03-04 Thread Tom Lane
Andrew Dunstan  writes:
> On 2023-03-03 Fr 13:46, Tom Lane wrote:
>> This is actually moving the inclusion-check goalposts quite far,
>> but HEAD seems to pass cleanly, and again we can always adjust later.
>> Any objections?

> LGTM

Pushed, thanks for looking.

regards, tom lane




Re: zstd compression for pg_dump

2023-03-04 Thread Justin Pryzby
On Fri, Mar 03, 2023 at 01:38:05PM -0800, Jacob Champion wrote:
> > > With this particular dataset, I don't see much improvement with
> > > zstd:long.
> >
> > Yeah.  I this could be because either 1) you already got very good
> > comprssion without looking at more data; and/or 2) the neighboring data
> > is already very similar, maybe equally or more similar, than the further
> > data, from which there's nothing to gain.
> 
> What kinds of improvements do you see with your setup? I'm wondering
> when we would suggest that people use it.

On customer data, I see small improvements - below 10%.

But on my first two tries, I made synthetic data sets where it's a lot:

$ ./src/bin/pg_dump/pg_dump -d pryzbyj -Fp -Z zstd:long |wc -c
286107
$ ./src/bin/pg_dump/pg_dump -d pryzbyj -Fp -Z zstd:long=0 |wc -c
1709695

That's just 6 identical tables like:
pryzbyj=# CREATE TABLE t1 AS SELECT generate_series(1,99);

In this case, "custom" format doesn't see that benefit, because the
greatest similarity is across tables, which don't share compressor
state.  But I think the note that I wrote in the docs about that should
be removed - custom format could see a big benefit, as long as the table
is big enough, and there's more similarity/repetition at longer
distances.

Here's one where custom format *does* benefit, due to long-distance
repetition within a single table.  The data is contrived, but the schema
of ID => data is not.  What's notable isn't how compressible the data
is, but how much *more* compressible it is with long-distance matching.

pryzbyj=# CREATE TABLE t1 AS SELECT i,array_agg(j) FROM 
generate_series(1,444)i,generate_series(1,9)j GROUP BY 1;
$ ./src/bin/pg_dump/pg_dump -d pryzbyj -Fc -Z zstd:long=1 |wc -c
82023
$ ./src/bin/pg_dump/pg_dump -d pryzbyj -Fc -Z zstd:long=0 |wc -c
1048267

-- 
Justin




Re: Request for comment on setting binary format output per session

2023-03-04 Thread Jeff Davis
On Thu, 2023-03-02 at 09:13 -0500, Dave Cramer wrote:
> I'd like to open up this discussion again so that we can
> move forward. I prefer the GUC as it is relatively simple and as
> Peter mentioned it works, but I'm not married to the idea. 

It's not very friendly to extensions, where the types are not
guaranteed to have stable OIDs. Did you consider any proposals that
work with type names?

Regards,
Jeff Davis





Re: Date-time extraneous fields with reserved keywords

2023-03-04 Thread Joseph Koshakow
On Sat, Mar 4, 2023 at 11:23 AM Keisuke Kuroda 
wrote:
>
>Good catch.
>Of the reserved words that are special values of type Date/Time,
>'now', 'today', 'tomorrow', 'yesterday', and 'allballs',
>I get an error even before applying the patch.

Thanks for pointing this out. After taking a look
at the code, 'now', 'today', 'tomorrow',
'yesterday', and 'allballs' all set the
appropriate tmask field which is what causes them
to error.

  case DTK_NOW:
tmask = (DTK_DATE_M | DTK_TIME_M | DTK_M(TZ));

  case DTK_YESTERDAY:
tmask = DTK_DATE_M;

  case DTK_TODAY:
tmask = DTK_DATE_M;

  case DTK_TOMORROW:
tmask = DTK_DATE_M;

  case DTK_ZULU:
tmask = (DTK_TIME_M | DTK_M(TZ));


while 'epoch', 'infinity', and '-infinity' do not
set tmask (note the default below handles all of
these fields)

  default:
  *dtype = val;

So I think a better fix here would be to also set
tmask for those three reserved keywords.


>One thing I noticed is that the following SQL
>returns normal results even after applying the patch.
>
>postgres=# select timestamp 'epoch 01:01:01';
>  timestamp
>-
> 1970-01-01 00:00:00
>(1 row)
>
>When 'epoch','infinity','-infinity' and time are specified together,
>the time specified in the SQL is not included in result.
>I think it might be better to assume that this pattern is also an
error.
>What do you think?

I agree this pattern should also be an error. I
think that the tmask approach will cause an error
for this pattern as well.

Thanks,
Joe Koshakow


Re: [HACKERS] make async slave to wait for lsn to be replayed

2023-03-04 Thread Kartyshov Ivan

Here I made new patch of feature, discussed above.

WAIT FOR procedure - waits for certain lsn on pause
==
Synopsis
==
SELECT pg_wait_lsn(‘LSN’, timeout) returns boolean

Where timeout = 0, will wait infinite without timeout
And if timeout = 1, then just check if lsn was replayed

How to use it
==

Greg Stark  wrote:

That said, I'm not a fan of the specific function names. Remember that
we have polymorphic functions so you could probably just have an
option argument:


If you have any example, I will be glade to see them. Ьy searches have
not been fruitful.

Michael Paquier wrote:

While looking at all the patches proposed, I have noticed that all the
approaches proposed force a wakeup of the waiters in the redo loop of
the startup process for each record, before reading the next record.
It strikes me that there is some interaction with custom resource
managers here, where it is possible to poke at the waiters not for
each record, but after reading some specific records.  Something
out-of-core would not be as responsive as the per-record approach,
still responsive enough that the waiters wait on input for an
acceptable amount of time, depending on the frequency of the records
generated by a primary to wake them up.  Just something that popped
into my mind while looking a bit at the threads.


I`ll work on this idea to have less impact on the redo system.

On 2023-03-02 13:33, Peter Eisentraut wrote:

But I wonder how a client is going to get the LSN.  How would all of
this be used by a client?

As I wrote earlier main purpose of the feature is to achieve
read-your-writes-consistency, while using async replica for reads and
primary for writes. In that case lsn of last modification is stored
inside application.


I'm tempted to think this could be a protocol-layer facility.  Every
query automatically returns the current LSN, and every query can also
send along an LSN to wait for, and the client library would just keep
track of the LSN for (what it thinks of as) the connection.  So you
get some automatic serialization without having to modify your client
code.

Yes it sounds very tempted. But I think community will be against it.

--
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index dbe9394762..422bb1ed82 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -43,6 +43,7 @@
 #include "backup/basebackup.h"
 #include "catalog/pg_control.h"
 #include "commands/tablespace.h"
+#include "commands/wait.h"
 #include "common/file_utils.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -1752,6 +1753,15 @@ PerformWalRecovery(void)
 break;
 			}
 
+			/*
+			 * If we replayed an LSN that someone was waiting for,
+			 * set latches in shared memory array to notify the waiter.
+			 */
+			if (XLogRecoveryCtl->lastReplayedEndRecPtr >= GetMinWaitedLSN())
+			{
+WaitSetLatch(XLogRecoveryCtl->lastReplayedEndRecPtr);
+			}
+
 			/* Else, try to fetch the next WAL record */
 			record = ReadRecord(xlogprefetcher, LOG, false, replayTLI);
 		} while (record != NULL);
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index 48f7348f91..d8f6965d8c 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -61,6 +61,7 @@ OBJS = \
 	vacuum.o \
 	vacuumparallel.o \
 	variable.o \
-	view.o
+	view.o \
+	wait.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
new file mode 100644
index 00..3465673f0a
--- /dev/null
+++ b/src/backend/commands/wait.c
@@ -0,0 +1,275 @@
+/*-
+ *
+ * wait.c
+ *	  Implements wait lsn, which allows waiting for events such as
+ *	  LSN having been replayed on replica.
+ *
+ * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group
+ * Portions Copyright (c) 2020, Regents of PostgresPro
+ *
+ * IDENTIFICATION
+ *	  src/backend/commands/wait.c
+ *
+ *-
+ */
+#include "postgres.h"
+
+#include 
+
+#include "access/xact.h"
+#include "access/xlogrecovery.h"
+#include "access/xlogdefs.h"
+#include "commands/wait.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "storage/backendid.h"
+#include "storage/pmsignal.h"
+#include "storage/proc.h"
+#include "storage/shmem.h"
+#include "storage/sinvaladt.h"
+#include "storage/spin.h"
+#include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+#include "utils/timestamp.h"
+
+/* Add to shared memory array */
+static void AddWaitedLSN(XLogRecPtr lsn_to_wait);
+
+/* Shared memory structure */
+typedef struct
+{
+	int			backend_maxid;
+	pg_atomic_uint64 min_lsn; /* XLogRecPtr of minimal waited for LSN */
+	slock_t		mutex;
+	/* LSNs that 

Re: libpq-fe.h should compile *entirely* standalone

2023-03-04 Thread Andrew Dunstan


On 2023-03-03 Fr 13:46, Tom Lane wrote:

I wrote:

We can easily do better, as attached, but I wonder which other
headers should get the same treatment.

After a bit of further research I propose the attached.  I'm not
sure exactly what subset of ECPG headers is meant to be exposed
to clients, but we can adjust these patterns if new info emerges.

This is actually moving the inclusion-check goalposts quite far,
but HEAD seems to pass cleanly, and again we can always adjust later.
Any objections?




LGTM


cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

2023-03-04 Thread Amit Kapila
On Fri, Mar 3, 2023 at 1:09 PM Önder Kalacı  wrote:
>
>>
>> 5.
>>
>> +   /*
>> +* If index scans are disabled, use a sequential scan.
>> +*
>> +* Note that we do not use index scans below when enable_indexscan is
>> +* false. Allowing primary key or replica identity even when index 
>> scan is
>> +* disabled is the legacy behaviour. So we hesitate to move the below
>> +* enable_indexscan check to be done earlier in this function.
>> +*/
>> +   if (!enable_indexscan)
>> +   return InvalidOid;
>>
>> Since the document of enable_indexscan says "Enables or disables the query
>> planner's use of index-scan plan types. The default is on.", and we don't use
>> planner here, so I am not sure should we allow/disallow index scan in apply
>> worker based on this GUC.
>>
>
> Given Amit's suggestion on [1], I'm planning to drop this check altogether, 
> and
> rely on table storage parameters.
>

This still seems to be present in the latest version. I think we can
just remove this and then add the additional check as suggested by you
as part of the second patch.

Few other comments on latest version:
==
1.
+/*
+ * Returns true if the index is usable for replica identity full. For details,
+ * see FindUsableIndexForReplicaIdentityFull.
+ */
+bool
+IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo)
+{
+ bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
+ bool is_partial = (indexInfo->ii_Predicate != NIL);
+ bool is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
+
+ if (is_btree && !is_partial && !is_only_on_expression)
+ {
+ return true;
...
...
+/*
+ * Returns the oid of an index that can be used via the apply worker. The index
+ * should be btree, non-partial and have at least one column reference (e.g.,
+ * should not consist of only expressions). The limitations arise from
+ * RelationFindReplTupleByIndex(), which is designed to handle PK/RI and these
+ * limitations are inherent to PK/RI.

By these two, the patch infers that it picks an index that adheres to
the limitations of PK/RI. Apart from unique, the other properties of
RI are "not partial, not deferrable, and include only columns marked
NOT NULL". See ATExecReplicaIdentity() for corresponding checks. We
don't try to ensure the last two from the list. It is fine to do so if
we document the reasons for the same in comments or we can even try to
enforce the remaining restrictions as well. For example, it should be
okay to allow NULL column values because we anyway compare the entire
tuple after getting the value from the index.

2.
+ {
+ /*
+ * This attribute is an expression, and
+ * FindUsableIndexForReplicaIdentityFull() was called earlier
+ * when the index for subscriber was selected. There, the indexes
+ * comprising *only* expressions have already been eliminated.
+ *
+ * Also, because PK/RI can't include expressions we
+ * sanity check the index is neither of those kinds.
+ */
+ Assert(!IdxIsRelationIdentityOrPK(rel, idxrel->rd_id));

This comment doesn't make much sense after you have moved the
corresponding Assert in RelationFindReplTupleByIndex(). Either we
should move or remove this Assert as well or at least update the
comments to reflect the latest code.

3. When FindLogicalRepUsableIndex() is invoked from
logicalrep_partition_open(), the current memory context would be
LogicalRepPartMapContext which would be a long-lived context and we
allocate memory for indexes in FindLogicalRepUsableIndex() which can
accumulate over a period of time. So, I think it would be better to
switch to the old context in logicalrep_partition_open() before
invoking FindLogicalRepUsableIndex() provided that is not a long-lived
context.

-- 
With Regards,
Amit Kapila.




Re: Fix comments in gistxlogDelete, xl_heap_freeze_page and xl_btree_delete

2023-03-04 Thread Drouvot, Bertrand

Hi,

On 3/3/23 6:53 PM, Robert Haas wrote:

On Fri, Mar 3, 2023 at 11:28 AM Drouvot, Bertrand
 wrote:

Thanks for having looked at it!


+1. Committed.



Thanks!

Not a big deal, but the commit message that has been used is not 100% accurate.

Indeed, for gistxlogDelete, that's the other way around (as
compare to what the commit message says).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com