Re: Cleaning up perl code

2024-05-20 Thread Dagfinn Ilmari Mannsåker
Alexander Lakhin  writes:

> Hello hackers,
>
> Please look at a bunch of unused variables and a couple of other defects
> I found in the perl code, maybe you'll find them worth fixing:

Nice cleanup!  Did you use some static analysis tool, or did look for
them manually?  If I add [Variables::ProhibitUnusedVariables] to
src/tools/perlcheck/perlcriticrc, it finds a few more, see the attached
patch.

The scripts parsing errcodes.txt really should be refactored into using
a common module, but that's a patch for another day.

- ilmari

>From 6b096a39753338bb91add5fcf1ed963024e58c15 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Mon, 20 May 2024 19:55:20 +0100
Subject: [PATCH] Prohibit unused variables

---
 src/pl/plpgsql/src/generate-plerrcodes.pl | 6 ++
 src/pl/plpython/generate-spiexceptions.pl | 6 ++
 src/pl/tcl/generate-pltclerrcodes.pl  | 6 ++
 src/tools/perlcheck/perlcriticrc  | 2 ++
 src/tools/pgindent/pgindent   | 2 +-
 5 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/src/pl/plpgsql/src/generate-plerrcodes.pl b/src/pl/plpgsql/src/generate-plerrcodes.pl
index 1c662bc967..e969a4b33e 100644
--- a/src/pl/plpgsql/src/generate-plerrcodes.pl
+++ b/src/pl/plpgsql/src/generate-plerrcodes.pl
@@ -23,10 +23,8 @@
 	# Skip section headers
 	next if /^Section:/;
 
-	die unless /^([^\s]{5})\s+([EWS])\s+([^\s]+)(?:\s+)?([^\s]+)?/;
-
-	(my $sqlstate, my $type, my $errcode_macro, my $condition_name) =
-	  ($1, $2, $3, $4);
+	my ($type, $errcode_macro, $condition_name) =
+		/^[^\s]{5}\s+([EWS])\s+([^\s]+)(?:\s+)?([^\s]+)?/ or die;
 
 	# Skip non-errors
 	next unless $type eq 'E';
diff --git a/src/pl/plpython/generate-spiexceptions.pl b/src/pl/plpython/generate-spiexceptions.pl
index f0c5142be3..984017f212 100644
--- a/src/pl/plpython/generate-spiexceptions.pl
+++ b/src/pl/plpython/generate-spiexceptions.pl
@@ -23,10 +23,8 @@
 	# Skip section headers
 	next if /^Section:/;
 
-	die unless /^([^\s]{5})\s+([EWS])\s+([^\s]+)(?:\s+)?([^\s]+)?/;
-
-	(my $sqlstate, my $type, my $errcode_macro, my $condition_name) =
-	  ($1, $2, $3, $4);
+	my ($type, $errcode_macro, $condition_name) =
+		/^[^\s]{5}\s+([EWS])\s+([^\s]+)(?:\s+)?([^\s]+)?/ or die;
 
 	# Skip non-errors
 	next unless $type eq 'E';
diff --git a/src/pl/tcl/generate-pltclerrcodes.pl b/src/pl/tcl/generate-pltclerrcodes.pl
index fcac4d00a6..58eb6afefe 100644
--- a/src/pl/tcl/generate-pltclerrcodes.pl
+++ b/src/pl/tcl/generate-pltclerrcodes.pl
@@ -23,10 +23,8 @@
 	# Skip section headers
 	next if /^Section:/;
 
-	die unless /^([^\s]{5})\s+([EWS])\s+([^\s]+)(?:\s+)?([^\s]+)?/;
-
-	(my $sqlstate, my $type, my $errcode_macro, my $condition_name) =
-	  ($1, $2, $3, $4);
+	my ($type, $errcode_macro, $condition_name) =
+		/^[^\s]{5}\s+([EWS])\s+([^\s]+)(?:\s+)?([^\s]+)?/ or die;
 
 	# Skip non-errors
 	next unless $type eq 'E';
diff --git a/src/tools/perlcheck/perlcriticrc b/src/tools/perlcheck/perlcriticrc
index 4739e9f4f1..6053dfcc2a 100644
--- a/src/tools/perlcheck/perlcriticrc
+++ b/src/tools/perlcheck/perlcriticrc
@@ -15,6 +15,8 @@ verbose = %f: %m at line %l, column %c.  %e.  ([%p] Severity: %s)\n
 
 # Note: for policy descriptions see https://metacpan.org/dist/Perl-Critic
 
+[Variables::ProhibitUnusedVariables]
+severity = 5
 
 # allow octal constants with leading zeros
 [-ValuesAndExpressions::ProhibitLeadingZeros]
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 48d83bc434..063ec8ce63 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -22,7 +22,7 @@ my $indent_opts =
 my $devnull = File::Spec->devnull;
 
 my ($typedefs_file, $typedef_str, @excludes,
-	$indent, $build, $diff,
+	$indent, $diff,
 	$check, $help, @commits,);
 
 $help = 0;
-- 
2.39.2



Re: Improving information_schema._pg_expandarray()

2024-05-13 Thread Dagfinn Ilmari Mannsåker
[ I got distracted while writing this follow-up and only just found it
  in my list of unsent Gnus buffers, and now it's probably too late to
  make it for 17, but here it is anyway while I remember. ]

Tom Lane  writes:

> I happened to notice that information_schema._pg_expandarray(),
> which has the nigh-unreadable definition
>
> AS 'select $1[s],
> s operator(pg_catalog.-) pg_catalog.array_lower($1,1) 
> operator(pg_catalog.+) 1
> from pg_catalog.generate_series(pg_catalog.array_lower($1,1),
> pg_catalog.array_upper($1,1),
> 1) as g(s)';
>
> can now be implemented using unnest():
>
> AS 'SELECT * FROM pg_catalog.unnest($1) WITH ORDINALITY';
>
> It seems to be slightly more efficient this way, but the main point
> is to make it more readable.

I didn't spot this until it got committed, but it got me wondering what
eliminating the wrapper function completely would look like, so I
whipped up the attached.  It instead calls UNNEST() laterally in the
queries, which has the side benefit of getting rid of several
subselects, one of which was particularly confusing.  In one place the
lateral form eliminated the need for WITH ORDINALITY as well.


- ilmari

>From 9d1b2f2d16f10903d975a3bb7551a38c5ce62e15 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Thu, 28 Dec 2023 23:47:33 +
Subject: [PATCH] Eliminate information_schema._pg_expandarray completely

Commit 58054de2d0847c09ef091956f72ae5e9fb9a176e made it into a simple
wrapper around unnest, but we can simplfy things further by calling
unnest directly in the queries.
---
 src/backend/catalog/information_schema.sql | 104 +
 src/backend/utils/adt/arrayfuncs.c |   3 -
 src/test/regress/expected/psql.out |  30 +++---
 src/test/regress/sql/psql.sql  |   2 +-
 4 files changed, 58 insertions(+), 81 deletions(-)

diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index c4145131ce..cf25f5d1bc 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -39,22 +39,15 @@ SET search_path TO information_schema;
  * A few supporting functions first ...
  */
 
-/* Expand any 1-D array into a set with integers 1..N */
-CREATE FUNCTION _pg_expandarray(IN anyarray, OUT x anyelement, OUT n int)
-RETURNS SETOF RECORD
-LANGUAGE sql STRICT IMMUTABLE PARALLEL SAFE
-ROWS 100 SUPPORT pg_catalog.array_unnest_support
-AS 'SELECT * FROM pg_catalog.unnest($1) WITH ORDINALITY';
-
 /* Given an index's OID and an underlying-table column number, return the
  * column's position in the index (NULL if not there) */
 CREATE FUNCTION _pg_index_position(oid, smallint) RETURNS int
 LANGUAGE sql STRICT STABLE
 BEGIN ATOMIC
-SELECT (ss.a).n FROM
-  (SELECT information_schema._pg_expandarray(indkey) AS a
-   FROM pg_catalog.pg_index WHERE indexrelid = $1) ss
-  WHERE (ss.a).x = $2;
+SELECT ik.icol
+FROM pg_catalog.pg_index,
+ pg_catalog.unnest(indkey) WITH ORDINALITY ik(tcol, icol)
+WHERE indexrelid = $1 AND ik.tcol = $2;
 END;
 
 CREATE FUNCTION _pg_truetypid(pg_attribute, pg_type) RETURNS oid
@@ -1079,37 +1072,32 @@ GRANT SELECT ON enabled_roles TO PUBLIC;
 
 CREATE VIEW key_column_usage AS
 SELECT CAST(current_database() AS sql_identifier) AS constraint_catalog,
-   CAST(nc_nspname AS sql_identifier) AS constraint_schema,
+   CAST(nc.nspname AS sql_identifier) AS constraint_schema,
CAST(conname AS sql_identifier) AS constraint_name,
CAST(current_database() AS sql_identifier) AS table_catalog,
-   CAST(nr_nspname AS sql_identifier) AS table_schema,
+   CAST(nr.nspname AS sql_identifier) AS table_schema,
CAST(relname AS sql_identifier) AS table_name,
CAST(a.attname AS sql_identifier) AS column_name,
-   CAST((ss.x).n AS cardinal_number) AS ordinal_position,
+   CAST(ck.icol AS cardinal_number) AS ordinal_position,
CAST(CASE WHEN contype = 'f' THEN
-   _pg_index_position(ss.conindid, ss.confkey[(ss.x).n])
+   _pg_index_position(c.conindid, c.confkey[ck.icol])
  ELSE NULL
 END AS cardinal_number)
  AS position_in_unique_constraint
 FROM pg_attribute a,
- (SELECT r.oid AS roid, r.relname, r.relowner,
- nc.nspname AS nc_nspname, nr.nspname AS nr_nspname,
- c.oid AS coid, c.conname, c.contype, c.conindid,
- c.confkey, c.confrelid,
- _pg_expandarray(c.conkey) AS x
-  FROM pg_namespace nr, pg_class r, pg_namespace nc,
-   pg_constraint c
-  WHERE nr.oid = r.relnamespace
-AND r.oid = c.conrelid
-AND nc.oid = c.connamespace
-AND c.contype IN ('p', 'u', 'f')

Re: Allowing additional commas between columns, and at the end of the SELECT clause

2024-05-13 Thread Dagfinn Ilmari Mannsåker
Tom Lane  writes:

> =?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?=  writes:
>> Matthias van de Meent  writes:
>>> Single trailing commas are a feature that's more and more common in
>>> languages, yes, but arbitrary excess commas is new to me. Could you
>>> provide some examples of popular languages which have that, as I can't
>>> think of any.
>
>> The only one I can think of is Perl, which I'm not sure counts as
>> popular any more.  JavaScript allows consecutive commas in array
>> literals, but they're not no-ops, they create empty array slots:
>
> I'm fairly down on this idea for SQL, because I think it creates
> ambiguity for the ROW() constructor syntax.  That is:
>
>   (x,y) is understood to be shorthand for ROW(x,y)
>
>   (x) is not ROW(x), it's just x
>
>   (x,) means what?

Python has a similar issue: (x, y) is a tuple, but (x) is just x, and
they use the trailing comma to disambiguate, so (x,) creates a
single-item tuple.  AFAIK it's the only place where the trailing comma
is significant.

> I realize the original proposal intended to restrict the legality of
> excess commas to only a couple of places, but to me that just flags
> it as a kluge.  ROW(...) ought to work pretty much the same as a
> SELECT list.

Yeah, a more principled approach would be to not special-case target
lists, but to allow one (and only one) trailing comma everywhere:
select, order by, group by, array constructors, row constructors,
everything that looks like a function call, etc.

> As already mentioned, if you can get some variant of this through the
> SQL standards process, we'll probably adopt it.  But I doubt that we
> want to get out front of the committee in this area.

Agreed.

>   regards, tom lane

- ilmari




Re: Allowing additional commas between columns, and at the end of the SELECT clause

2024-05-13 Thread Dagfinn Ilmari Mannsåker
Matthias van de Meent  writes:

> On Mon, 13 May 2024 at 10:42, Artur Formella  
> wrote:
>> Motivation:
>> Commas of this type are allowed in many programming languages, in some
>> it is even recommended to use them at the ends of lists or objects.
>
> Single trailing commas are a feature that's more and more common in
> languages, yes, but arbitrary excess commas is new to me. Could you
> provide some examples of popular languages which have that, as I can't
> think of any.

The only one I can think of is Perl, which I'm not sure counts as
popular any more.  JavaScript allows consecutive commas in array
literals, but they're not no-ops, they create empty array slots:

❯ js
Welcome to Node.js v18.19.0.
Type ".help" for more information.
> [1,,2,,]
[ 1, <1 empty item>, 2, <1 empty item> ]

- ilmari




Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread Dagfinn Ilmari Mannsåker
Andres Freund  writes:

> Hi,
>
> On 2024-05-09 20:53:27 +0100, Dagfinn Ilmari Mannsåker wrote:
>> Andres Freund  writes:
>> > On 2024-05-09 20:12:38 +0100, Dagfinn Ilmari Mannsåker wrote:
>> >> Attached is a patch which adds a check-docs target for meson, which
>> >> takes 0.3s on my laptop.
>> >> +checkdocs = custom_target('check-docs',
>> >> +  input: 'postgres.sgml',
>> >> +  output: 'check-docs',
>> >> +  depfile: 'postgres-full.xml.d',
>> >> +  command: [xmllint,  '--nonet', '--valid', '--noout',
>> >> +'--path', '@OUTDIR@', '@INPUT@'],
>> >> +  depends: doc_generated,
>> >> +  build_by_default: false,
>> >> +)
>> >> +alias_target('check-docs', checkdocs)
>> >
>> > Isn't the custom target redundant with postgres_full_xml? I.e. you could 
>> > just
>> > have the alias_target depend on postgres_full_xml?
>> 
>> We could, but that would actually rebuild postgres-full.xml, not just
>> check the syntax (but that only takes 0.1-0.2s longer),
>
> I don't think this is performance critical enough to worry about 0.1s. If
> anything I think the performance argument goes the other way round - doing the
> validation work multiple times is a waste of time...

These targets are both build_by_default: false, so the checkdocs target
won't both be built when building the docs.  But reusing the
postgres_full_xml target will save half a second of the half-minute
build time if you then go on to actually build the docs without changing
anything after the syntax check passes.

>> and only run if the docs have been modified since it was last built (which I
>> guess is fine, since if you haven't modified the docs you can't have
>> introduced any syntax errors).
>
> That actually seems good to me.
>
>
>> It's already possible to run that target directly, i.e.
>> 
>> ninja doc/src/sgml/postgres-full.xml
>> 
>> We could just document that in the list of meson doc targets, but a
>> shortcut alias would roll off the fingers more easily and be more
>> discoverable.
>
> Agreed.

Here's a v2 patch that does it that way.  Should we document that
check-docs actually builds postgres-full.xml, and if so, where?

> Greetings,
>
> Andres Freund

- ilmari


>From 20b5a8e96ec88022b63062f9e4d74d46f09cedd6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Thu, 9 May 2024 19:59:46 +0100
Subject: [PATCH v2] Add a check-docs target for meson

This is is just an alias for the `postgres-full.xml` target, but is
more discoverable and quicker to type.  This allows checking the
syntax of the docs much faster than actually building them, similar to
`make -C doc/src/sgml check`.
---
 doc/src/sgml/meson.build   | 2 ++
 doc/src/sgml/targets-meson.txt | 1 +
 2 files changed, 3 insertions(+)

diff --git a/doc/src/sgml/meson.build b/doc/src/sgml/meson.build
index e418de83a7..3eb0b99462 100644
--- a/doc/src/sgml/meson.build
+++ b/doc/src/sgml/meson.build
@@ -112,6 +112,8 @@ postgres_full_xml = custom_target('postgres-full.xml',
 docs += postgres_full_xml
 alldocs += postgres_full_xml
 
+# Shortcut to the above for checking doc syntax
+alias_target('check-docs', postgres_full_xml)
 
 if xsltproc_bin.found()
   xsltproc_flags = [
diff --git a/doc/src/sgml/targets-meson.txt b/doc/src/sgml/targets-meson.txt
index bd470c87a7..ba426707be 100644
--- a/doc/src/sgml/targets-meson.txt
+++ b/doc/src/sgml/targets-meson.txt
@@ -26,6 +26,7 @@ Documentation Targets:
   doc/src/sgml/postgres-US.pdf  Build documentation in PDF format, with US letter pages
   doc/src/sgml/postgres.htmlBuild documentation in single-page HTML format
   alldocs   Build documentation in all supported formats
+  check-docsCheck the syntax of the documentation
 
 Installation Targets:
   install   Install postgres, excluding documentation
-- 
2.39.2



Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread Dagfinn Ilmari Mannsåker
Andres Freund  writes:

> Hi,
>
> On 2024-05-09 20:12:38 +0100, Dagfinn Ilmari Mannsåker wrote:
>> Attached is a patch which adds a check-docs target for meson, which
>> takes 0.3s on my laptop.
>
> Nice.
>
>
>> +checkdocs = custom_target('check-docs',
>> +  input: 'postgres.sgml',
>> +  output: 'check-docs',
>> +  depfile: 'postgres-full.xml.d',
>> +  command: [xmllint,  '--nonet', '--valid', '--noout',
>> +'--path', '@OUTDIR@', '@INPUT@'],
>> +  depends: doc_generated,
>> +  build_by_default: false,
>> +)
>> +alias_target('check-docs', checkdocs)
>
> Isn't the custom target redundant with postgres_full_xml? I.e. you could just
> have the alias_target depend on postgres_full_xml?

We could, but that would actually rebuild postgres-full.xml, not just
check the syntax (but that only takes 0.1-0.2s longer), and only run if
the docs have been modified since it was last built (which I guess is
fine, since if you haven't modified the docs you can't have introduced
any syntax errors).

It's already possible to run that target directly, i.e.

ninja doc/src/sgml/postgres-full.xml

We could just document that in the list of meson doc targets, but a
shortcut alias would roll off the fingers more easily and be more
discoverable.

> Greetings,
>
> Andres Freund

- ilmari




Re: open items

2024-05-09 Thread Dagfinn Ilmari Mannsåker
Robert Haas  writes:

> * Register ALPN protocol id with IANA. From the mailing list thread,
> it is abundantly clear that IANA is in no hurry to finish dealing with
> what seems to be a completely pro forma request from our end. I think
> we just have to be patient.

This appears to have been approved without anyone mentioning it on the
list, and the registry now lists "postgresql" at the bottom:

https://www.iana.org/assignments/tls-extensiontype-values/tls-extensiontype-values.xhtml#alpn-protocol-ids

- ilmari




Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread Dagfinn Ilmari Mannsåker
"David G. Johnston"  writes:

> $subject
>
> Make has one:
> https://www.postgresql.org/docs/current/docguide-build.html#DOCGUIDE-BUILD-SYNTAX-CHECK
>
> This needs updating:
> https://www.postgresql.org/docs/current/docguide-build-meson.html
>
> I've been using "ninja html" which isn't shown here.

The /devel/ version has a link to the full list of doc targets:

https://www.postgresql.org/docs/devel/install-meson.html#TARGETS-MESON-DOCUMENTATION

Attached is a patch which adds a check-docs target for meson, which
takes 0.3s on my laptop.

- ilmari

>From d54104493b9d97b95a890e47b395723d9b152447 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Thu, 9 May 2024 19:59:46 +0100
Subject: [PATCH] Add a check-docs target for meson

This allows checking the syntax of the docs much faster than
rebuilding them all, similar to `make -C doc/src/sgml check`.
---
 doc/src/sgml/meson.build   | 10 ++
 doc/src/sgml/targets-meson.txt |  1 +
 2 files changed, 11 insertions(+)

diff --git a/doc/src/sgml/meson.build b/doc/src/sgml/meson.build
index e418de83a7..51eed7b31d 100644
--- a/doc/src/sgml/meson.build
+++ b/doc/src/sgml/meson.build
@@ -112,6 +112,16 @@ postgres_full_xml = custom_target('postgres-full.xml',
 docs += postgres_full_xml
 alldocs += postgres_full_xml
 
+checkdocs = custom_target('check-docs',
+  input: 'postgres.sgml',
+  output: 'check-docs',
+  depfile: 'postgres-full.xml.d',
+  command: [xmllint,  '--nonet', '--valid', '--noout',
+'--path', '@OUTDIR@', '@INPUT@'],
+  depends: doc_generated,
+  build_by_default: false,
+)
+alias_target('check-docs', checkdocs)
 
 if xsltproc_bin.found()
   xsltproc_flags = [
diff --git a/doc/src/sgml/targets-meson.txt b/doc/src/sgml/targets-meson.txt
index bd470c87a7..ba426707be 100644
--- a/doc/src/sgml/targets-meson.txt
+++ b/doc/src/sgml/targets-meson.txt
@@ -26,6 +26,7 @@ Documentation Targets:
   doc/src/sgml/postgres-US.pdf  Build documentation in PDF format, with US letter pages
   doc/src/sgml/postgres.htmlBuild documentation in single-page HTML format
   alldocs   Build documentation in all supported formats
+  check-docsCheck the syntax of the documentation
 
 Installation Targets:
   install   Install postgres, excluding documentation
-- 
2.39.2



Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread Dagfinn Ilmari Mannsåker
"David G. Johnston"  writes:

> $subject
>
> Make has one:
> https://www.postgresql.org/docs/current/docguide-build.html#DOCGUIDE-BUILD-SYNTAX-CHECK
>
> This needs updating:
> https://www.postgresql.org/docs/current/docguide-build-meson.html
>
> I've been using "ninja html" which isn't shown here.  Also, as a sanity
> check, running that command takes my system 1 minute.  Any idea what
> percentile that falls into?

My laptop (8-core i7-11800H @ 2.30GHz) takes 22s to do `ninja html`
after `ninja clean`.

> David J.

- ilmari




Re: First draft of PG 17 release notes

2024-05-09 Thread Dagfinn Ilmari Mannsåker
Dagfinn Ilmari Mannsåker  writes:

> Bruce Momjian  writes:
>
>> I have committed the first draft of the PG 17 release notes;  you can
>> see the results here:
>>
>>  https://momjian.us/pgsql_docs/release-17.html
>
> My name is listed twice in the "Improve psql tab completion" item.

You can move one of them to "Track DEALLOCATE in pg_stat_statements",
which Michael and I co-authored.

- ilmari




Re: First draft of PG 17 release notes

2024-05-09 Thread Dagfinn Ilmari Mannsåker
Bruce Momjian  writes:

> I have committed the first draft of the PG 17 release notes;  you can
> see the results here:
>
>   https://momjian.us/pgsql_docs/release-17.html

My name is listed twice in the "Improve psql tab completion" item.

- ilmari




Re: Expand applicability of aggregate's sortop optimization

2024-05-08 Thread Dagfinn Ilmari Mannsåker
Matthias van de Meent  writes:

> PFA the small patch that implements this.

I don't have enough knowledge to have an opinion on most of the patch
other than it looks okay at a glance, but the list API usage could be
updated to more modern variants:

> diff --git a/src/backend/optimizer/plan/planagg.c 
> b/src/backend/optimizer/plan/planagg.c
> index afb5445b77..d8479fe286 100644
> --- a/src/backend/optimizer/plan/planagg.c
> +++ b/src/backend/optimizer/plan/planagg.c
> @@ -253,6 +253,16 @@ can_minmax_aggs(PlannerInfo *root, List **context)
>   if (list_length(aggref->args) != 1)
>   return false;   /* it couldn't be MIN/MAX */
>  
> + /*
> +  * We might implement the optimization when a FILTER clause is 
> present
> +  * by adding the filter to the quals of the generated subquery. 
>  For
> +  * now, just punt.
> +  */
> + if (aggref->aggfilter != NULL)
> + return false;
> +
> + curTarget = (TargetEntry *) linitial(aggref->args);

This could be linitial_node(TargetEntry, aggref->args).

> + if (list_length(aggref->aggorder) > 1)
> + return false;
> +
> + orderClause = castNode(SortGroupClause, 
> linitial(aggref->aggorder));

This could be linitial_node(SortGroupClause, aggref->aggorder).

> + if (orderClause->sortop != aggsortop)
> + {
> + List   *btclasses;
> + ListCell *cell;
> + boolmatch = false;
> +
> + btclasses = 
> get_op_btree_interpretation(orderClause->sortop);
> +
> + foreach(cell, btclasses)
> + {
> + OpBtreeInterpretation *interpretation;
> + interpretation = (OpBtreeInterpretation 
> *) lfirst(cell);

This could be foreach_ptr(OpBtreeInterpretation, interpretation, btclasses),
which also eliminates the need for the explicit `ListCell *` variable
and lfirst() call.

- ilmari




Re: Doc anchors for COPY formats

2024-04-25 Thread Dagfinn Ilmari Mannsåker
Daniel Gustafsson  writes:

>> On 25 Apr 2024, at 13:23, Dagfinn Ilmari Mannsåker  wrote:
>
>> An IRC conversation just now made me notice that it would be handy to
>> have stable links for the descrpitions of the various COPY formats, per
>> the attached patch.
>
> No objections, that seems perfectly a reasonable idea.  Maybe we should set an
> xreflabel while at it for completeness sake?

xreflabel only affects the link text when using , but there are no
links to these sections in the docs, so I don't see much point.  One
thing that could make sense would be to link to the File Formats section
from the FORMAT keyword docs, like the attached.

- ilmari

>From 8921e4b6d86d29210b9d59511def16ba34865f62 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Thu, 25 Apr 2024 12:09:14 +0100
Subject: [PATCH v2] Add anchors for COPY format descriptions

And link the section from the FORMAT keyword description.
---
 doc/src/sgml/ref/copy.sgml | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 1ce19668d8..f52ad1e34f 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -220,6 +220,7 @@
   csv (Comma Separated Values),
   or binary.
   The default is text.
+  See  below for details.
  
 

@@ -627,10 +628,10 @@
 
  
 
- 
+ 
   File Formats
 
-  
+  
Text Format
 

@@ -770,7 +771,7 @@

   
 
-  
+  
CSV Format
 

@@ -855,7 +856,7 @@
 
   
 
-  
+  
Binary Format
 

-- 
2.39.2



Doc anchors for COPY formats

2024-04-25 Thread Dagfinn Ilmari Mannsåker
Hi hackers,

An IRC conversation just now made me notice that it would be handy to
have stable links for the descrpitions of the various COPY formats, per
the attached patch.

- ilmari

>From 5e59273370a39cd46627b89145a7c06dd6f00f7d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Thu, 25 Apr 2024 12:09:14 +0100
Subject: [PATCH] Add anchors for COPY format descriptions

---
 doc/src/sgml/ref/copy.sgml | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 1ce19668d8..6851293854 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -630,7 +630,7 @@
  
   File Formats
 
-  
+  
Text Format
 

@@ -770,7 +770,7 @@

   
 
-  
+  
CSV Format
 

@@ -855,7 +855,7 @@
 
   
 
-  
+  
Binary Format
 

-- 
2.39.2



Re: Idea Feedback: psql \h misses -> Offers Links?

2024-04-18 Thread Dagfinn Ilmari Mannsåker
Kirk Wolak  writes:

> On Thu, Apr 18, 2024 at 2:37 PM Peter Eisentraut 
> wrote:
>
>> On 17.04.24 19:47, Kirk Wolak wrote:
>> > *Example:*
>> > \h current_setting
>> > No help available for "current_setting".
>> > Try \h with no arguments to see available help.
>> >
>> > https://www.postgresql.org/search/?u=%2Fdocs%2F16%2F=current_setting
>> > 
>>
>> One problem is that this search URL does not actually produce any useful
>> information about current_setting.
>
> I see what you mean, but doesn't that imply our web search feature is
> weak?  That's the full name of an existing function, and it's in the index.
> But it cannot be found if searched from the website?

While I do think we could do a better job of providing links directly to
the documentation of functions and config parameters, I wouldn't say
that the search result is _completely_ useless in this case.  The first
hit is https://www.postgresql.org/docs/16/functions-admin.html, which is
where current_setting() is documented (it's even the first function on
that page, but that's just luck in this case).

- ilmari




Re: documentation structure

2024-04-18 Thread Dagfinn Ilmari Mannsåker
Corey Huinker  writes:

>>
>> I havent dealt with variadic yet, since the two styles are visually
>> different, not just markup (... renders as [...]).
>>
>> The two styles for variadic are the what I call caller-style:
>>
>>concat ( val1 "any" [, val2 "any" [, ...] ] )
>>format(formatstr text [, formatarg "any" [, ...] ])
>>
>
> While this style is obviously clumsier for us to compose, it does avoid
> relying on the user understanding what the word variadic means. Searching
> through online documentation of the python *args parameter, the word
> variadic never comes up, the closest they get is "variable length
> argument". I realize that python is not SQL, but I think it's a good point
> of reference for what concepts the average reader is likely to know.

Yeah, we can't expect everyone wanting to call a built-in function to
know how they would define an equivalent one themselves. In that case I
propos marking it up like this:

format (
formatstr text
, formatarg "any"
, ...  )
text


> Looking at the patch, I think it is good, though I'd consider doing some
> indentation for the nested s to allow the author to do more
> visual tag-matching. The ']'s were sufficiently visually distinct that we
> didn't really need or want nesting, but  is just another tag to
> my eyes in a sea of tags.

The requisite nesting when there are multiple optional parameters makes
it annoying to wrap and indent it "properly" per XML convention, but how
about something like this, with each parameter on a line of its own, and
all the closing  tags on one line?

regexp_substr (
string text,
pattern text
, start integer
, N integer
, flags text
, subexpr integer
)
text

A lot of functions mostly follow this style, except they tend to put the
first parameter on the same line of the function namee, even when that
makes the line overly long. I propose going the other way, with each
parameter on a line of its own, even if the first one would fit after
the function name, except the whole parameter list fits after the
function name.

Also, when there's only one optional argument, or they're independently
optional, not nested, the  tag should go on the same line as
the parameter.

substring (
bits bit
 FROM start 
integer 
 FOR count 
integer  )
bit


I'm not quite sure what to with things like json_object which have even
more complex nexting of optional parameters, but I do think the current
200+ character lines are too long.

- ilmari




Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-04-18 Thread Dagfinn Ilmari Mannsåker
Alexander Lakhin  writes:

> Hi Alexander,
>
> 18.04.2024 13:35, Alexander Korotkov wrote:
>>
>> The revised patchset is attached.
>> 1) I've split the fix for the CommandCounterIncrement() issue and the
>> fix for relation persistence issue into a separate patch.
>> 2) I've validated that the lock on the new partition is held in
>> createPartitionTable() after ProcessUtility() as pointed out by
>> Robert.  So, no need to place the lock again.
>> 3) Added fix for problematic error message as a separate patch [1].
>> 4) Added rename "salemans" => "salesmen" for tests as a separate patch.
>>
>> I think these fixes are reaching committable shape, but I'd like
>> someone to check it before I push.
>
> I think the feature implementation should also provide tab completion for
> SPLIT/MERGE.
> (ALTER TABLE t S
> fills in only SET now.)

Here's a patch for that.  One thing I noticed while testing it was that
the tab completeion for partitions (Query_for_partition_of_table) shows
all the schemas in the DB, even ones that don't contain any partitions
of the table being altered.

- ilmari

>From 26db03b7a7675aa7dbff1f18ee084296caa1e181 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Thu, 18 Apr 2024 17:47:22 +0100
Subject: [PATCH] =?UTF-8?q?Add=20tab=20completion=20for=20ALTER=20TABLE=20?=
 =?UTF-8?q?=E2=80=A6=20SPLIT|MERGE=20PARTITION(S)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 src/bin/psql/tab-complete.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 6fee3160f0..97cd5d9f62 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2353,6 +2353,7 @@ psql_completion(const char *text, int start, int end)
 	  "OWNER TO", "SET", "VALIDATE CONSTRAINT",
 	  "REPLICA IDENTITY", "ATTACH PARTITION",
 	  "DETACH PARTITION", "FORCE ROW LEVEL SECURITY",
+	  "SPLIT PARTITION", "MERGE PARTITIONS (",
 	  "OF", "NOT OF");
 	/* ALTER TABLE xxx ADD */
 	else if (Matches("ALTER", "TABLE", MatchAny, "ADD"))
@@ -2609,10 +2610,10 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("FROM (", "IN (", "WITH (");
 
 	/*
-	 * If we have ALTER TABLE  DETACH PARTITION, provide a list of
+	 * If we have ALTER TABLE  DETACH|SPLIT PARTITION, provide a list of
 	 * partitions of .
 	 */
-	else if (Matches("ALTER", "TABLE", MatchAny, "DETACH", "PARTITION"))
+	else if (Matches("ALTER", "TABLE", MatchAny, "DETACH|SPLIT", "PARTITION"))
 	{
 		set_completion_reference(prev3_wd);
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_partition_of_table);
@@ -2620,6 +2621,19 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("ALTER", "TABLE", MatchAny, "DETACH", "PARTITION", MatchAny))
 		COMPLETE_WITH("CONCURRENTLY", "FINALIZE");
 
+	/* ALTER TABLE  SPLIT PARTITION  */
+	else if (Matches("ALTER", "TABLE", MatchAny, "SPLIT", "PARTITION", MatchAny))
+		COMPLETE_WITH("INTO ( PARTITION");
+
+	/* ALTER TABLE  MERGE PARTITIONS ( */
+	else if (Matches("ALTER", "TABLE", MatchAny, "MERGE", "PARTITIONS", "("))
+	{
+		set_completion_reference(prev4_wd);
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_partition_of_table);
+	}
+	else if (Matches("ALTER", "TABLE", MatchAny, "MERGE", "PARTITIONS", "(*)"))
+		COMPLETE_WITH("INTO");
+
 	/* ALTER TABLE  OF */
 	else if (Matches("ALTER", "TABLE", MatchAny, "OF"))
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_composite_datatypes);
-- 
2.39.2



Re: documentation structure

2024-04-17 Thread Dagfinn Ilmari Mannsåker
Andres Freund  writes:

> Hi,
>
> On 2024-04-17 12:07:24 +0100, Dagfinn Ilmari Mannsåker wrote:
>> Andres Freund  writes:
>> > I think the manual work for writing signatures in sgml is not 
>> > insignificant,
>> > nor is the volume of sgml for them. Manually maintaining the signatures 
>> > makes
>> > it impractical to significantly improve the presentation - which I don't 
>> > think
>> > is all that great today.
>> 
>> And it's very inconsistent.  For example, some functions use 
>> tags for optional parameters, others use square brackets, and some use
>> VARIADIC to indicate variadic parameters, others use
>> ellipses (sometimes in  tags or brackets).
>
> That seems almost inevitably the outcome of many people having to manually
> infer the recommended semantics, for writing something boring but nontrivial,
> from a 30k line file.

As Corey mentioned elsethread, having a markup style guide (maybe a
comment at the top of the file?) would be nice.

>> > And the lack of argument names in the pg_proc entries is occasionally 
>> > fairly
>> > annoying, because a \df+ doesn't provide enough information to use 
>> > functions.
>> 
>> I was also annoyed by this the other day (specifically wrt. the boolean
>> arguments to pg_ls_dir),
>
> My bane is regexp_match et al, I have given up on remembering the argument
> order.

There's a thread elsewhere about those specifically, but I can't be
bothered to find the link right now.

>> and started whipping up a Perl script to parse func.sgml and generate
>> missing proargnames values for pg_proc.dat, which is how I discovered the
>> above.
>
> Nice.
>
>> The script currently has a pile of hacky regexes to cope with that,
>> so I'd be happy to submit a doc patch to turn it into actual markup to get
>> rid of that, if people think that's a worhtwhile use of time and won't clash
>> with any other plans for the documentation.
>
> I guess it's a bit hard to say without knowing how voluminious the changes
> would be. If we end up rewriting the whole file the tradeoff is less clear
> than if it's a dozen inconsistent entries.

It turned out to not be that many that used [] for optional parameters,
see the attached patch. 

I havent dealt with variadic yet, since the two styles are visually
different, not just markup (... renders as [...]).

The two styles for variadic are the what I call caller-style:

   concat ( val1 "any" [, val2 "any" [, ...] ] )
   format(formatstr text [, formatarg "any" [, ...] ])

which shows more clearly how you'd call it, versus definition-style:

   num_nonnulls ( VARIADIC "any" )
   jsonb_extract_path ( from_json jsonb, VARIADIC path_elems text[] )

which matches the CREATE FUNCTION statement.  I don't have a strong
opinion on which we should use, but we should be consistent.

> Greetings,
>
> Andres Freund

- ilmari

>From f71e0669eb25b205bd5065f15657ba6d749261f3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Wed, 17 Apr 2024 16:00:52 +0100
Subject: [PATCH] func.sgml: Consistently use  to indicate optional
 parameters

Some functions were using square brackets instead.
---
 doc/src/sgml/func.sgml | 54 +-
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 8dfb42ad4d..afaaf61d69 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3036,7 +3036,7 @@
  concat
 
 concat ( val1 "any"
- [, val2 "any" [, ...] ] )
+ , val2 "any" [, ...]  )
 text


@@ -3056,7 +3056,7 @@
 
 concat_ws ( sep text,
 val1 "any"
-[, val2 "any" [, ...] ] )
+, val2 "any" [, ...]  )
 text


@@ -3076,7 +3076,7 @@
  format
 
 format ( formatstr text
-[, formatarg "any" [, ...] ] )
+, formatarg "any" [, ...]  )
 text


@@ -3170,7 +3170,7 @@
  parse_ident
 
 parse_ident ( qualified_identifier text
-[, strict_mode boolean DEFAULT true ] )
+, strict_mode boolean DEFAULT true  )
 text[]


@@ -3309,8 +3309,8 @@
  regexp_count
 
 regexp_count ( string text, pattern text
- [, start integer
- [, flags text ] ] )
+ , start integer
+ , flags text   )
 integer


@@ -3331,11 +3331,11 @@
  regexp_instr
 
 regexp_instr ( string text, pattern text
- [, start integer
- [, N integer
- [, en

Re: documentation structure

2024-04-17 Thread Dagfinn Ilmari Mannsåker
Andres Freund  writes:

> Definitely shouldn't be the same in all cases, but I think there's a decent
> number of cases where they can be the same. The differences between the two is
> often minimal today.
>
> Entirely randomly chosen example:
>
> { oid => '2825',
>   descr => 'slope of the least-squares-fit linear equation determined by the 
> (X, Y) pairs',
>   proname => 'regr_slope', prokind => 'a', proisstrict => 'f',
>   prorettype => 'float8', proargtypes => 'float8 float8',
>   prosrc => 'aggregate_dummy' },
>
> and
>
>   
>
> 
>  regression slope
> 
> 
>  regr_slope
> 
> regr_slope ( Y 
> double precision, X double 
> precision )
> double precision
>
>
> Computes the slope of the least-squares-fit linear equation determined
> by the (X, Y)
> pairs.
>
>Yes
>   
>
>
> The description is quite similar, the pg_proc entry lacks argument names. 
>
>
>> This sounds to me like it would be a painful exercise with not a
>> lot of benefit in the end.
>
> I think the manual work for writing signatures in sgml is not insignificant,
> nor is the volume of sgml for them. Manually maintaining the signatures makes
> it impractical to significantly improve the presentation - which I don't think
> is all that great today.

And it's very inconsistent.  For example, some functions use 
tags for optional parameters, others use square brackets, and some use
VARIADIC to indicate variadic parameters, others use
ellipses (sometimes in  tags or brackets).

> And the lack of argument names in the pg_proc entries is occasionally fairly
> annoying, because a \df+ doesn't provide enough information to use functions.

I was also annoyed by this the other day (specifically wrt. the boolean
arguments to pg_ls_dir), and started whipping up a Perl script to parse
func.sgml and generate missing proargnames values for pg_proc.dat, which
is how I discovered the above.  The script currently has a pile of hacky
regexes to cope with that, so I'd be happy to submit a doc patch to turn
it into actual markup to get rid of that, if people think that's a
worhtwhile use of time and won't clash with any other plans for the
documentation.

> It'd also be quite useful if clients could render more of the documentation
> for functions. People are used to language servers providing full
> documentation for functions etc...

A more user-friendly version of \df+ (maybe spelled \hf, for symmetry
with \h for commands?) would certainly be nice.

> Greetings,
>
> Andres Freund

- ilmari




Re: IPC::Run::time[r|out] vs our TAP tests

2024-04-05 Thread Dagfinn Ilmari Mannsåker
Tom Lane  writes:

> Erik Wienhold  writes:
>> Libedit 20191025-3.1 is the first version where ":{?VERB" works as
>> expected.  The previous release 20190324-3.1 still produces the escaped
>> output that Michael found.  That narrows down the changes to everything
>> between [1] (changed on 2019-03-24 but not included in 20190324-3.1) and
>> [2] (both inclusive).
>
> Hm.  I just installed NetBSD 8.2 in a VM, and it passes this test:
>
> # +++ tap install-check in src/bin/psql +++
> t/001_basic.pl ... ok 
> t/010_tab_completion.pl .. ok
> t/020_cancel.pl .. ok   
> All tests successful.
>
> So it seems like the bug does not exist in any currently-supported
> NetBSD release.  Debian has been known to ship obsolete libedit
> versions, though.

Both the current (bokworm/12) and previous (bullseye/11) versions of
Debian have new enough libedits to not be affected by this bug:

libedit| 3.1-20181209-1 | oldoldstable   | source
libedit| 3.1-20191231-2 | oldstable  | source
libedit| 3.1-20221030-2 | stable | source
libedit| 3.1-20230828-1 | testing| source
libedit| 3.1-20230828-1 | unstable   | source
libedit| 3.1-20230828-1 | unstable-debug | source

But in bullseye they decided that OpenSSL is a system library as far as
the GPL is concerned, so are linking directly to readline.

And even before then their psql wrapper would LD_PRELOAD readline
instead if installed, so approximately nobody actually ever used psql
with libedit on Debian.

>   regards, tom lane

- ilmari




Re: Using the %m printf format more

2024-04-04 Thread Dagfinn Ilmari Mannsåker
On Thu, 4 Apr 2024, at 03:35, Michael Paquier wrote:
> On Fri, Mar 22, 2024 at 01:58:24PM +0000, Dagfinn Ilmari Mannsåker wrote:
>> Here's an updated patch that adds such a comment.  I'll add it to the
>> commitfest later today unless someone commits it before then.
>
> I see no problem to do that now rather than later.  So, done to make
> pg_regress able to use %m.

Thanks!

-- 
- ilmari




Re: Using the %m printf format more

2024-03-22 Thread Dagfinn Ilmari Mannsåker
Dagfinn Ilmari Mannsåker  writes:

> Michael Paquier  writes:
>
>> On Wed, Mar 13, 2024 at 02:33:52PM +0100, Peter Eisentraut wrote:
>>> The 0002 patch looks sensible.  It would be good to fix that, otherwise it
>>> could have some confusing outcomes in the future.
>>
>> You mean if we begin to use %m in future callers of
>> emit_tap_output_v(), hypothetically?  That's a fair argument.
>
> Yeah, developers would rightfully expect to be able to use %m with
> anything that takes a printf format string.  Case in point: when I was
> first doing the conversion I did change the bail() and diag() calls in
> pg_regress to %m, and only while I was preparing the patch for
> submission did I think to check the actual implementation to see if it
> was safe to do so.
>
> The alternative would be to document that you can't use %m with these
> functions, which is silly IMO, given how simple the fix is.
>
> One minor improvement I can think of is to add a comment by the
> save_errno declaration noting that it's needed in order to support %m.

Here's an updated patch that adds such a comment.  I'll add it to the
commitfest later today unless someone commits it before then.

> - ilmari

>From 4312d746a722582202a013c7199f5c42e88db951 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Mon, 11 Mar 2024 11:08:14 +
Subject: [PATCH v2] Save errno in emit_tap_output_v() and use %m in callers
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This defends aganst the fprintf() calls before vprintf(…, fmt, …)
clobbering errno, thus breaking %m.
---
 src/test/regress/pg_regress.c | 84 ++-
 1 file changed, 34 insertions(+), 50 deletions(-)

diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index 76f01c73f5..ea94d874b0 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -341,6 +341,14 @@ emit_tap_output_v(TAPtype type, const char *fmt, va_list argp)
 {
 	va_list		argp_logfile;
 	FILE	   *fp;
+	int			save_errno;
+
+	/*
+	 * The fprintf() calls used to output TAP protocol elements might clobber
+	 * errno, so we need to save it and restore it before vfprintf()-ing the
+	 * user's format string, in case it contains %m placeholders.
+	 */
+	save_errno = errno;
 
 	/*
 	 * Diagnostic output will be hidden by prove unless printed to stderr. The
@@ -379,9 +387,13 @@ emit_tap_output_v(TAPtype type, const char *fmt, va_list argp)
 		if (logfile)
 			fprintf(logfile, "# ");
 	}
+	errno = save_errno;
 	vfprintf(fp, fmt, argp);
 	if (logfile)
+	{
+		errno = save_errno;
 		vfprintf(logfile, fmt, argp_logfile);
+	}
 
 	/*
 	 * If we are entering into a note with more details to follow, register
@@ -492,10 +504,7 @@ make_temp_sockdir(void)
 
 	temp_sockdir = mkdtemp(template);
 	if (temp_sockdir == NULL)
-	{
-		bail("could not create directory \"%s\": %s",
-			 template, strerror(errno));
-	}
+		bail("could not create directory \"%s\": %m", template);
 
 	/* Stage file names for remove_temp().  Unsafe in a signal handler. */
 	UNIXSOCK_PATH(sockself, port, temp_sockdir);
@@ -616,8 +625,7 @@ load_resultmap(void)
 		/* OK if it doesn't exist, else complain */
 		if (errno == ENOENT)
 			return;
-		bail("could not open file \"%s\" for reading: %s",
-			 buf, strerror(errno));
+		bail("could not open file \"%s\" for reading: %m", buf);
 	}
 
 	while (fgets(buf, sizeof(buf), f))
@@ -1046,10 +1054,7 @@ config_sspi_auth(const char *pgdata, const char *superuser_name)
 #define CW(cond)	\
 	do { \
 		if (!(cond)) \
-		{ \
-			bail("could not write to file \"%s\": %s", \
- fname, strerror(errno)); \
-		} \
+			bail("could not write to file \"%s\": %m", fname); \
 	} while (0)
 
 	res = snprintf(fname, sizeof(fname), "%s/pg_hba.conf", pgdata);
@@ -1064,8 +1069,7 @@ config_sspi_auth(const char *pgdata, const char *superuser_name)
 	hba = fopen(fname, "w");
 	if (hba == NULL)
 	{
-		bail("could not open file \"%s\" for writing: %s",
-			 fname, strerror(errno));
+		bail("could not open file \"%s\" for writing: %m", fname);
 	}
 	CW(fputs("# Configuration written by config_sspi_auth()\n", hba) >= 0);
 	CW(fputs("host all all 127.0.0.1/32  sspi include_realm=1 map=regress\n",
@@ -1079,8 +1083,7 @@ config_sspi_auth(const char *pgdata, const char *superuser_name)
 	ident = fopen(fname, "w");
 	if (ident == NULL)
 	{
-		bail("could not open file \"%s\" for writing: %s",
-			 fname, strerror(errno));
+		bail("could not open file \"%s\" for writing: %m", fname);
 	}
 	CW(fputs("# Configuration written by config_sspi_auth()\n", ident) &

Re: What about Perl autodie?

2024-03-18 Thread Dagfinn Ilmari Mannsåker
Daniel Gustafsson  writes:

>> On 18 Mar 2024, at 07:27, Peter Eisentraut  wrote:
>
>> After some pondering, I figured the exclude list is better.  
>
> Agreed.
>
>> So here is a squashed patch, also with a complete commit message.
>
> Looks good from a read-through.  It would have been nice to standardize on
> using one of "|| die" and "or die" consistently but that's clearly not for 
> this
> body of work.

"or die" is generally the preferred form, since || has higher precedence
than comma, so it's easy to make mistakes if you don't parenthesise the
function args, like:

   open my $fh, '>', $filname || die "can't open $filename: $!";

which will only fail if $filename is falsy (i.e. undef, "", or "0").

- ilmari




Re: Using the %m printf format more

2024-03-14 Thread Dagfinn Ilmari Mannsåker
Michael Paquier  writes:

> On Wed, Mar 13, 2024 at 02:33:52PM +0100, Peter Eisentraut wrote:
>> The 0002 patch looks sensible.  It would be good to fix that, otherwise it
>> could have some confusing outcomes in the future.
>
> You mean if we begin to use %m in future callers of
> emit_tap_output_v(), hypothetically?  That's a fair argument.

Yeah, developers would rightfully expect to be able to use %m with
anything that takes a printf format string.  Case in point: when I was
first doing the conversion I did change the bail() and diag() calls in
pg_regress to %m, and only while I was preparing the patch for
submission did I think to check the actual implementation to see if it
was safe to do so.

The alternative would be to document that you can't use %m with these
functions, which is silly IMO, given how simple the fix is.

One minor improvement I can think of is to add a comment by the
save_errno declaration noting that it's needed in order to support %m.

- ilmari




Re: Using the %m printf format more

2024-03-13 Thread Dagfinn Ilmari Mannsåker
Michael Paquier  writes:

> On Mon, Mar 11, 2024 at 11:19:16AM +0000, Dagfinn Ilmari Mannsåker wrote:
>> On closer look, fprintf() is actually the only errno-clobbering function
>> it calls, I was just hedging my bets in that statement.
>
> This makes the whole simpler, so I'd be OK with that.  I am wondering
> if others have opinions to offer about that.

If no one chimes in in the next couple of days I'll add it to the
commitfest so it doesn't get lost.

> I've applied v2-0001 for now, as it is worth on its own and it shaves
> a bit of code.

Thanks!

- ilmari




Re: Using the %m printf format more

2024-03-11 Thread Dagfinn Ilmari Mannsåker
Michael Paquier  writes:

> On Wed, Mar 06, 2024 at 07:11:19PM +0000, Dagfinn Ilmari Mannsåker wrote:
>
>> The attached patch does so everywhere appropriate. One place where it's
>> not appropriate is the TAP-emitting functions in pg_regress, since those
>> call fprintf()
>
> I am not really following your argument with pg_regress.c and
> fprintf().  d6c55de1f99a should make that possible even in the case of
> emit_tap_output_v(), no? 

The problem isn't that emit_tap_output_v() doesn't support %m, which it
does, but that it potentially calls fprintf() to output TAP protocol
elements such as "\n" and "# " before it calls vprintf(…, fmt, …), and
those calls might clobber errno.  An option is to make it save errno at
the start and restore it before the vprintf() calls, as in the second
attached patch.

>> and other potentially errno-modifying functions before
>> evaluating the format string.
>
> Sure.

On closer look, fprintf() is actually the only errno-clobbering function
it calls, I was just hedging my bets in that statement.

- ilmari

>From 3727341aad84fbd275acb24f92591cc734fdd6a7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Wed, 6 Mar 2024 16:58:33 +
Subject: [PATCH v2 1/2] Use %m printf format instead of strerror(errno) where
 appropriate

Now that all live branches (12-) support %m in printf formats, let's
use it everywhere appropriate.

Particularly, we're _not_ converting the TAP output calls in
pg_regress, since those functions call fprintf() and friends before
evaluating the format string, which can clobber errno.
---
 src/backend/postmaster/postmaster.c   | 21 ++--
 src/backend/postmaster/syslogger.c|  2 +-
 src/backend/utils/misc/guc.c  |  9 +-
 src/bin/initdb/findtimezone.c |  4 +-
 src/bin/pg_ctl/pg_ctl.c   | 74 +++---
 src/bin/pg_dump/compress_gzip.c   |  2 +-
 src/bin/pg_dump/compress_none.c   |  5 +-
 src/bin/pg_upgrade/check.c| 41 +++-
 src/bin/pg_upgrade/controldata.c  |  6 +-
 src/bin/pg_upgrade/exec.c | 14 ++-
 src/bin/pg_upgrade/file.c | 98 +--
 src/bin/pg_upgrade/function.c |  3 +-
 src/bin/pg_upgrade/option.c   | 10 +-
 src/bin/pg_upgrade/parallel.c | 12 +--
 src/bin/pg_upgrade/pg_upgrade.c   |  4 +-
 src/bin/pg_upgrade/relfilenumber.c|  5 +-
 src/bin/pg_upgrade/tablespace.c   |  4 +-
 src/bin/pg_upgrade/version.c  |  9 +-
 src/common/psprintf.c |  4 +-
 src/interfaces/ecpg/preproc/ecpg.c| 12 +--
 src/port/path.c   |  3 +-
 src/test/isolation/isolationtester.c  |  2 +-
 .../modules/libpq_pipeline/libpq_pipeline.c   |  4 +-
 src/tools/ifaddrs/test_ifaddrs.c  |  2 +-
 24 files changed, 158 insertions(+), 192 deletions(-)

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index f3c09e8dc0..af8a1efe66 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1375,12 +1375,12 @@ PostmasterMain(int argc, char *argv[])
 
 			/* Make PID file world readable */
 			if (chmod(external_pid_file, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH) != 0)
-write_stderr("%s: could not change permissions of external PID file \"%s\": %s\n",
-			 progname, external_pid_file, strerror(errno));
+write_stderr("%s: could not change permissions of external PID file \"%s\": %m\n",
+			 progname, external_pid_file);
 		}
 		else
-			write_stderr("%s: could not write external PID file \"%s\": %s\n",
-		 progname, external_pid_file, strerror(errno));
+			write_stderr("%s: could not write external PID file \"%s\": %m\n",
+		 progname, external_pid_file);
 
 		on_proc_exit(unlink_external_pid_file, 0);
 	}
@@ -1589,8 +1589,8 @@ checkControlFile(void)
 	{
 		write_stderr("%s: could not find the database system\n"
 	 "Expected to find it in the directory \"%s\",\n"
-	 "but could not open file \"%s\": %s\n",
-	 progname, DataDir, path, strerror(errno));
+	 "but could not open file \"%s\": %m\n",
+	 progname, DataDir, path);
 		ExitPostmaster(2);
 	}
 	FreeFile(fp);
@@ -6277,15 +6277,13 @@ read_backend_variables(char *id, Port **port, BackgroundWorker **worker)
 	fp = AllocateFile(id, PG_BINARY_R);
 	if (!fp)
 	{
-		write_stderr("could not open backend variables file \"%s\": %s\n",
-	 id, strerror(errno));
+		write_stderr("could not open backend variables file \"%s\": %m\n", id);
 		exit(1);
 	}
 
 	if (fread(, sizeof(param), 1

Using the %m printf format more

2024-03-06 Thread Dagfinn Ilmari Mannsåker
Hi hackers,

I just noticed that commit d93627bc added a bunch of pg_fatal() calls
using %s and strerror(errno), which could be written more concisely as
%m.  I'm assuming this was done because the surrounding code also uses
this pattern, and hadn't been changed to use %m when support for that
was added to snprintf.c to avoid backporting hazards.  However, that
support was in v12, which is now the oldest still-supported back branch,
so we can safely make that change.

The attached patch does so everywhere appropriate. One place where it's
not appropriate is the TAP-emitting functions in pg_regress, since those
call fprintf() and other potentially errno-modifying functions before
evaluating the format string.

- ilmari

>From 17ad27665821decf9116723fe8873bd256e9d75d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Wed, 6 Mar 2024 16:58:33 +
Subject: [PATCH] Use %m printf format instead of strerror(errno) where
 appropriate

Now that all live branches (12-) support %m in printf formats, let's
use it everywhere appropriate.

Particularly, we're _not_ converting the TAP output calls in
pg_regress, since those functions call fprintf() and friends before
evaluating the format string, which can clobber errno.
---
 src/backend/postmaster/postmaster.c   | 21 ++--
 src/backend/postmaster/syslogger.c|  2 +-
 src/backend/utils/misc/guc.c  |  9 +-
 src/bin/initdb/findtimezone.c |  4 +-
 src/bin/pg_ctl/pg_ctl.c   | 74 +++---
 src/bin/pg_dump/compress_gzip.c   |  2 +-
 src/bin/pg_dump/compress_none.c   |  5 +-
 src/bin/pg_upgrade/check.c| 41 +++-
 src/bin/pg_upgrade/controldata.c  |  6 +-
 src/bin/pg_upgrade/exec.c | 14 ++-
 src/bin/pg_upgrade/file.c | 98 +--
 src/bin/pg_upgrade/function.c |  3 +-
 src/bin/pg_upgrade/option.c   | 10 +-
 src/bin/pg_upgrade/parallel.c | 12 +--
 src/bin/pg_upgrade/pg_upgrade.c   |  4 +-
 src/bin/pg_upgrade/relfilenumber.c|  5 +-
 src/bin/pg_upgrade/tablespace.c   |  4 +-
 src/bin/pg_upgrade/version.c  |  9 +-
 src/common/psprintf.c |  4 +-
 src/interfaces/ecpg/preproc/ecpg.c| 12 +--
 src/test/isolation/isolationtester.c  |  2 +-
 .../modules/libpq_pipeline/libpq_pipeline.c   |  4 +-
 src/tools/ifaddrs/test_ifaddrs.c  |  2 +-
 23 files changed, 157 insertions(+), 190 deletions(-)

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index f3c09e8dc0..af8a1efe66 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1375,12 +1375,12 @@ PostmasterMain(int argc, char *argv[])
 
 			/* Make PID file world readable */
 			if (chmod(external_pid_file, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH) != 0)
-write_stderr("%s: could not change permissions of external PID file \"%s\": %s\n",
-			 progname, external_pid_file, strerror(errno));
+write_stderr("%s: could not change permissions of external PID file \"%s\": %m\n",
+			 progname, external_pid_file);
 		}
 		else
-			write_stderr("%s: could not write external PID file \"%s\": %s\n",
-		 progname, external_pid_file, strerror(errno));
+			write_stderr("%s: could not write external PID file \"%s\": %m\n",
+		 progname, external_pid_file);
 
 		on_proc_exit(unlink_external_pid_file, 0);
 	}
@@ -1589,8 +1589,8 @@ checkControlFile(void)
 	{
 		write_stderr("%s: could not find the database system\n"
 	 "Expected to find it in the directory \"%s\",\n"
-	 "but could not open file \"%s\": %s\n",
-	 progname, DataDir, path, strerror(errno));
+	 "but could not open file \"%s\": %m\n",
+	 progname, DataDir, path);
 		ExitPostmaster(2);
 	}
 	FreeFile(fp);
@@ -6277,15 +6277,13 @@ read_backend_variables(char *id, Port **port, BackgroundWorker **worker)
 	fp = AllocateFile(id, PG_BINARY_R);
 	if (!fp)
 	{
-		write_stderr("could not open backend variables file \"%s\": %s\n",
-	 id, strerror(errno));
+		write_stderr("could not open backend variables file \"%s\": %m\n", id);
 		exit(1);
 	}
 
 	if (fread(, sizeof(param), 1, fp) != 1)
 	{
-		write_stderr("could not read from backend variables file \"%s\": %s\n",
-	 id, strerror(errno));
+		write_stderr("could not read from backend variables file \"%s\": %m\n", id);
 		exit(1);
 	}
 
@@ -6293,8 +6291,7 @@ read_backend_variables(char *id, Port **port, BackgroundWorker **worker)
 	FreeFile(fp);
 	if (unlink(id) != 0)
 	{
-		write_stderr("could not remove file \"%s\": %s\n",
-	 id, strerror(errno));
+		write_stderr("could not remove file \"%s\": %m\n", id);
 		exit(1);
 	}
 #else
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index c2a6a226e7..d9d042f562 100644
--- 

Re: What about Perl autodie?

2024-02-08 Thread Dagfinn Ilmari Mannsåker
Daniel Gustafsson  writes:

>> On 8 Feb 2024, at 16:53, Tom Lane  wrote:
>
>> 2. Don't wait, migrate them all now.  This would mean requiring
>> Perl 5.10.1 or later to run the TAP tests, even in back branches.
>> 
>> I think #2 might not be all that radical.  We have nothing older
>> than 5.14.0 in the buildfarm, so we don't really have much grounds
>> for claiming that 5.8.3 will work today.  And 5.10.1 came out in
>> 2009, so how likely is it that anyone cares anymore?
>
> I would vote for this option, if we don't run the trailing edge anywhere where
> breakage is visible to developers then it is like you say, far from guaranteed
> to work.

The oldest Perl I'm aware of on a still-supported (fsvo) OS is RHEL 6,
which shipped 5.10.1 and has Extended Life-cycle Support until
2024-06-30.

For comparison, last year the at the Perl Toolchain Summit in Lyon we
decided that toolchain modules (the modules needed to build, test and
install CPAN distributions) are only required support versions of Perl
up to 10 years old, i.e. currently 5.18 (but there's a one-time
excemption to keep it to 5.16 until RHEL 7 goes out of maintenance
support on 2024-06-30).

- ilmari




Re: 2024-02-08 release announcement draft

2024-02-08 Thread Dagfinn Ilmari Mannsåker
jian he  writes:

> On Thu, Feb 8, 2024 at 1:17 PM Tom Lane  wrote:
>>
>> "Jonathan S. Katz"  writes:
>> > On 2/6/24 3:19 AM, jian he wrote:
>> >> On Tue, Feb 6, 2024 at 12:43 PM Jonathan S. Katz  
>> >> wrote:
>> >>> * In PL/pgSQL, support SQL commands that are `CREATE FUNCTION`/`CREATE
>> >>> PROCEDURE` with SQL-standard bodies.
>>
>> >> https://git.postgresql.org/cgit/postgresql.git/commit/?id=57b440ec115f57ff6e6a3f0df26063822e3123d2
>> >> says only for plpgsql routine or DO block.
>>
>> > I had some trouble wordsmithing this, but the commit title is pretty
>> > clear to me so I opted for that.
>>
>> Your text seems fine to me.  I'm confused about the objection here:
>> exactly what uses of plpgsql aren't in a routine or DO block?
>>
>> regards, tom lane
>
> I guess I was confused with cases like this
> `
> create function test1() returns int language plpgsql
> begin atomic
> select unique1 from tenk1 limit 1;
> end ;
> `
> looking back, the original text seems fine.

The word "routine" is the SQL standard's common term for both functions
and procedures.

- ilmari




Re: doc patch: Spell I/O consistently

2024-02-06 Thread Dagfinn Ilmari Mannsåker
Michael Paquier  writes:

> On Mon, Feb 05, 2024 at 03:59:27PM +0000, Dagfinn Ilmari Mannsåker wrote:
>> I just noticed that a couple of places in the docs spell I/O as IO or
>> even io when not referring to literal table/view/column names or values
>> therein.  Here's a patch to fix them.
>
> Makes sense to me to be consistent in these sections of the docs, so
> applied.  Thanks!

Thanks!

- ilmari




doc patch: Spell I/O consistently

2024-02-05 Thread Dagfinn Ilmari Mannsåker
Hi Hackers,

I just noticed that a couple of places in the docs spell I/O as IO or
even io when not referring to literal table/view/column names or values
therein.  Here's a patch to fix them.

- ilmari


>From ed5f9ce738dd6356d5d68e4cfed95d8d98d2cde5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Mon, 5 Feb 2024 15:52:08 +
Subject: [PATCH] doc: Spell I/O consistently

The pg_stat_io and pg_stat_copy_progress view docs spelled I/O as IO
or even io in some places when not referring to literal names or
string values.
---
 doc/src/sgml/monitoring.sgml | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index d9b8b37585..5cf9363ac8 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2584,7 +2584,7 @@
   vacuum: I/O operations performed outside of shared
   buffers while vacuuming and analyzing permanent relations. Temporary
   table vacuums use the same local buffer pool as other temporary table
-  IO operations and are tracked in context
+  I/O operations and are tracked in context
   normal.
  
 
@@ -2860,7 +2860,7 @@
 Columns tracking I/O time will only be non-zero when
  is enabled. The user should be
 careful when referencing these columns in combination with their
-corresponding IO operations in case track_io_timing
+corresponding I/O operations in case track_io_timing
 was not enabled for the entire time since the last stats reset.

   
@@ -5734,7 +5734,7 @@
type text
   
   
-   The io type that the data is read from or written to:
+   The I/O type that the data is read from or written to:
FILE, PROGRAM,
PIPE (for COPY FROM STDIN and
COPY TO STDOUT), or CALLBACK
-- 
2.39.2



Re: Patch: Add parse_type Function

2024-02-05 Thread Dagfinn Ilmari Mannsåker
Jim Jones  writes:

> 1) The function causes a crash when called with a NULL parameter:
>
> SELECT * FROM parse_type(NULL);
> server closed the connection unexpectedly
>     This probably means the server terminated abnormally
>     before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> The connection to the server was lost. Attempting reset: Failed.
>
> You have to check it in the beginning of function and either return an
> error message or just NULL, e.g
>
>     if (PG_ARGISNULL(0))
>     PG_RETURN_NULL();

Instead of handling this in the function body, the function should be
declared as strict.  This is in fact the default in pg_proc.h,

/* strict with respect to NULLs? */
boolproisstrict BKI_DEFAULT(t);

so removing the explicit proisstrict => 'f' from the pg_proc.dat entry
will fix it with no additional code.

> 2) Add a function call with NULL in the regression tests
>
> SELECT * FROM parse_type(NULL);
>
> 3) Add the expected behaviour of calling the function with NULL in the
> docs (error message or null)

Once the function is declared strict, I don't think either of these is
necessary: function strictness is tested elsewhere, and it's the default
behaviour.  The only functions that explicitly say they return NULL on
NULL inputs are quote_literal (because you might expect it to return the
string 'NULL', but there's qoute_nullable for that) and xmlexists (which
I don't see any particular reason for).

- ilmari




Re: Bytea PL/Perl transform

2024-01-30 Thread Dagfinn Ilmari Mannsåker
Pavel Stehule  writes:

> út 30. 1. 2024 v 17:46 odesílatel Dagfinn Ilmari Mannsåker <
> ilm...@ilmari.org> napsal:
>
>> Pavel Stehule  writes:
>>
>> > út 30. 1. 2024 v 17:18 odesílatel Dagfinn Ilmari Mannsåker <
>> > ilm...@ilmari.org> napsal:
>> >
>> >> Pavel Stehule  writes:
>> >>
>> >> > út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
>> >> > ilm...@ilmari.org> napsal:
>> >> >
>> >> >> Pavel Stehule  writes:
>> >> >>
>> >> >> > I inserted perl reference support - hstore_plperl and json_plperl
>> does
>> >> >> it.
>> >> >> >
>> >> >> > +<->/* Dereference references recursively. */
>> >> >> > +<->while (SvROK(in))
>> >> >> > +<-><-->in = SvRV(in);
>> >> >>
>> >> >> That code in hstore_plperl and json_plperl is only relevant because
>> they
>> >> >> deal with non-scalar values (hashes for hstore, and also arrays for
>> >> >> json) which must be passed as references.  The recursive nature of
>> the
>> >> >> dereferencing is questionable, and masked the bug fixed by commit
>> >> >> 1731e3741cbbf8e0b4481665d7d523bc55117f63.
>> >> >>
>> >> >> bytea_plperl only deals with scalars (specifically strings), so
>> should
>> >> >> not concern itself with references.  In fact, this code breaks
>> returning
>> >> >> objects with overloaded stringification, for example:
>> >> >>
>> >> >> CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
>> >> >>   TRANSFORM FOR TYPE bytea
>> >> >>   AS $$
>> >> >> package StringOverload { use overload '""' => sub { "stuff" }; }
>> >> >> return bless {}, "StringOverload";
>> >> >>   $$;
>> >> >>
>> >> >> This makes the server crash with an assertion failure from Perl
>> because
>> >> >> SvPVbyte() was passed a non-scalar value:
>> >> >>
>> >> >> postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
>> >> >> Perl_sv_2pv_flags:
>> >> >> Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV &&
>> >> SvTYPE(sv)
>> >> >> != SVt_PVFM' failed.
>> >> >>
>> >> >> If I remove the dereferincing loop it succeeds:
>> >> >>
>> >> >> SELECT encode(plperlu_overload(), 'escape') AS string;
>> >> >>  string
>> >> >> 
>> >> >>  stuff
>> >> >> (1 row)
>> >> >>
>> >> >> Attached is a v2 patch which removes the dereferencing and includes
>> the
>> >> >> above example as a test.
>> >> >>
>> >> >
>> >> > But without dereference it returns bad value.
>> >>
>> >> Where exactly does it return a bad value?  The existing tests pass, and
>> >> the one I included shows that it does the right thing in that case too.
>> >> If you pass it an unblessed reference it returns the stringified version
>> >> of that, as expected.
>> >>
>> >
>> > ugly test code
>> >
>> > (2024-01-30 13:44:28) postgres=# CREATE or replace FUNCTION
>> > perl_inverse_bytes(bytea) RETURNS bytea
>> > TRANSFORM FOR TYPE bytea
>> > AS $$ my $bytes =  pack 'H*', '0123'; my $ref = \$bytes;
>>
>> You are returning a reference, not a string.
>>
>
> I know, but for this case, should not be raised an error?

I don't think so, as I explained in my previous reply:

> There's no reason to ban references, that would break every Perl
> programmer's expectations.

To elaborate on this: when a function is defined to return a string
(which bytea effectively is, as far as Perl is converned), I as a Perl
programmer would expect PL/Perl to just stringify whatever value I
returned, according to the usual Perl rules.

I also said:

> If we really want to be strict, we should at least allow references to
> objects that overload stringification, as they are explicitly designed
> to be well-behaved as strings.  But that would be a lot of extra code
> for very little benefit over just letting Perl stringify everything.

By "a lot of code", I mean everything `string_amg`-related in the
amagic_applies() function
(https://github.com/Perl/perl5/blob/v5.38.0/gv.c#L3401-L3545).  We can't
just call it: it's only available since Perl 5.38 (released last year),
and we support Perl versions all the way back to 5.14.

- ilmari




Re: Bytea PL/Perl transform

2024-01-30 Thread Dagfinn Ilmari Mannsåker
Pavel Stehule  writes:

> út 30. 1. 2024 v 17:18 odesílatel Dagfinn Ilmari Mannsåker <
> ilm...@ilmari.org> napsal:
>
>> Pavel Stehule  writes:
>>
>> > út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
>> > ilm...@ilmari.org> napsal:
>> >
>> >> Pavel Stehule  writes:
>> >>
>> >> > I inserted perl reference support - hstore_plperl and json_plperl does
>> >> it.
>> >> >
>> >> > +<->/* Dereference references recursively. */
>> >> > +<->while (SvROK(in))
>> >> > +<-><-->in = SvRV(in);
>> >>
>> >> That code in hstore_plperl and json_plperl is only relevant because they
>> >> deal with non-scalar values (hashes for hstore, and also arrays for
>> >> json) which must be passed as references.  The recursive nature of the
>> >> dereferencing is questionable, and masked the bug fixed by commit
>> >> 1731e3741cbbf8e0b4481665d7d523bc55117f63.
>> >>
>> >> bytea_plperl only deals with scalars (specifically strings), so should
>> >> not concern itself with references.  In fact, this code breaks returning
>> >> objects with overloaded stringification, for example:
>> >>
>> >> CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
>> >>   TRANSFORM FOR TYPE bytea
>> >>   AS $$
>> >> package StringOverload { use overload '""' => sub { "stuff" }; }
>> >> return bless {}, "StringOverload";
>> >>   $$;
>> >>
>> >> This makes the server crash with an assertion failure from Perl because
>> >> SvPVbyte() was passed a non-scalar value:
>> >>
>> >> postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
>> >> Perl_sv_2pv_flags:
>> >> Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV &&
>> SvTYPE(sv)
>> >> != SVt_PVFM' failed.
>> >>
>> >> If I remove the dereferincing loop it succeeds:
>> >>
>> >> SELECT encode(plperlu_overload(), 'escape') AS string;
>> >>  string
>> >> 
>> >>  stuff
>> >> (1 row)
>> >>
>> >> Attached is a v2 patch which removes the dereferencing and includes the
>> >> above example as a test.
>> >>
>> >
>> > But without dereference it returns bad value.
>>
>> Where exactly does it return a bad value?  The existing tests pass, and
>> the one I included shows that it does the right thing in that case too.
>> If you pass it an unblessed reference it returns the stringified version
>> of that, as expected.
>>
>
> ugly test code
>
> (2024-01-30 13:44:28) postgres=# CREATE or replace FUNCTION
> perl_inverse_bytes(bytea) RETURNS bytea
> TRANSFORM FOR TYPE bytea
> AS $$ my $bytes =  pack 'H*', '0123'; my $ref = \$bytes;

You are returning a reference, not a string.

> return $ref;
> $$ LANGUAGE plperlu;
> CREATE FUNCTION
> (2024-01-30 13:44:33) postgres=# select perl_inverse_bytes(''), ' '::bytea;
> ┌──┬───┐
> │  perl_inverse_bytes  │ bytea │
> ╞══╪═══╡
> │ \x5343414c41522830783130656134333829 │ \x20  │
> └──┴───┘
> (1 row)

~=# select encode('\x5343414c41522830783130656134333829', 'escape');
┌───┐
│  encode   │
├───┤
│ SCALAR(0x10ea438) │
└───┘

This is how Perl stringifies references in the absence of overloading.
Return the byte string directly from your function and it will do the
right thing:

CREATE FUNCTION plperlu_bytes() RETURNS bytea LANGUAGE plperlu
 TRANSFORM FOR TYPE bytea
 AS $$ return pack  'H*', '0123'; $$;

SELECT plperlu_bytes();
 plperlu_bytes
---
 \x0123
(1 row)


- ilmari




Re: Bytea PL/Perl transform

2024-01-30 Thread Dagfinn Ilmari Mannsåker
Pavel Stehule  writes:

> út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
> ilm...@ilmari.org> napsal:
>
>> Pavel Stehule  writes:
>>
>> > I inserted perl reference support - hstore_plperl and json_plperl does
>> it.
>> >
>> > +<->/* Dereference references recursively. */
>> > +<->while (SvROK(in))
>> > +<-><-->in = SvRV(in);
>>
>> That code in hstore_plperl and json_plperl is only relevant because they
>> deal with non-scalar values (hashes for hstore, and also arrays for
>> json) which must be passed as references.  The recursive nature of the
>> dereferencing is questionable, and masked the bug fixed by commit
>> 1731e3741cbbf8e0b4481665d7d523bc55117f63.
>>
>> bytea_plperl only deals with scalars (specifically strings), so should
>> not concern itself with references.  In fact, this code breaks returning
>> objects with overloaded stringification, for example:
>>
>> CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
>>   TRANSFORM FOR TYPE bytea
>>   AS $$
>> package StringOverload { use overload '""' => sub { "stuff" }; }
>> return bless {}, "StringOverload";
>>   $$;
>>
>> This makes the server crash with an assertion failure from Perl because
>> SvPVbyte() was passed a non-scalar value:
>>
>> postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
>> Perl_sv_2pv_flags:
>> Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV && SvTYPE(sv)
>> != SVt_PVFM' failed.
>>
>> If I remove the dereferincing loop it succeeds:
>>
>> SELECT encode(plperlu_overload(), 'escape') AS string;
>>  string
>> 
>>  stuff
>> (1 row)
>>
>> Attached is a v2 patch which removes the dereferencing and includes the
>> above example as a test.
>>
>
> But without dereference it returns bad value.

Where exactly does it return a bad value?  The existing tests pass, and
the one I included shows that it does the right thing in that case too.
If you pass it an unblessed reference it returns the stringified version
of that, as expected.

CREATE FUNCTION plperl_reference() RETURNS bytea LANGUAGE plperl
 TRANSFORM FOR TYPE bytea
 AS $$ return []; $$;

SELECT encode(plperl_reference(), 'escape') string;
string
---
 ARRAY(0x559a3109f0a8)
(1 row)

This would also crash if the dereferencing loop was left in place.

> Maybe there should be a check so references cannot be returned? Probably is
> not safe pass pointers between Perl and Postgres.

There's no reason to ban references, that would break every Perl
programmer's expectations.  And there are no pointers being passed,
SvPVbyte() returns the stringified form of whatever's passed in, which
is well-behaved for both blessed and unblessed references.

If we really want to be strict, we should at least allow references to
objects that overload stringification, as they are explicitly designed
to be well-behaved as strings.  But that would be a lot of extra code
for very little benefit over just letting Perl stringify everything.

- ilmari




Re: Bytea PL/Perl transform

2024-01-30 Thread Dagfinn Ilmari Mannsåker
Pavel Stehule  writes:

> I inserted perl reference support - hstore_plperl and json_plperl does it.
>
> +<->/* Dereference references recursively. */
> +<->while (SvROK(in))
> +<-><-->in = SvRV(in);

That code in hstore_plperl and json_plperl is only relevant because they
deal with non-scalar values (hashes for hstore, and also arrays for
json) which must be passed as references.  The recursive nature of the
dereferencing is questionable, and masked the bug fixed by commit
1731e3741cbbf8e0b4481665d7d523bc55117f63.

bytea_plperl only deals with scalars (specifically strings), so should
not concern itself with references.  In fact, this code breaks returning
objects with overloaded stringification, for example:

CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
  TRANSFORM FOR TYPE bytea
  AS $$
package StringOverload { use overload '""' => sub { "stuff" }; }
return bless {}, "StringOverload";
  $$;

This makes the server crash with an assertion failure from Perl because
SvPVbyte() was passed a non-scalar value:

postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865: 
Perl_sv_2pv_flags:
Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV && SvTYPE(sv) != 
SVt_PVFM' failed.

If I remove the dereferincing loop it succeeds:

SELECT encode(plperlu_overload(), 'escape') AS string;
 string

 stuff
(1 row)

Attached is a v2 patch which removes the dereferencing and includes the
above example as a test.

- ilmari

>From aabaf4f5932f59de2fed48bbba7339807a1f04bd Mon Sep 17 00:00:00 2001
From: "ok...@github.com" 
Date: Tue, 30 Jan 2024 10:31:00 +0100
Subject: [PATCH v2] Add bytea transformation for plperl

---
 contrib/Makefile  |  4 +-
 contrib/bytea_plperl/.gitignore   |  4 ++
 contrib/bytea_plperl/Makefile | 39 ++
 contrib/bytea_plperl/bytea_plperl--1.0.sql| 19 +++
 contrib/bytea_plperl/bytea_plperl.c   | 44 
 contrib/bytea_plperl/bytea_plperl.control |  7 +++
 contrib/bytea_plperl/bytea_plperlu--1.0.sql   | 19 +++
 contrib/bytea_plperl/bytea_plperlu.control|  6 +++
 .../bytea_plperl/expected/bytea_plperl.out| 49 ++
 .../bytea_plperl/expected/bytea_plperlu.out   | 49 ++
 contrib/bytea_plperl/meson.build  | 51 +++
 contrib/bytea_plperl/sql/bytea_plperl.sql | 31 +++
 contrib/bytea_plperl/sql/bytea_plperlu.sql| 31 +++
 contrib/meson.build   |  1 +
 doc/src/sgml/plperl.sgml  | 16 ++
 15 files changed, 368 insertions(+), 2 deletions(-)
 create mode 100644 contrib/bytea_plperl/.gitignore
 create mode 100644 contrib/bytea_plperl/Makefile
 create mode 100644 contrib/bytea_plperl/bytea_plperl--1.0.sql
 create mode 100644 contrib/bytea_plperl/bytea_plperl.c
 create mode 100644 contrib/bytea_plperl/bytea_plperl.control
 create mode 100644 contrib/bytea_plperl/bytea_plperlu--1.0.sql
 create mode 100644 contrib/bytea_plperl/bytea_plperlu.control
 create mode 100644 contrib/bytea_plperl/expected/bytea_plperl.out
 create mode 100644 contrib/bytea_plperl/expected/bytea_plperlu.out
 create mode 100644 contrib/bytea_plperl/meson.build
 create mode 100644 contrib/bytea_plperl/sql/bytea_plperl.sql
 create mode 100644 contrib/bytea_plperl/sql/bytea_plperlu.sql

diff --git a/contrib/Makefile b/contrib/Makefile
index da4e2316a3..56c628df00 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -77,9 +77,9 @@ ALWAYS_SUBDIRS += sepgsql
 endif
 
 ifeq ($(with_perl),yes)
-SUBDIRS += bool_plperl hstore_plperl jsonb_plperl
+SUBDIRS += bool_plperl bytea_plperl hstore_plperl jsonb_plperl
 else
-ALWAYS_SUBDIRS += bool_plperl hstore_plperl jsonb_plperl
+ALWAYS_SUBDIRS += bool_plperl bytea_plperl hstore_plperl jsonb_plperl
 endif
 
 ifeq ($(with_python),yes)
diff --git a/contrib/bytea_plperl/.gitignore b/contrib/bytea_plperl/.gitignore
new file mode 100644
index 00..5dcb3ff972
--- /dev/null
+++ b/contrib/bytea_plperl/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/contrib/bytea_plperl/Makefile b/contrib/bytea_plperl/Makefile
new file mode 100644
index 00..1814d2f418
--- /dev/null
+++ b/contrib/bytea_plperl/Makefile
@@ -0,0 +1,39 @@
+# contrib/bytea_plperl/Makefile
+
+MODULE_big = bytea_plperl
+OBJS = \
+	$(WIN32RES) \
+	bytea_plperl.o
+PGFILEDESC = "bytea_plperl - bytea transform for plperl"
+
+PG_CPPFLAGS = -I$(top_srcdir)/src/pl/plperl
+
+EXTENSION = bytea_plperlu bytea_plperl
+DATA = bytea_plperlu--1.0.sql bytea_plperl--1.0.sql
+
+REGRESS = bytea_plperl bytea_plperlu
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/bytea_plperl
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
+
+# We must link libperl explicitly
+ifeq ($(PORTNAME), win32)

Re: System username in pg_stat_activity

2024-01-10 Thread Dagfinn Ilmari Mannsåker
Magnus Hagander  writes:

> On Wed, Jan 10, 2024 at 1:44 PM Aleksander Alekseev
>  wrote:
>>
>> It hurts my sense of beauty that usename and authname are of different
>> types. But if I'm the only one, maybe we can close our eyes on this.
>> Also I suspect that placing usename and authname in a close proximity
>> can be somewhat confusing. Perhaps adding authname as the last column
>> of the view will solve both nitpicks?
>
> But it should probably actually be name, given that's the underlying
> datatype. I kept changing it around and ended up half way in
> between...

https://www.postgresql.org/docs/current/functions-info.html#FUNCTIONS-INFO-SESSION-TABLE
(and pg_typeof(system_user)) says it's text.  Which makes sense, since
it can easily be longer than 63 bytes, e.g. in the case of a TLS client
certificate DN.

- ilmari




Re: Adding a pg_get_owned_sequence function?

2024-01-09 Thread Dagfinn Ilmari Mannsåker
Tom Lane  writes:

> Peter Eisentraut  writes:
>> Would it work to change the signature of pg_get_serial_sequence to
>>  pg_get_serial_sequence(table anyelement, column text) -> anyelement
>> and then check inside the function code whether text or regclass was passed?
>
> Probably not very well, because then we'd get no automatic coercion of
> inputs that were not either type.
>
> Maybe it would work to have both
>
> pg_get_serial_sequence(table text, column text) -> text
> pg_get_serial_sequence(table regclass, column text) -> regclass

I think it would be more correct to use name for the column argument
type, rather than text.

> but I wonder if that would create any situations where the parser
> couldn't choose between these candidates.

According to my my earlier testing¹, the parser prefers the text version
for unknown-type arguments, and further testing shows that that's also
the case for other types with implicit casts to text such as varchar and
name.  The regclass version gets chosen for oid and (big)int, because of
the implicit cast from (big)int to oid and oid to regclass.

The only case I could foresee failing would be types that have implicit
casts to both text and regtype.  It turns out that varchar does have
both, but the parser still picks the text version without copmlaint.
Does it prefer the binary-coercible cast over the one that requires
calling a conversion function?

>   regards, tom lane

- ilmari

[1]: https://postgr.es/m/87v8cfo32v@wibble.ilmari.org




Re: Adding a pg_get_owned_sequence function?

2024-01-08 Thread Dagfinn Ilmari Mannsåker
vignesh C  writes:

> On Tue, 24 Oct 2023 at 22:00, Nathan Bossart  wrote:
>>
>> On Tue, Sep 12, 2023 at 03:53:28PM +0100, Dagfinn Ilmari Mannsåker wrote:
>> > Tom Lane  writes:
>> >> It's possible that we could get away with just summarily changing
>> >> the argument type from text to regclass.  ISTR that we did exactly
>> >> that with nextval() years ago, and didn't get too much push-back.
>> >> But we couldn't do the same for the return type.  Also, this
>> >> approach does nothing for the concern about the name being
>> >> misleading.
>> >
>> > Maybe we should go all the way the other way, and call it
>> > pg_get_identity_sequence() and claim that "serial" is a legacy form of
>> > identity columns?
>>
>> Hm.  Could we split it into two functions, pg_get_owned_sequence() and
>> pg_get_identity_sequence()?  I see that commit 3012061 [0] added support
>> for identity columns to pg_get_serial_sequence() because folks expected
>> that to work, so maybe that's a good reason to keep them together.  If we
>> do elect to keep them combined, I'd be okay with renaming it to
>> pg_get_identity_sequence() along with your other proposed changes.

We can't make pg_get_serial_sequence(text, text) not work on identity
columns any more, that would break existing users, and making the new
function not work on serial columns would make it harder for people to
migrate to it.  If you're suggesting adding two new functions,
pg_get_identity_sequence(regclass, name) and
pg_get_serial_sequence(regclass, name)¹, which only work on the type of
column corresponding to the name, I don't think that's great for
usability or migratability either.

> I have changed the status of the patch to "Waiting on Author" as
> Nathan's comments have not yet been followed up.

Thanks for the nudge, here's an updated patch that together with the
above addresses them.  I've changed the commitfest entry back to "Needs
review".

> Regards,
> Vignesh

- ilmari

>From 75ed637ec4ed84ac92eb7385fd295b7d5450a450 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Fri, 9 Jun 2023 19:55:58 +0100
Subject: [PATCH v2] Add pg_get_identity_sequence function
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The docs docs say `pg_get_serial_sequence` should have been called
`pg_get_owned_sequence`, but identity columns' sequences are not owned
in the sense of `ALTER SEQUENCE … SET OWNED BY`, so instead call it
`pg_get_identity_sequence`.  This gives us the opportunity to change
the return and table argument types to `regclass` and the column
argument to `name`, instead of using `text` everywhere.  This matches
what's in catalogs, and requires less explaining than the rules for
`pg_get_serial_sequence`.
---
 doc/src/sgml/func.sgml | 37 +++---
 src/backend/utils/adt/ruleutils.c  | 69 +++---
 src/include/catalog/pg_proc.dat|  3 ++
 src/test/regress/expected/identity.out |  6 +++
 src/test/regress/expected/sequence.out |  6 +++
 src/test/regress/sql/identity.sql  |  1 +
 src/test/regress/sql/sequence.sql  |  1 +
 7 files changed, 98 insertions(+), 25 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index cec21e42c0..ff4fa5a0c4 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24589,13 +24589,13 @@
   

 
- pg_get_serial_sequence
+ pg_get_identity_sequence
 
-pg_get_serial_sequence ( table text, column text )
-text
+pg_get_identity_sequence ( table regclass, column name )
+regclass


-Returns the name of the sequence associated with a column,
+Returns the sequence associated with identity or serial column,
 or NULL if no sequence is associated with the column.
 If the column is an identity column, the associated sequence is the
 sequence internally created for that column.
@@ -24604,10 +24604,31 @@
 it is the sequence created for that serial column definition.
 In the latter case, the association can be modified or removed
 with ALTER SEQUENCE OWNED BY.
-(This function probably should have been
-called pg_get_owned_sequence; its current name
-reflects the fact that it has historically been used with serial-type
-columns.)  The first parameter is a table name with optional
+The result is suitable for passing to the sequence functions (see
+).
+   
+   
+A typical use is in reading the current value of the sequence for an
+identity or serial column, for example:
+
+SELECT currval(pg_get_identity_sequence('sometable', 'id'));
+
+   

Re: Assorted typo fixes

2024-01-03 Thread Dagfinn Ilmari Mannsåker
Robert Haas  writes:

> On Mon, Jan 1, 2024 at 6:05 PM Dagfinn Ilmari Mannsåker
>  wrote:
>> Thanks. I've fixed the commit message (and elaborated it a bit more why
>> I think it's a valid and safe fix).
>
> Regarding 0001:
>
> - AIUI, check_decls.m4 is copied from an upstream project, so I don't
> think we should tinker with it.

It contains modified versions of a few macros from Autoconf's
general.m4¹, specifically _AC_UNDECLARED_WARNING (since renamed to
_AC_UNDECLARED_BUILTIN upstream) and _AC_CHECK_DECL_BODY.  That has
since been updated² to spell François' name correctly, so I think we
should follow suit (and maybe also check if our override is even still
necessary).

[1]: 
http://git.savannah.gnu.org/gitweb/?p=autoconf.git;a=history;f=lib/autoconf/general.m4;hb=HEAD
[2]: 
http://git.savannah.gnu.org/gitweb/?p=autoconf.git;a=commitdiff;h=8a228e9d58363ad3ebdb89a05bd77568d1d863b7

> - I'm not convinced by encrypter->encryptor
> - I'm not convinced by multidimension-aware->multidimensional-aware

I don't feel particularly strongy about these.

> - I'm not convinced by cachable->cacheable

If nothing else, consistency.  There are 13 occurrences of "cacheable"
and only three of "cachable" in the tree.

> - You corrected restorting to restarting, but I'm wondering if Andres
> intended restoring?

Yeah, re-reading the sentence that's clearly meant to be "restoring".

> Committed the rest of 0001.
>
> 0002-0005 look OK to me, so I committed those as well.

Thanks!

> With regard to 0006, we typically use indexes rather than indices as
> the plural of "index", although exceptions exist.

We (mostly) use indexes when referring to database indexes (as in btree,
gist, etc.), but indices when referring to offsets in arrays, which is
what this variable is about.

- ilmari




Re: Assorted typo fixes

2024-01-01 Thread Dagfinn Ilmari Mannsåker
Shubham Khanna  writes:

> I was reviewing the Patch and came across a minor issue that the Patch
> does not apply on the current Head. Please provide the updated version
> of the patch.

Thanks for the heads-up. Commit 5ccb3bb13dcbedc30d015fc06d306d5106701e16
removed one of the instances of "data struture" fixed by the patch.

Rebased patch set attached.  I also squashed the check_decls.m4 change
into the main comment typos commit.

> Also, I found one typo:
> 0008-ecpg-fix-typo-in-get_dtype-return-value-for-ECPGd_co.patch All
> the other enum values return a string mathing the enum label, but this
> has had a trailing r since the function was added in commit
> 339a5bbfb17ecd171ebe076c5bf016c4e66e2c0a
>
>  Here 'mathing' should be 'matching'.

Thanks. I've fixed the commit message (and elaborated it a bit more why
I think it's a valid and safe fix).

> Thanks and Regards,
> Shubham Khanna.

- ilmari

>From 5ccb3bb13dcbedc30d015fc06d306d5106701e16 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Fri, 22 Dec 2023 01:46:27 +
Subject: [PATCH v2 01/12] Fix typos in comments

---
 config/check_decls.m4  |  2 +-
 contrib/bloom/bloom.h  |  2 +-
 contrib/pgcrypto/expected/pgp-compression.out  |  2 +-
 contrib/pgcrypto/openssl.c |  2 +-
 contrib/pgcrypto/pgp-encrypt.c |  2 +-
 contrib/pgcrypto/sql/pgp-compression.sql   |  2 +-
 contrib/postgres_fdw/expected/postgres_fdw.out |  2 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql  |  2 +-
 src/backend/access/brin/brin.c |  6 +++---
 src/backend/access/common/heaptuple.c  |  2 +-
 src/backend/access/gist/gistbuild.c|  2 +-
 src/backend/access/heap/heapam.c   |  4 ++--
 src/backend/access/nbtree/nbtree.c |  2 +-
 src/backend/catalog/namespace.c|  2 +-
 src/backend/catalog/pg_constraint.c|  2 +-
 src/backend/commands/event_trigger.c   |  2 +-
 src/backend/executor/execMain.c|  2 +-
 src/backend/optimizer/plan/initsplan.c |  4 ++--
 src/backend/utils/adt/rangetypes.c |  2 +-
 src/backend/utils/cache/catcache.c |  2 +-
 src/backend/utils/sort/tuplesortvariants.c | 10 +-
 src/backend/utils/time/combocid.c  |  2 +-
 src/bin/pg_rewind/t/001_basic.pl   |  2 +-
 src/bin/pg_rewind/t/004_pg_xlog_symlink.pl |  2 +-
 src/bin/pg_rewind/t/007_standby_source.pl  |  2 +-
 src/bin/pg_rewind/t/009_growing_files.pl   |  2 +-
 src/include/pg_config_manual.h |  2 +-
 src/test/isolation/specs/stats.spec| 16 
 src/test/recovery/t/029_stats_restart.pl   |  2 +-
 src/test/regress/expected/boolean.out  |  2 +-
 src/test/regress/expected/brin_multi.out   |  4 ++--
 src/test/regress/expected/join.out |  2 +-
 src/test/regress/sql/boolean.sql   |  2 +-
 src/test/regress/sql/brin_multi.sql|  4 ++--
 src/test/regress/sql/join.sql  |  2 +-
 35 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/config/check_decls.m4 b/config/check_decls.m4
index f1b90c5430..2dfcfe13fb 100644
--- a/config/check_decls.m4
+++ b/config/check_decls.m4
@@ -31,7 +31,7 @@
 # respectively.  If not, see .
 
 # Written by David MacKenzie, with help from
-# Franc,ois Pinard, Karl Berry, Richard Pixley, Ian Lance Taylor,
+# François Pinard, Karl Berry, Richard Pixley, Ian Lance Taylor,
 # Roland McGrath, Noah Friedman, david d zuhn, and many others.
 
 
diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index 330811ec60..7c4407b9ec 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -127,7 +127,7 @@ typedef struct BloomMetaPageData
 	FreeBlockNumberArray notFullPage;
 } BloomMetaPageData;
 
-/* Magic number to distinguish bloom pages among anothers */
+/* Magic number to distinguish bloom pages from others */
 #define BLOOM_MAGICK_NUMBER (0xDBAC0DED)
 
 /* Number of blocks numbers fit in BloomMetaPageData */
diff --git a/contrib/pgcrypto/expected/pgp-compression.out b/contrib/pgcrypto/expected/pgp-compression.out
index d4c57feba3..67e2dce897 100644
--- a/contrib/pgcrypto/expected/pgp-compression.out
+++ b/contrib/pgcrypto/expected/pgp-compression.out
@@ -60,7 +60,7 @@ WITH random_string AS
   -- This generates a random string of 16366 bytes.  This is chosen
   -- as random so that it does not get compressed, and the decompression
   -- would work on a string with the same length as the origin, making the
-  -- test behavior more predictible.  lpad() ensures that the generated
+  -- test behavior more predictable.  lpad() ensures that the generated
   -- hexadecimal value is completed by extra zero characters if random()
   -- has generated a value strictly lower than 16.
   SELECT string_agg(decode(lpad(to_hex((random()*256)::int), 2, '0'), 

Assorted typo fixes

2023-12-27 Thread Dagfinn Ilmari Mannsåker
Hi folks,

I was playing around with the `typos` tool
(https://github.com/crate-ci/typos), and thought I'd run it on the
posgres repo for fun.  After a bit of tweaking to get rid of most false
positives (see separately attached .typos.toml file), it came up with a
useful set of suggestions, some of which I applied verbatim, others
which needed a bit more rewording.

Attached is a series of patches.  The first one are what I consider
obvious, unambiguous fixes to code comments.  The subsequent ones are
fixes for actual code (variable, function, type names) and docs, one
patch per class of typo.  As far as I can tell, none of the code changes
(except the ECPG one, see below) affect anything exported, so this
should not cause any compatibility issues for extensions.

The ECPG change affects the generated C code, but from my reading of the
callers in descriptor.c and ecpg.trailer, any code that would have
caused it to encounter the affected enum value would fail to compile, so
either the case is not possible, or nobody actually uses whatever syntax
is affected (I don't know enough about ECPG to tell without spending far
too much time digging in the code).

- ilmari

>From d5abde91b33bbdaa99c6c1e7a6334affa370e9c6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Fri, 22 Dec 2023 01:46:27 +
Subject: [PATCH 01/13] Fix typos in comments

---
 contrib/bloom/bloom.h  |  2 +-
 contrib/pgcrypto/expected/pgp-compression.out  |  2 +-
 contrib/pgcrypto/openssl.c |  2 +-
 contrib/pgcrypto/pgp-encrypt.c |  2 +-
 contrib/pgcrypto/sql/pgp-compression.sql   |  2 +-
 contrib/postgres_fdw/expected/postgres_fdw.out |  2 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql  |  2 +-
 src/backend/access/brin/brin.c |  6 +++---
 src/backend/access/common/heaptuple.c  |  2 +-
 src/backend/access/gist/gistbuild.c|  2 +-
 src/backend/access/heap/heapam.c   |  4 ++--
 src/backend/access/nbtree/nbtree.c |  2 +-
 src/backend/catalog/namespace.c|  2 +-
 src/backend/catalog/pg_constraint.c|  2 +-
 src/backend/commands/event_trigger.c   |  2 +-
 src/backend/executor/execMain.c|  2 +-
 src/backend/optimizer/plan/initsplan.c |  4 ++--
 src/backend/utils/adt/rangetypes.c |  2 +-
 src/backend/utils/cache/catcache.c |  2 +-
 src/backend/utils/sort/tuplesortvariants.c | 12 ++--
 src/backend/utils/time/combocid.c  |  2 +-
 src/bin/pg_rewind/t/001_basic.pl   |  2 +-
 src/bin/pg_rewind/t/004_pg_xlog_symlink.pl |  2 +-
 src/bin/pg_rewind/t/007_standby_source.pl  |  2 +-
 src/bin/pg_rewind/t/009_growing_files.pl   |  2 +-
 src/include/pg_config_manual.h |  2 +-
 src/test/isolation/specs/stats.spec| 16 
 src/test/recovery/t/029_stats_restart.pl   |  2 +-
 src/test/regress/expected/boolean.out  |  2 +-
 src/test/regress/expected/brin_multi.out   |  4 ++--
 src/test/regress/expected/join.out |  2 +-
 src/test/regress/sql/boolean.sql   |  2 +-
 src/test/regress/sql/brin_multi.sql|  4 ++--
 src/test/regress/sql/join.sql  |  2 +-
 34 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index 330811ec60..7c4407b9ec 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -127,7 +127,7 @@ typedef struct BloomMetaPageData
 	FreeBlockNumberArray notFullPage;
 } BloomMetaPageData;
 
-/* Magic number to distinguish bloom pages among anothers */
+/* Magic number to distinguish bloom pages from others */
 #define BLOOM_MAGICK_NUMBER (0xDBAC0DED)
 
 /* Number of blocks numbers fit in BloomMetaPageData */
diff --git a/contrib/pgcrypto/expected/pgp-compression.out b/contrib/pgcrypto/expected/pgp-compression.out
index d4c57feba3..67e2dce897 100644
--- a/contrib/pgcrypto/expected/pgp-compression.out
+++ b/contrib/pgcrypto/expected/pgp-compression.out
@@ -60,7 +60,7 @@ WITH random_string AS
   -- This generates a random string of 16366 bytes.  This is chosen
   -- as random so that it does not get compressed, and the decompression
   -- would work on a string with the same length as the origin, making the
-  -- test behavior more predictible.  lpad() ensures that the generated
+  -- test behavior more predictable.  lpad() ensures that the generated
   -- hexadecimal value is completed by extra zero characters if random()
   -- has generated a value strictly lower than 16.
   SELECT string_agg(decode(lpad(to_hex((random()*256)::int), 2, '0'), 'hex'), '') as bytes
diff --git a/contrib/pgcrypto/openssl.c b/contrib/pgcrypto/openssl.c
index 4a913bd04f..8259de5e39 100644
--- a/contrib/pgcrypto/openssl.c
+++ b/contrib/pgcrypto/openssl.c
@@ -460,7 +460,7 @@ bf_init(PX_Cipher *c, const uint8 *key, unsigned klen, const 

Re: On non-Windows, hard depend on uselocale(3)

2023-11-15 Thread Dagfinn Ilmari Mannsåker
Tom Lane  writes:

> "Tristan Partin"  writes:
>> I would like to propose removing HAVE_USELOCALE, and just have WIN32, 
>> which means that Postgres would require uselocale(3) on anything that 
>> isn't WIN32.
>
> You would need to do some research and try to prove that that won't
> be a problem on any modern platform.  Presumably it once was a problem,
> or we'd not have bothered with a configure check.
>
> (Some git archaeology might yield useful info about when and why
> we added the check.)

For reference, the Perl effort to use the POSIX.1-2008 thread-safe
locale APIs have revealed several platform-specific bugs that cause it
to disable them on FreeBSD and macOS:

https://github.com/perl/perl5/commit/9cbc12c368981c56d4d8e40cc9417ac26bec2c35
https://github.com/perl/perl5/commit/dd4eb78c55aab441aec1639b1dd49f88bd960831

and work around bugs on others (e.g. OpenBSD):

https://github.com/perl/perl5/commit/0f3830f3997cf7ef1531bad26d2e0f13220dd862

But Perl actually makes use of per-thread locales, because it has a
separate interpereer per thread, each of which can have a different
locale active.  Since Postgres isn't actually multi-threaded (yet),
these concerns might not apply to the same degree.

>   regards, tom lane

- ilmari




Re: retire MemoryContextResetAndDeleteChildren backwards compatibility macro

2023-11-14 Thread Dagfinn Ilmari Mannsåker
Alvaro Herrera  writes:

> On 2023-Nov-13, Nathan Bossart wrote:
>
>> Shall we retire this backwards compatibility macro at this point?  A search
>> of https://codesearch.debian.net/ does reveal a few external uses, so we
>> could alternatively leave it around and just update Postgres to stop using
>> it, but I don't think it would be too burdensome for extension authors to
>> fix if we removed it completely.
>
> Let's leave the macro around and just remove its uses in PGDG-owned
> code.  Having the macro around hurts nothing, and we can remove it in 15
> years or so.

Is there a preprocessor symbol that is defined when building Postgres
itself (and extensions in /contrib/), but not third-party extensions (or
vice versa)?  If so, the macro could be guarded by that, so that uses
don't accientally sneak back in.

There's also __attribute__((deprecated)) (and and __declspec(deprecated)
for MSVC), but that can AFAIK only be attached to functions and
variables, not macros, so it would have to be changed to a static inline
function.

- ilmari




Re: Query execution in Perl TAP tests needs work

2023-10-18 Thread Dagfinn Ilmari Mannsåker
Tom Lane  writes:

> Robert Haas  writes:
>> On Sat, Sep 2, 2023 at 2:42 PM Andrew Dunstan  wrote:
 How much burden is it? Would anyone actually mind?
>
>> ... At the same time, fallbacks can be a problem too,
>> because then you can end up with things that work one way on one
>> developer's machine (or BF machine) and another way on another
>> developer's machine (or BF machine) and it's not obvious that the
>> reason for the difference is that one machine is using a fallback and
>> the other is not.
>
> I agree with this worry.
>
>> In terms of what that faster and better thing might be, AFAICS, there
>> are two main options. First, we could proceed with the approach you've
>> tried here, namely requiring everybody to get Platypus::FFI. I find
>> that it's included in MacPorts on my machine, which is at least
>> somewhat of a good sign that perhaps this is fairly widely available.
>
> I did a bit of research on this on my favorite platforms, and did
> not like the results:
>
> RHEL8: does not seem to be packaged at all.
>
> Fedora 37: available, but the dependencies are a bit much:
>
> $ sudo yum install perl-FFI-Platypus
> Last metadata expiration check: 2:07:42 ago on Wed Oct 18 08:05:40 2023.
> Dependencies resolved.
> 
>  Package Architecture Version   Repository 
> Size
> 
> Installing:
>  perl-FFI-Platypus   x86_64   2.08-1.fc37   updates   417 
> k
> Installing dependencies:
>  libgfortran x86_64   12.3.1-1.fc37 updates   904 
> k
>  libquadmath x86_64   12.3.1-1.fc37 updates   206 
> k
>  libquadmath-devel   x86_64   12.3.1-1.fc37 updates48 
> k
>  perl-FFI-CheckLib   noarch   0.29-2.fc37   updates29 
> k
> Installing weak dependencies:
>  gcc-gfortranx86_64   12.3.1-1.fc37 updates12 
> M
>
> Transaction Summary
> 
> Install  6 Packages
>
> Total download size: 14 M
> Installed size: 37 M
> Is this ok [y/N]: 
>
> gfortran?   Really??  I mean, I don't care about the disk space,
> but this is not promising for anyone who has to build it themselves.

The Fortran support for FFI::Platypus is in a separate CPAN distribution
(FFI-Platypus-Lang-Fortran), so that must be some quirk of the Fedora
packaging and not a problem for people building it themselves.  They
just need libffi and a working Perl/CPAN setup.

On Debian the only things besides Perl and core perl modules it
(build-)depends on are libffi, Capture::Tiny, FFI::Checklib (which
depends on File::Which), Test2::Suite and pkg-config.

- ilmari




Re: Tab completion for AT TIME ZONE

2023-10-12 Thread Dagfinn Ilmari Mannsåker
Michael Paquier  writes:

> On Fri, Apr 14, 2023 at 12:05:25PM +0200, Jim Jones wrote:
>> The patch applies cleanly and it does what it is proposing. - and it's IMHO
>> a very nice addition.
>> 
>> I've marked the CF entry as "Ready for Committer".
>
> +/* ... AT TIME ZONE ... */
> + else if (TailMatches("AT"))
> + COMPLETE_WITH("TIME ZONE");
> + else if (TailMatches("AT", "TIME"))
> + COMPLETE_WITH("ZONE");
> + else if (TailMatches("AT", "TIME", "ZONE"))
> + COMPLETE_WITH_TIMEZONE_NAME();
>
> This style will for the completion of timezone values even if "AT" is
> the first word of a query.  Shouldn't this be more selective by making
> sure that we are at least in the context of a SELECT query?

It's valid anywhere an expression is, which is a lot more places than
just SELECT queries.  Off the top of my head I can think of WITH,
INSERT, UPDATE, VALUES, CALL, CREATE TABLE, CREATE INDEX.

As I mentioned upthread, the only place in the grammar where the word AT
occurs is in AT TIME ZONE, so there's no ambiguity.  Also, it doesn't
complete time zone names after AT, it completes the literal words TIME
ZONE, and you have to then hit tab again to get a list of time zones.
If we (or the SQL committee) were to invent more operators that start
with the word AT, we can add those to the first if clause above and
complete with the appropriate values after each one separately.

- ilmari




Re: typo in couple of places

2023-10-06 Thread Dagfinn Ilmari Mannsåker
vignesh C  writes:

> Hi,
>
> I noticed a couple of typos in code. "the the" should have been "the",
> attached patch has the changes for the same.

This made me curious about other duplicate word occurrences, and after a
few minutes of increasingly-elaborate regexing to exclude false
postives, I found a few more (plus a bonus a _missing_ "the").  See the
attached patch (which includes your originl one, for completeness).

> Regards,
> Vignesh

- ilmari

>From 758e890b3c259f242f3ff6fe4b0bfc48d8d48dcd Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Fri, 6 Oct 2023 12:55:52 +0100
Subject: [PATCH] Fix duplicate words in docs and code comments

And add a missing "the" in a couple of places.
---
 .cirrus.star  | 2 +-
 contrib/citext/expected/citext_utf8.out   | 2 +-
 contrib/citext/expected/citext_utf8_1.out | 2 +-
 contrib/citext/sql/citext_utf8.sql| 2 +-
 doc/src/sgml/logical-replication.sgml | 2 +-
 doc/src/sgml/ref/create_subscription.sgml | 2 +-
 src/backend/access/nbtree/nbtsearch.c | 8 
 src/backend/access/transam/xlogreader.c   | 2 +-
 src/backend/executor/nodeHashjoin.c   | 2 +-
 src/test/regress/expected/tuplesort.out   | 4 ++--
 src/test/regress/sql/tuplesort.sql| 4 ++--
 11 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/.cirrus.star b/.cirrus.star
index d2d6ceca20..36233872d1 100644
--- a/.cirrus.star
+++ b/.cirrus.star
@@ -46,7 +46,7 @@ def main():
 
 def config_from(config_src):
 """return contents of config file `config_src`, surrounded by markers
-indicating start / end of the the included file
+indicating start / end of the included file
 """
 
 config_contents = fs.read(config_src)
diff --git a/contrib/citext/expected/citext_utf8.out b/contrib/citext/expected/citext_utf8.out
index 6630e09a4d..5d988dcd48 100644
--- a/contrib/citext/expected/citext_utf8.out
+++ b/contrib/citext/expected/citext_utf8.out
@@ -2,7 +2,7 @@
  * This test must be run in a database with UTF-8 encoding
  * and a Unicode-aware locale.
  *
- * Also disable this file for ICU, because the test for the the
+ * Also disable this file for ICU, because the test for the
  * Turkish dotted I is not correct for many ICU locales. citext always
  * uses the default collation, so it's not easy to restrict the test
  * to the "tr-TR-x-icu" collation where it will succeed.
diff --git a/contrib/citext/expected/citext_utf8_1.out b/contrib/citext/expected/citext_utf8_1.out
index 3caa7a00d4..7065a5da19 100644
--- a/contrib/citext/expected/citext_utf8_1.out
+++ b/contrib/citext/expected/citext_utf8_1.out
@@ -2,7 +2,7 @@
  * This test must be run in a database with UTF-8 encoding
  * and a Unicode-aware locale.
  *
- * Also disable this file for ICU, because the test for the the
+ * Also disable this file for ICU, because the test for the
  * Turkish dotted I is not correct for many ICU locales. citext always
  * uses the default collation, so it's not easy to restrict the test
  * to the "tr-TR-x-icu" collation where it will succeed.
diff --git a/contrib/citext/sql/citext_utf8.sql b/contrib/citext/sql/citext_utf8.sql
index 1f51df134b..34b232d64e 100644
--- a/contrib/citext/sql/citext_utf8.sql
+++ b/contrib/citext/sql/citext_utf8.sql
@@ -2,7 +2,7 @@
  * This test must be run in a database with UTF-8 encoding
  * and a Unicode-aware locale.
  *
- * Also disable this file for ICU, because the test for the the
+ * Also disable this file for ICU, because the test for the
  * Turkish dotted I is not correct for many ICU locales. citext always
  * uses the default collation, so it's not easy to restrict the test
  * to the "tr-TR-x-icu" collation where it will succeed.
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index fbf8ad669e..3b2fa1129e 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -1788,7 +1788,7 @@
   
 
   
-   To create a subscription, the user must have the privileges of the
+   To create a subscription, the user must have the privileges of
the pg_create_subscription role, as well as
CREATE privileges on the database.
   
diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index 71652fd918..c1bafbfa06 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -51,7 +51,7 @@
   
 
   
-   To be able to create a subscription, you must have the privileges of the
+   To be able to create a subscription, you must have the privileges of
the pg_create_subscription role, as well as
CREATE privileges on the current database.
   
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index c47eaed0e9..efc5284e5b 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -1664,8 +1664,8 @@ _bt_readpage(IndexScanDesc scan, ScanDirection 

Re: Allow deleting enumerated values from an existing enumerated data type

2023-10-02 Thread Dagfinn Ilmari Mannsåker
Vik Fearing  writes:

> On 9/29/23 03:17, Tom Lane wrote:
>> Vik Fearing  writes:
>>> On 9/28/23 20:46, Tom Lane wrote:
 We went through all these points years ago when the enum feature
 was first developed, as I recall.  Nobody thought that the ability
 to remove an enum value was worth the amount of complexity it'd
 entail.
>> 
>>> This issue comes up regularly (although far from often).  Do we want to
>>> put some comments right where would-be implementors would be sure to see it?
>> Perhaps.  I'd be kind of inclined to leave the "yet" out of "not yet
>> implemented" in the error message, as that wording sounds like we just
>> haven't got round to it.
>
> I see your point, but should we be dissuading people who might want to
> work on solving those problems?  I intentionally did not document that 
> this syntax exists so the only people seeing the message are those who
> just try it, and those wanting to write a patch like Danil did.
>
> No one except you has said anything about this patch.  I think it would
> be good to commit it, wordsmithing aside.

FWIW I'm +1 on this patch, and with Tom on dropping the "yet".  To me it
makes it sound like we intend to implement it soon (fsvo).

- ilmari




Re: Replace (stat())[7] in TAP tests with -s

2023-10-02 Thread Dagfinn Ilmari Mannsåker
"Drouvot, Bertrand"  writes:

> Hi hackers,
>
> Please find attached a tiny patch to $SUBJECT.
>
> It:
>
>  - provides more consistency to the way we get files size in TAP tests
>  - seems more elegant that relying on a hardcoded result position

I approve of removing use of the list form of stat, it's a horrible API.

If we weren't already using -s everywhere else, I would prefer
File::stat, which makes stat (in scalar context) return an object with
methods for the fields, so you'd do stat($file)->size.  It's included in
Perl core since 5.4, and we're already using it in several places for
other fields (mode and ino at least).

I see another use of stat array positions (for mtime) in
src/tools/msvc/Solution.pm, but that's on the chopping block, so not
much point in fixing.

- ilmari




Re: Adding a pg_get_owned_sequence function?

2023-09-12 Thread Dagfinn Ilmari Mannsåker
Tom Lane  writes:

> Peter Eisentraut  writes:
>> Would it work to just overload pg_get_serial_sequence with regclass 
>> argument types?
>
> Probably not; the parser would have no principled way to resolve
> pg_get_serial_sequence('foo', 'bar') as one or the other.  I'm
> not sure offhand if it would throw error or just choose one, but
> if it just chooses one it'd likely be the text variant.

That works fine, and it prefers the text version:

~=# create function pg_get_serial_sequence(tbl regclass, col name)
returns regclass strict stable parallel safe
return pg_get_serial_sequence(tbl::text, col::text)::regclass;
CREATE FUNCTION

~=# select pg_typeof(pg_get_serial_sequence('identest', 'id'));
┌───┐
│ pg_typeof │
├───┤
│ text  │
└───┘
(1 row)

~=# select pg_typeof(pg_get_serial_sequence('identest', 'id'::name));
┌───┐
│ pg_typeof │
├───┤
│ regclass  │
└───┘
(1 row)

> It's possible that we could get away with just summarily changing
> the argument type from text to regclass.  ISTR that we did exactly
> that with nextval() years ago, and didn't get too much push-back.
> But we couldn't do the same for the return type.  Also, this
> approach does nothing for the concern about the name being
> misleading.

Maybe we should go all the way the other way, and call it
pg_get_identity_sequence() and claim that "serial" is a legacy form of
identity columns?

- ilmari




Re: Replace some cstring_to_text to cstring_to_text_with_len

2023-08-31 Thread Dagfinn Ilmari Mannsåker
Peter Eisentraut  writes:

> On 31.08.23 16:10, Ranier Vilela wrote:
>> Em qui., 31 de ago. de 2023 às 09:51, Andrew Dunstan
>> mailto:and...@dunslane.net>> escreveu:
>> 
>> On 2023-08-31 Th 07:41, John Naylor wrote:
>>>
>>> On Thu, Aug 31, 2023 at 6:07 PM Ranier Vilela >> > wrote:
>>> >
>>> > Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier
>>> mailto:mich...@paquier.xyz>> escreveu:
>>> >>
>>> >> On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
>>> >> > cstring_to_text has a small overhead, because call strlen for
>>> >> > pointer to char parameter.
>>> >> >
>>> >> > Is it worth the effort to avoid this, where do we know the
>>> size of the
>>> >> > parameter?
>>> >>
>>> >> Are there workloads where this matters?
>>> >
>>> > None, but note this change has the same spirit of 8b26769bc.
>>>
>>> - return cstring_to_text("");
>>> + return cstring_to_text_with_len("", 0);
>>>
>>> This looks worse, so we'd better be getting something in return.
>> 
>> I agree this is a bit ugly. I wonder if we'd be better off creating
>> a function that returned an empty text value, so we'd just avoid
>> converting the empty cstring altogether and say:
>>    return empty_text();
>> Hi,
>> Thanks for the suggestion,  I agreed.
>> New patch is attached.
>
> I think these patches make the code uniformly uglier and harder to
> understand.
>
> If a performance benefit could be demonstrated, then making
> cstring_to_text() an inline function could be sensible.  But I wouldn't 
> go beyond that.

I haven't benchmarked it yet, but here's a patch that inlines them and
changes callers of cstring_to_text_with_len() with a aliteral string and
constant length to cstring_to_text().

On an x86-64 Linux build (meson with -Dbuildtype=debugoptimized
-Dcassert=true), the inlining increases the size of the text section of
the postgres binary from 9719722 bytes to 9750557, i.e. an increase of
30KiB or 0.3%, while the change to cstring_to_text() makes zero
difference (as expected from my investigation).

- ilmari

>From e332e5229dafd94e44ad8608c5fa2d3c58d80e0e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Thu, 31 Aug 2023 19:11:55 +0100
Subject: [PATCH] Inline cstring_to_text(_with_len) functions

This lets the compiler optimise out the strlen() and memcpy() calls
when they are called with a literal string or constant length.

Thus, convert cstring_to_text_with_length("literal", 7) calls to plain
cstring_to_text("literal") to avoid having to count string lenghts
manually.
---
 contrib/btree_gin/btree_gin.c |  2 +-
 contrib/hstore/hstore_io.c|  4 ++--
 src/backend/utils/adt/json.c  |  4 ++--
 src/backend/utils/adt/jsonfuncs.c |  4 ++--
 src/backend/utils/adt/varlena.c   | 32 +
 src/include/utils/builtins.h  | 34 +--
 6 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/contrib/btree_gin/btree_gin.c b/contrib/btree_gin/btree_gin.c
index c50d68ce18..59db475791 100644
--- a/contrib/btree_gin/btree_gin.c
+++ b/contrib/btree_gin/btree_gin.c
@@ -347,7 +347,7 @@ GIN_SUPPORT(cidr, true, leftmostvalue_inet, network_cmp)
 static Datum
 leftmostvalue_text(void)
 {
-	return PointerGetDatum(cstring_to_text_with_len("", 0));
+	return PointerGetDatum(cstring_to_text(""));
 }
 
 GIN_SUPPORT(text, true, leftmostvalue_text, bttextcmp)
diff --git a/contrib/hstore/hstore_io.c b/contrib/hstore/hstore_io.c
index 999ddad76d..7e3b52fbef 100644
--- a/contrib/hstore/hstore_io.c
+++ b/contrib/hstore/hstore_io.c
@@ -1347,7 +1347,7 @@ hstore_to_json_loose(PG_FUNCTION_ARGS)
 dst;
 
 	if (count == 0)
-		PG_RETURN_TEXT_P(cstring_to_text_with_len("{}", 2));
+		PG_RETURN_TEXT_P(cstring_to_text("{}"));
 
 	initStringInfo();
 	initStringInfo();
@@ -1402,7 +1402,7 @@ hstore_to_json(PG_FUNCTION_ARGS)
 dst;
 
 	if (count == 0)
-		PG_RETURN_TEXT_P(cstring_to_text_with_len("{}", 2));
+		PG_RETURN_TEXT_P(cstring_to_text("{}"));
 
 	initStringInfo();
 	initStringInfo();
diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index 2c620809b2..cd2b494259 100644
--- a/src/backend/utils/adt/json.c
+++ b/src/backend/utils/adt/json.c
@@ -1290,7 +1290,7 @@ json_build_object(PG_FUNCTION_ARGS)
 Datum
 json_build_object_noargs(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_TEXT_P(cstring_to_text_with_len("{}", 2));
+	PG_RETURN_TEXT_P(cstring_to_text("{}"));
 }
 
 Datum
@@ -1346,7 +1346,7 @@ json_build_array(PG_FUNCTION_ARGS)
 Datum
 json_build_array_noargs(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_TEXT_P(cstring_to_text_with_len("[]", 2));
+	PG_RETURN_TEXT_P(cstring_to_text("[]"));
 }
 
 /*
diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index a4bfa5e404..b8845635ac 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -1800,8 +1800,8 @@ 

Re: Replace some cstring_to_text to cstring_to_text_with_len

2023-08-31 Thread Dagfinn Ilmari Mannsåker
Ranier Vilela  writes:

> Em qui., 31 de ago. de 2023 às 12:12, Dagfinn Ilmari Mannsåker <
> ilm...@ilmari.org> escreveu:
>
>> Ranier Vilela  writes:
>>
>> > Em qui., 31 de ago. de 2023 às 10:12, Dagfinn Ilmari Mannsåker <
>> > ilm...@ilmari.org> escreveu:
>> >
>> >> Andrew Dunstan  writes:
>> >>
>> >> > On 2023-08-31 Th 07:41, John Naylor wrote:
>> >> >>
>> >> >> On Thu, Aug 31, 2023 at 6:07 PM Ranier Vilela 
>> >> wrote:
>> >> >> >
>> >> >> > Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier
>> >> >>  escreveu:
>> >> >> >>
>> >> >> >> On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
>> >> >> >> > cstring_to_text has a small overhead, because call strlen for
>> >> >> >> > pointer to char parameter.
>> >> >> >> >
>> >> >> >> > Is it worth the effort to avoid this, where do we know the size
>> >> >> of the
>> >> >> >> > parameter?
>> >> >> >>
>> >> >> >> Are there workloads where this matters?
>> >> >> >
>> >> >> > None, but note this change has the same spirit of 8b26769bc.
>> >> >>
>> >> >> - return cstring_to_text("");
>> >> >> + return cstring_to_text_with_len("", 0);
>> >> >>
>> >> >> This looks worse, so we'd better be getting something in return.
>> >> >
>> >> >
>> >> > I agree this is a bit ugly. I wonder if we'd be better off creating a
>> >> > function that returned an empty text value, so we'd just avoid
>> >> > converting the empty cstring altogether and say:
>> >> >
>> >> >   return empty_text();
>> >>
>> >> Or we could generalise it for any string literal (of which there are
>> >> slightly more¹ non-empty than empty in calls to
>> >> cstring_to_text(_with_len)):
>> >>
>> >> #define literal_to_text(str) cstring_to_text_with_len("" str "",
>> >> sizeof(str)-1)
>> >>
>> > I do not agree, I think this will get worse.
>>
>> How exactly will it get worse?  It's exactly equivalent to
>> cstring_to_text_with_len("", 0), since sizeof() is a compile-time
>> construct, and the string concatenation makes it fail if the argument is
>> not a literal string.
>>
> I think that concatenation makes the strings twice bigger, doesn't it?

No, it's just taking advantage of the fact that C string literals can be
split into multiple pieces separated by whitespace (like in SQL, but
without requiring a newline between them).  E.g. "foo" "bar" is exactly
equivalent to "foobar" after parsing.  The reason to use it in the macro
is to make it a syntax error if the argument is not a literal string but
instead a string variable, becuause in that case the sizeof() would
return the size of the pointer, not the string.

- ilmari




Re: Replace some cstring_to_text to cstring_to_text_with_len

2023-08-31 Thread Dagfinn Ilmari Mannsåker
Ranier Vilela  writes:

> Em qui., 31 de ago. de 2023 às 10:12, Dagfinn Ilmari Mannsåker <
> ilm...@ilmari.org> escreveu:
>
>> Andrew Dunstan  writes:
>>
>> > On 2023-08-31 Th 07:41, John Naylor wrote:
>> >>
>> >> On Thu, Aug 31, 2023 at 6:07 PM Ranier Vilela 
>> wrote:
>> >> >
>> >> > Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier
>> >>  escreveu:
>> >> >>
>> >> >> On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
>> >> >> > cstring_to_text has a small overhead, because call strlen for
>> >> >> > pointer to char parameter.
>> >> >> >
>> >> >> > Is it worth the effort to avoid this, where do we know the size
>> >> of the
>> >> >> > parameter?
>> >> >>
>> >> >> Are there workloads where this matters?
>> >> >
>> >> > None, but note this change has the same spirit of 8b26769bc.
>> >>
>> >> - return cstring_to_text("");
>> >> + return cstring_to_text_with_len("", 0);
>> >>
>> >> This looks worse, so we'd better be getting something in return.
>> >
>> >
>> > I agree this is a bit ugly. I wonder if we'd be better off creating a
>> > function that returned an empty text value, so we'd just avoid
>> > converting the empty cstring altogether and say:
>> >
>> >   return empty_text();
>>
>> Or we could generalise it for any string literal (of which there are
>> slightly more¹ non-empty than empty in calls to
>> cstring_to_text(_with_len)):
>>
>> #define literal_to_text(str) cstring_to_text_with_len("" str "",
>> sizeof(str)-1)
>>
> I do not agree, I think this will get worse.

How exactly will it get worse?  It's exactly equivalent to
cstring_to_text_with_len("", 0), since sizeof() is a compile-time
construct, and the string concatenation makes it fail if the argument is
not a literal string.

Whether we want an even-more-optimised version for an empty text value
is another matter, but I doubt it'd be worth it.  Another option would
be to make cstring_to_text(_with_len) static inline functions, which
lets the compiler eliminate the memcpy() call when len == 0.

In fact, after playing around a bit (https://godbolt.org/z/x51aYGadh),
it seems like GCC, Clang and MSVC all eliminate the strlen() and
memcpy() calls for cstring_to_text("") under -O2 if it's static inline.

- ilmari




Re: Adding a pg_get_owned_sequence function?

2023-08-31 Thread Dagfinn Ilmari Mannsåker
On Fri, 9 Jun 2023, at 20:19, Dagfinn Ilmari Mannsåker wrote:

> Hi hackers,
>
> I've always been annoyed by the fact that pg_get_serial_sequence takes
> the table and returns the sequence as strings rather than regclass. And
> since identity columns were added, the name is misleading as well (which
> is even acknowledged in the docs, together with a suggestion for a
> better name).
> 
> So, instead of making excuses in the documentation, I thought why not
> add a new function which addresses all of these issues, and document the
> old one as a backward-compatibilty wrapper?
> 
> Please see the attached patch for my stab at this.

This didn't get any replies, so I've created a commitfest entry to make sure it 
doesn't get lost:

https://commitfest.postgresql.org/44/4535/

-- 
- ilmari




Re: Replace some cstring_to_text to cstring_to_text_with_len

2023-08-31 Thread Dagfinn Ilmari Mannsåker
Andrew Dunstan  writes:

> On 2023-08-31 Th 07:41, John Naylor wrote:
>>
>> On Thu, Aug 31, 2023 at 6:07 PM Ranier Vilela  wrote:
>> >
>> > Em qui., 31 de ago. de 2023 às 00:22, Michael Paquier
>>  escreveu:
>> >>
>> >> On Wed, Aug 30, 2023 at 03:00:13PM -0300, Ranier Vilela wrote:
>> >> > cstring_to_text has a small overhead, because call strlen for
>> >> > pointer to char parameter.
>> >> >
>> >> > Is it worth the effort to avoid this, where do we know the size
>> of the
>> >> > parameter?
>> >>
>> >> Are there workloads where this matters?
>> >
>> > None, but note this change has the same spirit of 8b26769bc.
>>
>> - return cstring_to_text("");
>> + return cstring_to_text_with_len("", 0);
>>
>> This looks worse, so we'd better be getting something in return.
>
>
> I agree this is a bit ugly. I wonder if we'd be better off creating a
> function that returned an empty text value, so we'd just avoid 
> converting the empty cstring altogether and say:
>
>   return empty_text();

Or we could generalise it for any string literal (of which there are
slightly more¹ non-empty than empty in calls to
cstring_to_text(_with_len)):

#define literal_to_text(str) cstring_to_text_with_len("" str "", sizeof(str)-1)

[1]: 

~/src/postgresql $ rg 'cstring_to_text(_with_len)?\("[^"]+"' | wc -l
17
~/src/postgresql $ rg 'cstring_to_text(_with_len)?\(""' | wc -l
15

- ilmari




Re: Make all Perl warnings fatal

2023-08-25 Thread Dagfinn Ilmari Mannsåker
Alvaro Herrera  writes:

> On 2023-Aug-10, Peter Eisentraut wrote:
>
>> I wanted to figure put if we can catch these more reliably, in the style of
>> -Werror.  AFAICT, there is no way to automatically turn all warnings into
>> fatal errors.  But there is a way to do it per script, by replacing
>> 
>> use warnings;
>> 
>> by
>> 
>> use warnings FATAL => 'all';
>> 
>> See attached patch to try it out.
>
> BTW in case we do find that there's some unforeseen problem and we want
> to roll back, it would be great to have a way to disable this without
> having to edit every single Perl file again later.  However, I didn't
> find a way to do it -- I thought about creating a separate PgWarnings.pm
> file that would do the "use warnings FATAL => 'all'" dance and which
> every other Perl file would use or include; but couldn't make it work.
> Maybe some Perl expert knows a good answer to this.

Like most pragmas (all-lower-case module names), `warnings` affects the
currently-compiling lexical scope, so to have a module like PgWarnings
inject it into the module that uses it, you'd call warnings->import in
its import method (which gets called when the `use PgWarnings;``
statement is compiled, e.g.:

package PgWarnings;

sub import {
warnings->import(FATAL => 'all');
}

I wouldn't bother with a whole module just for that, but if we have a
group of pragmas or modules we always want to enable/import and have the
ability to change this set without having to edit all the files, it's
quite common to have a ProjectName::Policy module that does that.  For
example, to exclude warnings that are unsafe, pointless, or impossible
to fatalise (c.f. https://metacpan.org/pod/strictures#CATEGORY-SELECTIONS):

package PostgreSQL::Policy;

sub import {
strict->import;
warnings->import(
FATAL => 'all',
NONFATAL => qw(exec internal malloc recursion),
);
warnings->uniport(qw(once));
}

Now that we require Perl 5.14, we might want to consider enabling its
feature bundle as well, with:

feature->import(':5.14')

Although the only features of note that adds are:

   - say: the `say` function, like `print` but appends a newline

   - state: `state` variables, like `my` but only initialised the first
 time the function they're in is called, and the value persists
 between calls (like function-scoped `static` variables in C)

   - unicode_strings: use unicode semantics for characters in the
 128-255 range, regardless of internal representation

> Maybe the BEGIN block of each file can `eval` a new PgWarnings.pm that
> emits the "use warnings" line?

That's ugly as sin, and thankfully not necessary.

-ilmari




Re: Adding argument names to aggregate functions

2023-08-24 Thread Dagfinn Ilmari Mannsåker
On Thu, 24 Aug 2023, at 11:05, Daniel Gustafsson wrote:
>> On 4 Aug 2023, at 01:36, Nathan Bossart  wrote:
>> 
>> On Wed, Jul 19, 2023 at 09:38:12PM +0200, Daniel Gustafsson wrote:
>>> Great, thanks!  I had a quick look at this while rebasing (as well as your
>>> updated patch) and it seems like a good idea to add this.  Unless there are
>>> objections I will look at getting this in.
>> 
>> Hey Daniel, are you still planning on committing this?  I can pick it up if
>> you are busy.
>
> Finally unburied this from the post-vacation pile on the TODO list and pushed
> it after another once-over.

Thanks!

-- 
- ilmari




Re: Ignore 2PC transaction GIDs in query jumbling

2023-08-18 Thread Dagfinn Ilmari Mannsåker
Michael Paquier  writes:

> On Tue, Aug 15, 2023 at 08:49:37AM +0900, Michael Paquier wrote:
>> Hmm.  One issue with the patch is that we finish by considering
>> DEALLOCATE ALL and DEALLOCATE $1 as the same things, compiling the
>> same query IDs.  The difference is made in the Nodes by assigning NULL
>> to the name but we would now ignore it.  Wouldn't it be better to add
>> an extra field to DeallocateStmt to track separately the named
>> deallocate queries and ALL in monitoring?
>
> In short, I would propose something like that, with a new boolean
> field in DeallocateStmt that's part of the jumbling.
>
> Dagfinn, Julien, what do you think about the attached?

I don't have a particularly strong opinion on whether we should
distinguish DEALLOCATE ALL from DEALLOCATE  (call it +0.5), but
this seems like a reasonable way to do it unless we want to invent a
query_jumble_ignore_unless_null attribute (which seems like overkill for
this one case).

- ilmari

> Michael
>
> From ad91672367f408663824b36b6dd517513d7f0a2a Mon Sep 17 00:00:00 2001
> From: Michael Paquier 
> Date: Fri, 18 Aug 2023 09:12:58 +0900
> Subject: [PATCH v2] Track DEALLOCATE statements in pg_stat_activity
>
> Treat the statement name as a constant when jumbling.  A boolean field
> is added to DeallocateStmt to make a distinction between the ALL and
> named cases.
> ---
>  src/include/nodes/parsenodes.h|  8 +++-
>  src/backend/parser/gram.y |  8 
>  .../pg_stat_statements/expected/utility.out   | 41 +++
>  .../pg_stat_statements/pg_stat_statements.c   |  8 +---
>  contrib/pg_stat_statements/sql/utility.sql| 13 ++
>  5 files changed, 70 insertions(+), 8 deletions(-)
>
> diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
> index 2565348303..2b356336eb 100644
> --- a/src/include/nodes/parsenodes.h
> +++ b/src/include/nodes/parsenodes.h
> @@ -3925,8 +3925,12 @@ typedef struct ExecuteStmt
>  typedef struct DeallocateStmt
>  {
>   NodeTag type;
> - char   *name;   /* The name of the plan to 
> remove */
> - /* NULL means DEALLOCATE ALL */
> + /* The name of the plan to remove.  NULL if DEALLOCATE ALL. */
> + char   *name pg_node_attr(query_jumble_ignore);
> + /* true if DEALLOCATE ALL */
> + boolisall;
> + /* token location, or -1 if unknown */
> + int location pg_node_attr(query_jumble_location);
>  } DeallocateStmt;
>  
>  /*
> diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
> index b3bdf947b6..df34e96f89 100644
> --- a/src/backend/parser/gram.y
> +++ b/src/backend/parser/gram.y
> @@ -11936,6 +11936,8 @@ DeallocateStmt: DEALLOCATE name
>   DeallocateStmt *n = 
> makeNode(DeallocateStmt);
>  
>   n->name = $2;
> + n->isall = false;
> + n->location = @2;
>   $$ = (Node *) n;
>   }
>   | DEALLOCATE PREPARE name
> @@ -11943,6 +11945,8 @@ DeallocateStmt: DEALLOCATE name
>   DeallocateStmt *n = 
> makeNode(DeallocateStmt);
>  
>   n->name = $3;
> + n->isall = false;
> + n->location = @3;
>   $$ = (Node *) n;
>   }
>   | DEALLOCATE ALL
> @@ -11950,6 +11954,8 @@ DeallocateStmt: DEALLOCATE name
>   DeallocateStmt *n = 
> makeNode(DeallocateStmt);
>  
>   n->name = NULL;
> + n->isall = true;
> + n->location = -1;
>   $$ = (Node *) n;
>   }
>   | DEALLOCATE PREPARE ALL
> @@ -11957,6 +11963,8 @@ DeallocateStmt: DEALLOCATE name
>   DeallocateStmt *n = 
> makeNode(DeallocateStmt);
>  
>   n->name = NULL;
> + n->isall = true;
> + n->location = -1;
>   $$ = (Node *) n;
>   }
>   ;
> diff --git a/contrib/pg_stat_statements/expected/utility.out 
> b/contrib/pg_stat_statements/expected/utility.out
> index 93735d5d85..f331044f3e 100644
> --- a/contrib/pg_stat_statements/expected/utility.out
> +++ b/contrib/pg_stat_statements/expected/utility.out
> @@ -472,6 

Re: Ignore 2PC transaction GIDs in query jumbling

2023-08-14 Thread Dagfinn Ilmari Mannsåker
Michael Paquier  writes:

> On Sun, Aug 13, 2023 at 02:48:22PM +0800, Julien Rouhaud wrote:
>> On Sun, Aug 13, 2023 at 03:25:33PM +0900, Michael Paquier wrote:
>>> Perhaps not as much, actually, because I was just reminded that
>>> DEALLOCATE is something that pg_stat_statements ignores.  So this
>>> makes harder the introduction of tests.
>> 
>> Maybe it's time to revisit that?  According to [1] the reason why
>> pg_stat_statements currently ignores DEALLOCATE is because it indeed bloated
>> the entries but also because at that time the suggestion for jumbling only 
>> this
>> one was really hackish.
>
> Good point.  The argument of the other thread does not really hold
> much these days now that query jumbling can happen for all the utility
> nodes.
>
>> Now that we do have a sensible approach to jumble utility statements, I think
>> it would be beneficial to be able to track those, for instance to be easily
>> diagnose a driver that doesn't rely on the extended protocol.
>
> Fine by me.  Would you like to write a patch?  I've begun typing an
> embryon of patch a few days ago, and did not look yet at the rest.
> Please see the attached.

As far as I could tell the only thing missing was removing
DeallocateStmt from the list of unhandled utility statement types (and
updating comments to match).  Updated patch attached.

- ilmari

>From 7f11e362ef8c097a78463435a4bd1ab1031ea233 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Mon, 14 Aug 2023 11:59:44 +0100
Subject: [PATCH] Track DEALLOCATE statements in pg_stat_activity

Treat the statement name as a constant when jumbling.
---
 .../pg_stat_statements/expected/utility.out   | 40 +++
 .../pg_stat_statements/pg_stat_statements.c   |  8 +---
 contrib/pg_stat_statements/sql/utility.sql| 13 ++
 src/backend/parser/gram.y |  4 ++
 src/include/nodes/parsenodes.h|  6 ++-
 5 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/contrib/pg_stat_statements/expected/utility.out b/contrib/pg_stat_statements/expected/utility.out
index 93735d5d85..3681374869 100644
--- a/contrib/pg_stat_statements/expected/utility.out
+++ b/contrib/pg_stat_statements/expected/utility.out
@@ -472,6 +472,46 @@ SELECT pg_stat_statements_reset();
  
 (1 row)
 
+-- Execution statements
+SELECT 1 as a;
+ a 
+---
+ 1
+(1 row)
+
+PREPARE stat_select AS SELECT $1 AS a;
+EXECUTE stat_select (1);
+ a 
+---
+ 1
+(1 row)
+
+DEALLOCATE stat_select;
+PREPARE stat_select AS SELECT $1 AS a;
+EXECUTE stat_select (2);
+ a 
+---
+ 2
+(1 row)
+
+DEALLOCATE PREPARE stat_select;
+DEALLOCATE ALL;
+DEALLOCATE PREPARE ALL;
+SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+ calls | rows | query 
+---+--+---
+ 4 |0 | DEALLOCATE $1
+ 2 |2 | PREPARE stat_select AS SELECT $1 AS a
+ 1 |1 | SELECT $1 as a
+ 1 |1 | SELECT pg_stat_statements_reset()
+(4 rows)
+
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--
+ 
+(1 row)
+
 -- SET statements.
 -- These use two different strings, still they count as one entry.
 SET work_mem = '1MB';
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 55b957d251..06b65aeef5 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -104,8 +104,7 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
  * ignores.
  */
 #define PGSS_HANDLED_UTILITY(n)		(!IsA(n, ExecuteStmt) && \
-	!IsA(n, PrepareStmt) && \
-	!IsA(n, DeallocateStmt))
+	!IsA(n, PrepareStmt))
 
 /*
  * Extension version number, for supporting older extension versions' objects
@@ -830,8 +829,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
 
 	/*
 	 * Clear queryId for prepared statements related utility, as those will
-	 * inherit from the underlying statement's one (except DEALLOCATE which is
-	 * entirely untracked).
+	 * inherit from the underlying statement's one.
 	 */
 	if (query->utilityStmt)
 	{
@@ -1116,8 +1114,6 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 	 * calculated from the query tree) would be used to accumulate costs of
 	 * ensuing EXECUTEs.  This would be confusing, and inconsistent with other
 	 * cases where planning time is not included at all.
-	 *
-	 * Likewise, we don't track execution of DEALLOCATE.
 	 */
 	if (pgss_track_utility && pgss_enabled(exec_nested_level) &&
 		PGSS_HANDLED_UTILITY(parsetree))
diff --git a/contrib/pg_stat_statements/sql/utility.sql b/contrib/pg_stat_statements/sql/utility.sql
index 87666d9135..5f7d4a467f 100644
--- a/contrib/pg_stat_statements/sql/utility.sql
+++ b/contrib/pg_stat_statements/sql/utility.sql
@@ -237,6 +237,19 @@ DROP DOMAIN domain_stats;
 SELECT calls, rows, query 

Obsolete reference to pg_relation in comment

2023-07-26 Thread Dagfinn Ilmari Mannsåker
Hi hackers,

Triggered by a discussion on IRC, I noticed that there's a stray
reference to pg_relation in a comment that was added long after it was
renamed to pg_class.  Here's a patch to bring that up to speed.

- ilmari

>From e395f8cb293f674f45eb3847534de07c7124e738 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Wed, 26 Jul 2023 18:31:51 +0100
Subject: [PATCH] Fix obsolete reference to pg_relation in comment

pg_relation was renamed to pg_class in 1991, but this comment (added
in 2004) missed the memo
---
 src/backend/storage/large_object/inv_api.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/storage/large_object/inv_api.c b/src/backend/storage/large_object/inv_api.c
index 84e543e731..a56912700b 100644
--- a/src/backend/storage/large_object/inv_api.c
+++ b/src/backend/storage/large_object/inv_api.c
@@ -59,7 +59,7 @@ bool		lo_compat_privileges;
 
 /*
  * All accesses to pg_largeobject and its index make use of a single Relation
- * reference, so that we only need to open pg_relation once per transaction.
+ * reference, so that we only need to open pg_class once per transaction.
  * To avoid problems when the first such reference occurs inside a
  * subtransaction, we execute a slightly klugy maneuver to assign ownership of
  * the Relation reference to TopTransactionResourceOwner.
-- 
2.39.2



Re: [PATCH] Check more invariants during syscache initialization

2023-07-26 Thread Dagfinn Ilmari Mannsåker
Aleksander Alekseev  writes:

> Hi Zhang,
>
>> That remind me to have a look other codes, and a grep search `oid != 0` show 
>> there are several files using old != 0.
>>
>> ```
>> .//src/bin/pg_resetwal/pg_resetwal.c: if (set_oid != 0)
>> .//src/bin/pg_resetwal/pg_resetwal.c: if (set_oid != 0)
>> .//src/bin/pg_dump/pg_backup_tar.c: if (oid != 0)
>> .//src/bin/pg_dump/pg_backup_custom.c: while (oid != 0)
>> .//src/bin/pg_dump/pg_backup_custom.c: while (oid != 0)
>> ```
>> That is another story…I would like provide a patch if it worths.
>
> Good catch. Please do so.

Shouldn't these be calling `OidIsValid(…)`, not comparing directly to
`InvalidOid`?

~/src/postgresql (master $)$ git grep '[Oo]id [!=]= 0' | wc -l
18
~/src/postgresql (master $)$ git grep '[!=]= InvalidOid' | wc -l
296
~/src/postgresql (master $)$ git grep 'OidIsValid' |
wc -l
1462

- ilmari




Re: Adding argument names to aggregate functions

2023-07-19 Thread Dagfinn Ilmari Mannsåker
Daniel Gustafsson  writes:

> This patch no longer applied but had a fairly trivial conflict so I've 
> attached
> a rebased v3 addressing the conflict in the hopes of getting this further.

Thanks for the heads-up!  Turns out the conflict was due to the new
json(b)_object_agg(_unique)(_strict) functions, which should also have
proargnames added.  Here's an updated patch that does that.

- ilmari

>From 2da3bada4f2a9425cbaa925a51f78773e4e16dfb Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Mon, 27 Feb 2023 13:06:29 +
Subject: [PATCH v4] Add argument names to multi-argument aggregates

This makes it easier to see which way around the arguments go when
using \dfa.  This is particularly relevant for string_agg(), but add
it to json(b)_object_agg() too for good measure.
---
 src/include/catalog/pg_proc.dat | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 6996073989..3e283671dc 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -4992,7 +4992,7 @@
 { oid => '3538', descr => 'concatenate aggregate input into a string',
   proname => 'string_agg', prokind => 'a', proisstrict => 'f',
   prorettype => 'text', proargtypes => 'text text',
-  prosrc => 'aggregate_dummy' },
+  proargnames => '{value,delimiter}', prosrc => 'aggregate_dummy' },
 { oid => '3543', descr => 'aggregate transition function',
   proname => 'bytea_string_agg_transfn', proisstrict => 'f',
   prorettype => 'internal', proargtypes => 'internal bytea bytea',
@@ -5004,7 +5004,7 @@
 { oid => '3545', descr => 'concatenate aggregate input into a bytea',
   proname => 'string_agg', prokind => 'a', proisstrict => 'f',
   prorettype => 'bytea', proargtypes => 'bytea bytea',
-  prosrc => 'aggregate_dummy' },
+  proargnames => '{value,delimiter}', prosrc => 'aggregate_dummy' },
 
 # To ASCII conversion
 { oid => '1845', descr => 'encode text from DB encoding to ASCII text',
@@ -8953,21 +8953,22 @@
 { oid => '3197', descr => 'aggregate input into a json object',
   proname => 'json_object_agg', prokind => 'a', proisstrict => 'f',
   provolatile => 's', prorettype => 'json', proargtypes => 'any any',
-  prosrc => 'aggregate_dummy' },
+  proargnames => '{key,value}', prosrc => 'aggregate_dummy' },
 { oid => '6280', descr => 'aggregate non-NULL input into a json object',
   proname => 'json_object_agg_strict', prokind => 'a', proisstrict => 'f',
   provolatile => 's', prorettype => 'json', proargtypes => 'any any',
-  prosrc => 'aggregate_dummy' },
+  proargnames => '{key,value}', prosrc => 'aggregate_dummy' },
 { oid => '6281',
   descr => 'aggregate input into a json object with unique keys',
   proname => 'json_object_agg_unique', prokind => 'a', proisstrict => 'f',
   provolatile => 's', prorettype => 'json', proargtypes => 'any any',
-  prosrc => 'aggregate_dummy' },
+  proargnames => '{key,value}', prosrc => 'aggregate_dummy' },
 { oid => '6282',
   descr => 'aggregate non-NULL input into a json object with unique keys',
   proname => 'json_object_agg_unique_strict', prokind => 'a',
   proisstrict => 'f', provolatile => 's', prorettype => 'json',
-  proargtypes => 'any any', prosrc => 'aggregate_dummy' },
+  proargtypes => 'any any', proargnames => '{key,value}',
+  prosrc => 'aggregate_dummy' },
 { oid => '3198', descr => 'build a json array from any inputs',
   proname => 'json_build_array', provariadic => 'any', proisstrict => 'f',
   provolatile => 's', prorettype => 'json', proargtypes => 'any',
@@ -9881,22 +9882,22 @@
   prosrc => 'jsonb_object_agg_finalfn' },
 { oid => '3270', descr => 'aggregate inputs into jsonb object',
   proname => 'jsonb_object_agg', prokind => 'a', proisstrict => 'f',
-  prorettype => 'jsonb', proargtypes => 'any any',
+  prorettype => 'jsonb', proargtypes => 'any any', proargnames => '{key,value}',
   prosrc => 'aggregate_dummy' },
 { oid => '6288', descr => 'aggregate non-NULL inputs into jsonb object',
   proname => 'jsonb_object_agg_strict', prokind => 'a', proisstrict => 'f',
-  prorettype => 'jsonb', proargtypes => 'any any',
+  prorettype => 'jsonb', proargtypes => 'any any', proargnames => '{key,value}',
   prosrc => 'aggregate_dummy' },
 { oid => '6289',
   descr => 'aggregate inputs into jsonb object checking key uniqueness',
   proname => 'jsonb_object_agg_unique', prokind => 'a', proisstrict => 'f',
-  prorettype => 'jsonb', proargtypes => 'any any',
+  prorettype => 'jsonb', proargtypes => 'any any', proargnames => '{key,value}',
   prosrc => 'aggregate_dummy' },
 { oid => '6290',
   descr => 'aggregate non-NULL inputs into jsonb object checking key uniqueness',
   proname => 'jsonb_object_agg_unique_strict', prokind => 'a',
   proisstrict => 'f', prorettype => 'jsonb', proargtypes => 'any any',
-  prosrc => 'aggregate_dummy' },
+  proargnames => '{key,value}', prosrc => 'aggregate_dummy' },
 { oid => '3271', descr => 'build 

Re: [PATCH] Using named captures in Catalog::ParseHeader()

2023-06-30 Thread Dagfinn Ilmari Mannsåker
Michael Paquier  writes:

> On Wed, Jun 14, 2023 at 10:30:23AM +0100, Dagfinn Ilmari Mannsåker wrote:
>> Thanks for the review!
>
> v17 is now open, so applied this one.

Thanks for committing!

- ilmari




Re: Tab completion for CREATE SCHEMAAUTHORIZATION

2023-06-30 Thread Dagfinn Ilmari Mannsåker
Michael Paquier  writes:

> On Tue, May 09, 2023 at 12:26:16PM +0900, Michael Paquier wrote:
>> That looks pretty much OK to me.  One tiny comment I have is that this
>> lacks brackets for the inner blocks, so I have added some in the v4
>> attached.
>
> The indentation was a bit wrong, so fixed it, and applied on HEAD.

Thanks!

- ilmari




Re: [PATCH] Honor PG_TEST_NOCLEAN for tempdirs

2023-06-27 Thread Dagfinn Ilmari Mannsåker
Andrew Dunstan  writes:

> On 2023-06-26 Mo 19:55, Jacob Champion wrote:
>> Hello,
>>
>> I was running the test_pg_dump extension suite, and I got annoyed that
>> I couldn't keep it from deleting its dump artifacts after a successful
>> run. Here's a patch to make use of PG_TEST_NOCLEAN (which currently
>> covers the test cluster's base directory) with the Test::Utils
>> tempdirs too.
>>
>> (Looks like this idea was also discussed last year [1]; let me know if
>> I missed any more recent suggestions.)
>
>
> -        CLEANUP => 1);
> +        CLEANUP => not defined $ENV{'PG_TEST_NOCLEAN'});
>
>
> This doesn't look quite right. If PG_TEST_CLEAN had a value of 0 we
> would still do the cleanup. I would probably use something like:
>
>     CLEANUP => $ENV{'PG_TEST_NOCLEAN'} // 1
>
> i.e. if it's not defined at all or has a value of undef, do the cleanup,
> otherwise use the value.

If the environment varible were used as a boolean, it should be

CLEANUP => not $ENV{PG_TEST_NOCLEAN}

since `not undef` returns true with no warning, and the senses of the
two flags are inverted.

However, the docs
(https://www.postgresql.org/docs/16/regress-tap.html#REGRESS-TAP-VARS)
say "If the environment variable PG_TEST_NOCLEAN is set", not "is set to
a true value", and the existing test in PostgreSQL::Test::Cluster's END
block is:

# skip clean if we are requested to retain the basedir
next if defined $ENV{'PG_TEST_NOCLEAN'};
  
So the original `not defined` test is consistent with that.

Tangentially, even though the above line contradicts it, the general
perl style is to not unnecessarily quote hash keys or words before `=>`:

~/src/postgresql $ rg -P -t perl '\{\s*\w+\s*\}' | wc -l
1662
~/src/postgresql $ rg -P -t perl '\{\s*(["'\''])\w+\1\s*\}' | wc -l
155
~/src/postgresql $ rg -P -t perl '\w+\s*=>' | wc -l
3842
~/src/postgresql $ rg -P -t perl '(["'\''])\w+\1\s*=>' | wc -l
310

- ilmari




Re: Bytea PL/Perl transform

2023-06-23 Thread Dagfinn Ilmari Mannsåker
Andrew Dunstan  writes:

> On 2023-06-22 Th 16:56, Greg Sabino Mullane wrote:
>>
>> * Do all of these transforms need to be their own contrib modules? So
>> much duplicated code across contrib/*_plperl already (and *plpython 
>> too for that matter) ...
>>
>>
>
> Yeah, that's a bit of a mess. Not sure what we can do about it now.

Would it be possible to move the functions and other objects to a new
combined extension, and make the existing ones depend on that?

I see ALTER EXTENSION has both ADD and DROP subcommands which don't
affect the object itself, only the extension membership.  The challenge
would be getting the ordering right between the upgrade/install scripts
dropping the objects from the existing extension and adding them to the
new extension.

- ilmari




Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

2023-06-22 Thread Dagfinn Ilmari Mannsåker
Tom Lane  writes:

> Tommy Pavlicek  writes:
>
>> Additionally, I wasn't sure whether it was preferred to fail or succeed on
>> ALTERs that have no effect, such as adding hashes on an operator that
>> already allows them or disabling hashes on one that does not. I chose to
>> raise an error when this happens, on the thinking it was more explicit and
>> made the code simpler, even though the end result would be what the user
>> wanted.
>
> You could argue that both ways I guess.  We definitely need to raise error
> if the command tries to change an existing nondefault setting, since that
> might break things as per previous discussion.  But perhaps rejecting
> an attempt to set the existing setting is overly nanny-ish.  Personally
> I think I'd lean to "don't throw an error if we don't have to", but I'm
> not strongly set on that position.
>
> (Don't we have existing precedents that apply here?  I can't offhand
> think of any existing ALTER commands that would reject no-op requests,
> but maybe that's not a direct precedent.)

Since it only supports adding these operations if they don't already
exist, should it not be ALTER OPERATOR ADD , not SET ?

That makes it natural to add an IF NOT EXISTS clause, like ALTER TABLE
ADD COLUMN has, to make it a no-op instead of an error.

>   regards, tom lane

- ilmari




Re: pgindent vs. pgperltidy command-line arguments

2023-06-20 Thread Dagfinn Ilmari Mannsåker
Andrew Dunstan  writes:

> I'm intending to add some of the new pgindent features to
> pgperltidy. Preparatory to that here's a rewrite of pgperltidy in perl -
> no new features yet but it does remove the hardcoded path, and requires
> you to pass in one or more files / directories as arguments.

Good idea, here's some comments.

> #!/usr/bin/perl
>
> # Copyright (c) 2023, PostgreSQL Global Development Group
>
> # src/tools/pgindent/pgperltidy
>
> use strict;
> use warnings;
>
> use File::Find;
>
> my $perltidy = $ENV{PERLTIDY} || 'perltidy';
>
> my @files;
>
> die "No directories or files specified" unless @ARGV;

It's not really useful to have the file name and line in errors like
this, adding a "\n" to the end of the message suppresses that.

> sub is_perl_exec
> {
>   my $name = shift;
>   my $out = `file $name 2>/dev/null`;
>   return $out =~ /:.*perl[0-9]*\b/i;
> }

> my $wanted = sub {
>
>   my $name = $File::Find::name;
>   my ($dev, $ino, $mode, $nlink, $uid, $gid);
>
>   # check it's a plain file and either it has a perl extension (.p[lm])
>   # or it's executable and `file` thinks it's a perl script.
>
>   (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))
> && -f _
> && (/\.p[lm]$/ || ((($mode & 0100) == 0100) && is_perl_exec($_)))
> && push(@files, $name);
> };

The core File::stat and Fcntl modules can make this neater:

use File::stat;
use Fcntl ':mode';
 
my $wanted = sub {
my $st;
push @files, $File::Find::name
if $st = lstat($_) && -f $st
&& (/\.p[lm]$/ || (($st->mode & S_IXUSR) && 
is_perl_exec($_)));
};

> File::Find::find({ wanted => $wanted }, @ARGV);
>
> my $list = join(" ", @files);
>
> system "$perltidy --profile=src/tools/pgindent/perltidyrc $list";

It's better to use the list form of system, to avoid shell escaping
issues.  Also, since this is the last thing in the script we might as
well exec it instead:

exec $perltidy, '--profile=src/tools/pgindent/perltidyrc', @files;

- ilmari




Re: [PATCH] Missing dep on Catalog.pm in meson rules

2023-06-16 Thread Dagfinn Ilmari Mannsåker
Andres Freund  writes:

> Hi,
>
> On 2023-06-16 16:20:14 -0400, Andrew Dunstan wrote:
>> Unless I'm misunderstanding, this doesn't look terribly feasible to me. You
>> can only get at %INC by loading the module, which in many cases will have
>> side effects.
>
> I was envisioning using %INC after the use/require block - I don't think our
> scripts load additional modules after that point?

I was thinking of a module for writing depfile entries that would append
`values %INC` to the list of source files for each target specified by
the script.

>> And then you would also need to filter out things loaded that
>> are not our artefacts (e.g. Catalog.pm loads File::Compare).
>
> I don't think we would need to filter the output. This would just be for a
> build dependency file. I don't see a problem with rerunning genbki.pl et al 
> after
> somebody updates File::Compare?

As long as mason doesn't object to dep files outside the source tree.
Otherwise, and option would be to pass in @SOURCE_ROOT@ and only include
`grep /^\Q$source_root\E\b/, values %INC` in the depfile.

> Greetings,
>
> Andres Freund

- ilmari




Re: [PATCH] Using named captures in Catalog::ParseHeader()

2023-06-14 Thread Dagfinn Ilmari Mannsåker
Michael Paquier  writes:

> On Thu, Jun 01, 2023 at 01:12:22PM +0100, Dagfinn Ilmari Mannsåker wrote:
>> While I was rewriting the regexes I noticed that they were inconsistent
>> about whether they accepted whitespace in the parameter lists, so I took
>> the liberty to make them consistently allow whitespace after the opening
>> paren and the commas, which is what most of them already did.
>
> That's the business with \s* in CATALOG.  Is that right?  Indeed,
> that's more consistent.

Yes, \s* means "zero or more whitespace characters".

>> I've verified that the generated postgres.bki is identical to before,
>> and all tests pass.
>
> I find that pretty cool.  Nice.  Patch looks OK from here.

Thanks for the review!

- ilmari




Re: [PATCH] Using named captures in Catalog::ParseHeader()

2023-06-13 Thread Dagfinn Ilmari Mannsåker
Dagfinn Ilmari Mannsåker  writes:

> However, now that we require Perl 5.14, we can use the named capture
> feature (introduced in Perl 5.10) to make that a lot clearer, as in the
> attached patch.

Added to the open commitfest: https://commitfest.postgresql.org/43/4361/

- ilmari




Adding a pg_get_owned_sequence function?

2023-06-09 Thread Dagfinn Ilmari Mannsåker
Hi hackers,

I've always been annoyed by the fact that pg_get_serial_sequence takes
the table and returns the sequence as strings rather than regclass. And
since identity columns were added, the name is misleading as well (which
is even acknowledged in the docs, together with a suggestion for a
better name).

So, instead of making excuses in the documentation, I thought why not
add a new function which addresses all of these issues, and document the
old one as a backward-compatibilty wrapper?

Please see the attached patch for my stab at this.

- ilmari

>From 7fd37c15920b7d2e87edef4351932559e2c9ef4f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Fri, 9 Jun 2023 19:55:58 +0100
Subject: [PATCH] Add pg_get_owned_sequence function

This is the name the docs say `pg_get_serial_sequence` sholud have
had, and gives us the opportunity to change the return and table
argument types to `regclass` and the column argument to `name`,
instead of using `text` everywhere.  This matches what's in catalogs,
and requires less explaining than the rules for
`pg_get_serial_sequence`.
---
 doc/src/sgml/func.sgml | 46 -
 src/backend/utils/adt/ruleutils.c  | 69 +++---
 src/include/catalog/pg_proc.dat|  3 ++
 src/test/regress/expected/identity.out |  6 +++
 src/test/regress/expected/sequence.out |  6 +++
 src/test/regress/sql/identity.sql  |  1 +
 src/test/regress/sql/sequence.sql  |  1 +
 7 files changed, 102 insertions(+), 30 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5a47ce4343..687a8480e6 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -24327,6 +24327,35 @@

   
 
+  
+   
+
+ pg_get_owned_sequence
+
+pg_get_owned_sequence ( table regclass, column name )
+regclass
+   
+   
+Returns the sequence associated with a column, or NULL if no sequence
+is associated with the column.  If the column is an identity column,
+the associated sequence is the sequence internally created for that
+column.  For columns created using one of the serial types
+(serial, smallserial, bigserial),
+it is the sequence created for that serial column definition.  In the
+latter case, the association can be modified or removed
+with ALTER SEQUENCE OWNED BY.  The result is
+suitable for passing to the sequence functions (see
+).
+   
+   
+A typical use is in reading the current value of the sequence for an
+identity or serial column, for example:
+
+SELECT currval(pg_get_owned_sequence('sometable', 'id'));
+
+   
+  
+
   

 
@@ -24336,19 +24365,10 @@
 text


-Returns the name of the sequence associated with a column,
-or NULL if no sequence is associated with the column.
-If the column is an identity column, the associated sequence is the
-sequence internally created for that column.
-For columns created using one of the serial types
-(serial, smallserial, bigserial),
-it is the sequence created for that serial column definition.
-In the latter case, the association can be modified or removed
-with ALTER SEQUENCE OWNED BY.
-(This function probably should have been
-called pg_get_owned_sequence; its current name
-reflects the fact that it has historically been used with serial-type
-columns.)  The first parameter is a table name with optional
+A backwards-compatibility wrapper
+for pg_get_owned_sequence, which
+uses text for the table and sequence names instead of
+regclass.  The first parameter is a table name with optional
 schema, and the second parameter is a column name.  Because the first
 parameter potentially contains both schema and table names, it is
 parsed per usual SQL rules, meaning it is lower-cased by default.
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index d3a973d86b..b20a1e7583 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -518,6 +518,7 @@ static char *generate_qualified_type_name(Oid typid);
 static text *string_to_text(char *str);
 static char *flatten_reloptions(Oid relid);
 static void get_reloptions(StringInfo buf, Datum reloptions);
+static Oid pg_get_owned_sequence_internal(Oid tableOid, char *column);
 
 #define only_marker(rte)  ((rte)->inh ? "" : "ONLY ")
 
@@ -2763,6 +2764,28 @@ pg_get_userbyid(PG_FUNCTION_ARGS)
 }
 
 
+/*
+ * pg_get_owned_sequence
+ *		Get the sequence used by an identity or serial column.
+ */
+Datum
+pg_get_owned_sequence(PG_FUNCTION_ARGS)
+{
+	Oid			tableOid = PG_GETARG_OID(0);
+	char	   *column = NameStr(*PG_GETARG_NAME(1));
+	Oid			sequenceId;
+
+	sequenceId = 

Re: Error in calculating length of encoded base64 string

2023-06-09 Thread Dagfinn Ilmari Mannsåker
Tom Lane  writes:

> Gurjeet Singh  writes:
>> On Thu, Jun 8, 2023 at 7:35 AM Tom Lane  wrote:
>>> This bug is very ancient, dating to commit 79d78bb26 which
>>> added encode.c.  (The other instances were presumably copied
>>> from there.)  Still, it doesn't quite seem worth back-patching.
>
>> Is it worth investing time in trying to unify these 3 occurrences of
>> base64 length (and possibly other relevant) code to one place? If yes,
>> I can volunteer for it.
>
> I wondered about that too.  It seems really silly that we made
> a copy in src/common and did not replace the others with calls
> to that.

Also, while we're at it, how about some unit tests that both encode and
calculate the encoded length of strings of various lengths and check
that they match?

- ilmari




Re: [PATCH] Missing dep on Catalog.pm in meson rules

2023-06-01 Thread Dagfinn Ilmari Mannsåker
On Thu, 1 Jun 2023, at 22:16, Tristan Partin wrote:
> Could you create a variable for the file instead of calling files() 3
> times?
>
>> catalog_pm = files('path/to/Catalog.pm')

Sure, but which meson.build file should it go in? I know nothing about meson 
variable scoping.

> -- 
> Tristan Partin
> Neon (https://neon.tech)

-- 
- ilmari




[PATCH] Missing dep on Catalog.pm in meson rules

2023-06-01 Thread Dagfinn Ilmari Mannsåker
Hi Hackers,

While hacking on Catalog.pm (over in
https://postgr.es/m/87y1l3s7o9.fsf%40wibble.ilmari.org) I noticed that
ninja wouldn't rebuild postgres.bki on changes to the module.  Here's a
patch that adds it to depend_files for the targets I culd find that
invoke scripts that use it.

- ilmari

>From 63e8cdbd2509feeb493bf7b52362e8b429e8279c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Thu, 1 Jun 2023 13:27:41 +0100
Subject: [PATCH] meson: declare dependency on Catalog.pm for targets that use
 it

---
 src/include/catalog/meson.build | 2 +-
 src/include/nodes/meson.build   | 1 +
 src/include/utils/meson.build   | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/include/catalog/meson.build b/src/include/catalog/meson.build
index 3179be09d3..01181ae1fc 100644
--- a/src/include/catalog/meson.build
+++ b/src/include/catalog/meson.build
@@ -111,7 +111,7 @@ generated_catalog_headers = custom_target('generated_catalog_headers',
   output: output_files,
   install_dir: output_install,
   input: input,
-  depend_files: bki_data_f,
+  depend_files: bki_data_f + files('../../backend/catalog/Catalog.pm'),
   build_by_default: true,
   install: true,
   command: [
diff --git a/src/include/nodes/meson.build b/src/include/nodes/meson.build
index 9a8e85c4a5..dafad003ed 100644
--- a/src/include/nodes/meson.build
+++ b/src/include/nodes/meson.build
@@ -50,6 +50,7 @@ node_support_install = [
 generated_nodes = custom_target('nodetags.h',
   input: node_support_input,
   output: node_support_output,
+  depend_files: files('../../backend/catalog/Catalog.pm'),
   command: [
 perl, files('../../backend/nodes/gen_node_support.pl'),
 '-o', '@OUTDIR@',
diff --git a/src/include/utils/meson.build b/src/include/utils/meson.build
index 2fed1e3f8e..3cbe21350c 100644
--- a/src/include/utils/meson.build
+++ b/src/include/utils/meson.build
@@ -44,6 +44,7 @@ fmgrtab_output = ['fmgroids.h', 'fmgrprotos.h', 'fmgrtab.c']
 fmgrtab_target = custom_target('fmgrtab',
   input: '../catalog/pg_proc.dat',
   output : fmgrtab_output,
+  depend_files: files('../../backend/catalog/Catalog.pm'),
   command: [perl, '-I', '@SOURCE_ROOT@/src/backend/catalog/', files('../../backend/utils/Gen_fmgrtab.pl'), '--include-path=@SOURCE_ROOT@/src/include', '--output=@OUTDIR@', '@INPUT@'],
   install: true,
   install_dir: [dir_include_server / 'utils', dir_include_server / 'utils', false],
-- 
2.39.2



[PATCH] Using named captures in Catalog::ParseHeader()

2023-06-01 Thread Dagfinn Ilmari Mannsåker
Hi Hackers,

Peter's patch set for autogenerating syscache info
(https://postgr.es/m/75ae5875-3abc-dafc-8aec-73247ed41cde%40eisentraut.org)
touched on one of my least favourite parts of Catalog.pm: the
parenthesis-counting nightmare that is the parsing of catalog header
directives.

However, now that we require Perl 5.14, we can use the named capture
feature (introduced in Perl 5.10) to make that a lot clearer, as in the
attached patch.

While I was rewriting the regexes I noticed that they were inconsistent
about whether they accepted whitespace in the parameter lists, so I took
the liberty to make them consistently allow whitespace after the opening
paren and the commas, which is what most of them already did.

I've verified that the generated postgres.bki is identical to before,
and all tests pass.

- ilmari

>From dd7ed959c93438045f9ff270feab33bb7b965c29 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Wed, 31 May 2023 14:00:58 +0100
Subject: [PATCH] Use named captures in Catalog::ParseHeader()

Now that we require Perl 5.14 we can use named captures and the %+
hash instead of having to count parenthesis groups manually.
---
 src/backend/catalog/Catalog.pm | 95 --
 1 file changed, 55 insertions(+), 40 deletions(-)

diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm
index 84aaeb002a..b15f513183 100644
--- a/src/backend/catalog/Catalog.pm
+++ b/src/backend/catalog/Catalog.pm
@@ -91,73 +91,88 @@ sub ParseHeader
 		# Push the data into the appropriate data structure.
 		# Caution: when adding new recognized OID-defining macros,
 		# also update src/include/catalog/renumber_oids.pl.
-		if (/^DECLARE_TOAST\(\s*(\w+),\s*(\d+),\s*(\d+)\)/)
-		{
-			push @{ $catalog{toasting} },
-			  { parent_table => $1, toast_oid => $2, toast_index_oid => $3 };
-		}
-		elsif (
-			/^DECLARE_TOAST_WITH_MACRO\(\s*(\w+),\s*(\d+),\s*(\d+),\s*(\w+),\s*(\w+)\)/
+		if (/^DECLARE_TOAST\(\s*
+			 (?\w+),\s*
+			 (?\d+),\s*
+			 (?\d+)\s*
+			 \)/x
 		  )
 		{
-			push @{ $catalog{toasting} },
-			  {
-parent_table => $1,
-toast_oid => $2,
-toast_index_oid => $3,
-toast_oid_macro => $4,
-toast_index_oid_macro => $5
-			  };
+			push @{ $catalog{toasting} }, {%+};
 		}
 		elsif (
-			/^DECLARE_(UNIQUE_)?INDEX(_PKEY)?\(\s*(\w+),\s*(\d+),\s*(\w+),\s*(.+)\)/
+			/^DECLARE_TOAST_WITH_MACRO\(\s*
+			 (?\w+),\s*
+			 (?\d+),\s*
+			 (?\d+),\s*
+			 (?\w+),\s*
+			 (?\w+)\s*
+			 \)/x
+		  )
+		{
+			push @{ $catalog{toasting} }, {%+};
+		}
+		elsif (
+			/^DECLARE_(UNIQUE_)?INDEX(_PKEY)?\(\s*
+			 (?\w+),\s*
+			 (?\d+),\s*
+			 (?\w+),\s*
+			 (?.+)\s*
+			 \)/x
 		  )
 		{
 			push @{ $catalog{indexing} },
 			  {
 is_unique => $1 ? 1 : 0,
 is_pkey => $2 ? 1 : 0,
-index_name => $3,
-index_oid => $4,
-index_oid_macro => $5,
-index_decl => $6
-			  };
-		}
-		elsif (/^DECLARE_OID_DEFINING_MACRO\(\s*(\w+),\s*(\d+)\)/)
-		{
-			push @{ $catalog{other_oids} },
-			  {
-other_name => $1,
-other_oid => $2
+%+,
 			  };
 		}
 		elsif (
-			/^DECLARE_(ARRAY_)?FOREIGN_KEY(_OPT)?\(\s*\(([^)]+)\),\s*(\w+),\s*\(([^)]+)\)\)/
+			/^DECLARE_OID_DEFINING_MACRO\(\s*
+			 (?\w+),\s*
+			 (?\d+)\s*
+			 \)/x
+		  )
+		{
+			push @{ $catalog{other_oids} }, {%+};
+		}
+		elsif (
+			/^DECLARE_(ARRAY_)?FOREIGN_KEY(_OPT)?\(\s*
+			 \((?[^)]+)\),\s*
+			 (?\w+),\s*
+			 \((?[^)]+)\)\s*
+			 \)/x
 		  )
 		{
 			push @{ $catalog{foreign_keys} },
 			  {
 is_array => $1 ? 1 : 0,
 is_opt => $2 ? 1 : 0,
-fk_cols => $3,
-pk_table => $4,
-pk_cols => $5
+%+,
 			  };
 		}
-		elsif (/^CATALOG\((\w+),(\d+),(\w+)\)/)
+		elsif (
+			/^CATALOG\(\s*
+			 (?\w+),\s*
+			 (?\d+),\s*
+			 (?\w+)\s*
+			 \)/x
+		  )
 		{
-			$catalog{catname} = $1;
-			$catalog{relation_oid} = $2;
-			$catalog{relation_oid_macro} = $3;
+			@catalog{ keys %+ } = values %+;
 
 			$catalog{bootstrap} = /BKI_BOOTSTRAP/ ? ' bootstrap' : '';
 			$catalog{shared_relation} =
 			  /BKI_SHARED_RELATION/ ? ' shared_relation' : '';
-			if (/BKI_ROWTYPE_OID\((\d+),(\w+)\)/)
+			if (/BKI_ROWTYPE_OID\(\s*
+ (?\d+),\s*
+ (?\w+)\s*
+ \)/x
+			  )
 			{
-$catalog{rowtype_oid} = $1;
-$catalog{rowtype_oid_clause} = " rowtype_oid $1";
-$catalog{rowtype_oid_macro} = $2;
+@catalog{ keys %+ } = values %+;
+$catalog{rowtype_oid_clause} = " rowtype_oid $+{rowtype_oid}";
 			}
 			else
 			{
-- 
2.39.2



Re: generate syscache info automatically

2023-05-31 Thread Dagfinn Ilmari Mannsåker
Peter Eisentraut  writes:

> The idea was mentioned in [0].  genbki.pl already knows everything about
> system catalog indexes.  If we add a "please also make a syscache for 
> this one" flag to the catalog metadata, we can have genbki.pl produce
> the tables in syscache.c and syscache.h automatically.

+1 on this worthwhile reduction of manual work.  Tangentially, it
reminded me of one of my least favourite parts of Catalog.pm, the
regexes in ParseHeader():


> diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm
> index 84aaeb002a..a727d692b7 100644
> --- a/src/backend/catalog/Catalog.pm
> +++ b/src/backend/catalog/Catalog.pm
> @@ -110,7 +110,7 @@ sub ParseHeader
> };
>   }
>   elsif (
> - 
> /^DECLARE_(UNIQUE_)?INDEX(_PKEY)?\(\s*(\w+),\s*(\d+),\s*(\w+),\s*(.+)\)/
> + 
> /^DECLARE_(UNIQUE_)?INDEX(_PKEY)?\(\s*(\w+),\s*(\d+),\s*(\w+),\s*(\w+),\s*(.+)\)/
> )
>   {
>   push @{ $catalog{indexing} },
> @@ -120,7 +120,8 @@ sub ParseHeader
>   index_name => $3,
>   index_oid => $4,
>   index_oid_macro => $5,
> - index_decl => $6
> + table_name => $6,
> + index_decl => $7
> };
>   }
>   elsif (/^DECLARE_OID_DEFINING_MACRO\(\s*(\w+),\s*(\d+)\)/)


Now that we require Perl 5.14, we could replace this parenthesis-
counting nightmare with named captures (introduced in Perl 5.10), which
would make the above change look like this instead (context expanded to
show the whole elsif block):

elsif (
/^DECLARE_(UNIQUE_)?INDEX(_PKEY)?\(\s*
 (?\w+),\s*
 (?\d+),\s*
 (?\w+),\s*
+(?\w+),\s*
 (?.+)
 \)/x
  )
{
push @{ $catalog{indexing} },
  {
is_unique => $1 ? 1 : 0,
is_pkey => $2 ? 1 : 0,
%+,
  };
}

For other patterns without the optional bits in the keyword, it becomes
even simpler, e.g.

if (/^DECLARE_TOAST\(\s*
 (?\w+),\s*
 (?\d+),\s*
 (?\d+)\s*
 \)/x
  )
{
push @{ $catalog{toasting} }, {%+};
}


I'd be happy to submit a patch to do this for all the ParseHeader()
regexes (in a separate thread) if others agree this is an improvement.

- ilmari




Re: PG 16 draft release notes ready

2023-05-30 Thread Dagfinn Ilmari Mannsåker
Bruce Momjian  writes:

> On Sat, May 27, 2023 at 09:34:37PM -0400, Bruce Momjian wrote:
>> > > This is controlled by auto_explain.log_parameter_max_length, and by
>> > > default query parameters will be logged with no length
>> > > restriction. SHOULD THIS BE MORE CLEARLY IDENTIFIED AS CONTROLLING THE
>> > > EXECUTION OF PREPARED STATEMENTS?
>> > 
>> > This is wrong, the logging applies to all query parameters, not just for
>> > prepared statements (and has nothing to do with controlling the
>> > execution thereof).  That was just the only way to test it when it was
>> > written, because psql's \bind command exist yet then.
>> 
>> I see your point.  How is this?
>> 
>>  Allow auto_explain to log query parameters used by parameterized
>>  statements (Dagfinn Ilmari Mannsåker)
>> 
>>  This affects queries using server-side PRAPARE/EXECUTE
>>  and client-side parse/bind.  Logging is controlled by
>>  auto_explain.log_parameter_max_length;  by default query
>>  parameters will be logged with no length restriction.
>
> Done, attached patch applied.

That works for me. Thanks!

- ilmari




Re: PG 16 draft release notes ready

2023-05-25 Thread Dagfinn Ilmari Mannsåker
Bruce Momjian  writes:

> I have completed the first draft of the PG 16 release notes.  You can
> see the output here:
>
>   https://momjian.us/pgsql_docs/release-16.html
>
> I will adjust it to the feedback I receive;  that URL will quickly show
> all updates.

The bit about auto_explain and query parameters says:

> Allow auto_explain to log query parameters used in executing prepared
> statements (Dagfinn Ilmari Mannsåker)
>
> This is controlled by auto_explain.log_parameter_max_length, and by
> default query parameters will be logged with no length
> restriction. SHOULD THIS BE MORE CLEARLY IDENTIFIED AS CONTROLLING THE
> EXECUTION OF PREPARED STATEMENTS?

This is wrong, the logging applies to all query parameters, not just for
prepared statements (and has nothing to do with controlling the
execution thereof).  That was just the only way to test it when it was
written, because psql's \bind command exist yet then.

Should we perhaps add some tests for that, like the attached?

- ilmari

>From d3630f299fc2d2d9f9eb3addd426f98e5196100d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Thu, 25 May 2023 21:13:11 +0100
Subject: [PATCH] Test auto_explain parameter logging with protocol-level bind
 parameters

When auto_explain.log_parameter_max_length was added, psql didn't have
the \bind command for extended query protocol yet, so the test could
only use prepared statements.
---
 contrib/auto_explain/t/001_auto_explain.pl | 31 ++
 1 file changed, 31 insertions(+)

diff --git a/contrib/auto_explain/t/001_auto_explain.pl b/contrib/auto_explain/t/001_auto_explain.pl
index abb422f8de..d2a0078546 100644
--- a/contrib/auto_explain/t/001_auto_explain.pl
+++ b/contrib/auto_explain/t/001_auto_explain.pl
@@ -106,6 +106,21 @@ sub query_log
 	qr/Query Parameters:/,
 	"query parameters not logged when disabled, text mode");
 
+# bind parameters
+$log_contents = query_log($node,
+	q{SELECT * FROM pg_proc WHERE proname = $1 AND prokind = $2 \bind ascii f \g}
+);
+
+like(
+	$log_contents,
+	qr/Query Text: SELECT \* FROM pg_proc WHERE proname = \$1 AND prokind = \$2/,
+	"query text with parameters logged, text mode");
+
+like(
+	$log_contents,
+	qr/Query Parameters: \$1 = 'ascii', \$2 = 'f'/,
+	"query parameters logged, text mode");
+
 # Query Identifier.
 # Logging enabled.
 $log_contents = query_log(
@@ -172,6 +187,22 @@ sub query_log
 	qr/"Node Type": "Index Scan"[^}]*"Index Name": "pg_class_relname_nsp_index"/s,
 	"index scan logged, json mode");
 
+# query with bind parameters in JSON format.
+$log_contents = query_log(
+	$node,
+	q{SELECT * FROM pg_class WHERE relname = $1 AND relkind = $2 \bind pg_proc r \x},
+	{ "auto_explain.log_format" => "json" });
+
+like(
+	$log_contents,
+	qr/"Query Text": "SELECT \* FROM pg_class WHERE relname = \$1 AND relkind = \$2 "/,
+	"query text with parameters logged, json mode");
+
+like(
+	$log_contents,
+	qr/"Query Parameters": "\$1 = 'pg_proc', \$2 = 'r'"/,
+	"query parameters logged, json mode");
+
 # Check that PGC_SUSET parameters can be set by non-superuser if granted,
 # otherwise not
 
-- 
2.39.2



Re: Implement generalized sub routine find_in_log for tap test

2023-05-25 Thread Dagfinn Ilmari Mannsåker
vignesh C  writes:

> Hi,
>
> The recovery tap test has 4 implementations of find_in_log sub routine
> for various uses, I felt we can generalize these and have a single
> function for the same. The attached patch is an attempt to have a
> generalized sub routine find_in_log which can be used by all of them.
> Thoughts?

+1 on factoring out this common code. Just a few comments on the implementation.


> diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm 
> b/src/test/perl/PostgreSQL/Test/Utils.pm
> index a27fac83d2..5c9b2f6c03 100644
> --- a/src/test/perl/PostgreSQL/Test/Utils.pm
> +++ b/src/test/perl/PostgreSQL/Test/Utils.pm
> @@ -67,6 +67,7 @@ our @EXPORT = qw(
>slurp_file
>append_to_file
>string_replace_file
> +  find_in_log
>check_mode_recursive
>chmod_recursive
>check_pg_config
> @@ -579,6 +580,28 @@ sub string_replace_file
> 
>  =pod
>
> +
> +=item find_in_log(node, pattern, offset)
> +
> +Find pattern in logfile of node after offset byte.
> +
> +=cut
> +
> +sub find_in_log
> +{
> + my ($node, $pattern, $offset) = @_;
> +
> + $offset = 0 unless defined $offset;
> + my $log = PostgreSQL::Test::Utils::slurp_file($node->logfile);

Since this function is in the same package, there's no need to qualify
it with the full name.  I know the callers you copied it from did, but
they wouldn't have had to either, since it's exported by default (in the
@EXPORT array above), unless the use statement has an explicit argument
list that excludes it.

> + return 0 if (length($log) <= 0 || length($log) <= $offset);
> +
> + $log = substr($log, $offset);

Also, the existing callers don't seem to have got the memo that
slurp_file() takes an optinal offset parameter, which will cause it to
seek to that postion before slurping the file, which is more efficient
than reading the whole file in and substr-ing it.  There's not much
point in the length checks either, since regex-matching against an empty
string is very cheap (and if the provide pattern can match the empty
string the whole function call is rather pointless).

> + return $log =~ m/$pattern/;
> +}

All in all, it could be simplified to:

sub find_in_log {
my ($node, $pattern, $offset) = @_;

return slurp_file($node->logfile, $offset) =~ $pattern;
}

However, none of the other functions in ::Utils know anything about node
objects, which makes me think it should be a method on the node itself
(i.e. in PostgreSQL::Test::Cluster) instead.  Also, I think log_contains
would be a better name, since it just returns a boolean.  The name
find_in_log makes me think it would return the log lines matching the
pattern, or the position of the match in the file.

In that case, the slurp_file() call would have to be fully qualified,
since ::Cluster uses an empty import list to avoid polluting the method
namespace with imported functions.

- ilmari




Re: Large files for relations

2023-05-12 Thread Dagfinn Ilmari Mannsåker
Thomas Munro  writes:

> On Fri, May 12, 2023 at 8:16 AM Jim Mlodgenski  wrote:
>> On Mon, May 1, 2023 at 9:29 PM Thomas Munro  wrote:
>>> I am not aware of any modern/non-historic filesystem[2] that can't do
>>> large files with ease.  Anyone know of anything to worry about on that
>>> front?
>>
>> There is some trouble in the ambiguity of what we mean by "modern" and
>> "large files". There are still a large number of users of ext4 where
>> the max file size is 16TB. Switching to a single large file per
>> relation would effectively cut the max table size in half for those
>> users. How would a user with say a 20TB table running on ext4 be
>> impacted by this change?
[…]
> A less aggressive version of the plan would be that we just keep the
> segment code for the foreseeable future with no planned cut off, and
> we make all of those "piggy back" transformations that I showed in the
> patch set optional.  For example, I had it so that CLUSTER would
> quietly convert your relation to large format, if it was still in
> segmented format (might as well if you're writing all the data out
> anyway, right?), but perhaps that could depend on a GUC.  Likewise for
> base backup.  Etc.  Then someone concerned about hitting the 16TB
> limit on ext4 could opt out.  Or something like that.  It seems funny
> though, that's exactly the user who should want this feature (they
> have 16,000 relation segment files).

If we're going to have to keep the segment code for the foreseeable
future anyway, could we not get most of the benefit by increasing the
segment size to something like 1TB?  The vast majority of tables would
fit in one file, and there would be less risk of hitting filesystem
limits.

- ilmari




Re: Discussion: psql \et -> edit the trigger function

2023-05-10 Thread Dagfinn Ilmari Mannsåker
Kirk Wolak  writes:

> We already have
> \ef
> \ev
>
> The use case here is simply that it saves me from:
> \d 
> [scroll through all the fields]
> [often scroll right]
> select function name
> \ef [paste function name]
>
> and tab completion is much narrower

I think it would make more sense to model it on the filtering letters
available for \df:

  \df[anptw][S+] [FUNCPTRN [TYPEPTRN ...]]
   list [only agg/normal/procedure/trigger/window] functions


I just noticed that tab completion after e.g. \dft does not take the
function type restriction into account, so the solution for \ef
should be made to work for both. I wonder if it would even be possible
to share the tab completion filtering conditions with the actual
implementation of \df.

Also, I notice that \df only tab completes functions (i.e. not
procedures), although it actually returns all routines.

> When doing conversions and reviews all of this stuff has to be reviewed.
> Oftentimes, renamed, touched.
>
> I am 100% willing to write the code, docs, etc. but would appreciate
> feedback.

I'm happy to assist with and review at least the tab completion parts of
this effort.

> Kirk...

- ilmari




Re: Tab completion for CREATE SCHEMAAUTHORIZATION

2023-05-08 Thread Dagfinn Ilmari Mannsåker
Michael Paquier  writes:

> On Tue, May 02, 2023 at 01:19:49PM +0100, Dagfinn Ilmari Mannsåker wrote:
>> Dagfinn Ilmari Mannsåker  writes:
>>> This is for completing the word CREATE itself after CREATE SCHEMA
>>> [[] AUTHORIZATION] .  The things that can come after that
>>> are already handled generically earlier in the function:
>>>
>>> /* CREATE */
>>> /* complete with something you can create */
>>> else if (TailMatches("CREATE"))
>>> matches = rl_completion_matches(text, create_command_generator);
>>>
>>> create_command_generator uses the words_after_create array, which lists
>>> all the things that can be created.
>
> You are right.  I have completely forgotten that this code path would
> append everything that supports CREATE for a CREATE SCHEMA command :)
>
>> But, looking closer at the docs, only tables, views, indexes, sequences
>> and triggers can be created as part of a CREATE SCHEMA statement. Maybe
>> we should add a HeadMatches("CREATE", "SCHEMA") exception in the above?
>
> Yes, it looks like we are going to need an exception and append only
> the keywords that are supported, or we will end up recommending mostly
> things that are not accepted by the parser.

Here's an updated v3 patch with that.  While adding that, I noticed that
CREATE UNLOGGED only tab-completes TABLE and MATERIALIZED VIEW, not
SEQUENCE, so I added that (and removed MATERIALIZED VIEW when part of
CREATE SCHEMA).

- ilmari

>From 31856bf5397253da76cbce9ccb590855a44da30d Mon Sep 17 00:00:00 2001
From: tanghy 
Date: Mon, 9 Aug 2021 18:47:12 +0100
Subject: [PATCH v3] Add tab completion for CREATE SCHEMA

 - AUTHORIZATION both in addition to and after a schema name
 - list of owner roles after AUTHORIZATION
 - CREATE and GRANT after any otherwise-complete command
 - Only the allowed object types after CREATE

Also adjust complation after CREATE UNLOGGED:

 - Add SEQUENCE
 - Don't suggest MATERIALIZED VIEW in CREATE TABLE
---
 src/bin/psql/tab-complete.c | 40 ++---
 1 file changed, 33 insertions(+), 7 deletions(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index bd04244969..66b3f00c1b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1031,9 +1031,13 @@ static const SchemaQuery Query_for_trigger_of_table = {
 "   FROM pg_catalog.pg_roles "\
 "  WHERE rolname LIKE '%s'"
 
+/* add these to Query_for_list_of_roles in OWNER contexts */
+#define Keywords_for_list_of_owner_roles \
+"CURRENT_ROLE", "CURRENT_USER", "SESSION_USER"
+
 /* add these to Query_for_list_of_roles in GRANT contexts */
 #define Keywords_for_list_of_grant_roles \
-"PUBLIC", "CURRENT_ROLE", "CURRENT_USER", "SESSION_USER"
+Keywords_for_list_of_owner_roles, "PUBLIC"
 
 #define Query_for_all_table_constraints \
 "SELECT conname "\
@@ -1785,7 +1789,13 @@ psql_completion(const char *text, int start, int end)
 /* CREATE */
 	/* complete with something you can create */
 	else if (TailMatches("CREATE"))
-		matches = rl_completion_matches(text, create_command_generator);
+		/* only some object types can be created as part of CREATE SCHEMA */
+		if (HeadMatches("CREATE", "SCHEMA"))
+			COMPLETE_WITH("TABLE", "VIEW", "INDEX", "SEQUENCE", "TRIGGER",
+		  /* for INDEX and TABLE/SEQUENCE, respectively */
+		  "UNIQUE", "UNLOGGED");
+		else
+			matches = rl_completion_matches(text, create_command_generator);
 
 	/* complete with something you can create or replace */
 	else if (TailMatches("CREATE", "OR", "REPLACE"))
@@ -3154,6 +3164,20 @@ psql_completion(const char *text, int start, int end)
 	else if (TailMatches("AS", "ON", "SELECT|UPDATE|INSERT|DELETE", "TO"))
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables);
 
+/* CREATE SCHEMA [  ] [ AUTHORIZATION ] */
+	else if (Matches("CREATE", "SCHEMA"))
+		COMPLETE_WITH_QUERY_PLUS(Query_for_list_of_schemas,
+ "AUTHORIZATION");
+	else if (Matches("CREATE", "SCHEMA", "AUTHORIZATION") ||
+			 Matches("CREATE", "SCHEMA", MatchAny, "AUTHORIZATION"))
+		COMPLETE_WITH_QUERY_PLUS(Query_for_list_of_roles,
+ Keywords_for_list_of_owner_roles);
+	else if (Matches("CREATE", "SCHEMA", "AUTHORIZATION", MatchAny) ||
+			 Matches("CREATE", "SCHEMA", MatchAny, "AUTHORIZATION", MatchAny))
+		COMPLETE_WITH("CREATE", "GRANT");
+	else if (Matches("CREATE", "SCHEMA", MatchAny))
+	

Re: Tab completion for CREATE SCHEMAAUTHORIZATION

2023-05-02 Thread Dagfinn Ilmari Mannsåker
Dagfinn Ilmari Mannsåker  writes:

> Michael Paquier  writes:
>
>> On Sat, Apr 15, 2023 at 11:06:25AM +0900, Michael Paquier wrote:
>>> Thanks, I'll look at it.
>>
>> +   else if (Matches("CREATE", "SCHEMA", "AUTHORIZATION", MatchAny) ||
>> +Matches("CREATE", "SCHEMA", MatchAny, "AUTHORIZATION", 
>> MatchAny))
>> +   COMPLETE_WITH("CREATE", "GRANT");
>> +   else if (Matches("CREATE", "SCHEMA", MatchAny))
>> +   COMPLETE_WITH("AUTHORIZATION", "CREATE", "GRANT");
>>
>> I had this grammar under my eyes a few days ago for a different patch,
>> and there are much more objects types that can be appended to a CREATE
>> SCHEMA, like triggers, sequences, tables or views, so this is
>> incomplete, isn't it?
>
> This is for completing the word CREATE itself after CREATE SCHEMA
> [[] AUTHORIZATION] .  The things that can come after that
> are already handled generically earlier in the function:
>
> /* CREATE */
> /* complete with something you can create */
> else if (TailMatches("CREATE"))
> matches = rl_completion_matches(text, create_command_generator);
>
> create_command_generator uses the words_after_create array, which lists
> all the things that can be created.

But, looking closer at the docs, only tables, views, indexes, sequences
and triggers can be created as part of a CREATE SCHEMA statement. Maybe
we should add a HeadMatches("CREATE", "SCHEMA") exception in the above?

- ilmari




Re: Tab completion for CREATE SCHEMAAUTHORIZATION

2023-05-02 Thread Dagfinn Ilmari Mannsåker
Michael Paquier  writes:

> On Sat, Apr 15, 2023 at 11:06:25AM +0900, Michael Paquier wrote:
>> Thanks, I'll look at it.
>
> +   else if (Matches("CREATE", "SCHEMA", "AUTHORIZATION", MatchAny) ||
> +Matches("CREATE", "SCHEMA", MatchAny, "AUTHORIZATION", MatchAny))
> +   COMPLETE_WITH("CREATE", "GRANT");
> +   else if (Matches("CREATE", "SCHEMA", MatchAny))
> +   COMPLETE_WITH("AUTHORIZATION", "CREATE", "GRANT");
>
> I had this grammar under my eyes a few days ago for a different patch,
> and there are much more objects types that can be appended to a CREATE
> SCHEMA, like triggers, sequences, tables or views, so this is
> incomplete, isn't it?

This is for completing the word CREATE itself after CREATE SCHEMA
[[] AUTHORIZATION] .  The things that can come after that
are already handled generically earlier in the function:

/* CREATE */
/* complete with something you can create */
else if (TailMatches("CREATE"))
matches = rl_completion_matches(text, create_command_generator);

create_command_generator uses the words_after_create array, which lists
all the things that can be created.

- ilmari




Re: constants for tar header offsets

2023-04-18 Thread Dagfinn Ilmari Mannsåker
Robert Haas  writes:

> On Tue, Apr 18, 2023 at 12:06 PM Tom Lane  wrote:
>> Hmm, you're right: I checked the POSIX.1-2018 spec as well, and
>> it agrees that the prefix field is 155 bytes long.  Perhaps just
>> add another comment line indicating that 12 bytes remain unassigned?
>
> OK. Here's v2, with that change and a few others.

It still has magic numbers for the sizes of the fields, should those
also be named constants?

- ilmari




Re: Adding argument names to aggregate functions

2023-04-18 Thread Dagfinn Ilmari Mannsåker
Jim Jones  writes:

> On 18.04.23 10:58, I wrote:
>> On 14.04.23 12:03, Dagfinn Ilmari Mannsåker wrote:
>>> Thanks for the heads-up, here's a rebased patch. I've also formatted
>>> the lines to match what reformat_dat_file.pl wants.  It also wanted to
>>> reformat a bunch of other entries, but I left those alone.
>>>
>>> - ilmari
>>
>> The patch applies cleanly now and \df shows the argument names:
>>
>> postgres=# \df string_agg
>>     List of functions
>>    Schema   |    Name    | Result data type | Argument data
>> types  | Type
>> ++--+--+--
>>  
>>  pg_catalog | string_agg | bytea    | value bytea, delimiter bytea | 
>> agg
>>  pg_catalog | string_agg | text | value text, delimiter text   | 
>> agg
>> (2 rows)
>>
>> postgres=# \df json_object_agg
>>     List of functions
>>    Schema   |  Name   | Result data type |  Argument data
>> types   | Type
>> +-+--++--
>>  
>>  pg_catalog | json_object_agg | json | key "any", value "any" | 
>> agg
>> (1 row)
>>
>>
>> I'm wondering if there are some sort of guidelines that dictate when
>> to name an argument or not. It would be nice to have one for future 
>> reference.

I seemed to recall a patch to add arugment names to a bunch of functions
in the past, thinking that might have some guidance, but can't for the
life of me find it now.

>> I will mark the CF entry as "Read for Committer" and let the
>> committers decide if it's best to first create a guideline for that or 
>> not.
>>
>> Best, Jim
>>
> I just saw that the patch is failing[1] on "macOS - Ventura -
> Meson". Not sure if it is related to this patch though ..
>
> [1]
> https://api.cirrus-ci.com/v1/artifact/task/5881376021413888/meson_log/build/meson-logs/meson-log.txt

Link to the actual job:

https://cirrus-ci.com/task/5881376021413888

The failure was:

[09:54:38.727] 216/262 postgresql:recovery / recovery/031_recovery_conflict 
ERROR 198.73s exit status 60

Looking at its log:

https://api.cirrus-ci.com/v1/artifact/task/5881376021413888/testrun/build/testrun/recovery/031_recovery_conflict/log/regress_log_031_recovery_conflict

we see:

timed out waiting for match: (?^:User was holding a relation lock for too long) 
at /Users/admin/pgsql/src/test/recovery/t/031_recovery_conflict.pl line 311.

That looks indeed completely unrelated to this patch.

- ilmari




Re: Tab completion for CREATE SCHEMAAUTHORIZATION

2023-04-14 Thread Dagfinn Ilmari Mannsåker
Michael Paquier  writes:

> On Wed, Aug 11, 2021 at 10:16:15AM +0900, Michael Paquier wrote:
>> +   else if (Matches("CREATE", "SCHEMA", "AUTHORIZATION"))
>> +   COMPLETE_WITH_QUERY(Query_for_list_of_owner_roles);
>> +   else if (Matches("CREATE", "SCHEMA", MatchAny, "AUTHORIZATION"))
>> +   COMPLETE_WITH_QUERY(Query_for_list_of_owner_roles);
>> +   else if (Matches("CREATE", "SCHEMA", MatchAny, "AUTHORIZATION", 
>> MatchAny))
>> +   COMPLETE_WITH("CREATE", "GRANT");
>> +   else if (Matches("CREATE", "SCHEMA", MatchAny))
>> +   COMPLETE_WITH("AUTHORIZATION", "CREATE", "GRANT");
>> Looks like you forgot the case "CREATE SCHEMA AUTHORIZATION MatchAny"
>> that should be completed by GRANT and CREATE.
>
> This patch has been waiting on author for more than a couple of weeks,
> so I have marked it as returned with feedback in the CF app.  Please
> feel free to resubmit if you are able to work more on that.

Looks like I completely dropped the ball on this one, sorry.  Here's a
rebased patch which uses the new COMPLETE_WITH_QUERY_PLUS functionality
added in commit 02b8048ba5dc36238f3e7c3c58c5946220298d71.


- ilmari


>From 4c4833dfdd2bb01cd35715223433f961e5ec004c Mon Sep 17 00:00:00 2001
From: tanghy 
Date: Mon, 9 Aug 2021 18:47:12 +0100
Subject: [PATCH v2] Add tab completion for CREATE SCHEMA

 - AUTHORIZATION both in addition to and after a schema name
 - list of owner roles after AUTHORIZATION
 - CREATE and GRANT after any otherwise-complete command
---
 src/bin/psql/tab-complete.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5825b2a195..222dd617a2 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1031,9 +1031,13 @@ static const SchemaQuery Query_for_trigger_of_table = {
 "   FROM pg_catalog.pg_roles "\
 "  WHERE rolname LIKE '%s'"
 
+/* add these to Query_for_list_of_roles in OWNER contexts */
+#define Keywords_for_list_of_owner_roles \
+"CURRENT_ROLE", "CURRENT_USER", "SESSION_USER"
+
 /* add these to Query_for_list_of_roles in GRANT contexts */
 #define Keywords_for_list_of_grant_roles \
-"PUBLIC", "CURRENT_ROLE", "CURRENT_USER", "SESSION_USER"
+Keywords_for_list_of_owner_roles, "PUBLIC"
 
 #define Query_for_all_table_constraints \
 "SELECT conname "\
@@ -3154,6 +3158,20 @@ psql_completion(const char *text, int start, int end)
 	else if (TailMatches("AS", "ON", "SELECT|UPDATE|INSERT|DELETE", "TO"))
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables);
 
+/* CREATE SCHEMA [  ] [ AUTHORIZATION ] */
+	else if (Matches("CREATE", "SCHEMA"))
+		COMPLETE_WITH_QUERY_PLUS(Query_for_list_of_schemas,
+ "AUTHORIZATION");
+	else if (Matches("CREATE", "SCHEMA", "AUTHORIZATION") ||
+			 Matches("CREATE", "SCHEMA", MatchAny, "AUTHORIZATION"))
+		COMPLETE_WITH_QUERY_PLUS(Query_for_list_of_roles,
+ Keywords_for_list_of_owner_roles);
+	else if (Matches("CREATE", "SCHEMA", "AUTHORIZATION", MatchAny) ||
+			 Matches("CREATE", "SCHEMA", MatchAny, "AUTHORIZATION", MatchAny))
+		COMPLETE_WITH("CREATE", "GRANT");
+	else if (Matches("CREATE", "SCHEMA", MatchAny))
+		COMPLETE_WITH("AUTHORIZATION", "CREATE", "GRANT");
+
 /* CREATE SEQUENCE --- is allowed inside CREATE SCHEMA, so use TailMatches */
 	else if (TailMatches("CREATE", "SEQUENCE", MatchAny) ||
 			 TailMatches("CREATE", "TEMP|TEMPORARY", "SEQUENCE", MatchAny))
@@ -4263,9 +4281,7 @@ psql_completion(const char *text, int start, int end)
 /* OWNER TO  - complete with available roles */
 	else if (TailMatches("OWNER", "TO"))
 		COMPLETE_WITH_QUERY_PLUS(Query_for_list_of_roles,
- "CURRENT_ROLE",
- "CURRENT_USER",
- "SESSION_USER");
+ Keywords_for_list_of_owner_roles);
 
 /* ORDER BY */
 	else if (TailMatches("FROM", MatchAny, "ORDER"))
-- 
2.39.2



Re: Adding argument names to aggregate functions

2023-04-14 Thread Dagfinn Ilmari Mannsåker
Jim Jones  writes:

> On 12.04.23 19:53, Dagfinn Ilmari Mannsåker wrote:
>> Dagfinn Ilmari Mannsåker  writes:
>>
>>> Hi hackers,
>>>
>>> I'm sure I'm not the only one who can never remember which way around
>>> the value and delimiter arguments go for string_agg() and has to look it
>>> up in the manual every time. To make it more convenient, here's a patch
>>> that adds proargnames to its pg_proc entries so that it can be seen with
>>> a quick \df in psql.
>> Added to the 2023-07 commitfest:
>>
>> https://commitfest.postgresql.org/43/4275/
>>
>> - ilmari
>
> +1 for adding the argument names.
>
> The patch needs a rebase though.. it no longer applies :
>
> $ git apply
> ~/Downloads/0001-Add-argument-names-to-multi-argument-aggregates.patch
> error: patch failed: src/include/catalog/pg_proc.dat:8899
> error: src/include/catalog/pg_proc.dat: patch does not apply

Thanks for the heads-up, here's a rebased patch.  I've also formatted
the lines to match what reformat_dat_file.pl wants.  It also wanted to
reformat a bunch of other entries, but I left those alone.

- ilmari

>From a6ff997fcea7aa7201318cb94db0173ea6efdf02 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Mon, 27 Feb 2023 13:06:29 +
Subject: [RFC PATCH v2] Add argument names to multi-argument aggregates

This makes it easier to see which way around the arguments go when
using \dfa.  This is particularly relevant for string_agg(), but add
it to json(b)_object_agg() too for good measure.
---
 src/include/catalog/pg_proc.dat | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index b516cee8bd..b2db8d07e1 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5017,7 +5017,7 @@
 { oid => '3538', descr => 'concatenate aggregate input into a string',
   proname => 'string_agg', prokind => 'a', proisstrict => 'f',
   prorettype => 'text', proargtypes => 'text text',
-  prosrc => 'aggregate_dummy' },
+  proargnames => '{value,delimiter}', prosrc => 'aggregate_dummy' },
 { oid => '3543', descr => 'aggregate transition function',
   proname => 'bytea_string_agg_transfn', proisstrict => 'f',
   prorettype => 'internal', proargtypes => 'internal bytea bytea',
@@ -5029,7 +5029,7 @@
 { oid => '3545', descr => 'concatenate aggregate input into a bytea',
   proname => 'string_agg', prokind => 'a', proisstrict => 'f',
   prorettype => 'bytea', proargtypes => 'bytea bytea',
-  prosrc => 'aggregate_dummy' },
+  proargnames => '{value,delimiter}', prosrc => 'aggregate_dummy' },
 
 # To ASCII conversion
 { oid => '1845', descr => 'encode text from DB encoding to ASCII text',
@@ -8978,7 +8978,7 @@
 { oid => '3197', descr => 'aggregate input into a json object',
   proname => 'json_object_agg', prokind => 'a', proisstrict => 'f',
   provolatile => 's', prorettype => 'json', proargtypes => 'any any',
-  prosrc => 'aggregate_dummy' },
+  proargnames => '{key,value}', prosrc => 'aggregate_dummy' },
 { oid => '8955', descr => 'aggregate non-NULL input into a json object',
   proname => 'json_object_agg_strict', prokind => 'a', proisstrict => 'f',
   provolatile => 's', prorettype => 'json', proargtypes => 'any any',
@@ -9906,7 +9906,7 @@
   prosrc => 'jsonb_object_agg_finalfn' },
 { oid => '3270', descr => 'aggregate inputs into jsonb object',
   proname => 'jsonb_object_agg', prokind => 'a', proisstrict => 'f',
-  prorettype => 'jsonb', proargtypes => 'any any',
+  prorettype => 'jsonb', proargtypes => 'any any', proargnames => '{key,value}',
   prosrc => 'aggregate_dummy' },
 { oid => '8963', descr => 'aggregate non-NULL inputs into jsonb object',
   proname => 'jsonb_object_agg_strict', prokind => 'a', proisstrict => 'f',
-- 
2.39.2



Re: Tab completion for AT TIME ZONE

2023-04-14 Thread Dagfinn Ilmari Mannsåker
Hi Jim,

Thanks for having a look at my patch, but please don't top post on
PostgreSQL lists.

Jim Jones  writes:

> Hi,
>
> On 12.04.23 19:53, Dagfinn Ilmari Mannsåker wrote:
>> Dagfinn Ilmari Mannsåker  writes:
>>
>>> Hi hackers,
>>>
>>> A while back we added support for completing time zone names after SET
>>> TIMEZONE, but we failed to do the same for the AT TIME ZONE operator.
>>> Here's a trivial patch for that.
>>
>
> Is this supposed to provide tab completion for the AT TIME ZONE operator
> like in this query?
>
> SELECT '2023-04-14 08:00:00' AT TIME ZONE 'Europe/Lisbon';
>
> The patch applied cleanly but I'm afraid I cannot reproduce the intended
> behaviour:
>
> postgres=# SELECT '2023-04-14 08:00:00' AT
>
> postgres=# SELECT '2023-04-14 08:00:00' AT T
>
> postgres=# SELECT '2023-04-14 08:00:00' AT TIME Z
>
> Perhaps I'm testing it in the wrong place?

It doesn't tab complete the AT TIME ZONE operator itself, just the
timezone name after it, so this sholud work:

# SELECT now() AT TIME ZONE 

or

# SELECT now() AT TIME ZONE am


However, looking more closely at the grammar, the word AT only occurs in
AT TIME ZONE, so we could complete the operator itself as well.  Updated
patch attatched.

> Best, Jim

- ilmari

>From 365844db04d27c5bcd1edf8a9d0d44353bc34631 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Wed, 29 Mar 2023 11:16:01 +0100
Subject: [PATCH v2] psql: tab completion for AT TIME ZONE

Commit 7fa3db367986160dee2b2b0bbfb61e1a51d486fd added support for
completing time zone names, use that after the AT TIME ZONE operator.

Also complete the operator itself, since it's the only thing in the
grammar starting with AT.
---
 src/bin/psql/tab-complete.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5825b2a195..e3870c68e9 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -4657,6 +4657,14 @@ psql_completion(const char *text, int start, int end)
 	else if (TailMatches("JOIN"))
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_selectables);
 
+/* ... AT TIME ZONE ... */
+	else if (TailMatches("AT"))
+		COMPLETE_WITH("TIME ZONE");
+	else if (TailMatches("AT", "TIME"))
+		COMPLETE_WITH("ZONE");
+	else if (TailMatches("AT", "TIME", "ZONE"))
+		COMPLETE_WITH_TIMEZONE_NAME();
+
 /* Backslash commands */
 /* TODO:  \dc \dd \dl */
 	else if (TailMatchesCS("\\?"))
-- 
2.39.2



Re: doc: add missing "id" attributes to extension packaging page

2023-04-13 Thread Dagfinn Ilmari Mannsåker
Peter Eisentraut  writes:

> On 06.04.23 16:19, Brar Piening wrote:
>> 001-make_html_ids_discoverable_v5.postgresql.patch which needs to be
>> applied to the postgresql repository. It adds the XSLT to generate the
>> id links and the CSS to hide/display them. I've added comments as
>> suggested above.
>> 002-add-discoverable-id-style_v1.pgweb.patch which needs to be applied
>> to the pgweb repository. It adds the CSS to the offical documentation site.
>
> The first patch has been committed.
>
> The second patch should be sent to pgsql-www for integrating into the
> web site.
>
> Side project: I noticed that these new hover links don't appear in the
> single-page HTML output (make postgres.html), even though the generated 
> HTML source code looks correct.  Maybe someone has an idea there.

Another side note: I notice the links don't appear on  elements
(e.g. https://www.postgresql.org/docs/devel/sql-select.html#SQL-WITH),
only .

- ilmari




Re: Adding argument names to aggregate functions

2023-04-12 Thread Dagfinn Ilmari Mannsåker
Dagfinn Ilmari Mannsåker  writes:

> Hi hackers,
>
> I'm sure I'm not the only one who can never remember which way around
> the value and delimiter arguments go for string_agg() and has to look it
> up in the manual every time. To make it more convenient, here's a patch
> that adds proargnames to its pg_proc entries so that it can be seen with
> a quick \df in psql.

Added to the 2023-07 commitfest:

https://commitfest.postgresql.org/43/4275/

- ilmari




Re: Tab completion for AT TIME ZONE

2023-04-12 Thread Dagfinn Ilmari Mannsåker
Dagfinn Ilmari Mannsåker  writes:

> Hi hackers,
>
> A while back we added support for completing time zone names after SET
> TIMEZONE, but we failed to do the same for the AT TIME ZONE operator.
> Here's a trivial patch for that.

Added to the 2023-07 commitfest:

https://commitfest.postgresql.org/43/4274/

- ilmari




Re: Direct I/O

2023-04-12 Thread Dagfinn Ilmari Mannsåker
Andrew Dunstan  writes:

> On 2023-04-12 We 10:23, Dagfinn Ilmari Mannsåker wrote:
>> Andrew Dunstan  writes:
>>
>>> On 2023-04-12 We 01:48, Thomas Munro wrote:
>>>> On Wed, Apr 12, 2023 at 3:04 PM Thomas Munro   
>>>> wrote:
>>>>> On Wed, Apr 12, 2023 at 2:56 PM Christoph Berg   wrote:
>>>>>> I'm hitting a panic in t_004_io_direct. The build is running on
>>>>>> overlayfs on tmpfs/ext4 (upper/lower) which is probably a weird
>>>>>> combination but has worked well for building everything over the last
>>>>>> decade. On Debian unstable:
>>>>>>
>>>>>> PANIC:  could not open file "pg_wal/00010001": Invalid 
>>>>>> argument
>>>>> ... I have a new idea:  perhaps it is possible to try
>>>>> to open a file with O_DIRECT from perl, and if it fails like that,
>>>>> skip the test.  Looking into that now.
>>>> I think I have that working OK.  Any Perl hackers want to comment on
>>>> my use of IO::File (copied from examples on the internet that showed
>>>> how to use O_DIRECT)?  I am not much of a perl hacker but according to
>>>> my package manager, IO/File.pm came with perl itself.  And the Fcntl
>>>> eval trick that I copied from File::stat, and the perl-critic
>>>> suppression that requires?
>>>
>>> I think you can probably replace a lot of the magic here by simply saying
>>>
>>>
>>> if (Fcntl->can("O_DIRECT")) ...
>> Fcntl->can() is true for all constants that Fcntl knows about, whether
>> or not they are defined for your OS. `defined _DIRECT` is the simplest
>> check, see my other reply to Thomas.
>>
>>
>
> My understanding was that Fcntl only exported constants known to the
> OS. That's certainly what its docco suggests, e.g.:
>
>     By default your system's F_* and O_* constants (eg, F_DUPFD and
> O_CREAT)
>     and the FD_CLOEXEC constant are exported into your namespace.

It's a bit more magical than that (this is Perl after all).  They are
all exported (which implicitly creates stubs visible to `->can()`,
similarly to forward declarations like `sub O_FOO;`), but only the
defined ones (`#ifdef O_FOO` is true) are defined (`defined _FOO` is
true).  The rest fall through to an AUTOLOAD¹ function that throws an
exception for undefined ones.

Here's an example (Fcntl knows O_RAW, but Linux does not define it):

$ perl -E '
use strict; use Fcntl;
say "can", main->can("O_RAW") ? "" : "not";
say defined _RAW ? "" : "not ", "defined";
say O_RAW;'
can
not defined
Your vendor has not defined Fcntl macro O_RAW, used at -e line 4

While O_DIRECT is defined:

$ perl -E '
use strict; use Fcntl;
say "can", main->can("O_DIRECT") ? "" : "not";
say defined _DIRECT ? "" : "not ", "defined";
say O_DIRECT;'
can
defined
16384

And O_FOO is unknown to Fcntl (the parens on `O_FOO()q are to make it
not a bareword, which would be a compile error under `use strict;` when
the sub doesn't exist at all):

$ perl -E '
use strict; use Fcntl;
say "can", main->can("O_FOO") ? "" : "not";
say defined _FOO ? "" : "not ", "defined";
say O_FOO();'
cannot
not defined
Undefined subroutine ::O_FOO called at -e line 4.

> cheers
>
>
> andrew

- ilmari

[1] https://perldoc.perl.org/perlsub#Autoloading




Re: Direct I/O

2023-04-12 Thread Dagfinn Ilmari Mannsåker
Andrew Dunstan  writes:

> On 2023-04-12 We 01:48, Thomas Munro wrote:
>> On Wed, Apr 12, 2023 at 3:04 PM Thomas Munro  wrote:
>>> On Wed, Apr 12, 2023 at 2:56 PM Christoph Berg  wrote:
 I'm hitting a panic in t_004_io_direct. The build is running on
 overlayfs on tmpfs/ext4 (upper/lower) which is probably a weird
 combination but has worked well for building everything over the last
 decade. On Debian unstable:

 PANIC:  could not open file "pg_wal/00010001": Invalid 
 argument
>>> ... I have a new idea:  perhaps it is possible to try
>>> to open a file with O_DIRECT from perl, and if it fails like that,
>>> skip the test.  Looking into that now.
>> I think I have that working OK.  Any Perl hackers want to comment on
>> my use of IO::File (copied from examples on the internet that showed
>> how to use O_DIRECT)?  I am not much of a perl hacker but according to
>> my package manager, IO/File.pm came with perl itself.  And the Fcntl
>> eval trick that I copied from File::stat, and the perl-critic
>> suppression that requires?
>
>
> I think you can probably replace a lot of the magic here by simply saying
>
>
> if (Fcntl->can("O_DIRECT")) ...

Fcntl->can() is true for all constants that Fcntl knows about, whether
or not they are defined for your OS. `defined _DIRECT` is the simplest
check, see my other reply to Thomas.

> cheers
>
>
> andrew

- ilmari




Re: Direct I/O

2023-04-12 Thread Dagfinn Ilmari Mannsåker
Thomas Munro  writes:

> I think I have that working OK.  Any Perl hackers want to comment on
> my use of IO::File (copied from examples on the internet that showed
> how to use O_DIRECT)?  I am not much of a perl hacker but according to
> my package manager, IO/File.pm came with perl itself.

Indeed, and it has been since perl 5.003_07, released in 1996.  And Fcntl
has known about O_DIRECT since perl 5.6.0, released in 2000, so we can
safely use both.

> And the Fcntl eval trick that I copied from File::stat, and the
> perl-critic suppression that requires?
[…]
> + no strict 'refs';## no critic (ProhibitNoStrict)
> + my $val = eval { &{'Fcntl::O_DIRECT'} };
> + if (defined $val)

This trick is only needed in File::stat because it's constructing the
symbol name dynamically.  And because Fcntl by default exports all the
O_* and F_* constants it knows about, we can simply do:

if (defined _DIRECT)
> + {
> + use Fcntl qw(O_DIRECT);

The `use Fcntl;` above will already have imported this, so this is
redundant.

- ilmari




Tab completion for AT TIME ZONE

2023-03-29 Thread Dagfinn Ilmari Mannsåker
Hi hackers,

A while back we added support for completing time zone names after SET
TIMEZONE, but we failed to do the same for the AT TIME ZONE operator.
Here's a trivial patch for that.

- ilmari

>From 57138e851552a99174d00a0e48ce55ca3170df75 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Wed, 29 Mar 2023 11:16:01 +0100
Subject: [PATCH] psql: tab-complete time zones after AT TIME ZONE

Commit 7fa3db367986160dee2b2b0bbfb61e1a51d486fd added support for
completing time zone names, use that after the AT TIME ZONE operator.
---
 src/bin/psql/tab-complete.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index e38a49e8bd..9bb8ae0dc3 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -4655,6 +4655,10 @@ psql_completion(const char *text, int start, int end)
 	else if (TailMatches("JOIN"))
 		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_selectables);
 
+/* ... AT TIME ZONE ... */
+	else if (TailMatches("AT", "TIME", "ZONE"))
+		COMPLETE_WITH_TIMEZONE_NAME();
+
 /* Backslash commands */
 /* TODO:  \dc \dd \dl */
 	else if (TailMatchesCS("\\?"))
-- 
2.39.2



Re: MacOS: xsltproc fails with "warning: failed to load external entity"

2023-03-27 Thread Dagfinn Ilmari Mannsåker
Daniel Gustafsson  writes:

>> On 27 Mar 2023, at 10:41, Julien Rouhaud  wrote:
>> On Mon, Mar 27, 2023 at 10:32:52AM +0200, Daniel Gustafsson wrote:
 On 27 Mar 2023, at 10:24, Julien Rouhaud  wrote:
>
 FTR the documented XML_CATALOG_FILES environment variable is only valid for
 Intel based machines, as homebrew installs everything in a different 
 location
 for M1...
 
 I'm attaching a patch to make that distinction, hoping that no one else 
 will
 have to waste time trying to figure out how to get it working on such 
 hardware.
>>> 
>>> LGTM apart from the double // in the export which is easy enough to fix 
>>> before
>>> pushing.
>>> 
>>> +export XML_CATALOG_FILES=/opt/homebrew//etc/xml/catalog
>> 
>> Oh, I didn't notice it.  Apparently apple's find isn't smart enough to trim 
>> a /
>> when fed with a directory with a trailing /
>
> Applied with a tiny but of changes to make it look like the rest of the
> paragraph more. Thanks!

Doesn't this apply to Apple Silicon generally, not just M1? M2 already
exists, and M3 etc. will presumably also appear at some point. The
linked Homebrew issue refers to Apple Silicon, not any specific models.

- ilmari




Re: Making background psql nicer to use in tap tests

2023-03-17 Thread Dagfinn Ilmari Mannsåker
Andrew Dunstan  writes:

> On 2023-03-17 Fr 10:08, Daniel Gustafsson wrote:
>>> Why is $restart_before_query a package/class level value instead of
>>> an instance value? And why can we only ever set it to 1 but not back
>>> again? Maybe we don't want to, but it looks odd.
>> It was mostly a POC to show what I meant with the functionality.  I think 
>> there
>> should be a way to turn it off (set it to zero) even though I doubt it will 
>> be
>> used much.
>
>
> A common idiom is to have a composite getter/setter method for object
> properties something like this
>
>
>sub settingname
>{
>   my ($self, $arg) = @_;
>   $self->{settingname} = $arg if defined $arg;
>   return $self->{settingname};
>}

Or, if undef is a valid value:


sub settingname
{
my $self = shift;
$self->[settingname} = shift if @_;
return $self->{settingname};
}

- ilmari




Adding argument names to aggregate functions

2023-02-27 Thread Dagfinn Ilmari Mannsåker
Hi hackers,

I'm sure I'm not the only one who can never remember which way around
the value and delimiter arguments go for string_agg() and has to look it
up in the manual every time. To make it more convenient, here's a patch
that adds proargnames to its pg_proc entries so that it can be seen with
a quick \df in psql.

I also added names to json(b)_object_agg() for good measure, even though
they're more obvious. The remaining built-in multi-argument aggregate
functions are the stats-related ones, where it's all just Y/X (but why
in that order?), so I didn't think it was necessary. If others feel more
strongly, I can add those too.

- ilmari

>From 73f323d5e97dca2e2452f5be199864a8358559c5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Mon, 27 Feb 2023 13:06:29 +
Subject: [PATCH] Add argument names to multi-argument aggregates

This makes it easier to see which way around the arguments go when
using \dfa.  This is particularly relevant for string_agg(), but add
it to json(b)_object_agg() too for good measure.
---
 src/include/catalog/pg_proc.dat | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index e2a7642a2b..f96d29278f 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -4988,6 +4988,7 @@
 { oid => '3538', descr => 'concatenate aggregate input into a string',
   proname => 'string_agg', prokind => 'a', proisstrict => 'f',
   prorettype => 'text', proargtypes => 'text text',
+  proargnames => '{value,delimiter}',
   prosrc => 'aggregate_dummy' },
 { oid => '3543', descr => 'aggregate transition function',
   proname => 'bytea_string_agg_transfn', proisstrict => 'f',
@@ -5000,6 +5001,7 @@
 { oid => '3545', descr => 'concatenate aggregate input into a bytea',
   proname => 'string_agg', prokind => 'a', proisstrict => 'f',
   prorettype => 'bytea', proargtypes => 'bytea bytea',
+  proargnames => '{value,delimiter}',
   prosrc => 'aggregate_dummy' },
 
 # To ASCII conversion
@@ -8899,6 +8901,7 @@
 { oid => '3197', descr => 'aggregate input into a json object',
   proname => 'json_object_agg', prokind => 'a', proisstrict => 'f',
   provolatile => 's', prorettype => 'json', proargtypes => 'any any',
+  proargnames => '{key,value}',
   prosrc => 'aggregate_dummy' },
 { oid => '3198', descr => 'build a json array from any inputs',
   proname => 'json_build_array', provariadic => 'any', proisstrict => 'f',
@@ -9791,6 +9794,7 @@
 { oid => '3270', descr => 'aggregate inputs into jsonb object',
   proname => 'jsonb_object_agg', prokind => 'a', proisstrict => 'f',
   prorettype => 'jsonb', proargtypes => 'any any',
+  proargnames => '{key,value}',
   prosrc => 'aggregate_dummy' },
 { oid => '3271', descr => 'build a jsonb array from any inputs',
   proname => 'jsonb_build_array', provariadic => 'any', proisstrict => 'f',
-- 
2.39.1



  1   2   3   4   >