Re: [PATCH] Accept IP addresses in server certificate SANs

2022-03-22 Thread Kyotaro Horiguchi
At Tue, 22 Mar 2022 20:42:37 +, Jacob Champion  wrote 
in 
> Thanks, looks like I had some old header dependencies left over from
> several versions ago. Fixed in v9.

Thanks!  Looks perfect.

> v9 contains the bare minimum but I don't think it's quite enough. How
> much of the behavior (and edge cases) do you think we should detail
> here? All of it?

I tried to write out the doc part.  What do you think about it?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 3998b1781b..13e3e63768 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -8342,16 +8342,31 @@ 
ldap://ldap.acme.com/cn=dbserver,cn=hosts?pgconnectinfo?base?(objectclass=*)
 
   
In verify-full mode, the host name is matched against the
-   certificate's Subject Alternative Name attribute(s), or against the
-   Common Name attribute if no Subject Alternative Name of type 
dNSName is
+   certificate's Subject Alternative Name attribute(s) (SAN), or against the
+   Common Name attribute if no SAN of type dNSName is
present.  If the certificate's name attribute starts with an asterisk
(*), the asterisk will be treated as
a wildcard, which will match all characters except a 
dot
(.). This means the certificate will not match 
subdomains.
If the connection is made using an IP address instead of a host name, the
-   IP address will be matched (without doing any DNS lookups).
+   IP address will be matched (without doing any DNS lookups) against SANs of
+   type iPAddress or dNSName.  If no
+   ipAddress SAN is present and no
+   matching dNSName SAN is present, the host IP address is
+   matched against the Common Name attribute.
   
 
+  
+
+  For backward compatibility with earlier versions of PostgreSQL, the host
+  IP address is verified in a manner different
+  from https://tools.ietf.org/html/rfc6125;>RFC 6125.
+  The host IP address is always matched against dNSName
+  SANs as well as iPAdress SANs, and can be matched
+  against the Common Name attribute for a certain condition.
+   
+  
+
   
To allow server certificate verification, one or more root certificates
must be placed in the file ~/.postgresql/root.crt


Re: Window Function "Run Conditions"

2022-03-22 Thread David G. Johnston
On Tue, Mar 22, 2022 at 3:39 PM David Rowley  wrote:

> On Thu, 17 Mar 2022 at 17:04, Corey Huinker 
> wrote:
> > It seems like this effort would aid in implementing what some other
> databases implement via the QUALIFY clause, which is to window functions
> what HAVING is to aggregate functions.
> > example:
> https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#qualify_clause
>
> Isn't that just syntactic sugar?  You could get the same from adding a
> subquery where a WHERE clause to filter rows evaluated after the
> window clause.
>
>
I'd like some of that syntactic sugar please.  It goes nicely with my
HAVING syntactic coffee.

David J.


Re: make MaxBackends available in _PG_init

2022-03-22 Thread Julien Rouhaud
Hi,

Sorry for showing up this late, but I'm a bit confused with the new situation.

Unless I'm missing something, the new situation is that the system is supposed
to prevent access to MaxBackends during s_p_l_pg_init, for reasons I totally
agree with, but without doing anything for extensions that actually need to
access it at that time.  So what are extensions supposed to do now if they do
need the information during their _PG_init() / RequestAddinShmemSpace()?

Note that I have two of such extensions.  They actually only need it to give
access to the queryid in the background workers since it's otherwise not
available, but there's at least pg_wait_sampling extension that needs the value
to request shmem for other needs.

One funny note is that my extensions aren't using MaxBackends but instead
compute it based on MaxConnections, autovacuum_max_workers and so on.  So those
two extensions aren't currently broken, or more accurately not more than they
previously were.  Is there any reasoning to not protect all the variables that
contribute to MaxBackends?




Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

2022-03-22 Thread Bharath Rupireddy
On Tue, Mar 22, 2022 at 8:12 PM Andres Freund  wrote:
> > Do you mean like this?
> > ereport(LOG,
> > /* translator: the placeholders show checkpoint options */
> > (errmsg("%s starting:%s%s%s%s%s%s%s%s",
> > restartpoint ? _("restartpoint") : _("checkpoint"),
> > (flags & CHECKPOINT_IS_SHUTDOWN) ? " shutdown" : "",
> > (flags & CHECKPOINT_END_OF_RECOVERY) ? "
> > end-of-recovery" : "",
> > (flags & CHECKPOINT_IMMEDIATE) ? " immediate" : "",
> > (flags & CHECKPOINT_FORCE) ? " force" : "",
> > (flags & CHECKPOINT_WAIT) ? " wait" : "",
> > (flags & CHECKPOINT_CAUSE_XLOG) ? " wal" : "",
> > (flags & CHECKPOINT_CAUSE_TIME) ? " time" : "",
> > (flags & CHECKPOINT_FLUSH_ALL) ? " flush-all" : "")));
>
> Yes.

Done that way, see
v7-0001-Deduplicate-checkpoint-restartpoint-starting-comp.patch.

> > I think the reason in this case might be that some flag names with hyphens
> > and spaces before words may not have the right/matching words in all
> > languages. What happens if we choose to translate/not translate the entire
> > message?
>
> If individual words aren't translated the "original" word would be used.

Interestingly, the translated message for "checkpoint/restart
complete" is empty. Maybe because it has untranslatable strings?

#: access/transam/xlog.c:8752
#, c-format
msgid "restartpoint complete: wrote %d buffers (%.1f%%); %d WAL
file(s) added, %d removed, %d recycled; write=%ld.%03d s,
sync=%ld.%03d s, total=%ld.%03d s; sync files=%d, longest=%ld.%03d s,
average=%ld.%03d s; distance=%d kB, estimate=%d kB"
msgstr ""

> > > Both seem still very long. I still am doubtful this level of detail is
> > > appropriate. Seems more like a thing for a tracepoint or such. How about 
> > > just
> > > printing the time for the logical decoding operations in aggregate, 
> > > without
> > > breaking down into files, adding LSNs etc?
> >
> > The distinction that the patch makes right now is for snapshot and
> > rewrite mapping files and it makes sense to have them separately.
>
> -1. The line also needs to be readable...

IMHO, that's subjective. Even now, the existing
"checkpoint/restartpoint complete" message has a good amount of info
which makes it unreadable for some.

The number of logical decoding files(snapshot and mapping) the
checkpoint processed is a good metric to have in server logs along
with the time it took for removing/syncing them. Thoughts?

[1]
2022-03-23 04:13:06.050 UTC [1322043] LOG:  checkpoint complete: wrote
506 buffers (3.1%); 0 WAL file(s) added, 0 removed, 0 recycled;
write=0.010 s, sync=0.193 s, total=0.287 s; sync files=8,
longest=0.189 s, average=0.025 s; distance=12866 kB, estimate=12866
kB; logical snapshot file(s) removed=3, time=0.001 s
2022-03-23 04:14:33.937 UTC [1322043] LOG:  checkpoint complete: wrote
37 buffers (0.2%); 0 WAL file(s) added, 0 removed, 1 recycled;
write=0.004 s, sync=0.012 s, total=0.060 s; sync files=21,
longest=0.007 s, average=0.001 s; distance=1857 kB, estimate=11765 kB;
logical snapshot file(s) removed=3, time=0.001 s; logical rewrite
mapping file(s) removed=0, synced=28, time=0.001 s
2022-03-23 04:15:04.306 UTC [1322043] LOG:  checkpoint complete: wrote
32 buffers (0.2%); 0 WAL file(s) added, 0 removed, 0 recycled;
write=0.004 s, sync=0.008 s, total=0.078 s; sync files=11,
longest=0.006 s, average=0.001 s; distance=109 kB, estimate=10600 kB;
logical snapshot file(s) removed=4, time=0.001 s; logical rewrite
mapping file(s) removed=28, synced=28, time=0.001 s

Regards,
Bharath Rupireddy.


v7-0001-Deduplicate-checkpoint-restartpoint-starting-comp.patch
Description: Binary data


v7-0001-Add-checkpoint-stats-of-snapshot-and-mapping-file.patch
Description: Binary data


Re: Optimize external TOAST storage

2022-03-22 Thread Michael Paquier
On Tue, Mar 22, 2022 at 02:42:53PM -0700, Nathan Bossart wrote:
> On Tue, Mar 22, 2022 at 04:34:05PM -0400, Robert Haas wrote:
>> Then, too, I'm not very confident about the usefulness of EXTENDED,
>> EXTERNAL, and MAIN. I think it's useful to be able to categorically
>> disable compression (as EXTERNAL does), but you can't categorically
>> disable out-of-line storage because the value can be bigger than the
>> page, so MAIN vs. EXTENDED is just changing the threshold for the use
>> of out-of-line storage. However, it does so in a way that IMHO is not
>> particularly intuitive, which goes back to my earlier point about the
>> algorithm seeming wacky, and it's in any case unclear why we should
>> offer exactly two possible thresholds and not any of the intermediate
>> values.
> 
> I agree with all of this.  Adding configurability for each constant might
> help folks in the short term, but using these knobs probably requires quite
> a bit of expertise in Postgres internals as well as a good understanding of
> your data.  I think we ought to revist TOAST configurability from a user
> perspective.  IOW what can be chosen automatically, and how do we enable
> users to effectively configure the settings that cannot be chosen
> automatically?  IMO this is a worthwhile conversation to have as long as it
> doesn't stray too far into the "let's rewrite TOAST" realm.  I think there
> is always going to be some level of complexity with stuff like TOAST, but
> there are probably all sorts of ways to simplify/improve it also.

I agree with this feeling.  TOAST has already too many configuration
parameters that have their own way of behaving slightling differently.
If we could reduce this number rather than increasing it, the better.
I would be also curious to see how much those parameters become
relevant with more compression options possible with toast values.
--
Michael


signature.asc
Description: PGP signature


Re: Adding CI to our tree

2022-03-22 Thread Justin Pryzby
On Fri, Mar 18, 2022 at 03:45:03PM -0700, Andres Freund wrote:
> Pushed 0001, 0002. Only change I made was to add

Thanks - is there any reason not to do the MSVC compiler warnings one, too ?

I see that it'll warn about issues with at least 3 patches (including one of
yours).

-- 
Justin




Re: Allow file inclusion in pg_hba and pg_ident files

2022-03-22 Thread Michael Paquier
On Wed, Mar 23, 2022 at 10:16:34AM +0800, Julien Rouhaud wrote:
> Yeah, I thought about it but didn't rename it given your concerns about git
> history.  I'm fine either way.

Oh, OK.  The amount of diffs created by 0001 is still fine to grab
even with the struct rename, so let's stick with it.
--
Michael


signature.asc
Description: PGP signature


Re: freeing bms explicitly

2022-03-22 Thread Zhihong Yu
On Tue, Mar 22, 2022 at 8:45 PM Amit Kapila  wrote:

> On Tue, Mar 22, 2022 at 3:39 AM Zhihong Yu  wrote:
> >
> > On Mon, Mar 21, 2022 at 3:05 PM Tom Lane  wrote:
> >>
> >> Zhihong Yu  writes:
> >> >> I was looking at calls to bms_free() in PG code.
> >> >> e.g. src/backend/commands/publicationcmds.c line 362
> >> >>  bms_free(bms);
> >> >> The above is just an example, there're other calls to bms_free().
> >> >> Since the bms is allocated from some execution context, I wonder why
> this
> >> >> call is needed.
> >> >>
> >> >> When the underlying execution context wraps up, isn't the bms freed ?
> >>
> >> Yeah, that's kind of pointless --- and the pfree(rfnode) after it is
> even
> >> more pointless, since it'll free only the top node of that expression
> >> tree.  Not to mention the string returned by TextDatumGetCString, and
> >> whatever might be leaked during the underlying catalog accesses.
> >>
> >> If we were actually worried about transient space consumption of this
> >> function, it'd be necessary to do a lot more than this.  It doesn't
> >> look to me like it's worth worrying about though -- it doesn't seem
> >> like it could be hit more than once per query in normal cases.
> >>
> >> regards, tom lane
> >
> >
> > Thanks Tom for replying.
> >
> > What do you think of the following patch ?
> >
>
> Your patch looks good to me. I have found one more similar instance in
> the same file and changed that as well accordingly. Let me know what
> you think of the attached?
>
> --
> With Regards,
> Amit Kapila.
>

Hi, Amit:
The patch looks good to me.

Cheers


Re: Temporary tables versus wraparound... again

2022-03-22 Thread Greg Stark
Here's a rebased patch. I split the test into a separate patch that I
would lean to dropping. But at least it applies now.

I did look into pg_stat_get_backend_pid() and I guess it would work
but going through the stats mechanism does seem like going the long
way around since we're already looking at the backendId info directly
here, we just weren't grabbing the pid.

I did make a small change, I renamed the checkTempNamespaceStatus()
function to checkTempNamespaceStatusAndPid(). It seems unlikely there
are any external consumers of this function (the only internal
consumer is autovacuum.c). But just in case I renamed it to protect
against any external modules failing from the added parameter.
From eb6ec2edfcb10aafc3874262276638932a97add7 Mon Sep 17 00:00:00 2001
From: Greg Stark 
Date: Tue, 22 Mar 2022 15:54:59 -0400
Subject: [PATCH v3 2/2] Add test for truncating temp tables advancing
 relfrozenxid

This test depends on other transactions not running at the same time
so that relfrozenxid can advance so it has to be moved to its own
schedule.
---
 src/test/regress/expected/temp.out | 21 +
 src/test/regress/parallel_schedule | 10 +++---
 src/test/regress/sql/temp.sql  | 19 +++
 3 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/src/test/regress/expected/temp.out b/src/test/regress/expected/temp.out
index a5b3ed34a3..1fee5521af 100644
--- a/src/test/regress/expected/temp.out
+++ b/src/test/regress/expected/temp.out
@@ -82,6 +82,27 @@ SELECT * FROM temptest;
 -
 (0 rows)
 
+DROP TABLE temptest;
+-- Test that ON COMMIT DELETE ROWS resets the relfrozenxid when the
+-- table is truncated. This requires this test not be run in parallel
+-- with other tests as concurrent transactions will hold back the
+-- globalxmin
+CREATE TEMP TABLE temptest(col int) ON COMMIT DELETE ROWS;
+SELECT relpages, reltuples, relallvisible, relfrozenxid FROM pg_class where oid = 'temptest'::regclass \gset old_
+BEGIN;
+INSERT INTO temptest (select generate_series(1,1000));
+ANALYZE temptest; -- update relpages, reltuples
+COMMIT;
+SELECT relpages, reltuples, relallvisible, relfrozenxid FROM pg_class where oid = 'temptest'::regclass \gset new_
+SELECT :old_relpages  = :new_relpages  AS pages_reset,
+   :old_reltuples = :new_reltuples AS tuples_reset,
+   :old_relallvisible = :new_relallvisible AS allvisible_reset,
+   :old_relfrozenxid <> :new_relfrozenxid  AS frozenxid_advanced;
+ pages_reset | tuples_reset | allvisible_reset | frozenxid_advanced 
+-+--+--+
+ t   | t| t| t
+(1 row)
+
 DROP TABLE temptest;
 -- Test ON COMMIT DROP
 BEGIN;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 6d8f524ae9..f919c2f978 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -116,10 +116,14 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
 # --
 # Another group of parallel tests
 # with depends on create_misc
-# NB: temp.sql does a reconnect which transiently uses 2 connections,
-# so keep this parallel group to at most 19 tests
 # --
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+
+# --
+# Run this alone because it transiently uses 2 connections and also
+# tests relfrozenxid advances when truncating temp tables
+# --
+test: temp
 
 # --
 # Another group of parallel tests
diff --git a/src/test/regress/sql/temp.sql b/src/test/regress/sql/temp.sql
index 424d12b283..5f0c39b5e7 100644
--- a/src/test/regress/sql/temp.sql
+++ b/src/test/regress/sql/temp.sql
@@ -79,6 +79,25 @@ SELECT * FROM temptest;
 
 DROP TABLE temptest;
 
+-- Test that ON COMMIT DELETE ROWS resets the relfrozenxid when the
+-- table is truncated. This requires this test not be run in parallel
+-- with other tests as concurrent transactions will hold back the
+-- globalxmin
+CREATE TEMP TABLE temptest(col int) ON COMMIT DELETE ROWS;
+
+SELECT relpages, reltuples, relallvisible, relfrozenxid FROM pg_class where oid = 'temptest'::regclass \gset old_
+BEGIN;
+INSERT INTO temptest (select generate_series(1,1000));
+ANALYZE temptest; -- update relpages, reltuples
+COMMIT;
+SELECT relpages, reltuples, relallvisible, relfrozenxid FROM pg_class where oid = 'temptest'::regclass \gset new_
+SELECT :old_relpages  = :new_relpages  AS pages_reset,
+   :old_reltuples = :new_reltuples AS tuples_reset,
+   :old_relallvisible = :new_relallvisible AS allvisible_reset,
+   :old_relfrozenxid <> :new_relfrozenxid  AS frozenxid_advanced;
+
+DROP TABLE temptest;
+
 -- Test ON COMMIT 

Re: freeing bms explicitly

2022-03-22 Thread Amit Kapila
On Tue, Mar 22, 2022 at 3:39 AM Zhihong Yu  wrote:
>
> On Mon, Mar 21, 2022 at 3:05 PM Tom Lane  wrote:
>>
>> Zhihong Yu  writes:
>> >> I was looking at calls to bms_free() in PG code.
>> >> e.g. src/backend/commands/publicationcmds.c line 362
>> >>  bms_free(bms);
>> >> The above is just an example, there're other calls to bms_free().
>> >> Since the bms is allocated from some execution context, I wonder why this
>> >> call is needed.
>> >>
>> >> When the underlying execution context wraps up, isn't the bms freed ?
>>
>> Yeah, that's kind of pointless --- and the pfree(rfnode) after it is even
>> more pointless, since it'll free only the top node of that expression
>> tree.  Not to mention the string returned by TextDatumGetCString, and
>> whatever might be leaked during the underlying catalog accesses.
>>
>> If we were actually worried about transient space consumption of this
>> function, it'd be necessary to do a lot more than this.  It doesn't
>> look to me like it's worth worrying about though -- it doesn't seem
>> like it could be hit more than once per query in normal cases.
>>
>> regards, tom lane
>
>
> Thanks Tom for replying.
>
> What do you think of the following patch ?
>

Your patch looks good to me. I have found one more similar instance in
the same file and changed that as well accordingly. Let me know what
you think of the attached?

-- 
With Regards,
Amit Kapila.


v2-0001-Remove-some-useless-free-calls.patch
Description: Binary data


Re: [PATCH] Add native windows on arm64 support

2022-03-22 Thread Thomas Munro
On Tue, Mar 22, 2022 at 11:30 PM Julien Rouhaud  wrote:
> On Tue, Mar 22, 2022 at 09:37:46AM +, Niyas Sait wrote:
> > Yes, we could look into providing a build machine. Do you have any
> > reference to what the CI system looks like now for PostgresSQL and how to
> > add new workers etc.?
>
> It's all documented at
> https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto.

It seems likely there will be more and more Windows/ARM users, so yeah
having a machine to test that combination would be great.  I wonder if
ASLR isn't breaking for you by good luck only...




Re: Window Function "Run Conditions"

2022-03-22 Thread David Rowley
On Wed, 23 Mar 2022 at 11:24, David Rowley  wrote:
> I think it's safer to just disable the optimisation when there are
> multiple window clauses.  Multiple matching clauses are merged
> already, so it's perfectly valid to have multiple window functions,
> it's just they must share the same window clause.  I don't think
> that's terrible as with the major use case that I have in mind for
> this, the window function is only added to limit the number of rows.
> In most cases I can imagine, there'd be no reason to have an
> additional window function with different frame options.

I've not looked into the feasibility of it, but I had a thought that
we could just accumulate all the run-conditions in a new field in the
PlannerInfo then just tag them onto the top-level WindowAgg when
building the plan.

I'm just not sure it would be any more useful than what the v3 patch
is currently doing as intermediate WindowAggs would still need to
process all rows.  I think it would only save the window function
evaluation of the top-level WindowAgg for rows that don't match the
run-condition.  All the supported window functions are quite cheap, so
it's not a huge saving. I'd bet there would be example cases where it
would be measurable though.

David




Re: Window Function "Run Conditions"

2022-03-22 Thread David Rowley
 On Wed, 23 Mar 2022 at 12:50, Zhihong Yu  wrote:
> The following code seems to be common between if / else blocks (w.r.t. 
> wfunc_left) of find_window_run_conditions().

> It would be nice if this code can be shared.

I remember thinking about that and thinking that I didn't want to
overcomplicate the if conditions for the strategy tests. I'd thought
these would have become:

if ((wfunc_left && (strategy == BTLessStrategyNumber ||
strategy == BTLessEqualStrategyNumber)) ||
 (!wfunc_left && (strategy == BTGreaterStrategyNumber ||
  strategy == BTGreaterEqualStrategyNumber)))

which I didn't think was very readable. That caused me to keep it separate.

On reflection, we can just leave the strategy checks as they are, then
add the additional code for checking wfunc_left when checking the
res->monotonic, i.e:

if ((wfunc_left && (res->monotonic & MONOTONICFUNC_INCREASING)) ||
   (!wfunc_left && (res->monotonic & MONOTONICFUNC_DECREASING)))

I think that's more readable than doubling up the strategy checks, so
I've done it that way in the attached.

>
> +   WindowClause *wclause = (WindowClause *)
> +   list_nth(subquery->windowClause,
> +wfunc->winref - 1);
>
> The code would be more readable if list_nth() is indented.

That's just the way pgindent put it.

> +   /* Check the left side of the OpExpr */
>
> It seems the code for checking left / right is the same. It would be better 
> to extract and reuse the code.

I've moved some of that code into find_window_run_conditions() which
removes about 10 lines of code.

Updated patch attached. Thanks for looking.

David
From db393b01ce6dd48f3a49d367a28d7804510bd1f6 Mon Sep 17 00:00:00 2001
From: David Rowley 
Date: Sat, 8 May 2021 23:43:25 +1200
Subject: [PATCH v3] Teach planner and executor about monotonic window funcs

Window functions such as row_number() always return a value higher than
the one previously in any given partition.  If a query were to only
request the first few row numbers, then traditionally we would continue
evaluating the WindowAgg node until all tuples are exhausted.  However, it
is possible if someone, say only wanted all row numbers <= 10, then we
could just stop once we get number 11.

Here we implement means to do just that.  This is done by way of adding a
prosupport function to various of the built-in window functions and adding
supporting code to allow the support function to inform the planner if
the function is monotonically increasing, monotonically decreasing, both
or neither.  The planner is then able to make use of that information and
possibly allow the executor to short-circuit execution by way of adding a
"run condition" to the WindowAgg to instruct it to run while this
condition is true and stop when it becomes false.

This optimization is only possible when the subquery contains only a
single window clause.  The problem with having multiple clauses is that at
the time when we're looking for run conditions, we don't yet know the
order that the window clauses will be evaluated.  We cannot add a run
condition to a window clause that is evaluated first as we may stop
execution and not output rows that are required for a window clause which
will be evaluated later.

Here we add prosupport functions to allow this to work for; row_number(),
rank(), dense_rank(), count(*) and count(expr).

Author: David Rowley
Reviewed-by: Andy Fan, Zhihong Yu
Discussion: 
https://postgr.es/m/caaphdvqvp3at8++yf8ij06sdcoo1s_b2yoat9d4nf+mobzs...@mail.gmail.com
---
 src/backend/commands/explain.c  |   4 +
 src/backend/executor/nodeWindowAgg.c|  28 +-
 src/backend/nodes/copyfuncs.c   |   4 +
 src/backend/nodes/equalfuncs.c  |   2 +
 src/backend/nodes/outfuncs.c|   4 +
 src/backend/nodes/readfuncs.c   |   4 +
 src/backend/optimizer/path/allpaths.c   | 327 +++-
 src/backend/optimizer/plan/createplan.c |   7 +-
 src/backend/optimizer/plan/setrefs.c|   8 +
 src/backend/utils/adt/int8.c|  45 
 src/backend/utils/adt/windowfuncs.c |  63 +
 src/include/catalog/pg_proc.dat |  35 ++-
 src/include/nodes/execnodes.h   |   4 +
 src/include/nodes/nodes.h   |   3 +-
 src/include/nodes/parsenodes.h  |   2 +
 src/include/nodes/plannodes.h   |  20 ++
 src/include/nodes/supportnodes.h|  61 -
 src/test/regress/expected/window.out| 315 +++
 src/test/regress/sql/window.sql | 165 
 19 files changed, 1083 insertions(+), 18 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 9f632285b6..92ca4acf50 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -1985,6 +1985,10 @@ ExplainNode(PlanState *planstate, List *ancestors,
show_instrumentation_count("Rows Removed by 
Filter", 1,

Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-03-22 Thread Kyotaro Horiguchi
At Wed, 23 Mar 2022 11:51:25 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> The two places emit different outputs but the only difference is the
> delimiter between two blockrefs. (By the way, the current code forgets
> to insert a delimiter there).  So even if the function took "bool
> is_waldump", it is used only when appending a line delimiter.  It
> would be nicer if the "bool is_waldump" were "char *delimiter".
> Others might think differently, though..
> 
> So, the function looks like this.
> 
> StringInfo XLogBlockRefInfos(XLogReaderState *record, char *delimiter,
>   uint32 _len);

By the way, xlog_block_info@xlogrecovery.c has the subset of the
function.  So the function can be shared with the callers of
xlog_block_info but I'm not sure it is not too-much...

StringInfo XLogBlockRefInfos(XLogReaderState *record, char *delimiter,
bool fpw_detail, uint32 
_len);

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: cpluspluscheck vs ICU

2022-03-22 Thread Thomas Munro
On Wed, Mar 23, 2022 at 3:23 PM Andres Freund  wrote:
> On 2022-03-22 17:20:24 -0700, Andres Freund wrote:
> > I was about to propose adding headerscheck / cpluspluscheck to the CI file 
> > so
> > that cfbot can catch future issues.
>
> The attached patch does so, with ICU disabled to avoid the problems discussed
> in the thread. Example run:
> https://cirrus-ci.com/task/6326161696358400?logs=headers_headerscheck#L0
>
> Unless somebody sees a reason not to, I'm planning to commit this soon.

LGTM.




Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-03-22 Thread Kyotaro Horiguchi
At Tue, 22 Mar 2022 11:00:06 -0700, Andres Freund  wrote in 
> Hi,
> 
> On 2022-03-22 21:57:51 +0530, Bharath Rupireddy wrote:
> > > This is probably close to an order of magnitude slower than pg_waldump
> > > --stats. Which imo renders this largely useless.
> > 
> > Yeah that's true. Do you suggest having pg_get_wal_stats() a
> > c-function like in v8 patch [1]?
> 
> Yes.
>
> > SEe some numbers at [2] with pg_get_wal_stats using
> > pg_get_wal_records_info and pg_get_wal_records_info as a direct
> > c-function like in v8 patch [1]. A direct c-function always fares
> > better (84 msec vs 1400msec).
> 
> That indeed makes the view as is pretty much useless. And it'd probably be
> worse in a workload with longer records / many FPIs.

FWIW agreed. The SQL version is too slow..


> > > > + for (block_id = 0; block_id <= record->max_block_id; block_id++)
> > > > + {
> > >
> > > To me duplicating this much code from waldump seems like a bad idea from a
> > > maintainability POV.
> > 
> > Even if we were to put the above code from pg_walinspect and
> > pg_waldump into, say, walutils.c or some other existing file, there we
> > had to make if (pg_walinspect) appendStringInfo else if (pg_waldump)
> > printf() sort of thing, isn't it clumsy?
> 
> Why is that needed? Just use the stringinfo in both places? You're outputting
> the exact same thing in both places right now. There's already a stringinfo in
> XLogDumpDisplayRecord() these days (there wasn't back when pg_xlogddump was
> written), so you could just convert at least the relevant printfs in
> XLogDumpDisplayRecord().

> > Also, unnecessary if
> > conditions need to be executed for every record. For maintainability,
> > I added a note in pg_walinspect.c and pg_waldump.c to consider fixing
> > things in both places (of course this might sound dumbest way of doing
> > it, IMHO, it's sensible, given the if(pg_walinspect)-else
> > if(pg_waldump) sorts of code that we need to do in the common
> > functions). Thoughts?
> 
> IMO we shouldn't merge this with as much duplication as there is right now,
> the notes don't change that it's a PITA to maintain.

The two places emit different outputs but the only difference is the
delimiter between two blockrefs. (By the way, the current code forgets
to insert a delimiter there).  So even if the function took "bool
is_waldump", it is used only when appending a line delimiter.  It
would be nicer if the "bool is_waldump" were "char *delimiter".
Others might think differently, though..

So, the function looks like this.

StringInfo XLogBlockRefInfos(XLogReaderState *record, char *delimiter,
uint32 _len);


> > Yeah. It is to handle some edge cases to print the WAL  upto end_lsn
> > and avoid waiting in read_local_xlog_page. I will change it.
> > 
> > Actually, there's an open point as specified in [3]. Any thoughts on it?
> 
> Seems more user-friendly to wait - it's otherwise hard for a user to know what
> LSN to put in.

I'm not sure it is user-friendly that the function "freeze"s for a
reason uncertain to the user..  Even if any results are accumulated
before waiting, all of them vanishes by entering Ctrl-C to release the
"freeze".

About the usefulness of the waiting behavior, it depends on what we
think the function's major use cases are.  Robert (AFAIU) thinks it as
a simple WAL dumper that is intended to use in some automated
mechanism.  The start/end LSNs simply identifies the records to emit.
No warning/errors and no waits except for apparently invalid inputs.

I thought it as a means by which to manually inspect wal on SQL
interface but don't have a strong opinion on the waiting behavior.
(Because I can avoid that by giving a valid LSN pair to the function
if I don't want it to "freeze".)


Anyway, the opinions here on the interface are described as follows.

A. as a diag interface for human use.

  1. If the whole region is filled with records, return them all.
  2. If start-LSN is too past, starts from the first available record.

  3-1. If start-LSN is in futnre, wait for the record to come.
  4-1. If end-LSN is in future, waits for new records.
  5-1. If end-LSN is too past, error out?

B. as a simple WAL dumper

  1. If the whole region is filled with records, return them all.
  2. If start-LSN is too past, starts from the first available record.

  3-2. If start-LSN is in futnre, returns nothig.
  4-2. If end-LSN is in future, ends with the last available record.
  5-2. If end-LSN is too past, returns northing.

1 and 2 are uncontroversial.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: cpluspluscheck vs ICU

2022-03-22 Thread Andres Freund
Hi,

On 2022-03-22 17:20:24 -0700, Andres Freund wrote:
> I was about to propose adding headerscheck / cpluspluscheck to the CI file so
> that cfbot can catch future issues.

The attached patch does so, with ICU disabled to avoid the problems discussed
in the thread. Example run:
https://cirrus-ci.com/task/6326161696358400?logs=headers_headerscheck#L0

Unless somebody sees a reason not to, I'm planning to commit this soon.

Greetings,

Andres Freund
>From fbc2c30f2420706bc2db64fa193b809b88ddff0b Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 22 Mar 2022 16:52:03 -0700
Subject: [PATCH v1] ci: test headerscheck, cpluspluscheck as part of
 CompilerWarnings task.

---
 .cirrus.yml | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/.cirrus.yml b/.cirrus.yml
index e5335fede76..171bd29cf03 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -576,5 +576,28 @@ task:
   make -s -j${BUILD_JOBS} clean
   time make -s -j${BUILD_JOBS} -C doc
 
+  ###
+  # Verify headerscheck / cpluspluscheck succeed
+  #
+  # - Don't use ccache, the files are uncacheable, polluting ccache's
+  #   cache
+  # - Use -fmax-errors, as particularly cpluspluscheck can be very verbose
+  # - XXX have to disable ICU to avoid errors:
+  #   https://postgr.es/m/20220323002024.f2g6tivduzrktgfa%40alap3.anarazel.de
+  # - XXX: the -Wno-register avoids verbose warnings:
+  #   https://postgr.es/m/20220308181837.aun3tdtdvao4vb7o%40alap3.anarazel.de
+  ###
+  always:
+headers_headerscheck_script: |
+  time ./configure \
+${LINUX_CONFIGURE_FEATURES} \
+--without-icu \
+--quiet \
+CC="gcc" CXX"=g++" CLANG="clang"
+  make -s -j${BUILD_JOBS} clean
+  time make -s headerscheck EXTRAFLAGS='-fmax-errors=10'
+headers_cpluspluscheck_script: |
+  time make -s cpluspluscheck EXTRAFLAGS='-Wno-register -fmax-errors=10'
+
   always:
 upload_caches: ccache
-- 
2.35.1.354.g715d08a9e5



Re: Column Filtering in Logical Replication

2022-03-22 Thread Amit Kapila
On Wed, Mar 23, 2022 at 12:54 AM Alvaro Herrera  wrote:
>
> On 2022-Mar-19, Tomas Vondra wrote:
>
> > @@ -174,7 +182,13 @@ ALTER PUBLICATION noinsert SET (publish = 'update, 
> > delete');
> >
> > Add some tables to the publication:
> >  
> > -ALTER PUBLICATION mypublication ADD TABLE users, departments;
> > +ALTER PUBLICATION mypublication ADD TABLE users (user_id, firstname), 
> > departments;
> > +
> > +
> > +  
> > +   Change the set of columns published for a table:
> > +
> > +ALTER PUBLICATION mypublication SET TABLE users (user_id, firstname, 
> > lastname), TABLE departments;
> >  
> >
> >
>
> Hmm, it seems to me that if you've removed the feature to change the set
> of columns published for a table, then the second example should be
> removed as well.
>

As per my understanding, the removed feature is "Alter Publication ...
Alter Table ...". The example here "Alter Publication ... Set Table
.." should still work as mentioned in my email[1].

[1] - 
https://www.postgresql.org/message-id/CAA4eK1L6YTcx%3DyJfdudr-y98Wcn4rWX4puHGAa2nxSCRb3fzQw%40mail.gmail.com

-- 
With Regards,
Amit Kapila.




Re: Allow file inclusion in pg_hba and pg_ident files

2022-03-22 Thread Julien Rouhaud
Hi,

On Wed, Mar 23, 2022 at 11:03:46AM +0900, Michael Paquier wrote:
> 
> Pushing forward with 0001 by the end of the CF is the part that has no
> controversy IMO, and I have no objections to it.  Now, after looking
> at this part, I found a few things, as of:
> - HbaToken, the set of elements in the lists of TokenizedAuthLine, is
> a weird to use as this layer gets used by both pg_hba.conf and
> pg_indent.conf before transforming them into each HbaLine and
> IdentLine.  While making this part of the internals exposed, I think
> that we'd better rename that to AuthToken at least.  This impacts the
> names of some routines internal to hba.c to copy and create
> AuthTokens.

Yeah, I thought about it but didn't rename it given your concerns about git
history.  I'm fine either way.

> - s/gethba_options/get_hba_options/, to be consistent with
> fill_hba_view() and other things.
> - The comment at the top of tokenize_auth_file() needed a refresh.
> 
> That's mostly cosmetic, and the rest of the code moved is identical.
> So at the end this part looks rather commitable to me.

Looks good to me, thanks.

> I have not been able to test 0002 in details, but it looks rather
> rather sane to me at quick glance, and it is simple.  The argument
> about more TAP tests applies to it, though, even if there is one SQL
> test to check the function execution.  It is probably better to not
> consider 0003 and 0004 for this CF.

No objection to moving 0003 and 0004 to the next commitfest.




Re: Allow file inclusion in pg_hba and pg_ident files

2022-03-22 Thread Michael Paquier
On Tue, Mar 22, 2022 at 09:38:00PM +0800, Julien Rouhaud wrote:
> On Tue, Mar 22, 2022 at 03:21:20PM +0300, Aleksander Alekseev wrote:
>> Since v3-0002 adds a new view and alters pg_proc.dat shouldn't it also
>> increase CATALOG_VERSION_NO? Not sure if we generally do this in the
>> patches or expect the committer to make the change manually.
> 
> It's better to no include any catversion bump, otherwise the patches will
> rot very fast.  Author can mention the need for a catversion bump in the
> commit message, and I usually do so but I apparently forgot.

Yeah, committers take care of that.  You would just expose yourself to
more noise in the CF bot for no gain, as a catversion bump is useful
after a patch has been merged so as as users are able to know when a
cluster needs to be pg_upgrade'd or initdb'd because the catalog
created and run are incompatible.

>> All in all, the patchset seems to be in good shape and I don't have
>> anything but some little nitpicks. It passes `make installcheck` and I
>> verified manually that the file inclusion 1) works 2) write proper error
>> messages to the logfile when the included file doesn't exist or has wrong
>> permissions.
> 
> Thanks!

Pushing forward with 0001 by the end of the CF is the part that has no
controversy IMO, and I have no objections to it.  Now, after looking
at this part, I found a few things, as of:
- HbaToken, the set of elements in the lists of TokenizedAuthLine, is
a weird to use as this layer gets used by both pg_hba.conf and
pg_indent.conf before transforming them into each HbaLine and
IdentLine.  While making this part of the internals exposed, I think
that we'd better rename that to AuthToken at least.  This impacts the
names of some routines internal to hba.c to copy and create
AuthTokens.
- s/gethba_options/get_hba_options/, to be consistent with
fill_hba_view() and other things.
- The comment at the top of tokenize_auth_file() needed a refresh.

That's mostly cosmetic, and the rest of the code moved is identical.
So at the end this part looks rather commitable to me.

I have not been able to test 0002 in details, but it looks rather
rather sane to me at quick glance, and it is simple.  The argument
about more TAP tests applies to it, though, even if there is one SQL
test to check the function execution.  It is probably better to not
consider 0003 and 0004 for this CF.
--
Michael
From dbee2a5b673b3bac2e4df0ec966912c29416fc2c Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 23 Mar 2022 10:56:15 +0900
Subject: [PATCH v4] Extract view processing code from hba.c

This file is already quite big and a following commit will add yet an
additional view, so let's move all the view related code in hba.c into a new
adt/hbafuncs.c.

Author: Julien Rouhaud
Reviewed-by: FIXME
Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya%40jrouhaud
---
 src/include/libpq/hba.h  |  31 ++
 src/backend/libpq/hba.c  | 517 +++
 src/backend/utils/adt/Makefile   |   1 +
 src/backend/utils/adt/hbafuncs.c | 428 +
 src/tools/pgindent/typedefs.list |   4 +-
 5 files changed, 511 insertions(+), 470 deletions(-)
 create mode 100644 src/backend/utils/adt/hbafuncs.c

diff --git a/src/include/libpq/hba.h b/src/include/libpq/hba.h
index 8d9f3821b1..13ecb329f8 100644
--- a/src/include/libpq/hba.h
+++ b/src/include/libpq/hba.h
@@ -132,6 +132,34 @@ typedef struct IdentLine
 	regex_t		re;
 } IdentLine;
 
+/*
+ * A single string token lexed from an authentication configuration file
+ * (pg_ident.conf or pg_hba.conf), together with whether the token has
+ * been quoted.
+ */
+typedef struct AuthToken
+{
+	char	   *string;
+	bool		quoted;
+} AuthToken;
+
+/*
+ * TokenizedAuthLine represents one line lexed from an authentication
+ * configuration file.  Each item in the "fields" list is a sub-list of
+ * AuthTokens.  We don't emit a TokenizedAuthLine for empty or all-comment
+ * lines, so "fields" is never NIL (nor are any of its sub-lists).
+ *
+ * Exception: if an error occurs during tokenization, we might have
+ * fields == NIL, in which case err_msg != NULL.
+ */
+typedef struct TokenizedAuthLine
+{
+	List	   *fields;			/* List of lists of AuthTokens */
+	int			line_num;		/* Line number */
+	char	   *raw_line;		/* Raw line text */
+	char	   *err_msg;		/* Error message if any */
+} TokenizedAuthLine;
+
 /* kluge to avoid including libpq/libpq-be.h here */
 typedef struct Port hbaPort;
 
@@ -142,6 +170,9 @@ extern void hba_getauthmethod(hbaPort *port);
 extern int	check_usermap(const char *usermap_name,
 		  const char *pg_role, const char *auth_user,
 		  bool case_sensitive);
+extern HbaLine *parse_hba_line(TokenizedAuthLine *tok_line, int elevel);
 extern bool pg_isblank(const char c);
+extern MemoryContext tokenize_auth_file(const char *filename, FILE *file,
+		List **tok_lines, int elevel);
 
 #endif			/* HBA_H */
diff --git a/src/backend/libpq/hba.c 

Re: SQL/JSON: functions

2022-03-22 Thread Tom Lane
I wrote:
> Andres Freund  writes:
>> There's also
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jabiru=2022-03-22%2022%3A25%3A26
>> that started failing with
>> ../../preproc/ecpg --regression -I./../../include -I. -o test1.c test1.pgc
>> test1.pgc:12: ERROR: syntax error at or near "int"
>> with this commit.

> Yeah, I was just scratching my head about that.

I have a possibly-far-fetched theory: the ecpg grammar builder has
certainly never been validated against a backend grammar that
contains unused rules or unused nonterminals.  Maybe it generates
a subtly incorrect .y file in such cases, and on this particular
platform that results in bad code generated by bison, and ensuing
bogus syntax errors.

The lack of other failures weakens this theory.  Notably, I failed
to duplicate the problem on florican's host, which has the same
nominal bison version 3.7.6.  But it wouldn't surprise me a bit
to find that OpenBSD is carrying some private patches for their
build, so maybe that matters?

In any case, I think it's a bit pointless to chase these issues
with respect to this intermediate state of the JSON patch.
Let's merge in the next step, get to a state that does not
generate build warnings, and see what happens then.

regards, tom lane




Re: Probable CF bot degradation

2022-03-22 Thread Justin Pryzby
On Wed, Mar 23, 2022 at 12:44:09PM +1300, Thomas Munro wrote:
> On Mon, Mar 21, 2022 at 12:46 PM Thomas Munro  wrote:
> > On Mon, Mar 21, 2022 at 12:23 PM Thomas Munro  
> > wrote:
> > > On Mon, Mar 21, 2022 at 1:58 AM Matthias van de Meent 
> > >  wrote:
> > > > Additionally, are there plans to validate commits of the main branch
> > > > before using them as a base for CF entries, so that "bad" commits on
> > > > master won't impact CFbot results as easy?
> > >
> > > How do you see this working?
> >
> > [Now with more coffee on board]  Oh, right, I see, you're probably
> > thinking that we could look at
> > https://github.com/postgres/postgres/commits/master and take the most
> > recent passing commit as a base.  Hmm, interesting idea.
> 
> A nice case in point today: everything is breaking on Windows due to a
> commit in master, which could easily be avoided by looking back a
> certain distance for a passing commit from postgres/postgres to use as
> a base.  Let's me see if this is easy to fix...
> 
> https://www.postgresql.org/message-id/20220322231311.GK28503%40telsasoft.com

I suggest not to make it too sophisticated.  If something is broken, the CI
should show that rather than presenting a misleading conclusion.

Maybe you could keep track of how many consecutive, *new* failures there've
been (which were passing on the previous run for that task, for that patch) and
delay if it's more than (say) 5.  For bonus points, queue a rerun of all the
failed tasks once something passes.

If you create a page to show the queue, maybe it should show the history of
results, too.  And maybe there should be a history of results for each patch.

If you implement interactive buttons, maybe it could allow re-queueing some
recent failures (add to end of queue).

-- 
Justin




Re: shared-memory based stats collector - v66

2022-03-22 Thread Melanie Plageman
On Thu, Mar 17, 2022 at 3:36 AM Andres Freund  wrote:
> I've attached a substantially improved version of the shared memory stats
> patch.
...
>   - lot of the pg_stat_ views like bgwriter, pg_stat_database have zero 
> coverage today

Attached are some tests including tests that reset of stats works for
all views having a reset timestamp as well as a basic test for at least
one column in all of the following stats views:
pg_stat_archiver, pg_stat_bgwriter, pg_stat_wal, pg_stat_slru,
pg_stat_replication_slots, pg_stat_database

It might be nice to have a test for one of the columns fetched from the
PgStatBgwriter data structure since those and the Checkpointer stats are
stored separately despite being displayed in the same view currently.
but, alas...

- Melanie
From 37e5ba3b7743309b00c81dbfe65cfd481d4859a6 Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Tue, 22 Mar 2022 16:53:32 -0400
Subject: [PATCH] test

---
 src/test/isolation/expected/stats.out   |  35 
 src/test/isolation/specs/stats.spec |  31 +++
 src/test/recovery/t/006_logical_decoding.pl |  63 ++
 src/test/regress/expected/stats.out | 208 
 src/test/regress/sql/stats.sql  | 107 ++
 5 files changed, 444 insertions(+)

diff --git a/src/test/isolation/expected/stats.out b/src/test/isolation/expected/stats.out
index ca553be3bf..7a156e1da2 100644
--- a/src/test/isolation/expected/stats.out
+++ b/src/test/isolation/expected/stats.out
@@ -1895,3 +1895,38 @@ name  |pg_stat_get_function_calls|total_above_zero|self_above_zero
 test_stat_func| 5|t   |t  
 (1 row)
 
+
+starting permutation: s1_slru_save_stats s1_listen s1_big_notify s1_ff s1_slru_check_stats
+step s1_slru_save_stats: 
+	INSERT INTO test_slru_stats VALUES('Notify', 'blks_zeroed',
+(SELECT blks_zeroed FROM pg_stat_slru WHERE name = 'Notify'));
+
+step s1_listen: LISTEN big_notify;
+step s1_big_notify: SELECT pg_notify('big_notify',
+repeat('0', current_setting('block_size')::int / 2)) FROM generate_series(1, 2);
+
+pg_notify
+-
+ 
+ 
+(2 rows)
+
+s1: NOTIFY "big_notify" with payload 

Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2022-03-22 Thread Nathan Bossart
On Tue, Mar 22, 2022 at 04:13:47PM -0700, Andres Freund wrote:
> Just skimming this thread quickly, I really have no idea what this is trying
> to achieve and the commit message doesn't help either...  I didn't read the
> referenced thread, but I shouldn't have to, to get a basic idea.

Ah, my bad.  I should've made sure the context was carried over better.  I
updated the commit message with some basic information about the intent.
Please let me know if there is anything else that needs to be cleared up.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 739ec6e42183280d57d867157cfe1b37d127ca54 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 22 Mar 2022 15:35:34 -0700
Subject: [PATCH v5 1/1] Disallow setting XMAX_COMMITTED and XMAX_LOCK_ONLY
 together.

Even though Postgres doesn't set both the XMAX_COMMITTED and
XMAX_LOCK_ONLY infomask bits on the same tuple header, the heap
visibility logic still accepts it.  However, other functions like
compute_new_xmax_infomask() seem to assume that this bit pattern
is not possible.

This change marks this bit pattern as disallowed, removes the heap
visibility logic that handles it, and adds checks like those for
other disallowed infomask bit combinations (e.g., XMAX_COMMITTED
and XMAX_IS_MULTI).  Besides simplifying the heap visibility logic
a bit, this change aims to reduce ambiguity about the legal tuple
header states.

Note that this change also disallows XMAX_COMMITTED together with
the special pre-v9.3 locked-only bit pattern that
HEAP_XMAX_IS_LOCKED_ONLY checks for.  This locked-only bit pattern
may still be present on servers pg_upgraded from pre-v9.3 versions.
---
 contrib/amcheck/verify_heapam.c | 19 
 src/backend/access/heap/README.tuplock  |  2 +-
 src/backend/access/heap/heapam.c| 10 +++
 src/backend/access/heap/heapam_visibility.c | 97 ++---
 src/backend/access/heap/hio.c   |  2 +
 5 files changed, 96 insertions(+), 34 deletions(-)

diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index e5f7355dcb..f30b9c234f 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -665,6 +665,25 @@ check_tuple_header(HeapCheckContext *ctx)
 		 */
 	}
 
+	if ((ctx->tuphdr->t_infomask & HEAP_XMAX_COMMITTED) &&
+		(ctx->tuphdr->t_infomask & HEAP_XMAX_LOCK_ONLY))
+	{
+		report_corruption(ctx,
+		  pstrdup("locked-only should not be marked committed"));
+
+		/* As above, do not skip further checks. */
+	}
+
+	/* also check for pre-v9.3 lock-only bit pattern */
+	if ((ctx->tuphdr->t_infomask & HEAP_XMAX_COMMITTED) &&
+		HEAP_XMAX_IS_LOCKED_ONLY(ctx->tuphdr->t_infomask))
+	{
+		report_corruption(ctx,
+		  pstrdup("tuple with HEAP_XMAX_EXCL_LOCK set and HEAP_XMAX_IS_MULTI unset should not be marked committed"));
+
+		/* As above, do not skip further checks. */
+	}
+
 	if (infomask & HEAP_HASNULL)
 		expected_hoff = MAXALIGN(SizeofHeapTupleHeader + BITMAPLEN(ctx->natts));
 	else
diff --git a/src/backend/access/heap/README.tuplock b/src/backend/access/heap/README.tuplock
index 6441e8baf0..4e565e60ab 100644
--- a/src/backend/access/heap/README.tuplock
+++ b/src/backend/access/heap/README.tuplock
@@ -152,4 +152,4 @@ The following infomask bits are applicable:
   whether the XMAX is a TransactionId or a MultiXactId.
 
 We currently never set the HEAP_XMAX_COMMITTED when the HEAP_XMAX_IS_MULTI bit
-is set.
+or the HEAP_XMAX_LOCK_ONLY bit is set.
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 3746336a09..3f0b4cd754 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -5151,6 +5151,16 @@ l5:
 		MultiXactStatus status;
 		MultiXactStatus new_status;
 
+		/*
+		 * Currently we don't allow XMAX_LOCK_ONLY and XMAX_COMMITTED to both be
+		 * set in a tuple header, so cross-check.
+		 *
+		 * Note that this also checks for the special locked-only bit pattern
+		 * that may be present on servers that were pg_upgraded from pre-v9.3
+		 * versions.
+		 */
+		Assert(!HEAP_XMAX_IS_LOCKED_ONLY(old_infomask));
+
 		if (old_infomask2 & HEAP_KEYS_UPDATED)
 			status = MultiXactStatusUpdate;
 		else
diff --git a/src/backend/access/heap/heapam_visibility.c b/src/backend/access/heap/heapam_visibility.c
index ff0b8a688d..7d6fb66b0d 100644
--- a/src/backend/access/heap/heapam_visibility.c
+++ b/src/backend/access/heap/heapam_visibility.c
@@ -78,6 +78,31 @@
 #include "utils/snapmgr.h"
 
 
+/*
+ * InfomaskAssertions()
+ *
+ * Checks for invalid infomask bit patterns.
+ */
+static inline void
+InfomaskAssertions(HeapTupleHeader tuple)
+{
+	/*
+	 * XMAX_COMMITTED and XMAX_LOCK_ONLY cannot both be set in a tuple header.
+	 *
+	 * Note that this also checks for the special locked-only bit pattern that
+	 * may be present on servers that were pg_upgraded from pre-v9.3 versions.
+	 */
+	Assert(!((tuple->t_infomask & HEAP_XMAX_COMMITTED) &&
+			 

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2022-03-22 Thread Japin Li


On Wed, 23 Mar 2022 at 01:22, Maxim Orlov  wrote:
> Hi!
>
> Here is v24. Changes are:
> - correct commit messages for 0001 and 0002
> - use uint64 for SLRU page numbering instead of int64 in v23
> - fix code formatting and indent
> - and minor fixes in slru.c
>
> Reviews are very welcome!
>

Thanks for updating the patchs. I have a little comment on 0002.

+errmsg_internal("found xmax %llu" " (infomask 
0x%04x) not frozen, not multi, not normal",
+(unsigned long 
long) xid, tuple->t_infomask)));

IMO, we can remove the double quote inside the sentence.

 errmsg_internal("found xmax %llu (infomask 
0x%04x) not frozen, not multi, not normal",
 (unsigned long 
long) xid, tuple->t_infomask)));

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




cpluspluscheck vs ICU

2022-03-22 Thread Andres Freund
Hi,

I was about to propose adding headerscheck / cpluspluscheck to the CI file so
that cfbot can catch future issues. Unfortunately running cpluspluscheck with
ICU enabled is, um, not fun: There's 30k lines of error output.

/home/andres/src/postgresql/src/tools/pginclude/cpluspluscheck 
/home/andres/src/postgresql /home/andres/build/postgres/dev-assert/vpath
In file included from /usr/include/c++/12/bits/stl_algobase.h:60,
 from /usr/include/c++/12/memory:63,
 from /usr/include/unicode/localpointer.h:45,
 from /usr/include/unicode/unorm2.h:34,
 from /usr/include/unicode/unorm.h:25,
 from /usr/include/unicode/ucol.h:17,
 from 
/home/andres/src/postgresql/src/include/utils/pg_locale.h:19,
 from 
/home/andres/src/postgresql/src/include/tsearch/ts_locale.h:20,
 from /tmp/cpluspluscheck.H59Y6V/test.cpp:3:
/usr/include/c++/12/bits/functexcept.h:101:3: error: conflicting declaration of 
C function ‘void std::__throw_ios_failure(const char*, int)’
  101 |   __throw_ios_failure(const char*, int) __attribute__((__noreturn__));
  |   ^~~
/usr/include/c++/12/bits/functexcept.h:98:3: note: previous declaration ‘void 
std::__throw_ios_failure(const char*)’
   98 |   __throw_ios_failure(const char*) __attribute__((__noreturn__));
  |   ^~~
In file included from /usr/include/c++/12/bits/stl_algobase.h:63:
/usr/include/c++/12/ext/numeric_traits.h:50:3: error: template with C linkage
   50 |   template
  |   ^~~~
/tmp/cpluspluscheck.H59Y6V/test.cpp:1:1: note: ‘extern "C"’ linkage started here
1 | extern "C" {
  | ^~
...

with one warning for each declaration in numeric_traits.h, I think.

So, there's two questions:
1) How can we prevent this problem when ICU support is enabled?
2) Can we prevent such absurdly long error output?

For 2), perhaps we should just specify EXTRAFLAGS=-fmax-errors=10 in the
cpluspluscheck invocation, or add it in cpluspluscheck itself?

For 1), I don't immediately see a minimal solution other than ignoring it in
cpluspluscheck, similar to pg_trace.h/probes.h.

A different / complimentary approach could be to add -Wc++-compat to the
headerscheck invocation. Both gcc and clang understand that.

But neither of these really gets to the heart of the problem. There's still no
way for C++ code to include pg_locale.h correctly. And in contrast to
pg_trace.h/probes.h pg_locale.h is somewhat important.


This isn't a new problem, afaics.


Perhaps we should strive to remove the use of ICU headers from within our
headers? The members of pg_locale are just pointers and could thus be void *,
and HAVE_UCOL_STRCOLLUTF8 could be computed at configure time or such.

Greetings,

Andres Freund




Re: New Object Access Type hooks

2022-03-22 Thread Andrew Dunstan


On 3/22/22 20:07, Tom Lane wrote:
> Andrew Dunstan  writes:
>> On 3/22/22 18:18, Tom Lane wrote:
>>> Now, if our attitude to the OAT hooks is that we are going to
>>> sprinkle some at random and whether they are useful is someone
>>> else's problem, then maybe these are not interesting concerns.
>> So this was a pre-existing problem that the test has exposed? I don't
>> think we can just say "you deal with it", and if I understand you right
>> you don't think that either.
> Yeah, my point exactly: the placement of those hooks needs to be rethought.
> I'm guessing what we ought to do is let the cached namespace OID list
> get built without interference, and then allow the OAT hook to filter
> it when it's read.
>
>> I could make the buildfarm quiet again by resetting NO_INSTALLCHECK
>> temporarily.
> I was able to reproduce it under "make check" as long as I had
> LANG set to one of the troublesome values, so I'm not real sure
> that that'll be enough.
>
>   


The buildfarm only runs installcheck under different locales/encodings.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: New Object Access Type hooks

2022-03-22 Thread Tom Lane
Andrew Dunstan  writes:
> On 3/22/22 18:18, Tom Lane wrote:
>> Now, if our attitude to the OAT hooks is that we are going to
>> sprinkle some at random and whether they are useful is someone
>> else's problem, then maybe these are not interesting concerns.

> So this was a pre-existing problem that the test has exposed? I don't
> think we can just say "you deal with it", and if I understand you right
> you don't think that either.

Yeah, my point exactly: the placement of those hooks needs to be rethought.
I'm guessing what we ought to do is let the cached namespace OID list
get built without interference, and then allow the OAT hook to filter
it when it's read.

> I could make the buildfarm quiet again by resetting NO_INSTALLCHECK
> temporarily.

I was able to reproduce it under "make check" as long as I had
LANG set to one of the troublesome values, so I'm not real sure
that that'll be enough.

regards, tom lane




Re: SQL/JSON: functions

2022-03-22 Thread Andrew Dunstan


On 3/22/22 18:31, Tom Lane wrote:
> I wrote:
>> That patch is 0-for-three on my buildfarm animals.
> ... the reason being that they use -Werror, and this patch spews
> a bunch of warnings.  This is not acceptable, especially not in
> the middle of a CF when other people are trying to get work done.
> Please revert.
>
>   



reverted.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: New Object Access Type hooks

2022-03-22 Thread Andrew Dunstan


On 3/22/22 18:18, Tom Lane wrote:
> Andrew Dunstan  writes:
>> Fixed
> Now that we got past the hard failures, we can see that the test
> falls over with (some?) non-default encodings, as for instance
> here:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion=2022-03-22%2020%3A23%3A13
>
> I can replicate that by running the test under LANG=en_US.iso885915.
> What I think is happening is:
>
> (1) Rather unwisely, the relevant InvokeNamespaceSearchHook calls
> appear in recomputeNamespacePath.  That means that their timing
> depends heavily on accidents of caching.
>
> (2) If we decide that we need an encoding conversion to talk to
> the client, there'll be a lookup for the conversion function
> early during session startup.  That will cause the namespace
> search path to get computed then, before the test module has been
> loaded and certainly before the audit GUC has been turned on.
>
> (3) At the point where the test expects some audit notices
> to come out, nothing happens because the search path is
> already validated.
>
> I'm inclined to think that (1) is a seriously bad idea,
> not only because of this instability, but because
>
> (a) the namespace cache logic is unlikely to cause the search-path
> cache to get invalidated when something happens that might cause an
> OAT hook to wish to change its decision, and
>
> (b) this placement means that the hook is invoked during cache loading
> operations that are likely to be super-sensitive to any additional
> catalog accesses a hook might wish to do.  (I await the results of the
> CLOBBER_CACHE_ALWAYS animals with trepidation.)
>
> Now, if our attitude to the OAT hooks is that we are going to
> sprinkle some at random and whether they are useful is someone
> else's problem, then maybe these are not interesting concerns.


So this was a pre-existing problem that the test has exposed? I don't
think we can just say "you deal with it", and if I understand you right
you don't think that either.

I could make the buildfarm quiet again by resetting NO_INSTALLCHECK
temporarily.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Window Function "Run Conditions"

2022-03-22 Thread Zhihong Yu
On Tue, Mar 22, 2022 at 3:24 PM David Rowley  wrote:

> On Thu, 26 Aug 2021 at 14:54, Andy Fan  wrote:
> >
> > On Thu, Aug 19, 2021 at 2:35 PM David Rowley 
> wrote:
> > >
> > > On Thu, 19 Aug 2021 at 00:20, Andy Fan 
> wrote:
> > > > In the current master, the result is:
> > > >
> > > >  empno | salary | c | dr
> > > > ---++---+
> > > >  8 |   6000 | 4 |  1
> > >
> > > > In the patched version, the result is:
> > > >
> > > >  empno | salary | c | dr
> > > > ---++---+
> > > >  8 |   6000 | 1 |  1
> > >
> > > Thanks for taking it for a spin.
> > >
> > > That's a bit unfortunate.  I don't immediately see how to fix it other
> > > than to restrict the optimisation to only apply when there's a single
> > > WindowClause. It might be possible to relax it further and only apply
> > > if it's the final window clause to be evaluated, but in those cases,
> > > the savings are likely to be much less anyway as some previous
> > > WindowAgg will have exhausted all rows from its subplan.
> >
> > I am trying to hack the select_active_windows function to make
> > sure the WindowClause with Run Condition clause to be the last one
> > to evaluate (we also need to consider more than 1 window func has
> > run condition), at that time, the run condition clause is ready already.
> >
> > However there are two troubles in this direction: a).  This may conflict
> > with "the windows that need the same sorting are adjacent in the list."
> > b). "when two or more windows are order-equivalent then all peer rows
> > must be presented in the same order in all of them. .. (See General Rule
> 4 of
> >  in SQL2008 - SQL2016.)"
> >
> > In summary, I am not sure if it is correct to change the execution Order
> > of WindowAgg freely.
>
> Thanks for looking at that.
>
> My current thoughts are that it just feels a little too risky to
> adjust the comparison function that sorts the window clauses to pay
> attention to the run-condition.
>
> We would need to ensure that there's just a single window function
> with a run condition as it wouldn't be valid for there to be multiple.
> It would be easy enough to ensure we only push quals into just 1
> window clause, but that and meddling with the evaluation order has
> trade-offs.  To do that properly, we'd likely want to consider the
> costs when deciding which window clause would benefit from having
> quals pushed the most.  Plus, if we start messing with the evaluation
> order then we'd likely really want some sort of costing to check if
> pushing a qual and adjusting the evaluation order is actually cheaper
> than not pushing the qual and keeping the clauses in the order that
> requires the minimum number of sorts.   The planner is not really
> geared up for costing things like that properly, that's why we just
> assume the order with the least sorts is best. In reality that's often
> not going to be true as an index may exist and we might want to
> evaluate a clause first if we could get rid of a sort and index scan
> instead. Sorting the window clauses based on their SortGroupClause is
> just the best we can do for now at that stage in planning.
>
> I think it's safer to just disable the optimisation when there are
> multiple window clauses.  Multiple matching clauses are merged
> already, so it's perfectly valid to have multiple window functions,
> it's just they must share the same window clause.  I don't think
> that's terrible as with the major use case that I have in mind for
> this, the window function is only added to limit the number of rows.
> In most cases I can imagine, there'd be no reason to have an
> additional window function with different frame options.
>
> I've attached an updated patch.
>
Hi,
The following code seems to be common between if / else blocks (w.r.t.
wfunc_left) of find_window_run_conditions().

+   op = get_opfamily_member(opinfo->opfamily_id,
+opinfo->oplefttype,
+opinfo->oprighttype,
+newstrategy);
+
+   newopexpr = (OpExpr *) make_opclause(op,
+opexpr->opresulttype,
+opexpr->opretset,
+otherexpr,
+(Expr *) wfunc,
+opexpr->opcollid,
+opexpr->inputcollid);
+   newopexpr->opfuncid = get_opcode(op);
+
+   *keep_original = true;
+   runopexpr = newopexpr;

It would be nice if this code can be shared.

+   WindowClause *wclause = (WindowClause *)
+   list_nth(subquery->windowClause,
+wfunc->winref - 1);

The code would be more readable if list_nth() is indented.

+   /* Check the left side of the OpExpr */


Re: SQL/JSON: functions

2022-03-22 Thread Tom Lane
Andres Freund  writes:
> There's also
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jabiru=2022-03-22%2022%3A25%3A26
> that started failing with
> ../../preproc/ecpg --regression -I./../../include -I. -o test1.c test1.pgc
> test1.pgc:12: ERROR: syntax error at or near "int"
> with this commit.

Yeah, I was just scratching my head about that.  It's not clear how
that could be related; but jabiru doesn't seem to have a history of
random failures, so maybe it's real.

regards, tom lane




Re: Probable CF bot degradation

2022-03-22 Thread Thomas Munro
On Mon, Mar 21, 2022 at 12:46 PM Thomas Munro  wrote:
> On Mon, Mar 21, 2022 at 12:23 PM Thomas Munro  wrote:
> > On Mon, Mar 21, 2022 at 1:58 AM Matthias van de Meent
> >  wrote:
> > > Additionally, are there plans to validate commits of the main branch
> > > before using them as a base for CF entries, so that "bad" commits on
> > > master won't impact CFbot results as easy?
> >
> > How do you see this working?
>
> [Now with more coffee on board]  Oh, right, I see, you're probably
> thinking that we could look at
> https://github.com/postgres/postgres/commits/master and take the most
> recent passing commit as a base.  Hmm, interesting idea.

A nice case in point today: everything is breaking on Windows due to a
commit in master, which could easily be avoided by looking back a
certain distance for a passing commit from postgres/postgres to use as
a base.  Let's me see if this is easy to fix...

https://www.postgresql.org/message-id/20220322231311.GK28503%40telsasoft.com




Re: New Object Access Type hooks

2022-03-22 Thread Tom Lane
Justin Pryzby  writes:
> If I'm not wrong, this is still causing issues at least on cfbot/windows, even
> since f0206d99.

That's probably a variant of the encoding dependency I described
upthread.

regards, tom lane




Re: MDAM techniques and Index Skip Scan patch

2022-03-22 Thread Tom Lane
Andres Freund  writes:
> On 2022-03-22 16:55:49 -0400, Tom Lane wrote:
>> BTW, I've had a bee in my bonnet for a long time about whether some of
>> nbtree's scan setup work could be done once during planning, rather than
>> over again during each indexscan start.

> It does show up in simple-index-lookup heavy workloads. Not as a major thing,
> but it's there. And it's just architecturally displeasing :)
> Are you thinking of just moving the setup stuff in nbtree (presumably parts of
> _bt_first() / _bt_preprocess_keys()) or also stuff in
> ExecIndexBuildScanKeys()?

Didn't really have specifics in mind.  The key stumbling block is
that some (not all) of the work depends on knowing the specific
values of the indexqual comparison keys, so while you could do
that work in advance for constant keys, you'd still have to be
prepared to do work at scan start for non-constant keys.  I don't
have a clear idea about how to factorize that effectively.

A couple of other random ideas in this space:

* I suspect that a lot of this work overlaps with the efforts that
btcostestimate makes along the way to getting a cost estimate.
So it's interesting to wonder whether we could refactor so that
btcostestimate is integrated with this hypothetical plan-time key
preprocessing and doesn't duplicate work.

* I think that we run through most or all of that preprocessing
logic even for internal catalog accesses, where we know darn well
how the keys are set up.  We ought to think harder about how we
could short-circuit pointless work in those code paths.

I don't think any of this is an essential prerequisite to getting
something done for loose index scans, which ISTM ought to be the first
point of attack for v16.  Loose index scans per se shouldn't add much
to the key preprocessing costs.  But these ideas likely would be
useful to look into before anyone starts on the more complicated
preprocessing that would be needed for the ideas in the MDAM paper.

regards, tom lane




Re: SQL/JSON: functions

2022-03-22 Thread Andres Freund
Hi,

On 2022-03-22 18:31:39 -0400, Tom Lane wrote:
> I wrote:
> > That patch is 0-for-three on my buildfarm animals.
> 
> ... the reason being that they use -Werror, and this patch spews
> a bunch of warnings.  This is not acceptable, especially not in
> the middle of a CF when other people are trying to get work done.
> Please revert.

There's also 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jabiru=2022-03-22%2022%3A25%3A26
that started failing with
../../preproc/ecpg --regression -I./../../include -I. -o test1.c test1.pgc
test1.pgc:12: ERROR: syntax error at or near "int"
with this commit.

And the bison warnings I complained about on -committers:
https://www.postgresql.org/message-id/2022033319.so4ajcki7wwaujin%40alap3.anarazel.de

Greetings,

Andres Freund




Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2022-03-22 Thread Andres Freund
Hi,

On 2022-03-22 16:06:40 -0700, Nathan Bossart wrote:
> On Thu, Mar 17, 2022 at 04:45:28PM -0700, Nathan Bossart wrote:
> > I think this one requires some more work, and it needn't be a priority for
> > v15, so I've adjusted the commitfest entry to v16 and moved it to the next
> > commitfest.
> 
> Here is a new patch.  The main differences from v3 are in
> heapam_visibility.c.  Specifically, instead of trying to work the infomask
> checks into the visibility logic, I added a new function that does a couple
> of assertions.  This function is called at the beginning of each visibility
> function.
> 
> What do folks think?  The options I've considered are 1) not adding any
> such checks to heapam_visibility.c, 2) only adding assertions like the
> attached patch, or 3) actually using elog(ERROR, ...) when the invalid bit
> patterns are detected.  AFAICT (1) is more in line with existing invalid
> bit patterns (e.g., XMAX_COMMITTED + XMAX_IS_MULTI).  There are a couple of
> scattered assertions, but most code paths don't check for it.  (2) adds
> additional checks, but only for --enable-cassert builds.  (3) would add
> checks even for non-assert builds, but there would presumably be some
> performance cost involved.

> From 2d6b372cf61782e0fd52590b57b1c914b0ed7a4c Mon Sep 17 00:00:00 2001
> From: Nathan Bossart 
> Date: Tue, 22 Mar 2022 15:35:34 -0700
> Subject: [PATCH v4 1/1] disallow XMAX_COMMITTED + XMAX_LOCK_ONLY

Just skimming this thread quickly, I really have no idea what this is trying
to achieve and the commit message doesn't help either...  I didn't read the
referenced thread, but I shouldn't have to, to get a basic idea.

Greetings,

Andres Freund




Re: New Object Access Type hooks

2022-03-22 Thread Justin Pryzby
If I'm not wrong, this is still causing issues at least on cfbot/windows, even
since f0206d99.

https://cirrus-ci.com/task/5266352712712192
https://cirrus-ci.com/task/5061218867085312
https://cirrus-ci.com/task/5663822005403648
https://cirrus-ci.com/task/5744257246953472

https://cirrus-ci.com/task/5744257246953472
[22:26:50.939] test test_oat_hooks   ... FAILED  173 ms

https://api.cirrus-ci.com/v1/artifact/task/5744257246953472/log/src/test/modules/test_oat_hooks/regression.diffs
diff -w -U3 
c:/cirrus/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out 
c:/cirrus/src/test/modules/test_oat_hooks/results/test_oat_hooks.out
--- c:/cirrus/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out   
2022-03-22 22:13:23.386895200 +
+++ c:/cirrus/src/test/modules/test_oat_hooks/results/test_oat_hooks.out
2022-03-22 22:26:51.104419600 +
@@ -15,12 +15,6 @@
 NOTICE:  in process utility: superuser finished CreateRoleStmt
 CREATE TABLE regress_test_table (t text);
 NOTICE:  in process utility: superuser attempting CreateStmt
-NOTICE:  in object access: superuser attempting namespace search (subId=0) [no 
report on violation, allowed]
-LINE 1: CREATE TABLE regress_test_table (t text);
- ^
-NOTICE:  in object access: superuser finished namespace search (subId=0) [no 
report on violation, allowed]
-LINE 1: CREATE TABLE regress_test_table (t text);
- ^
 NOTICE:  in object access: superuser attempting create (subId=0) [explicit]




Re: New Object Access Type hooks

2022-03-22 Thread Andres Freund
Hi,

On 2022-03-22 18:41:45 -0400, Tom Lane wrote:
> Mark Dilger  writes:
> >> On Mar 22, 2022, at 3:20 PM, Andres Freund  wrote:
> >> Seems like it might actually be good to test that object access hooks work
> >> well in a parallel worker. How about going the other way and explicitly 
> >> setting
> >> force_parallel_mode = disabled for parts of the test and to enabled for
> >> others?
> 
> > Wouldn't we get differing numbers of NOTICE messages depending on how
> > many parallel workers there are?  Or would you propose setting the
> > number of workers to a small, fixed value?

Yes.


> The value would have to be "1", else you are going to have issues
> with notices from different workers being interleaved differently
> from run to run.

Yea. Possible one could work around those with some effort (using multiple
notification channels maybe), but there seems little to glean from multiple
workers that a single worker wouldn't show.


> You might have that anyway, due to interleaving of leader and worker
> messages.

That part could perhaps be addressed by setting parallel_leader_participation
= 0.

Greetings,

Andres Freund




Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2022-03-22 Thread Nathan Bossart
On Thu, Mar 17, 2022 at 04:45:28PM -0700, Nathan Bossart wrote:
> I think this one requires some more work, and it needn't be a priority for
> v15, so I've adjusted the commitfest entry to v16 and moved it to the next
> commitfest.

Here is a new patch.  The main differences from v3 are in
heapam_visibility.c.  Specifically, instead of trying to work the infomask
checks into the visibility logic, I added a new function that does a couple
of assertions.  This function is called at the beginning of each visibility
function.

What do folks think?  The options I've considered are 1) not adding any
such checks to heapam_visibility.c, 2) only adding assertions like the
attached patch, or 3) actually using elog(ERROR, ...) when the invalid bit
patterns are detected.  AFAICT (1) is more in line with existing invalid
bit patterns (e.g., XMAX_COMMITTED + XMAX_IS_MULTI).  There are a couple of
scattered assertions, but most code paths don't check for it.  (2) adds
additional checks, but only for --enable-cassert builds.  (3) would add
checks even for non-assert builds, but there would presumably be some
performance cost involved.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 2d6b372cf61782e0fd52590b57b1c914b0ed7a4c Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 22 Mar 2022 15:35:34 -0700
Subject: [PATCH v4 1/1] disallow XMAX_COMMITTED + XMAX_LOCK_ONLY

---
 contrib/amcheck/verify_heapam.c | 19 
 src/backend/access/heap/README.tuplock  |  2 +-
 src/backend/access/heap/heapam.c| 10 +++
 src/backend/access/heap/heapam_visibility.c | 97 ++---
 src/backend/access/heap/hio.c   |  2 +
 5 files changed, 96 insertions(+), 34 deletions(-)

diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index e5f7355dcb..f30b9c234f 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -665,6 +665,25 @@ check_tuple_header(HeapCheckContext *ctx)
 		 */
 	}
 
+	if ((ctx->tuphdr->t_infomask & HEAP_XMAX_COMMITTED) &&
+		(ctx->tuphdr->t_infomask & HEAP_XMAX_LOCK_ONLY))
+	{
+		report_corruption(ctx,
+		  pstrdup("locked-only should not be marked committed"));
+
+		/* As above, do not skip further checks. */
+	}
+
+	/* also check for pre-v9.3 lock-only bit pattern */
+	if ((ctx->tuphdr->t_infomask & HEAP_XMAX_COMMITTED) &&
+		HEAP_XMAX_IS_LOCKED_ONLY(ctx->tuphdr->t_infomask))
+	{
+		report_corruption(ctx,
+		  pstrdup("tuple with HEAP_XMAX_EXCL_LOCK set and HEAP_XMAX_IS_MULTI unset should not be marked committed"));
+
+		/* As above, do not skip further checks. */
+	}
+
 	if (infomask & HEAP_HASNULL)
 		expected_hoff = MAXALIGN(SizeofHeapTupleHeader + BITMAPLEN(ctx->natts));
 	else
diff --git a/src/backend/access/heap/README.tuplock b/src/backend/access/heap/README.tuplock
index 6441e8baf0..4e565e60ab 100644
--- a/src/backend/access/heap/README.tuplock
+++ b/src/backend/access/heap/README.tuplock
@@ -152,4 +152,4 @@ The following infomask bits are applicable:
   whether the XMAX is a TransactionId or a MultiXactId.
 
 We currently never set the HEAP_XMAX_COMMITTED when the HEAP_XMAX_IS_MULTI bit
-is set.
+or the HEAP_XMAX_LOCK_ONLY bit is set.
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 3746336a09..3f0b4cd754 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -5151,6 +5151,16 @@ l5:
 		MultiXactStatus status;
 		MultiXactStatus new_status;
 
+		/*
+		 * Currently we don't allow XMAX_LOCK_ONLY and XMAX_COMMITTED to both be
+		 * set in a tuple header, so cross-check.
+		 *
+		 * Note that this also checks for the special locked-only bit pattern
+		 * that may be present on servers that were pg_upgraded from pre-v9.3
+		 * versions.
+		 */
+		Assert(!HEAP_XMAX_IS_LOCKED_ONLY(old_infomask));
+
 		if (old_infomask2 & HEAP_KEYS_UPDATED)
 			status = MultiXactStatusUpdate;
 		else
diff --git a/src/backend/access/heap/heapam_visibility.c b/src/backend/access/heap/heapam_visibility.c
index ff0b8a688d..7d6fb66b0d 100644
--- a/src/backend/access/heap/heapam_visibility.c
+++ b/src/backend/access/heap/heapam_visibility.c
@@ -78,6 +78,31 @@
 #include "utils/snapmgr.h"
 
 
+/*
+ * InfomaskAssertions()
+ *
+ * Checks for invalid infomask bit patterns.
+ */
+static inline void
+InfomaskAssertions(HeapTupleHeader tuple)
+{
+	/*
+	 * XMAX_COMMITTED and XMAX_LOCK_ONLY cannot both be set in a tuple header.
+	 *
+	 * Note that this also checks for the special locked-only bit pattern that
+	 * may be present on servers that were pg_upgraded from pre-v9.3 versions.
+	 */
+	Assert(!((tuple->t_infomask & HEAP_XMAX_COMMITTED) &&
+			 HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask)));
+
+	/*
+	 * XMAX_COMMITTED and XMAX_IS_MULTI cannot be be set in a tuple header.
+	 */
+	Assert(!((tuple->t_infomask & HEAP_XMAX_COMMITTED) &&
+			 (tuple->t_infomask & HEAP_XMAX_IS_MULTI)));
+}
+
+
 /*
  * SetHintBits()
  *
@@ -113,6 +138,8 @@ static inline 

Re: MDAM techniques and Index Skip Scan patch

2022-03-22 Thread Andres Freund
Hi,

On 2022-03-22 16:55:49 -0400, Tom Lane wrote:
> 4. I find each of the above ideas to be far more attractive than
> optimizing SELECT-DISTINCT-that-matches-an-index, so I don't really
> understand why the current patchsets seem to be driven largely
> by that single use-case.

It's something causing plenty pain in production environments... Obviously
it'd be even better if the optimization also triggered in cases like
  SELECT some_indexed_col FROM blarg GROUP BY some_indexed_col
which seems to be what ORMs like to generate.


> BTW, I've had a bee in my bonnet for a long time about whether some of
> nbtree's scan setup work could be done once during planning, rather than
> over again during each indexscan start.

It does show up in simple-index-lookup heavy workloads. Not as a major thing,
but it's there. And it's just architecturally displeasing :)

Are you thinking of just moving the setup stuff in nbtree (presumably parts of
_bt_first() / _bt_preprocess_keys()) or also stuff in
ExecIndexBuildScanKeys()?

The latter does show up a bit more heavily in profiles than nbtree specific
setup, and given that it's generic executor type stuff, seems even more
amenable to being moved to plan time.

Greetings,

Andres Freund




Re: [PoC] Federated Authn/z with OAUTHBEARER

2022-03-22 Thread Jacob Champion
On Tue, 2022-03-22 at 14:48 -0700, samay sharma wrote:
> Thank you for porting this on top of the pluggable auth methods API.
> I've addressed the feedback around other backend changes in my latest
> patch, but the client side changes still remain. I had a few
> questions to understand them better.
> 
> (a) What specifically do the client side changes in the patch implement?

Hi Samay,

The client-side changes are an implementation of the OAuth 2.0 Device
Authorization Grant [1] in libpq. The majority of the OAuth logic is
handled by the third-party iddawc library.

The server tells the client what OIDC provider to contact, and then
libpq prompts you to log into that provider on your
smartphone/browser/etc. using a one-time code. After you give libpq
permission to act on your behalf, the Bearer token gets sent to libpq
via a direct connection, and libpq forwards it to the server so that
the server can determine whether you're allowed in.

> (b) Are the changes you made on the client side specific to OAUTH or
> are they about making SASL more generic?

The original patchset included changes to make SASL more generic. Many
of those changes have since been merged, and the remaining code is
mostly OAuth-specific, but there are still improvements to be made.
(And there's some JSON crud to sift through in the first couple of
patches. I'm still mad that the OAUTHBEARER spec requires clients to
parse JSON in the first place.)

> As an additional question,
> if someone wanted to implement something similar on top of your
> patch, would they still have to make client side changes?

Any new SASL mechanisms require changes to libpq at this point. You
need to implement a new pg_sasl_mech, modify pg_SASL_init() to select
the mechanism correctly, and add whatever connection string options you
need, along with the associated state in pg_conn. Patch 0004 has all
the client-side magic for OAUTHBEARER.

--Jacob

[1] https://datatracker.ietf.org/doc/html/rfc8628


Re: New Object Access Type hooks

2022-03-22 Thread Tom Lane
Mark Dilger  writes:
>> On Mar 22, 2022, at 3:20 PM, Andres Freund  wrote:
>> Seems like it might actually be good to test that object access hooks work
>> well in a parallel worker. How about going the other way and explicitly 
>> setting
>> force_parallel_mode = disabled for parts of the test and to enabled for
>> others?

> Wouldn't we get differing numbers of NOTICE messages depending on how
> many parallel workers there are?  Or would you propose setting the
> number of workers to a small, fixed value?

The value would have to be "1", else you are going to have issues
with notices from different workers being interleaved differently
from run to run.  You might have that anyway, due to interleaving
of leader and worker messages.

regards, tom lane




Re: Window Function "Run Conditions"

2022-03-22 Thread David Rowley
On Thu, 17 Mar 2022 at 17:04, Corey Huinker  wrote:
> It seems like this effort would aid in implementing what some other databases 
> implement via the QUALIFY clause, which is to window functions what HAVING is 
> to aggregate functions.
> example: 
> https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#qualify_clause

Isn't that just syntactic sugar?  You could get the same from adding a
subquery where a WHERE clause to filter rows evaluated after the
window clause.

David




Re: New Object Access Type hooks

2022-03-22 Thread Mark Dilger



> On Mar 22, 2022, at 3:20 PM, Andres Freund  wrote:
> 
> Seems like it might actually be good to test that object access hooks work
> well in a parallel worker. How about going the other way and explicitly 
> setting
> force_parallel_mode = disabled for parts of the test and to enabled for
> others?

Wouldn't we get differing numbers of NOTICE messages depending on how many 
parallel workers there are?  Or would you propose setting the number of workers 
to a small, fixed value?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Window Function "Run Conditions"

2022-03-22 Thread David Rowley
On Wed, 16 Mar 2022 at 10:24, Greg Stark  wrote:
>
> This looks like an awesome addition.

Thanks

> I have one technical questions...
>
> Is it possible to actually transform the row_number case into a LIMIT
> clause or make the planner support for this case equivalent to it (in
> which case we can replace the LIMIT clause planning to transform into
> a window function)?

Currently, I have only coded it to support monotonically increasing
and decreasing functions. Putting a <=  type condition on a
row_number() function with no PARTITION BY clause I think is logically
the same as a LIMIT clause, but that's not the case for rank() and
dense_rank().  There may be multiple peer rows with the same rank in
those cases. We'd have no way to know what the LIMIT should be set to.
I don't really want to just do this for row_number().

> The reason I ask is because the Limit plan node is actually quite a
> bit more optimized than the general window function plan node. It
> calculates cost estimates based on the limit and can support Top-N
> sort nodes.

This is true.  There's perhaps no reason why an additional property
could not be added to allow the prosupport function to optionally set
*exactly* the maximum number of rows that could match the condition.
e.g. for select * from (select *,row_number() over (order by c) rn
from ..) w where rn <= 10; that could be set to 10, and if we used
rank() instead of row_number(), it could just be left unset.

I think this is probably worth thinking about at some future date. I
don't really want to make it part of this effort. I also don't think
I'm doing anything here that would need to be undone to make that
work.

David




Re: SQL/JSON: functions

2022-03-22 Thread Tom Lane
I wrote:
> That patch is 0-for-three on my buildfarm animals.

... the reason being that they use -Werror, and this patch spews
a bunch of warnings.  This is not acceptable, especially not in
the middle of a CF when other people are trying to get work done.
Please revert.

regards, tom lane




Re: Window Function "Run Conditions"

2022-03-22 Thread David Rowley
On Thu, 26 Aug 2021 at 14:54, Andy Fan  wrote:
>
> On Thu, Aug 19, 2021 at 2:35 PM David Rowley  wrote:
> >
> > On Thu, 19 Aug 2021 at 00:20, Andy Fan  wrote:
> > > In the current master, the result is:
> > >
> > >  empno | salary | c | dr
> > > ---++---+
> > >  8 |   6000 | 4 |  1
> >
> > > In the patched version, the result is:
> > >
> > >  empno | salary | c | dr
> > > ---++---+
> > >  8 |   6000 | 1 |  1
> >
> > Thanks for taking it for a spin.
> >
> > That's a bit unfortunate.  I don't immediately see how to fix it other
> > than to restrict the optimisation to only apply when there's a single
> > WindowClause. It might be possible to relax it further and only apply
> > if it's the final window clause to be evaluated, but in those cases,
> > the savings are likely to be much less anyway as some previous
> > WindowAgg will have exhausted all rows from its subplan.
>
> I am trying to hack the select_active_windows function to make
> sure the WindowClause with Run Condition clause to be the last one
> to evaluate (we also need to consider more than 1 window func has
> run condition), at that time, the run condition clause is ready already.
>
> However there are two troubles in this direction: a).  This may conflict
> with "the windows that need the same sorting are adjacent in the list."
> b). "when two or more windows are order-equivalent then all peer rows
> must be presented in the same order in all of them. .. (See General Rule 4 of
>  in SQL2008 - SQL2016.)"
>
> In summary, I am not sure if it is correct to change the execution Order
> of WindowAgg freely.

Thanks for looking at that.

My current thoughts are that it just feels a little too risky to
adjust the comparison function that sorts the window clauses to pay
attention to the run-condition.

We would need to ensure that there's just a single window function
with a run condition as it wouldn't be valid for there to be multiple.
It would be easy enough to ensure we only push quals into just 1
window clause, but that and meddling with the evaluation order has
trade-offs.  To do that properly, we'd likely want to consider the
costs when deciding which window clause would benefit from having
quals pushed the most.  Plus, if we start messing with the evaluation
order then we'd likely really want some sort of costing to check if
pushing a qual and adjusting the evaluation order is actually cheaper
than not pushing the qual and keeping the clauses in the order that
requires the minimum number of sorts.   The planner is not really
geared up for costing things like that properly, that's why we just
assume the order with the least sorts is best. In reality that's often
not going to be true as an index may exist and we might want to
evaluate a clause first if we could get rid of a sort and index scan
instead. Sorting the window clauses based on their SortGroupClause is
just the best we can do for now at that stage in planning.

I think it's safer to just disable the optimisation when there are
multiple window clauses.  Multiple matching clauses are merged
already, so it's perfectly valid to have multiple window functions,
it's just they must share the same window clause.  I don't think
that's terrible as with the major use case that I have in mind for
this, the window function is only added to limit the number of rows.
In most cases I can imagine, there'd be no reason to have an
additional window function with different frame options.

I've attached an updated patch.
From a130b09b7acf5e98785e4560b98f4aca2fce4080 Mon Sep 17 00:00:00 2001
From: David Rowley 
Date: Sat, 8 May 2021 23:43:25 +1200
Subject: [PATCH v2] Teach planner and executor about monotonic window funcs

Window functions such as row_number() always return a value higher than
the one previously in any given partition.  If a query were to only
request the first few row numbers, then traditionally we would continue
evaluating the WindowAgg node until all tuples are exhausted.  However, it
is possible if someone, say only wanted all row numbers <= 10, then we
could just stop once we get number 11.

Here we implement means to do just that.  This is done by way of adding a
prosupport function to various of the built-in window functions and adding
supporting code to allow the support function to inform the planner if
the function is monotonically increasing, monotonically decreasing, both
or neither.  The planner is then able to make use of that information and
possibly allow the executor to short-circuit execution by way of adding a
"run condition" to the WindowAgg to instruct it to run while this
condition is true and stop when it becomes false.

This optimization is only possible when the subquery contains only a
single window clause.  The problem with having multiple clauses is that at
the time when we're looking for run conditions, we don't yet know the
order that the window clauses will be evaluated.  We cannot add a run
condition to a window clause 

Re: SQL/JSON: functions

2022-03-22 Thread Tom Lane
Andrew Dunstan  writes:
> I have committed the first of these. That will break the cfbot for this
> and the json_table patches. The remainder should be committed in the
> following days.

That patch is 0-for-three on my buildfarm animals.

regards, tom lane




Re: New Object Access Type hooks

2022-03-22 Thread Andres Freund
Hi,

On 2022-03-22 13:47:27 -0400, Andrew Dunstan wrote:
> OK, I have pushed that.

Seems like it might actually be good to test that object access hooks work
well in a parallel worker. How about going the other way and explicitly setting
force_parallel_mode = disabled for parts of the test and to enabled for
others?

- Andres




Re: New Object Access Type hooks

2022-03-22 Thread Tom Lane
Andrew Dunstan  writes:
> Fixed

Now that we got past the hard failures, we can see that the test
falls over with (some?) non-default encodings, as for instance
here:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion=2022-03-22%2020%3A23%3A13

I can replicate that by running the test under LANG=en_US.iso885915.
What I think is happening is:

(1) Rather unwisely, the relevant InvokeNamespaceSearchHook calls
appear in recomputeNamespacePath.  That means that their timing
depends heavily on accidents of caching.

(2) If we decide that we need an encoding conversion to talk to
the client, there'll be a lookup for the conversion function
early during session startup.  That will cause the namespace
search path to get computed then, before the test module has been
loaded and certainly before the audit GUC has been turned on.

(3) At the point where the test expects some audit notices
to come out, nothing happens because the search path is
already validated.

I'm inclined to think that (1) is a seriously bad idea,
not only because of this instability, but because

(a) the namespace cache logic is unlikely to cause the search-path
cache to get invalidated when something happens that might cause an
OAT hook to wish to change its decision, and

(b) this placement means that the hook is invoked during cache loading
operations that are likely to be super-sensitive to any additional
catalog accesses a hook might wish to do.  (I await the results of the
CLOBBER_CACHE_ALWAYS animals with trepidation.)

Now, if our attitude to the OAT hooks is that we are going to
sprinkle some at random and whether they are useful is someone
else's problem, then maybe these are not interesting concerns.

regards, tom lane




Re: SQL/JSON: functions

2022-03-22 Thread Andrew Dunstan


On 3/5/22 09:39, Andrew Dunstan wrote:
>
> here's a new set of patches, omitting the GUC patch and with the
> beginnings of some message cleanup - there's more work to do there.
>
>
>
> This patchset restores the RETURNING clause for JSON() and JSON_SCALAR()
> but without the GUC
>
>


I have committed the first of these. That will break the cfbot for this
and the json_table patches. The remainder should be committed in the
following days.


cheers


andew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-22 Thread Imseih (AWS), Sami
>Can the leader pass a callback that checks PVIndStats to ambulkdelete
>an amvacuumcleanup callbacks? I think that in the passed callback, the
>leader checks if the number of processed indexes and updates its
>progress information if the current progress needs to be updated.

Thanks for the suggestion.

I looked at this option a but today and found that passing the callback 
will also require signature changes to the ambulkdelete and 
amvacuumcleanup routines. 

This will also require us to check after x pages have been 
scanned inside vacuumscan and vacuumcleanup. After x pages
the callback can then update the leaders progress.
I am not sure if adding additional complexity to the scan/cleanup path
 is justified for what this patch is attempting to do. 

There will also be a lag of the leader updating the progress as it
must scan x amount of pages before updating. Obviously, the more
Pages to the scan, the longer the lag will be.

Would like to hear your thoughts on the above.

Regards,

Sami Imseih
Amazon Web Services.







Re: [PoC] Federated Authn/z with OAUTHBEARER

2022-03-22 Thread samay sharma
Hi Jacob,

Thank you for porting this on top of the pluggable auth methods API. I've
addressed the feedback around other backend changes in my latest patch, but
the client side changes still remain. I had a few questions to understand
them better.

(a) What specifically do the client side changes in the patch implement?
(b) Are the changes you made on the client side specific to OAUTH or are
they about making SASL more generic? As an additional question, if someone
wanted to implement something similar on top of your patch, would they
still have to make client side changes?

Regards,
Samay

On Fri, Mar 4, 2022 at 11:13 AM Jacob Champion  wrote:

> Hi all,
>
> v3 rebases this patchset over the top of Samay's pluggable auth
> provider API [1], included here as patches 0001-3. The final patch in
> the set ports the server implementation from a core feature to a
> contrib module; to switch between the two approaches, simply leave out
> that final patch.
>
> There are still some backend changes that must be made to get this
> working, as pointed out in 0009, and obviously libpq support still
> requires code changes.
>
> --Jacob
>
> [1]
> https://www.postgresql.org/message-id/flat/CAJxrbyxTRn5P8J-p%2BwHLwFahK5y56PhK28VOb55jqMO05Y-DJw%40mail.gmail.com
>


Re: Optimize external TOAST storage

2022-03-22 Thread Nathan Bossart
On Tue, Mar 22, 2022 at 04:34:05PM -0400, Robert Haas wrote:
> We seem to have a shortage of "others" showing up with opinions on
> this topic, but I guess I'm not very confident about the general
> utility of such a setting. Just to be clear, I'm also not very
> confident about the usefulness of the existing settings for
> controlling TOAST. Why is it useful default behavior to try to get
> rows down to 2kB by default, rather than 1.8kB or 3kB? Even more, why
> don't we try to compress primarily based on the length of individual
> attributes and then compress further only if the resulting tuple
> doesn't fit into a page at all? There doesn't seem to be anything
> magic about fitting tuples into a quarter-page, yet the code acts as
> though that's the primary goal - and then, when that didn't turn out
> to work well in all cases, we added a user-settable parameter
> (toast_tuple_target) to let you say you really want tuples in table X
> to fit into a third of a page or a fifth of a page instead of a
> quarter. And there's some justification for that: a proposal to
> fundamentally change the algorithm would likely have gotten bogged
> down for years, and the parameter no doubt lets you solve some
> problems. Yet if the whole algorithm is wrong, and I think maybe it
> is, then being able to change the constants is not really getting us
> where we need to be.
> 
> Then, too, I'm not very confident about the usefulness of EXTENDED,
> EXTERNAL, and MAIN. I think it's useful to be able to categorically
> disable compression (as EXTERNAL does), but you can't categorically
> disable out-of-line storage because the value can be bigger than the
> page, so MAIN vs. EXTENDED is just changing the threshold for the use
> of out-of-line storage. However, it does so in a way that IMHO is not
> particularly intuitive, which goes back to my earlier point about the
> algorithm seeming wacky, and it's in any case unclear why we should
> offer exactly two possible thresholds and not any of the intermediate
> values.

I agree with all of this.  Adding configurability for each constant might
help folks in the short term, but using these knobs probably requires quite
a bit of expertise in Postgres internals as well as a good understanding of
your data.  I think we ought to revist TOAST configurability from a user
perspective.  IOW what can be chosen automatically, and how do we enable
users to effectively configure the settings that cannot be chosen
automatically?  IMO this is a worthwhile conversation to have as long as it
doesn't stray too far into the "let's rewrite TOAST" realm.  I think there
is always going to be some level of complexity with stuff like TOAST, but
there are probably all sorts of ways to simplify/improve it also.

> Maybe the conclusion here is that more thought is needed before
> changing anything in this area

You've certainly got me thinking more about this.  If the scope of this
work is going to expand, a few months before the first v16 commitfest is
probably the right time for it.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: SQL/JSON: JSON_TABLE

2022-03-22 Thread Andrew Dunstan


On 3/22/22 11:27, Andrew Dunstan wrote:
> On 3/22/22 10:53, Alvaro Herrera wrote:
>> On 2022-Mar-22, Andrew Dunstan wrote:
>>
>>> I'm planning on pushing the functions patch set this week and json-table
>>> next week.
>> I think it'd be a good idea to push the patches one by one and let a day
>> between each.  That would make it easier to chase buildfarm issues
>> individually, and make sure they are fixed before the next step.
>> Each patch in each series is already big enough.
>
>
> OK, can do it that way. First one will be later today then.
>
>

OK, pushed the first of the functions patches. That means the cfbot will
break on these two CF items, but it's way too much trouble to have to
remake the patch sets every day, so we can just live without the cfbot
on these for a bit. I will of course test before committing and fix any
bitrot.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: MDAM techniques and Index Skip Scan patch

2022-03-22 Thread Tom Lane
Peter Geoghegan  writes:
> Like many difficult patches, the skip scan patch is not so much
> troubled by problems with the implementation as it is troubled by
> *ambiguity* about the design. Particularly concerning how skip scan
> meshes with existing designs, as well as future designs --
> particularly designs for other MDAM techniques. I've started this
> thread to have a big picture conversation about how to think about
> these things.

Peter asked me off-list to spend some time thinking about the overall
direction we ought to be pursuing here.  I have done that, and here
are a few modest suggestions.

1. Usually I'm in favor of doing this sort of thing in an index AM
agnostic way, but here I don't see much point.  All of the ideas at
stake rely fundamentally on having a lexicographically-ordered multi
column index; but we don't have any of those except btree, nor do
I think we're likely to get any soon.  This motivates the general
tenor of my remarks below, which is "do it in access/nbtree/ not in
the planner".

2. The MDAM paper Peter cited is really interesting.  You can see
fragments of those ideas in our existing btree code, particularly in
the scan setup stuff that detects redundant or contradictory keys and
determines a scan start strategy.  The special handling we implemented
awhile ago for ScalarArrayOp index quals is also a subset of what they
were talking about.  It seems to me that if we wanted to implement more
of those ideas, the relevant work should almost all be done in nbtree
proper.  The planner would need only minor adjustments: btcostestimate
would have to be fixed to understand the improvements, and there are
some rules in indxpath.c that prevent us from passing "too complicated"
sets of indexquals to the AM, which would need to be relaxed or removed
altogether.

3. "Loose" indexscan (i.e., sometimes re-descend from the tree root
to find the next index entry) is again something that seems like it's
mainly nbtree's internal problem.  Loose scan is interesting if we
have index quals for columns that are after the first column that lacks
an equality qual, otherwise not.  I've worried in the past that we'd
need planner/statistical support to figure out whether a loose scan
is likely to be useful compared to just plowing ahead in the index.
However, that seems to be rendered moot by the idea used in the current
patchsets, ie scan till we find that we'll have to step off the current
page, and re-descend at that point.  (When and if we find that that
heuristic is inadequate, we could work on passing some statistical data
forward.  But we don't need any in the v1 patch.)  Again, we need some
work in btcostestimate to understand how the index scan cost will be
affected, but I still don't see any pressing need for major planner
changes or plan tree contents changes.

4. I find each of the above ideas to be far more attractive than
optimizing SELECT-DISTINCT-that-matches-an-index, so I don't really
understand why the current patchsets seem to be driven largely
by that single use-case.  I wouldn't even bother with that for the
initial patch.  When we do get around to it, it still doesn't need
major planner support, I think --- again fixing the cost estimation
is the bulk of the work.  Munro's original 2014 patch showed that we
don't really need all that much to get the planner to build such a
plan; the problem is to convince it that that plan will be cheap.

In short: I would throw out just about all the planner infrastructure
that's been proposed so far.  It looks bulky, expensive, and
drastically undercommented, and I don't think it's buying us anything
of commensurate value.  The part of the planner that actually needs
serious thought is btcostestimate, which has been woefully neglected in
both of the current patchsets.

BTW, I've had a bee in my bonnet for a long time about whether some of
nbtree's scan setup work could be done once during planning, rather than
over again during each indexscan start.  This issue might become more
pressing if the work becomes significantly more complicated/expensive,
which these ideas might cause.  But it's a refinement that could be
left for later --- and in any case, the responsibility would still
fundamentally be nbtree's.  I don't think the planner would do more
than call some AM routine that could add decoration to an IndexScan
plan node.

Now ... where did I put my flameproof vest?

regards, tom lane




Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-03-22 Thread Robert Haas
On Mon, Mar 21, 2022 at 2:23 PM Robert Haas  wrote:
> On Mon, Mar 21, 2022 at 11:21 AM Dilip Kumar  wrote:
> > I tried to debug the case but I realized that somehow
> > CHECK_FOR_INTERRUPTS() is not calling the
> > AcceptInvalidationMessages() and I could not find the same while
> > looking into the code as well.   While debugging I noticed that
> > AcceptInvalidationMessages() is called multiple times but that is only
> > through LockRelationId() but while locking the relation we had already
> > closed the previous smgr because at a time we keep only one smgr open.
> > And that's the reason it is not hitting the issue which we think it
> > could. Is there any condition under which it will call
> > AcceptInvalidationMessages() through CHECK_FOR_INTERRUPTS() ? because
> > I could not see while debugging as well as in code.
>
> Yeah, I think the reason you can't find it is that it's not there. I
> was confused in what I wrote earlier. I think we only process sinval
> catchups when we're idle, not at every CHECK_FOR_INTERRUPTS(). And I
> think the reason for that is precisely that it would be hard to write
> correct code otherwise, since invalidations might then get processed
> in a lot more places. So ... I guess all we really need to do here is
> avoid assuming that the results of smgropen() are valid across any
> code that might acquire relation locks. Which I think the code already
> does.

So I talked to Andres and Thomas about this and they told me that I
was right to worry about this problem. Over on the thread about "wrong
fds used for refilenodes after pg_upgrade relfilenode changes
Reply-To:" there is a plan to make use ProcSignalBarrier to make smgr
objects disappear, and ProcSignalBarrier can be processed at any
CHECK_FOR_INTERRUPTS(), so then we'd have a problem here. Commit
f10f0ae420ee62400876ab34dca2c09c20dcd030 established a policy that you
should always re-fetch the smgr object instead of reusing one you've
already got, and even before that it was known to be unsafe to keep
them around for any period of time, because anything that opened a
relation, including a syscache lookup, could potentially accept
invalidations. So most of our code is already hardened against the
possibility of smgr objects disappearing. I have a feeling there may
be some that isn't, but it would be good if this patch didn't
introduce more such code at the same time that patch is trying to
introduce more ways to get rid of smgr objects. It was suggested to me
that what this patch ought to be doing is calling
CreateFakeRelcacheEntry() and then using RelationGetSmgr(fakerel)
every time we need the SmgrRelation, without ever keeping it around
for any amount of code. That way, if the smgr relation gets closed out
from under us at a CHECK_FOR_INTERRUPTS(), we'll just recreate it at
the next RelationGetSmgr() call.

Andres also noted that he thinks the patch performs redundant cleanup,
because of the fact that it uses RelationCreateStorage. That will
arrange to remove files on abort, but createdb() also has its own
mechanism for that. It doesn't seem like a thing to do twice in two
different ways.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PATCH] Accept IP addresses in server certificate SANs

2022-03-22 Thread Jacob Champion
On Tue, 2022-03-22 at 13:32 +0900, Kyotaro Horiguchi wrote:
> At Fri, 18 Mar 2022 16:38:57 +0900 (JST), Kyotaro Horiguchi 
>  wrote in
> > 
> > fe-secure-common.c doesn't need netinet/in.h.
> > 
> > 
> > +++ b/src/include/utils/inet.h
> > ..
> > +#include "common/inet-common.h"
> > 
> > I'm not sure about the project policy on #include practice, but I
> > think it is the common practice not to include headers that are not
> > required by the file itself.  In this case, fe-secure-common.h itself
> > doesn't need the include.  Instead, fe-secure-openssl.c and
> > fe-secure-common.c needs the include.

Thanks, looks like I had some old header dependencies left over from
several versions ago. Fixed in v9.

> I noticed that this doesn't contain doc changes.
> 
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fdocs%2Fcurrent%2Flibpq-ssl.htmldata=04%7C01%7Cpchampion%40vmware.com%7Cb25566c0f0124a30221908da0bbcec13%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637835203290105956%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=sZuKc9UxmW1oZQij%2F%2F91rkEF57BZiQebkXtvEt%2FdROU%3Dreserved=0
> 
> > In verify-full mode, the host name is matched against the
> > certificate's Subject Alternative Name attribute(s), or against the
> > Common Name attribute if no Subject Alternative Name of type dNSName
> > is present. If the certificate's name attribute starts with an
> > asterisk (*), the asterisk will be treated as a wildcard, which will
> > match all characters except a dot (.). This means the certificate will
> > not match subdomains. If the connection is made using an IP address
> > instead of a host name, the IP address will be matched (without doing
> > any DNS lookups).
> 
> This refers to dNSName, so we should revise this so that it describes
> the new behavior.

v9 contains the bare minimum but I don't think it's quite enough. How
much of the behavior (and edge cases) do you think we should detail
here? All of it?

Thanks,
--Jacob
commit 6278176f7f417f0afecceb71807eb480e20769ef
Author: Jacob Champion 
Date:   Tue Mar 22 13:21:48 2022 -0700

squash! libpq: add OAUTHBEARER SASL mechanism

feedback:
- clean up header deps
- add a mention in the docs

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 3998b1781b..bdbfed1aa5 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -8343,7 +8343,8 @@ 
ldap://ldap.acme.com/cn=dbserver,cn=hosts?pgconnectinfo?base?(objectclass=*)
   
In verify-full mode, the host name is matched against the
certificate's Subject Alternative Name attribute(s), or against the
-   Common Name attribute if no Subject Alternative Name of type 
dNSName is
+   Common Name attribute if no Subject Alternative Name of type 
dNSName
+   (or iPAddress, if the connection uses an IP address) is
present.  If the certificate's name attribute starts with an asterisk
(*), the asterisk will be treated as
a wildcard, which will match all characters except a 
dot
diff --git a/src/interfaces/libpq/fe-secure-common.c 
b/src/interfaces/libpq/fe-secure-common.c
index 4d78715756..9c408df369 100644
--- a/src/interfaces/libpq/fe-secure-common.c
+++ b/src/interfaces/libpq/fe-secure-common.c
@@ -19,9 +19,9 @@
 
 #include "postgres_fe.h"
 
-#include 
 #include 
 
+#include "common/inet-common.h"
 #include "fe-secure-common.h"
 
 #include "libpq-int.h"
diff --git a/src/interfaces/libpq/fe-secure-common.h 
b/src/interfaces/libpq/fe-secure-common.h
index 20ff9ba5db..d18db7138c 100644
--- a/src/interfaces/libpq/fe-secure-common.h
+++ b/src/interfaces/libpq/fe-secure-common.h
@@ -16,7 +16,6 @@
 #ifndef FE_SECURE_COMMON_H
 #define FE_SECURE_COMMON_H
 
-#include "common/inet-common.h"
 #include "libpq-fe.h"
 
 extern int pq_verify_peer_name_matches_certificate_name(PGconn *conn,
diff --git a/src/interfaces/libpq/fe-secure-openssl.c 
b/src/interfaces/libpq/fe-secure-openssl.c
index 06b166b7fd..b70bd2eb19 100644
--- a/src/interfaces/libpq/fe-secure-openssl.c
+++ b/src/interfaces/libpq/fe-secure-openssl.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 
+#include "common/inet-common.h"
 #include "libpq-fe.h"
 #include "fe-auth.h"
 #include "fe-secure-common.h"
From 84a107da33b7d901cb318f8c2fd140644fbc5100 Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Wed, 24 Nov 2021 14:46:11 -0800
Subject: [PATCH v9 1/3] Move inet_net_pton() to src/port

This will be helpful for IP address verification in libpq.
---
 src/backend/utils/adt/Makefile  |  1 -
 src/include/port.h  |  3 +++
 src/include/utils/builtins.h|  4 
 src/port/Makefile   |  1 +
 src/{backend/utils/adt => port}/inet_net_pton.c | 16 +++-
 src/tools/msvc/Mkvcbuild.pm |  2 +-
 6 files changed, 20 insertions(+), 7 deletions(-)
 rename src/{backend/utils/adt => port}/inet_net_pton.c (96%)


Re: New Object Access Type hooks

2022-03-22 Thread Andrew Dunstan


On 3/22/22 14:01, Tom Lane wrote:
> Andrew Dunstan  writes:
>> OK, I have pushed that.
> It seems like you could remove the NO_INSTALLCHECK restriction
> too.  You already removed the comment defending it, and it
> seems to work fine as an installcheck now if I remove that
> locally.
>
> Other nitpicks:
>
> * the IsParallelWorker test could use a comment
>
> * I notice a typo "permisisons" in test_oat_hooks.sql
>
>   



Fixed


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: MDAM techniques and Index Skip Scan patch

2022-03-22 Thread Thomas Munro
On Tue, Mar 22, 2022 at 2:34 PM Andres Freund  wrote:
> IMO it's pretty clear that having "duelling" patches below one CF entry is a
> bad idea. I think they should be split, with inactive approaches marked as
> returned with feeback or whatnot.

I have the impression that this thread is getting some value from
having a CF entry, as a multi-person collaboration where people are
trading ideas and also making progress that no one wants to mark as
returned, but it's vexing for people managing the CF because it's not
really proposed for 15.  Perhaps what we lack is a new status, "Work
In Progress" or something?




Re: Optimize external TOAST storage

2022-03-22 Thread Robert Haas
On Thu, Mar 17, 2022 at 4:05 PM Nathan Bossart  wrote:
> On Thu, Mar 17, 2022 at 01:04:17PM -0400, Robert Haas wrote:
> > Right, so perhaps the ultimate thing here would be a more fine-grained
> > knob than SET STORAGE EXTERNAL -- something that allows you to specify
> > that you want to compress only when it really helps. While some people
> > might find that useful, I think the current patch is less ambitious,
> > and I think that's OK. It just wants to save something in the cases
> > where it's basically free. Unfortunately we've learned that it's never
> > *entirely* free because making the last TOAST chunk longer can always
> > cost you something, even if it gets longer by only 1 byte. But for
> > larger values it's hard for that to be significant.
>
> I guess I think we should be slightly more ambitious.  One idea could be to
> create a default_toast_compression_ratio GUC with a default of 0.95.  This
> means that, by default, a compressed attribute must be 0.95x or less of the
> size of the uncompressed attribute to be stored compressed.  Like
> default_toast_compression, this could also be overridden at the column
> level with something like
>
> ALTER TABLE mytbl ALTER mycol SET COMPRESSION lz4 RATIO 0.9;
>
> If the current "basically free" patch is intended for v15, then maybe all
> this extra configurability stuff could wait for a bit, especially if we can
> demonstrate a decent read performance boost with a more conservative
> setting.  However, I don't see anything terribly complicated about the
> proposed configurability changes (assuming my proposal makes some amount of
> sense to you and others).

We seem to have a shortage of "others" showing up with opinions on
this topic, but I guess I'm not very confident about the general
utility of such a setting. Just to be clear, I'm also not very
confident about the usefulness of the existing settings for
controlling TOAST. Why is it useful default behavior to try to get
rows down to 2kB by default, rather than 1.8kB or 3kB? Even more, why
don't we try to compress primarily based on the length of individual
attributes and then compress further only if the resulting tuple
doesn't fit into a page at all? There doesn't seem to be anything
magic about fitting tuples into a quarter-page, yet the code acts as
though that's the primary goal - and then, when that didn't turn out
to work well in all cases, we added a user-settable parameter
(toast_tuple_target) to let you say you really want tuples in table X
to fit into a third of a page or a fifth of a page instead of a
quarter. And there's some justification for that: a proposal to
fundamentally change the algorithm would likely have gotten bogged
down for years, and the parameter no doubt lets you solve some
problems. Yet if the whole algorithm is wrong, and I think maybe it
is, then being able to change the constants is not really getting us
where we need to be.

Then, too, I'm not very confident about the usefulness of EXTENDED,
EXTERNAL, and MAIN. I think it's useful to be able to categorically
disable compression (as EXTERNAL does), but you can't categorically
disable out-of-line storage because the value can be bigger than the
page, so MAIN vs. EXTENDED is just changing the threshold for the use
of out-of-line storage. However, it does so in a way that IMHO is not
particularly intuitive, which goes back to my earlier point about the
algorithm seeming wacky, and it's in any case unclear why we should
offer exactly two possible thresholds and not any of the intermediate
values.

I think I'm prepared to concede that the setting you are proposing
here is easier to understand than either of those. Saying "store this
value compressed only if you an thereby reduce the size by at least
10%" is an understandable directive which I think will make sense even
to people who know very little about PostgreSQL or databases
generally, as long as they have some vague idea how compression works.
However, I guess I am not confident that the setting would get much
use, or that it is the best way to measure what we care about. For
instance, if my value is 19kB in length, saving 10% isn't very
impactful, because it's still the same number of chunks, and the only
benefit I gain is that the last chunk is smaller, which is probably
not very impactful because I'm still going to have 8 full chunks. I
will gain more if the value is bigger or smaller. Going up, if I
increase the size to 19MB or 190MB, I'm really saving a very
significant amount of cache space and, potentially, I/O. If I was
happy with a 10% savings at 19kB, I will likely be happy with a
considerably smaller economy here. Going down, if I decrease the size
to 1.9kB, I'm not storing any full-sized chunks any more, so reducing
the size of the one partial chunk that I'm storing seems like it could
have a real impact, provided of course that other toasted values in
the same table are similarly small. If the whole TOAST table is full
of just partial chunks, 

Re: MDAM techniques and Index Skip Scan patch

2022-03-22 Thread Dmitry Dolgov
> On Mon, Mar 21, 2022 at 06:34:09PM -0700, Andres Freund wrote:
>
> > I don't mind having all of the alternatives under the same CF item, only
> > one being tested seems to be only a small limitation of cfbot.
>
> IMO it's pretty clear that having "duelling" patches below one CF entry is a
> bad idea. I think they should be split, with inactive approaches marked as
> returned with feeback or whatnot.

On the other hand even for patches with dependencies (i.e. the patch A
depends on the patch B) different CF items cause a lot of confusion for
reviewers. I guess for various flavours of the same patch it would be
even worse. But I don't have a strong opinion here.

> Either way, currently this patch fails on cfbot due to a new GUC:
> https://api.cirrus-ci.com/v1/artifact/task/5134905372835840/log/src/test/recovery/tmp_check/regression.diffs
> https://cirrus-ci.com/task/5134905372835840

This seems to be easy to solve.
>From 5bae9fdf8b74e5996b606e78f8b2a5fb327e011b Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Mon, 17 May 2021 11:47:07 +0200
Subject: [PATCH v41 1/6] Unique expressions

Extend PlannerInfo and Path structures with the list of relevant unique
expressions. It specifies which keys must be unique on the query
level, and allows to leverage this into on the path level. At the moment
only distinctClause makes use of such mechanism, which enables potential
use of index skip scan.

Originally proposed by David Rowley, based on the UniqueKey patch
implementation from Andy Fan, contains few bits out of previous version
from Jesper Pedersen, Floris Van Nee.
---
 src/backend/nodes/list.c| 31 +
 src/backend/optimizer/path/Makefile |  3 +-
 src/backend/optimizer/path/pathkeys.c   | 62 +
 src/backend/optimizer/path/uniquekeys.c | 92 +
 src/backend/optimizer/plan/planner.c| 36 +-
 src/backend/optimizer/util/pathnode.c   | 32 ++---
 src/include/nodes/pathnodes.h   |  5 ++
 src/include/nodes/pg_list.h |  1 +
 src/include/optimizer/pathnode.h|  1 +
 src/include/optimizer/paths.h   |  9 +++
 10 files changed, 261 insertions(+), 11 deletions(-)
 create mode 100644 src/backend/optimizer/path/uniquekeys.c

diff --git a/src/backend/nodes/list.c b/src/backend/nodes/list.c
index f843f861ef..a53a50f372 100644
--- a/src/backend/nodes/list.c
+++ b/src/backend/nodes/list.c
@@ -1653,3 +1653,34 @@ list_oid_cmp(const ListCell *p1, const ListCell *p2)
 		return 1;
 	return 0;
 }
+
+/*
+ * Return true iff every entry in "members" list is also present
+ * in the "target" list.
+ */
+bool
+list_is_subset(const List *members, const List *target)
+{
+	const ListCell	*lc1, *lc2;
+
+	Assert(IsPointerList(members));
+	Assert(IsPointerList(target));
+	check_list_invariants(members);
+	check_list_invariants(target);
+
+	foreach(lc1, members)
+	{
+		bool found = false;
+		foreach(lc2, target)
+		{
+			if (equal(lfirst(lc1), lfirst(lc2)))
+			{
+found = true;
+break;
+			}
+		}
+		if (!found)
+			return false;
+	}
+	return true;
+}
diff --git a/src/backend/optimizer/path/Makefile b/src/backend/optimizer/path/Makefile
index 1e199ff66f..7b9820c25f 100644
--- a/src/backend/optimizer/path/Makefile
+++ b/src/backend/optimizer/path/Makefile
@@ -21,6 +21,7 @@ OBJS = \
 	joinpath.o \
 	joinrels.o \
 	pathkeys.o \
-	tidpath.o
+	tidpath.o \
+	uniquekeys.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index 86a35cdef1..e2be1fbf90 100644
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
@@ -29,6 +29,7 @@
 #include "utils/lsyscache.h"
 
 
+static bool pathkey_is_unique(PathKey *new_pathkey, List *pathkeys);
 static bool pathkey_is_redundant(PathKey *new_pathkey, List *pathkeys);
 static bool matches_boolean_partition_clause(RestrictInfo *rinfo,
 			 RelOptInfo *partrel,
@@ -96,6 +97,29 @@ make_canonical_pathkey(PlannerInfo *root,
 	return pk;
 }
 
+/*
+ * pathkey_is_unique
+ *		Checks if the new pathkey's equivalence class is the same as that of
+ *		any existing member of the pathkey list.
+ */
+static bool
+pathkey_is_unique(PathKey *new_pathkey, List *pathkeys)
+{
+	EquivalenceClass *new_ec = new_pathkey->pk_eclass;
+	ListCell   *lc;
+
+	/* If same EC already is already in the list, then not unique */
+	foreach(lc, pathkeys)
+	{
+		PathKey*old_pathkey = (PathKey *) lfirst(lc);
+
+		if (new_ec == old_pathkey->pk_eclass)
+			return false;
+	}
+
+	return true;
+}
+
 /*
  * pathkey_is_redundant
  *	   Is a pathkey redundant with one already in the given list?
@@ -1152,6 +1176,44 @@ make_pathkeys_for_sortclauses(PlannerInfo *root,
 	return pathkeys;
 }
 
+/*
+ * make_pathkeys_for_uniquekeyclauses
+ *		Generate a pathkeys list to be used for uniquekey clauses
+ */
+List *
+make_pathkeys_for_uniquekeys(PlannerInfo *root,
+			 List *sortclauses,
+			 List 

Re: Column Filtering in Logical Replication

2022-03-22 Thread Alvaro Herrera
On 2022-Mar-19, Tomas Vondra wrote:

> @@ -174,7 +182,13 @@ ALTER PUBLICATION noinsert SET (publish = 'update, 
> delete');
>
> Add some tables to the publication:
>  
> -ALTER PUBLICATION mypublication ADD TABLE users, departments;
> +ALTER PUBLICATION mypublication ADD TABLE users (user_id, firstname), 
> departments;
> +
> +
> +  
> +   Change the set of columns published for a table:
> +
> +ALTER PUBLICATION mypublication SET TABLE users (user_id, firstname, 
> lastname), TABLE departments;
>  
>  
>

Hmm, it seems to me that if you've removed the feature to change the set
of columns published for a table, then the second example should be
removed as well.

> +/*
> + * Transform the publication column lists expression for all the relations
> + * in the list.
> + *
> + * XXX The name is a bit misleading, because we don't really transform
> + * anything here - we merely check the column list is compatible with the
> + * definition of the publication (with publish_via_partition_root=false)
> + * we only allow column lists on the leaf relations. So maybe rename it?
> + */
> +static void
> +TransformPubColumnList(List *tables, const char *queryString,
> +bool pubviaroot)
> +{

I agree with renaming this function.  Maybe CheckPubRelationColumnList() ?

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"This is a foot just waiting to be shot"(Andrew Dunstan)




Re: LockAcquireExtended() dontWait vs weaker lock levels than already held

2022-03-22 Thread Robert Haas
On Tue, Mar 22, 2022 at 3:01 PM Andres Freund  wrote:
> We clearly already know how to compute whether a lock is "included" in
> something we already hold - after all ProcSleep() successfully does so.
>
> Isn't it a pretty trivial test? Seems like it'd boil down to something like

I don't mind you fixing the behavior. I just couldn't pass up an
opportunity to complain about the structure of lock.c.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PATCH] Proof of concept for GUC improvements

2022-03-22 Thread Andres Freund
On 2022-03-22 13:15:34 -0500, David Christensen wrote:
> > On Mar 21, 2022, at 7:53 PM, Tom Lane  wrote:
> > If we'd done it like this from the beginning, it'd have been
> > great, but retrofitting it now is a lot less appealing.
>
> Yeah, agreed on this. As far as I’m concerned we can reject.

Updated CF entry to say so...




Re: LockAcquireExtended() dontWait vs weaker lock levels than already held

2022-03-22 Thread Andres Freund
Hi,

On 2022-03-22 14:20:55 -0400, Robert Haas wrote:
> On Tue, Mar 22, 2022 at 1:43 PM Andres Freund  wrote:
> > When LockAcquireExtended(dontWait=false) acquires a lock where we already 
> > hold
> > stronger lock and somebody else is also waiting for that lock, it goes 
> > through
> > a fairly circuitous path to acquire the lock:
> >
> > A conflicting lock is detected: if (lockMethodTable->conflictTab[lockmode] 
> > & lock->waitMask)
> > LockAcquireExtended() -> WaitOnLock() -> ProcSleep()
> > ProcSleep() follows this special path:
> >  * Special case: if I find I should go in front of some waiter, 
> > check to
> >  * see if I conflict with already-held locks or the requests before 
> > that
> >  * waiter.  If not, then just grant myself the requested lock 
> > immediately.
> > and grants the lock.
>
> I think this happens because lock.c is trying to imagine a world in
> which we don't know anything a priori about which locks are stronger
> or weaker than others and everything is deduced from the conflict
> matrix. I think at some point in time someone believed that we might
> use different conflict matrixes for different lock types. With an
> arbitrary conflict matrix, "stronger" and "weaker" aren't even
> necessarily well-defined ideas: A could conflict with B, B with C, and
> C with A, or something crazy like that. It seems rather unlikely to me
> that we'd ever do such a thing at this point. In fact, there are a lot
> of things in lock.c that we'd probably do differently if we were doing
> that work over.

We clearly already know how to compute whether a lock is "included" in
something we already hold - after all ProcSleep() successfully does so.

Isn't it a pretty trivial test? Seems like it'd boil down to something like

acquireMask = lockMethodTable->conflictTab[lockmode];
if ((MyProc->heldLocks & acquireMask) == acquireMask)
/* already hold lock conflicting with it, grant the new lock to myself) */
else
/* current behaviour */

LockCheckConflicts() mostly knows how to deal with this. It's just that we don't
even use LockCheckConflicts() if a lock acquisition conflicts with waitMask:

/*
 * If lock requested conflicts with locks requested by waiters, must join
 * wait queue.  Otherwise, check for conflict with already-held locks.
 * (That's last because most complex check.)
 */
if (lockMethodTable->conflictTab[lockmode] & lock->waitMask)
found_conflict = true;
else
found_conflict = LockCheckConflicts(lockMethodTable, lockmode,
lock, proclock);

Yes, there's more deadlocks that can be solved by queue reordering, but the
simple cases that ProcSleep() handles don't seem problematic to solve in
lock.c directly either...

Greetings,

Andres Freund




Re: LogwrtResult contended spinlock

2022-03-22 Thread Alvaro Herrera
So I've been wondering about this block at the bottom of XLogWrite:

/*
 * Make sure that the shared 'request' values do not fall behind the
 * 'result' values.  This is not absolutely essential, but it saves some
 * code in a couple of places.
 */
{
SpinLockAcquire(>info_lck);
if (XLogCtl->LogwrtRqst.Write < LogwrtResult.Write)
XLogCtl->LogwrtRqst.Write = LogwrtResult.Write;
if (XLogCtl->LogwrtRqst.Flush < LogwrtResult.Flush)
XLogCtl->LogwrtRqst.Flush = LogwrtResult.Flush;
SpinLockRelease(>info_lck);
}

I just noticed that my 0001 makes the comment a lie: it is now quite
possible that 'result' is advanced beyond 'request'.  Before the patch
that never happened because they were both advanced in the region locked
by the spinlock.

I think we could still maintain this promise if we just moved this
entire block before the first pg_atomic_monotonic_advance_u64 setting
XLogCtl->LogwrtResult.Write.  Or we could halve the whole block, and put
one acquire/test/set/release stanza before each monotonic increase of
the corresponding variable.


However, I wonder if this is still necessary.  This code was added in
4d14fe0048c (March 2001) and while everything else was quite different
back then, this hasn't changed at all.  I can't quite figure out what
are those "couple of places" that would need additional code if this
block is just removed.  I tried running the tests (including
wal_consistency_checking), and nothing breaks.  Reading the code
surrounding the other accesses of XLogCtl->LogwrtRqst, there's nothing
that looks to me like it depends on these values not lagging behind
LogwrtResult.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"If you have nothing to say, maybe you need just the right tool to help you
not say it."   (New York Times, about Microsoft PowerPoint)




Re: LockAcquireExtended() dontWait vs weaker lock levels than already held

2022-03-22 Thread Robert Haas
On Tue, Mar 22, 2022 at 1:43 PM Andres Freund  wrote:
> When LockAcquireExtended(dontWait=false) acquires a lock where we already hold
> stronger lock and somebody else is also waiting for that lock, it goes through
> a fairly circuitous path to acquire the lock:
>
> A conflicting lock is detected: if (lockMethodTable->conflictTab[lockmode] & 
> lock->waitMask)
> LockAcquireExtended() -> WaitOnLock() -> ProcSleep()
> ProcSleep() follows this special path:
>  * Special case: if I find I should go in front of some waiter, check 
> to
>  * see if I conflict with already-held locks or the requests before 
> that
>  * waiter.  If not, then just grant myself the requested lock 
> immediately.
> and grants the lock.

I think this happens because lock.c is trying to imagine a world in
which we don't know anything a priori about which locks are stronger
or weaker than others and everything is deduced from the conflict
matrix. I think at some point in time someone believed that we might
use different conflict matrixes for different lock types. With an
arbitrary conflict matrix, "stronger" and "weaker" aren't even
necessarily well-defined ideas: A could conflict with B, B with C, and
C with A, or something crazy like that. It seems rather unlikely to me
that we'd ever do such a thing at this point. In fact, there are a lot
of things in lock.c that we'd probably do differently if we were doing
that work over.

--
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PATCH] Proof of concept for GUC improvements

2022-03-22 Thread David Christensen


> On Mar 21, 2022, at 7:53 PM, Tom Lane  wrote:
> 
> Andres Freund  writes:
>> My impression is that there's not a lot of enthusiasm for the concept? If
>> that's true we maybe ought to mark the CF entry as rejected?
> 
> Yeah, I'm kind of leaning that way too.  I don't see how we can
> incorporate the symbolic values into any existing display paths
> without breaking applications that expect the old output.
> That being the case, it seems like we'd have "two ways to do it"
> indefinitely, which would add enough confusion that I'm not
> sure there's a net gain.  In particular, I foresee novice questions
> along the lines of "I set foo to disabled, why is it showing
> as zero?"

Yeah, my main motivation here was about trying to have less special values in 
the config files, but I guess it would effectively be making everything 
effectively an enum and still relies on knowing just what the specific magic 
values are, no not really a net gain in this department. 

For the record, I thought this would have a fairly low chance of getting in, 
was mainly curious what level of effort it would take to get something like 
this going and get some feedback on the actual idea. 

> If we'd done it like this from the beginning, it'd have been
> great, but retrofitting it now is a lot less appealing.

Yeah, agreed on this. As far as I’m concerned we can reject. 

Thanks,

David



Re: pg14 psql broke \d datname.nspname.relname

2022-03-22 Thread Robert Haas
On Mon, Mar 21, 2022 at 9:32 PM Mark Dilger
 wrote:
> [ new patch version ]

This patch adds three new arguments to processSQLNamePattern() and
documents one of them. It adds three new parameters to
patternToSQLRegex() as well, and documents none of them. I think that
the text of the comment might need some updating too, in particular
the sentence "Additional dots in the name portion are not treated as
special."

There are no comments explaining the left_is_literal stuff. It appears
that your intention here is that if the pattern string supplied by the
user contains any of *?|+()[]{}.^\ not surrounded by double-quotes, we
signal the caller. Some callers then use this to issue a complaint
that the database name must be a literal. To me, this behavior doesn't
really make sense. If something is a literal, that means we're not
going to interpret the special characters that it contains. Here, we
are interpreting the special characters just so we can complain that
they exist. It seems to me that a simpler solution would be to not
interpret them at all. I attach a patch showing what I mean by that.
It just rips out the dbname_is_literal stuff in favor of doing nothing
at all. To put the whole thing another way, if the user types "\d
}.public.ft", your code wants to complain about the fact that the user
is trying to use regular expression characters in a place where they
are not allowed to do that. I argue that we should instead just be
comparing "}" against the database name and see whether it happens to
match.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


0002-Remove-dbname_is_literal-stuff.patch
Description: Binary data


0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch
Description: Binary data


Re: New Object Access Type hooks

2022-03-22 Thread Tom Lane
Andrew Dunstan  writes:
> OK, I have pushed that.

It seems like you could remove the NO_INSTALLCHECK restriction
too.  You already removed the comment defending it, and it
seems to work fine as an installcheck now if I remove that
locally.

Other nitpicks:

* the IsParallelWorker test could use a comment

* I notice a typo "permisisons" in test_oat_hooks.sql

regards, tom lane




Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-03-22 Thread Andres Freund
Hi,

On 2022-03-22 21:57:51 +0530, Bharath Rupireddy wrote:
> On Sat, Mar 19, 2022 at 5:18 AM Andres Freund  wrote:
> > On 2022-03-17 13:25:35 +0530, Bharath Rupireddy wrote:
> > > +--
> > > +-- pg_get_raw_wal_record()
> >
> > What is raw about the function?
> 
> It right now gives data starting from the output of XLogReadRecord
> upto XLogRecGetTotalLen(xlogreader); length. Given that XLogReadRecord
> returns a pointer to the decoded record's header, I'm not sure it's
> the right choice. Actually, this function's intention(not an immediate
> use-case though), is to feed the WAL record to another function and
> then, say, repair a corrupted page given a base data page.
> 
> As I said upthread, I'm open to removing this function for now, when a
> realistic need comes we can add it back. It also raised some concerns
> around the security and permissions. Thoughts?

I'm ok with having it with appropriate permissions, I just don't like the
name.


> > Why "CALLED ON NULL INPUT"? It doesn't make sense to call the function with 
> > a
> > NULL lsn, does it?  Also, that's the default, why is it restated here?
> 
> pg_get_wal_records_info needed that option (if end_lsn being the
> default, providing WAL info upto the end of WAL). Also, we can emit
> better error message ("invalid WAL start LSN") instead of generic one.
> I wanted to keep error message and code same across all the functions
> hence CALLED ON NULL INPUT option for pg_get_raw_wal_record.

I think it should be strict if it behaves strict. I fail to see what
consistency in error messages is worth here. And I'd probably just create two
different functions for begin and begin & end LSN and mark those as strict as
well.


> > > +REVOKE EXECUTE ON FUNCTION pg_get_wal_record_info(pg_lsn) FROM PUBLIC;
> > > +GRANT EXECUTE ON FUNCTION pg_get_wal_record_info(pg_lsn) TO pg_monitor;
> >
> > I don't think it's appropriate for pg_monitor to see all the data in the 
> > WAL.
> 
> How about pg_read_server_files or some other?

That seems more appropriate.


> > > +-- pg_get_wal_stats()
> >
> > This seems like an exceedingly expensive way to compute this. Not just 
> > because
> > of doing the grouping, window etc, but also because it's serializing the
> > "data" field from pg_get_wal_records_info() just to never use it. With any
> > appreciatable amount of data the return value pg_get_wal_records_info() will
> > be serialized into a on-disk tuplestore.
> >
> > This is probably close to an order of magnitude slower than pg_waldump
> > --stats. Which imo renders this largely useless.
> 
> Yeah that's true. Do you suggest having pg_get_wal_stats() a
> c-function like in v8 patch [1]?

Yes.


> SEe some numbers at [2] with pg_get_wal_stats using
> pg_get_wal_records_info and pg_get_wal_records_info as a direct
> c-function like in v8 patch [1]. A direct c-function always fares
> better (84 msec vs 1400msec).

That indeed makes the view as is pretty much useless. And it'd probably be
worse in a workload with longer records / many FPIs.


> > > +void
> > > +_PG_init(void)
> >
> > > +void
> > > +_PG_fini(void)
> >
> > Why have this stuff if it's not used?
> 
> I kept it as a placeholder for future code additions, for instance
> test_decoding.c and ssl_passphrase_func.c has empty _PG_init(),
> _PG_fini(). If okay, I can mention there like "placeholder for now",
> otherwise I can remove it.

That's not comparable, the test_decoding case has it as a placeholder because
it serves as a template to create further output plugins. Something not the
case here. So please remove.


> > > + for (block_id = 0; block_id <= record->max_block_id; block_id++)
> > > + {
> >
> > To me duplicating this much code from waldump seems like a bad idea from a
> > maintainability POV.
> 
> Even if we were to put the above code from pg_walinspect and
> pg_waldump into, say, walutils.c or some other existing file, there we
> had to make if (pg_walinspect) appendStringInfo else if (pg_waldump)
> printf() sort of thing, isn't it clumsy?

Why is that needed? Just use the stringinfo in both places? You're outputting
the exact same thing in both places right now. There's already a stringinfo in
XLogDumpDisplayRecord() these days (there wasn't back when pg_xlogddump was
written), so you could just convert at least the relevant printfs in
XLogDumpDisplayRecord().


> Also, unnecessary if
> conditions need to be executed for every record. For maintainability,
> I added a note in pg_walinspect.c and pg_waldump.c to consider fixing
> things in both places (of course this might sound dumbest way of doing
> it, IMHO, it's sensible, given the if(pg_walinspect)-else
> if(pg_waldump) sorts of code that we need to do in the common
> functions). Thoughts?

IMO we shouldn't merge this with as much duplication as there is right now,
the notes don't change that it's a PITA to maintain.


> Yeah. It is to handle some edge cases to print the WAL  upto end_lsn
> and avoid waiting in read_local_xlog_page. I will 

Re: New Object Access Type hooks

2022-03-22 Thread Andrew Dunstan


On 3/22/22 13:08, Tom Lane wrote:
> Andrew Dunstan  writes:
>> On 3/22/22 12:58, Tom Lane wrote:
>>> I only suggested removing the error check in _PG_init, not
>>> changing the way the test works.
>> Mark and I discussed this offline, and decided there was no requirement
>> for the module to be preloaded. Do you have a different opinion?
> No, I was actually about to make the same point: it seems to me there
> are arguable use-cases for loading it shared, loading it per-session
> (perhaps via ALTER USER SET or ALTER DATABASE SET to target particular
> users/DBs), or even manually LOADing it.  So the module code should
> not be prejudging how it's used.
>
> On reflection, I withdraw my complaint about changing the way the
> test script loads the module.  Getting rid of the need for a custom
> .conf file simplifies the test module, and that seems good.
> So I'm on board with Mark's patch now.
>
>   



OK, I have pushed that.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





LockAcquireExtended() dontWait vs weaker lock levels than already held

2022-03-22 Thread Andres Freund
Hi,

When LockAcquireExtended(dontWait=false) acquires a lock where we already hold
stronger lock and somebody else is also waiting for that lock, it goes through
a fairly circuitous path to acquire the lock:

A conflicting lock is detected: if (lockMethodTable->conflictTab[lockmode] & 
lock->waitMask)
LockAcquireExtended() -> WaitOnLock() -> ProcSleep()
ProcSleep() follows this special path:
 * Special case: if I find I should go in front of some waiter, check to
 * see if I conflict with already-held locks or the requests before that
 * waiter.  If not, then just grant myself the requested lock 
immediately.
and grants the lock.


However, in dontWait mode, there's no such path. Which means that
LockAcquireExtended() returns LOCKACQUIRE_NOT_AVAIL despite the fact that 
dontWait=false
would succeed in granting us the lock.

This seems decidedly suboptimal.

For one, the code flow to acquire a lock we already hold in some form is
unnecessarily hard to understand and expensive. There's no comment in
LockAcquireExtended() explaining that WaitOnLock() might immediately grant us
the lock in that case, we emit bogus TRACE_POSTGRESQL_LOCK_WAIT_START() etc.

For another, LockAcquireExtended(dontWait=true) returning spuriously is quite
confusing behaviour, and quite plausibly could cause bugs in fairly random
places.


Not planning to do anything about this for now, but I did want something I can
find if I hit such a problem in the future...


Greetings,

Andres Freund




Re: New Object Access Type hooks

2022-03-22 Thread Tom Lane
Andrew Dunstan  writes:
> On 3/22/22 12:58, Tom Lane wrote:
>> I only suggested removing the error check in _PG_init, not
>> changing the way the test works.

> Mark and I discussed this offline, and decided there was no requirement
> for the module to be preloaded. Do you have a different opinion?

No, I was actually about to make the same point: it seems to me there
are arguable use-cases for loading it shared, loading it per-session
(perhaps via ALTER USER SET or ALTER DATABASE SET to target particular
users/DBs), or even manually LOADing it.  So the module code should
not be prejudging how it's used.

On reflection, I withdraw my complaint about changing the way the
test script loads the module.  Getting rid of the need for a custom
.conf file simplifies the test module, and that seems good.
So I'm on board with Mark's patch now.

regards, tom lane




Re: New Object Access Type hooks

2022-03-22 Thread Mark Dilger



> On Mar 22, 2022, at 9:58 AM, Tom Lane  wrote:
> 
>> Ok, done as you suggest:
> 
> I only suggested removing the error check in _PG_init, not
> changing the way the test works.

I should have been more explicit and said, "done as y'all suggest".  The "To" 
line of that email was to you and Andrew.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: New Object Access Type hooks

2022-03-22 Thread Andrew Dunstan


On 3/22/22 12:58, Tom Lane wrote:
> Mark Dilger  writes:
>> On Mar 22, 2022, at 9:33 AM, Tom Lane  wrote:
>>> As a quick-n-dirty fix to avoid reverting the entire test module,
>>> perhaps just delete this error check for now.
>> Ok, done as you suggest:
> I only suggested removing the error check in _PG_init, not
> changing the way the test works.
>
>   



Mark and I discussed this offline, and decided there was no requirement
for the module to be preloaded. Do you have a different opinion?


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: New Object Access Type hooks

2022-03-22 Thread Tom Lane
Mark Dilger  writes:
> On Mar 22, 2022, at 9:33 AM, Tom Lane  wrote:
>> As a quick-n-dirty fix to avoid reverting the entire test module,
>> perhaps just delete this error check for now.

> Ok, done as you suggest:

I only suggested removing the error check in _PG_init, not
changing the way the test works.

regards, tom lane




Re: New Object Access Type hooks

2022-03-22 Thread Mark Dilger


> On Mar 22, 2022, at 9:33 AM, Tom Lane  wrote:
> 
> Andrew Dunstan  writes:
>> On 3/22/22 11:26, Mark Dilger wrote:
>>> culicidae is complaining:
>>> 2022-03-22 14:53:27.198 UTC [2167008][not initialized][:0] FATAL:  
>>> test_oat_hooks must be loaded via shared_preload_libraries
> 
>> That seems quite weird. I'm not sure how it's getting loaded at all if
>> not via shared_preload_libraries
> 
> After checking culicidae's config, I've duplicated this failure
> by building with EXEC_BACKEND defined.  So I'd opine that there
> is something broken about the method test_oat_hooks uses to
> decide if it was loaded via shared_preload_libraries or not.
> (Note that the failures appear to be coming out of auxiliary
> processes such as the checkpointer.)
> 
> As a quick-n-dirty fix to avoid reverting the entire test module,
> perhaps just delete this error check for now.

Ok, done as you suggest:



v2-0001-Fix-buildfarm-test-failures-in-test_oat_hooks.patch
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: New Object Access Type hooks

2022-03-22 Thread Tom Lane
Andrew Dunstan  writes:
> On 3/22/22 11:26, Mark Dilger wrote:
>> culicidae is complaining:
>> 2022-03-22 14:53:27.198 UTC [2167008][not initialized][:0] FATAL:  
>> test_oat_hooks must be loaded via shared_preload_libraries

> That seems quite weird. I'm not sure how it's getting loaded at all if
> not via shared_preload_libraries

After checking culicidae's config, I've duplicated this failure
by building with EXEC_BACKEND defined.  So I'd opine that there
is something broken about the method test_oat_hooks uses to
decide if it was loaded via shared_preload_libraries or not.
(Note that the failures appear to be coming out of auxiliary
processes such as the checkpointer.)

As a quick-n-dirty fix to avoid reverting the entire test module,
perhaps just delete this error check for now.

regards, tom lane




Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-03-22 Thread Bharath Rupireddy
On Sat, Mar 19, 2022 at 5:18 AM Andres Freund  wrote:
>
> Hi,
>
> First look at this patch, so I might be repeating stuff already commented on /
> discussed.

Thanks for taking a look at the patch.

> On 2022-03-17 13:25:35 +0530, Bharath Rupireddy wrote:
> > +--
> > +-- pg_get_raw_wal_record()
>
> What is raw about the function?

It right now gives data starting from the output of XLogReadRecord
upto XLogRecGetTotalLen(xlogreader); length. Given that XLogReadRecord
returns a pointer to the decoded record's header, I'm not sure it's
the right choice. Actually, this function's intention(not an immediate
use-case though), is to feed the WAL record to another function and
then, say, repair a corrupted page given a base data page.

As I said upthread, I'm open to removing this function for now, when a
realistic need comes we can add it back. It also raised some concerns
around the security and permissions. Thoughts?

> Why "CALLED ON NULL INPUT"? It doesn't make sense to call the function with a
> NULL lsn, does it?  Also, that's the default, why is it restated here?

pg_get_wal_records_info needed that option (if end_lsn being the
default, providing WAL info upto the end of WAL). Also, we can emit
better error message ("invalid WAL start LSN") instead of generic one.
I wanted to keep error message and code same across all the functions
hence CALLED ON NULL INPUT option for pg_get_raw_wal_record.

> > +REVOKE EXECUTE ON FUNCTION pg_get_wal_record_info(pg_lsn) FROM PUBLIC;
> > +GRANT EXECUTE ON FUNCTION pg_get_wal_record_info(pg_lsn) TO pg_monitor;
>
> I don't think it's appropriate for pg_monitor to see all the data in the WAL.

How about pg_read_server_files or some other?

> > +-- pg_get_wal_stats()
>
> This seems like an exceedingly expensive way to compute this. Not just because
> of doing the grouping, window etc, but also because it's serializing the
> "data" field from pg_get_wal_records_info() just to never use it. With any
> appreciatable amount of data the return value pg_get_wal_records_info() will
> be serialized into a on-disk tuplestore.
>
> This is probably close to an order of magnitude slower than pg_waldump
> --stats. Which imo renders this largely useless.

Yeah that's true. Do you suggest having pg_get_wal_stats() a
c-function like in v8 patch [1]?

SEe some numbers at [2] with pg_get_wal_stats using
pg_get_wal_records_info and pg_get_wal_records_info as a direct
c-function like in v8 patch [1]. A direct c-function always fares
better (84 msec vs 1400msec).

> The column names don't seem great either. "tfpil"?

"total fpi length" - tfpil wanted to keep it short - it's just an
internal column name isn't it? The actual column name the user sees is
fpi_length.

> > +void
> > +_PG_init(void)
>
> > +void
> > +_PG_fini(void)
>
> Why have this stuff if it's not used?

I kept it as a placeholder for future code additions, for instance
test_decoding.c and ssl_passphrase_func.c has empty _PG_init(),
_PG_fini(). If okay, I can mention there like "placeholder for now",
otherwise I can remove it.

> > +static XLogRecPtr
> > +ValidateInputLSN(XLogRecPtr lsn)
>
> > +static XLogRecPtr
> > +ValidateStartAndEndLSNs(XLogRecPtr start_lsn, XLogRecPtr end_lsn)
> > +{
>
> These two functions are largely redundant, that doesn't seem great.

I will modify it in the next version.

> > +Datum
> > +pg_get_raw_wal_record(PG_FUNCTION_ARGS)
>
> Most of this has another copy in pg_get_wal_record_info(). Can more of this be
> deduplicated?

I will do, if we decide on whether or not to have the
pg_get_raw_wal_record function at all? Please see my comments above.

> > + initStringInfo();
> > + desc->rm_desc(, record);
> > + appendStringInfo(_desc, "%s", temp.data);
> > + pfree(temp.data);
> > + initStringInfo(_blk_ref);
>
> This seems unnecessarily wasteful. You serialize into one stringinfo, just to
> then copy that stringinfo into another stringinfo. Just to then allocate yet
> another stringinfo.

Yeah, I will remove it. Looks like all the rm_desc callbacks append to
the passed-in buffer and not reset it, so we should be good.

> > + /* Block references (detailed format). */
>
> This comment seems copied from pg_waldump, but doesn't make sense here,
> because there's no short format.

Yes, I will remove it.

> > + for (block_id = 0; block_id <= record->max_block_id; block_id++)
> > + {
>
> To me duplicating this much code from waldump seems like a bad idea from a
> maintainability POV.

Even if we were to put the above code from pg_walinspect and
pg_waldump into, say, walutils.c or some other existing file, there we
had to make if (pg_walinspect) appendStringInfo else if (pg_waldump)
printf() sort of thing, isn't it clumsy? Also, unnecessary if
conditions need to be executed for every record. For maintainability,
I added a note in pg_walinspect.c and pg_waldump.c to consider fixing
things in both places (of course this might sound dumbest way of doing
it, IMHO, it's sensible, given the 

Re: New Object Access Type hooks

2022-03-22 Thread Tom Lane
Mark Dilger  writes:
> The problem is coming from the REGRESS_exec_check_perms, which was included 
> in the patch to demonstrate when the other hooks fired relative to the 
> ExecutorCheckPerms_hook, but since it is causing problems, I can submit a 
> patch with that removed.  Give me a couple minutes

Maybe better to suppress the audit messages if in a parallel worker?

regards, tom lane




Re: New Object Access Type hooks

2022-03-22 Thread Mark Dilger


> On Mar 22, 2022, at 9:11 AM, Mark Dilger  wrote:
> 
> Give me a couple minutes



v1-0001-Fix-build-farm-failures-in-test_oat_hooks.patch
Description: Binary data

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Re: Partial aggregates pushdown

2022-03-22 Thread Alexander Pyhalov

Tomas Vondra писал 2022-03-22 15:28:

On 3/22/22 01:49, Andres Freund wrote:

On 2022-01-17 15:27:53 +0300, Alexander Pyhalov wrote:

Alexander Pyhalov писал 2022-01-17 15:26:

Updated patch.


Sorry, missed attachment.


Needs another update: http://cfbot.cputube.org/patch_37_3369.log

Marked as waiting on author.



TBH I'm still not convinced this is the right approach. I've voiced 
this

opinion before, but to reiterate the main arguments:

1) It's not clear to me how could this get extended to aggregates with
more complex aggregate states, to support e.g. avg() and similar fairly
common aggregates.


Hi.
Yes, I'm also not sure how to proceed with aggregates with complex 
state.
Likely it needs separate function to export their state, but then we 
should
somehow ensure that this function exists and our 'importer' can handle 
its result.
Note that for now we have no mechanics in postgres_fdw to find out 
remote server version

on planning stage.


2) I'm not sure relying on aggpartialpushdownsafe without any version
checks etc. is sufficient. I mean, how would we know the remote node 
has

the same idea of representing the aggregate state. I wonder how this
aligns with assumptions we do e.g. for functions etc.


It seems to be not a problem for me, as for now we don't care about 
remote node internal aggregate state representation.

We currently get just aggregate result from remote node. For aggregates
with 'internal' stype we call converter locally, and it converts 
external result from

aggregate return type to local node internal representation.



Aside from that, there's a couple review comments:

1) should not remove the comment in foreign_expr_walker


Fixed.



2) comment in deparseAggref is obsolete/inaccurate


Fixed.



3) comment for partial_agg_ok should probably explain when we consider
aggregate OK to be pushed down


Expanded comment.


4) I'm not sure why get_rcvd_attinmeta comment talks about "return type
bytea" and "real input type".


Expanded comment.  Tupdesc can be retrieved from 
node->ss.ss_ScanTupleSlot,
and so we expect to see bytea (as should be produced by partial 
aggregation).

But when we scan data, we get aggregate
output type (which matches converter input type), so attinmeta should
be fixed.
If we deal with aggregate which doesn't have converter, partial_agg_ok()
ensures that agg->aggfnoid return type matches agg->aggtranstype.



5) Talking about "partial" aggregates is a bit confusing, because that
suggests this is related to actual "partial aggregates". But it's not.


How should we call them? It's about pushing "Partial count()" or  
"Partial sum()" to the remote server,
why it's not related to partial aggregates? Do you mean that it's not 
about parallel aggregate processing?



6) Can add_foreign_grouping_paths do without the new 'partial'
parameter? Clearly, it can be deduced from extra->patype, no?


Fixed this.



7) There's no docs for PARTIALCONVERTERFUNC / PARTIAL_PUSHDOWN_SAFE in
CREATE AGGREGATE sgml docs.


Added documentation. I'd appreciate advice on how it should be extended.



8) I don't think "serialize" in the converter functions is the right
term, considering those functions are not "serializing" anything. If
anything, it's the remote node that is serializing the agg state and 
the

local not is deserializing it. Or maybe I just misunderstand where are
the converter functions executed?


Converter function transforms aggregate result to serialized internal 
representation,

which is expected from partial aggregate. I mean, it converts aggregate
result type to internal representation and then efficiently executes
serialization code (i.e. converter(x) == serialize(to_internal(x))).

--
Best regards,
Alexander Pyhalov,
Postgres ProfessionalFrom 7ad4eacf017a4fd3793f0949ef43ccc8292bf3f6 Mon Sep 17 00:00:00 2001
From: Alexander Pyhalov 
Date: Thu, 14 Oct 2021 17:30:34 +0300
Subject: [PATCH] Partial aggregates push down

---
 contrib/postgres_fdw/deparse.c|  63 +-
 .../postgres_fdw/expected/postgres_fdw.out| 185 -
 contrib/postgres_fdw/postgres_fdw.c   | 187 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql |  27 ++-
 doc/src/sgml/ref/create_aggregate.sgml|  27 +++
 src/backend/catalog/pg_aggregate.c|  28 ++-
 src/backend/commands/aggregatecmds.c  |  23 ++-
 src/backend/utils/adt/numeric.c   |  96 +
 src/bin/pg_dump/pg_dump.c |  20 +-
 src/include/catalog/pg_aggregate.dat  | 110 ++-
 src/include/catalog/pg_aggregate.h|  10 +-
 src/include/catalog/pg_proc.dat   |   6 +
 src/test/regress/expected/oidjoins.out|   1 +
 13 files changed, 702 insertions(+), 81 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index bf12eac0288..272e2391d7f 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -197,6 +197,7 @@ 

Re: LogwrtResult contended spinlock

2022-03-22 Thread Alvaro Herrera
On 2022-Mar-22, Tom Lane wrote:

> I looked briefly at 0001, and I've got to say that I disagree with
> your decision to rearrange the representation of the local LogwrtResult
> copy.  It clutters the patch tremendously and makes it hard to
> understand what the actual functional change is.  Moreover, I'm
> not entirely convinced that it's a notational improvement in the
> first place.
> 
> Perhaps it'd help if you split 0001 into two steps, one to do the
> mechanical change of the representation and then a second patch that
> converts the shared variable to atomics.  Since you've moved around
> the places that read the shared variable, that part is subtler than
> one could wish and really needs to be studied on its own.

Hmm, I did it the other way around: first change to use atomics, then
the mechanical change.  I think that makes the usefulness of the change
more visible, because before the atomics use the use of the combined
struct as a unit remains sensible.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
Officer Krupke, what are we to do?
Gee, officer Krupke, Krup you! (West Side Story, "Gee, Officer Krupke")
>From dd9b53878faeedba921ea7027e98ddbb433e8647 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 2 Feb 2021 14:03:43 -0300
Subject: [PATCH v7 1/3] Change XLogCtl->LogwrtResult to use atomic ops
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Currently, access to LogwrtResult is protected by a spinlock.  This
becomes severely contended in some scenarios, such as with a largish
replication flock: walsenders all calling GetFlushRecPtr repeatedly
cause the processor heat up to the point where eggs can be fried on top.

This can be reduced to a non-problem by replacing XLogCtl->LogwrtResult
with a struct containing a pair of atomically accessed variables.

In addition, this commit splits the process-local copy of these values
(kept in the freestanding LogwrtResult struct) into two separate
LogWriteResult and LogFlushResult.  This is not strictly necessary, but
it makes it clearer that these are updated separately, each on their own
schedule.

Author: Álvaro Herrera 
Discussion: https://postgr.es/m/20200831182156.GA3983@alvherre.pgsql
---
 src/backend/access/transam/xlog.c | 85 +++
 src/include/port/atomics.h| 29 +++
 src/tools/pgindent/typedefs.list  |  1 +
 3 files changed, 72 insertions(+), 43 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 4ac3871c74..6f2eb494fe 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -285,16 +285,13 @@ static bool doPageWrites;
  *
  * LogwrtRqst indicates a byte position that we need to write and/or fsync
  * the log up to (all records before that point must be written or fsynced).
- * LogwrtResult indicates the byte positions we have already written/fsynced.
- * These structs are identical but are declared separately to indicate their
- * slightly different functions.
+ * LogWrtResult indicates the byte positions we have already written/fsynced.
+ * These structs are similar but are declared separately to indicate their
+ * slightly different functions; in addition, the latter is read and written
+ * using atomic operations.
  *
- * To read XLogCtl->LogwrtResult, you must hold either info_lck or
- * WALWriteLock.  To update it, you need to hold both locks.  The point of
- * this arrangement is that the value can be examined by code that already
- * holds WALWriteLock without needing to grab info_lck as well.  In addition
- * to the shared variable, each backend has a private copy of LogwrtResult,
- * which is updated when convenient.
+ * In addition to the shared variable, each backend has a private copy of
+ * LogwrtResult, each member of which is separately updated when convenient.
  *
  * The request bookkeeping is simpler: there is a shared XLogCtl->LogwrtRqst
  * (protected by info_lck), but we don't need to cache any copies of it.
@@ -317,6 +314,12 @@ static bool doPageWrites;
  *--
  */
 
+typedef struct XLogwrtAtomic
+{
+	pg_atomic_uint64 Write;		/* last byte + 1 written out */
+	pg_atomic_uint64 Flush;		/* last byte + 1 flushed */
+} XLogwrtAtomic;
+
 typedef struct XLogwrtRqst
 {
 	XLogRecPtr	Write;			/* last byte + 1 to write out */
@@ -480,6 +483,7 @@ typedef struct XLogCtlData
 {
 	XLogCtlInsert Insert;
 
+	XLogwrtAtomic LogwrtResult; /* uses atomics */
 	/* Protected by info_lck: */
 	XLogwrtRqst LogwrtRqst;
 	XLogRecPtr	RedoRecPtr;		/* a recent copy of Insert->RedoRecPtr */
@@ -497,12 +501,6 @@ typedef struct XLogCtlData
 	pg_time_t	lastSegSwitchTime;
 	XLogRecPtr	lastSegSwitchLSN;
 
-	/*
-	 * Protected by info_lck and WALWriteLock (you must hold either lock to
-	 * read it, but both to update)
-	 */
-	XLogwrtResult LogwrtResult;
-
 	/*
 	 * Latest initialized page in the cache (last byte position + 1).
 	 *
@@ -622,7 

Re: New Object Access Type hooks

2022-03-22 Thread Mark Dilger



> On Mar 22, 2022, at 9:09 AM, Andrew Dunstan  wrote:
> 
> If I can't com up with a very quick fix I'll revert it.

The problem is coming from the REGRESS_exec_check_perms, which was included in 
the patch to demonstrate when the other hooks fired relative to the 
ExecutorCheckPerms_hook, but since it is causing problems, I can submit a patch 
with that removed.  Give me a couple minutes


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: New Object Access Type hooks

2022-03-22 Thread Andrew Dunstan


On 3/22/22 12:02, Tom Lane wrote:
> Andrew Dunstan  writes:
>> That seems quite weird. I'm not sure how it's getting loaded at all if
>> not via shared_preload_libraries
> Some other animals are showing this:
>
> diff -U3 
> /home/postgres/pgsql/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out
>  
> /home/postgres/pgsql/src/test/modules/test_oat_hooks/results/test_oat_hooks.out
> --- 
> /home/postgres/pgsql/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out
>   2022-03-22 11:57:40.224991011 -0400
> +++ 
> /home/postgres/pgsql/src/test/modules/test_oat_hooks/results/test_oat_hooks.out
>2022-03-22 11:59:59.998983366 -0400
> @@ -48,6 +48,8 @@
>  SELECT * FROM regress_test_table;
>  NOTICE:  in executor check perms: superuser attempting execute
>  NOTICE:  in executor check perms: superuser finished execute
> +NOTICE:  in executor check perms: superuser attempting execute
> +NOTICE:  in executor check perms: superuser finished execute
>   t 
>  ---
>  (0 rows)
> @@ -95,6 +97,8 @@
>^
>  NOTICE:  in executor check perms: non-superuser attempting execute
>  NOTICE:  in executor check perms: non-superuser finished execute
> +NOTICE:  in executor check perms: non-superuser attempting execute
> +NOTICE:  in executor check perms: non-superuser finished execute
>   t 
>  ---
>  (0 rows)
> @@ -168,6 +172,8 @@
>^
>  NOTICE:  in executor check perms: superuser attempting execute
>  NOTICE:  in executor check perms: superuser finished execute
> +NOTICE:  in executor check perms: superuser attempting execute
> +NOTICE:  in executor check perms: superuser finished execute
>   t 
>  ---
>  (0 rows)
>
>
> I can duplicate that by adding "force_parallel_mode = regress"
> to test_oat_hooks.conf, so a fair bet is that the duplication
> comes from executing the same hook in both leader and worker.
>
>   



OK, thanks. My test didn't include that one setting :-(


If I can't com up with a very quick fix I'll revert it.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: New Object Access Type hooks

2022-03-22 Thread Tom Lane
Andrew Dunstan  writes:
> That seems quite weird. I'm not sure how it's getting loaded at all if
> not via shared_preload_libraries

Some other animals are showing this:

diff -U3 
/home/postgres/pgsql/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out
 /home/postgres/pgsql/src/test/modules/test_oat_hooks/results/test_oat_hooks.out
--- 
/home/postgres/pgsql/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out
2022-03-22 11:57:40.224991011 -0400
+++ 
/home/postgres/pgsql/src/test/modules/test_oat_hooks/results/test_oat_hooks.out 
2022-03-22 11:59:59.998983366 -0400
@@ -48,6 +48,8 @@
 SELECT * FROM regress_test_table;
 NOTICE:  in executor check perms: superuser attempting execute
 NOTICE:  in executor check perms: superuser finished execute
+NOTICE:  in executor check perms: superuser attempting execute
+NOTICE:  in executor check perms: superuser finished execute
  t 
 ---
 (0 rows)
@@ -95,6 +97,8 @@
   ^
 NOTICE:  in executor check perms: non-superuser attempting execute
 NOTICE:  in executor check perms: non-superuser finished execute
+NOTICE:  in executor check perms: non-superuser attempting execute
+NOTICE:  in executor check perms: non-superuser finished execute
  t 
 ---
 (0 rows)
@@ -168,6 +172,8 @@
   ^
 NOTICE:  in executor check perms: superuser attempting execute
 NOTICE:  in executor check perms: superuser finished execute
+NOTICE:  in executor check perms: superuser attempting execute
+NOTICE:  in executor check perms: superuser finished execute
  t 
 ---
 (0 rows)


I can duplicate that by adding "force_parallel_mode = regress"
to test_oat_hooks.conf, so a fair bet is that the duplication
comes from executing the same hook in both leader and worker.

regards, tom lane




Re: Mingw task for Cirrus CI

2022-03-22 Thread Melih Mutlu
Hi Andres,

Rebased it.
I also removed the temp installation task and
used NoDefaultCurrentDirectoryInExePath env variable instead.

Best,
Melih
From e8a1dae0ec10efd8a967070e0d412d2bf875fa93 Mon Sep 17 00:00:00 2001
From: Melih Mutlu 
Date: Mon, 21 Feb 2022 14:46:05 +0300
Subject: [PATCH] Added Windows with MinGW environment in Cirrus CI

---
 .cirrus.yml | 79 +++--
 1 file changed, 65 insertions(+), 14 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index e5335fede7..1ed40347cf 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -23,7 +23,6 @@ env:
   CHECKFLAGS: -Otarget
   PROVE_FLAGS: --timer
   PGCTLTIMEOUT: 120 # avoids spurious failures during parallel tests
-  TEMP_CONFIG: ${CIRRUS_WORKING_DIR}/src/tools/ci/pg_ci_base.conf
   PG_TEST_EXTRA: kerberos ldap ssl
 
 
@@ -334,13 +333,30 @@ task:
 cores_script: src/tools/ci/cores_backtrace.sh macos "${HOME}/cores"
 
 
+WINDOWS_ENVIRONMENT_BASE: _ENVIRONMENT_BASE
+env:
+  # Half the allowed per-user CPU cores
+  CPUS: 4
+  # The default working dir is in a directory msbuild complains about
+  CIRRUS_WORKING_DIR: "c:/cirrus"
+  TEMP_CONFIG: ${CIRRUS_WORKING_DIR}/src/tools/ci/pg_ci_base.conf
+
+  # Avoids port conflicts between concurrent tap test runs
+  PG_TEST_USE_UNIX_SOCKETS: 1
+
+only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*windows.*'
+
+sysinfo_script: |
+  chcp
+  systeminfo
+  powershell -Command get-psdrive -psprovider filesystem
+  set
+
 task:
+  << : *WINDOWS_ENVIRONMENT_BASE
   name: Windows - Server 2019, VS 2019
 
   env:
-# Half the allowed per-user CPU cores
-CPUS: 4
-
 # Our windows infrastructure doesn't have test concurrency above the level
 # of a single vcregress test target. Due to that, it's useful to run prove
 # with multiple jobs. For the other tasks it isn't, because two sources
@@ -350,15 +366,11 @@ task:
 # likely can be improved upon further.
 PROVE_FLAGS: -j10 --timer
 
-# The default cirrus working dir is in a directory msbuild complains about
-CIRRUS_WORKING_DIR: "c:/cirrus"
 # Avoid re-installing over and over
 NO_TEMP_INSTALL: 1
 # git's tar doesn't deal with drive letters, see
 # https://postgr.es/m/b6782dc3-a7b0-ed56-175f-f8f54cb08d67%40dunslane.net
 TAR: "c:/windows/system32/tar.exe"
-# Avoids port conflicts between concurrent tap test runs
-PG_TEST_USE_UNIX_SOCKETS: 1
 PG_REGRESS_SOCK_DIR: "c:/cirrus/"
 # -m enables parallelism
 # verbosity:minimal + Summary reduce verbosity, while keeping a summary of
@@ -389,12 +401,6 @@ task:
 cpu: $CPUS
 memory: 4G
 
-  sysinfo_script: |
-chcp
-systeminfo
-powershell -Command get-psdrive -psprovider filesystem
-set
-
   setup_additional_packages_script: |
 REM choco install -y --no-progress ...
 
@@ -454,6 +460,51 @@ task:
   path: "crashlog-*.txt"
   type: text/plain
 
+task:
+  << : *WINDOWS_ENVIRONMENT_BASE
+  name: Windows - Server 2019, MinGW64
+  windows_container:
+image: $CONTAINER_REPO/windows_ci_mingw64:latest
+cpu: $CPUS
+memory: 4G
+  env:
+CCACHE_DIR: C:/msys64/ccache
+BUILD_DIR: "%CIRRUS_WORKING_DIR%/build"
+
+  ccache_cache:
+folder: ${CCACHE_DIR}
+
+  mingw_info_script:
+- C:\msys64\usr\bin\bash.exe -lc "where gcc"
+- C:\msys64\usr\bin\bash.exe -lc "gcc --version"
+- C:\msys64\usr\bin\bash.exe -lc "where perl"
+- C:\msys64\usr\bin\bash.exe -lc "perl --version"
+
+  configure_script:
+- C:\msys64\usr\bin\bash.exe -lc "mkdir %BUILD_DIR% &&
+  cd %BUILD_DIR% &&
+  %CIRRUS_WORKING_DIR%/configure
+--host=x86_64-w64-mingw32
+--enable-cassert
+--enable-tap-tests
+--with-icu
+--with-libxml
+--with-libxslt
+--with-lz4
+--enable-debug
+CC='ccache gcc'
+CXX='ccache g++'"
+
+  build_script:
+C:\msys64\usr\bin\bash.exe -lc "cd %BUILD_DIR% && make -s world-bin -j${CPUS}"
+
+  upload_caches: ccache
+
+  tests_script:
+  - set "NoDefaultCurrentDirectoryInExePath=0"
+  - C:\msys64\usr\bin\bash.exe -lc "cd %BUILD_DIR% && make -s ${CHECK} ${CHECKFLAGS} -j${CPUS} TMPDIR=%BUILD_DIR%/tmp_install"
+
+  on_failure: *on_failure
 
 task:
   name: CompilerWarnings
-- 
2.35.1.windows.2



Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-03-22 Thread Robert Haas
On Tue, Mar 22, 2022 at 11:42 AM Andres Freund  wrote:
> I might have missed it because I just skimmed the patch. But I still think it
> should contain a comment detailing why accessing catalogs from another
> database is safe in this instance, and perhaps a comment or three in places
> that could break it (e.g. snapshot computation, horizon stuff).

Please see the function header comment for ScanSourceDatabasePgClass.
I don't quite see how changes in those places would break this, but if
you want to be more specific perhaps I will see the light?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)

2022-03-22 Thread Robert Haas
On Mon, Mar 21, 2022 at 8:52 PM Andres Freund  wrote:
> I noticed this still has an open CF entry: 
> https://commitfest.postgresql.org/37/3296/
> I assume it can be marked as committed?

Yeah, done now. But don't forget that we still need to do something on
the "wrong fds used for refilenodes after pg_upgrade relfilenode
changes Reply-To:" thread, and I think the ball is in your court
there.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: New Object Access Type hooks

2022-03-22 Thread Andrew Dunstan


On 3/22/22 11:26, Mark Dilger wrote:
>
>> On Mar 22, 2022, at 8:14 AM, Julien Rouhaud  wrote:
>>
>> Hi,
>>
>> On Tue, Mar 22, 2022 at 10:41:05AM -0400, Andrew Dunstan wrote:
>>> Pushed with slight adjustments - the LOAD was unnecessary as was the
>>> setting of client_min_messages - the latter would have made buildfarm
>>> animals unhappy.
>> For the record this just failed on my buildfarm animal:
>> https://brekka.postgresql.org/cgi-bin/show_stage_log.pl?nm=lapwing=2022-03-22%2014%3A40%3A10=misc-check.
> culicidae is complaining:
>
> ==~_~===-=-===~_~== 
> pgsql.build/src/test/modules/test_oat_hooks/log/postmaster.log 
> ==~_~===-=-===~_~==
> 2022-03-22 14:53:27.175 UTC [2166986][postmaster][:0] LOG:  starting 
> PostgreSQL 15devel on x86_64-pc-linux-gnu, compiled by gcc (Debian 11.2.0-18) 
> 11.2.0, 64-bit
> 2022-03-22 14:53:27.175 UTC [2166986][postmaster][:0] LOG:  listening on Unix 
> socket "/tmp/pg_regress-RiE7x8/.s.PGSQL.6280"
> 2022-03-22 14:53:27.198 UTC [2167008][not initialized][:0] FATAL:  
> test_oat_hooks must be loaded via shared_preload_libraries
> 2022-03-22 14:53:27.202 UTC [2167006][not initialized][:0] FATAL:  
> test_oat_hooks must be loaded via shared_preload_libraries
> 2022-03-22 14:53:27.203 UTC [2167009][not initialized][:0] FATAL:  
> test_oat_hooks must be loaded via shared_preload_libraries
> 2022-03-22 14:53:27.204 UTC [2166986][postmaster][:0] LOG:  checkpointer 
> process (PID 2167006) exited with exit code 1
> 2022-03-22 14:53:27.204 UTC [2166986][postmaster][:0] LOG:  terminating any 
> other active server processes
> 2022-03-22 14:53:27.204 UTC [2166986][postmaster][:0] LOG:  shutting down 
> because restart_after_crash is off
> 2022-03-22 14:53:27.206 UTC [2166986][postmaster][:0] LOG:  database system 
> is shut down
> ==~_~===-=-===~_~== 
> pgsql.build/src/test/modules/test_rls_hooks/log/initdb.log ==~_~===-=-===~_~==
>
>


That seems quite weird. I'm not sure how it's getting loaded at all if
not via shared_preload_libraries


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-03-22 Thread Andres Freund
Hi,

On 2022-03-22 11:23:16 -0400, Robert Haas wrote:
> From 116bcdb6174a750b7ef7ae05ef6f39cebaf9bcf5 Mon Sep 17 00:00:00 2001
> From: Robert Haas 
> Date: Tue, 22 Mar 2022 11:22:26 -0400
> Subject: [PATCH v1] Add new block-by-block strategy for CREATE DATABASE.

I might have missed it because I just skimmed the patch. But I still think it
should contain a comment detailing why accessing catalogs from another
database is safe in this instance, and perhaps a comment or three in places
that could break it (e.g. snapshot computation, horizon stuff).

Greetings,

Andres Freund




Re: [PATCH] pgbench: add multiconnect option

2022-03-22 Thread David Christensen
On Sat, Mar 19, 2022 at 11:43 AM Fabien COELHO  wrote:

>
> Hi Sami,
>
> > Pgbench is a simple benchmark tool by design, and I wonder if adding
> > a multiconnect feature will cause pgbench to be used incorrectly.
>
> Maybe, but I do not see how it would be worse that what pgbench already
> allows.
>

I agree that pgbench is simple; perhaps really too simple when it comes to
being able to measure much more than basic query flows.  What pgbench does
have in its favor is being distributed with the core distribution.

I think there is definitely space for a more complicated benchmarking tool
that exercises more scenarios and more realistic query patterns and
scenarios.  Whether that is distributed with the core is another question.

David


Re: refactoring basebackup.c (zstd workers)

2022-03-22 Thread Robert Haas
On Mon, Mar 21, 2022 at 2:41 PM Dagfinn Ilmari Mannsåker
 wrote:
> This is no longer the accurate. How about something like like "Specifies
> details of the chosen compression method"?

Good catch. v5 attached.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v5-0001-Replace-BASE_BACKUP-COMPRESSION_LEVEL-option-with.patch
Description: Binary data


Re: Broken make dependency somewhere near win32ver.o?

2022-03-22 Thread Andres Freund
Hi,

On 2022-03-22 18:09:08 +1300, Thomas Munro wrote:
> On Tue, Mar 22, 2022 at 4:14 PM Andres Freund  wrote:
> > The problem looks to be that pg_dumpall doesn't have a dependency on OBJs,
> > which in turn is what contains the dependency on WIN32RES, which in turn
> > contains win32ver.o. So the build succeeds if pg_dump/restores's 
> > dependencies
> > are built first, but not if pg_dumpall starts to be built before that...
> >
> > Seems we just need to add $(WIN32RES) to pg_dumpall: ?
> 
> Ah, yeah, that looks right.  I don't currently have a mingw setup to
> test, but clearly $(WIN32RES) is passed to $(CC) despite not being
> listed as a dependency.

Pushed a fix for that. Ended up doing it for all branches, although I was
debating with myself about doing so.

I did a quick search and didn't find other cases of this problem.

Greetings,

Andres Freund




Re: SQL/JSON: JSON_TABLE

2022-03-22 Thread Andrew Dunstan


On 3/22/22 10:55, Daniel Gustafsson wrote:
>> On 22 Mar 2022, at 16:31, Andrew Dunstan  wrote:
>> I'm planning on pushing the functions patch set this week and json-table
>> next week.
> My comments from 30827b3c-edf6-4d41-bbf1-298181874...@yesql.se are yet to be
> addressed (or at all responded to) in this patchset.  I'll paste the ones 
> which
> still apply to make it easier:


Thanks for reminding me. I will make sure these are attended to.


cheers


andrew



--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: SQL/JSON: JSON_TABLE

2022-03-22 Thread Andrew Dunstan


On 3/22/22 10:53, Alvaro Herrera wrote:
> On 2022-Mar-22, Andrew Dunstan wrote:
>
>> I'm planning on pushing the functions patch set this week and json-table
>> next week.
> I think it'd be a good idea to push the patches one by one and let a day
> between each.  That would make it easier to chase buildfarm issues
> individually, and make sure they are fixed before the next step.
> Each patch in each series is already big enough.



OK, can do it that way. First one will be later today then.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: New Object Access Type hooks

2022-03-22 Thread Mark Dilger



> On Mar 22, 2022, at 8:14 AM, Julien Rouhaud  wrote:
> 
> Hi,
> 
> On Tue, Mar 22, 2022 at 10:41:05AM -0400, Andrew Dunstan wrote:
>> 
>> Pushed with slight adjustments - the LOAD was unnecessary as was the
>> setting of client_min_messages - the latter would have made buildfarm
>> animals unhappy.
> 
> For the record this just failed on my buildfarm animal:
> https://brekka.postgresql.org/cgi-bin/show_stage_log.pl?nm=lapwing=2022-03-22%2014%3A40%3A10=misc-check.

culicidae is complaining:

==~_~===-=-===~_~== 
pgsql.build/src/test/modules/test_oat_hooks/log/postmaster.log 
==~_~===-=-===~_~==
2022-03-22 14:53:27.175 UTC [2166986][postmaster][:0] LOG:  starting PostgreSQL 
15devel on x86_64-pc-linux-gnu, compiled by gcc (Debian 11.2.0-18) 11.2.0, 
64-bit
2022-03-22 14:53:27.175 UTC [2166986][postmaster][:0] LOG:  listening on Unix 
socket "/tmp/pg_regress-RiE7x8/.s.PGSQL.6280"
2022-03-22 14:53:27.198 UTC [2167008][not initialized][:0] FATAL:  
test_oat_hooks must be loaded via shared_preload_libraries
2022-03-22 14:53:27.202 UTC [2167006][not initialized][:0] FATAL:  
test_oat_hooks must be loaded via shared_preload_libraries
2022-03-22 14:53:27.203 UTC [2167009][not initialized][:0] FATAL:  
test_oat_hooks must be loaded via shared_preload_libraries
2022-03-22 14:53:27.204 UTC [2166986][postmaster][:0] LOG:  checkpointer 
process (PID 2167006) exited with exit code 1
2022-03-22 14:53:27.204 UTC [2166986][postmaster][:0] LOG:  terminating any 
other active server processes
2022-03-22 14:53:27.204 UTC [2166986][postmaster][:0] LOG:  shutting down 
because restart_after_crash is off
2022-03-22 14:53:27.206 UTC [2166986][postmaster][:0] LOG:  database system is 
shut down
==~_~===-=-===~_~== pgsql.build/src/test/modules/test_rls_hooks/log/initdb.log 
==~_~===-=-===~_~==


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







  1   2   >