date:20240314

Re: Add basic tests for the low-level backup method.

2024-03-14 Thread David Steele


On 3/15/24 18:32, Michael Paquier wrote:

On Fri, Mar 15, 2024 at 06:23:15PM +1300, David Steele wrote:

Well, this is what we recommend in the docs, i.e. using bin mode to save
backup_label, so it seems OK to me.


Indeed, I didn't notice that this is actually documented, so what I
did took the right angle.  French flair, perhaps..


This seems like a reasonable explanation to me.

-David

Re: Add basic tests for the low-level backup method.

2024-03-14 Thread Michael Paquier

On Fri, Mar 15, 2024 at 06:23:15PM +1300, David Steele wrote:
> Well, this is what we recommend in the docs, i.e. using bin mode to save
> backup_label, so it seems OK to me.

Indeed, I didn't notice that this is actually documented, so what I
did took the right angle.  French flair, perhaps..
--
Michael


signature.asc
Description: PGP signature

Re: Add basic tests for the low-level backup method.

2024-03-14 Thread Michael Paquier

On Fri, Mar 15, 2024 at 08:38:47AM +0900, Michael Paquier wrote:
> That's why these tests are not that easy, they can be racy.  I've run
> the test 5~10 times in the CI this time to gain more confidence, and
> saw zero failures with the stability fixes in place including Windows.
> I've applied it now, as I can still monitor the buildfarm for a few
> more days.  Let's see what happens, but that should be better.

So, it looks like the buildfarm is clear.  sidewinder has reported a
green state, and the recent runs of the CFbot across all the patches
are looking stable as well on all platforms.  There are still a few
buildfarm members on Windows that will take time more time before
running.
--
Michael


signature.asc
Description: PGP signature

Re: Add basic tests for the low-level backup method.

2024-03-14 Thread David Steele


On 3/15/24 12:38, Michael Paquier wrote:

On Fri, Mar 15, 2024 at 09:40:38AM +1300, David Steele wrote:

Is the missing test in meson the reason we did not see test failures for
Windows in CI?


The test has to be listed in src/test/recovery/meson.build or the CI
would ignore it.


Right -- I will keep this in mind for the future.


The second LOG is something that can be acted on.  I've added some
debugging to the parsing of the backup_label file in the backend, and
noticed that the first fscanf() for START WAL LOCATION is failing
because the last %c is detected as \r rather than \n.  Tweaking the
contents stored from pg_backend_stop() with a sed won't help, because
the issue is that we write the CRLFs with append_to_file, and the
startup process cannot cope with that.  The simplest method I can
think of is to use binmode, as of the attached.


Yeah, that makes sense.


I am wondering if there is a better trick here that would not require
changes in the backend to make the backup_label parsing more flexible,
though.


Well, this is what we recommend in the docs, i.e. using bin mode to save 
backup_label, so it seems OK to me.



I am attaching an updated patch with all that fixed, which is stable
in the CI and any tests I've run.  Do you have any comments about


These changes look good to me. Sure wish we had an easier to way to test
commits in the build farm.


That's why these tests are not that easy, they can be racy.  I've run
the test 5~10 times in the CI this time to gain more confidence, and
saw zero failures with the stability fixes in place including Windows.
I've applied it now, as I can still monitor the buildfarm for a few
more days.  Let's see what happens, but that should be better.


At least sidewinder is happy now -- and the build farm in general as far 
as I can see.


Thank you for your help on this!
-David

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-14 Thread Bharath Rupireddy

On Wed, Mar 13, 2024 at 9:38 AM Amit Kapila  wrote:
>
> BTW, is XID the based parameter 'max_slot_xid_age' not have similarity
> with 'max_slot_wal_keep_size'? I think it will impact the rows we
> removed based on xid horizons. Don't we need to consider it while
> vacuum computing the xid horizons in ComputeXidHorizons() similar to
> what we do for WAL w.r.t 'max_slot_wal_keep_size'?

I'm having a hard time understanding why we'd need something up there
in ComputeXidHorizons(). Can you elaborate it a bit please?

What's proposed with max_slot_xid_age is that during checkpoint we
look at slot's xmin and catalog_xmin, and the current system txn id.
Then, if the XID age of (xmin, catalog_xmin) and current_xid crosses
max_slot_xid_age, we invalidate the slot.  Let me illustrate how all
this works:

1. Setup a primary and standby with hot_standby_feedback set to on on
standby. For instance, check my scripts at [1].

2. Stop the standby to make the slot inactive on the primary. Check
the slot is holding xmin of 738.
./pg_ctl -D sbdata -l logfilesbdata stop

postgres=# SELECT * FROM pg_replication_slots;
-[ RECORD 1 ]---+-
slot_name   | sb_repl_slot
plugin  |
slot_type   | physical
datoid  |
database|
temporary   | f
active  | f
active_pid  |
xmin| 738
catalog_xmin|
restart_lsn | 0/300
confirmed_flush_lsn |
wal_status  | reserved
safe_wal_size   |
two_phase   | f
conflict_reason |
failover| f
synced  | f

3. Start consuming the XIDs on the primary with the following script
for instance
./psql -d postgres -p 5432
DROP TABLE tab_int;
CREATE TABLE tab_int (a int);

do $$
begin
  for i in 1..268435 loop
-- use an exception block so that each iteration eats an XID
begin
  insert into tab_int values (i);
exception
  when division_by_zero then null;
end;
  end loop;
end$$;

4. Make some dead rows in the table.
update tab_int set a = a+1;
delete from tab_int where a%4=0;

postgres=# SELECT n_dead_tup, n_tup_ins, n_tup_upd, n_tup_del FROM
pg_stat_user_tables WHERE relname = 'tab_int';
-[ RECORD 1 ]--
n_dead_tup | 335544
n_tup_ins  | 268435
n_tup_upd  | 268435
n_tup_del  | 67109

5. Try vacuuming to delete the dead rows, observe 'tuples: 0 removed,
536870 remain, 335544 are dead but not yet removable'. The dead rows
can't be removed because the inactive slot is holding an xmin, see
'removable cutoff: 738, which was 268441 XIDs old when operation
ended'.

postgres=# vacuum verbose tab_int;
INFO:  vacuuming "postgres.public.tab_int"
INFO:  finished vacuuming "postgres.public.tab_int": index scans: 0
pages: 0 removed, 2376 remain, 2376 scanned (100.00% of total)
tuples: 0 removed, 536870 remain, 335544 are dead but not yet removable
removable cutoff: 738, which was 268441 XIDs old when operation ended
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
index scan not needed: 0 pages from table (0.00% of total) had 0 dead
item identifiers removed
avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
buffer usage: 4759 hits, 0 misses, 0 dirtied
WAL usage: 0 records, 0 full page images, 0 bytes
system usage: CPU: user: 0.07 s, system: 0.00 s, elapsed: 0.07 s
VACUUM

6. Now, repeat the above steps but with setting max_slot_xid_age =
20 on the primary.

7. Do a checkpoint to invalidate the slot.
postgres=# checkpoint;
CHECKPOINT
postgres=# SELECT * FROM pg_replication_slots;
-[ RECORD 1 ]---+-
slot_name   | sb_repl_slot
plugin  |
slot_type   | physical
datoid  |
database|
temporary   | f
active  | f
active_pid  |
xmin| 738
catalog_xmin|
restart_lsn | 0/300
confirmed_flush_lsn |
wal_status  | lost
safe_wal_size   |
two_phase   | f
conflicting |
failover| f
synced  | f
invalidation_reason | xid_aged

8. And, then vacuum the table, observe 'tuples: 335544 removed, 201326
remain, 0 are dead but not yet removable'.

postgres=# vacuum verbose tab_int;
INFO:  vacuuming "postgres.public.tab_int"
INFO:  finished vacuuming "postgres.public.tab_int": index scans: 0
pages: 0 removed, 2376 remain, 2376 scanned (100.00% of total)
tuples: 335544 removed, 201326 remain, 0 are dead but not yet removable
removable cutoff: 269179, which was 0 XIDs old when operation ended
new relfrozenxid: 269179, which is 268441 XIDs ahead of previous value
frozen: 1189 pages from table (50.04% of total) had 201326 tuples frozen
index scan not needed: 0 pages from table (0.00% of total) had 0 dead
item identifiers removed
avg read rate: 0.000 MB/s, avg write rate: 193.100 MB/s
buffer usage: 4760 hits, 0 misses, 2381 dirtied
WAL usage: 5942 records, 2378 full page images, 8343275 bytes
system usage: CPU: user: 0.09 s, system: 0.00 s, elapsed: 0.09 s
VACUUM

[1]
cd

Re: POC, WIP: OR-clause support for indexes

2024-03-14 Thread Andrei Lepikhov


On 14/3/2024 17:39, Alexander Korotkov wrote:

Thank you, Andrei.  Looks like a very undesirable side effect.  Do you
have any idea why it happens?  Partition pruning should work correctly
for both transformed and non-transformed quals, why does
transformation hurt it?
Now we have the v23-0001-* patch with all issues resolved. The last one 
which caused execution stage pruning was about necessity to evaluate 
SAOP expression right after transformation. In previous version the core 
executed it on transformed expressions.


> As you can see this case is not related to partial indexes.  Just no
> index selective for the whole query.  However, splitting scan by the
> OR qual lets use a combination of two selective indexes.
Thanks for the case. I will try to resolve it.

--
regards,
Andrei Lepikhov
Postgres Professional
From 156c00c820a38e5e1856f07363af87b3109b5d77 Mon Sep 17 00:00:00 2001
From: Alena Rybakina 
Date: Fri, 2 Feb 2024 22:01:09 +0300
Subject: [PATCH 1/2] Transform OR clauses to ANY expression.

Replace (expr op C1) OR (expr op C2) ... with expr op ANY(ARRAY[C1, C2, ...]) 
on the
preliminary stage of optimization when we are still working with the
expression tree.
Here C is a constant expression, 'expr' is non-constant expression, 'op' is
an operator which returns boolean result and has a commuter (for the case of
reverse order of constant and non-constant parts of the expression,
like 'CX op expr').
Sometimes it can lead to not optimal plan. But we think it is better to have
array of elements instead of a lot of OR clauses. Here is a room for further
optimizations on decomposing that array into more optimal parts.
Authors: Alena Rybakina , Andrey Lepikhov 

Reviewed-by: Peter Geoghegan , Ranier Vilela 
Reviewed-by: Alexander Korotkov , Robert Haas 

Reviewed-by: jian he 
---
 .../postgres_fdw/expected/postgres_fdw.out|   8 +-
 doc/src/sgml/config.sgml  |  17 +
 src/backend/nodes/queryjumblefuncs.c  |  27 ++
 src/backend/optimizer/prep/prepqual.c | 374 +-
 src/backend/utils/misc/guc_tables.c   |  11 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/nodes/queryjumble.h   |   1 +
 src/include/optimizer/optimizer.h |   2 +
 src/test/regress/expected/create_index.out| 156 +++-
 src/test/regress/expected/join.out|  62 ++-
 src/test/regress/expected/partition_prune.out | 215 +-
 src/test/regress/expected/stats_ext.out   |  12 +-
 src/test/regress/expected/sysviews.out|   3 +-
 src/test/regress/expected/tidscan.out |  23 +-
 src/test/regress/sql/create_index.sql |  35 ++
 src/test/regress/sql/join.sql |  10 +
 src/test/regress/sql/partition_prune.sql  |  22 ++
 src/test/regress/sql/tidscan.sql  |   6 +
 src/tools/pgindent/typedefs.list  |   2 +
 19 files changed, 929 insertions(+), 58 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index 58a603ac56..a965b43cc6 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -8838,18 +8838,18 @@ insert into utrtest values (2, 'qux');
 -- Check case where the foreign partition is a subplan target rel
 explain (verbose, costs off)
 update utrtest set a = 1 where a = 1 or a = 2 returning *;
- QUERY PLAN
 
-
+  QUERY PLAN   
   
+--
  Update on public.utrtest
Output: utrtest_1.a, utrtest_1.b
Foreign Update on public.remp utrtest_1
Update on public.locp utrtest_2
->  Append
  ->  Foreign Update on public.remp utrtest_1
-   Remote SQL: UPDATE public.loct SET a = 1 WHERE (((a = 1) OR (a 
= 2))) RETURNING a, b
+   Remote SQL: UPDATE public.loct SET a = 1 WHERE ((a = ANY 
('{1,2}'::integer[]))) RETURNING a, b
  ->  Seq Scan on public.locp utrtest_2
Output: 1, utrtest_2.tableoid, utrtest_2.ctid, NULL::record
-   Filter: ((utrtest_2.a = 1) OR (utrtest_2.a = 2))
+   Filter: (utrtest_2.a = ANY ('{1,2}'::integer[]))
 (10 rows)
 
 -- The new values are concatenated with ' triggered !'
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 65a6e6c408..2de6ae301a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5472,6 +5472,23 @@ ANY num_sync ( 
+  enable_or_transformation (boolean)
+   
+enable_or_transformation configuration 
parameter
+   
+  
+  
+   
+Enables or disables the query

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-14 Thread shveta malik

On Thu, Mar 14, 2024 at 7:58 PM Bharath Rupireddy
 wrote:
>
> On Thu, Mar 14, 2024 at 12:24 PM Amit Kapila  wrote:
> >
> > On Wed, Mar 13, 2024 at 9:24 PM Bharath Rupireddy
> > >
> > > Yes, there will be some sort of duplicity if we emit conflict_reason
> > > as a text field. However, I still think the better way is to turn
> > > conflict_reason text to conflict boolean and set it to true only on
> > > rows_removed and wal_level_insufficient invalidations. When conflict
> > > boolean is true, one (including all the tests that we've added
> > > recently) can look for invalidation_reason text field for the reason.
> > > This sounds reasonable to me as opposed to we just mentioning in the
> > > docs that "if invalidation_reason is rows_removed or
> > > wal_level_insufficient it's the reason for conflict with recovery".

+1 on maintaining both conflicting and invalidation_reason

> > Fair point. I think we can go either way. Bertrand, Nathan, and
> > others, do you have an opinion on this matter?
>
> While we wait to hear from others on this, I'm attaching the v9 patch
> set implementing the above idea (check 0001 patch). Please have a
> look. I'll come back to the other review comments soon.

Thanks for the patch. JFYI, patch09 does not apply to HEAD, some
recent commit caused the conflict.

Some trivial comments on patch001 (yet to review other patches)

1)
info.c:

- "%s as caught_up, conflict_reason IS NOT NULL as invalid "
+ "%s as caught_up, invalidation_reason IS NOT NULL as invalid "

Can we revert back to 'conflicting as invalid' since it is a query for
logical slots only.

2)
040_standby_failover_slots_sync.pl:

- q{SELECT conflict_reason IS NULL AND synced AND NOT temporary FROM
pg_replication_slots WHERE slot_name = 'lsub1_slot';}
+ q{SELECT invalidation_reason IS NULL AND synced AND NOT temporary
FROM pg_replication_slots WHERE slot_name = 'lsub1_slot';}

Here too, can we have 'NOT conflicting' instead of '
invalidation_reason IS NULL' as it is a logical slot test.

thanks
Shveta

Re: Add publisher and subscriber to glossary documentation.

2024-03-14 Thread Euler Taveira

On Fri, Mar 15, 2024, at 1:14 AM, Amit Kapila wrote:
> I think node should mean instance for both physical and logical
> replication, otherwise, it would be confusing. We need both the usages
> as a particular publication/subscription is defined at the database
> level but the server on which we define those is referred to as a
> node/instance.

If you are creating a subscription that connects to the same instance
(replication between 2 databases in the same cluster), your definition is not
correct and Alvaro's definition is accurate. The node definition is closely
linked to the connection string. While the physical replication does not
specify a database (meaning "any database" referring to an instance), the
logical replication requires a database.

--
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: Skip collecting decoded changes of already-aborted transactions

2024-03-14 Thread Ajin Cherian

On Fri, Mar 15, 2024 at 3:17 PM Masahiko Sawada 
wrote:

>
> I resumed working on this item. I've attached the new version patch.
>
> I rebased the patch to the current HEAD and updated comments and
> commit messages. The patch is straightforward and I'm somewhat
> satisfied with it, but I'm thinking of adding some tests for it.
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com


I just had a look at the patch, the patch no longer applies because of a
removal of a header in a recent commit. Overall the patch looks fine, and I
didn't find any issues. Some cosmetic comments:
in ReorderBufferCheckTXNAbort()
+ /* Quick return if we've already knew the transaction status */
+ if (txn->aborted)
+ return true;

knew/know

/*
+ * If logical_replication_mode is "immediate", we don't check the
+ * transaction status so the caller always process this transaction.
+ */
+ if (debug_logical_replication_streaming ==
DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE)
+ return false;

/process/processes

regards,
Ajin Cherian
Fujitsu Australia

Re: Add publisher and subscriber to glossary documentation.

2024-03-14 Thread Amit Kapila

On Thu, Mar 14, 2024 at 7:51 PM Alvaro Herrera  wrote:
>
> On 2024-Mar-14, Shlok Kyal wrote:
>
> > Andrew Atkinson wrote:
> >
> > > Anyway, hopefully these examples show “node” and “database” are
> > > mixed and perhaps others agree using one consistently might help the
> > > goals of the docs.
> >
> > For me the existing content looks good, I felt let's keep it as it is
> > unless others feel differently.
>
> Actually it's these small terminology glitches that give me pause.  If
> we're going to have terms that are interchangeable (in this case "node"
> and "database"), then they should be always interchangeable, not just in
> some unspecified cases.  Maybe the idea of using "node" (which sounds
> like something that's instance-wide) is wrong for logical replication,
> which is necessarily something that happens database-locally.
>
> Then again, maybe defining "node" as something that exists at a
> database-local level when used in the context of logical replication is
> sufficient.  In that case, it would be better to avoid defining it as a
> synonym of "instance".  Then the terms are not always interchangeable,
> but it's clear when they are and when they aren't.
>
> "Node: in replication, each of the endpoints to which or
> from which data is replicated.  In the context of physical replication,
> each node is an instance.  In the context of logical replication, each
> node is a database".
>

I think node should mean instance for both physical and logical
replication, otherwise, it would be confusing. We need both the usages
as a particular publication/subscription is defined at the database
level but the server on which we define those is referred to as a
node/instance.

One of the usages pointed out by Andrew: "The subscriber database..."
[1] is unclear but I feel we can use node there as well instead of
database.

[1] - 
https://www.postgresql.org/docs/current/logical-replication-subscription.html

-- 
With Regards,
Amit Kapila.

Re: speed up a logical replica setup

2024-03-14 Thread Euler Taveira

On Wed, Mar 13, 2024, at 10:09 AM, Shlok Kyal wrote:
> Added a top-up patch v28-0005 to fix this issue.
> I am not changing the version as v28-0001 to v28-0004 is the same as above.

Thanks for your review!

I'm posting a new patch (v29) that merges the previous patches (v28-0002 and
v28-0003). I applied the fix provided by Hayato [1]. It was an oversight during
a rebase. I also included the patch proposed by Shlok [2] that stops the target
server on error if it is running.

Tomas suggested in [3] that maybe the PID should be replaced with something
else that has more entropy. Instead of PID, it uses a random number for
replication slot and subscription. There is also a concern about converting
multiple standbys that will have the same publication name. It added the same
random number to the publication name so it doesn't fail because the
publication already exists. Documentation was changed based on Tomas feedback.

The user name was always included in the subscriber connection string. Let's
have the libpq to choose it. While on it, a new routine (get_sub_conninfo)
contains the code to build the subscriber connection string.

As I said in [4], there wasn't a way to inform a different configuration file.
If your cluster has a postgresql.conf outside PGDATA, when pg_createsubscriber
starts the server it will fail. The new --config-file option let you inform the
postgresql.conf location and the server is started just fine.

I also did some changes in the start_standby_server routine. I replaced the
strcat and snprintf with appendPQExpBuffer that has been used to store the
pg_ctl command.

[1]
https://www.postgresql.org/message-id/TYCPR01MB12077FD21BB186C5A685C0BF3F52A2%40TYCPR01MB12077.jpnprd01.prod.outlook.com
[2]
https://www.postgresql.org/message-id/CANhcyEW6-dH28gLbFc5XpDTJ6JPizU%2Bt5g-aKUWJBf5W_Zriqw%40mail.gmail.com
[3]
https://www.postgresql.org/message-id/6423dfeb-a729-45d3-b71e-7bf1b3adb0c9%40enterprisedb.com
[4]
https://www.postgresql.org/message-id/d898faad-f6d7-4b0d-b816-b9dcdf490685%40app.fastmail.com

1 2 >

1 - 100 of 149 matches

Mail list logo