Thanks a lot for the fast reply!
On 03.11.2022 18:29, Tom Lane wrote:
If we were just adding a
query or two to an existing scenario, that could be okay; but spinning
up and syncing a whole new primary and standby database is *expensive*
when you multiply it by the number of times developers and buildfarm
animals are going to run the tests in the future.
On 15.11.2022 04:59, Tom Lane wrote:
"Anton A. Melnikov" <aamelni...@inbox.ru> writes:
On 02.11.2022 21:02, Tom Lane wrote:
I don't think the cost of that test case is justified by the tiny
probability that it'd ever catch anything.
Seems it is possible to do a test without these remarks. The attached
test uses existing nodes and checks the specific error message.
My opinion remains unchanged: this would be a very poor use of test
cycles.
Sorry, i didn't fully understand what is required and
added some functions to the test that spend extra cpu time. But i found
that it is possible to make a test according to previous remarks by adding
only a few extra queries to an existent test without any additional syncing.
Experimentally, i could not observe any significant difference in
CPU usage due to this test addition.
The difference in the CPU usage was less than standard error.
See 100_bugs-CPU-time.txt attached.
Additionally
i've tried to reduce overall number of nodes previously
used in this test in a similar way.
Optimizing existing tests isn't an answer to that. We could
install those optimizations without adding a new test case.
Oh sure! Here is a separate patch for this optimization:
https://www.postgresql.org/message-id/eb7aa992-c2d7-6ce7-4942-0c784231a362%40inbox.ru
On the contrary with previous case in that one the difference in the CPU usage
during 100_bugs.pl is essential; it decreases approximately by 30%.
With the best wishes!
--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
commit 2449158ba576d7c6d97852d14f85dadb3aced262
Author: Anton A. Melnikov <a.melni...@postgrespro.ru>
Date: Wed Nov 16 11:46:54 2022 +0300
Add test for syntax error in the function in a a logical replication
worker. See dea834938.
diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl
index 6247aa7730..eb4ab6d359 100644
--- a/src/test/subscription/t/100_bugs.pl
+++ b/src/test/subscription/t/100_bugs.pl
@@ -69,6 +69,51 @@ $node_publisher->wait_for_catchup('sub1');
pass('index predicates do not cause crash');
+# https://www.postgresql.org/message-id/flat/adf0452f-8c6b-7def-d35e-ab516c80088e%40inbox.ru
+
+# The bug was that when a syntax error occurred in a SQL-language or PL/pgSQL-language
+# CREATE FUNCTION or DO command executed in a logical replication worker,
+# we'd suffer a null pointer dereference or assertion failure.
+
+$node_subscriber->safe_psql('postgres', q{
+ create or replace procedure rebuild_test(
+ ) as
+ $body$
+ declare
+ l_code text:= E'create or replace function public.test_selector(
+ ) returns setof public.tab1 as
+ \$body\$
+ select * from error_name
+ \$body\$ language sql;';
+ begin
+ execute l_code;
+ perform public.test_selector();
+ end
+ $body$ language plpgsql;
+ create or replace function test_trg()
+ returns trigger as
+ $body$
+ declare
+ begin
+ call public.rebuild_test();
+ return NULL;
+ end
+ $body$ language plpgsql;
+ create trigger test_trigger after insert on tab1 for each row
+ execute function test_trg();
+ alter table tab1 enable replica trigger test_trigger;
+});
+
+# This would crash on the subscriber if not fixed
+$node_publisher->safe_psql('postgres', "INSERT INTO tab1 VALUES (3, 4)");
+
+my $result = $node_subscriber->wait_for_log(
+ "ERROR: relation \"error_name\" does not exist at character"
+);
+
+ok($result,
+ "ERROR: Logical decoding doesn't fail on function error");
+
$node_publisher->stop('fast');
$node_subscriber->stop('fast');
########## src/test/subscription/100_bugs.pl ########################
Before adding test:
Files=1, Tests=7, 14 wallclock secs ( 0.02 usr 0.00 sys + 7.96 cusr 2.24
csys = 10.22 CPU)
Files=1, Tests=7, 15 wallclock secs ( 0.02 usr 0.00 sys + 8.21 cusr 2.13
csys = 10.36 CPU)
Files=1, Tests=7, 14 wallclock secs ( 0.01 usr 0.01 sys + 8.69 cusr 2.17
csys = 10.88 CPU)
Files=1, Tests=7, 15 wallclock secs ( 0.01 usr 0.00 sys + 8.34 cusr 2.22
csys = 10.57 CPU)
Files=1, Tests=7, 15 wallclock secs ( 0.01 usr 0.00 sys + 8.64 cusr 2.19
csys = 10.84 CPU)
Files=1, Tests=7, 14 wallclock secs ( 0.02 usr 0.00 sys + 8.18 cusr 2.20
csys = 10.40 CPU)
Files=1, Tests=7, 13 wallclock secs ( 0.01 usr 0.00 sys + 8.02 cusr 2.23
csys = 10.26 CPU)
Files=1, Tests=7, 14 wallclock secs ( 0.01 usr 0.01 sys + 8.20 cusr 2.14
csys = 10.36 CPU)
Files=1, Tests=7, 13 wallclock secs ( 0.02 usr 0.00 sys + 8.04 cusr 2.19
csys = 10.25 CPU)
Files=1, Tests=7, 14 wallclock secs ( 0.01 usr 0.00 sys + 8.20 cusr 2.09
csys = 10.30 CPU)
Average CPU usage = 10.44+-0.07s
After adding test:
Files=1, Tests=8, 14 wallclock secs ( 0.02 usr 0.00 sys + 8.13 cusr 2.13
csys = 10.28 CPU)
Files=1, Tests=8, 13 wallclock secs ( 0.01 usr 0.00 sys + 7.99 cusr 2.11
csys = 10.11 CPU)
Files=1, Tests=8, 15 wallclock secs ( 0.01 usr 0.01 sys + 8.23 cusr 2.28
csys = 10.53 CPU)
Files=1, Tests=8, 16 wallclock secs ( 0.02 usr 0.00 sys + 8.36 cusr 2.28
csys = 10.66 CPU)
Files=1, Tests=8, 13 wallclock secs ( 0.01 usr 0.00 sys + 8.20 cusr 2.21
csys = 10.42 CPU)
Files=1, Tests=8, 14 wallclock secs ( 0.02 usr 0.00 sys + 8.07 cusr 2.20
csys = 10.29 CPU)
Files=1, Tests=8, 14 wallclock secs ( 0.02 usr 0.00 sys + 8.38 cusr 2.07
csys = 10.47 CPU)
Files=1, Tests=8, 14 wallclock secs ( 0.01 usr 0.00 sys + 8.07 cusr 2.18
csys = 10.26 CPU)
Files=1, Tests=8, 16 wallclock secs ( 0.01 usr 0.00 sys + 8.45 cusr 2.24
csys = 10.70 CPU)
Files=1, Tests=8, 14 wallclock secs ( 0.01 usr 0.00 sys + 8.17 cusr 2.09
csys = 10.27 CPU)
Average CPU usage = 10.40+-0.06s