Re: [BUG] Logical replica crash if there was an error in a function.

Anton A. Melnikov Wed, 16 Nov 2022 06:53:05 -0800

Thanks a lot for the fast reply!

On 03.11.2022 18:29, Tom Lane wrote:

If we were just adding a
query or two to an existing scenario, that could be okay; but spinning
up and syncing a whole new primary and standby database is *expensive*
when you multiply it by the number of times developers and buildfarm
animals are going to run the tests in the future.


On 15.11.2022 04:59, Tom Lane wrote:

"Anton A. Melnikov" <aamelni...@inbox.ru> writes:

On 02.11.2022 21:02, Tom Lane wrote:

I don't think the cost of that test case is justified by the tiny
probability that it'd ever catch anything.

Seems it is possible to do a test without these remarks. The attached
test uses existing nodes and checks the specific error message.


My opinion remains unchanged: this would be a very poor use of test
cycles.


Sorry, i didn't fully understand what is required and
added some functions to the test that spend extra cpu time. But i found
that it is possible to make a test according to previous remarks by adding
only a few extra queries to an existent test without any additional syncing.

Experimentally, i could not observe any significant difference in
CPU usage due to this test addition.
The difference in the CPU usage was less than standard error.
See 100_bugs-CPU-time.txt attached.

Additionally
i've tried to reduce overall number of nodes previously
used in this test in a similar way.


Optimizing existing tests isn't an answer to that.  We could
install those optimizations without adding a new test case.


Oh sure! Here is a separate patch for this optimization:
https://www.postgresql.org/message-id/eb7aa992-c2d7-6ce7-4942-0c784231a362%40inbox.ru
On the contrary with previous case in that one the difference in the CPU usage
during 100_bugs.pl is essential; it decreases approximately by 30%.


With the best wishes!

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit 2449158ba576d7c6d97852d14f85dadb3aced262
Author: Anton A. Melnikov <a.melni...@postgrespro.ru>
Date:   Wed Nov 16 11:46:54 2022 +0300

    Add test for syntax error in the function in a a logical replication
    worker. See dea834938.

diff --git a/src/test/subscription/t/100_bugs.pl b/src/test/subscription/t/100_bugs.pl
index 6247aa7730..eb4ab6d359 100644
--- a/src/test/subscription/t/100_bugs.pl
+++ b/src/test/subscription/t/100_bugs.pl
@@ -69,6 +69,51 @@ $node_publisher->wait_for_catchup('sub1');
 
 pass('index predicates do not cause crash');
 
+# https://www.postgresql.org/message-id/flat/adf0452f-8c6b-7def-d35e-ab516c80088e%40inbox.ru
+
+# The bug was that when a syntax error occurred in a SQL-language or PL/pgSQL-language
+# CREATE FUNCTION or DO command executed in a logical replication worker,
+# we'd suffer a null pointer dereference or assertion failure.
+
+$node_subscriber->safe_psql('postgres', q{
+	create or replace procedure rebuild_test(
+	) as
+	$body$
+	declare
+		l_code  text:= E'create or replace function public.test_selector(
+	) returns setof public.tab1 as
+	\$body\$
+		select * from  error_name
+	\$body\$ language sql;';
+	begin
+		execute l_code;
+		perform public.test_selector();
+	end
+	$body$ language plpgsql;
+	create or replace function test_trg()
+	returns trigger as
+	$body$
+		declare
+		begin
+			call public.rebuild_test();
+			return NULL;
+		end
+	$body$ language plpgsql;
+	create trigger test_trigger after insert on tab1 for each row
+								execute function test_trg();
+	alter table tab1 enable replica trigger test_trigger;
+});
+
+# This would crash on the subscriber if not fixed
+$node_publisher->safe_psql('postgres', "INSERT INTO tab1 VALUES (3, 4)");
+
+my $result = $node_subscriber->wait_for_log(
+	"ERROR:  relation \"error_name\" does not exist at character"
+);
+
+ok($result,
+	"ERROR: Logical decoding doesn't fail on function error");
+
 $node_publisher->stop('fast');
 $node_subscriber->stop('fast');

########## src/test/subscription/100_bugs.pl ########################
Before adding test:
Files=1, Tests=7, 14 wallclock secs ( 0.02 usr  0.00 sys +  7.96 cusr  2.24 
csys = 10.22 CPU)
Files=1, Tests=7, 15 wallclock secs ( 0.02 usr  0.00 sys +  8.21 cusr  2.13 
csys = 10.36 CPU)
Files=1, Tests=7, 14 wallclock secs ( 0.01 usr  0.01 sys +  8.69 cusr  2.17 
csys = 10.88 CPU)
Files=1, Tests=7, 15 wallclock secs ( 0.01 usr  0.00 sys +  8.34 cusr  2.22 
csys = 10.57 CPU)
Files=1, Tests=7, 15 wallclock secs ( 0.01 usr  0.00 sys +  8.64 cusr  2.19 
csys = 10.84 CPU)
Files=1, Tests=7, 14 wallclock secs ( 0.02 usr  0.00 sys +  8.18 cusr  2.20 
csys = 10.40 CPU)
Files=1, Tests=7, 13 wallclock secs ( 0.01 usr  0.00 sys +  8.02 cusr  2.23 
csys = 10.26 CPU)
Files=1, Tests=7, 14 wallclock secs ( 0.01 usr  0.01 sys +  8.20 cusr  2.14 
csys = 10.36 CPU)
Files=1, Tests=7, 13 wallclock secs ( 0.02 usr  0.00 sys +  8.04 cusr  2.19 
csys = 10.25 CPU)
Files=1, Tests=7, 14 wallclock secs ( 0.01 usr  0.00 sys +  8.20 cusr  2.09 
csys = 10.30 CPU)
Average CPU usage = 10.44+-0.07s

After adding test:
Files=1, Tests=8, 14 wallclock secs ( 0.02 usr  0.00 sys +  8.13 cusr  2.13 
csys = 10.28 CPU)
Files=1, Tests=8, 13 wallclock secs ( 0.01 usr  0.00 sys +  7.99 cusr  2.11 
csys = 10.11 CPU)
Files=1, Tests=8, 15 wallclock secs ( 0.01 usr  0.01 sys +  8.23 cusr  2.28 
csys = 10.53 CPU)
Files=1, Tests=8, 16 wallclock secs ( 0.02 usr  0.00 sys +  8.36 cusr  2.28 
csys = 10.66 CPU)
Files=1, Tests=8, 13 wallclock secs ( 0.01 usr  0.00 sys +  8.20 cusr  2.21 
csys = 10.42 CPU)
Files=1, Tests=8, 14 wallclock secs ( 0.02 usr  0.00 sys +  8.07 cusr  2.20 
csys = 10.29 CPU)
Files=1, Tests=8, 14 wallclock secs ( 0.02 usr  0.00 sys +  8.38 cusr  2.07 
csys = 10.47 CPU)
Files=1, Tests=8, 14 wallclock secs ( 0.01 usr  0.00 sys +  8.07 cusr  2.18 
csys = 10.26 CPU)
Files=1, Tests=8, 16 wallclock secs ( 0.01 usr  0.00 sys +  8.45 cusr  2.24 
csys = 10.70 CPU)
Files=1, Tests=8, 14 wallclock secs ( 0.01 usr  0.00 sys +  8.17 cusr  2.09 
csys = 10.27 CPU)
Average CPU usage = 10.40+-0.06s

Re: [BUG] Logical replica crash if there was an error in a function.

Reply via email to