On Tue, Oct 18, 2022 at 5:22 PM Dilip Kumar <dilipbal...@gmail.com> wrote: > > On Thu, Oct 6, 2022 at 1:37 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > > > > While looking at v35 patch, I realized that there are some cases where > > the logical replication gets stuck depending on partitioned table > > structure. For instance, there are following tables, publication, and > > subscription: > > > > * On publisher > > create table p (c int) partition by list (c); > > create table c1 partition of p for values in (1); > > create table c2 (c int); > > create publication test_pub for table p, c1, c2 with > > (publish_via_partition_root = 'true'); > > > > * On subscriber > > create table p (c int) partition by list (c); > > create table c1 partition of p for values In (2); > > create table c2 partition of p for values In (1); > > create subscription test_sub connection 'port=5551 dbname=postgres' > > publication test_pub with (streaming = 'parallel', copy_data = > > 'false'); > > > > Note that while both the publisher and the subscriber have the same > > name tables the partition structure is different and rows go to a > > different table on the subscriber (eg, row c=1 will go to c2 table on > > the subscriber). If two current transactions are executed as follows, > > the apply worker (ig, the leader apply worker) waits for a lock on c2 > > held by its parallel apply worker: > > > > * TX-1 > > BEGIN; > > INSERT INTO p SELECT 1 FROM generate_series(1, 10000); --- changes are > > streamed > > > > * TX-2 > > BEGIN; > > TRUNCATE c2; --- wait for a lock on c2 > > > > * TX-1 > > INSERT INTO p SELECT 1 FROM generate_series(1, 10000); > > COMMIT; > > > > This might not be a common case in practice but it could mean that > > there is a restriction on how partitioned tables should be structured > > on the publisher and the subscriber when using streaming = 'parallel'. > > When this happens, since the logical replication cannot move forward > > the users need to disable parallel-apply mode or increase > > logical_decoding_work_mem. We could describe this limitation in the > > doc but it would be hard for users to detect problematic table > > structure. > > Interesting case. So I think the root of the problem is the same as > what we have for a column is marked unique to the subscriber but not > to the publisher. In short, two transactions which are independent of > each other on the publisher are dependent on each other on the > subscriber side because table definition is different on the > subscriber. So can't we handle this case in the same way by marking > this table unsafe for parallel-apply? >
Yes, we can do that. I think Hou-San has already dealt that way in his latest patch [1]. See his response in the email [1]: "Disallow replicating from or to a partitioned table in parallel streaming mode". [1] - https://www.postgresql.org/message-id/OS0PR01MB57160760B34E1655718F4D1994249%40OS0PR01MB5716.jpnprd01.prod.outlook.com -- With Regards, Amit Kapila.