Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On Tue, Nov 21, 2023 at 6:56 PM Tomas Vondra wrote: > > On 11/21/23 14:16, Amit Kapila wrote: > > On Tue, Nov 21, 2023 at 5:17 PM Tomas Vondra > > wrote: > >> > > > > It seems there is some inconsistency in what you have written for > > client backends/tablesync worker vs. apply worker. The above text > > seems to be saying that the client backend and table sync worker are > > waiting on a "subscription row in pg_subscription" and the apply > > worker is operating on "pg_subscription_rel". So, if that is true then > > they shouldn't get stuck. > > > > I think here client backend and tablesync worker seems to be blocked > > for a lock on pg_subscription_rel. > > > > Not really, they are all locking the subscription. All the locks are on > classid=6100, which is pg_subscription: > > test=# select 6100::regclass; > regclass > - >pg_subscription > (1 row) > > The thing is, the tablesync workers call UpdateSubscriptionRelState, > which locks the pg_subscription catalog at the very beginning: > >LockSharedObject(SubscriptionRelationId, ...); > > So that's the issue. I haven't explored why it's done this way, and > there's no comment explaining locking the subscriptions is needed ... > I think it prevents concurrent drop of rel during the REFRESH operation. > >> The tablesync workers can't proceed because their lock request is stuck > >> behind the AccessExclusiveLock request. > >> > >> And the apply worker can't proceed, because it's waiting for status > >> update from the tablesync workers. > >> > > > > This part is not clear to me because > > wait_for_relation_state_change()->GetSubscriptionRelState() seems to > > be releasing the lock while closing the relation. Am, I missing > > something? > > > > I think you're missing the fact that GetSubscriptionRelState() acquires > and releases the lock on pg_subscription_rel, but that's not the lock > causing the issue. The problem is the lock on the pg_subscription row. > Okay. IIUC, what's going on here is that the apply worker acquires AccessShareLock on pg_subscription to update rel state for one of the tables say tbl-1, and then for another table say tbl-2, it started waiting for a state change via wait_for_relation_state_change(). I think here the fix is to commit the transaction before we go for a wait. I guess we need something along the lines of what is proposed in [1] though we have solved the problem in that thread in some other way.. [1] - https://www.postgresql.org/message-id/1412708.1674417574%40sss.pgh.pa.us -- With Regards, Amit Kapila.
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On 11/22/23 11:38, Amit Kapila wrote: > On Tue, Nov 21, 2023 at 6:56 PM Tomas Vondra > wrote: >> >> On 11/21/23 14:16, Amit Kapila wrote: >>> On Tue, Nov 21, 2023 at 5:17 PM Tomas Vondra >>> wrote: >>> >>> It seems there is some inconsistency in what you have written for >>> client backends/tablesync worker vs. apply worker. The above text >>> seems to be saying that the client backend and table sync worker are >>> waiting on a "subscription row in pg_subscription" and the apply >>> worker is operating on "pg_subscription_rel". So, if that is true then >>> they shouldn't get stuck. >>> >>> I think here client backend and tablesync worker seems to be blocked >>> for a lock on pg_subscription_rel. >>> >> >> Not really, they are all locking the subscription. All the locks are on >> classid=6100, which is pg_subscription: >> >> test=# select 6100::regclass; >> regclass >> - >>pg_subscription >> (1 row) >> >> The thing is, the tablesync workers call UpdateSubscriptionRelState, >> which locks the pg_subscription catalog at the very beginning: >> >>LockSharedObject(SubscriptionRelationId, ...); >> >> So that's the issue. I haven't explored why it's done this way, and >> there's no comment explaining locking the subscriptions is needed ... >> > > I think it prevents concurrent drop of rel during the REFRESH operation. > Yes. Or maybe some other concurrent DDL on the relations included in the subscription. The tablesync workers can't proceed because their lock request is stuck behind the AccessExclusiveLock request. And the apply worker can't proceed, because it's waiting for status update from the tablesync workers. >>> >>> This part is not clear to me because >>> wait_for_relation_state_change()->GetSubscriptionRelState() seems to >>> be releasing the lock while closing the relation. Am, I missing >>> something? >>> >> >> I think you're missing the fact that GetSubscriptionRelState() acquires >> and releases the lock on pg_subscription_rel, but that's not the lock >> causing the issue. The problem is the lock on the pg_subscription row. >> > > Okay. IIUC, what's going on here is that the apply worker acquires > AccessShareLock on pg_subscription to update rel state for one of the > tables say tbl-1, and then for another table say tbl-2, it started > waiting for a state change via wait_for_relation_state_change(). I > think here the fix is to commit the transaction before we go for a > wait. I guess we need something along the lines of what is proposed in > [1] though we have solved the problem in that thread in some other > way.. > Possibly. I haven't checked if the commit might have some unexpected consequences, but I can confirm I can no longer reproduce the deadlock with the patch applied. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On Wed, Nov 22, 2023 at 4:51 PM Tomas Vondra wrote: > > On 11/22/23 11:38, Amit Kapila wrote: > > > > Okay. IIUC, what's going on here is that the apply worker acquires > > AccessShareLock on pg_subscription to update rel state for one of the > > tables say tbl-1, and then for another table say tbl-2, it started > > waiting for a state change via wait_for_relation_state_change(). I > > think here the fix is to commit the transaction before we go for a > > wait. I guess we need something along the lines of what is proposed in > > [1] though we have solved the problem in that thread in some other > > way.. > > > > Possibly. I haven't checked if the commit might have some unexpected > consequences, but I can confirm I can no longer reproduce the deadlock > with the patch applied. > Thanks for the verification. Offhand, I don't see any problem with doing a commit at that place but will try to think some more about it. I think we may want to call pgstat_report_stat(false) after commit to avoid a long delay in stats. I haven't verified but I think this will be a problem in back-branches as well. -- With Regards, Amit Kapila.
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On 11/23/23 10:24, Amit Kapila wrote: > On Wed, Nov 22, 2023 at 4:51 PM Tomas Vondra > wrote: >> >> On 11/22/23 11:38, Amit Kapila wrote: >>> >>> Okay. IIUC, what's going on here is that the apply worker acquires >>> AccessShareLock on pg_subscription to update rel state for one of the >>> tables say tbl-1, and then for another table say tbl-2, it started >>> waiting for a state change via wait_for_relation_state_change(). I >>> think here the fix is to commit the transaction before we go for a >>> wait. I guess we need something along the lines of what is proposed in >>> [1] though we have solved the problem in that thread in some other >>> way.. >>> >> >> Possibly. I haven't checked if the commit might have some unexpected >> consequences, but I can confirm I can no longer reproduce the deadlock >> with the patch applied. >> > > Thanks for the verification. Offhand, I don't see any problem with > doing a commit at that place but will try to think some more about it. > I think we may want to call pgstat_report_stat(false) after commit to > avoid a long delay in stats. > Makes sense. > I haven't verified but I think this will be a problem in > back-branches as well. > Yes. I haven't tried but I don't see why backbranches wouldn't have the same issue. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
Hi, I tried to reproduce the issue and was able to reproduce it with scripts shared by Tomas. I tried testing it from PG17 to PG 11. This issue is reproducible for each version. Next I would try to test with the patch in the thread shared by Amit. Thanks, Shlok Kumar Kyal
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
Hi, > I tried to reproduce the issue and was able to reproduce it with > scripts shared by Tomas. > I tried testing it from PG17 to PG 11. This issue is reproducible for > each version. > > Next I would try to test with the patch in the thread shared by Amit. I have created the v1 patch to resolve the issue. Have tested the patch on HEAD to PG12. The same patch applies to all the versions. The changes are similar to the one posted in the thread https://www.postgresql.org/message-id/1412708.1674417574%40sss.pgh.pa.us Thanks and Regards, Shlok Kyal v1-0001-Deadlock-when-apply-worker-tablesync-worker-and-c.patch Description: Binary data
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On Fri, Nov 24, 2023 at 5:05 PM Shlok Kyal wrote: > > > I tried to reproduce the issue and was able to reproduce it with > > scripts shared by Tomas. > > I tried testing it from PG17 to PG 11. This issue is reproducible for > > each version. > > > > Next I would try to test with the patch in the thread shared by Amit. > > I have created the v1 patch to resolve the issue. Have tested the > patch on HEAD to PG12. > The same patch applies to all the versions. The changes are similar to > the one posted in the thread > https://www.postgresql.org/message-id/1412708.1674417574%40sss.pgh.pa.us > (it's quite likely we hold lock on + * pg_replication_origin, which the sync worker will need + * to update). This part of the comment is stale and doesn't hold true. You need to update the reason based on the latest problem discovered in this thread. I think you can compare the timing of regression tests in subscription, with and without the patch to show there is no regression. And probably some tests with a large number of tables for sync with very little data. -- With Regards, Amit Kapila.
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
Hi, > thread. I think you can compare the timing of regression tests in > subscription, with and without the patch to show there is no > regression. And probably some tests with a large number of tables for > sync with very little data. I have tested the regression test timings for subscription with and without patch. I also did the timing test for sync of subscription with the publisher for 100 and 1000 tables respectively. I have attached the test script and results of the timing test are as follows: Time taken for test to run in Linux VM Summary| Subscription Test (sec) |100 tables in pub and Sub (sec)| 1000 tables in pub and Sub (sec) Without patch Release | 95.564 | 7.877 | 58.919 With patch Release| 96.513 | 6.533 | 45.807 Time Taken for test to run in another Linux VM Summary| Subscription Test (sec) |100 tables in pub and Sub (sec)| 1000 tables in pub and Sub (sec) Without patch Release | 109.8145 |6.4675 |83.001 With patch Release| 113.162 |7.947 | 87.113 Time Taken for test to run in Performance Machine Linux Summary| Subscription Test (sec) |100 tables in pub and Sub (sec)| 1000 tables in pub and Sub (sec) Without patch Release | 115.871 | 6.656 | 81.157 With patch Release| 115.922 | 6.7305 | 81.1525 thoughts? Thanks and Regards, Shlok Kyal 034_tmp.pl Description: Binary data
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On Sat, Dec 2, 2023 at 9:52 PM Shlok Kyal wrote: > > > thread. I think you can compare the timing of regression tests in > > subscription, with and without the patch to show there is no > > regression. And probably some tests with a large number of tables for > > sync with very little data. > > I have tested the regression test timings for subscription with and > without patch. I also did the timing test for sync of subscription > with the publisher for 100 and 1000 tables respectively. > I have attached the test script and results of the timing test are as follows: > > Time taken for test to run in Linux VM > Summary| Subscription Test (sec) > |100 tables in pub and Sub (sec)| 1000 tables in pub and Sub > (sec) > Without patch Release | 95.564 > | 7.877 | 58.919 > With patch Release| 96.513 >| 6.533 | 45.807 > > Time Taken for test to run in another Linux VM > Summary| Subscription Test (sec) > |100 tables in pub and Sub (sec)| 1000 tables in pub and Sub > (sec) > Without patch Release | 109.8145 > |6.4675 |83.001 > With patch Release| 113.162 >|7.947 | 87.113 > So, on some machines, it may increase the test timing although not too much. I think the reason is probably doing the work in multiple transactions for multiple relations. I am wondering that instead of committing and starting a new transaction before wait_for_relation_state_change(), what if we do it inside that function just before we decide to wait? It is quite possible that in many cases we don't need any wait at all. -- With Regards, Amit Kapila.
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On 12/4/23 12:37, Amit Kapila wrote: > On Sat, Dec 2, 2023 at 9:52 PM Shlok Kyal wrote: >> >>> thread. I think you can compare the timing of regression tests in >>> subscription, with and without the patch to show there is no >>> regression. And probably some tests with a large number of tables for >>> sync with very little data. >> >> I have tested the regression test timings for subscription with and >> without patch. I also did the timing test for sync of subscription >> with the publisher for 100 and 1000 tables respectively. >> I have attached the test script and results of the timing test are as >> follows: >> >> Time taken for test to run in Linux VM >> Summary| Subscription Test (sec) >> |100 tables in pub and Sub (sec)| 1000 tables in pub and Sub >> (sec) >> Without patch Release | 95.564 >> | 7.877 | 58.919 >> With patch Release| 96.513 >>| 6.533 | 45.807 >> >> Time Taken for test to run in another Linux VM >> Summary| Subscription Test (sec) >> |100 tables in pub and Sub (sec)| 1000 tables in pub and Sub >> (sec) >> Without patch Release | 109.8145 >> |6.4675 |83.001 >> With patch Release| 113.162 >>|7.947 | 87.113 >> > > So, on some machines, it may increase the test timing although not too > much. I think the reason is probably doing the work in multiple > transactions for multiple relations. I am wondering that instead of > committing and starting a new transaction before > wait_for_relation_state_change(), what if we do it inside that > function just before we decide to wait? It is quite possible that in > many cases we don't need any wait at all. > I'm not sure what you mean by "do it". What should the function do? As for the test results, I very much doubt the differences are not caused simply by random timing variations, or something like that. And I don't understand what "Performance Machine Linux" is, considering those timings are slower than the other two machines. Also, even if it was a bit slower, does it really matter? I mean, the current code is wrong, can lead to infinite duration if it happens to hit the deadlock. And it's a one-time action, I don't think it's a very sensitive in terms of performance. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On Mon, Dec 4, 2023 at 5:30 PM Tomas Vondra wrote: > > On 12/4/23 12:37, Amit Kapila wrote: > > On Sat, Dec 2, 2023 at 9:52 PM Shlok Kyal wrote: > >> > >>> thread. I think you can compare the timing of regression tests in > >>> subscription, with and without the patch to show there is no > >>> regression. And probably some tests with a large number of tables for > >>> sync with very little data. > >> > >> I have tested the regression test timings for subscription with and > >> without patch. I also did the timing test for sync of subscription > >> with the publisher for 100 and 1000 tables respectively. > >> I have attached the test script and results of the timing test are as > >> follows: > >> > >> Time taken for test to run in Linux VM > >> Summary| Subscription Test (sec) > >> |100 tables in pub and Sub (sec)| 1000 tables in pub and Sub > >> (sec) > >> Without patch Release | 95.564 > >> | 7.877 | 58.919 > >> With patch Release| 96.513 > >>| 6.533 | 45.807 > >> > >> Time Taken for test to run in another Linux VM > >> Summary| Subscription Test (sec) > >> |100 tables in pub and Sub (sec)| 1000 tables in pub and Sub > >> (sec) > >> Without patch Release | 109.8145 > >> |6.4675 |83.001 > >> With patch Release| 113.162 > >>|7.947 | 87.113 > >> > > > > So, on some machines, it may increase the test timing although not too > > much. I think the reason is probably doing the work in multiple > > transactions for multiple relations. I am wondering that instead of > > committing and starting a new transaction before > > wait_for_relation_state_change(), what if we do it inside that > > function just before we decide to wait? It is quite possible that in > > many cases we don't need any wait at all. > > > > I'm not sure what you mean by "do it". What should the function do? > I mean to commit the open transaction at the below place in wait_for_relation_state_change() wait_for_relation_state_change() { ... -- commit the xact WaitLatch(); ... } Then start after the wait is over. This is just to test whether it improves the difference in regression test timing. > As for the test results, I very much doubt the differences are not > caused simply by random timing variations, or something like that. And I > don't understand what "Performance Machine Linux" is, considering those > timings are slower than the other two machines. > > Also, even if it was a bit slower, does it really matter? I mean, the > current code is wrong, can lead to infinite duration if it happens to > hit the deadlock. And it's a one-time action, I don't think it's a very > sensitive in terms of performance. > Yeah, I see that point but trying to evaluate if we can avoid an increase in regression test timing at least for the HEAD. The tests are done in release mode, so, I suspect there could be a slightly bigger gap in debug mode (we can check that once though) which may hit developers running regressions quite often in their development environments. Now, if we there is no easy way to avoid the increase in regression test timing, we have to still fix the problem, so we have to take the hit of some increase in time. -- With Regards, Amit Kapila.
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
Hi, > As for the test results, I very much doubt the differences are not > caused simply by random timing variations, or something like that. And I > don't understand what "Performance Machine Linux" is, considering those > timings are slower than the other two machines. The machine has Total Memory of 755.536 GB, 120 CPUs and RHEL 7 Operating System Also find the detailed info of the performance machine attached. Thanks and Regards, Shlok Kyal MemTotal: 755.536 GB MemFree: 748.281 GB MemAvailable: 748.581 GB Buffers: 0.002 GB Cached: 1.366 GB SwapCached: 0.000 GB Active: 0.834 GB Inactive: 0.745 GB Active(anon): 0.221 GB Inactive(anon): 0.028 GB Active(file): 0.614 GB Inactive(file): 0.717 GB Unevictable: 0.000 GB Mlocked: 0.000 GB SwapTotal: 4.000 GB SwapFree: 4.000 GB Dirty: 0.000 GB Writeback: 0.000 GB AnonPages: 0.212 GB Mapped: 0.142 GB Shmem: 0.038 GB Slab: 0.468 GB SReclaimable: 0.139 GB SUnreclaim: 0.329 GB KernelStack: 0.020 GB PageTables: 0.023 GB NFS_Unstable: 0.000 GB Bounce: 0.000 GB WritebackTmp: 0.000 GB CommitLimit: 381.768 GB Committed_AS: 0.681 GB VmallocTotal: 32768.000 GB VmallocUsed: 1.852 GB VmallocChunk: 32189.748 GB Percpu: 0.035 GB HardwareCorrupted: 0.000 GB AnonHugePages: 0.025 GB CmaTotal: 0.000 GB CmaFree: 0.000 GB HugePages_Total: 0.000 GB HugePages_Free: 0.000 GB HugePages_Rsvd: 0.000 GB HugePages_Surp: 0.000 GB Hugepagesize: 0.002 GB DirectMap4k: 0.314 GB DirectMap2M: 6.523 GB DirectMap1G: 763.000 GB processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Intel(R) Xeon(R) CPU E7-4890 v2 @ 2.80GHz stepping: 7 microcode : 0x715 cpu MHz : 1762.646 cache size : 38400 KB physical id : 0 siblings: 30 core id : 0 cpu cores : 15 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear spec_ctrl intel_stibp flush_l1d bogomips: 5586.07 clflush size: 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management:NAME="Red Hat Enterprise Linux Server" VERSION="7.8 (Maipo)" ID="rhel" ID_LIKE="fedora" VARIANT="Server" VARIANT_ID="server" VERSION_ID="7.8" PRETTY_NAME="Red Hat Enterprise Linux Server 7.8 (Maipo)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:7.8:GA:server" HOME_URL="https://www.redhat.com/"; BUG_REPORT_URL="https://bugzilla.redhat.com/"; REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7" REDHAT_BUGZILLA_PRODUCT_VERSION=7.8 REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux" REDHAT_SUPPORT_PRODUCT_VERSION="7.8"
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On 12/5/23 08:14, Shlok Kyal wrote: > Hi, > >> As for the test results, I very much doubt the differences are not >> caused simply by random timing variations, or something like that. And I >> don't understand what "Performance Machine Linux" is, considering those >> timings are slower than the other two machines. > > The machine has Total Memory of 755.536 GB, 120 CPUs and RHEL 7 Operating > System > Also find the detailed info of the performance machine attached. > Thanks for the info. I don't think the tests really benefit from this much resources, I would be rather surprised if it was faster beyond 8 cores or so. The CPU frequency likely matters much more. Which probably explains why this machine was the slowest. Also, I wonder how much the results vary between the runs. I suppose you only did s single run for each, right? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On Tue, 5 Dec 2023 at 17:18, Tomas Vondra wrote: > > On 12/5/23 08:14, Shlok Kyal wrote: > > Hi, > > > >> As for the test results, I very much doubt the differences are not > >> caused simply by random timing variations, or something like that. And I > >> don't understand what "Performance Machine Linux" is, considering those > >> timings are slower than the other two machines. > > > > The machine has Total Memory of 755.536 GB, 120 CPUs and RHEL 7 Operating > > System > > Also find the detailed info of the performance machine attached. > > > > Thanks for the info. I don't think the tests really benefit from this > much resources, I would be rather surprised if it was faster beyond 8 > cores or so. The CPU frequency likely matters much more. Which probably > explains why this machine was the slowest. > > Also, I wonder how much the results vary between the runs. I suppose you > only did s single run for each, right? I did 10 runs for each of the cases and reported the median result in the previous thread. I have documented the result of the runs and have attached the same here. Thanks and Regards, Shlok Kyal Performance_Test.xlsx Description: MS-Excel 2007 spreadsheet
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
Hi, > I mean to commit the open transaction at the below place in > wait_for_relation_state_change() > > wait_for_relation_state_change() > { > ... > -- commit the xact > WaitLatch(); > ... > } > > Then start after the wait is over. This is just to test whether it > improves the difference in regression test timing. I tried the above approach and observed that the performance of this approach is nearly same as the previous approach. For Linux VM: Summary| Subscription | 100 tables in pub | 1000 tables in pub | Test (sec) | and Sub (sec) | and Sub (sec) -- old patch | 107.4545 | 6.911 | 77.918 alternate | 108.3985| 6.9835 | 78.111 approach For Performance Machine: Summary| Subscription | 100 tables in pub | 1000 tables in pub | Test (sec) | and Sub (sec) | and Sub (sec) -- old patch | 115.922 | 6.7305 | 81.1525 alternate | 115.8215 | 6.7685| 81.2335 approach I have attached the patch for this approach as 'alternate_approach.patch'. Since the performance is the same, I think that the previous approach is better. As in this approach we are using CommitTransactionCommand() and StartTransactionCommand() inside a 'for loop'. I also fixed the comment in previous approach and attached here as 'v2-0001-Deadlock-when-apply-worker-tablesync-worker-and-c.patch' Thanks and Regards Shlok Kyal From 11072d138d900227b963b54d1a3626cf256db721 Mon Sep 17 00:00:00 2001 From: Shlok Kyal Date: Fri, 24 Nov 2023 16:37:14 +0530 Subject: [PATCH v2] Deadlock when apply worker,tablesync worker and client backend run parallelly. Apply worker holds a lock on the table pg_subscription_rel and waits for notification from the tablesync workers where the relation is synced, which happens through updating the pg_subscription_rel row. And that wait happens in wait_for_relation_state_change, which simply checks the row in a loop, with a sleep by WaitLatch(). Unfortunately, the tablesync workers can't update the row because the client backend executing ALTER SUBSCRIPTION ... REFRESH PUBLICATION sneaked in, and waits for an AccessExclusiveLock. So the tablesync workers are stuck in the queue and can't proceed. The client backend is waiting for a lock held by the apply worker. The tablesync workers can't proceed because their lock request is stuck behind the AccessExclusiveLock request. And the apply worker is waiting for status update from the tablesync workers. And the deadlock is undetected, because the apply worker is not waiting on a lock, but sleeping on a latch. We have resolved the issue by releasing the lock by commiting the transaction before the apply worker goes to wait state. --- src/backend/replication/logical/tablesync.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c index df3c42eb5d..b71234d998 100644 --- a/src/backend/replication/logical/tablesync.c +++ b/src/backend/replication/logical/tablesync.c @@ -542,15 +542,26 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn) LWLockRelease(LogicalRepWorkerLock); /* - * Enter busy loop and wait for synchronization worker to - * reach expected state (or die trying). + * If we have a transaction, we must commit it to release + * any locks we have. Then start a new transaction so we + * can examine catalog state. */ - if (!started_tx) + if (started_tx) + { + CommitTransactionCommand(); + pgstat_report_stat(false); + StartTransactionCommand(); + } + else { StartTransactionCommand(); started_tx = true; } + /* + * Enter busy loop and wait for synchronization worker to + * reach expected state (or die trying). + */ wait_for_relation_state_change(rstate->relid, SUBREL_STATE_SYNCDONE); } -- 2.34.1 From ce261399d22a7fd9ee7ccdd3b9d1cc7f4f72ce4b Mon Sep 17 00:00:00 2001 From: Shlok Kyal Date: Thu, 7 Dec 2023 10:55:26 +0530 Subject: [PATCH] Deadlock when apply worker,tablesync worker and client backend run parallelly. Apply worker holds a lock on the table pg_subscription_rel and waits for notification from the tablesync workers where the relation is synced, which happens through updating the pg_subscription_rel row. And that wait happens in wait_for_relation_state_change, which simply checks the row in a loop, with a sleep by WaitLatch(). Unfortunately, the tablesync workers can't update the row because the client backend executing ALTER SUBSCRIPTION ... REFRESH PUBLICATION sneaked in, and waits for an AccessExclusiveLock. So the tablesync workers are stuck in the queue and can't proceed. The
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On Thu, Dec 7, 2023 at 11:21 AM Shlok Kyal wrote: > > > I mean to commit the open transaction at the below place in > > wait_for_relation_state_change() > > > > wait_for_relation_state_change() > > { > > ... > > -- commit the xact > > WaitLatch(); > > ... > > } > > > > Then start after the wait is over. This is just to test whether it > > improves the difference in regression test timing. > > I tried the above approach and observed that the performance of this > approach is nearly same as the previous approach. > Then let's go with the original patch only. BTW, it took almost the same time (105 wallclock secs) in my environment (CentOs VM) to run tests in src/test/subscription both with and without the patch. I took a median of five runs. I have slightly adjusted the comments and commit message in the attached. If you are fine with this, we can commit and backpatch this. -- With Regards, Amit Kapila. v3-0001-Fix-an-undetected-deadlock-due-to-apply-worker.patch Description: Binary data
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
Hi, > Then let's go with the original patch only. BTW, it took almost the > same time (105 wallclock secs) in my environment (CentOs VM) to run > tests in src/test/subscription both with and without the patch. I took > a median of five runs. I have slightly adjusted the comments and > commit message in the attached. If you are fine with this, we can > commit and backpatch this. I have tested the patch for all the branches from PG 17 to PG 12. The same patch applies cleanly on all branches. Also, the same patch resolves the issue on all the branches. I ran all the tests and all the tests passed on each branch. I also reviewed the patch and it looks good to me. Thanks and Regards, Shlok Kyal
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On Fri, Dec 8, 2023 at 7:16 PM Shlok Kyal wrote: > > > Then let's go with the original patch only. BTW, it took almost the > > same time (105 wallclock secs) in my environment (CentOs VM) to run > > tests in src/test/subscription both with and without the patch. I took > > a median of five runs. I have slightly adjusted the comments and > > commit message in the attached. If you are fine with this, we can > > commit and backpatch this. > > I have tested the patch for all the branches from PG 17 to PG 12. > The same patch applies cleanly on all branches. Also, the same patch > resolves the issue on all the branches. > I ran all the tests and all the tests passed on each branch. > > I also reviewed the patch and it looks good to me. > Thanks, I could also reproduce the issue on back branches (tried till 12), and the fix works. I'll push this on Monday. -- With Regards, Amit Kapila.
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On Sat, Dec 9, 2023 at 12:16 PM Amit Kapila wrote: > > Thanks, I could also reproduce the issue on back branches (tried till > 12), and the fix works. I'll push this on Monday. > Peter sent one minor suggestion (to write the check differently for easier understanding) offlist which I addressed and pushed the patch. -- With Regards, Amit Kapila.
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On 12/11/23 07:12, Amit Kapila wrote: > On Sat, Dec 9, 2023 at 12:16 PM Amit Kapila wrote: >> >> Thanks, I could also reproduce the issue on back branches (tried till >> 12), and the fix works. I'll push this on Monday. >> > > Peter sent one minor suggestion (to write the check differently for > easier understanding) offlist which I addressed and pushed the patch. > Thanks for taking care of fixing this! regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On Tue, Nov 21, 2023 at 5:17 PM Tomas Vondra wrote: > > I decided to do some stress-testing of the built-in logical replication, > as part of the sequence decoding work. And I soon ran into an undetected > deadlock related to ALTER SUBSCRIPTION ... REFRESH PUBLICATION :-( > > The attached bash scripts triggers that in a couple seconds for me. The > script looks complicated, but most of the code is waiting for sync to > complete, catchup, and that sort of thing. > > What the script does is pretty simple: > > 1) initialize two clusters, set them as publisher/subscriber pair > > 2) create some number of tables, add them to publication and wait for >the sync to complete > > 3) start two pgbench runs in the background, modifying the publication >(one removes+adds all tables in a single transaction, one does that > with transaction per table) > > 4) run refresh.sh which does ALTER PUBLICATION ... REFRESH PUBLICATION >in a loop (now that I think about it, could be another pgbench >script, but well ...) > > 5) some consistency checks, but the lockup happens earlier so this does >not really matter > > After a small number of refresh cycles (for me it's usually a couple > dozen), we end up with a couple stuck locks (I shortened the backend > type string a bit, for formatting reasons): > > test=# select a.pid, classid, objid, backend_type, query >from pg_locks l join pg_stat_activity a on (a.pid = l.pid) > where not granted; > > pid | classid | objid | backend_type | query > -+-+---+--+-- >2691941 |6100 | 16785 | client backend | ALTER SUBSCRIPTION s > REFRESH PUBLICATION >2691837 |6100 | 16785 | tablesync worker | >2691936 |6100 | 16785 | tablesync worker | > (3 rows) > > All these backends wait for 6100/16785, which is the subscription row in > pg_subscription. The tablesync workers are requesting AccessShareLock, > the client backend however asks for AccessExclusiveLock. > > The entry is currently locked by: > > test=# select a.pid, mode, backend_type from pg_locks l >join pg_stat_activity a on (a.pid = l.pid) > where classid=6100 and objid=16785 and granted; > > pid | mode | backend_type > -+-+-- >2690477 | AccessShareLock | logical replication apply worker > (1 row) > > But the apply worker is not waiting for any locks, so what's going on? > > Well, the problem is the apply worker is waiting for notification from > the tablesync workers the relation is synced, which happens through > updating the pg_subscription_rel row. And that wait happens in > wait_for_relation_state_change, which simply checks the row in a loop, > with a sleep by WaitLatch(). > > Unfortunately, the tablesync workers can't update the row because the > client backend executing ALTER SUBSCRIPTION ... REFRESH PUBLICATION > sneaked in, and waits for an AccessExclusiveLock. So the tablesync > workers are stuck in the queue and can't proceed. > > The client backend can't proceed, because it's waiting for a lock held > by the apply worker. > It seems there is some inconsistency in what you have written for client backends/tablesync worker vs. apply worker. The above text seems to be saying that the client backend and table sync worker are waiting on a "subscription row in pg_subscription" and the apply worker is operating on "pg_subscription_rel". So, if that is true then they shouldn't get stuck. I think here client backend and tablesync worker seems to be blocked for a lock on pg_subscription_rel. > The tablesync workers can't proceed because their lock request is stuck > behind the AccessExclusiveLock request. > > And the apply worker can't proceed, because it's waiting for status > update from the tablesync workers. > This part is not clear to me because wait_for_relation_state_change()->GetSubscriptionRelState() seems to be releasing the lock while closing the relation. Am, I missing something? -- With Regards, Amit Kapila.
Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
On 11/21/23 14:16, Amit Kapila wrote: > On Tue, Nov 21, 2023 at 5:17 PM Tomas Vondra > wrote: >> >> I decided to do some stress-testing of the built-in logical replication, >> as part of the sequence decoding work. And I soon ran into an undetected >> deadlock related to ALTER SUBSCRIPTION ... REFRESH PUBLICATION :-( >> >> The attached bash scripts triggers that in a couple seconds for me. The >> script looks complicated, but most of the code is waiting for sync to >> complete, catchup, and that sort of thing. >> >> What the script does is pretty simple: >> >> 1) initialize two clusters, set them as publisher/subscriber pair >> >> 2) create some number of tables, add them to publication and wait for >>the sync to complete >> >> 3) start two pgbench runs in the background, modifying the publication >>(one removes+adds all tables in a single transaction, one does that >> with transaction per table) >> >> 4) run refresh.sh which does ALTER PUBLICATION ... REFRESH PUBLICATION >>in a loop (now that I think about it, could be another pgbench >>script, but well ...) >> >> 5) some consistency checks, but the lockup happens earlier so this does >>not really matter >> >> After a small number of refresh cycles (for me it's usually a couple >> dozen), we end up with a couple stuck locks (I shortened the backend >> type string a bit, for formatting reasons): >> >> test=# select a.pid, classid, objid, backend_type, query >>from pg_locks l join pg_stat_activity a on (a.pid = l.pid) >> where not granted; >> >> pid | classid | objid | backend_type | query >> -+-+---+--+-- >>2691941 |6100 | 16785 | client backend | ALTER SUBSCRIPTION s >> REFRESH PUBLICATION >>2691837 |6100 | 16785 | tablesync worker | >>2691936 |6100 | 16785 | tablesync worker | >> (3 rows) >> >> All these backends wait for 6100/16785, which is the subscription row in >> pg_subscription. The tablesync workers are requesting AccessShareLock, >> the client backend however asks for AccessExclusiveLock. >> >> The entry is currently locked by: >> >> test=# select a.pid, mode, backend_type from pg_locks l >>join pg_stat_activity a on (a.pid = l.pid) >> where classid=6100 and objid=16785 and granted; >> >> pid | mode | backend_type >> -+-+-- >>2690477 | AccessShareLock | logical replication apply worker >> (1 row) >> >> But the apply worker is not waiting for any locks, so what's going on? >> >> Well, the problem is the apply worker is waiting for notification from >> the tablesync workers the relation is synced, which happens through >> updating the pg_subscription_rel row. And that wait happens in >> wait_for_relation_state_change, which simply checks the row in a loop, >> with a sleep by WaitLatch(). >> >> Unfortunately, the tablesync workers can't update the row because the >> client backend executing ALTER SUBSCRIPTION ... REFRESH PUBLICATION >> sneaked in, and waits for an AccessExclusiveLock. So the tablesync >> workers are stuck in the queue and can't proceed. >> >> The client backend can't proceed, because it's waiting for a lock held >> by the apply worker. >> > > It seems there is some inconsistency in what you have written for > client backends/tablesync worker vs. apply worker. The above text > seems to be saying that the client backend and table sync worker are > waiting on a "subscription row in pg_subscription" and the apply > worker is operating on "pg_subscription_rel". So, if that is true then > they shouldn't get stuck. > > I think here client backend and tablesync worker seems to be blocked > for a lock on pg_subscription_rel. > Not really, they are all locking the subscription. All the locks are on classid=6100, which is pg_subscription: test=# select 6100::regclass; regclass - pg_subscription (1 row) The thing is, the tablesync workers call UpdateSubscriptionRelState, which locks the pg_subscription catalog at the very beginning: LockSharedObject(SubscriptionRelationId, ...); So that's the issue. I haven't explored why it's done this way, and there's no comment explaining locking the subscriptions is needed ... >> The tablesync workers can't proceed because their lock request is stuck >> behind the AccessExclusiveLock request. >> >> And the apply worker can't proceed, because it's waiting for status >> update from the tablesync workers. >> > > This part is not clear to me because > wait_for_relation_state_change()->GetSubscriptionRelState() seems to > be releasing the lock while closing the relation. Am, I missing > something? > I think you're missing the fact that GetSubscriptionRelState() acquires and releases the lock on pg_subscription_rel, but that's not the lock causing th