On Thu, Feb 2, 2023 at 12:05 PM houzj.f...@fujitsu.com <houzj.f...@fujitsu.com> wrote: > > On Tuesday, January 31, 2023 1:07 AM vignesh C <vignes...@gmail.com> wrote: > > On Mon, 30 Jan 2023 at 17:30, vignesh C <vignes...@gmail.com> wrote: > > > > I also tried to test the time of "src/test/subscription/t/002_types.pl" > before and after the patch(change the lock level) and Tom's patch(split > transaction) like what Vignesh has shared on -hackers. > > I run about 100 times for each case. Tom's and the lock level patch > behave similarly on my machines[1]. > > HEAD: 3426 ~ 6425 ms > HEAD + Tom: 3404 ~ 3462 ms > HEAD + Vignesh: 3419 ~ 3474 ms > HEAD + Tom + Vignesh: 3408 ~ 3454 ms > > Even apart from the testing time reduction, reducing the lock level and lock > the specific object can also help improve the lock contention which user(that > use the exposed function) , table sync worker and apply worker can also > benefit > from it. So, I think pushing the patch to change the lock level makes sense. > > And the patch looks good to me. >
Thanks for the tests. I also see a reduction in test time variability with Vignesh's patch. I think we can release the locks in case the origin is concurrently dropped as in the attached patch. I am planning to commit this patch tomorrow unless there are more comments or objections. > While on it, after pushing the patch, I think there is another case might also > worth to be improved, that is the table sync and apply worker try to drop the > same origin which might cause some delay. This is another case(different from > the deadlock), so I feel we can try to improve this in another patch. > Right, I think that case could be addressed by Tom's patch to some extent but I am thinking we should also try to analyze if we can completely avoid the need to remove origins from both processes. One idea could be to introduce another relstate something like PRE_SYNCDONE and set it in a separate transaction before we set the state as SYNCDONE and remove the slot and origin in tablesync worker. Now, if the tablesync worker errors out due to some reason during the second transaction, it can remove the slot and origin after restart by checking the state. However, it would add another relstate which may not be the best way to address this problem. Anyway, that can be accomplished as a separate patch. -- With Regards, Amit Kapila.
v4-0001-Optimize-the-origin-drop-functionality.patch
Description: Binary data