On Tue, Jun 26, 2018 at 6:25 AM, Amit Khandekar <amitdkhan...@gmail.com> wrote: > Added this into the July 2018 commitfest : > > https://commitfest.postgresql.org/18/1696/
It seems to me that it would probably be better to separate this into two patches, because I think there are really two separate issues. With regard to the lack of proper subtransaction handling, I think it would be best if we could avoid introducing a new AtSubStart function. I wrote a patch for this issue that works that uses a slightly different kind of stack than yours, which I have attached to this email, and it creates stack levels lazily so that it doesn't need an AtSubStart function. It would probably also be possible to adapt your patch to create new stack levels on demand instead of at subtransaction start. I'm not sure which approach is better, but I do think it would be best not to use your patch as you have it now, because that does unnecessary work at the beginning and end of every subtransaction if there is an ALTER SUBSCRIPTION command pending at an outer level, even though the subtransaction may never touch the subscription state. As for the other part of your fix, which I think basically boils down to comparing the final states instead of just looking at what got changed, the logic looks complicated and I don't think I fully understand it, but here are a few comments. + subrelids = GetSubscriptionRelids(sub->oid, + committed_subrels ? + CurrentMemoryContext : TopTransactionContext); This looks ugly and dangerous. In the first place, if GetSubscriptionRelids() needs to work in one of several memory contexts, the best thing would probably be for the caller to be responsible for saving and restoring the memory context as needed, rather than passing it as an argument. Secondly, it's not very clear why we need to do this. The comment says we have to do it, but it doesn't give a reason. + * Merge the current list into the immediate parent. + * So say, parent has sub1(tab1, tab2), sub2(tab2, tab3), + * and current on_commit_workers has sub2(tab4) and sub3(tab1), + * then the merged list will have : + * sub1(tab1, tab2), sub2(tab4), sub3(tab1) I don't think this is very clear. Also, why is that the correct answer? Why not sub2(tab2, tab3, tab4)? + foreach(lc, on_commit_stop_workers) + { + SubscriptionRels *subrels = lfirst(lc); + ListCell *lc1; + + /* Search this subrel in the subrels of the top of stack. */ + foreach(lc1, subtrans_stop_workers->stop_workers) This might be very expensive if both lists are long. I guess that's probably not very likely. You would need to have modify a lot of subscriptions and then, within a subtransaction, modify a lot of subscriptions again. + foreach(lc, committed_subrels_list) + { + SubscriptionRels *subrels = (SubscriptionRels *) lfirst(lc); + + if (sub->oid == subrels->subid) + { + committed_subrels = subrels; + break; + } + } This looks to be O(n^2) in the number of subscriptions modified in a single transaction. Like Peter, I'm not entirely sure how much we should worry about this problem. However, if we're going to worry about it, I wonder if we should worry harder and try to come up with a better data structure than a bunch of linked lists. Maybe a hash table indexed by subid? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
apply-launcher-subtrans-rmh.patch
Description: Binary data