Hi Jon, I understood your concern.
I have checked the possibility of merging patch #1, #4 and #5 as one. However, just merging the three patch is insufficient, and at least #2 seems necessary too, otherwise, another deadlock still exists due to two premature 'return's in subcsrb_report_overlap(). Even if we merged them as one, it will lose my initial purpose of dividing the series as so small patches. Although each patch is made a small change, it's often related to a policy adjustment of locking or holding refcount. Moreover, as our locking policy associated with topserver becomes complex, I want to use the comments in each patch header to record what policy has been adjusted. In the future, the information can guide whether our changes comply with the adjusted policy or not. In fact, all changes contained in the series is not big. But if we merged them as one, all useful messages will be lost forever. Additionally, "net-next" tree reaches 4.10-rc8, and "net" tree is 4.10-rc7 now. I saw today there was one developer who submitted a patch to net-next and David also accepted it. However, if John's testing proved the series is okay tomorrow, probably I can send the series to net-next tomorrow. Even for the worst case, we cannot submit the series until net-next is open again. But I have checked nobody would maintain 4.10 as a stable version. So even if there is a big long time gap, it seems not to cause a series issue. Regards, Ying -----Original Message----- From: Jon Maloy [mailto:jon.ma...@ericsson.com] Sent: Tuesday, February 21, 2017 7:12 PM To: Xue, Ying; Parthasarathy Bhuvaragan; thompa....@gmail.com Cc: tipc-discussion@lists.sourceforge.net Subject: RE: [net 0/5] solve two deadlock issues Hi Ying, These are good design changes, that definitely should go in asap. However, I feel deeply uncomfortable with such a big change going into 'net', especially since our previous, exceptionally large, contribution now has turned out to have problems. I wonder if we could not get away with something simpler for 'net'. Looking closer at your series, it seems to me that only patches ## 1, 4, and the lock removal part of #5 are really needed to solve the problem we have at hand now. Why not merge those into one patch and deliver this to 'net', while reference count redesign parts can go into net-next ? Regards ///jon > -----Original Message----- > From: Ying Xue [mailto:ying....@windriver.com] > Sent: Monday, February 20, 2017 06:39 AM > To: Jon Maloy <jon.ma...@ericsson.com>; Parthasarathy Bhuvaragan > <parthasarathy.bhuvara...@ericsson.com>; thompa....@gmail.com > Cc: tipc-discussion@lists.sourceforge.net > Subject: [net 0/5] solve two deadlock issues > > Commit d094c4d5f5 ("tipc: add subscription refcount to avoid invalid > delete") accidently introduce the following deadlock scenarios: > > CPU1: CPU2: > ---------- ---------------- > tipc_nametbl_publish > spin_lock_bh(&tn->nametbl_lock) > tipc_nametbl_insert_publ > tipc_nameseq_insert_publ > tipc_subscrp_report_overlap > tipc_subscrp_get > tipc_subscrp_send_event > tipc_close_conn > tipc_subscrb_release_cb > tipc_subscrb_delete > tipc_subscrp_put > tipc_subscrp_put > tipc_subscrp_kref_release > tipc_nametbl_unsubscribe > spin_lock_bh(&tn->nametbl_lock) > <<grab nametbl_lock again>> > > CPU1: CPU2: > ---------- ---------------- > tipc_nametbl_stop > spin_lock_bh(&tn->nametbl_lock) > tipc_purge_publications > tipc_nameseq_remove_publ > tipc_subscrp_report_overlap > tipc_subscrp_get > tipc_subscrp_send_event > tipc_close_conn > tipc_subscrb_release_cb > tipc_subscrb_delete > tipc_subscrp_put > tipc_subscrp_put > tipc_subscrp_kref_release > tipc_nametbl_unsubscribe > spin_lock_bh(&tn->nametbl_lock) > <<grab nametbl_lock again>> > > The root cause of two deadlocks is that we have to hold nametbl lock > when subscription is freed in tipc_subscrp_kref_release(). In order to > eliminate the need of taking nametbl lock in > tipc_subscrp_kref_release(), the functions protected by nametbl lock > in tipc_subscrp_kref_release() are moved to other places step by step in the > series. > > Ying Xue (5): > tipc: advance the time of deleting subscription from > subscriber->subscrp_list > tipc: adjust the policy of holding subscription kref > tipc: adjust policy that sub->timer holds subscription kref > tipc: advance the time of calling tipc_nametbl_unsubscribe > tipc: remove unnecessary increasement of subscription refcount > > net/tipc/name_table.c | 2 ++ > net/tipc/subscr.c | 32 ++++++++++++++------------------ > net/tipc/subscr.h | 3 +++ > 3 files changed, 19 insertions(+), 18 deletions(-) > > -- > 2.7.4 ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion