> -----Original Message----- > From: Ying Xue [mailto:[email protected]] > Sent: Tuesday, December 05, 2017 06:53 > To: Jon Maloy <[email protected]>; [email protected] > Cc: [email protected] > Subject: Re: [net-next PATCH 1/2] tipc: prevent swork item from queuing > workqueue after connection is closed > > On 12/05/2017 01:45 AM, Jon Maloy wrote: > >> diff --git a/net/tipc/server.c b/net/tipc/server.c index > >> acaef80..571e137 > >> 100644 > >> --- a/net/tipc/server.c > >> +++ b/net/tipc/server.c > >> @@ -449,16 +449,12 @@ int tipc_conn_sendmsg(struct tipc_server *s, > >> int conid, { > >> struct outqueue_entry *e; > >> struct tipc_conn *con; > >> + struct sock *sk; > >> > >> con = tipc_conn_lookup(s, conid); > >> if (!con) > >> return -EINVAL; > >> > >> - if (!test_bit(CF_CONNECTED, &con->flags)) { > >> - conn_put(con); > >> - return 0; > >> - } > >> - > >> e = tipc_alloc_entry(data, len); > >> if (!e) { > >> conn_put(con); > >> @@ -472,8 +468,17 @@ int tipc_conn_sendmsg(struct tipc_server *s, int > >> conid, > > > > You cannot be sure that con is valid here. You need to add the > !kref_get_unless_zero(&con->kref)) in the lookup function, as I had to do. > > This was caused by observed crashes. > > > > Thanks! This is a very good catch point!! I was never aware of such corner > case. > > > > >> list_add_tail(&e->list, &con->outqueue); > >> spin_unlock_bh(&con->outqueue_lock); > >> > >> - if (!queue_work(s->send_wq, &con->swork)) > >> + sk = con->sock->sk; > >> + > > You don't know if the socket still exists here. We may be in the middle of a > tipc_conn_close(). > > Furthermore, in the case of a kernel subscriber, there *is* no socket. > > But I am not sure this patch is needed at all ?? > > Sorry, I missed the case where con->sock is null. But the change is really > necessary for another case where con->sock is not null. If we don't hold > sk_callback_lock read lock and don't identify whether > tipc_unregister_callbac() has been called, but directly queue swork item to > sending workqueue, this will open another risk window that a new tx work > item might be queued through tipc_conn_sendmsg() again after all tx/rx > pending work items have been flushed in tipc_close_conn() conducted in > patch #2. As a result, the context of freeing con instance might be not in > tipc_server_stop() any more, which means the nametbl_lock soft lockup at > module exit might occur again.
If you in tipc_conn_kref_release() remove the connection item from the idr list *before* you flush the queues and remove the socket, there is no risk that this will happen. If tipc_conn_sendmsg() obtains a pointer, we can be certain that the queue has not been flushed yet, and that this will not happen until conn_put() is called. And, you won't need to use the callback lock or touch the socket at all here. Regards ///jon ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ tipc-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/tipc-discussion
