On Wed, Jun 11, 2014 at 04:12:15PM -0700, Eric W. Biederman wrote: > "Paul E. McKenney" <paul...@linux.vnet.ibm.com> writes: > > > On Wed, Jun 11, 2014 at 01:46:08PM -0700, Eric W. Biederman wrote: > >> On the chance it is dropping the old nsproxy which calls syncrhonize_rcu > >> in switch_task_namespaces that is causing you problems I have attached > >> a patch that changes from rcu_read_lock to task_lock for code that > >> calls task_nsproxy from a different task. The code should be safe > >> and it should be an unquestions performance improvement but I have only > >> compile tested it. > >> > >> If you can try the patch it will tell is if the problem is the rcu > >> access in switch_task_namespaces (the only one I am aware of network > >> namespace creation) or if the problem rcu case is somewhere else. > >> > >> If nothing else knowing which rcu accesses are causing the slow down > >> seem important at the end of the day. > >> > >> Eric > >> > > > > If this is the culprit, another approach would be to use workqueues from > > RCU callbacks. The following (untested, probably does not even build) > > patch illustrates one such approach. > > For reference the only reason we are using rcu_lock today for nsproxy is > an old lock ordering problem that does not exist anymore. > > I can say that in some workloads setns is a bit heavy today because of > the synchronize_rcu and setns is more important that I had previously > thought because pthreads break the classic unix ability to do things in > your process after fork() (sigh). > > Today daemonize is gone, and notify the parent process with a signal > relies on task_active_pid_ns which does not use nsproxy. So the old > lock ordering problem/race is gone. > > The description of what was happening when the code switched from > task_lock to rcu_read_lock to protect nsproxy.
OK, never mind, then! ;-) Thanx, Paul > commit cf7b708c8d1d7a27736771bcf4c457b332b0f818 > Author: Pavel Emelyanov <xe...@openvz.org> > Date: Thu Oct 18 23:39:54 2007 -0700 > > Make access to task's nsproxy lighter > > When someone wants to deal with some other taks's namespaces it has to > lock > the task and then to get the desired namespace if the one exists. This is > slow on read-only paths and may be impossible in some cases. > > E.g. Oleg recently noticed a race between unshare() and the (sent for > review in cgroups) pid namespaces - when the task notifies the parent it > has to know the parent's namespace, but taking the task_lock() is > impossible there - the code is under write locked tasklist lock. > > On the other hand switching the namespace on task (daemonize) and > releasing > the namespace (after the last task exit) is rather rare operation and we > can sacrifice its speed to solve the issues above. > > The access to other task namespaces is proposed to be performed > like this: > > rcu_read_lock(); > nsproxy = task_nsproxy(tsk); > if (nsproxy != NULL) { > / * > * work with the namespaces here > * e.g. get the reference on one of them > * / > } / * > * NULL task_nsproxy() means that this task is > * almost dead (zombie) > * / > rcu_read_unlock(); > > This patch has passed the review by Eric and Oleg :) and, > of course, tested. > > [c...@fr.ibm.com: fix unshare()] > [ebied...@xmission.com: Update get_net_ns_by_pid] > Signed-off-by: Pavel Emelyanov <xe...@openvz.org> > Signed-off-by: Eric W. Biederman <ebied...@xmission.com> > Cc: Oleg Nesterov <o...@tv-sign.ru> > Cc: Paul E. McKenney <paul...@linux.vnet.ibm.com> > Cc: Serge Hallyn <se...@us.ibm.com> > Signed-off-by: Cedric Le Goater <c...@fr.ibm.com> > Signed-off-by: Andrew Morton <a...@linux-foundation.org> > Signed-off-by: Linus Torvalds <torva...@linux-foundation.org> > > Eric > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/