On Wed, Feb 26, 2014 at 10:44:32PM +0900, Tetsuo Handa wrote:
> Thank you for reviewing, Paul.

No problem, but you do also need to address Lai Jiangshan's and
Peter Zijlstra's questions about the purpose of this patch.  I was
looking at it only from a "does it work?" viewpoint.

> Paul E. McKenney wrote:
> > > +/**
> > > + * set_task_comm - Change task_struct->comm with tracer and perf hooks 
> > > called.
> > > + *
> > > + * @tsk: Pointer to "struct task_struct".
> > > + * @buf: New comm name.
> > > + *
> > > + * Returns nothing.
> > > + */
> > > +void set_task_comm(struct task_struct *tsk, char *buf)
> > > +{
> > > + /*
> > > +  * Question: Do we need to use task_lock()/task_unlock() ?
> > 
> > The have-memory path through do_set_task_comm() does tolerate multiple
> > CPUs concurrently setting the comm of a given task, but the no-memory
> > path does not.  I advise keeping the locking, at least unless/until
> > some workload is having lots of CPUs concurrently changing a given
> > task's comm.  For some good reason, I hasten to add!  ;-)
> > 
> That question meant whether trace_task_rename(tsk, buf) needs to be serialized
> by tsk->alloc_lock. If trace_task_rename(tsk, buf) doesn't need to be
> serialized by tsk->alloc_lock, we can remove task_lock()/task_unlock().

I was attempting ot answer that question.  ;-)

> > Another reason to get rid of the lock is to allow do_set_task_comm()
> > to sleep waiting for memory to become available, but not sure that
> > this is a good approach.
> 
> I used GFP_ATOMIC because I don't know whether there are callers that depend 
> on
> "changing task->comm does not sleep", for until today "changing task->comm did
> not sleep".

OK...

> > > +void do_set_task_comm(struct task_struct *tsk, const char *buf)
> > > +{
> > > + struct rcu_comm *comm;
> > > +
> > > + comm = kmem_cache_zalloc(commname_cachep, GFP_ATOMIC | __GFP_NOWARN);
> > > + if (likely(comm)) {
> > > +         atomic_set(&comm->usage, 1);
> > > +         strncpy(comm->name, buf, TASK_COMM_LEN - 1);
> > > +         smp_wmb(); /* Behave like rcu_assign_pointer(). */
> > > +         comm = xchg(&tsk->rcu_comm, comm);
> > 
> > The value-returning xchg() implies a full memory barrier, so the
> > smp_wmb() is not needed.
> > 
> I see.
> 
> > > +         put_commname(comm);
> > > + } else {
> > > +         /*
> > > +          * Memory allocation failed. We have to tolerate loss of
> > > +          * consistency.
> > > +          *
> > > +          * Question 1: Do we want to reserve some amount of static
> > > +          * "struct rcu_comm" arrays for use upon allocation failures?
> > > +          *
> > > +          * Question 2: Do we perfer unchanged comm name over
> > > +          * overwritten comm name because comm name is copy-on-write ?
> > > +          */
> > > +         WARN_ONCE(1, "Failed to allocate memory for comm name.\n");
> > > +         rcu_read_lock();
> > > +         do {
> > > +                 comm = rcu_dereference(tsk->rcu_comm);
> > > +         } while (!atomic_read(&comm->usage));
> > 
> > So the idea here is to avoid races with someone who might be freeing
> > this same ->rcu_comm structure, right?  If we see a non-zero reference,
> > they might still free it, but that would be OK because we are in an
> > RCU read-side critical section, so it won't actually be freed until
> > we get done.  And our change is being overwritten by someone else's
> > in that case, but that is OK because it could happen anyway.
> > 
> Right.
> 
> > So the loop shouldn't go through many cycles, especially if memory
> > remains low.
> > 
> > If I am correct, might be worth a comment.  Doubly so if I am wrong.  ;-)
> > 
> You are correct.
> 
> What about Q1 and Q2, like below ?

I suggest reviewing the LKML thread that Peter Zijlstra pointed you at
before digging too much further into this.  Especially this one:

        https://lkml.org/lkml/2011/5/18/408

                                                        Thanx, Paul

>   /* Usage is initialized with ATOMIC_INIT(-1). */
>   static struct rcu_comm static_rcu_comm[CONFIG_RESERVED_COMMBUFFER];
> 
>   static void free_commname(struct rcu_head *rcu)
>   {
>       struct rcu_comm *comm = container_of(rcu, struct rcu_comm, rcu);
>       if (comm < &static_rcu_comm[0] ||
>           comm >= &static_rcu_comm[CONFIG_RESERVED_COMMBUFFER])
>               kmem_cache_free(commname_cachep, comm);
>       else
>               atomic_set(&comm->usage, -1);
>   }
> 
>   void do_set_task_comm(struct task_struct *tsk, const char *buf)
>   {
>       struct rcu_comm *comm;
>   
>       comm = kmem_cache_zalloc(commname_cachep, GFP_ATOMIC | __GFP_NOWARN);
>       if (likely(comm))
>               atomic_set(&comm->usage, 1);
>       else {
>               int i;
>   
>               /* Memory allocation failed. Try static comm name buffer. */
>               for (i = 0; i < CONFIG_RESERVED_COMMBUFFER; i++) {
>                       comm = &static_rcu_comm[i];
>                       if (atomic_read(&comm->usage) != -1)
>                               continue;
>                       if (atomic_add_return(&comm->usage, 2) == 1)
>                               goto found;
>                       put_commname(comm);
>                       put_commname(comm);
>               }
>               /* No comm name buffer available. Keep current comm name. */
>               WARN_ONCE(1, "Failed to allocate memory for comm name.\n");
>               return;
>       }
>   found:
>       strncpy(comm->name, buf, TASK_COMM_LEN - 1);
>       comm = xchg(&tsk->rcu_comm, comm);
>       put_commname(comm);
>   }
> 
> Since we can preallocate rcu_comm using GFP_KERNEL and return -ENOMEM (e.g.
> prepare_bprm_creds() for execve() case, copy_from_user() for 
> prctl(PR_SET_NAME)
> and comm_write() (i.e. /proc/$pid/comm ) cases), there will be little users of
> static_rcu_comm[] (e.g. kthreadd()).
> 
> 
> > > @@ -1191,6 +1195,11 @@ static struct task_struct *copy_process(unsigned 
> > > long clone_flags,
> > >   p = dup_task_struct(current);
> > >   if (!p)
> > >           goto fork_out;
> > > + rcu_read_lock();
> > > + do {
> > > +         p->rcu_comm = rcu_dereference(current->rcu_comm);
> > > + } while (!atomic_inc_not_zero(&p->rcu_comm->usage));
> > 
> > OK, loop until we get the sole reference on the comm...  But
> > I suggest an rcu_read_unlock() followed by an rcu_read_lock() as the
> > first thing in the loop.  Just to prevent adding an RCU CPU stall to
> > our woes if we cannot get a reference.
> > 
> I see. I will use
> 
>   p->rcu_comm = NULL;
>   do {
>       struct rcu_comm *comm;
>       rcu_read_lock();
>       comm = rcu_dereference(current->rcu_comm);
>       if (atomic_inc_not_zero(&comm->usage))
>               p->rcu_comm = comm;
>       rcu_read_unlock();
>   } while (!p->rcu_comm);
> 
> in the next version (if we use RCU approach).
> 
> > And here the two tasks (parent and child) share the name until one
> > or the other either changes its name or exits.  OK.
> > 
> Yes, this is copy on write approach.
> 
> Another approach could use heuristic atomicity; change from "char comm[16]"
> to "long comm[16/sizeof(long)]" and read/write using "long". By using "long",
> we can reduce the possibility of getting the mixture of current comm name and
> comm name to update to, for reading/writing of "long" is atomic.
> 
> The horizontal axis of below map is the strlen() of current comm name and the
> vertical axis is the strlen() of comm name to update to. If sizeof(long) == 8,
> 193 out of 256 patterns can be atomically updated without any locks. Moreover,
> since strlen() of at least one of current comm name and new comm name tends to
> be shorter than sizeof(long) (e.g. "bash" -> "ls", "rc.sysinit" -> "cat",
> "bash" -> "avahi-daemon" ), update of comm name will more likely be atomic
> without any locks.
> 
>     | 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
>   --+-----------------------------------------------
>    0| o  o  o  o  o  o  o  o  o  o  o  o  o  o  o  o
>    1| o  o  o  o  o  o  o  o  o  o  o  o  o  o  o  o
>    2| o  o  o  o  o  o  o  o  o  o  o  o  o  o  o  o
>    3| o  o  o  o  o  o  o  o  o  o  o  o  o  o  o  o
>    4| o  o  o  o  o  o  o  o  o  o  o  o  o  o  o  o
>    5| o  o  o  o  o  o  o  o  o  o  o  o  o  o  o  o
>    6| o  o  o  o  o  o  o  o  o  o  o  o  o  o  o  o
>    7| o  o  o  o  o  o  o  o  o  o  o  o  o  o  o  o
>    8| o  o  o  o  o  o  o  o  o  x  x  x  x  x  x  x
>    9| o  o  o  o  o  o  o  o  x  x  x  x  x  x  x  x
>   10| o  o  o  o  o  o  o  o  x  x  x  x  x  x  x  x
>   11| o  o  o  o  o  o  o  o  x  x  x  x  x  x  x  x
>   12| o  o  o  o  o  o  o  o  x  x  x  x  x  x  x  x
>   13| o  o  o  o  o  o  o  o  x  x  x  x  x  x  x  x
>   14| o  o  o  o  o  o  o  o  x  x  x  x  x  x  x  x
>   15| o  o  o  o  o  o  o  o  x  x  x  x  x  x  x  x
> 
> > > + rcu_read_unlock();
> > > 
> > >   ftrace_graph_init_task(p);
> > >   get_seccomp_filter(p);
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to