On Fri, Sep 11, 2015 at 05:59:01PM +0200, Oleg Nesterov wrote:
> On 09/10, Paul E. McKenney wrote:
> >
> > On Thu, Sep 10, 2015 at 03:59:42PM +0200, Oleg Nesterov wrote:
> > > On 09/09, Paul E. McKenney wrote:
> > > >
> > > > This is obsolete, but its replacement is the same patch.
> > >
> > > fbe3b97183f84155d81e506b1aa7d2ce986f7a36 in linux-rcu.git#experimental
> > > I guess?
> > >
> > > > Oleg, Davidlohr, am I missing something on how percpu_rwsem or
> > > > locktorture work?
> > >
> > > No, I think the patch is fine. Thanks for doing this! I was going to
> > > send something like this change too. And in fact I am still thinking
> > > about another test which plays with rcu_sync only, but probably we
> > > need some cleanups first (and we need them anyway). I'll try to do
> > > this a bit later.
> >
> > I would welcome an rcu_sync-specific torture patch!
> 
> I want it much more than you ;) I have already warned you, I'll send
> more rcu_sync patches. The current code is actually a very early draft
> which was written during the discussion with Peter a long ago. I sent
> it unchanged because a) it was already reviewed and b) I tested it a
> bit in the past.
> 
> We can greatly simplify this code and at the same time make it more
> useful. Actually I already have the patches. The 1st one removes
> rcu_sync->cb_state and gp_ops->sync(). This makes the state machine
> almost self-obvious and allows other improvements. See the resulting
> (pseudo) code at the end.
> 
> But again, I'll try very much to write the test before I send the patch.

That sounds very good!  ;-)

> Until then, let me send this trivial cleanup. The CONFIG_PROVE_RCU
> code looks trivial but imo really annoying. And it is not complete,
> so lets document this at least. Plus rcu_lockdep_assert() looks more
> consistent.
> 
> 
> > > > +void torture_percpu_rwsem_init(void)
> > > > +{
> > > > +       BUG_ON(percpu_init_rwsem(&pcpu_rwsem));
> > > > +}
> > > > +
> > >
> > > Aha, we don't really need this... I mean we can use the static initialiser
> > > which can also be used by uprobes and cgroups. I'll try to send the patch
> > > tomorrow.
> >
> > Very good, please do!
> 
> Hmm. I am lier. I won't send this patch at least today.
> 
> The change I had in mind is very simple,
> 
>       #define DECLARE_PERCPU_RWSEM(sem)                               \
>               static DEFINE_PER_CPU(unsigned int, sem##_counters);    \
>               struct percpu_rw_semaphore sem = {                      \
>                       .fast_read_ctr = &sem##_counters,               \
>                       ...                                             \
>               }
>                       
> and yes, uprobes and cgroups can use it.
> 
> But somehow I missed that we can't use it to define a _static_ sem,
> 
>       static DECLARE_PERCPU_RWSEM(sem);
> 
> obviously won't work. And damn, I am shy to admit that I spent several
> hours trying to invent something but failed. Perhaps we can add 2 helpers,
> DECLARE_PERCPU_RWSEM_GLOBAL() and DECLARE_PERCPU_RWSEM_STATIC().

That is indeed what we do for SRCU for the same reason, DEFINE_SRCU()
and DEFINE_STATIC_SRCU(), but with a common __DEFINE_SRCU() doing the
actual work.

                                                        Thanx, Paul

> Oleg.
> 
> -------------------------------------------------------------------------------
> static const struct {
>       void (*call)(struct rcu_head *, void (*)(struct rcu_head *));
>       void (*wait)(void);     // TODO: remove this
> #ifdef CONFIG_PROVE_RCU
>       int  (*held)(void);
> #endif
> } gp_ops[] = {
>       ...
> };
> 
> // COMMENT to explain these states
> enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT, GP_REPLAY };
> 
> #define       rss_lock        gp_wait.lock
> 
> // !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!1!!!!!!!!
> // XXX code must be removed when we split rcu_sync_enter() into start + wait
> // !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
> 
> static void rcu_sync_func(struct rcu_head *rcu)
> {
>       struct rcu_sync *rsp = container_of(rcu, struct rcu_sync, cb_head);
>       unsigned long flags;
> 
>       BUG_ON(rsp->gp_state == GP_IDLE);
>       BUG_ON(rsp->gp_state == GP_PASSED);
> 
>       spin_lock_irqsave(&rsp->rss_lock, flags);
>       if (rsp->gp_count) {
>               /*
>                * COMMENT.
>                */
>               rsp->gp_state = GP_PASSED;
>               wake_up_locked(&rsp->gp_wait);
>       } else if (rsp->gp_state == GP_REPLAY) {
>               /*
>                * A new rcu_sync_exit() has happened; requeue the callback
>                * to catch a later GP.
>                */
>               rsp->gp_state = GP_EXIT;
>               gp_ops[rsp->gp_type].call(&rsp->cb_head, rcu_sync_func);
>       } else {
>               /*
>                * We're at least a GP after rcu_sync_exit(); eveybody will now
>                * have observed the write side critical section. Let 'em rip!.
>                */
>               BUG_ON(rsp->gp_state == GP_ENTER);      // XXX
>               rsp->gp_state = GP_IDLE;
>       }
>       spin_unlock_irqrestore(&rsp->rss_lock, flags);
> }
> 
> static void rcu_sync_call(struct rcu_sync *rsp)
> {
>       // TODO:
>       // This is called by might_sleep() code outside of ->rss_lock,
>       // we can avoid ->call() in some cases (say rcu_blocking_is_gp())
>       gp_ops[rsp->gp_type].call(&rsp->cb_head, rcu_sync_func);
> }
> 
> void rcu_sync_enter(struct rcu_sync *rsp)
> {
>       int gp_count, gp_state;
> 
>       spin_lock_irq(&rsp->rss_lock);
>       gp_count = rsp->gp_count++;
>       gp_state = rsp->gp_state;
>       if (gp_state == GP_IDLE)
>               rsp->gp_state = GP_ENTER;
>       spin_unlock_irq(&rsp->rss_lock);
> 
>       BUG_ON(gp_count != 0 && gp_state == GP_IDLE);
>       BUG_ON(gp_count == 0 && gp_state == GP_PASSED);
>       BUG_ON(gp_count == 0 && gp_state == GP_ENTER); // XXX
> 
>       if (gp_state == GP_IDLE)
>               rcu_sync_call(rsp);
> 
>       wait_event(rsp->gp_wait, rsp->gp_state != GP_ENTER);
>       BUG_ON(rsp->gp_state < GP_PASSED);
> }
> 
> void rcu_sync_exit(struct rcu_sync *rsp)
> {
>       bool need_call;
> 
>       BUG_ON(rsp->gp_state == GP_IDLE);
>       BUG_ON(rsp->gp_state == GP_ENTER);      // XXX
> 
>       spin_lock_irq(&rsp->rss_lock);
>       if (!--rsp->gp_count) {
>               if (rsp->gp_state == GP_PASSED) {
>                       need_call = true;
>                       rsp->gp_state = GP_EXIT;
>               } else if (rsp->gp_state == GP_EXIT) {
>                       rsp->gp_state = GP_REPLAY;
>               }
>       }
>       spin_unlock_irq(&rsp->rss_lock);
> 
>       // Comment to explain why we do not care if another enter()
>       // and perhaps even exit() comes after spin_unlock().
>       if (need_call)
>               rcu_sync_call(rsp);
> }
> 
> void rcu_sync_dtor(struct rcu_sync *rsp)
> {
>       int gp_state;
> 
>       BUG_ON(rsp->gp_count);
>       BUG_ON(rsp->gp_state == GP_ENTER);      // XXX
>       BUG_ON(rsp->gp_state == GP_PASSED);
> 
>       spin_lock_irq(&rsp->rss_lock);
>       if (rsp->gp_state == GP_REPLAY)
>               rsp->gp_state = GP_EXIT;
>       gp_state = rsp->gp_state;
>       spin_unlock_irq(&rsp->rss_lock);
> 
>       // TODO: add another wake_up_locked() into rcu_sync_func(),
>       // use wait_event + spin_lock_wait, remove gp_ops->wait().
> 
>       if (gp_state != GP_IDLE) {
>               gp_ops[rsp->gp_type].wait();
>               BUG_ON(rsp->gp_state != GP_IDLE);
>       }
> }
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to