On Sat, Jul 16, 2016 at 07:10:07PM +0200, Oleg Nesterov wrote:
> Peter, Paul, could you review? Do you see any hole?
> 
> Why. Firstly, note that the state machine was greatly simplified, and
> rsp->cb_state has gone, we have a single "state" variable, gp_state.
> Note also the ->sync() op has gone (and actually ->wait() too, see the
> "TODO" comment).

Yes, that seems like a nice simplification.

> GP_IDLE       - owned by __rcu_sync_enter() which can only move this state to
> 
> GP_ENTER - owned by rcu-callback which moves it to
> 
> GP_PASSED - owned by the last rcu_sync_exit() which moves it to
> 
> GP_EXIT - owned by rcu-callback which moves it back to GP_IDLE.
> 
> Yes, this is a bit simplified, we also have GP_REPLAY, but hopefully
> clear.
> 
> And, there is another possible transition, GP_ENTER -> GP_IDLE, because
> not it is possible to call __rcu_sync_enter() and rcu_sync_exit() in any
> state (except obviously they should be balanced), and they do not block.
> 
> The only blocking call is __rcu_sync_wait() which actually waits for GP.
> Obviously should only be called if gp_count != 0, iow after __rcu_sync_enter.

So I'm a complete moron today, to aid the failing brain I draw pictures.

I think I ended up with this:

 .----> GP_IDLE <--------------.
 |        |                    |
 |        | __rcu_sync_enter() | <GP> / rcu_sync_exit()
 |        v                    |
 |      GP_ENTER --------------'
 |        |
 |  <GP>  |
 |        v
 |      GP_PASSED <---------.
 |        |                 |
 |        | rcu_sync_exit() | <GP> / __rcu_sync_enter()
 |        v                 |
 `----- GP_EXIT ------------'
          ^
    <GP>  | __rcu_sync_enter() + rcu_sync_exit()
          v
        GP_RETRY


> enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT, GP_REPLAY };
> 
> static void rcu_sync_func(struct rcu_head *rcu);
> 
> static void rcu_sync_call(struct rcu_sync *rsp)
> {
>       // TODO: THIS IS SUBOPTIMAL. We want to call it directly
>       // if rcu_blocking_is_gp() == T, but it has might_sleep().

Not sure I get that comment..

>       gp_ops[rsp->gp_type].call(&rsp->cb_head, rcu_sync_func);
> }
> 
> static void rcu_sync_func(struct rcu_head *rcu)
> {
>       struct rcu_sync *rsp = container_of(rcu, struct rcu_sync, cb_head);
>       unsigned long flags;
> 

Right, those are 'stable' states and must be advanced through explicit
calls to __rcu_sync_enter()/rcu_sync_exit() respectively.

>       BUG_ON(rsp->gp_state == GP_IDLE);
>       BUG_ON(rsp->gp_state == GP_PASSED);
> 
>       spin_lock_irqsave(&rsp->rss_lock, flags);
>       if (rsp->gp_count) {
>               /*
>                * We're at least a GP after the first __rcu_sync_enter().
>                */
>               rsp->gp_state = GP_PASSED;

So we can end up here in two ways afaict.

The simple way: someone called __rcu_sync_enter(), we go IDLE -> ENTER
with a raised count. Once the GP passes, we get here, observe the raised
count and advance to PASSED.

The more involved way: we were EXIT and someone calls __rcu_sync_enter()
to raise the count again. The callback from rcu_sync_exit() was still
pending and once we get here we observe the raised count and voila.

Now, since state != IDLE, I suppose this is valid, but it does hurt my
brain.

Else, !count:

>       } else if (rsp->gp_state == GP_REPLAY) {

Fairly straight forward: during EXIT someone did __rcu_sync_enter +
rcu_sync_exit() and we need to wait longer still.

>               /*
>                * A new rcu_sync_exit() has happened; requeue the callback
>                * to catch a later GP.
>                */
>               rsp->gp_state = GP_EXIT;
>               rcu_sync_call(rsp);
>       } else {

Again two ways here, simple: we were EXIT, GP passed, we let em rip.

But its also possible, as you noted, that someone called rcu_sync_exit()
during ENTER and we ended up with !count which gets us back to IDLE.

>               /*
>                * We're at least a GP after the last rcu_sync_exit();
>                * eveybody will now have observed the write side critical
>                * section. Let 'em rip!.
>                *
>                * OR. ->gp_state can be still GP_ENTER if __rcu_sync_wait()
>                * wasn't called after __rcu_sync_enter(), abort.
>                */
>               rsp->gp_state = GP_IDLE;
>       }
>       spin_unlock_irqrestore(&rsp->rss_lock, flags);
> }

Agreed on the rest.

So I think its solid, but you failed to mention one state transition,
which seems ok in any case.


Reply via email to