On Sat, Jul 16, 2016 at 07:10:07PM +0200, Oleg Nesterov wrote: > Peter, Paul, could you review? Do you see any hole? > > Why. Firstly, note that the state machine was greatly simplified, and > rsp->cb_state has gone, we have a single "state" variable, gp_state. > Note also the ->sync() op has gone (and actually ->wait() too, see the > "TODO" comment).
Yes, that seems like a nice simplification. > GP_IDLE - owned by __rcu_sync_enter() which can only move this state to > > GP_ENTER - owned by rcu-callback which moves it to > > GP_PASSED - owned by the last rcu_sync_exit() which moves it to > > GP_EXIT - owned by rcu-callback which moves it back to GP_IDLE. > > Yes, this is a bit simplified, we also have GP_REPLAY, but hopefully > clear. > > And, there is another possible transition, GP_ENTER -> GP_IDLE, because > not it is possible to call __rcu_sync_enter() and rcu_sync_exit() in any > state (except obviously they should be balanced), and they do not block. > > The only blocking call is __rcu_sync_wait() which actually waits for GP. > Obviously should only be called if gp_count != 0, iow after __rcu_sync_enter. So I'm a complete moron today, to aid the failing brain I draw pictures. I think I ended up with this: .----> GP_IDLE <--------------. | | | | | __rcu_sync_enter() | <GP> / rcu_sync_exit() | v | | GP_ENTER --------------' | | | <GP> | | v | GP_PASSED <---------. | | | | | rcu_sync_exit() | <GP> / __rcu_sync_enter() | v | `----- GP_EXIT ------------' ^ <GP> | __rcu_sync_enter() + rcu_sync_exit() v GP_RETRY > enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT, GP_REPLAY }; > > static void rcu_sync_func(struct rcu_head *rcu); > > static void rcu_sync_call(struct rcu_sync *rsp) > { > // TODO: THIS IS SUBOPTIMAL. We want to call it directly > // if rcu_blocking_is_gp() == T, but it has might_sleep(). Not sure I get that comment.. > gp_ops[rsp->gp_type].call(&rsp->cb_head, rcu_sync_func); > } > > static void rcu_sync_func(struct rcu_head *rcu) > { > struct rcu_sync *rsp = container_of(rcu, struct rcu_sync, cb_head); > unsigned long flags; > Right, those are 'stable' states and must be advanced through explicit calls to __rcu_sync_enter()/rcu_sync_exit() respectively. > BUG_ON(rsp->gp_state == GP_IDLE); > BUG_ON(rsp->gp_state == GP_PASSED); > > spin_lock_irqsave(&rsp->rss_lock, flags); > if (rsp->gp_count) { > /* > * We're at least a GP after the first __rcu_sync_enter(). > */ > rsp->gp_state = GP_PASSED; So we can end up here in two ways afaict. The simple way: someone called __rcu_sync_enter(), we go IDLE -> ENTER with a raised count. Once the GP passes, we get here, observe the raised count and advance to PASSED. The more involved way: we were EXIT and someone calls __rcu_sync_enter() to raise the count again. The callback from rcu_sync_exit() was still pending and once we get here we observe the raised count and voila. Now, since state != IDLE, I suppose this is valid, but it does hurt my brain. Else, !count: > } else if (rsp->gp_state == GP_REPLAY) { Fairly straight forward: during EXIT someone did __rcu_sync_enter + rcu_sync_exit() and we need to wait longer still. > /* > * A new rcu_sync_exit() has happened; requeue the callback > * to catch a later GP. > */ > rsp->gp_state = GP_EXIT; > rcu_sync_call(rsp); > } else { Again two ways here, simple: we were EXIT, GP passed, we let em rip. But its also possible, as you noted, that someone called rcu_sync_exit() during ENTER and we ended up with !count which gets us back to IDLE. > /* > * We're at least a GP after the last rcu_sync_exit(); > * eveybody will now have observed the write side critical > * section. Let 'em rip!. > * > * OR. ->gp_state can be still GP_ENTER if __rcu_sync_wait() > * wasn't called after __rcu_sync_enter(), abort. > */ > rsp->gp_state = GP_IDLE; > } > spin_unlock_irqrestore(&rsp->rss_lock, flags); > } Agreed on the rest. So I think its solid, but you failed to mention one state transition, which seems ok in any case.