Hi Paul, Thanks a lot for your comments, my replies inline:
On Mon, Jun 25, 2018 at 10:19:20AM -0700, Paul E. McKenney wrote: > > Thanx, Paul > > ------------------------------------------------------------------------ > > When I traced rdtp->dynticks_nesting, I could only find its > value to be either a 0 or a 1. However looking back at old kernel > sources, it appears that these can be nested becaues of so called > “half-interrupts”. I believe these are basically interrupts > that cause a transition to usermode due to usermode upcalls > (usermode helper subsystem). So a nesting situation could be > something like: 1. Transition from idle to process context which > makes dynticks_nesting == 1. Next, an interrupt comes in which > makes a usermode upcall. This usermode call now makes a system > call causing entry back into process context, which increments > the dynticks_nesting counter to 2. Such a crazy situation is > perhaps possible. > > The half-interrupts can instead cause ->dynticks_nmi_nesting to either > fail to return to zero or to go negative, depending on which half of Actually in the above paragraph I was referring to a "half interrupt" messing up dynticks_nesting, not dynticks_nmi_nesting. I know that the latter can be messed up too but I wasn't referring to dynticks_nmi_nesting in this part of the article. I was thinking more in terms of the comment in: https://elixir.bootlin.com/linux/v3.19.8/source/kernel/rcu/rcu.h#L34 /* * Process-level increment to ->dynticks_nesting field. This allows for * architectures that use half-interrupts and half-exceptions from * process context. ... */ In my hypothetical example above that you quoted from my notes, I was trying to reason about how taking a half-interrupt in process context can cause dynticks_nesting to increase to 2. Thinking some more though, I am not sure how the above hypothetical example I mentioned can cause this ;) since the transition to usermode from the half-interrupt should have corrected the dynticks_nesting counter due to the callchain: rcu_user_enter->rcu_eqs_enter ? > the interrupt was present. I don't immediately recall the reason for > allowing nested process-level entry/exit. Might be another place to > put a WARN_ON_ONCE(), as eliminating this capability would save another > conditional branch. Sure, sounds good to me. > > Any time the rdtp->dynticks counter’s second-lowest most bit > is not set, we are in an EQS, and if its set, then we are not > (second lowest because lowest is reserved for something else as > of v4.18-rc1). This function is not useful to check if we’re > in an EQS from a timer tick though, because its possible the > timer tick interrupt entry caused an EQS exit which updated > the counter. IOW, the ‘dynticks’ counter is not capable of > checking if we had already exited the EQS before. To check if > we were in an EQS or not from the timer tick, we instead must > use dynticks_nesting counter. More on that later. The above > function is probably just useful to make sure that interrupt > entry/exit is properly updating the dynticks counter, and also > to make sure from non-interrupt context that RCU is in an EQS > (see rcu_gp_fqs function). > > You lost me on this one. There is rcu_is_cpu_rrupt_from_idle(), but > I am not sure what you are trying to achieve here, so I am not sure > whether this function does what you want. Sorry about that. Let me try to explain in detail about why I wrote the above paragraph when talking about rdtp->dynticks. I was trying to determine how the RCU code determines if the CPU is idle. It appears from the code that there are 2 ways it does so: 1. By calling rcu_is_cpu_rrupt_from_idle() which checks for the dynticks_nesting counter. If the counter is 0, then CPU was idle at the time of the check. This is how rcu_check_callbacks knows that the CPU was idle. 2. By checking for evenness of the dynticks counter. If its even we were idle (or perhaps in usermode, but I think that extra inference doesn't hurt). This is done in rcu_dynticks_curr_cpu_in_eqs. So basically, there are 2 different counters that seem to serve the same purpose as far as determining if we're in an idle EQS state goes. Right? Then I was trying to see why we can't just use method 2. in rcu_check_callbacks to determine if the "timer interrupt was taken while the CPU was idle". rcu_check_callbacks could simply call rcu_dynticks_curr_cpu_in_eqs() from rcu_check_callbacks(). I was trying to convince myself why that wouldn't work. I concluded that that wouldn't work because the timer interrupt that led to the rcu_check_callbacks() call would have tainted the dynticks counter because of it would have called rcu_nmi_enter() during interrupt entry. So there's no way to know if the CPU was really idle at the time of the interrupt if we were to rely on rcu_dynticks_curr_cpu_in_eqs for that. Hence we would need to rely on method 1 for the "did I take an interrupt while I was idle" in rcu_check_callbacks() function which uses the dynticks_nesting counter for this determination. Does that make sense? > > When dynticks_nesting is decremented to 0 (the outermost > process-context nesting level exit causes an eqs-entry), the > dynticks_nmi_nesting is reset to > > I think you want "0." at the end of this sentence. Or maybe my browser > is messing things up. Yes the 0. was on the next line, but I moved it back to the previous line so its easier to read. Thanks for letting me know. Thanks! - Joel