Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-04-07 Thread Paul E. McKenney
On Mon, Apr 07, 2014 at 08:54:52PM +0200, Frederic Weisbecker wrote:
> On Mon, Apr 07, 2014 at 11:11:55AM -0700, Paul E. McKenney wrote:
> > > On the upstream code, the first read of full_sysidle_state after exiting 
> > > idle is not
> > > performed by an atomic operation. So I wonder if this is right to put this
> > > in the atomic section.
> > > 
> > > I don't know the language enough to tell if it has no effect but I'm just
> > > worried that it gets badly intepreted. Like the above read plus the below
> > > conditional write in the same atomic section gets packed in a kind of 
> > > cmpxchg_if_above() ?
> > > 
> > > Which is what we want to avoid if the value is not above 
> > > RCU_SYSIDLE_SHORT after
> > > a non atomic read.
> > 
> > Given that cmpxchg() is being used to emulate exactly that atomic
> > operation, I feel good about this Promela translation.
> 
> Right but the very first read that checks if full_sysidle_state is > 
> RCU_SYSIDLE_SHORT
> is done in a non-atomic way. Only when it verifies the condition do we use 
> cmpxchg()
> 
>while (oldstate > RCU_SYSIDLE_SHORT) {
>newoldstate = cmpxchg(&full_sysidle_state,
>   oldstate, RCU_SYSIDLE_NOT);
>if (oldstate == newoldstate && oldstate == RCU_SYSIDLE_FULL_NOTED) 
> {
> rcu_kick_nohz_cpu(tick_do_timer_cpu);
> return; /* We cleared it, done! */
>}
>oldstate = newoldstate;
>}
> 
> Now the way you wrote it PROMELA looked like we use cmpxchg() (or some close 
> relative)
> right away from the first read. It's hard to translate with real world 
> operations so I'm
> inventing cmpxchg_if_above() which has the same atomic and full barrier 
> properties
> than cmpxchg() expect that it only exchanges old value with new value if old 
> is
> above the last parameter:
> 
> oldstate = cmpxchg_if_above(oldstate, RCU_SYSIDLE_NOT, 
> RCU_SYSIDLE_SHORT);
> if (oldstate == RCU_SYSIDLE_FULL_NOTED)
> rcu_kick_nohz_cpu(tick_do_timer_cpu);
> return; /* We cleared it, done! */
> 
> 
> > If the value is different at the time of the cmpxchg(), the cmpxchg() fails.
> 
> Right but it might not do a cmpxchg() if oldval is <= SHORT.
> 
> > I suppose I could write it as follows instead:
> > 
> > /* Get state out of full-system idle states. */
> > oldstate = full_sysidle_state;
> > do
> > :: 1 ->
> > atomic {
> > if
> > :: oldstate > RCU_SYSIDLE_SHORT &&
> >oldstate == full_sysidle_state ->
> > full_sysidle_state = RCU_SYSIDLE_NOT;
> > break;
> > :: else ->
> > oldstate = full_sysidle_state;
> > fi;
> > }
> > od;
> > 
> > Here the "if" emulates the cmpxchg() instruction and the rest emulates
> > the loop.  Both representations get the same result when
> 
> Ok if they have the same result and implement the first read in a non atomic
> way then it's all good. The PROMELA syntax has just been confusing to me in
> this regard.

Actually, my first attempt above wasn't quite right.  This one is better.
So, yes, Promela can be confusing.  ;-)

/* Get state out of full-system idle states. */
oldstate = full_sysidle_state;
do
:: 1 ->
atomic {
if
:: oldstate > RCU_SYSIDLE_SHORT &&
   oldstate == full_sysidle_state ->
full_sysidle_state = RCU_SYSIDLE_NOT;
break;
:: oldstate > RCU_SYSIDLE_SHORT &&
   oldstate != full_sysidle_state ->
oldstate = full_sysidle_state;
:: oldstate <= RCU_SYSIDLE_SHORT -> break;
fi;
}
od;

And this passes as well.

> > > > if
> > > > :: oldstate > RCU_SYSIDLE_SHORT ->
> > > > full_sysidle_state = RCU_SYSIDLE_NOT;
> > > > :: else -> skip;
> > > > fi;
> > > > }
> > > > 
> > > > /* If needed, wake up the timekeeper. */
> > > > if
> > > > :: oldstate == RCU_SYSIDLE_FULL_NOTED ->
> > > > wakeup_timekeeper = 1;
> > > > :: else -> skip;
> > > > fi;
> > > > 
> > > > /* Mark ourselves fully awake and operational. */
> > > >  

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-04-07 Thread Frederic Weisbecker
On Mon, Apr 07, 2014 at 11:11:55AM -0700, Paul E. McKenney wrote:
> > On the upstream code, the first read of full_sysidle_state after exiting 
> > idle is not
> > performed by an atomic operation. So I wonder if this is right to put this
> > in the atomic section.
> > 
> > I don't know the language enough to tell if it has no effect but I'm just
> > worried that it gets badly intepreted. Like the above read plus the below
> > conditional write in the same atomic section gets packed in a kind of 
> > cmpxchg_if_above() ?
> > 
> > Which is what we want to avoid if the value is not above RCU_SYSIDLE_SHORT 
> > after
> > a non atomic read.
> 
> Given that cmpxchg() is being used to emulate exactly that atomic
> operation, I feel good about this Promela translation.

Right but the very first read that checks if full_sysidle_state is > 
RCU_SYSIDLE_SHORT
is done in a non-atomic way. Only when it verifies the condition do we use 
cmpxchg()

   while (oldstate > RCU_SYSIDLE_SHORT) {
   newoldstate = cmpxchg(&full_sysidle_state,
  oldstate, RCU_SYSIDLE_NOT);
   if (oldstate == newoldstate && oldstate == RCU_SYSIDLE_FULL_NOTED) {
rcu_kick_nohz_cpu(tick_do_timer_cpu);
return; /* We cleared it, done! */
   }
   oldstate = newoldstate;
   }

Now the way you wrote it PROMELA looked like we use cmpxchg() (or some close 
relative)
right away from the first read. It's hard to translate with real world 
operations so I'm
inventing cmpxchg_if_above() which has the same atomic and full barrier 
properties
than cmpxchg() expect that it only exchanges old value with new value if old is
above the last parameter:

oldstate = cmpxchg_if_above(oldstate, RCU_SYSIDLE_NOT, 
RCU_SYSIDLE_SHORT);
if (oldstate == RCU_SYSIDLE_FULL_NOTED)
rcu_kick_nohz_cpu(tick_do_timer_cpu);
return; /* We cleared it, done! */


> If the value is different at the time of the cmpxchg(), the cmpxchg() fails.

Right but it might not do a cmpxchg() if oldval is <= SHORT.

> I suppose I could write it as follows instead:
> 
>   /* Get state out of full-system idle states. */
>   oldstate = full_sysidle_state;
>   do
>   :: 1 ->
>   atomic {
>   if
>   :: oldstate > RCU_SYSIDLE_SHORT &&
>  oldstate == full_sysidle_state ->
>   full_sysidle_state = RCU_SYSIDLE_NOT;
>   break;
>   :: else ->
>   oldstate = full_sysidle_state;
>   fi;
>   }
>   od;
> 
> Here the "if" emulates the cmpxchg() instruction and the rest emulates
> the loop.  Both representations get the same result when

Ok if they have the same result and implement the first read in a non atomic
way then it's all good. The PROMELA syntax has just been confusing to me in
this regard.

> 
> > >   if
> > >   :: oldstate > RCU_SYSIDLE_SHORT ->
> > >   full_sysidle_state = RCU_SYSIDLE_NOT;
> > >   :: else -> skip;
> > >   fi;
> > >   }
> > > 
> > >   /* If needed, wake up the timekeeper. */
> > >   if
> > >   :: oldstate == RCU_SYSIDLE_FULL_NOTED ->
> > >   wakeup_timekeeper = 1;
> > >   :: else -> skip;
> > >   fi;
> > > 
> > >   /* Mark ourselves fully awake and operational. */
> > >   am_setup[me] = 1;
> > > 
> > >   /* We are fully awake, so timekeeper must not be asleep. */
> > >   assert(full_sysidle_state < RCU_SYSIDLE_FULL);
> > > 
> > >   /* Running in kernel in the following loop. */
> > >   do
> > >   :: 1 -> skip;
> > >   :: 1 -> break;
> > >   od;
> > >   od
> > > }
> > > 
> > > /*
> > >  * Are all the workers in dyntick-idle state?
> > >  */
> > > #define check_idle() \
> > >   i = 0; \
> > >   idle = 1; \
> > >   do \
> > >   :: i < NUM_WORKERS -> \
> > >   if \
> > >   :: am_busy[i] == 1 -> idle = 0; \
> > >   :: else -> skip; \
> > >   fi; \
> > >   i++; \
> > >   :: i >= NUM_WORKERS -> break; \
> > >   od
> > > 
> > > /*
> > >  * Timekeeping CPU.
> > >  */
> > > proctype timekeeper()
> > > {
> > >   byte i;
> > >   byte idle;
> > >   byte curstate;
> > >   byte newstate;
> > > 
> > >   do
> > >   :: 1 ->
> > >   /* Capture current state. */
> > >   check_idle();
> > >   curstate = full_sysidle_state;
> > >   newstate = curstate;
> > > 
> > >   /* Check for acceptance state. */
> > >   if
> > >   :: idle == 0 ->
> > > progress_idle:
> > 
> > Is this some kind of label? I can't find t

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-04-07 Thread Paul E. McKenney
On Fri, Apr 04, 2014 at 01:43:16AM +0200, Frederic Weisbecker wrote:
> On Sun, Mar 30, 2014 at 04:08:56PM -0700, Paul E. McKenney wrote:
> > For whatever it is worth, the following model claims safety and progress
> > for the sysidle state machine.
> > 
> > Thoughts?
> 
> I'm going to get fun of myself by risking a review of this. Warning,
> I don't speak promelian, so I may well write non-sense :)

Actually, you did find one real mismatch and one arguable one.  ;-)

> > Thanx, Paul
> > 
> > 
> > sysidle.sh
> > 
> > spin -a sysidle.spin
> > cc -DNP -o pan pan.c
> > # Fair scheduling to focus progress checks in timekeeper.
> > ./pan -f -l -m128 -w22
> > 
> > 
> > sysidle.spin
> > 
> > /*
> >  * Promela model for CONFIG_NO_HZ_FULL_SYSIDLE=y in the Linux kernel.
> >  * This model assumes that the dyntick-idle bit manipulation works based
> >  * on long usage, and substitutes a per-thread boolean "am_busy[]" array
> >  * for the Linux kernel's dyntick-idle masks.  The focus of this model
> >  * is therefore on the state machine itself.  Checks for both safety and
> >  * forward progress.
> >  *
> >  * This program is free software; you can redistribute it and/or modify
> >  * it under the terms of the GNU General Public License as published by
> >  * the Free Software Foundation; either version 2 of the License, or
> >  * (at your option) any later version.
> >  *
> >  * This program is distributed in the hope that it will be useful,
> >  * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >  * GNU General Public License for more details.
> >  *
> >  * You should have received a copy of the GNU General Public License
> >  * along with this program; if not, you can access it online at
> >  * http://www.gnu.org/licenses/gpl-2.0.html.
> >  *
> >  * Copyright IBM Corporation, 2014
> >  *
> >  * Author: Paul E. McKenney 
> >  */
> > 
> > #define NUM_WORKERS 3
> > 
> > byte wakeup_timekeeper = 0; /* Models rcu_kick_nohz_cpu(). */
> > 
> > #define RCU_SYSIDLE_NOT 0   /* Some CPU is not idle. */
> > #define RCU_SYSIDLE_SHORT   1   /* All CPUs idle for brief period. */
> > #define RCU_SYSIDLE_LONG2   /* All CPUs idle for long enough. */
> > #define RCU_SYSIDLE_FULL3   /* All CPUs idle, ready for sysidle. */
> > #define RCU_SYSIDLE_FULL_NOTED  4   /* Actually entered sysidle 
> > state. */
> > 
> > byte full_sysidle_state = RCU_SYSIDLE_NOT;
> > 
> > byte am_busy[NUM_WORKERS];  /* Busy is similar to "not dyntick-idle". */
> > byte am_setup[NUM_WORKERS]; /* Setup means timekeeper knows I am not idle. 
> > */
> > 
> > /*
> >  * Non-timekeeping CPU going into and out of dyntick-idle state.
> >  */
> > proctype worker(byte me)
> > {
> > byte oldstate;
> > 
> > do
> > :: 1 ->
> > /* Go idle. */
> > am_setup[me] = 0;
> > am_busy[me] = 0;
> > 
> > /* Dyntick-idle in the following loop. */
> > do
> > :: 1 -> skip;
> > :: 1 -> break;
> > od;
> > 
> > /* Exit idle loop, model getting out of dyntick idle state. */
> > am_busy[me] = 1;
> > 
> > /* Get state out of full-system idle states. */
> > atomic {
> > oldstate = full_sysidle_state;
> 
> On the upstream code, the first read of full_sysidle_state after exiting idle 
> is not
> performed by an atomic operation. So I wonder if this is right to put this
> in the atomic section.
> 
> I don't know the language enough to tell if it has no effect but I'm just
> worried that it gets badly intepreted. Like the above read plus the below
> conditional write in the same atomic section gets packed in a kind of 
> cmpxchg_if_above() ?
> 
> Which is what we want to avoid if the value is not above RCU_SYSIDLE_SHORT 
> after
> a non atomic read.

Given that cmpxchg() is being used to emulate exactly that atomic
operation, I feel good about this Promela translation.  If the value is
different at the time of the cmpxchg(), the cmpxchg() fails.  I suppose
I could write it as follows instead:

/* Get state out of full-system idle states. */
oldstate = full_sysidle_state;
do
:: 1 ->
atomic {
if
:: oldstate > RCU_SYSIDLE_SHORT &&
   oldstate == full_sysidle_state ->
full_sysidle_state = RCU_SYSIDLE_NOT;
br

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-04-03 Thread Frederic Weisbecker
On Sun, Mar 30, 2014 at 04:08:56PM -0700, Paul E. McKenney wrote:
> For whatever it is worth, the following model claims safety and progress
> for the sysidle state machine.
> 
> Thoughts?

I'm going to get fun of myself by risking a review of this. Warning,
I don't speak promelian, so I may well write non-sense :)

> 
>   Thanx, Paul
> 
> 
> sysidle.sh
> 
> spin -a sysidle.spin
> cc -DNP -o pan pan.c
> # Fair scheduling to focus progress checks in timekeeper.
> ./pan -f -l -m128 -w22
> 
> 
> sysidle.spin
> 
> /*
>  * Promela model for CONFIG_NO_HZ_FULL_SYSIDLE=y in the Linux kernel.
>  * This model assumes that the dyntick-idle bit manipulation works based
>  * on long usage, and substitutes a per-thread boolean "am_busy[]" array
>  * for the Linux kernel's dyntick-idle masks.  The focus of this model
>  * is therefore on the state machine itself.  Checks for both safety and
>  * forward progress.
>  *
>  * This program is free software; you can redistribute it and/or modify
>  * it under the terms of the GNU General Public License as published by
>  * the Free Software Foundation; either version 2 of the License, or
>  * (at your option) any later version.
>  *
>  * This program is distributed in the hope that it will be useful,
>  * but WITHOUT ANY WARRANTY; without even the implied warranty of
>  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>  * GNU General Public License for more details.
>  *
>  * You should have received a copy of the GNU General Public License
>  * along with this program; if not, you can access it online at
>  * http://www.gnu.org/licenses/gpl-2.0.html.
>  *
>  * Copyright IBM Corporation, 2014
>  *
>  * Author: Paul E. McKenney 
>  */
> 
> #define NUM_WORKERS 3
> 
> byte wakeup_timekeeper = 0; /* Models rcu_kick_nohz_cpu(). */
> 
> #define RCU_SYSIDLE_NOT   0   /* Some CPU is not idle. */
> #define RCU_SYSIDLE_SHORT 1   /* All CPUs idle for brief period. */
> #define RCU_SYSIDLE_LONG  2   /* All CPUs idle for long enough. */
> #define RCU_SYSIDLE_FULL  3   /* All CPUs idle, ready for sysidle. */
> #define RCU_SYSIDLE_FULL_NOTED4   /* Actually entered sysidle 
> state. */
> 
> byte full_sysidle_state = RCU_SYSIDLE_NOT;
> 
> byte am_busy[NUM_WORKERS];  /* Busy is similar to "not dyntick-idle". */
> byte am_setup[NUM_WORKERS]; /* Setup means timekeeper knows I am not idle. */
> 
> /*
>  * Non-timekeeping CPU going into and out of dyntick-idle state.
>  */
> proctype worker(byte me)
> {
>   byte oldstate;
> 
>   do
>   :: 1 ->
>   /* Go idle. */
>   am_setup[me] = 0;
>   am_busy[me] = 0;
> 
>   /* Dyntick-idle in the following loop. */
>   do
>   :: 1 -> skip;
>   :: 1 -> break;
>   od;
> 
>   /* Exit idle loop, model getting out of dyntick idle state. */
>   am_busy[me] = 1;
> 
>   /* Get state out of full-system idle states. */
>   atomic {
>   oldstate = full_sysidle_state;

On the upstream code, the first read of full_sysidle_state after exiting idle 
is not
performed by an atomic operation. So I wonder if this is right to put this
in the atomic section.

I don't know the language enough to tell if it has no effect but I'm just
worried that it gets badly intepreted. Like the above read plus the below
conditional write in the same atomic section gets packed in a kind of 
cmpxchg_if_above() ?

Which is what we want to avoid if the value is not above RCU_SYSIDLE_SHORT after
a non atomic read.

>   if
>   :: oldstate > RCU_SYSIDLE_SHORT ->
>   full_sysidle_state = RCU_SYSIDLE_NOT;
>   :: else -> skip;
>   fi;
>   }
> 
>   /* If needed, wake up the timekeeper. */
>   if
>   :: oldstate == RCU_SYSIDLE_FULL_NOTED ->
>   wakeup_timekeeper = 1;
>   :: else -> skip;
>   fi;
> 
>   /* Mark ourselves fully awake and operational. */
>   am_setup[me] = 1;
> 
>   /* We are fully awake, so timekeeper must not be asleep. */
>   assert(full_sysidle_state < RCU_SYSIDLE_FULL);
> 
>   /* Running in kernel in the following loop. */
>   do
>   :: 1 -> skip;
>   :: 1 -> break;
>   od;
>   od
> }
> 
> /*
>  * Are all the workers in dyntick-idle state?
>  */
> #define check_idle() \
>   i = 0; \
>   idle = 1; \

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-04-03 Thread Paul E. McKenney
On Thu, Apr 03, 2014 at 05:53:49PM +, Mathieu Desnoyers wrote:
> - Original Message -
> > From: "Paul E. McKenney" 
> > To: "Mathieu Desnoyers" 
> > Cc: fweis...@gmail.com, pet...@infradead.org, linux-kernel@vger.kernel.org
> > Sent: Thursday, April 3, 2014 1:41:03 PM
> > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > 
> > On Thu, Apr 03, 2014 at 05:33:58PM +, Mathieu Desnoyers wrote:
> > > - Original Message -
> > > > From: "Paul E. McKenney" 
> > > > To: "Mathieu Desnoyers" 
> > > > Cc: fweis...@gmail.com, pet...@infradead.org,
> > > > linux-kernel@vger.kernel.org
> > > > Sent: Thursday, April 3, 2014 1:21:58 PM
> > > > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > > > 
> > > > On Thu, Apr 03, 2014 at 02:57:55AM +, Mathieu Desnoyers wrote:
> > > > > - Original Message -
> > > > > > From: "Paul E. McKenney" 
> > > > > > To: "Mathieu Desnoyers" 
> > > > > > Cc: fweis...@gmail.com, pet...@infradead.org,
> > > > > > linux-kernel@vger.kernel.org
> > > > > > Sent: Tuesday, April 1, 2014 10:50:44 PM
> > > > > > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > > > > > 
> > > > > [...]
> > > > > > > Here is the experiment I did on this latest version: I just 
> > > > > > > enabled
> > > > > > > the
> > > > > > > safety checking (not progress). I commented these lines out
> > > > > > > (again):
> > > > > > > 
> > > > > > > /* If needed, wake up the timekeeper. */
> > > > > > > if
> > > > > > > //:: oldstate == RCU_SYSIDLE_FULL_NOTED ->
> > > > > > > //  wakeup_timekeeper = 1;
> > > > > > > :: else -> skip;
> > > > > > > fi;
> > > > > > > 
> > > > > > > compile and run with:
> > > > > > > 
> > > > > > > spin -a sysidle.spin
> > > > > > > gcc -o pan pan.c
> > > > > > > ./pan -m128 -w22
> > > > > > > 
> > > > > > > My expectation would be to trigger some kind of assertion that the
> > > > > > > timekeeper is not active while the worker is running.
> > > > > > > Unfortunately,
> > > > > > > nothing triggers.
> > > > > > 
> > > > > > That is expected behavior.  Failure to wake up the timekeeper is set
> > > > > > up as a progress criterion.
> > > > > > 
> > > > > > > I see 3 possible solutions: we could either add assertions into
> > > > > > > other branches of the timekeeper, or add assertions into the 
> > > > > > > worker
> > > > > > > thread. Another way to do it would be to express the assertions as
> > > > > > > negation of a LTL formula based on state variables.
> > > > > > 
> > > > > > I did try both a hand-coded "never" clause and LTL formulas without
> > > > > > success. You have more experience with them, so perhaps you could
> > > > > > make
> > > > > > something work.  The need is that if all worker threads go non-idle
> > > > > > indefinitely starting from the RCU_SYSIDLE_FULL_NOTED state, the
> > > > > > wakeup_timekeeper must eventually be set to 1, and then the
> > > > > > timekeeper
> > > > > > must start running, for example, the wakeup_timekeeper value must
> > > > > > revert
> > > > > > to zero.
> > > > > > 
> > > > > > The problem I had is that the "never" claims seem set up to catch
> > > > > > some
> > > > > > finite behavior after a possibly-infinite prefix.  In this case, we
> > > > > > instead need to catch some infinite behavior after a
> > > > > > possibly-infinite
> > > > > > prefix.
> > > > > > 
> > > > > > > Thoughts ?
> > > > > > 
> > > > > > Me, I eventually switched ove

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-04-03 Thread Mathieu Desnoyers
- Original Message -
> From: "Paul E. McKenney" 
> To: "Mathieu Desnoyers" 
> Cc: fweis...@gmail.com, pet...@infradead.org, linux-kernel@vger.kernel.org
> Sent: Thursday, April 3, 2014 1:41:03 PM
> Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> 
> On Thu, Apr 03, 2014 at 05:33:58PM +, Mathieu Desnoyers wrote:
> > - Original Message -
> > > From: "Paul E. McKenney" 
> > > To: "Mathieu Desnoyers" 
> > > Cc: fweis...@gmail.com, pet...@infradead.org,
> > > linux-kernel@vger.kernel.org
> > > Sent: Thursday, April 3, 2014 1:21:58 PM
> > > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > > 
> > > On Thu, Apr 03, 2014 at 02:57:55AM +, Mathieu Desnoyers wrote:
> > > > - Original Message -
> > > > > From: "Paul E. McKenney" 
> > > > > To: "Mathieu Desnoyers" 
> > > > > Cc: fweis...@gmail.com, pet...@infradead.org,
> > > > > linux-kernel@vger.kernel.org
> > > > > Sent: Tuesday, April 1, 2014 10:50:44 PM
> > > > > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > > > > 
> > > > [...]
> > > > > > Here is the experiment I did on this latest version: I just enabled
> > > > > > the
> > > > > > safety checking (not progress). I commented these lines out
> > > > > > (again):
> > > > > > 
> > > > > > /* If needed, wake up the timekeeper. */
> > > > > > if
> > > > > > //:: oldstate == RCU_SYSIDLE_FULL_NOTED ->
> > > > > > //  wakeup_timekeeper = 1;
> > > > > > :: else -> skip;
> > > > > > fi;
> > > > > > 
> > > > > > compile and run with:
> > > > > > 
> > > > > > spin -a sysidle.spin
> > > > > > gcc -o pan pan.c
> > > > > > ./pan -m128 -w22
> > > > > > 
> > > > > > My expectation would be to trigger some kind of assertion that the
> > > > > > timekeeper is not active while the worker is running.
> > > > > > Unfortunately,
> > > > > > nothing triggers.
> > > > > 
> > > > > That is expected behavior.  Failure to wake up the timekeeper is set
> > > > > up as a progress criterion.
> > > > > 
> > > > > > I see 3 possible solutions: we could either add assertions into
> > > > > > other branches of the timekeeper, or add assertions into the worker
> > > > > > thread. Another way to do it would be to express the assertions as
> > > > > > negation of a LTL formula based on state variables.
> > > > > 
> > > > > I did try both a hand-coded "never" clause and LTL formulas without
> > > > > success. You have more experience with them, so perhaps you could
> > > > > make
> > > > > something work.  The need is that if all worker threads go non-idle
> > > > > indefinitely starting from the RCU_SYSIDLE_FULL_NOTED state, the
> > > > > wakeup_timekeeper must eventually be set to 1, and then the
> > > > > timekeeper
> > > > > must start running, for example, the wakeup_timekeeper value must
> > > > > revert
> > > > > to zero.
> > > > > 
> > > > > The problem I had is that the "never" claims seem set up to catch
> > > > > some
> > > > > finite behavior after a possibly-infinite prefix.  In this case, we
> > > > > instead need to catch some infinite behavior after a
> > > > > possibly-infinite
> > > > > prefix.
> > > > > 
> > > > > > Thoughts ?
> > > > > 
> > > > > Me, I eventually switched over to using progress detection.  Which
> > > > > might
> > > > > be a bit unconventional, but it did have the virtue of making the
> > > > > model complain when it should complain.  ;-)
> > > > 
> > > > Here is my attempt at simplifying the model. I use LTL formulas as
> > > > checks
> > > > for the two things I think really matter here: having timekeeping
> > > > eventually
> > > > active when threads 

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-04-03 Thread Paul E. McKenney
On Thu, Apr 03, 2014 at 05:33:58PM +, Mathieu Desnoyers wrote:
> - Original Message -
> > From: "Paul E. McKenney" 
> > To: "Mathieu Desnoyers" 
> > Cc: fweis...@gmail.com, pet...@infradead.org, linux-kernel@vger.kernel.org
> > Sent: Thursday, April 3, 2014 1:21:58 PM
> > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > 
> > On Thu, Apr 03, 2014 at 02:57:55AM +, Mathieu Desnoyers wrote:
> > > - Original Message -
> > > > From: "Paul E. McKenney" 
> > > > To: "Mathieu Desnoyers" 
> > > > Cc: fweis...@gmail.com, pet...@infradead.org,
> > > > linux-kernel@vger.kernel.org
> > > > Sent: Tuesday, April 1, 2014 10:50:44 PM
> > > > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > > > 
> > > [...]
> > > > > Here is the experiment I did on this latest version: I just enabled 
> > > > > the
> > > > > safety checking (not progress). I commented these lines out (again):
> > > > > 
> > > > > /* If needed, wake up the timekeeper. */
> > > > > if
> > > > > //:: oldstate == RCU_SYSIDLE_FULL_NOTED ->
> > > > > //  wakeup_timekeeper = 1;
> > > > > :: else -> skip;
> > > > > fi;
> > > > > 
> > > > > compile and run with:
> > > > > 
> > > > > spin -a sysidle.spin
> > > > > gcc -o pan pan.c
> > > > > ./pan -m128 -w22
> > > > > 
> > > > > My expectation would be to trigger some kind of assertion that the
> > > > > timekeeper is not active while the worker is running. Unfortunately,
> > > > > nothing triggers.
> > > > 
> > > > That is expected behavior.  Failure to wake up the timekeeper is set
> > > > up as a progress criterion.
> > > > 
> > > > > I see 3 possible solutions: we could either add assertions into
> > > > > other branches of the timekeeper, or add assertions into the worker
> > > > > thread. Another way to do it would be to express the assertions as
> > > > > negation of a LTL formula based on state variables.
> > > > 
> > > > I did try both a hand-coded "never" clause and LTL formulas without
> > > > success. You have more experience with them, so perhaps you could make
> > > > something work.  The need is that if all worker threads go non-idle
> > > > indefinitely starting from the RCU_SYSIDLE_FULL_NOTED state, the
> > > > wakeup_timekeeper must eventually be set to 1, and then the timekeeper
> > > > must start running, for example, the wakeup_timekeeper value must revert
> > > > to zero.
> > > > 
> > > > The problem I had is that the "never" claims seem set up to catch some
> > > > finite behavior after a possibly-infinite prefix.  In this case, we
> > > > instead need to catch some infinite behavior after a possibly-infinite
> > > > prefix.
> > > > 
> > > > > Thoughts ?
> > > > 
> > > > Me, I eventually switched over to using progress detection.  Which might
> > > > be a bit unconventional, but it did have the virtue of making the
> > > > model complain when it should complain.  ;-)
> > > 
> > > Here is my attempt at simplifying the model. I use LTL formulas as checks
> > > for the two things I think really matter here: having timekeeping
> > > eventually
> > > active when threads are running, and having timekeeping eventually 
> > > inactive
> > > when threads are idle. Hopefully my Promela is not too rusty:
> > 
> > Well, this was the first time I had ever tried using LTL or never claims,
> > so you are ahead of me either way.  ;-)
> > 
> > > 1) When, at any point in the trail, a worker is setup, then we want to
> > >be sure that at some point in the future the timer is necessarily
> > >running:
> > > 
> > > timer_active.ltl:
> > > [] (am_setup_0 -> (<> timekeeper_is_running))
> > 
> > OK, am_setup_0 implies that the timekeeper will eventually be running.
> > For two threads, this presumably becomes:
> > 
> > [] ((am_setup_0 || am_setup_1) -> (<> timekeeper_is_running))
> 
> Yes.
> 
> > 
> > > 2) When,

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-04-03 Thread Mathieu Desnoyers
- Original Message -
> From: "Paul E. McKenney" 
> To: "Mathieu Desnoyers" 
> Cc: fweis...@gmail.com, pet...@infradead.org, linux-kernel@vger.kernel.org
> Sent: Thursday, April 3, 2014 1:21:58 PM
> Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> 
> On Thu, Apr 03, 2014 at 02:57:55AM +, Mathieu Desnoyers wrote:
> > - Original Message -
> > > From: "Paul E. McKenney" 
> > > To: "Mathieu Desnoyers" 
> > > Cc: fweis...@gmail.com, pet...@infradead.org,
> > > linux-kernel@vger.kernel.org
> > > Sent: Tuesday, April 1, 2014 10:50:44 PM
> > > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > > 
> > [...]
> > > > Here is the experiment I did on this latest version: I just enabled the
> > > > safety checking (not progress). I commented these lines out (again):
> > > > 
> > > > /* If needed, wake up the timekeeper. */
> > > > if
> > > > //:: oldstate == RCU_SYSIDLE_FULL_NOTED ->
> > > > //  wakeup_timekeeper = 1;
> > > > :: else -> skip;
> > > > fi;
> > > > 
> > > > compile and run with:
> > > > 
> > > > spin -a sysidle.spin
> > > > gcc -o pan pan.c
> > > > ./pan -m128 -w22
> > > > 
> > > > My expectation would be to trigger some kind of assertion that the
> > > > timekeeper is not active while the worker is running. Unfortunately,
> > > > nothing triggers.
> > > 
> > > That is expected behavior.  Failure to wake up the timekeeper is set
> > > up as a progress criterion.
> > > 
> > > > I see 3 possible solutions: we could either add assertions into
> > > > other branches of the timekeeper, or add assertions into the worker
> > > > thread. Another way to do it would be to express the assertions as
> > > > negation of a LTL formula based on state variables.
> > > 
> > > I did try both a hand-coded "never" clause and LTL formulas without
> > > success. You have more experience with them, so perhaps you could make
> > > something work.  The need is that if all worker threads go non-idle
> > > indefinitely starting from the RCU_SYSIDLE_FULL_NOTED state, the
> > > wakeup_timekeeper must eventually be set to 1, and then the timekeeper
> > > must start running, for example, the wakeup_timekeeper value must revert
> > > to zero.
> > > 
> > > The problem I had is that the "never" claims seem set up to catch some
> > > finite behavior after a possibly-infinite prefix.  In this case, we
> > > instead need to catch some infinite behavior after a possibly-infinite
> > > prefix.
> > > 
> > > > Thoughts ?
> > > 
> > > Me, I eventually switched over to using progress detection.  Which might
> > > be a bit unconventional, but it did have the virtue of making the
> > > model complain when it should complain.  ;-)
> > 
> > Here is my attempt at simplifying the model. I use LTL formulas as checks
> > for the two things I think really matter here: having timekeeping
> > eventually
> > active when threads are running, and having timekeeping eventually inactive
> > when threads are idle. Hopefully my Promela is not too rusty:
> 
> Well, this was the first time I had ever tried using LTL or never claims,
> so you are ahead of me either way.  ;-)
> 
> > 1) When, at any point in the trail, a worker is setup, then we want to
> >be sure that at some point in the future the timer is necessarily
> >running:
> > 
> > timer_active.ltl:
> > [] (am_setup_0 -> (<> timekeeper_is_running))
> 
> OK, am_setup_0 implies that the timekeeper will eventually be running.
> For two threads, this presumably becomes:
> 
> [] ((am_setup_0 || am_setup_1) -> (<> timekeeper_is_running))

Yes.

> 
> > 2) When, at any point in the trail, a worker is not setup, then we
> >want to be sure that at some point in the future, either this
> >thread is setup again, or the timekeeper reaches a not running
> >state:
> > 
> > timer_inactive.ltl:
> > [] (!am_setup_0 -> (<> (!timekeeper_is_running || am_setup_0)))
> 
> And here the two-thread version should be something like this:
> 
> [] (!(am_setup_0 || am_setup_1) -> (<> (!timekeeper_is_running || am_setup_0
> || am_setup_1)))


Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-04-03 Thread Paul E. McKenney
On Thu, Apr 03, 2014 at 02:57:55AM +, Mathieu Desnoyers wrote:
> - Original Message -
> > From: "Paul E. McKenney" 
> > To: "Mathieu Desnoyers" 
> > Cc: fweis...@gmail.com, pet...@infradead.org, linux-kernel@vger.kernel.org
> > Sent: Tuesday, April 1, 2014 10:50:44 PM
> > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > 
> [...]
> > > Here is the experiment I did on this latest version: I just enabled the
> > > safety checking (not progress). I commented these lines out (again):
> > > 
> > > /* If needed, wake up the timekeeper. */
> > > if
> > > //:: oldstate == RCU_SYSIDLE_FULL_NOTED ->
> > > //  wakeup_timekeeper = 1;
> > > :: else -> skip;
> > > fi;
> > > 
> > > compile and run with:
> > > 
> > > spin -a sysidle.spin
> > > gcc -o pan pan.c
> > > ./pan -m128 -w22
> > > 
> > > My expectation would be to trigger some kind of assertion that the
> > > timekeeper is not active while the worker is running. Unfortunately,
> > > nothing triggers.
> > 
> > That is expected behavior.  Failure to wake up the timekeeper is set
> > up as a progress criterion.
> > 
> > > I see 3 possible solutions: we could either add assertions into
> > > other branches of the timekeeper, or add assertions into the worker
> > > thread. Another way to do it would be to express the assertions as
> > > negation of a LTL formula based on state variables.
> > 
> > I did try both a hand-coded "never" clause and LTL formulas without
> > success. You have more experience with them, so perhaps you could make
> > something work.  The need is that if all worker threads go non-idle
> > indefinitely starting from the RCU_SYSIDLE_FULL_NOTED state, the
> > wakeup_timekeeper must eventually be set to 1, and then the timekeeper
> > must start running, for example, the wakeup_timekeeper value must revert
> > to zero.
> > 
> > The problem I had is that the "never" claims seem set up to catch some
> > finite behavior after a possibly-infinite prefix.  In this case, we
> > instead need to catch some infinite behavior after a possibly-infinite
> > prefix.
> > 
> > > Thoughts ?
> > 
> > Me, I eventually switched over to using progress detection.  Which might
> > be a bit unconventional, but it did have the virtue of making the
> > model complain when it should complain.  ;-)
> 
> Here is my attempt at simplifying the model. I use LTL formulas as checks
> for the two things I think really matter here: having timekeeping eventually
> active when threads are running, and having timekeeping eventually inactive
> when threads are idle. Hopefully my Promela is not too rusty:

Well, this was the first time I had ever tried using LTL or never claims,
so you are ahead of me either way.  ;-)

> 1) When, at any point in the trail, a worker is setup, then we want to
>be sure that at some point in the future the timer is necessarily
>running:
> 
> timer_active.ltl:
> [] (am_setup_0 -> (<> timekeeper_is_running))

OK, am_setup_0 implies that the timekeeper will eventually be running.
For two threads, this presumably becomes:

[] ((am_setup_0 || am_setup_1) -> (<> timekeeper_is_running))

> 2) When, at any point in the trail, a worker is not setup, then we
>want to be sure that at some point in the future, either this
>thread is setup again, or the timekeeper reaches a not running
>state:
> 
> timer_inactive.ltl:
> [] (!am_setup_0 -> (<> (!timekeeper_is_running || am_setup_0)))

And here the two-thread version should be something like this:

[] (!(am_setup_0 || am_setup_1) -> (<> (!timekeeper_is_running || am_setup_0 || 
am_setup_1)))

It would be nice if never claims allowed local variables, as that would
allow looping over the am_setup array.  Or maybe I just haven't figured
out how to tell spin that a given variable should not be considered to
be part of the global state.

> sysidle.sh:
> #!/bin/sh
> 
> spin -f "!(`cat timer_active.ltl`)" > pan.ltl
> #spin -f "!(`cat timer_inactive.ltl`)" > pan.ltl
> spin -a -X -N  pan.ltl sysidle.spin
> #spin -DINJECT_MISSING_WAKEUP -a -X -N  pan.ltl sysidle.spin

I definitely didn't use "-X".  The "spin --help" says "-[XYZ] reserved
for use by xspin interface", so maybe it doesn't matter.  ;-)

> gcc -o pan pan.c
> ./pan -f -a -m128 -w22
> 
> #view trail

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-04-02 Thread Mathieu Desnoyers
- Original Message -
> From: "Paul E. McKenney" 
> To: "Mathieu Desnoyers" 
> Cc: fweis...@gmail.com, pet...@infradead.org, linux-kernel@vger.kernel.org
> Sent: Tuesday, April 1, 2014 10:50:44 PM
> Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> 
[...]
> > Here is the experiment I did on this latest version: I just enabled the
> > safety checking (not progress). I commented these lines out (again):
> > 
> > /* If needed, wake up the timekeeper. */
> > if
> > //:: oldstate == RCU_SYSIDLE_FULL_NOTED ->
> > //  wakeup_timekeeper = 1;
> > :: else -> skip;
> > fi;
> > 
> > compile and run with:
> > 
> > spin -a sysidle.spin
> > gcc -o pan pan.c
> > ./pan -m128 -w22
> > 
> > My expectation would be to trigger some kind of assertion that the
> > timekeeper is not active while the worker is running. Unfortunately,
> > nothing triggers.
> 
> That is expected behavior.  Failure to wake up the timekeeper is set
> up as a progress criterion.
> 
> > I see 3 possible solutions: we could either add assertions into
> > other branches of the timekeeper, or add assertions into the worker
> > thread. Another way to do it would be to express the assertions as
> > negation of a LTL formula based on state variables.
> 
> I did try both a hand-coded "never" clause and LTL formulas without
> success. You have more experience with them, so perhaps you could make
> something work.  The need is that if all worker threads go non-idle
> indefinitely starting from the RCU_SYSIDLE_FULL_NOTED state, the
> wakeup_timekeeper must eventually be set to 1, and then the timekeeper
> must start running, for example, the wakeup_timekeeper value must revert
> to zero.
> 
> The problem I had is that the "never" claims seem set up to catch some
> finite behavior after a possibly-infinite prefix.  In this case, we
> instead need to catch some infinite behavior after a possibly-infinite
> prefix.
> 
> > Thoughts ?
> 
> Me, I eventually switched over to using progress detection.  Which might
> be a bit unconventional, but it did have the virtue of making the
> model complain when it should complain.  ;-)

Here is my attempt at simplifying the model. I use LTL formulas as checks
for the two things I think really matter here: having timekeeping eventually
active when threads are running, and having timekeeping eventually inactive
when threads are idle. Hopefully my Promela is not too rusty:

1) When, at any point in the trail, a worker is setup, then we want to
   be sure that at some point in the future the timer is necessarily
   running:

timer_active.ltl:
[] (am_setup_0 -> (<> timekeeper_is_running))

2) When, at any point in the trail, a worker is not setup, then we
   want to be sure that at some point in the future, either this
   thread is setup again, or the timekeeper reaches a not running
   state:

timer_inactive.ltl:
[] (!am_setup_0 -> (<> (!timekeeper_is_running || am_setup_0)))

sysidle.sh:
#!/bin/sh

spin -f "!(`cat timer_active.ltl`)" > pan.ltl
#spin -f "!(`cat timer_inactive.ltl`)" > pan.ltl
spin -a -X -N  pan.ltl sysidle.spin
#spin -DINJECT_MISSING_WAKEUP -a -X -N  pan.ltl sysidle.spin
gcc -o pan pan.c
./pan -f -a -m128 -w22

#view trail with:
# spin -v -t -N pan.ltl sysidle.spin

sysidle.spin:

/*
 * Promela model for CONFIG_NO_HZ_FULL_SYSIDLE=y in the Linux kernel.
 * This model assumes that the dyntick-idle bit manipulation works based
 * on long usage, and substitutes a per-thread boolean "am_busy[]" array
 * for the Linux kernel's dyntick-idle masks.  The focus of this model
 * is therefore on the state machine itself.  Models timekeeper "running"
 * state with respect to worker thread idle state.
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, you can access it online at
 * http://www.gnu.org/licenses/gpl-2.0.html.
 *
 * Copyright IBM Corporation, 2014
 * Copyright EfficiOS, 2014
 *
 * Author: Paul E. McKenney 
 * Mathieu Desnoyers 
 */

// adapt LTL formulas before changing NUM_WORKERS
//#define NUM_WO

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-04-01 Thread Paul E. McKenney
On Wed, Apr 02, 2014 at 12:35:04AM +, Mathieu Desnoyers wrote:
> - Original Message -
> > From: "Paul E. McKenney" 
> > To: "Mathieu Desnoyers" 
> > Cc: fweis...@gmail.com, pet...@infradead.org, linux-kernel@vger.kernel.org
> > Sent: Monday, March 31, 2014 1:23:19 PM
> > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > 
> > On Mon, Mar 31, 2014 at 08:38:13AM -0700, Paul E. McKenney wrote:
> > > On Mon, Mar 31, 2014 at 02:02:23PM +, Mathieu Desnoyers wrote:
> > > > - Original Message -
> > > > > From: "Paul E. McKenney" 
> > > > > To: "Mathieu Desnoyers" 
> > > > > Cc: fweis...@gmail.com, pet...@infradead.org,
> > > > > linux-kernel@vger.kernel.org
> > > > > Sent: Sunday, March 30, 2014 11:54:58 PM
> > > > > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > > > > 
> > > > > On Mon, Mar 31, 2014 at 03:29:25AM +, Mathieu Desnoyers wrote:
> > > > > > - Original Message -
> > > > > > > From: "Paul E. McKenney" 
> > > > > > > To: fweis...@gmail.com, "mathieu desnoyers"
> > > > > > > , pet...@infradead.org
> > > > > > > Cc: linux-kernel@vger.kernel.org
> > > > > > > Sent: Sunday, March 30, 2014 7:08:56 PM
> > > > > > > Subject: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > > > > > > 
> > > > > > > For whatever it is worth, the following model claims safety and
> > > > > > > progress
> > > > > > > for the sysidle state machine.
> > > > > > > 
> > > > > > > Thoughts?
> > > > > > 
> > > > > > Hi Paul,
> > > > > > 
> > > > > > Please keep in mind that it's been a while since I've looked at
> > > > > > Promela
> > > > > > proofs, but a couple of questions come to mind. First, how is the
> > > > > > execution
> > > > > > script below checking for progress in any way ? I remember that we
> > > > > > used
> > > > > > the negation of a "_np" LTL claim to check for progress in the past.
> > > > > > Moreover, I'd be much more comfortable seeing ifdefs in the code 
> > > > > > that
> > > > > > inject
> > > > > > known failures, and let them be triggered by error-triggering runs 
> > > > > > in
> > > > > > the
> > > > > > scripts (with e.g. -DINJECT_ERROR_XYZ), to confirm that this model
> > > > > > actually
> > > > > > catches known issues.
> > > > > 
> > > > > Well, if I comment out the four "progress_" labels, it complains about
> > > > > a non-progress cycle.  So at least spin does pay attention to these.
> > > > > ;-)
> > > > > 
> > > > > If I put the "progress_" labels back in, but change the check so as to
> > > > > prevent the state machine from ever entering RCU_SYSIDLE_FULL_NOTED,
> > > > > it cranks through 14M states before complaining about a non-progress
> > > > > cycle,
> > > > > as expected.
> > > > > 
> > > > > If I revert back to pristine state, and then comment out the 
> > > > > statements
> > > > > that reverts state back to RCU_SYSIDLE_NOT when exiting idle, an
> > > > > assert()
> > > > > triggers, as expected.
> > > > > 
> > > > > > If we can show that the model can detect a few failure modes, both
> > > > > > for
> > > > > > safety
> > > > > > and progress, then my confidence level in the model will start to
> > > > > > improve.
> > > > > > ;-)
> > > > > 
> > > > > Well, there probably is a bug in there somewhere, Murphy being who he
> > > > > is.
> > > > > ;-)
> > > > 
> > > > One failure mode your model seems to miss is when I comment the wakeup:
> > > > 
> > > > /* If needed, wake up the timekeeper. */
> > > > if
> > > > #ifndef INJECT_NO_WAKEUP
> > > > :: oldstate == RCU_SYSIDLE_FULL_NOTED -

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-04-01 Thread Mathieu Desnoyers
- Original Message -
> From: "Paul E. McKenney" 
> To: "Mathieu Desnoyers" 
> Cc: fweis...@gmail.com, pet...@infradead.org, linux-kernel@vger.kernel.org
> Sent: Monday, March 31, 2014 1:23:19 PM
> Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> 
> On Mon, Mar 31, 2014 at 08:38:13AM -0700, Paul E. McKenney wrote:
> > On Mon, Mar 31, 2014 at 02:02:23PM +, Mathieu Desnoyers wrote:
> > > - Original Message -
> > > > From: "Paul E. McKenney" 
> > > > To: "Mathieu Desnoyers" 
> > > > Cc: fweis...@gmail.com, pet...@infradead.org,
> > > > linux-kernel@vger.kernel.org
> > > > Sent: Sunday, March 30, 2014 11:54:58 PM
> > > > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > > > 
> > > > On Mon, Mar 31, 2014 at 03:29:25AM +, Mathieu Desnoyers wrote:
> > > > > - Original Message -
> > > > > > From: "Paul E. McKenney" 
> > > > > > To: fweis...@gmail.com, "mathieu desnoyers"
> > > > > > , pet...@infradead.org
> > > > > > Cc: linux-kernel@vger.kernel.org
> > > > > > Sent: Sunday, March 30, 2014 7:08:56 PM
> > > > > > Subject: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > > > > > 
> > > > > > For whatever it is worth, the following model claims safety and
> > > > > > progress
> > > > > > for the sysidle state machine.
> > > > > > 
> > > > > > Thoughts?
> > > > > 
> > > > > Hi Paul,
> > > > > 
> > > > > Please keep in mind that it's been a while since I've looked at
> > > > > Promela
> > > > > proofs, but a couple of questions come to mind. First, how is the
> > > > > execution
> > > > > script below checking for progress in any way ? I remember that we
> > > > > used
> > > > > the negation of a "_np" LTL claim to check for progress in the past.
> > > > > Moreover, I'd be much more comfortable seeing ifdefs in the code that
> > > > > inject
> > > > > known failures, and let them be triggered by error-triggering runs in
> > > > > the
> > > > > scripts (with e.g. -DINJECT_ERROR_XYZ), to confirm that this model
> > > > > actually
> > > > > catches known issues.
> > > > 
> > > > Well, if I comment out the four "progress_" labels, it complains about
> > > > a non-progress cycle.  So at least spin does pay attention to these.
> > > > ;-)
> > > > 
> > > > If I put the "progress_" labels back in, but change the check so as to
> > > > prevent the state machine from ever entering RCU_SYSIDLE_FULL_NOTED,
> > > > it cranks through 14M states before complaining about a non-progress
> > > > cycle,
> > > > as expected.
> > > > 
> > > > If I revert back to pristine state, and then comment out the statements
> > > > that reverts state back to RCU_SYSIDLE_NOT when exiting idle, an
> > > > assert()
> > > > triggers, as expected.
> > > > 
> > > > > If we can show that the model can detect a few failure modes, both
> > > > > for
> > > > > safety
> > > > > and progress, then my confidence level in the model will start to
> > > > > improve.
> > > > > ;-)
> > > > 
> > > > Well, there probably is a bug in there somewhere, Murphy being who he
> > > > is.
> > > > ;-)
> > > 
> > > One failure mode your model seems to miss is when I comment the wakeup:
> > > 
> > > /* If needed, wake up the timekeeper. */
> > > if
> > > #ifndef INJECT_NO_WAKEUP
> > > :: oldstate == RCU_SYSIDLE_FULL_NOTED ->
> > > wakeup_timekeeper = 1;
> > > #endif /* #ifndef INJECT_NO_WAKEUP */
> > > :: else -> skip;
> > > fi;
> > > 
> > > Somehow, I feel I am doing something very nasty to the algorithm,
> > > but the model checker fails to find any kind of progress or safety
> > > issue. Any idea why ?
> > 
> > Hmmm...  One reason is because I am not modeling the timekeeper thread
> > as waiting for

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-03-31 Thread Paul E. McKenney
On Mon, Mar 31, 2014 at 08:38:13AM -0700, Paul E. McKenney wrote:
> On Mon, Mar 31, 2014 at 02:02:23PM +, Mathieu Desnoyers wrote:
> > - Original Message -
> > > From: "Paul E. McKenney" 
> > > To: "Mathieu Desnoyers" 
> > > Cc: fweis...@gmail.com, pet...@infradead.org, linux-kernel@vger.kernel.org
> > > Sent: Sunday, March 30, 2014 11:54:58 PM
> > > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > > 
> > > On Mon, Mar 31, 2014 at 03:29:25AM +, Mathieu Desnoyers wrote:
> > > > - Original Message -
> > > > > From: "Paul E. McKenney" 
> > > > > To: fweis...@gmail.com, "mathieu desnoyers"
> > > > > , pet...@infradead.org
> > > > > Cc: linux-kernel@vger.kernel.org
> > > > > Sent: Sunday, March 30, 2014 7:08:56 PM
> > > > > Subject: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > > > > 
> > > > > For whatever it is worth, the following model claims safety and 
> > > > > progress
> > > > > for the sysidle state machine.
> > > > > 
> > > > > Thoughts?
> > > > 
> > > > Hi Paul,
> > > > 
> > > > Please keep in mind that it's been a while since I've looked at Promela
> > > > proofs, but a couple of questions come to mind. First, how is the 
> > > > execution
> > > > script below checking for progress in any way ? I remember that we used
> > > > the negation of a "_np" LTL claim to check for progress in the past.
> > > > Moreover, I'd be much more comfortable seeing ifdefs in the code that
> > > > inject
> > > > known failures, and let them be triggered by error-triggering runs in 
> > > > the
> > > > scripts (with e.g. -DINJECT_ERROR_XYZ), to confirm that this model 
> > > > actually
> > > > catches known issues.
> > > 
> > > Well, if I comment out the four "progress_" labels, it complains about
> > > a non-progress cycle.  So at least spin does pay attention to these.  ;-)
> > > 
> > > If I put the "progress_" labels back in, but change the check so as to
> > > prevent the state machine from ever entering RCU_SYSIDLE_FULL_NOTED,
> > > it cranks through 14M states before complaining about a non-progress 
> > > cycle,
> > > as expected.
> > > 
> > > If I revert back to pristine state, and then comment out the statements
> > > that reverts state back to RCU_SYSIDLE_NOT when exiting idle, an assert()
> > > triggers, as expected.
> > > 
> > > > If we can show that the model can detect a few failure modes, both for
> > > > safety
> > > > and progress, then my confidence level in the model will start to 
> > > > improve.
> > > > ;-)
> > > 
> > > Well, there probably is a bug in there somewhere, Murphy being who he is.
> > > ;-)
> > 
> > One failure mode your model seems to miss is when I comment the wakeup:
> > 
> > /* If needed, wake up the timekeeper. */
> > if
> > #ifndef INJECT_NO_WAKEUP
> > :: oldstate == RCU_SYSIDLE_FULL_NOTED ->
> > wakeup_timekeeper = 1;
> > #endif /* #ifndef INJECT_NO_WAKEUP */
> > :: else -> skip;
> > fi;
> > 
> > Somehow, I feel I am doing something very nasty to the algorithm,
> > but the model checker fails to find any kind of progress or safety
> > issue. Any idea why ?
> 
> Hmmm...  One reason is because I am not modeling the timekeeper thread
> as waiting for the wakeup.  Let me see what I can do about that...

And here is an updated model.  I now make it loop waiting for the
wakeup_timekeeper variable to transition to 1.  And I learned that
Promela ignores "progress" labels within atomic statements...

The previous error injections still trigger asserts.

Thanx, Paul



/*
 * Promela model for CONFIG_NO_HZ_FULL_SYSIDLE=y in the Linux kernel.
 * This model assumes that the dyntick-idle bit manipulation works based
 * on long usage, and substitutes a per-thread boolean "am_busy[]" array
 * for the Linux kernel's dyntick-idle masks.  The focus of this model
 * is therefore on the state machine itself.  Checks for both safety and
 * forward progress.
 *
 * T

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-03-31 Thread Paul E. McKenney
On Mon, Mar 31, 2014 at 02:02:23PM +, Mathieu Desnoyers wrote:
> - Original Message -
> > From: "Paul E. McKenney" 
> > To: "Mathieu Desnoyers" 
> > Cc: fweis...@gmail.com, pet...@infradead.org, linux-kernel@vger.kernel.org
> > Sent: Sunday, March 30, 2014 11:54:58 PM
> > Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > 
> > On Mon, Mar 31, 2014 at 03:29:25AM +, Mathieu Desnoyers wrote:
> > > - Original Message -
> > > > From: "Paul E. McKenney" 
> > > > To: fweis...@gmail.com, "mathieu desnoyers"
> > > > , pet...@infradead.org
> > > > Cc: linux-kernel@vger.kernel.org
> > > > Sent: Sunday, March 30, 2014 7:08:56 PM
> > > > Subject: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > > > 
> > > > For whatever it is worth, the following model claims safety and progress
> > > > for the sysidle state machine.
> > > > 
> > > > Thoughts?
> > > 
> > > Hi Paul,
> > > 
> > > Please keep in mind that it's been a while since I've looked at Promela
> > > proofs, but a couple of questions come to mind. First, how is the 
> > > execution
> > > script below checking for progress in any way ? I remember that we used
> > > the negation of a "_np" LTL claim to check for progress in the past.
> > > Moreover, I'd be much more comfortable seeing ifdefs in the code that
> > > inject
> > > known failures, and let them be triggered by error-triggering runs in the
> > > scripts (with e.g. -DINJECT_ERROR_XYZ), to confirm that this model 
> > > actually
> > > catches known issues.
> > 
> > Well, if I comment out the four "progress_" labels, it complains about
> > a non-progress cycle.  So at least spin does pay attention to these.  ;-)
> > 
> > If I put the "progress_" labels back in, but change the check so as to
> > prevent the state machine from ever entering RCU_SYSIDLE_FULL_NOTED,
> > it cranks through 14M states before complaining about a non-progress cycle,
> > as expected.
> > 
> > If I revert back to pristine state, and then comment out the statements
> > that reverts state back to RCU_SYSIDLE_NOT when exiting idle, an assert()
> > triggers, as expected.
> > 
> > > If we can show that the model can detect a few failure modes, both for
> > > safety
> > > and progress, then my confidence level in the model will start to improve.
> > > ;-)
> > 
> > Well, there probably is a bug in there somewhere, Murphy being who he is.
> > ;-)
> 
> One failure mode your model seems to miss is when I comment the wakeup:
> 
> /* If needed, wake up the timekeeper. */
> if
> #ifndef INJECT_NO_WAKEUP
> :: oldstate == RCU_SYSIDLE_FULL_NOTED ->
> wakeup_timekeeper = 1;
> #endif /* #ifndef INJECT_NO_WAKEUP */
> :: else -> skip;
> fi;
> 
> Somehow, I feel I am doing something very nasty to the algorithm,
> but the model checker fails to find any kind of progress or safety
> issue. Any idea why ?

Hmmm...  One reason is because I am not modeling the timekeeper thread
as waiting for the wakeup.  Let me see what I can do about that...

Thanx, Paul

> Thanks,
> 
> Mathieu
> 
> 
> > 
> > Thanx, Paul
> > 
> > > Thanks,
> > > 
> > > Mathieu
> > > 
> > > > 
> > > > Thanx, Paul
> > > > 
> > > > 
> > > > sysidle.sh
> > > > 
> > > > spin -a sysidle.spin
> > > > cc -DNP -o pan pan.c
> > > > # Fair scheduling to focus progress checks in timekeeper.
> > > > ./pan -f -l -m128 -w22
> > > > 
> > > > 
> > > > sysidle.spin
> > > > 
> > > > /*
> > > >  * Promela model for CONFIG_NO_HZ_FULL_SYSIDLE=y in the Linux kernel.
> > > >  * This model assumes that the dyntick-idle bit manipulation works based
> > > >  * on long usage, and subst

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-03-31 Thread Mathieu Desnoyers


- Original Message -
> From: "Paul E. McKenney" 
> To: "Mathieu Desnoyers" 
> Cc: fweis...@gmail.com, pet...@infradead.org, linux-kernel@vger.kernel.org
> Sent: Sunday, March 30, 2014 11:54:58 PM
> Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> 
> On Mon, Mar 31, 2014 at 03:29:25AM +, Mathieu Desnoyers wrote:
> > - Original Message -
> > > From: "Paul E. McKenney" 
> > > To: fweis...@gmail.com, "mathieu desnoyers"
> > > , pet...@infradead.org
> > > Cc: linux-kernel@vger.kernel.org
> > > Sent: Sunday, March 30, 2014 7:08:56 PM
> > > Subject: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > > 
> > > For whatever it is worth, the following model claims safety and progress
> > > for the sysidle state machine.
> > > 
> > > Thoughts?
> > 
> > Hi Paul,
> > 
> > Please keep in mind that it's been a while since I've looked at Promela
> > proofs, but a couple of questions come to mind. First, how is the execution
> > script below checking for progress in any way ? I remember that we used
> > the negation of a "_np" LTL claim to check for progress in the past.
> > Moreover, I'd be much more comfortable seeing ifdefs in the code that
> > inject
> > known failures, and let them be triggered by error-triggering runs in the
> > scripts (with e.g. -DINJECT_ERROR_XYZ), to confirm that this model actually
> > catches known issues.
> 
> Well, if I comment out the four "progress_" labels, it complains about
> a non-progress cycle.  So at least spin does pay attention to these.  ;-)
> 
> If I put the "progress_" labels back in, but change the check so as to
> prevent the state machine from ever entering RCU_SYSIDLE_FULL_NOTED,
> it cranks through 14M states before complaining about a non-progress cycle,
> as expected.
> 
> If I revert back to pristine state, and then comment out the statements
> that reverts state back to RCU_SYSIDLE_NOT when exiting idle, an assert()
> triggers, as expected.
> 
> > If we can show that the model can detect a few failure modes, both for
> > safety
> > and progress, then my confidence level in the model will start to improve.
> > ;-)
> 
> Well, there probably is a bug in there somewhere, Murphy being who he is.
> ;-)

One failure mode your model seems to miss is when I comment the wakeup:

/* If needed, wake up the timekeeper. */
if
#ifndef INJECT_NO_WAKEUP
:: oldstate == RCU_SYSIDLE_FULL_NOTED ->
wakeup_timekeeper = 1;
#endif /* #ifndef INJECT_NO_WAKEUP */
:: else -> skip;
fi;

Somehow, I feel I am doing something very nasty to the algorithm,
but the model checker fails to find any kind of progress or safety
issue. Any idea why ?

Thanks,

Mathieu


> 
>   Thanx, Paul
> 
> > Thanks,
> > 
> > Mathieu
> > 
> > > 
> > >   Thanx, Paul
> > > 
> > > 
> > > sysidle.sh
> > > 
> > > spin -a sysidle.spin
> > > cc -DNP -o pan pan.c
> > > # Fair scheduling to focus progress checks in timekeeper.
> > > ./pan -f -l -m128 -w22
> > > 
> > > 
> > > sysidle.spin
> > > 
> > > /*
> > >  * Promela model for CONFIG_NO_HZ_FULL_SYSIDLE=y in the Linux kernel.
> > >  * This model assumes that the dyntick-idle bit manipulation works based
> > >  * on long usage, and substitutes a per-thread boolean "am_busy[]" array
> > >  * for the Linux kernel's dyntick-idle masks.  The focus of this model
> > >  * is therefore on the state machine itself.  Checks for both safety and
> > >  * forward progress.
> > >  *
> > >  * This program is free software; you can redistribute it and/or modify
> > >  * it under the terms of the GNU General Public License as published by
> > >  * the Free Software Foundation; either version 2 of the License, or
> > >  * (at your option) any later version.
> > >  *
> > >  * This program is distributed in the hope that it will be useful,
> > >  * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-03-30 Thread Paul E. McKenney
On Mon, Mar 31, 2014 at 03:29:25AM +, Mathieu Desnoyers wrote:
> - Original Message -
> > From: "Paul E. McKenney" 
> > To: fweis...@gmail.com, "mathieu desnoyers" 
> > , pet...@infradead.org
> > Cc: linux-kernel@vger.kernel.org
> > Sent: Sunday, March 30, 2014 7:08:56 PM
> > Subject: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > 
> > For whatever it is worth, the following model claims safety and progress
> > for the sysidle state machine.
> > 
> > Thoughts?
> 
> Hi Paul,
> 
> Please keep in mind that it's been a while since I've looked at Promela
> proofs, but a couple of questions come to mind. First, how is the execution
> script below checking for progress in any way ? I remember that we used
> the negation of a "_np" LTL claim to check for progress in the past.
> Moreover, I'd be much more comfortable seeing ifdefs in the code that inject
> known failures, and let them be triggered by error-triggering runs in the
> scripts (with e.g. -DINJECT_ERROR_XYZ), to confirm that this model actually
> catches known issues.

Well, if I comment out the four "progress_" labels, it complains about
a non-progress cycle.  So at least spin does pay attention to these.  ;-)

If I put the "progress_" labels back in, but change the check so as to
prevent the state machine from ever entering RCU_SYSIDLE_FULL_NOTED,
it cranks through 14M states before complaining about a non-progress cycle,
as expected.

If I revert back to pristine state, and then comment out the statements
that reverts state back to RCU_SYSIDLE_NOT when exiting idle, an assert()
triggers, as expected.

> If we can show that the model can detect a few failure modes, both for safety
> and progress, then my confidence level in the model will start to improve. ;-)

Well, there probably is a bug in there somewhere, Murphy being who he is.  ;-)

Thanx, Paul

> Thanks,
> 
> Mathieu
> 
> > 
> > Thanx, Paul
> > 
> > 
> > sysidle.sh
> > 
> > spin -a sysidle.spin
> > cc -DNP -o pan pan.c
> > # Fair scheduling to focus progress checks in timekeeper.
> > ./pan -f -l -m128 -w22
> > 
> > 
> > sysidle.spin
> > 
> > /*
> >  * Promela model for CONFIG_NO_HZ_FULL_SYSIDLE=y in the Linux kernel.
> >  * This model assumes that the dyntick-idle bit manipulation works based
> >  * on long usage, and substitutes a per-thread boolean "am_busy[]" array
> >  * for the Linux kernel's dyntick-idle masks.  The focus of this model
> >  * is therefore on the state machine itself.  Checks for both safety and
> >  * forward progress.
> >  *
> >  * This program is free software; you can redistribute it and/or modify
> >  * it under the terms of the GNU General Public License as published by
> >  * the Free Software Foundation; either version 2 of the License, or
> >  * (at your option) any later version.
> >  *
> >  * This program is distributed in the hope that it will be useful,
> >  * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >  * GNU General Public License for more details.
> >  *
> >  * You should have received a copy of the GNU General Public License
> >  * along with this program; if not, you can access it online at
> >  * http://www.gnu.org/licenses/gpl-2.0.html.
> >  *
> >  * Copyright IBM Corporation, 2014
> >  *
> >  * Author: Paul E. McKenney 
> >  */
> > 
> > #define NUM_WORKERS 3
> > 
> > byte wakeup_timekeeper = 0; /* Models rcu_kick_nohz_cpu(). */
> > 
> > #define RCU_SYSIDLE_NOT 0   /* Some CPU is not idle. */
> > #define RCU_SYSIDLE_SHORT   1   /* All CPUs idle for brief period. */
> > #define RCU_SYSIDLE_LONG2   /* All CPUs idle for long enough. */
> > #define RCU_SYSIDLE_FULL3   /* All CPUs idle, ready for sysidle. */
> > #define RCU_SYSIDLE_FULL_NOTED  4   /* Actually entered sysidle 
> > state. */
> > 
> > byte full_sysidle_state = RCU_SYSIDLE_NOT;
> > 
> > byte am_busy[NUM_WORKERS];  /* Busy is similar to "not dyntick-idle". */
> > byte am_setup[NUM_WORKERS]; /* Setup means timekeeper knows I am not idle. 
> > */
> > 
> > /*
> >  * Non-timekeeping CPU going into and out of dyntick-idle state.
> >  */
> > proctype worker(byte me)
> > {
> > byte oldstate;
> > 
> > do
> > :: 1 ->
> > /* Go idle. */
> > am_setup[me] = 0;
> > am_busy[me] = 0;
> > 
> > /* Dyntick-idle in the following loop. */
> > do
> > :: 1 -> skip;
> > :: 1 -> break;
> > od;
> > 
> > /* Exit idle loop, model getting out of dyntick idle state.

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-03-30 Thread Mathieu Desnoyers
- Original Message -
> From: "Mathieu Desnoyers" 
> To: paul...@linux.vnet.ibm.com
> Cc: fweis...@gmail.com, pet...@infradead.org, linux-kernel@vger.kernel.org
> Sent: Sunday, March 30, 2014 11:29:25 PM
> Subject: Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> 
> - Original Message -
> > From: "Paul E. McKenney" 
> > To: fweis...@gmail.com, "mathieu desnoyers"
> > , pet...@infradead.org
> > Cc: linux-kernel@vger.kernel.org
> > Sent: Sunday, March 30, 2014 7:08:56 PM
> > Subject: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> > 
> > For whatever it is worth, the following model claims safety and progress
> > for the sysidle state machine.
> > 
> > Thoughts?
> 
> Hi Paul,
> 
> Please keep in mind that it's been a while since I've looked at Promela
> proofs, but a couple of questions come to mind. First, how is the execution
> script below checking for progress in any way ? I remember that we used
> the negation of a "_np" LTL claim to check for progress in the past.
> Moreover, I'd be much more comfortable seeing ifdefs in the code that inject
> known failures, and let them be triggered by error-triggering runs in the
> scripts (with e.g. -DINJECT_ERROR_XYZ), to confirm that this model actually
> catches known issues.
> 
> If we can show that the model can detect a few failure modes, both for safety
> and progress, then my confidence level in the model will start to improve.
> ;-)

Digging through the spin documentation, "-l" is indeed another way besides
the _np LTL negation to find non-progress cycles.

Still, injecting known failures would let us learn a lot about the model
error detection abilities.

Thanks!

Mathieu


> 
> Thanks,
> 
> Mathieu
> 
> > 
> > Thanx, Paul
> > 
> > 
> > sysidle.sh
> > 
> > spin -a sysidle.spin
> > cc -DNP -o pan pan.c
> > # Fair scheduling to focus progress checks in timekeeper.
> > ./pan -f -l -m128 -w22
> > 
> > 
> > sysidle.spin
> > 
> > /*
> >  * Promela model for CONFIG_NO_HZ_FULL_SYSIDLE=y in the Linux kernel.
> >  * This model assumes that the dyntick-idle bit manipulation works based
> >  * on long usage, and substitutes a per-thread boolean "am_busy[]" array
> >  * for the Linux kernel's dyntick-idle masks.  The focus of this model
> >  * is therefore on the state machine itself.  Checks for both safety and
> >  * forward progress.
> >  *
> >  * This program is free software; you can redistribute it and/or modify
> >  * it under the terms of the GNU General Public License as published by
> >  * the Free Software Foundation; either version 2 of the License, or
> >  * (at your option) any later version.
> >  *
> >  * This program is distributed in the hope that it will be useful,
> >  * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >  * GNU General Public License for more details.
> >  *
> >  * You should have received a copy of the GNU General Public License
> >  * along with this program; if not, you can access it online at
> >  * http://www.gnu.org/licenses/gpl-2.0.html.
> >  *
> >  * Copyright IBM Corporation, 2014
> >  *
> >  * Author: Paul E. McKenney 
> >  */
> > 
> > #define NUM_WORKERS 3
> > 
> > byte wakeup_timekeeper = 0; /* Models rcu_kick_nohz_cpu(). */
> > 
> > #define RCU_SYSIDLE_NOT 0   /* Some CPU is not idle. */
> > #define RCU_SYSIDLE_SHORT   1   /* All CPUs idle for brief period. */
> > #define RCU_SYSIDLE_LONG2   /* All CPUs idle for long enough. */
> > #define RCU_SYSIDLE_FULL3   /* All CPUs idle, ready for sysidle. */
> > #define RCU_SYSIDLE_FULL_NOTED  4   /* Actually entered sysidle 
> > state. */
> > 
> > byte full_sysidle_state = RCU_SYSIDLE_NOT;
> > 
> > byte am_busy[NUM_WORKERS];  /* Busy is similar to "not dyntick-idle". */
> > byte am_setup[NUM_WORKERS]; /* Setup means timekeeper knows I am not idle.
> > */
> > 
> > /*
> >  * Non-timekeeping CPU going into and out of dyntick-idle state.
> >  */
> > proctype worker(byte me)
> 

Re: Promela/spin model for NO_HZ_FULL_SYSIDLE code

2014-03-30 Thread Mathieu Desnoyers
- Original Message -
> From: "Paul E. McKenney" 
> To: fweis...@gmail.com, "mathieu desnoyers" , 
> pet...@infradead.org
> Cc: linux-kernel@vger.kernel.org
> Sent: Sunday, March 30, 2014 7:08:56 PM
> Subject: Promela/spin model for NO_HZ_FULL_SYSIDLE code
> 
> For whatever it is worth, the following model claims safety and progress
> for the sysidle state machine.
> 
> Thoughts?

Hi Paul,

Please keep in mind that it's been a while since I've looked at Promela
proofs, but a couple of questions come to mind. First, how is the execution
script below checking for progress in any way ? I remember that we used
the negation of a "_np" LTL claim to check for progress in the past.
Moreover, I'd be much more comfortable seeing ifdefs in the code that inject
known failures, and let them be triggered by error-triggering runs in the
scripts (with e.g. -DINJECT_ERROR_XYZ), to confirm that this model actually
catches known issues.

If we can show that the model can detect a few failure modes, both for safety
and progress, then my confidence level in the model will start to improve. ;-)

Thanks,

Mathieu

> 
>   Thanx, Paul
> 
> 
> sysidle.sh
> 
> spin -a sysidle.spin
> cc -DNP -o pan pan.c
> # Fair scheduling to focus progress checks in timekeeper.
> ./pan -f -l -m128 -w22
> 
> 
> sysidle.spin
> 
> /*
>  * Promela model for CONFIG_NO_HZ_FULL_SYSIDLE=y in the Linux kernel.
>  * This model assumes that the dyntick-idle bit manipulation works based
>  * on long usage, and substitutes a per-thread boolean "am_busy[]" array
>  * for the Linux kernel's dyntick-idle masks.  The focus of this model
>  * is therefore on the state machine itself.  Checks for both safety and
>  * forward progress.
>  *
>  * This program is free software; you can redistribute it and/or modify
>  * it under the terms of the GNU General Public License as published by
>  * the Free Software Foundation; either version 2 of the License, or
>  * (at your option) any later version.
>  *
>  * This program is distributed in the hope that it will be useful,
>  * but WITHOUT ANY WARRANTY; without even the implied warranty of
>  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>  * GNU General Public License for more details.
>  *
>  * You should have received a copy of the GNU General Public License
>  * along with this program; if not, you can access it online at
>  * http://www.gnu.org/licenses/gpl-2.0.html.
>  *
>  * Copyright IBM Corporation, 2014
>  *
>  * Author: Paul E. McKenney 
>  */
> 
> #define NUM_WORKERS 3
> 
> byte wakeup_timekeeper = 0; /* Models rcu_kick_nohz_cpu(). */
> 
> #define RCU_SYSIDLE_NOT   0   /* Some CPU is not idle. */
> #define RCU_SYSIDLE_SHORT 1   /* All CPUs idle for brief period. */
> #define RCU_SYSIDLE_LONG  2   /* All CPUs idle for long enough. */
> #define RCU_SYSIDLE_FULL  3   /* All CPUs idle, ready for sysidle. */
> #define RCU_SYSIDLE_FULL_NOTED4   /* Actually entered sysidle 
> state. */
> 
> byte full_sysidle_state = RCU_SYSIDLE_NOT;
> 
> byte am_busy[NUM_WORKERS];  /* Busy is similar to "not dyntick-idle". */
> byte am_setup[NUM_WORKERS]; /* Setup means timekeeper knows I am not idle. */
> 
> /*
>  * Non-timekeeping CPU going into and out of dyntick-idle state.
>  */
> proctype worker(byte me)
> {
>   byte oldstate;
> 
>   do
>   :: 1 ->
>   /* Go idle. */
>   am_setup[me] = 0;
>   am_busy[me] = 0;
> 
>   /* Dyntick-idle in the following loop. */
>   do
>   :: 1 -> skip;
>   :: 1 -> break;
>   od;
> 
>   /* Exit idle loop, model getting out of dyntick idle state. */
>   am_busy[me] = 1;
> 
>   /* Get state out of full-system idle states. */
>   atomic {
>   oldstate = full_sysidle_state;
>   if
>   :: oldstate > RCU_SYSIDLE_SHORT ->
>   full_sysidle_state = RCU_SYSIDLE_NOT;
>   :: else -> skip;
>   fi;
>   }
> 
>   /* If needed, wake up the timekeeper. */
>   if
>   :: oldstate == RCU_SYSIDLE_FULL_NOTED ->
>   wakeup_timekeeper = 1;
>   :: else -> skip;
>   fi;
> 
>   /* Mark ourselves fully awake and operational. */
>   am_setup[me] = 1;
> 
>   /* We are fully awake, so timekeeper must not be asleep. */
>   assert(full_sysidle_state < RCU_SYSIDLE_FULL);
> 
>   /* Running in ke