On Mon, Jul 02, 2018 at 04:05:54PM +0100, Peter Maydell wrote:
> On 26 June 2018 at 21:00, Guenter Roeck <li...@roeck-us.net> wrote:
> > Here is a log, with debug messages added to ptimer code.
> 
> Thanks for the logging. (I am currently also messing about
> with buildroot to try to reproduce -- I have a kernel that
> boots, but since it has no initial filesystem it hits the
> "stop because I can't find init" before anything interesting
> happens.)
> 
> > Without my patch:
> >
> > ptimer_tick(0x555556edac70): enabled=1 delta=250000 limit=250000
> > ptimer_reload(555556edac70): frac=0 period=40 delta=250000 limit=250000 
> > adjust=1
> > ptimer_tick(0x555556edac70): enabled=1 delta=250000 limit=250000
> > ptimer_reload(555556edac70): frac=0 period=40 delta=250000 limit=250000 
> > adjust=1
> > [    0.497942] clocksource: Switched to clocksource mps2-clksrc^M
> > ptimer_tick(0x555556edac70): enabled=1 delta=250000 limit=250000
> > ptimer_reload(555556edac70): frac=0 period=40 delta=250000 limit=250000 
> > adjust=1
> >
> > # up to here timer is in periodic mode and fires on a regular basis
> >
> > # prepare to switch to one-shot mode
> > ptimer_reload(555556edac70): frac=0 period=40 delta=0 limit=0 adjust=0
> >
> > # set timer for one-shot mode, start
> > ptimer_set_count(0x555556edac70, 92004
> > ptimer_reload(555556edac70): frac=0 period=40 delta=92004 limit=0 adjust=0
> >
> > # one-shot mode fires
> > ptimer_tick(0x555556edac70): enabled=1 delta=92004 limit=0
> >
> > # timer is in periodic mode, limit is 0 (from one-shot mode)
> > # reload propagates limit -> delta, making it 0
> > ptimer_reload(555556edac70): frac=0 period=40 delta=0 limit=0 adjust=-1
> >
> > # and the timer is disabled as result.
> > 0x555556edac70: Timer with delta zero, disabling
> >
> > ptimer_set_count(0x555556edac70, 199340
> > ptimer_reload(555556edac70): frac=0 period=40 delta=199340 limit=0 adjust=0
> > [    0.514458] NET: Registered protocol family 2^M
> > ptimer_tick(0x555556edac70): enabled=1 delta=199340 limit=0
> > ptimer_reload(555556edac70): frac=0 period=40 delta=0 limit=0 adjust=-1
> > 0x555556edac70: Timer with delta zero, disabling
> > ptimer_set_count(0x555556edac70, 240488
> > ptimer_reload(555556edac70): frac=0 period=40 delta=240488 limit=0 adjust=0
> > [    0.526096] tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 
> > 4096
> > bytes)^M
> > [    0.526533] TCP established hash table entries: 1024 (order: 0, 4096 
> > bytes)^M
> > [    0.526943] TCP bind hash table entries: 1024 (order: 0, 4096 bytes)^M
> > ptimer_tick(0x555556edac70): enabled=1 delta=240488 limit=0
> > ptimer_reload(555556edac70): frac=0 period=40 delta=0 limit=0 adjust=-1
> > 0x555556edac70: Timer with delta zero, disabling
> 
> OK, but I can't see anything actually going wrong here. Once
> the guest sets the RELOAD register to 0, it gets one-shot timers, and
> needs to write to VALUE each time to restart the timer. We seem to get
> calls to set_count which set different counts, and a a tick for each
> set_count, so that looks like it's working to me. (But see below...)
> 

Sometimes the timer doesn't fire or is not reloaded when expected, causing
the system to stall. It also seemed to me that the timer fires at the wrong
time, ie at the period instead of the selected one-time value, but I may
misinterpret the log.

> > And so on. The system may finish booting or lock up. Which one it is seems
> > to be random.
> >
> > Same log with my patch applied:
> >
> > ptimer_tick(0x555556edac70): enabled=1 delta=250000 limit=250000
> > ptimer_reload(555556edac70): frac=0 period=40 delta=250000 limit=250000 
> > adjust=1
> >
> > # switch to one-shot mode
> > ptimer_reload(555556edac70): frac=0 period=40 delta=0 limit=0 adjust=0
> >
> > @ set counter, start
> > ptimer_set_count(0x555556edac70, 24
> > ptimer_reload(555556edac70): frac=0 period=40 delta=24 limit=0 adjust=0
> >
> > # tick
> > ptimer_tick(0x555556edac70): enabled=2 delta=24 limit=0
> >
> > # update counter, restart
> > ptimer_set_count(0x555556edac70, 209498
> > ptimer_reload(555556edac70): frac=0 period=40 delta=209498 limit=0 adjust=0
> > [    0.509883] NET: Registered protocol family 2^M
> >
> > # tick
> > ptimer_tick(0x555556edac70): enabled=2 delta=209498 limit=0
> >
> > # update counter, restart
> > ptimer_set_count(0x555556edac70, 217500
> > ptimer_reload(555556edac70): frac=0 period=40 delta=217500 limit=0 adjust=0
> >
> > # tick
> > ptimer_tick(0x555556edac70): enabled=2 delta=217500 limit=0
> 
> This doesn't seem to be behaving any differently (apart from
> having silenced the fprintf about the zero delta).
> 
> My reading of the code in ptimer_tick() is that the behaviour
> for "this was setup as a one-shot" is in effect the same as
> for "this was setup as periodic but the reload value is 0"
> is the same: it sets enabled to 0 and delta to 0.
> 

To me the second log looks more predictive than the first, but that
may just be me.

Again, in my observation the boot sometimes stalls without my
patch. Maybe there is a better fix, but the code as-is doesn't
work for me. All I can say is that the current code does not set
one-shot mode when it should, and that seems wrong.

You could of course take out the zero delta warning. Personally
I would caution against that - whenever I have seen that warning,
it suggested that something is wrong with the emulation, and it
helped me track down the problem. Your call, of course. Also,
I don't think taking out the warning would fix my stall problem.

Thanks,
Guenter

> One thing I do notice reading the code is that we don't
> actually do anything to re-enable the timer on a write to
> the VALUE register. We call ptimer_set_count(), but that will
> not reenable the timer if it was disabled -- the cmsdk timer
> code needs to call ptimer_run() if necessary, I think. But
> that bug should manifest whether we're setting the timer up
> as a one-shot or as a periodic with a 0 limit, because
> ptimer_tick() will disable the timer when it fires whichever
> way we do that.
> 
> thanks
> -- PMM

Reply via email to