Hi Stefan, On Wed, Aug 31, 2022 at 2:53 PM Tony Dinh <mibo...@gmail.com> wrote: > > Hi Stefan, > > On Wed, Aug 31, 2022 at 8:08 AM Tony Dinh <mibo...@gmail.com> wrote: > > > > Hi Stefan, > > > > On Wed, Aug 31, 2022 at 12:22 AM Stefan Roese <s...@denx.de> wrote: > > > > > > Hi Tony, > > > > > > On 31.08.22 08:30, Tony Dinh wrote: > > > > Hi Stefan, > > > > > > > > On Tue, Aug 30, 2022 at 10:08 PM Stefan Roese <s...@denx.de> wrote: > > > >> > > > >> Hi Tony, > > > >> > > > >> On 31.08.22 07:02, Stefan Roese wrote: > > > >>> Hi Tony, > > > >>> > > > >>> On 31.08.22 00:15, Tony Dinh wrote: > > > >>>> Hi Stefan, > > > >>>> > > > >>>> On Tue, Aug 30, 2022 at 4:53 AM Stefan Roese <s...@denx.de> wrote: > > > >>>>> > > > >>>>> This patchset enhaces the recently added Orion Timer driver to > > > >>>>> support > > > >>>>> all other Kirkwood & 32bit MVEBU Armada platforms. Additionally, > > > >>>>> this > > > >>>>> timer support is then enabled per default for those platforms, so > > > >>>>> that > > > >>>>> the board config files don't need to be changed. Also necessary is > > > >>>>> some dts hacking, so that the timer DT node is available in early > > > >>>>> U-Boot stages. > > > >>>>> > > > >>>>> I've successfully tested this patchset on an Armada XP board. > > > >>>>> Additional > > > >>>>> test on other boards and platforms are very welcome and necessary. > > > >>>> > > > >>>> I've run some tests with the following 2 Kirkwood boards: Cloud > > > >>>> Engines Pogo V4 88F6192 (with CONFIG_DM_RTC and > > > >>>> CONFIG_RTC_EMULATION), > > > >>>> and Marvell Sheevaplug 88F6281 (with CONFIG_DM_RTC and > > > >>>> CONFIG_RTC_MV). > > > >>>> > > > >>>> It seems that it was either frozen or the timer did not expire at > > > >>>> some > > > >>>> subsequent sleep commands. Sometime it happened at 2nd command, some > > > >>>> time at a later sleep command. For example, > > > >>>> > > > >>>> === Pogo V4 (the 1st sleep command works correctly at 10 seconds on > > > >>>> my > > > >>>> stopwatch) > > > >>>> > > > >>>> U-Boot 2022.10-rc3-00048-g66ccd87a9c-dirty (Aug 30 2022 - 13:38:24 > > > >>>> -0700) > > > >>>> Pogoplug V4 > > > >>>> > > > >>>> Hit any key to stop autoboot: 0 > > > >>>> Pogo_V4> sleep 10 > > > >>>> Pogo_V4> sleep 31.5 > > > >>>> <frozen here> > > > >>> > > > >>> Does the cmd interface support fractial numbers? Please test again > > > >>> with > > > >>> 31 or other integral numbers. > > > >>> > > > >>>> === Sheevaplug (RTC battery is old, so the date was not updated, but > > > >>>> the clock seems OK) > > > >>>> > > > >>>> U-Boot 2022.10-rc3-00048-g66ccd87a9c-dirty (Aug 30 2022 - 14:14:24 > > > >>>> -0700) > > > >>>> Marvell-Sheevaplug > > > >>>> Hit any key to stop autoboot: 0 > > > >>>> => date > > > >>>> Date: 2000-01-01 (Saturday) Time: 0:02:55 > > > >>>> => sleep 10 > > > >>>> => date > > > >>>> Date: 2000-01-01 (Saturday) Time: 0:03:18 > > > >>>> => sleep 10 > > > >>>> => sleep 20.1 > > > >>>> <frozen here> > > > >>>> > > > >>>> Please let me know what I can do (i.e. perhaps running a debug > > > >>>> patch). > > > >>> > > > >>> Please see above. I assume that the fractional numbers result in very > > > >>> long numbers internally, which result in a frozen / hanging system. > > > >> > > > >> I just tested fractional numbers on another board and hey, it just > > > >> works. Learned something new. So we seem to have a problem here. Let > > > >> me see, if I can find something. > > > > > > > > I've added debug printfs and possibly tracked down this issue. Seems > > > > like in the 2nd call to sleep, get_timer(0) did not reset the start > > > > number. > > > > > > > > cmd/sleep.c > > > > static int do_sleep(struct cmd_tbl *cmdtp, int flag, int argc, > > > > char *const argv[]) > > > > { > > > > ulong start = get_timer(0); > > > > > > > > "do_sleep got a timer start = 2015" is the 1st call to sleep 5. > > > > "do_sleep got a timer start = 16304" is the 2nd call to sleep 10. > > > > > > > > <BEGIN log> > > > > Pogo_V4> sleep 5 > > > > do_sleep got a timer start = 2015 > > > > do_sleep delay = 5000 > > > > do_sleep delay = 5000 > > > > do_sleep sleeping... > > > > do_sleep start 2015 curent 100 > > > > do_sleep start 2015 curent 200 > > > > do_sleep start 2015 curent 300 > > > > <snip> > > > > do_sleep start 2015 curent 4900 > > > > do_sleep end of sleep ... current = 5000 > > > > Pogo_V4> > > > > > > > > Pogo_V4> sleep 10 > > > > do_sleep got a timer start = 16304 > > > > do_sleep delay = 10000 > > > > do_sleep delay = 10000 > > > > do_sleep sleeping... > > > > <snip> > > > > > > > > <END log> > > > > > > > > So somewhere in the DM timer, "start" got accumulated. I think each > > > > get_timer(0) should be a different timer instance. It looks like the > > > > same timer instance is used again and again, causing the "start "to > > > > grow bigger, and at one point it might just overflow. > > > > > > Frankly I don't really understand the problem you describe above. What > > > do you mean with "timer instance"? get_timer(0) will return different > > > values, depending on when you call this function. So where exactly is > > > the problem with the 2nd "sleep 10" above? > > > > Please ignore what I said above! I misunderstood what get_timer() > > does. I'll try again on another KIrkwood board to see if the behavior > > will be the same. > > I've run the tests again with the Seagate GoFlex Home board (Kirkwood > 88F6281). Below is the log, with my annotated comments starting with > ****. > In these tests, current = get_timer(start) is inside the while loop. > And I printed out the start and current values when (current % 100 == > 0). > > <BEGIN LOG> > U-Boot 2022.10-rc3-00048-g66ccd87a9c-dirty (Aug 31 2022 - 13:06:04 -0700) > Seagate GoFlex Home > > SoC: Kirkwood 88F6281_A1 > DRAM: 128 MiB > Core: 12 devices, 10 uclasses, devicetree: separate > NAND: orion_timer_probe successful > 256 MiB > Loading Environment from NAND... OK > In: serial > Out: serial > Err: serial > Net: eth0: ethernet-controller@72000 > Hit any key to stop autoboot: 0 > > GoFlexHome> date > Date: 2022-08-31 (Wednesday) Time: 20:43:03 > > GoFlexHome> sleep 5 > do_sleep got a timer start = 1244 > do_sleep delay = 5000 > do_sleep delay = 5000 > do_sleep sleeping... > do_sleep start 1244 current 100 > <snip> > do_sleep start 1244 current 2300 > <snip> > do_sleep start 1244 current 3700 > <snip> > do_sleep start 1244 current 4800 > do_sleep start 1244 current 4900 > do_sleep end of sleep ... current = 5000 > > **** working as intended > > GoFlexHome> sleep 10 > do_sleep got a timer start = 11428 > do_sleep delay = 10000 > do_sleep delay = 10000 > do_sleep sleeping... > do_sleep start 11428 current 100 > <snip> > do_sleep start 11428 current 2300 > <snip> > do_sleep start 11428 current 9900 > do_sleep end of sleep ... current = 10000 > > **** working as intended > > GoFlexHome> sleep 20.5 > do_sleep got a timer start = 15031 > do_sleep delay = 20000 > do_sleep delay = 20500 > do_sleep sleeping... > do_sleep start 15031 current 100 > <snip> > do_sleep start 15031 current 6400 > do_sleep end of sleep ... current = 4294952265 > > *** Something strange happened here. current should be 6500, but it > seems to have garbage. So the loop exits prematurely. > > GoFlexHome> sleep 20.5 > do_sleep got a timer start = 8339 > do_sleep delay = 20000 > do_sleep delay = 20500 > do_sleep sleeping... > do_sleep start 8339 current 100 > do_sleep start 8339 current 200 > <snip> > do_sleep start 8339 current 12300 > do_sleep start 8339 current 12400 > do_sleep start 8339 current 12500 > do_sleep start 8339 current 12600 > do_sleep start 8339 current 12700 > do_sleep start 8339 current 12800 > do_sleep start 8339 current 12900 > do_sleep start 8339 current 13000 > do_sleep start 8339 current 13100 > > *** It was frozen right here. Control-C could not interrupt the loop, > so my guess > *** either it was stuck during the call to get_timer(). Or the current value > *** overflowed and crashed the system. > > <END LOG> > > Finally, I removed the patch set, and rerun the sleep tests on this > board. They all worked fine, I did not see anything strange. >
Some ideas. The get_timer() function looks wrong assigning an uint64_t to ulong. lib/time.c static uint64_t notrace tick_to_time(uint64_t tick) uint64_t notrace get_ticks(void) uint64_t __weak notrace get_ticks(void) ulong __weak get_timer(ulong base) { return tick_to_time(get_ticks()) - base; } Most of the timer infrastructure is using uint64_t. I'm seeing this __weak function get_timer was invoked in Kirkwood boards. Both in sleep and timer commands. Thanks, Tony