On Fri, 14 Oct 2016 11:41:21 -0700 Douglas Anderson <diand...@chromium.org> 
wrote:

> We've got a delay loop waiting for secondary CPUs.  That loop uses
> loops_per_jiffy.  However, loops_per_jiffy doesn't actually mean how
> many tight loops make up a jiffy on all architectures.  It is quite
> common to see things like this in the boot log:
>   Calibrating delay loop (skipped), value calculated using timer
>   frequency.. 48.00 BogoMIPS (lpj=24000)
> 
> In my case I was seeing lots of cases where other CPUs timed out
> entering the debugging only to print their stack crawls shortly after
> the kdb> prompt was written.
> 
> It appears that other code with similar loops (like __spin_lock_debug)
> adds an extra __delay(1) in there which makes it work better.
> Presumably the __delay(1) is very safe.  At least on modern ARM/ARM64
> systems it will just do a CP15 instruction, which should be safe.  On
> older ARM systems it will fall back to an actual delay loop, or perhaps
> another memory-mapped timer.  On other platforms it must be safe too or
> it wouldn't be used in __spin_lock_debug.
> 
> Note that we use __delay(100) instead of __delay(1) so we can get a
> little closer to a more accurate timeout on systems where __delay() is
> backed by a timer.  It's better to have a more accurate timeout and the
> only penalty is that we might wait an extra 99 "loops" before we enter
> the debugger.
> 
> --- a/kernel/debug/debug_core.c
> +++ b/kernel/debug/debug_core.c
> @@ -598,11 +598,11 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct 
> pt_regs *regs,
>       /*
>        * Wait for the other CPUs to be notified and be waiting for us:
>        */
> -     time_left = loops_per_jiffy * HZ;
> +     time_left = DIV_ROUND_UP(loops_per_jiffy * HZ, 100);
>       while (kgdb_do_roundup && --time_left &&
>              (atomic_read(&masters_in_kgdb) + atomic_read(&slaves_in_kgdb)) !=
>                  online_cpus)
> -             cpu_relax();
> +             __delay(100);
>       if (!time_left)
>               pr_crit("Timed out waiting for secondary CPUs.\n");
>  

This is all rather vague, isn't it?

Can the code be redone using ndelay() or udelay()?  That way we should
be able to get predictable, arch-independent, cpu-freq-independent
delay periods.

Reply via email to