Jason Wessel wrote:
> Sergei Shtylyov wrote:

>>    The mainline code has the same TASKLET_STATE_SCHED but check and 
>> BUG, yet it didn't seem to give the trace -- I'll investigate today...

>>> Is this during the boot cycle or attaching afterwards?

>>    The former -- it's caused by the 'kgdbwait' option.

> Sergei,

> I did not previously understand the test case, because you had not 
> indicated that this was kgdboe with a kgdwait.

    It is not limited to KGDBoE.

> I inspected the code a bit, and you might consider trying the attached 
> patches.  The issue here is that the breakpoint will get scheduled 
> multiple times because the check exception stack with the init is part 

    No, it gets scheduled only *once*. I tried to put an accent on the fact 
that on CPU1 it gets executed with tasklet state of 0 -- which is a BUG even 
for the mainline code. Also, when the tasklet is executed on CPU0 *for real* 
we see:

__tasklet_common_schedule called on CPU0 with t = ffffffff80810ba0, head = 
ffffffff808ec5c8, nr = 5

    Mark the address in 'head' -- it's tasklet list address in the 
.percpu.data section. But when the taskle is executed for real, we see:

tasklet_action: &tasklet_vec = ffff810002c155c8

which is the address of CPU0 data allocated by setup_per_cpu_areas().
Later, when there's false execution on CPU1 (I've missed this piece in the 
previols mail), we cee:

tasklet_action: &tasklet_vec = ffff810002c1ddc8

which is the address of CPU1 data allocated by setup_per_cpu_areas().  Note 
the difference in the list addresses at the time of tasklet scheduling and 
execution.

> of the breakpoint() routine.  This is why you are seeing it on multiple 
> CPUs.  This check should be done in early or late init only, else 

    The check seems alright but it's just too early to use taslkets it seems.

> can have this sort of race condition.  Another way to circumvent the 
> problem is to have an atomic inc the breakpoint() routine and a dec in 
> the tasklet, so you can defer a breakpoint by scheduling it, if can not 
> be done immediately.   The important part is to make sure the arch 
> handlers were installed in the early init, which was clearly broken.

    Which arch handlers? kgdb_arch_init() only registers die chain notifier on 
x86_64.

> The attached changes do not implement the inc and dec I mentioned 
> earlier.  As a result this possibly breaks the rs232 early debug or 
> pushes it off to be a late_init, but the correct sequence of setting up 
> the die notification will happen on kgdboe for sure.

> Perhaps you can experiment a bit further, and report back as I do not 
> have your exact setup with SMP on x86_64 at the moment?

    Well, this would explain why this has gone unnoticed so far. :-)

> The patches attached have not been totally validated and are part of 
> some continuing uprev work.  I'll post final versions for review after I 
> hear back about your results or setup a similar scenario of test equipment.

    Well, I'll try them but not at all sure that changes anything.

> Thanks,
> Jason.

> ------------------------------------------------------------------------
>  kernel/kgdb.c |   23 +++++++++++------------
>  1 file changed, 11 insertions(+), 12 deletions(-)
> 
> Index: linux-2.6.21-standard/kernel/kgdb.c
> ===================================================================
> --- linux-2.6.21-standard.orig/kernel/kgdb.c
> +++ linux-2.6.21-standard/kernel/kgdb.c
> @@ -1788,17 +1788,6 @@ late_initcall(kgdb_late_entry);
>   */
>  void breakpoint(void)
>  {
> -     if (kgdb_initialized != 1) {
> -             kgdb_early_entry();
> -             if (kgdb_initialized == 1)
> -                     printk(KERN_CRIT "Waiting for connection from remote "
> -                            "gdb...\n");
> -             else {
> -                     printk(KERN_CRIT "KGDB cannot initialize I/O yet.\n");
> -                     return;
> -             }
> -     }
> -
>       atomic_set(&kgdb_setting_breakpoint, 1);
>       wmb();
>       BREAKPOINT();
> @@ -1883,7 +1872,17 @@ static int __init opt_kgdb_enter(char *s
>       if (kgdb_initialized)
>               return 0;
>  
> -     /* Call breakpoint() which will take care of init. */
> +     if (kgdb_initialized != 1) {
> +             kgdb_early_entry();
> +             if (kgdb_initialized == 1)
> +                     printk(KERN_CRIT "Waiting for connection from remote "
> +                            "gdb...\n");
> +             else {
> +                     printk(KERN_CRIT "KGDB cannot initialize I/O yet.\n");
> +                     return;
> +             }
> +     }
> +
>       breakpoint();

    I don't see how this might help.

>       return 0;

> ------------------------------------------------------------------------
>  kernel/kgdb.c |   11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> Index: linux-2.6.21-standard/kernel/kgdb.c
> ===================================================================
> --- linux-2.6.21-standard.orig/kernel/kgdb.c
> +++ linux-2.6.21-standard/kernel/kgdb.c
> @@ -1682,20 +1682,21 @@ DECLARE_TASKLET(kgdb_tasklet_breakpoint,
>   */
>  static void __init kgdb_early_entry(void)
>  {
> +     /* Let the architecture do any setup that it needs to. */
> +     kgdb_arch_init();
> +
>       /*
>        * Don't try and do anything until the architecture is able to
>        * setup the exception stack.  In this case, it is up to the
>        * architecture to hook in and look at us when they are ready.
>        */
> +
>       if(!CHECK_EXCEPTION_STACK()) {
>               kgdb_initialized = -1;
> -             tasklet_schedule(&kgdb_tasklet_breakpoint);
> +             /* any kind of break point is deferred to late_init */
>               return;
>       }
>  
> -     /* Let the architecture do any setup that it needs to. */
> -     kgdb_arch_init();
> -
>       /* Now try the I/O. */
>       /* For early entry kgdb_io_ops.init must be defined */
>       if (!kgdb_io_ops.init || kgdb_io_ops.init()) {
> @@ -1879,7 +1880,7 @@ static int __init opt_kgdb_enter(char *s
>                              "gdb...\n");
>               else {
>                       printk(KERN_CRIT "KGDB cannot initialize I/O yet.\n");
> -                     return;
> +                     return 0;
>               }
>       }
>  

WBR, Sergei

PS: it seems that SourceForge started eating my mails as well... :-(

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Kgdb-bugreport mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport

Reply via email to