On Wed, 2012-09-12 at 07:02 +0000, Zhang, Lin-Bao (ESSN-MCXS-Linux
Kernel R&D) wrote:
> Hi all, 
> This defect can be observed when the x2apic setting in BIOS is set to
> "auto" and the BIOS has virtual wire mode enabled on a power up. This
> defect was found on a 2.6.32 based kernel.

I assume you are able to reproduce the issue with the latest kernel
aswell?

What virtual wire mode is it?

Virtual wire mode-A (where the PIC output is connected to LINT0 of the
Local APIC) doesn't go through interrupt-remapping and virtual wire
mode-B (where the PIC output is routed through the IO-APIC RTE) will be
completely disabled as all the BIOS setup IO-APIC RTE's are masked by
the Linux kernel from the time we enable interrupt-remapping to the time
IO-APIC RTE's are properly re-configured by the Linux kernel again.

So I am at a loss to understand what is causing this.

> 
> The kernel code (smpboot.c, apic.c) does not mask 8259A interrupts
> before changing and initializing the new VT-d table when x2apic
> virtual wire mode is enable on power up. The Linux Kernel expects
> virtual wire mode to be disabled when booting and enables it when
> interrupts are masked.
> 
> The BIOS code builds a simple VT-d table on power up. While the Linux
> Kernel boots, it first builds an empty VT-d table and use it. After
> some time, the Linux Kernel then initializes the IO-APIC redirect
> table, and then initializes the VT-d entries. The window between
> initializing the redirect table and the VT-d entries, the 8259A
> interrupts are not masked. If an interrupt occurs in this window, the
> Linux Kernel will not find a valid entry for this interrupt. The
> kernel treats it to be a fatal error and panics. If the error never
> gets cleared, the Linux kernel continuously print this error:
> "NMI: IOCK error (debug interrupt?) for reason"

Not sure why we get a NMI instead of a vt-d fault? Perhaps the vt-d
fault is also getting reported via NMI in this platform?

Does your tested kernel has this fix?
commit 254e42006c893f45bca48f313536fcba12206418
Author: Suresh Siddha <[email protected]>
Date:   Mon Dec 6 12:26:30 2010 -0800

    x86, vt-d: Quirk for masking vtd spec errors to platform error handling 
logic

Will you be able to provide the failing kernel log so that I can better
understand the issue?

thanks,
suresh

> The fix to this defect, the code change is to mask 8259A interrupts
> before changing VT-d table and initializing VT-d entries. Then unmask
> interrupts after completing the redirect table entries.
> 
> 
> Signed-off-by: Zhang, Lin-Bao <[email protected]>
> Tested-by: Nigel Croxon <[email protected]>
> 
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 
> 24deb30..299172c 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -1556,7 +1556,6 @@ void __init enable_IR_x2apic(void)
>         }
> 
>         local_irq_save(flags);
> -       legacy_pic->mask_all();
>         mask_ioapic_entries();
> 
>         if (x2apic_preenabled && nox2apic) @@ -1603,7 +1602,6 @@ void __init 
> enable_IR_x2apic(void)
>  skip_x2apic:
>         if (ret < 0) /* IR enabling failed */
>                 restore_ioapic_entries();
> -       legacy_pic->restore_mask();
>         local_irq_restore(flags);
>  }
> 
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 
> 7c5a8c3..95fee01 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1000,7 +1000,7 @@ void __init native_smp_prepare_cpus(unsigned int 
> max_cpus)
>                 zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), 
> GFP_KERNEL);
>         }
>         set_cpu_sibling_map(0);
> -
> +       mask_8259A();
> 
>         if (smp_sanity_check(max_cpus) < 0) {
>                 pr_info("SMP disabled\n"); @@ -1037,6 +1037,8 @@ void __init 
> native_smp_prepare_cpus(unsigned int max_cpus)
>                 apic->setup_portio_remap();
> 
>         smpboot_setup_io_apic();
> +       unmask_8259A();
> +
>         /*
>          * Set up local APIC timer on boot CPU.
>          */
> 
> 
> 
> -- Bob(Zhang LinBao)
> 子曰:”不患人知不己知,患不知人也”
> "If not us, who ? if not now, when ?"
> ESSN-MCBS linux kernel enginner
> 
> 
> NР骒rybX肚v^?藓{.n?伐{赙zXФ≤}财z?j:+v?赙zZ+?zf"h~iz?wア?ㄨ??撷f^j谦ym@Aa囤0鹅h?i


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to