On Sun, Jan 08, 2017 at 03:20:20AM +0100, Rafael J. Wysocki wrote:
>  drivers/iommu/amd_iommu_init.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/iommu/amd_iommu_init.c
> ===================================================================
> --- linux-pm.orig/drivers/iommu/amd_iommu_init.c
> +++ linux-pm/drivers/iommu/amd_iommu_init.c
> @@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
>        */
>       ret = check_ivrs_checksum(ivrs_base);
>       if (ret)
> -             return ret;
> +             goto out;
>  
>       amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
>       DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);

Good catch, this one needs to be applied regardless.

However, it doesn't fix my issue though.

But I think I have it - I went and applied the well-proven debugging
technique of sprinkling printks around. Here's what I'm seeing:

early_amd_iommu_init()
|-> acpi_put_table(ivrs_base);
|-> acpi_tb_put_table(table_desc);
|-> acpi_tb_invalidate_table(table_desc);
|-> acpi_tb_release_table(...)
|-> acpi_os_unmap_memory
|-> acpi_os_unmap_iomem
|-> acpi_os_map_cleanup
|-> synchronize_rcu_expedited   <-- the kernel/rcu/tree_exp.h version with 
CONFIG_PREEMPT_RCU=y

Now that function goes and sends IPIs, i.e., schedule_work()
but this is too early - we haven't even done workqueue_init().
Actually, from looking at the callstack, we do
kernel_init_freeable->native_smp_prepare_cpus() and workqueue_init()
comes next.

And this makes sense because the splat rIP points to __queue_work() but
we haven't done that yet.

So that acpi_put_table() is happening too early. Looks like AMD IOMMU
should not put the table but WTH do I know?!

In any case, commenting out:

        acpi_put_table(ivrs_base);
        ivrs_base = NULL;

and the end of early_amd_iommu_init() makes the box boot again.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Reply via email to