On Thu, Feb 11, 2021 at 10:42:49AM -0700, Dave Jiang wrote: > When the auxiliary device code is built into the kernel, it can be executed > before the auxiliary bus is registered. This causes bus->p to be not > allocated and triggers a NULL pointer dereference when the auxiliary bus > device gets added with bus_add_device(). Call the auxiliary_bus_init() > under driver_init() so the bus is initialized before devices. > > Below is the kernel splat for the bug: > [ 1.948215] BUG: kernel NULL pointer dereference, address: 0000000000000060 > [ 1.950670] #PF: supervisor read access in kernel mode > [ 1.950670] #PF: error_code(0x0000) - not-present page > [ 1.950670] PGD 0 > [ 1.950670] Oops: 0000 1 SMP NOPTI > [ 1.950670] CPU: 0 PID: 1 Comm: swapper/0 Not tainted > 5.10.0-intel-nextsvmtest+ #2205 > [ 1.950670] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 > [ 1.950670] RIP: 0010:bus_add_device+0x64/0x140 > [ 1.950670] Code: 00 49 8b 75 20 48 89 df e8 59 a1 ff ff 41 89 c4 85 c0 75 7b > 48 8b 53 50 48 85 d2 75 03 48 8b 13 49 8b 85 a0 00 00 00 48 89 de <48> 8 > 78 60 48 83 c7 18 e8 ef d9 a9 ff 41 89 c4 85 c0 75 45 48 8b > [ 1.950670] RSP: 0000:ff46032ac001baf8 EFLAGS: 00010246 > [ 1.950670] RAX: 0000000000000000 RBX: ff4597f7414aa680 RCX: 0000000000000000 > [ 1.950670] RDX: ff4597f74142bbc0 RSI: ff4597f7414aa680 RDI: ff4597f7414aa680 > [ 1.950670] RBP: ff46032ac001bb10 R08: 0000000000000044 R09: 0000000000000228 > [ 1.950670] R10: ff4597f741141b30 R11: ff4597f740182a90 R12: 0000000000000000 > [ 1.950670] R13: ffffffffa5e936c0 R14: 0000000000000000 R15: 0000000000000000 > [ 1.950670] FS: 0000000000000000(0000) GS:ff4597f7bba00000(0000) > knlGS:0000000000000000 > [ 1.950670] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 1.950670] CR2: 0000000000000060 CR3: 000000002140c001 CR4: 0000000000f71ef0 > [ 1.950670] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 1.950670] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 > [ 1.950670] PKRU: 55555554 > [ 1.950670] Call Trace: > [ 1.950670] device_add+0x3ee/0x850 > [ 1.950670] __auxiliary_device_add+0x47/0x60 > [ 1.950670] idxd_pci_probe+0xf77/0x1180 > [ 1.950670] local_pci_probe+0x4a/0x90 > [ 1.950670] pci_device_probe+0xff/0x1b0 > [ 1.950670] really_probe+0x1cf/0x440 > [ 1.950670] ? rdinit_setup+0x31/0x31 > [ 1.950670] driver_probe_device+0xe8/0x150 > [ 1.950670] device_driver_attach+0x58/0x60 > [ 1.950670] __driver_attach+0x8f/0x150 > [ 1.950670] ? device_driver_attach+0x60/0x60 > [ 1.950670] ? device_driver_attach+0x60/0x60 > [ 1.950670] bus_for_each_dev+0x79/0xc0 > [ 1.950670] ? kmem_cache_alloc_trace+0x323/0x430 > [ 1.950670] driver_attach+0x1e/0x20 > [ 1.950670] bus_add_driver+0x154/0x1f0 > [ 1.950670] driver_register+0x70/0xc0 > [ 1.950670] __pci_register_driver+0x54/0x60 > [ 1.950670] idxd_init_module+0xe2/0xfc > [ 1.950670] ? idma64_platform_driver_init+0x19/0x19 > [ 1.950670] do_one_initcall+0x4a/0x1e0 > [ 1.950670] kernel_init_freeable+0x1fc/0x25c > [ 1.950670] ? rest_init+0xba/0xba > [ 1.950670] kernel_init+0xe/0x116 > [ 1.950670] ret_from_fork+0x1f/0x30 > [ 1.950670] Modules linked in: > [ 1.950670] CR2: 0000000000000060 > [ 1.950670] --[ end trace cd7d1b226d3ca901 ]-- > > Fixes: 7de3697e9cbd ("Add auxiliary bus support") > Reported-by: Jacob Pan <jacob.jun....@intel.com> > Acked-by: Dave Ertman <david.m.ert...@intel.com> > Reviewed-by: Dan Williams <dan.j.willi...@intel.com> > Signed-off-by: Dave Jiang <dave.ji...@intel.com> > --- > > v4: > - Remove remaining module bits as it's not a kernel module. (GregKH) > v3: > - Change init function to return void. (GregKH) > v2: > - Call in driver_init() to ensure aux bus gets init before devices. (GregKH) > > drivers/base/base.h | 5 +++++ > drivers/base/auxiliary.c | 18 +++--------------- > drivers/base/init.c | 1 + > 3 files changed, 9 insertions(+), 15 deletions(-) > > diff --git a/drivers/base/base.h b/drivers/base/base.h > index f5600a83124f..52b3d7b75c27 100644 > --- a/drivers/base/base.h > +++ b/drivers/base/base.h > @@ -119,6 +119,11 @@ static inline int hypervisor_init(void) { return 0; } > extern int platform_bus_init(void); > extern void cpu_dev_init(void); > extern void container_dev_init(void); > +#ifdef CONFIG_AUXILIARY_BUS > +extern void auxiliary_bus_init(void); > +#else > +static inline void auxiliary_bus_init(void) { } > +#endif > > struct kobject *virtual_device_parent(struct device *dev); > > diff --git a/drivers/base/auxiliary.c b/drivers/base/auxiliary.c > index 8336535f1e11..adc199dfba3c 100644 > --- a/drivers/base/auxiliary.c > +++ b/drivers/base/auxiliary.c > @@ -15,6 +15,7 @@ > #include <linux/pm_runtime.h> > #include <linux/string.h> > #include <linux/auxiliary_bus.h> > +#include "base.h" > > static const struct auxiliary_device_id *auxiliary_match_id(const struct > auxiliary_device_id *id, > const struct > auxiliary_device *auxdev) > @@ -260,20 +261,7 @@ void auxiliary_driver_unregister(struct auxiliary_driver > *auxdrv) > } > EXPORT_SYMBOL_GPL(auxiliary_driver_unregister); > > -static int __init auxiliary_bus_init(void) > +void __init auxiliary_bus_init(void) > { > - return bus_register(&auxiliary_bus_type); > + WARN_ON(bus_register(&auxiliary_bus_type)); > } > - > -static void __exit auxiliary_bus_exit(void) > -{ > - bus_unregister(&auxiliary_bus_type); > -} > - > -module_init(auxiliary_bus_init); > -module_exit(auxiliary_bus_exit); > - > -MODULE_LICENSE("GPL v2"); > -MODULE_DESCRIPTION("Auxiliary Bus"); > -MODULE_AUTHOR("David Ertman <david.m.ert...@intel.com>"); > -MODULE_AUTHOR("Kiran Patil <kiran.pa...@intel.com>"); > diff --git a/drivers/base/init.c b/drivers/base/init.c > index 908e6520e804..a9f57c22fb9e 100644 > --- a/drivers/base/init.c > +++ b/drivers/base/init.c > @@ -32,6 +32,7 @@ void __init driver_init(void) > */ > of_core_init(); > platform_bus_init(); > + auxiliary_bus_init(); > cpu_dev_init(); > memory_dev_init(); > container_dev_init(); > -- > 2.26.2 >
This is already in my tree, you got an email about it. Do the module removal as a separate patch as it has nothing to do with this one as the code could never be built as a module, your patch did not change it, so you are doing 2 different things here. {sigh} Intel owes me a new bottle of whisky. greg k-h