Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release
On 26/03/2019 12:31, Geert Uytterhoeven wrote: Hi John, CC robh On Tue, Mar 26, 2019 at 12:42 PM John Garry wrote: Memory is incorrectly freed using the direct ops, as dma_map_ops = NULL. Oops... After reversing the order of the calls to arch_teardown_dma_ops() and devres_release_all(), dma_map_ops is still valid, and the DMA memory is now released using __iommu_free_attrs(): +sata_rcar ee30.sata: dmam_release:32: size 2048 vaddr ff8012145000 dma_handle 0x0xf000 attrs 0x0 +sata_rcar ee30.sata: dma_free_attrs:289: size 2048, ops = iommu_dma_ops +sata_rcar ee30.sata: dma_free_attrs:311: calling __iommu_free_attrs() --- drivers/base/dd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 8ac10af17c0043a3..d62487d024559620 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -968,9 +968,9 @@ static void __device_release_driver(struct device *dev, struct device *parent) drv->remove(dev); device_links_driver_cleanup(dev); - arch_teardown_dma_ops(dev); devres_release_all(dev); + arch_teardown_dma_ops(dev); dev->driver = NULL; Hi guys, Could there still be the same problem in the error path of really_probe(): static int really_probe(struct device *dev, struct device_driver *drv) { [...] goto done; probe_failed: arch_teardown_dma_ops(dev); dma_failed: if (dev->bus) blocking_notifier_call_chain(&dev->bus->p->bus_notifier, BUS_NOTIFY_DRIVER_NOT_BOUND, dev); pinctrl_bind_failed: device_links_no_driver(dev); devres_release_all(dev); driver_sysfs_remove(dev); dev->driver = NULL; dev_set_drvdata(dev, NULL); We seem to be able to call arch_teardown_dma_ops() prior to devres_release_all() if we reach probe_failed label. Yes, this looks like another instance of the same problem. And test_remove doesn't expose this, as it doesn't exercise the full cycle. Hi Geert, OK, thanks. There is another devres_release_all() call in device_release(). Not sure if there is another potential problem there also... As for this issue, I'll look to fix unless anyone else doing it. Cheers, John Gr{oetje,eeting}s, Geert ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release
Hi John, CC robh On Tue, Mar 26, 2019 at 12:42 PM John Garry wrote: > > Memory is incorrectly freed using the direct ops, as dma_map_ops = NULL. > > Oops... > > > > After reversing the order of the calls to arch_teardown_dma_ops() and > > devres_release_all(), dma_map_ops is still valid, and the DMA memory is > > now released using __iommu_free_attrs(): > > > > +sata_rcar ee30.sata: dmam_release:32: size 2048 vaddr > > ff8012145000 dma_handle 0x0xf000 attrs 0x0 > > +sata_rcar ee30.sata: dma_free_attrs:289: size 2048, ops = > > iommu_dma_ops > > +sata_rcar ee30.sata: dma_free_attrs:311: calling > > __iommu_free_attrs() > > --- > > drivers/base/dd.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/base/dd.c b/drivers/base/dd.c > > index 8ac10af17c0043a3..d62487d024559620 100644 > > --- a/drivers/base/dd.c > > +++ b/drivers/base/dd.c > > @@ -968,9 +968,9 @@ static void __device_release_driver(struct device *dev, > > struct device *parent) > > drv->remove(dev); > > > > device_links_driver_cleanup(dev); > > - arch_teardown_dma_ops(dev); > > > > devres_release_all(dev); > > + arch_teardown_dma_ops(dev); > > dev->driver = NULL; > > Hi guys, > > Could there still be the same problem in the error path of really_probe(): > > static int really_probe(struct device *dev, struct device_driver *drv) > { > > [...] > > goto done; > > probe_failed: > arch_teardown_dma_ops(dev); > dma_failed: > if (dev->bus) > blocking_notifier_call_chain(&dev->bus->p->bus_notifier, > BUS_NOTIFY_DRIVER_NOT_BOUND, dev); > pinctrl_bind_failed: > device_links_no_driver(dev); > devres_release_all(dev); > driver_sysfs_remove(dev); > dev->driver = NULL; > dev_set_drvdata(dev, NULL); > > We seem to be able to call arch_teardown_dma_ops() prior to > devres_release_all() if we reach probe_failed label. Yes, this looks like another instance of the same problem. And test_remove doesn't expose this, as it doesn't exercise the full cycle. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release
Memory is incorrectly freed using the direct ops, as dma_map_ops = NULL. Oops... After reversing the order of the calls to arch_teardown_dma_ops() and devres_release_all(), dma_map_ops is still valid, and the DMA memory is now released using __iommu_free_attrs(): +sata_rcar ee30.sata: dmam_release:32: size 2048 vaddr ff8012145000 dma_handle 0x0xf000 attrs 0x0 +sata_rcar ee30.sata: dma_free_attrs:289: size 2048, ops = iommu_dma_ops +sata_rcar ee30.sata: dma_free_attrs:311: calling __iommu_free_attrs() --- drivers/base/dd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 8ac10af17c0043a3..d62487d024559620 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -968,9 +968,9 @@ static void __device_release_driver(struct device *dev, struct device *parent) drv->remove(dev); device_links_driver_cleanup(dev); - arch_teardown_dma_ops(dev); devres_release_all(dev); + arch_teardown_dma_ops(dev); dev->driver = NULL; Hi guys, Could there still be the same problem in the error path of really_probe(): static int really_probe(struct device *dev, struct device_driver *drv) { [...] goto done; probe_failed: arch_teardown_dma_ops(dev); dma_failed: if (dev->bus) blocking_notifier_call_chain(&dev->bus->p->bus_notifier, BUS_NOTIFY_DRIVER_NOT_BOUND, dev); pinctrl_bind_failed: device_links_no_driver(dev); devres_release_all(dev); driver_sysfs_remove(dev); dev->driver = NULL; dev_set_drvdata(dev, NULL); We seem to be able to call arch_teardown_dma_ops() prior to devres_release_all() if we reach probe_failed label. We have seen this crash when our driver probe fails on a dev branch based on v5.1-rc1: [ 87.896707] hisi_sas_v3_hw :74:02.0: Adding to iommu group 2 [ 87.909765] scsi host1: hisi_sas_v3_hw [ 89.127958] hisi_sas_v3_hw :74:02.0: evaluate _DSM failed [ 89.134043] BUG: Bad page state in process swapper/0 pfn:313f5 [ 89.139965] page:7ec4fd40 count:1 mapcount:0 mapping: index:0x0 [ 89.147960] flags: 0xfffe0001000(reserved) [ 89.152398] raw: 0fffe0001000 7ec4fd48 7ec4fd48 [ 89.160130] raw: 0001 [ 89.167861] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set [ 89.174290] bad because of flags: 0x1000(reserved) [ 89.179070] Modules linked in: [ 89.182117] CPU: 49 PID: 1 Comm: swapper/0 Not tainted 5.1.0-rc1-43081-g22d97fd-dirty #1433 [ 89.190453] Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI RC0 - V1.12.01 01/29/2019 [ 89.198876] Call trace: [ 89.201316] dump_backtrace+0x0/0x118 [ 89.204965] show_stack+0x14/0x1c [ 89.208272] dump_stack+0xa4/0xc8 [ 89.211576] bad_page+0xe4/0x13c [ 89.214791] free_pages_check_bad+0x4c/0xc0 [ 89.218961] __free_pages_ok+0x30c/0x340 [ 89.222871] __free_pages+0x30/0x44 [ 89.226347] __dma_direct_free_pages+0x30/0x38 [ 89.230777] dma_direct_free+0x24/0x38 [ 89.234513] dma_free_attrs+0x9c/0xd8 [ 89.238161] dmam_release+0x20/0x28 [ 89.241640] release_nodes+0x17c/0x220 [ 89.245375] devres_release_all+0x34/0x54 [ 89.249371] really_probe+0xc4/0x2c8 [ 89.252933] driver_probe_device+0x58/0xfc [ 89.257016] device_driver_attach+0x68/0x70 [ 89.261185] __driver_attach+0x94/0xdc [ 89.264921] bus_for_each_dev+0x5c/0xb4 [ 89.268744] driver_attach+0x20/0x28 [ 89.272306] bus_add_driver+0x14c/0x200 [ 89.276128] driver_register+0x6c/0x124 [ 89.279953] __pci_register_driver+0x48/0x50 [ 89.284213] sas_v3_pci_driver_init+0x20/0x28 [ 89.288557] do_one_initcall+0x40/0x25c [ 89.292381] kernel_init_freeable+0x2b8/0x3c0 [ 89.296727] kernel_init+0x10/0x100 [ 89.300202] ret_from_fork+0x10/0x18 [ 89.303773] Disabling lock debugging due to kernel taint [ 89.309076] BUG: Bad page state in process swapper/0 pfn:313f6 [ 89.314988] page:7ec4fd80 count:1 mapcount:0 mapping: index:0x0 [ 89.322983] flags: 0xfffe0001000(reserved) [ 89.327417] raw: 0fffe0001000 7ec4fd88 7ec4fd88 [ 89.335149] raw: 0001 Thanks, John dev_set_drvdata(dev, NULL); if (dev->pm_domain && dev->pm_domain->dismiss) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release
On Thu, Mar 07, 2019 at 02:52:55PM +, Robin Murphy wrote: > Hi John, > > On 07/03/2019 14:45, John Garry wrote: > [...] > > Hi guys, > > > > Any idea what happened to this fix? > > It's been in -next for a while (commit 376991db4b64) - I assume it will land > shortly and hit stable thereafter, at which point somebody gets to sort out > the manual backport past 4.20. It's already in Linus's tree, I'll go try to do the stable backporting now... greg k-h ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release
On 07/03/2019 14:52, Robin Murphy wrote: Hi John, Hi Robin, ok, fine. It's a pain bisecting another rmmod issue with it... Cheers, On 07/03/2019 14:45, John Garry wrote: [...] Hi guys, Any idea what happened to this fix? It's been in -next for a while (commit 376991db4b64) - I assume it will land shortly and hit stable thereafter, at which point somebody gets to sort out the manual backport past 4.20. Robin. I have this on 5.0: [0.00] Linux version 5.0.0 (john@htsatcamb-server) (gcc version 4.8.5 (Linaro GCC 4.8-2015.06)) #121 SMP PREEMPT Thu Mar 7 14:28:39 GMT 2019 [0.00] Kernel command line: BOOT_IMAGE=/john/Image rdinit=/init crashkernel=256M@32M earlycon console=ttyAMA0,115200 acpi=force pcie_aspm=off scsi_mod.use_blk_mq=y no_console_suspend pcie-hisi.disable=1 ... [ 26.806856] pci_bus 000c:20: 2-byte config write to 000c:20:00.0 offset 0x44 may corrupt adjacent RW1C bits [ 26.817521] pcieport 0002:f8:00.0: can't derive routing for PCI INT B [ 26.837167] pci_bus 000c:20: 2-byte config write to 000c:20:00.0 offset 0x44 may corrupt adjacent RW1C bits [ 26.850091] serial 0002:f9:00.1: PCI INT B: no GSI [ 26.879364] serial 0002:f9:00.1: enabling device ( -> 0003) [ 26.913131] 0002:f9:00.1: ttyS3 at I/O 0x1008 (irq = 0, base_baud = 115200) is a ST16650V2 [ 26.992897] rtc-efi rtc-efi: setting system clock to 2019-03-07T14:41:48 UTC (1551969708) [ 27.009380] ALSA device list: [ 27.015326] No soundcards found. [ 27.022549] Freeing unused kernel memory: 1216K [ 27.055567] Run /init as init process root@(none)$ cd /sys/devices/platform/HISI0162:01 root@(none)$ echo HISI0162:01 > driver/unbind [ 36.488040] hisi_sas_v2_hw HISI0162:01: dev[9:1] is gone [ 36.561077] hisi_sas_v2_hw HISI0162:01: dev[8:1] is gone [ 36.621061] hisi_sas_v2_hw HISI0162:01: dev[7:1] is gone [ 36.693074] hisi_sas_v2_hw HISI0162:01: dev[6:1] is gone [ 36.753066] hisi_sas_v2_hw HISI0162:01: dev[5:1] is gone [ 36.764276] sd 0:0:3:0: [sdd] Synchronizing SCSI cache [ 36.821106] hisi_sas_v2_hw HISI0162:01: dev[4:1] is gone [ 36.889048] hisi_sas_v2_hw HISI0162:01: dev[3:1] is gone [ 36.993002] hisi_sas_v2_hw HISI0162:01: dev[1:1] is gone [ 37.004276] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 37.014768] sd 0:0:0:0: [sda] Stopping disk [ 37.709094] hisi_sas_v2_hw HISI0162:01: dev[2:5] is gone [ 37.721231] hisi_sas_v2_hw HISI0162:01: dev[0:2] is gone [ 37.803774] BUG: Bad page state in process sh pfn:11356 [ 37.81] page:7e44d580 count:1 mapcount:0 mapping: index:0x0 [ 37.830525] flags: 0xfffc0001000(reserved) [ 37.839443] raw: 0fffc0001000 7e44d588 7e44d588 [ 37.854998] raw: 0001 [ 37.870552] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set [ 37.883485] bad because of flags: 0x1000(reserved) [ 37.893098] Modules linked in: [ 37.899221] CPU: 5 PID: 2691 Comm: sh Not tainted 5.0.0 #121 [ 37.910578] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018 [ 37.928924] Call trace: [ 37.933825] dump_backtrace+0x0/0x150 [ 37.941166] show_stack+0x14/0x1c [ 37.947808] dump_stack+0x8c/0xb0 [ 37.954451] bad_page+0xe4/0x144 [ 37.960918] free_pages_check_bad+0x7c/0x84 [ 37.969307] __free_pages_ok+0x284/0x290 [ 37.977173] __free_pages+0x30/0x44 [ 37.984163] __dma_direct_free_pages+0x68/0x6c [ 37.993076] dma_direct_free+0x24/0x38 [ 38.000591] dma_free_attrs+0x84/0xc0 [ 38.007930] dmam_release+0x20/0x28 [ 38.014924] release_nodes+0x128/0x1f8 [ 38.022439] devres_release_all+0x34/0x4c [ 38.030478] device_release_driver_internal+0x190/0x208 [ 38.040963] device_release_driver+0x14/0x1c [ 38.049526] unbind_store+0xbc/0xf4 [ 38.056517] drv_attr_store+0x20/0x30 [ 38.063859] sysfs_kf_write+0x44/0x4c [ 38.071200] kernfs_fop_write+0xd0/0x1c4 [ 38.079065] __vfs_write+0x2c/0x158 [ 38.086055] vfs_write+0xa8/0x19c [ 38.092696] ksys_write+0x44/0xa0 [ 38.099338] __arm64_sys_write+0x1c/0x24 [ 38.107203] el0_svc_common+0xb0/0x100 [ 38.114718] el0_svc_handler+0x70/0x88 [ 38.122232] el0_svc+0x8/0x7c0 [ 38.128356] Disabling lock debugging due to kernel taint [ 38.139019] BUG: Bad page state in process sh pfn:11355 [ 38.149682] page:7e44d540 count:0 mapcount:0 mapping: index:0x0 [ 38.165760] flags: 0xfffc0001000(reserved) [ 38.174676] raw: 0fffc0001000 7e44d548 7e44d548 [ 38.190230] raw: [ 38.205783] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set [ 38.218716] bad because of flags: 0x1000(reserved) [ 38.228329] Modules linked in: [ 38.234451] CPU: 5 PID: 2691 Comm: sh Tainted: GB 5.0.0 #121 [ 38.248604] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D
Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release
Hi John, On 07/03/2019 14:45, John Garry wrote: [...] Hi guys, Any idea what happened to this fix? It's been in -next for a while (commit 376991db4b64) - I assume it will land shortly and hit stable thereafter, at which point somebody gets to sort out the manual backport past 4.20. Robin. I have this on 5.0: [ 0.00] Linux version 5.0.0 (john@htsatcamb-server) (gcc version 4.8.5 (Linaro GCC 4.8-2015.06)) #121 SMP PREEMPT Thu Mar 7 14:28:39 GMT 2019 [ 0.00] Kernel command line: BOOT_IMAGE=/john/Image rdinit=/init crashkernel=256M@32M earlycon console=ttyAMA0,115200 acpi=force pcie_aspm=off scsi_mod.use_blk_mq=y no_console_suspend pcie-hisi.disable=1 ... [ 26.806856] pci_bus 000c:20: 2-byte config write to 000c:20:00.0 offset 0x44 may corrupt adjacent RW1C bits [ 26.817521] pcieport 0002:f8:00.0: can't derive routing for PCI INT B [ 26.837167] pci_bus 000c:20: 2-byte config write to 000c:20:00.0 offset 0x44 may corrupt adjacent RW1C bits [ 26.850091] serial 0002:f9:00.1: PCI INT B: no GSI [ 26.879364] serial 0002:f9:00.1: enabling device ( -> 0003) [ 26.913131] 0002:f9:00.1: ttyS3 at I/O 0x1008 (irq = 0, base_baud = 115200) is a ST16650V2 [ 26.992897] rtc-efi rtc-efi: setting system clock to 2019-03-07T14:41:48 UTC (1551969708) [ 27.009380] ALSA device list: [ 27.015326] No soundcards found. [ 27.022549] Freeing unused kernel memory: 1216K [ 27.055567] Run /init as init process root@(none)$ cd /sys/devices/platform/HISI0162:01 root@(none)$ echo HISI0162:01 > driver/unbind [ 36.488040] hisi_sas_v2_hw HISI0162:01: dev[9:1] is gone [ 36.561077] hisi_sas_v2_hw HISI0162:01: dev[8:1] is gone [ 36.621061] hisi_sas_v2_hw HISI0162:01: dev[7:1] is gone [ 36.693074] hisi_sas_v2_hw HISI0162:01: dev[6:1] is gone [ 36.753066] hisi_sas_v2_hw HISI0162:01: dev[5:1] is gone [ 36.764276] sd 0:0:3:0: [sdd] Synchronizing SCSI cache [ 36.821106] hisi_sas_v2_hw HISI0162:01: dev[4:1] is gone [ 36.889048] hisi_sas_v2_hw HISI0162:01: dev[3:1] is gone [ 36.993002] hisi_sas_v2_hw HISI0162:01: dev[1:1] is gone [ 37.004276] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 37.014768] sd 0:0:0:0: [sda] Stopping disk [ 37.709094] hisi_sas_v2_hw HISI0162:01: dev[2:5] is gone [ 37.721231] hisi_sas_v2_hw HISI0162:01: dev[0:2] is gone [ 37.803774] BUG: Bad page state in process sh pfn:11356 [ 37.81] page:7e44d580 count:1 mapcount:0 mapping: index:0x0 [ 37.830525] flags: 0xfffc0001000(reserved) [ 37.839443] raw: 0fffc0001000 7e44d588 7e44d588 [ 37.854998] raw: 0001 [ 37.870552] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set [ 37.883485] bad because of flags: 0x1000(reserved) [ 37.893098] Modules linked in: [ 37.899221] CPU: 5 PID: 2691 Comm: sh Not tainted 5.0.0 #121 [ 37.910578] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018 [ 37.928924] Call trace: [ 37.933825] dump_backtrace+0x0/0x150 [ 37.941166] show_stack+0x14/0x1c [ 37.947808] dump_stack+0x8c/0xb0 [ 37.954451] bad_page+0xe4/0x144 [ 37.960918] free_pages_check_bad+0x7c/0x84 [ 37.969307] __free_pages_ok+0x284/0x290 [ 37.977173] __free_pages+0x30/0x44 [ 37.984163] __dma_direct_free_pages+0x68/0x6c [ 37.993076] dma_direct_free+0x24/0x38 [ 38.000591] dma_free_attrs+0x84/0xc0 [ 38.007930] dmam_release+0x20/0x28 [ 38.014924] release_nodes+0x128/0x1f8 [ 38.022439] devres_release_all+0x34/0x4c [ 38.030478] device_release_driver_internal+0x190/0x208 [ 38.040963] device_release_driver+0x14/0x1c [ 38.049526] unbind_store+0xbc/0xf4 [ 38.056517] drv_attr_store+0x20/0x30 [ 38.063859] sysfs_kf_write+0x44/0x4c [ 38.071200] kernfs_fop_write+0xd0/0x1c4 [ 38.079065] __vfs_write+0x2c/0x158 [ 38.086055] vfs_write+0xa8/0x19c [ 38.092696] ksys_write+0x44/0xa0 [ 38.099338] __arm64_sys_write+0x1c/0x24 [ 38.107203] el0_svc_common+0xb0/0x100 [ 38.114718] el0_svc_handler+0x70/0x88 [ 38.122232] el0_svc+0x8/0x7c0 [ 38.128356] Disabling lock debugging due to kernel taint [ 38.139019] BUG: Bad page state in process sh pfn:11355 [ 38.149682] page:7e44d540 count:0 mapcount:0 mapping: index:0x0 [ 38.165760] flags: 0xfffc0001000(reserved) [ 38.174676] raw: 0fffc0001000 7e44d548 7e44d548 [ 38.190230] raw: [ 38.205783] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set [ 38.218716] bad because of flags: 0x1000(reserved) [ 38.228329] Modules linked in: [ 38.234451] CPU: 5 PID: 2691 Comm: sh Tainted: G B 5.0.0 #121 [ 38.248604] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018 [ 38.266949] Call trace: [ 38.271844] dump_backtrace+0x0/0x150 [
Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release
On 11/02/2019 10:22, Robin Murphy wrote: On 08/02/2019 18:55, Geert Uytterhoeven wrote: Hi Robin, On Fri, Feb 8, 2019 at 6:55 PM Robin Murphy wrote: On 08/02/2019 16:40, Joerg Roedel wrote: On Thu, Feb 07, 2019 at 08:36:53PM +0100, Geert Uytterhoeven wrote: diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 8ac10af17c0043a3..d62487d024559620 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -968,9 +968,9 @@ static void __device_release_driver(struct device *dev, struct device *parent) drv->remove(dev); device_links_driver_cleanup(dev); -arch_teardown_dma_ops(dev); devres_release_all(dev); +arch_teardown_dma_ops(dev); dev->driver = NULL; dev_set_drvdata(dev, NULL); if (dev->pm_domain && dev->pm_domain->dismiss) Thanks for the fix! Should it also be tagged for stable and get a Fixes FTR, Greg has added it to driver-core-testing, with a CC to stable. So I see, great! tag? I know it only triggers with a fix in v5.0-rc, but still... I think so: Fixes: 09515ef5ddad ("of/acpi: Configure dma operations at probe time for platform/amba/pci bus devices") Thanks! It won't backport cleanly due to commit dc3c05504d38849f ("dma-mapping: remove dma_deconfigure") in v4.20, though. Ah yes - backports beyond that should simply be a case of moving the dma_deconfigure() wrapper in the same manner. Hi guys, Any idea what happened to this fix? I have this on 5.0: [0.00] Linux version 5.0.0 (john@htsatcamb-server) (gcc version 4.8.5 (Linaro GCC 4.8-2015.06)) #121 SMP PREEMPT Thu Mar 7 14:28:39 GMT 2019 [0.00] Kernel command line: BOOT_IMAGE=/john/Image rdinit=/init crashkernel=256M@32M earlycon console=ttyAMA0,115200 acpi=force pcie_aspm=off scsi_mod.use_blk_mq=y no_console_suspend pcie-hisi.disable=1 ... [ 26.806856] pci_bus 000c:20: 2-byte config write to 000c:20:00.0 offset 0x44 may corrupt adjacent RW1C bits [ 26.817521] pcieport 0002:f8:00.0: can't derive routing for PCI INT B [ 26.837167] pci_bus 000c:20: 2-byte config write to 000c:20:00.0 offset 0x44 may corrupt adjacent RW1C bits [ 26.850091] serial 0002:f9:00.1: PCI INT B: no GSI [ 26.879364] serial 0002:f9:00.1: enabling device ( -> 0003) [ 26.913131] 0002:f9:00.1: ttyS3 at I/O 0x1008 (irq = 0, base_baud = 115200) is a ST16650V2 [ 26.992897] rtc-efi rtc-efi: setting system clock to 2019-03-07T14:41:48 UTC (1551969708) [ 27.009380] ALSA device list: [ 27.015326] No soundcards found. [ 27.022549] Freeing unused kernel memory: 1216K [ 27.055567] Run /init as init process root@(none)$ cd /sys/devices/platform/HISI0162:01 root@(none)$ echo HISI0162:01 > driver/unbind [ 36.488040] hisi_sas_v2_hw HISI0162:01: dev[9:1] is gone [ 36.561077] hisi_sas_v2_hw HISI0162:01: dev[8:1] is gone [ 36.621061] hisi_sas_v2_hw HISI0162:01: dev[7:1] is gone [ 36.693074] hisi_sas_v2_hw HISI0162:01: dev[6:1] is gone [ 36.753066] hisi_sas_v2_hw HISI0162:01: dev[5:1] is gone [ 36.764276] sd 0:0:3:0: [sdd] Synchronizing SCSI cache [ 36.821106] hisi_sas_v2_hw HISI0162:01: dev[4:1] is gone [ 36.889048] hisi_sas_v2_hw HISI0162:01: dev[3:1] is gone [ 36.993002] hisi_sas_v2_hw HISI0162:01: dev[1:1] is gone [ 37.004276] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 37.014768] sd 0:0:0:0: [sda] Stopping disk [ 37.709094] hisi_sas_v2_hw HISI0162:01: dev[2:5] is gone [ 37.721231] hisi_sas_v2_hw HISI0162:01: dev[0:2] is gone [ 37.803774] BUG: Bad page state in process sh pfn:11356 [ 37.81] page:7e44d580 count:1 mapcount:0 mapping: index:0x0 [ 37.830525] flags: 0xfffc0001000(reserved) [ 37.839443] raw: 0fffc0001000 7e44d588 7e44d588 [ 37.854998] raw: 0001 [ 37.870552] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set [ 37.883485] bad because of flags: 0x1000(reserved) [ 37.893098] Modules linked in: [ 37.899221] CPU: 5 PID: 2691 Comm: sh Not tainted 5.0.0 #121 [ 37.910578] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018 [ 37.928924] Call trace: [ 37.933825] dump_backtrace+0x0/0x150 [ 37.941166] show_stack+0x14/0x1c [ 37.947808] dump_stack+0x8c/0xb0 [ 37.954451] bad_page+0xe4/0x144 [ 37.960918] free_pages_check_bad+0x7c/0x84 [ 37.969307] __free_pages_ok+0x284/0x290 [ 37.977173] __free_pages+0x30/0x44 [ 37.984163] __dma_direct_free_pages+0x68/0x6c [ 37.993076] dma_direct_free+0x24/0x38 [ 38.000591] dma_free_attrs+0x84/0xc0 [ 38.007930] dmam_release+0x20/0x28 [ 38.014924] release_nodes+0x128/0x1f8 [ 38.022439] devres_release_all+0x34/0x4c [ 38.030478] device_release_driver_internal+0x190/0x208 [ 38.040963] device_release_driver+0x14/0x1c [ 38.049526] unbind_store+0xbc/0xf4 [ 38.056517] drv_attr_store+0x20/0x30 [ 38
Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release
On 08/02/2019 18:55, Geert Uytterhoeven wrote: Hi Robin, On Fri, Feb 8, 2019 at 6:55 PM Robin Murphy wrote: On 08/02/2019 16:40, Joerg Roedel wrote: On Thu, Feb 07, 2019 at 08:36:53PM +0100, Geert Uytterhoeven wrote: diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 8ac10af17c0043a3..d62487d024559620 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -968,9 +968,9 @@ static void __device_release_driver(struct device *dev, struct device *parent) drv->remove(dev); device_links_driver_cleanup(dev); -arch_teardown_dma_ops(dev); devres_release_all(dev); +arch_teardown_dma_ops(dev); dev->driver = NULL; dev_set_drvdata(dev, NULL); if (dev->pm_domain && dev->pm_domain->dismiss) Thanks for the fix! Should it also be tagged for stable and get a Fixes FTR, Greg has added it to driver-core-testing, with a CC to stable. So I see, great! tag? I know it only triggers with a fix in v5.0-rc, but still... I think so: Fixes: 09515ef5ddad ("of/acpi: Configure dma operations at probe time for platform/amba/pci bus devices") Thanks! It won't backport cleanly due to commit dc3c05504d38849f ("dma-mapping: remove dma_deconfigure") in v4.20, though. Ah yes - backports beyond that should simply be a case of moving the dma_deconfigure() wrapper in the same manner. Thanks, Robin. There aren't many drivers using dmam_alloc_*(), let alone which would also find themselves behind an IOMMU on an Arm system, but it turns out I actually have another one which can reproduce the BUG() with 5.0-rc. SATA core uses dmam_alloc_*(). I've tried a 4.12 kernel with a bit of instrumentation[1] and sure enough the devres-managed buffer is freed with the wrong ops[2] even then. How it manages not to blow up more catastrophically I have no idea... I guess at best it just leaks the buffers and IOMMU mappings, and at worst quietly frees random other pages instead. May depend on the actual ops, and whether CMA is used or not. Gr{oetje,eeting}s, Geert ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release
Hi Robin, On Fri, Feb 8, 2019 at 6:55 PM Robin Murphy wrote: > On 08/02/2019 16:40, Joerg Roedel wrote: > > On Thu, Feb 07, 2019 at 08:36:53PM +0100, Geert Uytterhoeven wrote: > >> diff --git a/drivers/base/dd.c b/drivers/base/dd.c > >> index 8ac10af17c0043a3..d62487d024559620 100644 > >> --- a/drivers/base/dd.c > >> +++ b/drivers/base/dd.c > >> @@ -968,9 +968,9 @@ static void __device_release_driver(struct device > >> *dev, struct device *parent) > >> drv->remove(dev); > >> > >> device_links_driver_cleanup(dev); > >> -arch_teardown_dma_ops(dev); > >> > >> devres_release_all(dev); > >> +arch_teardown_dma_ops(dev); > >> dev->driver = NULL; > >> dev_set_drvdata(dev, NULL); > >> if (dev->pm_domain && dev->pm_domain->dismiss) > > > > Thanks for the fix! Should it also be tagged for stable and get a Fixes FTR, Greg has added it to driver-core-testing, with a CC to stable. > > tag? I know it only triggers with a fix in v5.0-rc, but still... > > I think so: > > Fixes: 09515ef5ddad ("of/acpi: Configure dma operations at probe time > for platform/amba/pci bus devices") Thanks! It won't backport cleanly due to commit dc3c05504d38849f ("dma-mapping: remove dma_deconfigure") in v4.20, though. > There aren't many drivers using dmam_alloc_*(), let alone which would > also find themselves behind an IOMMU on an Arm system, but it turns out > I actually have another one which can reproduce the BUG() with 5.0-rc. SATA core uses dmam_alloc_*(). > I've tried a 4.12 kernel with a bit of instrumentation[1] and sure > enough the devres-managed buffer is freed with the wrong ops[2] even > then. How it manages not to blow up more catastrophically I have no > idea... I guess at best it just leaks the buffers and IOMMU mappings, > and at worst quietly frees random other pages instead. May depend on the actual ops, and whether CMA is used or not. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release
On 08/02/2019 16:40, Joerg Roedel wrote: Hi Geert, On Thu, Feb 07, 2019 at 08:36:53PM +0100, Geert Uytterhoeven wrote: diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 8ac10af17c0043a3..d62487d024559620 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -968,9 +968,9 @@ static void __device_release_driver(struct device *dev, struct device *parent) drv->remove(dev); device_links_driver_cleanup(dev); - arch_teardown_dma_ops(dev); devres_release_all(dev); + arch_teardown_dma_ops(dev); dev->driver = NULL; dev_set_drvdata(dev, NULL); if (dev->pm_domain && dev->pm_domain->dismiss) Thanks for the fix! Should it also be tagged for stable and get a Fixes tag? I know it only triggers with a fix in v5.0-rc, but still... I think so: Fixes: 09515ef5ddad ("of/acpi: Configure dma operations at probe time for platform/amba/pci bus devices") There aren't many drivers using dmam_alloc_*(), let alone which would also find themselves behind an IOMMU on an Arm system, but it turns out I actually have another one which can reproduce the BUG() with 5.0-rc. I've tried a 4.12 kernel with a bit of instrumentation[1] and sure enough the devres-managed buffer is freed with the wrong ops[2] even then. How it manages not to blow up more catastrophically I have no idea... I guess at best it just leaks the buffers and IOMMU mappings, and at worst quietly frees random other pages instead. Robin. -- [1] diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 4f3eecedca2d..f4dbaa5598e3 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -491,6 +491,7 @@ static inline void *dma_alloc_attrs(struct device *dev, size_t size, return NULL; cpu_addr = ops->alloc(dev, size, dma_handle, flag, attrs); + dev_info(dev, "alloc %lx %lx\n", (unsigned long)cpu_addr, (unsigned long)ops); debug_dma_alloc_coherent(dev, size, *dma_handle, cpu_addr); return cpu_addr; } @@ -512,6 +513,7 @@ static inline void dma_free_attrs(struct device *dev, size_t size, debug_dma_free_coherent(dev, size, cpu_addr, dma_handle); ops->free(dev, size, cpu_addr, dma_handle, attrs); + dev_info(dev, "free %lx %lx\n", (unsigned long)cpu_addr, (unsigned long)ops); } static inline void *dma_alloc_coherent(struct device *dev, size_t size, - [2] / # echo ':03:00.0' > /sys/bus/pci/drivers/sata_sil24/bind [ 107.417252] sata_sil24 :03:00.0: alloc 0a6f9000 089b8090 [ 107.424397] sata_sil24 :03:00.0: alloc 0a719000 089b8090 [ 107.432216] scsi host0: sata_sil24 [ 107.436134] scsi host1: sata_sil24 [ 107.439853] ata7: SATA max UDMA/100 host m128@0x50084000 port 0x5008 irq 51 [ 107.447228] ata8: SATA max UDMA/100 host m128@0x50084000 port 0x50082000 irq 51 / # echo ':03:00.0' > /sys/bus/pci/drivers/sata_sil24/unbind ... [ 112.048654] sata_sil24 :03:00.0: free 0a719000 089b8120 [ 112.055579] sata_sil24 :03:00.0: free 0a6f9000 089b8120 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release
Hi Geert, On Thu, Feb 07, 2019 at 08:36:53PM +0100, Geert Uytterhoeven wrote: > diff --git a/drivers/base/dd.c b/drivers/base/dd.c > index 8ac10af17c0043a3..d62487d024559620 100644 > --- a/drivers/base/dd.c > +++ b/drivers/base/dd.c > @@ -968,9 +968,9 @@ static void __device_release_driver(struct device *dev, > struct device *parent) > drv->remove(dev); > > device_links_driver_cleanup(dev); > - arch_teardown_dma_ops(dev); > > devres_release_all(dev); > + arch_teardown_dma_ops(dev); > dev->driver = NULL; > dev_set_drvdata(dev, NULL); > if (dev->pm_domain && dev->pm_domain->dismiss) Thanks for the fix! Should it also be tagged for stable and get a Fixes tag? I know it only triggers with a fix in v5.0-rc, but still... Regards, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release
On 07/02/2019 19:36, Geert Uytterhoeven wrote: When unbinding the (IOMMU-enabled) R-Car SATA device on Salvator-XS (R-Car H3 ES2.0), in preparation of rebinding against vfio-platform for device pass-through for virtualization: echo ee30.sata > /sys/bus/platform/drivers/sata_rcar/unbind the kernel crashes with: Unable to handle kernel paging request at virtual address ffbf029c Mem abort info: ESR = 0x9606 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x0006 CM = 0, WnR = 0 swapper pgtable: 4k pages, 39-bit VAs, pgdp = 7e8c586c [ffbf029c] pgd=00073bfc6003, pud=00073bfc6003, pmd= Internal error: Oops: 9606 [#1] SMP Modules linked in: CPU: 0 PID: 1098 Comm: bash Not tainted 5.0.0-rc5-salvator-x-00452-g37596f884f4318ef #287 Hardware name: Renesas Salvator-X 2nd version board based on r8a7795 ES2.0+ (DT) pstate: 6045 (nZCv daif +PAN -UAO) pc : __free_pages+0x8/0x58 lr : __dma_direct_free_pages+0x50/0x5c sp : ff801268baa0 x29: ff801268baa0 x28: x27: ffc6f9c60bf0 x26: ffc6f9c60bf0 x25: ffc6f9c60810 x24: x23: f000 x22: ff8012145000 x21: 0800 x20: ffbf029fffc8 x19: x18: ffc6f86c42c8 x17: x16: 0070 x15: 0003 x14: x13: ff801103d7f8 x12: 0028 x11: ff807604 x10: 9ad8 x9 : ff80110126d0 x8 : ffc6f7563000 x7 : 6b6b6b6b6b6b6b6b x6 : 0018 x5 : ff8011cf3cc8 x4 : 4000 x3 : 0008 x2 : 0001 x1 : x0 : ffbf029fffc8 Process bash (pid: 1098, stack limit = 0xc38e3e32) Call trace: __free_pages+0x8/0x58 __dma_direct_free_pages+0x50/0x5c arch_dma_free+0x1c/0x98 dma_direct_free+0x14/0x24 dma_free_attrs+0x9c/0xdc dmam_release+0x18/0x20 release_nodes+0x25c/0x28c devres_release_all+0x48/0x4c device_release_driver_internal+0x184/0x1f0 device_release_driver+0x14/0x1c unbind_store+0x70/0xb8 drv_attr_store+0x24/0x34 sysfs_kf_write+0x4c/0x64 kernfs_fop_write+0x154/0x1c4 __vfs_write+0x34/0x164 vfs_write+0xb4/0x16c ksys_write+0x5c/0xbc __arm64_sys_write+0x14/0x1c el0_svc_common+0x98/0x114 el0_svc_handler+0x1c/0x24 el0_svc+0x8/0xc Code: d51b4234 17fa a9bf7bfd 910003fd (b9403404) ---[ end trace 8c564cdd3a1a840f ]--- While I've bisected this to commit e8e683ae9a736407 ("iommu/of: Fix probe-deferral"), and reverting that commit on post-v5.0-rc4 kernels does fix the problem, this turned out to be a red herring. On arm64, arch_teardown_dma_ops() resets dev->dma_ops to NULL. Hence if a driver has used a managed DMA allocation API, the allocated DMA memory will be freed using the direct DMA ops, while it may have been allocated using a custom DMA ops (iommu_dma_ops in this case). Fix this by reversing the order of the calls to devres_release_all() and arch_teardown_dma_ops(). Signed-off-by: Geert Uytterhoeven --- Question: Is this safe on arm32, which calls arm_teardown_iommu_dma_ops() instead of resetting dev->dma_ops? Should be - the principle is the same, and even if it did break that would only be indicative of a separate bug being hidden by this one. This fix looks entirely valid and correct to me: Reviewed-by: Robin Murphy --- Adding some debug code, and comparing before/after commit e8e683ae9a736407: sata_rcar ee30.sata: of_iommu_configure:227: err = -517 -sata_rcar ee30.sata: of_iommu_configure:230: calling iommu_probe_device() -sata_rcar ee30.sata: iommu_probe_device:122: Calling ipmmu_add_device -sata_rcar ee30.sata: ipmmu_add_device:893 -sata_rcar ee30.sata: of_iommu_configure:232: iommu_probe_device() returned -19 -sata_rcar ee30.sata: dma_alloc_attrs:257: size 2048, ops = (null) -sata_rcar ee30.sata: __dma_direct_alloc_pages:104: size 4096 -sata_rcar ee30.sata: __dma_direct_alloc_pages:115: trying dma_alloc_from_contiguous() -sata_rcar ee30.sata: dma_alloc_from_contiguous:202: cma_alloc(1) returned page ffbf00d20e00 -sata_rcar ee30.sata: dma_alloc_attrs:271: allocated using dma_direct_alloc() -sata_rcar ee30.sata: dmam_alloc_attrs:98: size 2048 vaddr ff8011e95000 dma_handle 0x0x7c04 attrs 0x0 -scsi host0: sata_rcar -ata1: SATA max UDMA/133 irq 172 -ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) DMA memory used to be allocated from CMA, dma_map_ops = NULL. The SATA driver was probed and initialized. +sata_
Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release
On Thu, Feb 7, 2019 at 8:42 PM Geert Uytterhoeven wrote: > > When unbinding the (IOMMU-enabled) R-Car SATA device on Salvator-XS > (R-Car H3 ES2.0), in preparation of rebinding against vfio-platform for > device pass-through for virtualization: > > echo ee30.sata > /sys/bus/platform/drivers/sata_rcar/unbind > > the kernel crashes with: > > Unable to handle kernel paging request at virtual address ffbf029c > Mem abort info: > ESR = 0x9606 > Exception class = DABT (current EL), IL = 32 bits > SET = 0, FnV = 0 > EA = 0, S1PTW = 0 > Data abort info: > ISV = 0, ISS = 0x0006 > CM = 0, WnR = 0 > swapper pgtable: 4k pages, 39-bit VAs, pgdp = 7e8c586c > [ffbf029c] pgd=00073bfc6003, pud=00073bfc6003, > pmd= > Internal error: Oops: 9606 [#1] SMP > Modules linked in: > CPU: 0 PID: 1098 Comm: bash Not tainted > 5.0.0-rc5-salvator-x-00452-g37596f884f4318ef #287 > Hardware name: Renesas Salvator-X 2nd version board based on r8a7795 > ES2.0+ (DT) > pstate: 6045 (nZCv daif +PAN -UAO) > pc : __free_pages+0x8/0x58 > lr : __dma_direct_free_pages+0x50/0x5c > sp : ff801268baa0 > x29: ff801268baa0 x28: > x27: ffc6f9c60bf0 x26: ffc6f9c60bf0 > x25: ffc6f9c60810 x24: > x23: f000 x22: ff8012145000 > x21: 0800 x20: ffbf029fffc8 > x19: x18: ffc6f86c42c8 > x17: x16: 0070 > x15: 0003 x14: > x13: ff801103d7f8 x12: 0028 > x11: ff807604 x10: 9ad8 > x9 : ff80110126d0 x8 : ffc6f7563000 > x7 : 6b6b6b6b6b6b6b6b x6 : 0018 > x5 : ff8011cf3cc8 x4 : 4000 > x3 : 0008 x2 : 0001 > x1 : x0 : ffbf029fffc8 > Process bash (pid: 1098, stack limit = 0xc38e3e32) > Call trace: > __free_pages+0x8/0x58 > __dma_direct_free_pages+0x50/0x5c > arch_dma_free+0x1c/0x98 > dma_direct_free+0x14/0x24 > dma_free_attrs+0x9c/0xdc > dmam_release+0x18/0x20 > release_nodes+0x25c/0x28c > devres_release_all+0x48/0x4c > device_release_driver_internal+0x184/0x1f0 > device_release_driver+0x14/0x1c > unbind_store+0x70/0xb8 > drv_attr_store+0x24/0x34 > sysfs_kf_write+0x4c/0x64 > kernfs_fop_write+0x154/0x1c4 > __vfs_write+0x34/0x164 > vfs_write+0xb4/0x16c > ksys_write+0x5c/0xbc > __arm64_sys_write+0x14/0x1c > el0_svc_common+0x98/0x114 > el0_svc_handler+0x1c/0x24 > el0_svc+0x8/0xc > Code: d51b4234 17fa a9bf7bfd 910003fd (b9403404) > ---[ end trace 8c564cdd3a1a840f ]--- > > While I've bisected this to commit e8e683ae9a736407 ("iommu/of: Fix > probe-deferral"), and reverting that commit on post-v5.0-rc4 kernels > does fix the problem, this turned out to be a red herring. > > On arm64, arch_teardown_dma_ops() resets dev->dma_ops to NULL. > Hence if a driver has used a managed DMA allocation API, the allocated > DMA memory will be freed using the direct DMA ops, while it may have > been allocated using a custom DMA ops (iommu_dma_ops in this case). > > Fix this by reversing the order of the calls to devres_release_all() and > arch_teardown_dma_ops(). > > Signed-off-by: Geert Uytterhoeven Reviewed-by: Rafael J. Wysocki > --- > Question: > Is this safe on arm32, which calls arm_teardown_iommu_dma_ops() instead > of resetting dev->dma_ops? > > --- > Adding some debug code, and comparing before/after commit > e8e683ae9a736407: > > sata_rcar ee30.sata: of_iommu_configure:227: err = -517 > -sata_rcar ee30.sata: of_iommu_configure:230: calling > iommu_probe_device() > -sata_rcar ee30.sata: iommu_probe_device:122: Calling ipmmu_add_device > -sata_rcar ee30.sata: ipmmu_add_device:893 > -sata_rcar ee30.sata: of_iommu_configure:232: iommu_probe_device() > returned -19 > -sata_rcar ee30.sata: dma_alloc_attrs:257: size 2048, ops = > (null) > -sata_rcar ee30.sata: __dma_direct_alloc_pages:104: size 4096 > -sata_rcar ee30.sata: __dma_direct_alloc_pages:115: trying > dma_alloc_from_contiguous() > -sata_rcar ee30.sata: dma_alloc_from_contiguous:202: cma_alloc(1) > returned page ffbf00d20e00 > -sata_rcar ee30.sata: dma_alloc_attrs:271: allocated using > dma_direct_alloc() > -sata_rcar ee30.sata: dmam_alloc_attrs:98: size 2048 vaddr > ff8011e95000 dma_handle 0x0x7c04 attrs 0x0 > -scsi host0: sata_rcar > -ata1: SATA max UDMA/133 irq 172 > -ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > > DMA memory used to be allocated from CMA, dma_map_ops = NULL. > The SATA driver was probed and initialized. > > +sata_rcar ee30.sata: of_io
Re: [PATCH/RFC] driver core: Postpone DMA tear-down until after devres release
Yes, we should not reset the DMA ops before releasing all resources: Acked-by: Christoph Hellwig ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH/RFC] driver core: Postpone DMA tear-down until after devres release
When unbinding the (IOMMU-enabled) R-Car SATA device on Salvator-XS (R-Car H3 ES2.0), in preparation of rebinding against vfio-platform for device pass-through for virtualization: echo ee30.sata > /sys/bus/platform/drivers/sata_rcar/unbind the kernel crashes with: Unable to handle kernel paging request at virtual address ffbf029c Mem abort info: ESR = 0x9606 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x0006 CM = 0, WnR = 0 swapper pgtable: 4k pages, 39-bit VAs, pgdp = 7e8c586c [ffbf029c] pgd=00073bfc6003, pud=00073bfc6003, pmd= Internal error: Oops: 9606 [#1] SMP Modules linked in: CPU: 0 PID: 1098 Comm: bash Not tainted 5.0.0-rc5-salvator-x-00452-g37596f884f4318ef #287 Hardware name: Renesas Salvator-X 2nd version board based on r8a7795 ES2.0+ (DT) pstate: 6045 (nZCv daif +PAN -UAO) pc : __free_pages+0x8/0x58 lr : __dma_direct_free_pages+0x50/0x5c sp : ff801268baa0 x29: ff801268baa0 x28: x27: ffc6f9c60bf0 x26: ffc6f9c60bf0 x25: ffc6f9c60810 x24: x23: f000 x22: ff8012145000 x21: 0800 x20: ffbf029fffc8 x19: x18: ffc6f86c42c8 x17: x16: 0070 x15: 0003 x14: x13: ff801103d7f8 x12: 0028 x11: ff807604 x10: 9ad8 x9 : ff80110126d0 x8 : ffc6f7563000 x7 : 6b6b6b6b6b6b6b6b x6 : 0018 x5 : ff8011cf3cc8 x4 : 4000 x3 : 0008 x2 : 0001 x1 : x0 : ffbf029fffc8 Process bash (pid: 1098, stack limit = 0xc38e3e32) Call trace: __free_pages+0x8/0x58 __dma_direct_free_pages+0x50/0x5c arch_dma_free+0x1c/0x98 dma_direct_free+0x14/0x24 dma_free_attrs+0x9c/0xdc dmam_release+0x18/0x20 release_nodes+0x25c/0x28c devres_release_all+0x48/0x4c device_release_driver_internal+0x184/0x1f0 device_release_driver+0x14/0x1c unbind_store+0x70/0xb8 drv_attr_store+0x24/0x34 sysfs_kf_write+0x4c/0x64 kernfs_fop_write+0x154/0x1c4 __vfs_write+0x34/0x164 vfs_write+0xb4/0x16c ksys_write+0x5c/0xbc __arm64_sys_write+0x14/0x1c el0_svc_common+0x98/0x114 el0_svc_handler+0x1c/0x24 el0_svc+0x8/0xc Code: d51b4234 17fa a9bf7bfd 910003fd (b9403404) ---[ end trace 8c564cdd3a1a840f ]--- While I've bisected this to commit e8e683ae9a736407 ("iommu/of: Fix probe-deferral"), and reverting that commit on post-v5.0-rc4 kernels does fix the problem, this turned out to be a red herring. On arm64, arch_teardown_dma_ops() resets dev->dma_ops to NULL. Hence if a driver has used a managed DMA allocation API, the allocated DMA memory will be freed using the direct DMA ops, while it may have been allocated using a custom DMA ops (iommu_dma_ops in this case). Fix this by reversing the order of the calls to devres_release_all() and arch_teardown_dma_ops(). Signed-off-by: Geert Uytterhoeven --- Question: Is this safe on arm32, which calls arm_teardown_iommu_dma_ops() instead of resetting dev->dma_ops? --- Adding some debug code, and comparing before/after commit e8e683ae9a736407: sata_rcar ee30.sata: of_iommu_configure:227: err = -517 -sata_rcar ee30.sata: of_iommu_configure:230: calling iommu_probe_device() -sata_rcar ee30.sata: iommu_probe_device:122: Calling ipmmu_add_device -sata_rcar ee30.sata: ipmmu_add_device:893 -sata_rcar ee30.sata: of_iommu_configure:232: iommu_probe_device() returned -19 -sata_rcar ee30.sata: dma_alloc_attrs:257: size 2048, ops = (null) -sata_rcar ee30.sata: __dma_direct_alloc_pages:104: size 4096 -sata_rcar ee30.sata: __dma_direct_alloc_pages:115: trying dma_alloc_from_contiguous() -sata_rcar ee30.sata: dma_alloc_from_contiguous:202: cma_alloc(1) returned page ffbf00d20e00 -sata_rcar ee30.sata: dma_alloc_attrs:271: allocated using dma_direct_alloc() -sata_rcar ee30.sata: dmam_alloc_attrs:98: size 2048 vaddr ff8011e95000 dma_handle 0x0x7c04 attrs 0x0 -scsi host0: sata_rcar -ata1: SATA max UDMA/133 irq 172 -ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) DMA memory used to be allocated from CMA, dma_map_ops = NULL. The SATA driver was probed and initialized. +sata_rcar ee30.sata: of_iommu_configure:234: skipping iommu_probe_device() SATA driver initialization is now deferred. Later, it is retried: +sata_rcar ee30.sata: of_iommu_configure:227: err = 0 +sata_rcar ee30.sata: of_iommu_configure:230: calling iommu_probe_device() +sata_rcar ee30.sata: iommu_probe_device:122: Cal