Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II
On Wed, Mar 3, 2021 at 2:21 AM Michael Walle wrote: > > Am 2021-03-03 10:28, schrieb Saravana Kannan: > > On Wed, Mar 3, 2021 at 12:59 AM Michael Walle wrote: > >> > >> Am 2021-03-02 23:47, schrieb Saravana Kannan: > >> > On Tue, Mar 2, 2021 at 2:42 PM Saravana Kannan > >> > wrote: > >> >> > >> >> On Tue, Mar 2, 2021 at 2:24 PM Michael Walle wrote: > >> >> > > >> >> > Am 2021-03-02 22:11, schrieb Saravana Kannan: > >> >> > > I think Patch 1 should fix [4] without [5]. Can you test the series > >> >> > > please? > >> >> > > >> >> > Mh, I'm on latest linux-next (next-20210302) and I've applied patch > >> >> > 3/3 > >> >> > and > >> >> > reverted commit 7007b745a508 ("PCI: layerscape: Convert to > >> >> > builtin_platform_driver()"). I'd assumed that PCIe shouldn't be > >> >> > working, > >> >> > right? But it is. Did I miss something? > >> >> > >> >> You need to revert [5]. > >> > > >> > My bad. You did revert it. Ah... I wonder if it was due to > >> > fw_devlink.strict that I added. To break PCI again, also set > >> > fw_devlink.strict=1 in the kernel command line. > >> > >> Indeed, adding fw_devlink.strict=1 will break PCI again. But if > >> I then apply 1/3 and 2/3 again, PCI is still broken. Just to be clear: > >> I'm keeping the fw_devlink.strict=1 parameter. > > > > Thanks for your testing! I assume you are also setting fw_devlink=on? > > I've applied patch 3/3 and added nothing to the commandline, so yes. > > > Hmmm... ok. In the working case, does your PCI probe before IOMMU? If > > yes, then your results make sense. > > Yes that was the conclusion last time. That the probe is deferred and > the __init section is already discarded when there might a second > try of the probe. Long response below, but the TL;DR is: The real fix for your case was the implementation of fw_devlink.strict and NOT Patch 1 of this series. So, sorry for wasting your test effort. During the earlier debugging (for take I), this is what I thought: With fw_devlink=permissive, your boot sequence was (Case 1): 1. IOMMU probe 2. PCI builtin_platform_driver_probe() attempt - Driver core sets up PCI with IOMMU - PCI probe succeeds. - PCI works with IOMMU. < Remember this point. And with fw_devlink=on, I thought the IOMMU probe order was unnecessarily changed and caused this (Case 2): 1. IOMMU probe reordered for some reason to be attempted before its suppliers. Gets deferred. 2. PCI probe attempt - fw_devlink + device links defers the probe because IOMMU isn't ready. - builtin_platform_driver_probe() replaces drv->probe with platform_probe_fail() 3. IOMMU deferred probe succeeds eventually. 4. PCI deferred probe is attempted - platform_probe_fail() which is a stub just returns -ENXIO And if this was the case, patch 1 in this series would have fixed it by removing unnecessary reordering of probes. But what was really happening was (after I went through your logs again and looked at the code): With fw_devlink=permissive, your boot sequence was really (Case 3): 1. PCI builtin_platform_driver_probe() attempt - Driver core does NOT set up PCI with IOMMU - PCI probe succeeds. - PCI works without IOMMU. < Remember this point. 2. IOMMU probes And with fw_devlink=on what was happening was (Case 4): 1. PCI builtin_platform_driver_probe() attempt - fw_devlink + device links defers the probe because it thinks IOMMU is mandatory and isn't ready. - builtin_platform_driver_probe() replaces drv->probe with platform_probe_fail() 2. IOMMU probes. 3. PCI deferred probe is attempted - platform_probe_fail() which is a stub just returns -ENXIO 4. PCI is broken now. In your case IOMMU is not mandatory and PCI works without IOMMU even when fw_devlink=off/permissive. So the real fix for your case is the addition of fw_devlink.strict and defaulting it to 0. Because of my misunderstanding of your case, I didn't realize I already fixed your case and I thought Patch 1 in this series would fix your case. Patch 1 in this series is still important for other reasons, just not for you. > So I guess, Patch 1/3 and Patch 2/3 doesn't fix that and the drivers > still need to be converted to builtin_platform_driver(), right? So there is no real issue between fw_devlink=on and builtin_platform_driver_probe() anymore. At least none that I know of or has been reported. If you really want your PCI to work _with_ IOMMU, then builtin_platform_driver_probe() is wrong even with fw_devlink=off. And if you wanted PCI to work with IOMMU support and fw_devlink wasn't available, you'll have to play initcall chicken with the IOMMU driver or implement some IOMMU check + deferred probing in your PCI probe function. However, with fw_devlink=on, all you have to do is set fw_devlink=on and fw_devlink.strict=1 and use builtin_platform_driver() and not have to care about initcall orders or figure out how to defer when IOMMU isn't ready yet. -Saravana
Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II
On Wed, Mar 3, 2021 at 2:03 AM Geert Uytterhoeven wrote: > > Hi Saravana, > > On Wed, Mar 3, 2021 at 10:24 AM Saravana Kannan wrote: > > On Wed, Mar 3, 2021 at 1:22 AM Geert Uytterhoeven > > wrote: > > > On Tue, Mar 2, 2021 at 10:11 PM Saravana Kannan > > > wrote: > > > > This series fixes the last few remaining issues reported when > > > > fw_devlink=on > > > > by default. > > > > > > [...] > > > > > > Thanks for your series! > > > > > > > Geert/Marek, > > > > > > > > As far as I know, there shouldn't have any more issues you reported that > > > > are still left unfixed after this series. Please correct me if I'm > > > > wrong or > > > > if you find new issues. > > > > > > While this fixes the core support, there may still be driver fixes left > > > that were not developed in time for the v5.12-rc1 merge window. > > > Personally, I'm aware of "soc: renesas: rmobile-sysc: Mark fwnode > > > when PM domain is added", which I have queued for v5.13[1]. > > > There may be other fixes for other platforms. > > > > Right, I intended this series for 5.13. Is that what you are trying to say > > too? > > OK, v5.13 is fine for me. > It wasn't clear to me if you intended (the last patch of) this series to > be merged for v5.12-rcX or v5.13. The entire series is meant for 5.13. I don't want to land the Patch 1/3 in 5.12 in case it causes some regression. And 2/3 isn't urgent. -Saravana > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- > ge...@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like > that. > -- Linus Torvalds
Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II
Am 2021-03-03 10:28, schrieb Saravana Kannan: On Wed, Mar 3, 2021 at 12:59 AM Michael Walle wrote: Am 2021-03-02 23:47, schrieb Saravana Kannan: > On Tue, Mar 2, 2021 at 2:42 PM Saravana Kannan > wrote: >> >> On Tue, Mar 2, 2021 at 2:24 PM Michael Walle wrote: >> > >> > Am 2021-03-02 22:11, schrieb Saravana Kannan: >> > > I think Patch 1 should fix [4] without [5]. Can you test the series >> > > please? >> > >> > Mh, I'm on latest linux-next (next-20210302) and I've applied patch 3/3 >> > and >> > reverted commit 7007b745a508 ("PCI: layerscape: Convert to >> > builtin_platform_driver()"). I'd assumed that PCIe shouldn't be working, >> > right? But it is. Did I miss something? >> >> You need to revert [5]. > > My bad. You did revert it. Ah... I wonder if it was due to > fw_devlink.strict that I added. To break PCI again, also set > fw_devlink.strict=1 in the kernel command line. Indeed, adding fw_devlink.strict=1 will break PCI again. But if I then apply 1/3 and 2/3 again, PCI is still broken. Just to be clear: I'm keeping the fw_devlink.strict=1 parameter. Thanks for your testing! I assume you are also setting fw_devlink=on? I've applied patch 3/3 and added nothing to the commandline, so yes. Hmmm... ok. In the working case, does your PCI probe before IOMMU? If yes, then your results make sense. Yes that was the conclusion last time. That the probe is deferred and the __init section is already discarded when there might a second try of the probe. So I guess, Patch 1/3 and Patch 2/3 doesn't fix that and the drivers still need to be converted to builtin_platform_driver(), right? If your PCI does probe after IOMMU and uses IOMMU, then I'm not sure what else could be changing the order of the device probing. In any case, glad that the default case works and we have a fix merged even for .strict=1. -Saravana -michael
Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II
Hi Saravana, On Wed, Mar 3, 2021 at 10:24 AM Saravana Kannan wrote: > On Wed, Mar 3, 2021 at 1:22 AM Geert Uytterhoeven > wrote: > > On Tue, Mar 2, 2021 at 10:11 PM Saravana Kannan > > wrote: > > > This series fixes the last few remaining issues reported when > > > fw_devlink=on > > > by default. > > > > [...] > > > > Thanks for your series! > > > > > Geert/Marek, > > > > > > As far as I know, there shouldn't have any more issues you reported that > > > are still left unfixed after this series. Please correct me if I'm wrong > > > or > > > if you find new issues. > > > > While this fixes the core support, there may still be driver fixes left > > that were not developed in time for the v5.12-rc1 merge window. > > Personally, I'm aware of "soc: renesas: rmobile-sysc: Mark fwnode > > when PM domain is added", which I have queued for v5.13[1]. > > There may be other fixes for other platforms. > > Right, I intended this series for 5.13. Is that what you are trying to say > too? OK, v5.13 is fine for me. It wasn't clear to me if you intended (the last patch of) this series to be merged for v5.12-rcX or v5.13. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II
On Wed, Mar 3, 2021 at 1:22 AM Geert Uytterhoeven wrote: > > Hi Saravana, > > On Tue, Mar 2, 2021 at 10:11 PM Saravana Kannan wrote: > > This series fixes the last few remaining issues reported when fw_devlink=on > > by default. > > [...] > > Thanks for your series! > > > Geert/Marek, > > > > As far as I know, there shouldn't have any more issues you reported that > > are still left unfixed after this series. Please correct me if I'm wrong or > > if you find new issues. > > While this fixes the core support, there may still be driver fixes left > that were not developed in time for the v5.12-rc1 merge window. > Personally, I'm aware of "soc: renesas: rmobile-sysc: Mark fwnode > when PM domain is added", which I have queued for v5.13[1]. > There may be other fixes for other platforms. Right, I intended this series for 5.13. Is that what you are trying to say too? -Saravana > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel.git/commit/?h=renesas-drivers-for-v5.13&id=fb13bbd6c90ee4fb983c0e9a341bd2832a3857cf > > Gr{oetje,eeting}s, > > Geert > > -- > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- > ge...@linux-m68k.org > > In personal conversations with technical people, I call myself a hacker. But > when I'm talking to journalists I just say "programmer" or something like > that. > -- Linus Torvalds
Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II
Am 2021-03-02 23:47, schrieb Saravana Kannan: On Tue, Mar 2, 2021 at 2:42 PM Saravana Kannan wrote: On Tue, Mar 2, 2021 at 2:24 PM Michael Walle wrote: > > Am 2021-03-02 22:11, schrieb Saravana Kannan: > > I think Patch 1 should fix [4] without [5]. Can you test the series > > please? > > Mh, I'm on latest linux-next (next-20210302) and I've applied patch 3/3 > and > reverted commit 7007b745a508 ("PCI: layerscape: Convert to > builtin_platform_driver()"). I'd assumed that PCIe shouldn't be working, > right? But it is. Did I miss something? You need to revert [5]. My bad. You did revert it. Ah... I wonder if it was due to fw_devlink.strict that I added. To break PCI again, also set fw_devlink.strict=1 in the kernel command line. Indeed, adding fw_devlink.strict=1 will break PCI again. But if I then apply 1/3 and 2/3 again, PCI is still broken. Just to be clear: I'm keeping the fw_devlink.strict=1 parameter. -michael
Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II
Hi Saravana, On Tue, Mar 2, 2021 at 10:11 PM Saravana Kannan wrote: > This series fixes the last few remaining issues reported when fw_devlink=on > by default. [...] Thanks for your series! > Geert/Marek, > > As far as I know, there shouldn't have any more issues you reported that > are still left unfixed after this series. Please correct me if I'm wrong or > if you find new issues. While this fixes the core support, there may still be driver fixes left that were not developed in time for the v5.12-rc1 merge window. Personally, I'm aware of "soc: renesas: rmobile-sysc: Mark fwnode when PM domain is added", which I have queued for v5.13[1]. There may be other fixes for other platforms. [1] https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel.git/commit/?h=renesas-drivers-for-v5.13&id=fb13bbd6c90ee4fb983c0e9a341bd2832a3857cf Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II
On Wed, Mar 3, 2021 at 12:59 AM Michael Walle wrote: > > Am 2021-03-02 23:47, schrieb Saravana Kannan: > > On Tue, Mar 2, 2021 at 2:42 PM Saravana Kannan > > wrote: > >> > >> On Tue, Mar 2, 2021 at 2:24 PM Michael Walle wrote: > >> > > >> > Am 2021-03-02 22:11, schrieb Saravana Kannan: > >> > > I think Patch 1 should fix [4] without [5]. Can you test the series > >> > > please? > >> > > >> > Mh, I'm on latest linux-next (next-20210302) and I've applied patch 3/3 > >> > and > >> > reverted commit 7007b745a508 ("PCI: layerscape: Convert to > >> > builtin_platform_driver()"). I'd assumed that PCIe shouldn't be working, > >> > right? But it is. Did I miss something? > >> > >> You need to revert [5]. > > > > My bad. You did revert it. Ah... I wonder if it was due to > > fw_devlink.strict that I added. To break PCI again, also set > > fw_devlink.strict=1 in the kernel command line. > > Indeed, adding fw_devlink.strict=1 will break PCI again. But if > I then apply 1/3 and 2/3 again, PCI is still broken. Just to be clear: > I'm keeping the fw_devlink.strict=1 parameter. Thanks for your testing! I assume you are also setting fw_devlink=on? Hmmm... ok. In the working case, does your PCI probe before IOMMU? If yes, then your results make sense. If your PCI does probe after IOMMU and uses IOMMU, then I'm not sure what else could be changing the order of the device probing. In any case, glad that the default case works and we have a fix merged even for .strict=1. -Saravana
Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II
On Tue, Mar 2, 2021 at 2:42 PM Saravana Kannan wrote: > > On Tue, Mar 2, 2021 at 2:24 PM Michael Walle wrote: > > > > Am 2021-03-02 22:11, schrieb Saravana Kannan: > > > I think Patch 1 should fix [4] without [5]. Can you test the series > > > please? > > > > Mh, I'm on latest linux-next (next-20210302) and I've applied patch 3/3 > > and > > reverted commit 7007b745a508 ("PCI: layerscape: Convert to > > builtin_platform_driver()"). I'd assumed that PCIe shouldn't be working, > > right? But it is. Did I miss something? > > You need to revert [5]. My bad. You did revert it. Ah... I wonder if it was due to fw_devlink.strict that I added. To break PCI again, also set fw_devlink.strict=1 in the kernel command line. -Saravana
Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II
On Tue, Mar 2, 2021 at 2:24 PM Michael Walle wrote: > > Am 2021-03-02 22:11, schrieb Saravana Kannan: > > I think Patch 1 should fix [4] without [5]. Can you test the series > > please? > > Mh, I'm on latest linux-next (next-20210302) and I've applied patch 3/3 > and > reverted commit 7007b745a508 ("PCI: layerscape: Convert to > builtin_platform_driver()"). I'd assumed that PCIe shouldn't be working, > right? But it is. Did I miss something? You need to revert [5]. -Saravana > > Anyway, I've also applied Patch 1/3 and 2/3 and it still works. But I > guess that doesn't say much. > > -michael
Re: [PATCH v1 0/3] driver core: Set fw_devlink=on take II
Am 2021-03-02 22:11, schrieb Saravana Kannan: I think Patch 1 should fix [4] without [5]. Can you test the series please? Mh, I'm on latest linux-next (next-20210302) and I've applied patch 3/3 and reverted commit 7007b745a508 ("PCI: layerscape: Convert to builtin_platform_driver()"). I'd assumed that PCIe shouldn't be working, right? But it is. Did I miss something? Anyway, I've also applied Patch 1/3 and 2/3 and it still works. But I guess that doesn't say much. -michael
[PATCH v1 0/3] driver core: Set fw_devlink=on take II
This series fixes the last few remaining issues reported when fw_devlink=on by default. Patch 1 is just [6] pulled in without changes into this series. It reduces some unnecessary probe reordering caused by a combination of fw_devlink and existing device link code. This fixes some issue caused by fw_devlink=on with respect to DMAs and IOMMUs [1]. Patch 2 fixes a warning [2] present in code unrelated to fw_devlink. It was just exposed by fw_devlink. Jon, Patch 2 should address the issues you reported[2] even without [3]. Could you test this series please? Michael, I think Patch 1 should fix [4] without [5]. Can you test the series please? Geert/Marek, As far as I know, there shouldn't have any more issues you reported that are still left unfixed after this series. Please correct me if I'm wrong or if you find new issues. [1] - https://lore.kernel.org/lkml/camuhmduvvr8jes51_8_ypoicr-nwad_2nklyukwey8mbxx9...@mail.gmail.com/ [2] - https://lore.kernel.org/lkml/56f7d032-ba5a-a8c7-23de-2969d98c5...@nvidia.com/ [3] - https://lore.kernel.org/lkml/5176f496-facb-d7b0-9f4e-a9e4b8974...@nvidia.com/ [4] - https://lore.kernel.org/lkml/4b9ae679b6f76d2f7e340e2ec229d...@walle.cc/ [5] - https://lore.kernel.org/lkml/20210120105246.23218-1-mich...@walle.cc/ [6] - https://lore.kernel.org/lkml/20210217235130.1744843-1-sarava...@google.com/ Cc: Michael Walle Cc: Jon Hunter Cc: Marek Szyprowski Cc: Geert Uytterhoeven Cc: Guenter Roeck Thanks, Saravana Saravana Kannan (3): driver core: Avoid pointless deferred probe attempts driver core: Update device link status properly for device_bind_driver() Revert "Revert "driver core: Set fw_devlink=on by default"" drivers/base/base.h| 1 + drivers/base/core.c| 37 - drivers/base/dd.c | 10 +- include/linux/device.h | 4 4 files changed, 50 insertions(+), 2 deletions(-) -- 2.30.1.766.gb4fecdf3b7-goog