Re: [PATCH v1 0/2] Make fw_devlink=on more forgiving

2021-02-02 Thread Saravana Kannan
On Tue, Feb 2, 2021 at 12:12 AM Marek Szyprowski
 wrote:
>
> Hi Saravana,
>
> On 01.02.2021 10:02, Saravana Kannan wrote:
> > On Mon, Feb 1, 2021 at 12:05 AM Marek Szyprowski
> >  wrote:
> >> On 30.01.2021 05:08, Saravana Kannan wrote:
> >>> On Fri, Jan 29, 2021 at 8:03 PM Saravana Kannan  
> >>> wrote:
>  This patch series solves two general issues with fw_devlink=on
> 
>  Patch 1/2 addresses the issue of firmware nodes that look like they'll
>  have struct devices created for them, but will never actually have
>  struct devices added for them. For example, DT nodes with a compatible
>  property that don't have devices added for them.
> 
>  Patch 2/2 address (for static kernels) the issue of optional suppliers
>  that'll never have a driver registered for them. So, if the device could
>  have probed with fw_devlink=permissive with a static kernel, this patch
>  should allow those devices to probe with a fw_devlink=on. This doesn't
>  solve it for the case where modules are enabled because there's no way
>  to tell if a driver will never be registered or it's just about to be
>  registered. I have some other ideas for that, but it'll have to come
>  later thinking about it a bit.
> 
>  These two patches might remove the need for several other patches that
>  went in as fixes for commit e590474768f1 ("driver core: Set
>  fw_devlink=on by default"), but I think all those fixes are good
>  changes. So I think we should leave those in.
> 
>  Marek, Geert,
> 
>  Can you try this series on a static kernel with your OF_POPULATED
>  changes reverted? I just want to make sure these patches can identify
>  and fix those cases.
> 
>  Tudor,
> 
>  You should still make the clock driver fix (because it's a bug), but I
>  think this series will fix your issue too (even without the clock driver
>  fix). Can you please give this a shot?
> >>> Marek, Geert, Tudor,
> >>>
> >>> Forgot to say that this will probably fix your issues only in a static
> >>> kernel. So please try this with a static kernel. If you can also try
> >>> and confirm that this does not fix the issue for a modular kernel,
> >>> that'd be good too.
> >> I've checked those patches on top of linux next-20210129 with
> >> c09a3e6c97f0 ("soc: samsung: pm_domains: Convert to regular platform
> >> driver") commit reverted.
> > Hi Marek,
> >
> > Thanks for testing!
> >
> >> Sadly it doesn't help.
> > That sucks. I even partly "tested" it out on my platform (that needs
> > CONFIG_MODULES) by commenting out the CONFIG_MODULES check. And I saw
> > some device links getting dropped.
>
> Well, my fault. I've missed the fact that I have to disable
> CONFIG_MODULES to let it work. This is not really a fix for my case,
> because the exynos_defconfig has modules enabled (mainly for WiFi and
> media drivers). However disabling the CONFIG_MODULES indeed helped a
> bit. Most of the devices got finally probed. There are only 4 left in
> the deferred_devices list:
>
> sound
> 12e2.sysmmu
> 12d0.hdmi
> 12c1.mixer
>
> The last two (12c1.mixer and 12d0.hdmi) are consumers of the
> 12e2.sysmmu, which is a consumer of the 10023c20.power-domain. That
> power domain in turn is a consumer (child) of another power domain
> (10023c80.power-domain):
>
> # dmesg | grep 10023c20.power-domain
> [0.354435] platform 10023c20.power-domain: Linked as a consumer to
> 10023c80.power-domain
> [0.489573] platform 12d0.hdmi: Linked as a consumer to
> 10023c20.power-domain
> [0.497143] platform 12c1.mixer: Linked as a consumer to
> 10023c20.power-domain
> [0.580874] platform 12e2.sysmmu: Linked as a consumer to
> 10023c20.power-domain
> [0.601655] platform 12e2.sysmmu: probe deferral - supplier
> 10023c20.power-domain not ready
> [2.744884] platform 12c1.mixer: probe deferral - supplier
> 10023c20.power-domain not ready
> [2.766726] platform 12d0.hdmi: probe deferral - supplier
> 10023c20.power-domain not ready
>
> ...
>
> So a dependency chain of 2 power domains is still not resolved properly.
>
> I didn't have time to check what's wrong with the sound node. Simple
> grepping of the messages for the 'sound' string don't give any results.
> The above tests has been done on the Odroid U3 board
> (arch/arm/boot/dts/exynos4412-odroidu3.dts).

Thanks for testing again! This actually gave me valuable info. The
problem is that 10023c20.power-domain (let's call it PD-B) never gets
added to the deferred probe list (because that only happens once a
driver is registered). So, it never gets to drop it's dependency on
10023c20.power-domain (let's call it PD-A). So, once all drivers are
registered and the SMMU checks if it needs to drop the device link to
PD-B, it sees that PD-B is still waiting on suppliers. So it could be
that the PD-B would probe in the future. But PD-B never probes in the
future.

Anyway, I 

Re: [PATCH v1 0/2] Make fw_devlink=on more forgiving

2021-02-02 Thread Saravana Kannan
On Mon, Feb 1, 2021 at 11:55 PM Geert Uytterhoeven  wrote:
>
> Hi Saravana,
>
> On Tue, Feb 2, 2021 at 4:01 AM Saravana Kannan  wrote:
> > On Mon, Feb 1, 2021 at 2:40 AM Geert Uytterhoeven  
> > wrote:
> > > On Sat, Jan 30, 2021 at 5:09 AM Saravana Kannan  
> > > wrote:
> > > > On Fri, Jan 29, 2021 at 8:03 PM Saravana Kannan  
> > > > wrote:
> > > > > This patch series solves two general issues with fw_devlink=on
> > > > >
> > > > > Patch 1/2 addresses the issue of firmware nodes that look like they'll
> > > > > have struct devices created for them, but will never actually have
> > > > > struct devices added for them. For example, DT nodes with a compatible
> > > > > property that don't have devices added for them.
> > > > >
> > > > > Patch 2/2 address (for static kernels) the issue of optional suppliers
> > > > > that'll never have a driver registered for them. So, if the device 
> > > > > could
> > > > > have probed with fw_devlink=permissive with a static kernel, this 
> > > > > patch
> > > > > should allow those devices to probe with a fw_devlink=on. This doesn't
> > > > > solve it for the case where modules are enabled because there's no way
> > > > > to tell if a driver will never be registered or it's just about to be
> > > > > registered. I have some other ideas for that, but it'll have to come
> > > > > later thinking about it a bit.
> > > > >
> > > > > These two patches might remove the need for several other patches that
> > > > > went in as fixes for commit e590474768f1 ("driver core: Set
> > > > > fw_devlink=on by default"), but I think all those fixes are good
> > > > > changes. So I think we should leave those in.
> > > > >
> > > > > Marek, Geert,
> > > > >
> > > > > Can you try this series on a static kernel with your OF_POPULATED
> > > > > changes reverted? I just want to make sure these patches can identify
> > > > > and fix those cases.
> > > > >
> > > > > Tudor,
> > > > >
> > > > > You should still make the clock driver fix (because it's a bug), but I
> > > > > think this series will fix your issue too (even without the clock 
> > > > > driver
> > > > > fix). Can you please give this a shot?
> > > >
> > > > Marek, Geert, Tudor,
> > > >
> > > > Forgot to say that this will probably fix your issues only in a static
> > > > kernel. So please try this with a static kernel. If you can also try
> > > > and confirm that this does not fix the issue for a modular kernel,
> > > > that'd be good too.
> > >
> > > Thanks for your series!
> > >
> > > For the modular case, this series has no impact, as expected (i.e. fails
> > > to boot, no I/O devices probed).
> > > With modules disabled, both r8a7791/koelsch and r8a77951/salvator-xs
> > > seem to boot fine, except for one issue on koelsch:
> >
> > Thanks a lot for testing the series!
> >
> > Regarding the koelsch issue, do you not see it with your OF_POPULATED
> > fix for rcar-sysc driver? But only see if you revert it and use this
> > series?
>
> I've just rechecked, and with fw_devlink=on, and my OF_POPULATED
> fir for rcar-sysc, i2c-demux-pinctrl works, both with modules enabled
> and disabled.

Thanks Geert! My guess is that with your OF_POPULATED changes the
"i2c-parents" of i2c-demux-pinctrl don't get probe deferred and
therefore i2c-demux-pinctrl probes after them and everything goes
well.

I guess that goes to show this series can't be the magic bullet even
with patch 2/3 -- especially for top level DT nodes that never have
devices created.

The other odd thing I noticed is that i2c-demux-pinctrl seems to
return -ENODEV when I think it should do -EPROBE_DEFER. In
i2c_demux_activate_master():

ret = of_changeset_apply(&priv->chan[new_chan].chgset);
if (ret)
goto err;

adap = of_find_i2c_adapter_by_node(priv->chan[new_chan].parent_np);
if (!adap) {
ret = -ENODEV;
goto err_with_revert;
}

If I understand the code correctly, it's assuming the selected parent
will probe successfully as soon as its status=ok change is done. Which
is not guaranteed for many reasons (driver not registered, async
probing, stuff like fw_devlink, etc).

Thanks,
Saravana


Re: [PATCH v1 0/2] Make fw_devlink=on more forgiving

2021-02-02 Thread Marek Szyprowski
Hi Saravana,

On 01.02.2021 10:02, Saravana Kannan wrote:
> On Mon, Feb 1, 2021 at 12:05 AM Marek Szyprowski
>  wrote:
>> On 30.01.2021 05:08, Saravana Kannan wrote:
>>> On Fri, Jan 29, 2021 at 8:03 PM Saravana Kannan  
>>> wrote:
 This patch series solves two general issues with fw_devlink=on

 Patch 1/2 addresses the issue of firmware nodes that look like they'll
 have struct devices created for them, but will never actually have
 struct devices added for them. For example, DT nodes with a compatible
 property that don't have devices added for them.

 Patch 2/2 address (for static kernels) the issue of optional suppliers
 that'll never have a driver registered for them. So, if the device could
 have probed with fw_devlink=permissive with a static kernel, this patch
 should allow those devices to probe with a fw_devlink=on. This doesn't
 solve it for the case where modules are enabled because there's no way
 to tell if a driver will never be registered or it's just about to be
 registered. I have some other ideas for that, but it'll have to come
 later thinking about it a bit.

 These two patches might remove the need for several other patches that
 went in as fixes for commit e590474768f1 ("driver core: Set
 fw_devlink=on by default"), but I think all those fixes are good
 changes. So I think we should leave those in.

 Marek, Geert,

 Can you try this series on a static kernel with your OF_POPULATED
 changes reverted? I just want to make sure these patches can identify
 and fix those cases.

 Tudor,

 You should still make the clock driver fix (because it's a bug), but I
 think this series will fix your issue too (even without the clock driver
 fix). Can you please give this a shot?
>>> Marek, Geert, Tudor,
>>>
>>> Forgot to say that this will probably fix your issues only in a static
>>> kernel. So please try this with a static kernel. If you can also try
>>> and confirm that this does not fix the issue for a modular kernel,
>>> that'd be good too.
>> I've checked those patches on top of linux next-20210129 with
>> c09a3e6c97f0 ("soc: samsung: pm_domains: Convert to regular platform
>> driver") commit reverted.
> Hi Marek,
>
> Thanks for testing!
>
>> Sadly it doesn't help.
> That sucks. I even partly "tested" it out on my platform (that needs
> CONFIG_MODULES) by commenting out the CONFIG_MODULES check. And I saw
> some device links getting dropped.

Well, my fault. I've missed the fact that I have to disable 
CONFIG_MODULES to let it work. This is not really a fix for my case, 
because the exynos_defconfig has modules enabled (mainly for WiFi and 
media drivers). However disabling the CONFIG_MODULES indeed helped a 
bit. Most of the devices got finally probed. There are only 4 left in 
the deferred_devices list:

sound
12e2.sysmmu
12d0.hdmi
12c1.mixer

The last two (12c1.mixer and 12d0.hdmi) are consumers of the 
12e2.sysmmu, which is a consumer of the 10023c20.power-domain. That 
power domain in turn is a consumer (child) of another power domain 
(10023c80.power-domain):

# dmesg | grep 10023c20.power-domain
[    0.354435] platform 10023c20.power-domain: Linked as a consumer to 
10023c80.power-domain
[    0.489573] platform 12d0.hdmi: Linked as a consumer to 
10023c20.power-domain
[    0.497143] platform 12c1.mixer: Linked as a consumer to 
10023c20.power-domain
[    0.580874] platform 12e2.sysmmu: Linked as a consumer to 
10023c20.power-domain
[    0.601655] platform 12e2.sysmmu: probe deferral - supplier 
10023c20.power-domain not ready
[    2.744884] platform 12c1.mixer: probe deferral - supplier 
10023c20.power-domain not ready
[    2.766726] platform 12d0.hdmi: probe deferral - supplier 
10023c20.power-domain not ready

...

So a dependency chain of 2 power domains is still not resolved properly.

I didn't have time to check what's wrong with the sound node. Simple 
grepping of the messages for the 'sound' string don't give any results. 
The above tests has been done on the Odroid U3 board 
(arch/arm/boot/dts/exynos4412-odroidu3.dts).

Best regards

-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland



Re: [PATCH v1 0/2] Make fw_devlink=on more forgiving

2021-02-01 Thread Geert Uytterhoeven
Hi Saravana,

On Tue, Feb 2, 2021 at 4:01 AM Saravana Kannan  wrote:
> On Mon, Feb 1, 2021 at 2:40 AM Geert Uytterhoeven  
> wrote:
> > On Sat, Jan 30, 2021 at 5:09 AM Saravana Kannan  
> > wrote:
> > > On Fri, Jan 29, 2021 at 8:03 PM Saravana Kannan  
> > > wrote:
> > > > This patch series solves two general issues with fw_devlink=on
> > > >
> > > > Patch 1/2 addresses the issue of firmware nodes that look like they'll
> > > > have struct devices created for them, but will never actually have
> > > > struct devices added for them. For example, DT nodes with a compatible
> > > > property that don't have devices added for them.
> > > >
> > > > Patch 2/2 address (for static kernels) the issue of optional suppliers
> > > > that'll never have a driver registered for them. So, if the device could
> > > > have probed with fw_devlink=permissive with a static kernel, this patch
> > > > should allow those devices to probe with a fw_devlink=on. This doesn't
> > > > solve it for the case where modules are enabled because there's no way
> > > > to tell if a driver will never be registered or it's just about to be
> > > > registered. I have some other ideas for that, but it'll have to come
> > > > later thinking about it a bit.
> > > >
> > > > These two patches might remove the need for several other patches that
> > > > went in as fixes for commit e590474768f1 ("driver core: Set
> > > > fw_devlink=on by default"), but I think all those fixes are good
> > > > changes. So I think we should leave those in.
> > > >
> > > > Marek, Geert,
> > > >
> > > > Can you try this series on a static kernel with your OF_POPULATED
> > > > changes reverted? I just want to make sure these patches can identify
> > > > and fix those cases.
> > > >
> > > > Tudor,
> > > >
> > > > You should still make the clock driver fix (because it's a bug), but I
> > > > think this series will fix your issue too (even without the clock driver
> > > > fix). Can you please give this a shot?
> > >
> > > Marek, Geert, Tudor,
> > >
> > > Forgot to say that this will probably fix your issues only in a static
> > > kernel. So please try this with a static kernel. If you can also try
> > > and confirm that this does not fix the issue for a modular kernel,
> > > that'd be good too.
> >
> > Thanks for your series!
> >
> > For the modular case, this series has no impact, as expected (i.e. fails
> > to boot, no I/O devices probed).
> > With modules disabled, both r8a7791/koelsch and r8a77951/salvator-xs
> > seem to boot fine, except for one issue on koelsch:
>
> Thanks a lot for testing the series!
>
> Regarding the koelsch issue, do you not see it with your OF_POPULATED
> fix for rcar-sysc driver? But only see if you revert it and use this
> series?

I've just rechecked, and with fw_devlink=on, and my OF_POPULATED
fir for rcar-sysc, i2c-demux-pinctrl works, both with modules enabled
and disabled.

> > dmesg:
> >
> > +i2c-demux-pinctrl i2c-12: failed to setup demux-adapter 0 (-19)
> > +i2c-demux-pinctrl i2c-13: failed to setup demux-adapter 0 (-19)
> > +i2c-demux-pinctrl i2c-14: failed to setup demux-adapter 0 (-19)
> >
> > -  #0: rsnd-dai.0-ak4642-hifi
> > +  No soundcards found.
> >
> > regulator_summary:
> >
> > -13-0050-vcc   00mA 0mV 0mV
> > -13-0039-dvdd-3v   10mA 0mV 0mV
> > -13-0039-bgvdd 10mA 0mV 0mV
> > -13-0039-pvdd  10mA 0mV 0mV
> > -13-0039-dvdd  10mA 0mV 0mV
> > -13-0039-avdd  10mA 0mV 0mV
> >
> > pm_genpd_summary:
> >
> > -/devices/platform/soc/e6518000.i2c  suspended  0
> > -/devices/platform/soc/e653.i2c  suspended  0
> > -/devices/platform/soc/e652.i2c  suspended  0
> >
> > These are all symptoms of the same issue: i2c buses and devices are not
> > probed, due to the use of the i2c demuxer.
> > I guess the fw_devlink tracker doesn't consider "i2c-parent" links?
>
> No, it doesn't parse "i2c-parent". Ugh... looked at it. It's going to
> be a problem to parse because it requires the parents to be disbled in
> DT and then fixes them up during run time. fw_devlink can handle DT
> overlay changing a specific node, but the problem is that the consumer
> DT node doesn't get changed. So the i2c-parent will first be parsed,
> fw_devlink will notice they are disabled, so it'll ignore them. Then
> those nodes are enabled, but the i2c-parent isn't reparsed because the
> consumer isn't updated.

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH v1 0/2] Make fw_devlink=on more forgiving

2021-02-01 Thread Saravana Kannan
On Mon, Feb 1, 2021 at 2:40 AM Geert Uytterhoeven  wrote:
>
> Hi Saravana,
>
> On Sat, Jan 30, 2021 at 5:09 AM Saravana Kannan  wrote:
> > On Fri, Jan 29, 2021 at 8:03 PM Saravana Kannan  
> > wrote:
> > > This patch series solves two general issues with fw_devlink=on
> > >
> > > Patch 1/2 addresses the issue of firmware nodes that look like they'll
> > > have struct devices created for them, but will never actually have
> > > struct devices added for them. For example, DT nodes with a compatible
> > > property that don't have devices added for them.
> > >
> > > Patch 2/2 address (for static kernels) the issue of optional suppliers
> > > that'll never have a driver registered for them. So, if the device could
> > > have probed with fw_devlink=permissive with a static kernel, this patch
> > > should allow those devices to probe with a fw_devlink=on. This doesn't
> > > solve it for the case where modules are enabled because there's no way
> > > to tell if a driver will never be registered or it's just about to be
> > > registered. I have some other ideas for that, but it'll have to come
> > > later thinking about it a bit.
> > >
> > > These two patches might remove the need for several other patches that
> > > went in as fixes for commit e590474768f1 ("driver core: Set
> > > fw_devlink=on by default"), but I think all those fixes are good
> > > changes. So I think we should leave those in.
> > >
> > > Marek, Geert,
> > >
> > > Can you try this series on a static kernel with your OF_POPULATED
> > > changes reverted? I just want to make sure these patches can identify
> > > and fix those cases.
> > >
> > > Tudor,
> > >
> > > You should still make the clock driver fix (because it's a bug), but I
> > > think this series will fix your issue too (even without the clock driver
> > > fix). Can you please give this a shot?
> >
> > Marek, Geert, Tudor,
> >
> > Forgot to say that this will probably fix your issues only in a static
> > kernel. So please try this with a static kernel. If you can also try
> > and confirm that this does not fix the issue for a modular kernel,
> > that'd be good too.
>
> Thanks for your series!
>
> For the modular case, this series has no impact, as expected (i.e. fails
> to boot, no I/O devices probed).
> With modules disabled, both r8a7791/koelsch and r8a77951/salvator-xs
> seem to boot fine, except for one issue on koelsch:

Thanks a lot for testing the series!

Regarding the koelsch issue, do you not see it with your OF_POPULATED
fix for rcar-sysc driver? But only see if you revert it and use this
series?

>
> dmesg:
>
> +i2c-demux-pinctrl i2c-12: failed to setup demux-adapter 0 (-19)
> +i2c-demux-pinctrl i2c-13: failed to setup demux-adapter 0 (-19)
> +i2c-demux-pinctrl i2c-14: failed to setup demux-adapter 0 (-19)
>
> -  #0: rsnd-dai.0-ak4642-hifi
> +  No soundcards found.
>
> regulator_summary:
>
> -13-0050-vcc   00mA 0mV 0mV
> -13-0039-dvdd-3v   10mA 0mV 0mV
> -13-0039-bgvdd 10mA 0mV 0mV
> -13-0039-pvdd  10mA 0mV 0mV
> -13-0039-dvdd  10mA 0mV 0mV
> -13-0039-avdd  10mA 0mV 0mV
>
> pm_genpd_summary:
>
> -/devices/platform/soc/e6518000.i2c  suspended  0
> -/devices/platform/soc/e653.i2c  suspended  0
> -/devices/platform/soc/e652.i2c  suspended  0
>
> These are all symptoms of the same issue: i2c buses and devices are not
> probed, due to the use of the i2c demuxer.
> I guess the fw_devlink tracker doesn't consider "i2c-parent" links?

No, it doesn't parse "i2c-parent". Ugh... looked at it. It's going to
be a problem to parse because it requires the parents to be disbled in
DT and then fixes them up during run time. fw_devlink can handle DT
overlay changing a specific node, but the problem is that the consumer
DT node doesn't get changed. So the i2c-parent will first be parsed,
fw_devlink will notice they are disabled, so it'll ignore them. Then
those nodes are enabled, but the i2c-parent isn't reparsed because the
consumer isn't updated.

> Note that I only tested this on R-Car Gen2 and Gen3.
> I did not test this on Renesas SH/R-Mobile or RZ/A SoCs.

Thanks for any testing you can do :)

So overall, this series seems to be helping, but doesn't cover 100% of
the cases. So I suppose this is still a useful series. I'll be happy
to take any Tested-by or Reviewed-by.

-Saravana


Re: [PATCH v1 0/2] Make fw_devlink=on more forgiving

2021-02-01 Thread Marc Zyngier

Hi Saravana,

Thanks for this.

On 2021-01-30 04:03, Saravana Kannan wrote:

This patch series solves two general issues with fw_devlink=on

Patch 1/2 addresses the issue of firmware nodes that look like they'll
have struct devices created for them, but will never actually have
struct devices added for them. For example, DT nodes with a compatible
property that don't have devices added for them.

Patch 2/2 address (for static kernels) the issue of optional suppliers
that'll never have a driver registered for them. So, if the device 
could

have probed with fw_devlink=permissive with a static kernel, this patch
should allow those devices to probe with a fw_devlink=on. This doesn't
solve it for the case where modules are enabled because there's no way
to tell if a driver will never be registered or it's just about to be
registered. I have some other ideas for that, but it'll have to come
later thinking about it a bit.

These two patches might remove the need for several other patches that
went in as fixes for commit e590474768f1 ("driver core: Set
fw_devlink=on by default"), but I think all those fixes are good
changes. So I think we should leave those in.

Marek, Geert,

Can you try this series on a static kernel with your OF_POPULATED
changes reverted? I just want to make sure these patches can identify
and fix those cases.

Tudor,

You should still make the clock driver fix (because it's a bug), but I
think this series will fix your issue too (even without the clock 
driver

fix). Can you please give this a shot?

Marc,

Can you try this series with the gpiolib fix reverted please? I'm 
pretty

sure this will fix that case.


Almost. The board boots and behaves as expected, except that a few 
devices

such as the SD card are unusable (probably because the corresponding
suppliers are still not identified as being available:

# find /sys -name waiting_for_supplier| xargs grep .| grep -v :0
/sys/devices/platform/vcc3v0-sd/waiting_for_supplier:1
/sys/devices/platform/vbus-typec/waiting_for_supplier:1
/sys/devices/platform/sdio-pwrseq/waiting_for_supplier:1
/sys/devices/platform/ir-receiver/waiting_for_supplier:1

With the GPIO patch that I reverted, no device is waiting for
a supplier.

Let me know if I can further help.

M.
--
Jazz is not dead. It just smells funny...


Re: [PATCH v1 0/2] Make fw_devlink=on more forgiving

2021-02-01 Thread Geert Uytterhoeven
Hi Saravana,

On Sat, Jan 30, 2021 at 5:09 AM Saravana Kannan  wrote:
> On Fri, Jan 29, 2021 at 8:03 PM Saravana Kannan  wrote:
> > This patch series solves two general issues with fw_devlink=on
> >
> > Patch 1/2 addresses the issue of firmware nodes that look like they'll
> > have struct devices created for them, but will never actually have
> > struct devices added for them. For example, DT nodes with a compatible
> > property that don't have devices added for them.
> >
> > Patch 2/2 address (for static kernels) the issue of optional suppliers
> > that'll never have a driver registered for them. So, if the device could
> > have probed with fw_devlink=permissive with a static kernel, this patch
> > should allow those devices to probe with a fw_devlink=on. This doesn't
> > solve it for the case where modules are enabled because there's no way
> > to tell if a driver will never be registered or it's just about to be
> > registered. I have some other ideas for that, but it'll have to come
> > later thinking about it a bit.
> >
> > These two patches might remove the need for several other patches that
> > went in as fixes for commit e590474768f1 ("driver core: Set
> > fw_devlink=on by default"), but I think all those fixes are good
> > changes. So I think we should leave those in.
> >
> > Marek, Geert,
> >
> > Can you try this series on a static kernel with your OF_POPULATED
> > changes reverted? I just want to make sure these patches can identify
> > and fix those cases.
> >
> > Tudor,
> >
> > You should still make the clock driver fix (because it's a bug), but I
> > think this series will fix your issue too (even without the clock driver
> > fix). Can you please give this a shot?
>
> Marek, Geert, Tudor,
>
> Forgot to say that this will probably fix your issues only in a static
> kernel. So please try this with a static kernel. If you can also try
> and confirm that this does not fix the issue for a modular kernel,
> that'd be good too.

Thanks for your series!

For the modular case, this series has no impact, as expected (i.e. fails
to boot, no I/O devices probed).
With modules disabled, both r8a7791/koelsch and r8a77951/salvator-xs
seem to boot fine, except for one issue on koelsch:

dmesg:

+i2c-demux-pinctrl i2c-12: failed to setup demux-adapter 0 (-19)
+i2c-demux-pinctrl i2c-13: failed to setup demux-adapter 0 (-19)
+i2c-demux-pinctrl i2c-14: failed to setup demux-adapter 0 (-19)

-  #0: rsnd-dai.0-ak4642-hifi
+  No soundcards found.

regulator_summary:

-13-0050-vcc   00mA 0mV 0mV
-13-0039-dvdd-3v   10mA 0mV 0mV
-13-0039-bgvdd 10mA 0mV 0mV
-13-0039-pvdd  10mA 0mV 0mV
-13-0039-dvdd  10mA 0mV 0mV
-13-0039-avdd  10mA 0mV 0mV

pm_genpd_summary:

-/devices/platform/soc/e6518000.i2c  suspended  0
-/devices/platform/soc/e653.i2c  suspended  0
-/devices/platform/soc/e652.i2c  suspended  0

These are all symptoms of the same issue: i2c buses and devices are not
probed, due to the use of the i2c demuxer.
I guess the fw_devlink tracker doesn't consider "i2c-parent" links?

Note that I only tested this on R-Car Gen2 and Gen3.
I did not test this on Renesas SH/R-Mobile or RZ/A SoCs.

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH v1 0/2] Make fw_devlink=on more forgiving

2021-02-01 Thread Saravana Kannan
On Mon, Feb 1, 2021 at 12:05 AM Marek Szyprowski
 wrote:
>
> Hi Saravana,
>
> On 30.01.2021 05:08, Saravana Kannan wrote:
> > On Fri, Jan 29, 2021 at 8:03 PM Saravana Kannan  
> > wrote:
> >> This patch series solves two general issues with fw_devlink=on
> >>
> >> Patch 1/2 addresses the issue of firmware nodes that look like they'll
> >> have struct devices created for them, but will never actually have
> >> struct devices added for them. For example, DT nodes with a compatible
> >> property that don't have devices added for them.
> >>
> >> Patch 2/2 address (for static kernels) the issue of optional suppliers
> >> that'll never have a driver registered for them. So, if the device could
> >> have probed with fw_devlink=permissive with a static kernel, this patch
> >> should allow those devices to probe with a fw_devlink=on. This doesn't
> >> solve it for the case where modules are enabled because there's no way
> >> to tell if a driver will never be registered or it's just about to be
> >> registered. I have some other ideas for that, but it'll have to come
> >> later thinking about it a bit.
> >>
> >> These two patches might remove the need for several other patches that
> >> went in as fixes for commit e590474768f1 ("driver core: Set
> >> fw_devlink=on by default"), but I think all those fixes are good
> >> changes. So I think we should leave those in.
> >>
> >> Marek, Geert,
> >>
> >> Can you try this series on a static kernel with your OF_POPULATED
> >> changes reverted? I just want to make sure these patches can identify
> >> and fix those cases.
> >>
> >> Tudor,
> >>
> >> You should still make the clock driver fix (because it's a bug), but I
> >> think this series will fix your issue too (even without the clock driver
> >> fix). Can you please give this a shot?
> > Marek, Geert, Tudor,
> >
> > Forgot to say that this will probably fix your issues only in a static
> > kernel. So please try this with a static kernel. If you can also try
> > and confirm that this does not fix the issue for a modular kernel,
> > that'd be good too.
>
> I've checked those patches on top of linux next-20210129 with
> c09a3e6c97f0 ("soc: samsung: pm_domains: Convert to regular platform
> driver") commit reverted.

Hi Marek,

Thanks for testing!

> Sadly it doesn't help.

That sucks. I even partly "tested" it out on my platform (that needs
CONFIG_MODULES) by commenting out the CONFIG_MODULES check. And I saw
some device links getting dropped.

> All devices that belong

By belong, I assume you meant "are consumers"?

> to the Exynos power domains are never probed and stay endlessly on the
> deferred devices list. I've used static kernel build - the one from
> exynos_defconfig.

Can you enable the dev_dbg in __device_link_del() (the SRCU variant)?
Hopefully at least some of the device links would be dropped?

If the PD device link is not dropped, I wonder why this condition is
not hitting for consumers of the PD.

if (fw_devlink_def_probe_retry &&
link->flags & DL_FLAG_INFERRED &&
!device_links_probe_blocked_by(link->supplier)) {
device_link_drop_managed(link);
continue;
}

Could you try logging dev, link->supplier and
device_links_probe_blocked_by() return value. That should tell when a
consumer is waiting on a PD, why the PD might appear as waiting on
something else. I can't imagine the DL_FLAG_INFERRED being cleared
(it'll only happen when a driver/framework explicitly creates a device
link). Remind me again where the DT for this board is? Does the PD
depend on something else?

One other possibility is that some of the consumers of the PD could be
using the *_platform_driver_probe() macro/function that never
reattempts a probe. So even though this patch might drop the device
links, the consumer never tries again.

-Saravana


Re: [PATCH v1 0/2] Make fw_devlink=on more forgiving

2021-02-01 Thread Marek Szyprowski
Hi Saravana,

On 30.01.2021 05:08, Saravana Kannan wrote:
> On Fri, Jan 29, 2021 at 8:03 PM Saravana Kannan  wrote:
>> This patch series solves two general issues with fw_devlink=on
>>
>> Patch 1/2 addresses the issue of firmware nodes that look like they'll
>> have struct devices created for them, but will never actually have
>> struct devices added for them. For example, DT nodes with a compatible
>> property that don't have devices added for them.
>>
>> Patch 2/2 address (for static kernels) the issue of optional suppliers
>> that'll never have a driver registered for them. So, if the device could
>> have probed with fw_devlink=permissive with a static kernel, this patch
>> should allow those devices to probe with a fw_devlink=on. This doesn't
>> solve it for the case where modules are enabled because there's no way
>> to tell if a driver will never be registered or it's just about to be
>> registered. I have some other ideas for that, but it'll have to come
>> later thinking about it a bit.
>>
>> These two patches might remove the need for several other patches that
>> went in as fixes for commit e590474768f1 ("driver core: Set
>> fw_devlink=on by default"), but I think all those fixes are good
>> changes. So I think we should leave those in.
>>
>> Marek, Geert,
>>
>> Can you try this series on a static kernel with your OF_POPULATED
>> changes reverted? I just want to make sure these patches can identify
>> and fix those cases.
>>
>> Tudor,
>>
>> You should still make the clock driver fix (because it's a bug), but I
>> think this series will fix your issue too (even without the clock driver
>> fix). Can you please give this a shot?
> Marek, Geert, Tudor,
>
> Forgot to say that this will probably fix your issues only in a static
> kernel. So please try this with a static kernel. If you can also try
> and confirm that this does not fix the issue for a modular kernel,
> that'd be good too.

I've checked those patches on top of linux next-20210129 with 
c09a3e6c97f0 ("soc: samsung: pm_domains: Convert to regular platform 
driver") commit reverted. Sadly it doesn't help. All devices that belong 
to the Exynos power domains are never probed and stay endlessly on the 
deferred devices list. I've used static kernel build - the one from 
exynos_defconfig.

Best regards

-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland



Re: [PATCH v1 0/2] Make fw_devlink=on more forgiving

2021-01-31 Thread Saravana Kannan
On Fri, Jan 29, 2021 at 8:03 PM Saravana Kannan  wrote:
>
> This patch series solves two general issues with fw_devlink=on
>
> Patch 1/2 addresses the issue of firmware nodes that look like they'll
> have struct devices created for them, but will never actually have
> struct devices added for them. For example, DT nodes with a compatible
> property that don't have devices added for them.
>
> Patch 2/2 address (for static kernels) the issue of optional suppliers
> that'll never have a driver registered for them. So, if the device could
> have probed with fw_devlink=permissive with a static kernel, this patch
> should allow those devices to probe with a fw_devlink=on. This doesn't
> solve it for the case where modules are enabled because there's no way
> to tell if a driver will never be registered or it's just about to be
> registered. I have some other ideas for that, but it'll have to come
> later thinking about it a bit.
>
> These two patches might remove the need for several other patches that
> went in as fixes for commit e590474768f1 ("driver core: Set
> fw_devlink=on by default"), but I think all those fixes are good
> changes. So I think we should leave those in.
>
> Marek, Geert,
>
> Can you try this series on a static kernel with your OF_POPULATED
> changes reverted? I just want to make sure these patches can identify
> and fix those cases.
>
> Tudor,
>
> You should still make the clock driver fix (because it's a bug), but I
> think this series will fix your issue too (even without the clock driver
> fix). Can you please give this a shot?
>
> Marc,
>
> Can you try this series with the gpiolib fix reverted please? I'm pretty
> sure this will fix that case.
>
> Linus,
>
> This series very likely removes the need for the gpiolib patch (we can
> wait for Marc to confirm). I'm split on whether we should leave it in or
> not. Thoughts?

Actually, thinking more about this, we should keep the gpiolib patch.
It'll ensure the suspend/resume order is always correct.

This series basically gives up on creating device links to firmware
nodes that don't have a corresponding device added. The gpiolib patch
makes sure the nodes have a device that corresponds to them. So device
links will get created to the gpio_device and will make sure the
parent of the gpio_device doesn't suspend before the consumers of the
gpio.

-Saravana


[PATCH v1 0/2] Make fw_devlink=on more forgiving

2021-01-30 Thread Saravana Kannan
This patch series solves two general issues with fw_devlink=on

Patch 1/2 addresses the issue of firmware nodes that look like they'll
have struct devices created for them, but will never actually have
struct devices added for them. For example, DT nodes with a compatible
property that don't have devices added for them.

Patch 2/2 address (for static kernels) the issue of optional suppliers
that'll never have a driver registered for them. So, if the device could
have probed with fw_devlink=permissive with a static kernel, this patch
should allow those devices to probe with a fw_devlink=on. This doesn't
solve it for the case where modules are enabled because there's no way
to tell if a driver will never be registered or it's just about to be
registered. I have some other ideas for that, but it'll have to come
later thinking about it a bit.

These two patches might remove the need for several other patches that
went in as fixes for commit e590474768f1 ("driver core: Set
fw_devlink=on by default"), but I think all those fixes are good
changes. So I think we should leave those in.

Marek, Geert,

Can you try this series on a static kernel with your OF_POPULATED
changes reverted? I just want to make sure these patches can identify
and fix those cases.

Tudor,

You should still make the clock driver fix (because it's a bug), but I
think this series will fix your issue too (even without the clock driver
fix). Can you please give this a shot?

Marc,

Can you try this series with the gpiolib fix reverted please? I'm pretty
sure this will fix that case.

Linus,

This series very likely removes the need for the gpiolib patch (we can
wait for Marc to confirm). I'm split on whether we should leave it in or
not. Thoughts?

Thanks,
Saravana

Saravana Kannan (2):
  driver core: fw_devlink: Detect supplier devices that will never be
added
  driver core: fw_devlink: Handle missing drivers for optional suppliers

 drivers/base/base.h |   2 +
 drivers/base/core.c | 134 +---
 drivers/base/dd.c   |   5 ++
 3 files changed, 121 insertions(+), 20 deletions(-)

-- 
2.30.0.365.g02bc693789-goog



Re: [PATCH v1 0/2] Make fw_devlink=on more forgiving

2021-01-29 Thread Saravana Kannan
On Fri, Jan 29, 2021 at 8:03 PM Saravana Kannan  wrote:
>
> This patch series solves two general issues with fw_devlink=on
>
> Patch 1/2 addresses the issue of firmware nodes that look like they'll
> have struct devices created for them, but will never actually have
> struct devices added for them. For example, DT nodes with a compatible
> property that don't have devices added for them.
>
> Patch 2/2 address (for static kernels) the issue of optional suppliers
> that'll never have a driver registered for them. So, if the device could
> have probed with fw_devlink=permissive with a static kernel, this patch
> should allow those devices to probe with a fw_devlink=on. This doesn't
> solve it for the case where modules are enabled because there's no way
> to tell if a driver will never be registered or it's just about to be
> registered. I have some other ideas for that, but it'll have to come
> later thinking about it a bit.
>
> These two patches might remove the need for several other patches that
> went in as fixes for commit e590474768f1 ("driver core: Set
> fw_devlink=on by default"), but I think all those fixes are good
> changes. So I think we should leave those in.
>
> Marek, Geert,
>
> Can you try this series on a static kernel with your OF_POPULATED
> changes reverted? I just want to make sure these patches can identify
> and fix those cases.
>
> Tudor,
>
> You should still make the clock driver fix (because it's a bug), but I
> think this series will fix your issue too (even without the clock driver
> fix). Can you please give this a shot?

Marek, Geert, Tudor,

Forgot to say that this will probably fix your issues only in a static
kernel. So please try this with a static kernel. If you can also try
and confirm that this does not fix the issue for a modular kernel,
that'd be good too.

-Saravana

>
> Marc,
>
> Can you try this series with the gpiolib fix reverted please? I'm pretty
> sure this will fix that case.
>
> Linus,
>
> This series very likely removes the need for the gpiolib patch (we can
> wait for Marc to confirm). I'm split on whether we should leave it in or
> not. Thoughts?
>
> Thanks,
> Saravana
>
> Saravana Kannan (2):
>   driver core: fw_devlink: Detect supplier devices that will never be
> added
>   driver core: fw_devlink: Handle missing drivers for optional suppliers
>
>  drivers/base/base.h |   2 +
>  drivers/base/core.c | 134 +---
>  drivers/base/dd.c   |   5 ++
>  3 files changed, 121 insertions(+), 20 deletions(-)
>
> --
> 2.30.0.365.g02bc693789-goog
>