Hi Bhupesh, On 2023-07-05 06:48, Bhupesh Sharma wrote: > Hi Jonas, > > On Mon, 3 Jul 2023 at 01:18, Jonas Karlman <jo...@kwiboo.se> wrote: >> >> On 2023-07-02 20:47, Bhupesh Sharma wrote: >>> Since function 'phy_get_counts()' can return NULL, >>> add handling for that error path inside callers of >>> this function. >> >> Do you have any example where this can/has happened? > > Yes, it happened while I was writing a UFS Host Controller driver for > u-boot and tried to initialize the UFS PHY via the generic u-boot PHY > CLASS utility functions, namely: > > /* get phy */ > ret = generic_phy_get_by_name(hba->dev, "ufsphy", &phy); > ... > > /* phy initialization */ > ret = generic_phy_init(&phy); > ... > /* power on phy */ > ret = generic_phy_power_on(&phy); >
This looks like a proper call order that should not cause counts ending up as NULL. If this call order still cause issue for your driver there must be some other underlying issue that also needs to be fixed. >> Counts should never be NULL in a normal working call flow. I am a bit >> worried that this patch may hide some other bug or use of an >> uninitialized phy struct. > > I agree, but I found that if the UFS Host Controller driver would mess > up the phy call sequences while it was in a 'power_up_sequence', > instead of using the standard UCLASS_PHY > 'generic_phy_get_by_index_nodev' flow, it might get a counts value > NULL and then the PHY driver would silently crash while setting / > accessing members of 'counts'. > > int generic_phy_init(struct phy *phy) > { > counts = phy_get_counts(phy); > ... > if (counts->init_count > 0) { > // crash > .. I fully understand that this will cause a crash here, the issue is that this should never happen in the first place as long as the documentation for the phy struct is followed: Clients provide storage for phy handles. The content of the structure is managed solely by the PHY API and PHY drivers. A phy struct is initialized by "get"ing the phy struct. The phy struct is passed to all other phy APIs to identify which PHY port to operate upon. The possible reasons I can see for counts ending up as NULL is: - Use of a phy struct not initialized by generic_phy_get_by_index_nodev. - phy->id or phy->dev is somehow changed from "get"ing the phy struct and when it is passed to generic_phy_ functions. - Some other issue or bad behavior in phy driver. - of_xlate ops, device_get_supply_regulator or phy_alloc_counts fails in generic_phy_get_by_index_nodev and phy struct is later passed to generic_phy_ functions. The last one seem like a bug, phy->dev is left assigned when generic_phy_get_by_index_nodev return an error. This could be fixed with the following diff: --- a/drivers/phy/phy-uclass.c +++ b/drivers/phy/phy-uclass.c @@ -195,6 +195,7 @@ int generic_phy_get_by_index_nodev(ofnode node, int index, struct phy *phy) return 0; err: + phy->dev = NULL; return ret; } If we also need a null check for counts, I think a return value of -EINVAL is more appropriate, the phy struct passed to the function is in an invalid state, ENODEV should have been returned when "get"ing the phy struct. > >> The phy struct is initialized in generic_phy_get_by_index_nodev. That >> function should fail when counts cannot be allocated. >> >> Maybe generic_phy_get_by_index_nodev should unset phy->dev if it fails, >> or generic_phy_valid should be extended to also check phy->counts. >> That way generic_phy_valid would return false and these functions >> should return earlier before trying to use counts. > > I agree, this error handling should be here for counts being > uninitialized, whether we do it per-function or at the top level.. As > I can see several users of the flow I described in the u-boot code > itself (for setting up a PHY), for e.g.: > > board/sunxi/board.c > drivers/usb/cdns3/core.c These drivers seem to follow a proper call order and use appropriate return value checks. They may possible be affected by the issue mentioned above. Regards, Jonas > > etc.. > > Thanks, > Bhupesh