Re: [PATCH v1] phy: qcom-qmp: Raise qcom_qmp_phy_enable() polling delay

2019-06-24 Thread Doug Anderson
Hi,

On Thu, Jun 13, 2019 at 8:28 AM Marc Gonzalez  wrote:
>
> readl_poll_timeout() calls usleep_range() to sleep between reads.
> usleep_range() doesn't work efficiently for tiny values.
>
> Raise the polling delay in qcom_qmp_phy_enable() to bring it in line
> with the delay in qcom_qmp_phy_com_init().
>
> Signed-off-by: Marc Gonzalez 
> ---
> Vivek, do you remember why you didn't use the same delay value in
> qcom_qmp_phy_enable) and qcom_qmp_phy_com_init() ?
> ---
>  drivers/phy/qualcomm/phy-qcom-qmp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c 
> b/drivers/phy/qualcomm/phy-qcom-qmp.c
> index bb522b915fa9..34ff6434da8f 100644
> --- a/drivers/phy/qualcomm/phy-qcom-qmp.c
> +++ b/drivers/phy/qualcomm/phy-qcom-qmp.c
> @@ -1548,7 +1548,7 @@ static int qcom_qmp_phy_enable(struct phy *phy)
> status = pcs + cfg->regs[QPHY_PCS_READY_STATUS];
> mask = cfg->mask_pcs_ready;
>
> -   ret = readl_poll_timeout(status, val, val & mask, 1,
> +   ret = readl_poll_timeout(status, val, val & mask, 10,
>  PHY_INIT_COMPLETE_TIMEOUT);

I would agree that the existing code is almost certainly wrong, since,
as you said, trying to sleep for 1 us is likely pointless.  I quickly
coded up a test and ran it on sdm845-cheza.  It looked like this:

--

  ktime_t a, b, c;

  a = ktime_get();
  b = ktime_get();
  usleep_range(1, 1);
  c = ktime_get();

  pr_info("DOUG: %d ns, %d ns\n", (int)ktime_to_ns(ktime_sub(b, a)),
  (int)ktime_to_ns(ktime_sub(c, b)));

--

At bootup I got:

[4.121247] DOUG: 52 ns, 9479 ns
[4.144990] DOUG: 52 ns, 9636 ns
[4.328168] DOUG: 0 ns, 11667 ns
[4.332659] DOUG: 52 ns, 7136 ns
[4.358833] DOUG: 0 ns,  ns
[4.362095] DOUG: 52 ns, 8229 ns

So basically the existing code is already waiting 5-10 us between
polls but it's spending all of that time context switching.  Changing
the above to:

  usleep_range(5, 10);

Give me instead:

[4.120781] DOUG: 52 ns, 16927 ns
[4.144626] DOUG: 53 ns, 17447 ns
[4.327932] DOUG: 52 ns, 11302 ns
[4.332501] DOUG: 0 ns, 7395 ns
[4.357912] DOUG: 0 ns, 6823 ns
[4.361175] DOUG: 52 ns, 9063 ns

...and that seems fine to me.

--

Thus:

Reviewed-by: Douglas Anderson 


Re: [PATCH v1] phy: qcom-qmp: Raise qcom_qmp_phy_enable() polling delay

2019-06-24 Thread Marc Gonzalez
On 20/06/2019 08:25, Kishon Vijay Abraham I wrote:

> On 14/06/19 6:08 PM, Marc Gonzalez wrote:
> 
>> The issue is usleep_range() being misused ^_^
>>
>> Although usleep_range() takes unsigned longs as parameters, it is
>> not appropriate over the entire 0-2^64 range.
>>
>> a) It should not be used with tiny values, because the cost of programming
>> the timer interrupt, and processing the resulting IRQ would dominate.
>>
>> b) It should not be used with large values (above 200/HZ) because
>> msleep() is more efficient, and is acceptable for these ranges.
> 
> Documentation/timers/timers-howto.txt has all the information on the various
> kernel delay/sleep mechanisms. For < ~10us, it recommends to use udelay
> (readx_poll_timeout_atomic). Depending on the actual timeout to be used, the
> delay mechanism in timers-howto.txt should be used.

Hello Kishon,

I believe the proposed patch does the right thing:

a) polling for the ready bit is not done in atomic context,
therefore we don't need to busy-loop

b) since we're ultimately calling usleep_range(), we should
pass an appropriate parameter, such as max_us = 10
(instead of max_us = 1, which is outside usleep_range spec)

Maybe it would help if someone reviewed this patch.

Regards.


Re: [PATCH v1] phy: qcom-qmp: Raise qcom_qmp_phy_enable() polling delay

2019-06-20 Thread Kishon Vijay Abraham I
Hi,

On 14/06/19 6:08 PM, Marc Gonzalez wrote:
> + Doug (who is familiar with usleep_range quirks)
> 
> On 14/06/2019 11:50, Vivek Gautam wrote:
> 
>> On 6/13/2019 5:02 PM, Marc Gonzalez wrote:
>>
>>> readl_poll_timeout() calls usleep_range() to sleep between reads.
>>> usleep_range() doesn't work efficiently for tiny values.
>>>
>>> Raise the polling delay in qcom_qmp_phy_enable() to bring it in line
>>> with the delay in qcom_qmp_phy_com_init().
>>>
>>> Signed-off-by: Marc Gonzalez 
>>> ---
>>> Vivek, do you remember why you didn't use the same delay value in
>>> qcom_qmp_phy_enable) and qcom_qmp_phy_com_init() ?
>>
>> phy_qcom_init() thingy came from the PCIE phy driver from downstream
>> msm-3.18 PCIE did something as below:
> 
> FWIW and IMO, drivers/pci/host/pci-msm.c is a good example of how not to write
> a device driver. It's huge (7000+ lines) because it handles multiple platforms
> via ifdefs, and lumps everything together (phy, core IP, SoC specific glue)
> in a single file.
> 
>> -
>> do {
>>      if (pcie_phy_is_ready(dev))
>>      break;
>>      retries++;
>>      usleep_range(REFCLK_STABILIZATION_DELAY_US_MIN,
>>   REFCLK_STABILIZATION_DELAY_US_MAX);
>> } while (retries < PHY_READY_TIMEOUT_COUNT);
>>
>> REFCLK_STABILIZATION_DELAY_US_MIN/MAX ==> 1000/1005
>> PHY_READY_TIMEOUT_COUNT ==> 10
>> -
> 
> https://source.codeaurora.org/quic/la/kernel/msm-4.4/tree/drivers/pci/host/pci-msm.c?h=LE.UM.1.3.r3.25#n4624
> 
> https://source.codeaurora.org/quic/la/kernel/msm-4.4/tree/drivers/pci/host/pci-msm.c?h=LE.UM.1.3.r3.25#n1721
> 
> readl_relaxed(dev->phy + PCIE_N_PCS_STATUS(dev->rc_idx, dev->common_phy)) & 
> BIT(6)
> is equivalent to:
> the check in qcom_qmp_phy_enable()
> 
> readl_relaxed(dev->phy + PCIE_COM_PCS_READY_STATUS) & 0x1
> is equivalent to:
> the check in qcom_qmp_phy_com_init()
> 
> I'll take a closer look, using some printks, to narrow down the run-time
> execution path.
> 
>> phy_enable() from the usb phy driver from downstream.
>>   /* Wait for PHY initialization to be done */
>>   do {
>>   if (readl_relaxed(phy->base +
>>   phy->phy_reg[USB3_PHY_PCS_STATUS]) & PHYSTATUS)
>>   usleep_range(1, 2);
>> else
>> break;
>>   } while (--init_timeout_usec);
>>
>> init_timeout_usec ==> 1000
>> -
>> USB never had a COM_PHY status bit.
>>
>> So clearly the resolutions were different.
>>
>> Does this change solve an issue at hand?
> 
> The issue is usleep_range() being misused ^_^
> 
> Although usleep_range() takes unsigned longs as parameters, it is
> not appropriate over the entire 0-2^64 range.
> 
> a) It should not be used with tiny values, because the cost of programming
> the timer interrupt, and processing the resulting IRQ would dominate.
> 
> b) It should not be used with large values (above 200/HZ) because
> msleep() is more efficient, and is acceptable for these ranges.

Documentation/timers/timers-howto.txt has all the information on the various
kernel delay/sleep mechanisms. For < ~10us, it recommends to use udelay
(readx_poll_timeout_atomic). Depending on the actual timeout to be used, the
delay mechanism in timers-howto.txt should be used.

Thanks
Kishon


Re: [PATCH v1] phy: qcom-qmp: Raise qcom_qmp_phy_enable() polling delay

2019-06-14 Thread Marc Gonzalez
+ Doug (who is familiar with usleep_range quirks)

On 14/06/2019 11:50, Vivek Gautam wrote:

> On 6/13/2019 5:02 PM, Marc Gonzalez wrote:
> 
>> readl_poll_timeout() calls usleep_range() to sleep between reads.
>> usleep_range() doesn't work efficiently for tiny values.
>>
>> Raise the polling delay in qcom_qmp_phy_enable() to bring it in line
>> with the delay in qcom_qmp_phy_com_init().
>>
>> Signed-off-by: Marc Gonzalez 
>> ---
>> Vivek, do you remember why you didn't use the same delay value in
>> qcom_qmp_phy_enable) and qcom_qmp_phy_com_init() ?
> 
> phy_qcom_init() thingy came from the PCIE phy driver from downstream
> msm-3.18 PCIE did something as below:

FWIW and IMO, drivers/pci/host/pci-msm.c is a good example of how not to write
a device driver. It's huge (7000+ lines) because it handles multiple platforms
via ifdefs, and lumps everything together (phy, core IP, SoC specific glue)
in a single file.

> -
> do {
>      if (pcie_phy_is_ready(dev))
>      break;
>      retries++;
>      usleep_range(REFCLK_STABILIZATION_DELAY_US_MIN,
>   REFCLK_STABILIZATION_DELAY_US_MAX);
> } while (retries < PHY_READY_TIMEOUT_COUNT);
> 
> REFCLK_STABILIZATION_DELAY_US_MIN/MAX ==> 1000/1005
> PHY_READY_TIMEOUT_COUNT ==> 10
> -

https://source.codeaurora.org/quic/la/kernel/msm-4.4/tree/drivers/pci/host/pci-msm.c?h=LE.UM.1.3.r3.25#n4624

https://source.codeaurora.org/quic/la/kernel/msm-4.4/tree/drivers/pci/host/pci-msm.c?h=LE.UM.1.3.r3.25#n1721

readl_relaxed(dev->phy + PCIE_N_PCS_STATUS(dev->rc_idx, dev->common_phy)) & 
BIT(6)
is equivalent to:
the check in qcom_qmp_phy_enable()

readl_relaxed(dev->phy + PCIE_COM_PCS_READY_STATUS) & 0x1
is equivalent to:
the check in qcom_qmp_phy_com_init()

I'll take a closer look, using some printks, to narrow down the run-time
execution path.

> phy_enable() from the usb phy driver from downstream.
>   /* Wait for PHY initialization to be done */
>   do {
>   if (readl_relaxed(phy->base +
>   phy->phy_reg[USB3_PHY_PCS_STATUS]) & PHYSTATUS)
>   usleep_range(1, 2);
> else
> break;
>   } while (--init_timeout_usec);
> 
> init_timeout_usec ==> 1000
> -
> USB never had a COM_PHY status bit.
> 
> So clearly the resolutions were different.
> 
> Does this change solve an issue at hand?

The issue is usleep_range() being misused ^_^

Although usleep_range() takes unsigned longs as parameters, it is
not appropriate over the entire 0-2^64 range.

a) It should not be used with tiny values, because the cost of programming
the timer interrupt, and processing the resulting IRQ would dominate.

b) It should not be used with large values (above 200/HZ) because
msleep() is more efficient, and is acceptable for these ranges.

Regards.


Re: [PATCH v1] phy: qcom-qmp: Raise qcom_qmp_phy_enable() polling delay

2019-06-14 Thread Vivek Gautam

Hi Marc,

On 6/13/2019 5:02 PM, Marc Gonzalez wrote:

readl_poll_timeout() calls usleep_range() to sleep between reads.
usleep_range() doesn't work efficiently for tiny values.

Raise the polling delay in qcom_qmp_phy_enable() to bring it in line
with the delay in qcom_qmp_phy_com_init().

Signed-off-by: Marc Gonzalez 
---
Vivek, do you remember why you didn't use the same delay value in
qcom_qmp_phy_enable) and qcom_qmp_phy_com_init() ?


phy_qcom_init() thingy came from the PCIE phy driver from downstream 
msm-3.18

PCIE did something as below:

-
do {
    if (pcie_phy_is_ready(dev))
    break;
    retries++;
    usleep_range(REFCLK_STABILIZATION_DELAY_US_MIN,
 REFCLK_STABILIZATION_DELAY_US_MAX);
} while (retries < PHY_READY_TIMEOUT_COUNT);

REFCLK_STABILIZATION_DELAY_US_MIN/MAX ==> 1000/1005
PHY_READY_TIMEOUT_COUNT ==> 10
-


phy_enable() from the usb phy driver from downstream.
 /* Wait for PHY initialization to be done */
 do {
 if (readl_relaxed(phy->base +
 phy->phy_reg[USB3_PHY_PCS_STATUS]) & PHYSTATUS)
 usleep_range(1, 2);
else
break;
 } while (--init_timeout_usec);

init_timeout_usec ==> 1000
-
USB never had a COM_PHY status bit.

So clearly the resolutions were different.

Does this change solves an issue at hand?


---
  drivers/phy/qualcomm/phy-qcom-qmp.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c 
b/drivers/phy/qualcomm/phy-qcom-qmp.c
index bb522b915fa9..34ff6434da8f 100644
--- a/drivers/phy/qualcomm/phy-qcom-qmp.c
+++ b/drivers/phy/qualcomm/phy-qcom-qmp.c
@@ -1548,7 +1548,7 @@ static int qcom_qmp_phy_enable(struct phy *phy)
status = pcs + cfg->regs[QPHY_PCS_READY_STATUS];
mask = cfg->mask_pcs_ready;
  
-	ret = readl_poll_timeout(status, val, val & mask, 1,

+   ret = readl_poll_timeout(status, val, val & mask, 10,
 PHY_INIT_COMPLETE_TIMEOUT);
if (ret) {
dev_err(qmp->dev, "phy initialization timed-out\n");