Re: [PATCH v2 0/5] PCI: dwc: improve msi handling

2020-10-07 Thread Vidya Sagar




On 10/6/2020 12:06 PM, Jisheng Zhang wrote:

External email: Use caution opening links or attachments


On Tue, 6 Oct 2020 11:56:34 +0530 Vidya Sagar wrote:




Hi,


Hi,


I would like to verify this series along with the other series "PCI:
dwc: fix two MSI issues" on Tegra194. I tried to apply these series on
both linux-next and Lorenzo's pci/dwc branches but there seem to be non
trivial conflicts. Could you please tell me which branch I can use and
apply these series cleanly?


This is a fix, so I thought the series would be picked up in v5.9, so the
series is patched against v5.9-rcN

could you please try v5 https://lkml.org/lkml/2020/9/29/2511 on v5.9-rc7?
I tried this series on top of v5.9-rc7 and it worked as expected on 
Tegra194 platform. Also, I couldn't cleanly apply the other series 'PCI: 
dwc: fix two MSI issues' on top. Could you please rebase them?


Thanks,
Vidya Sagar



Thanks


FWIW, I acknowledge that the existing code does leak MSI target page
every time system goes through suspend-resume sequence on Tegra194.

Thanks,
Vidya Sagar

On 9/24/2020 4:35 PM, Jisheng Zhang wrote:

External email: Use caution opening links or attachments


Improve the msi code:
1. Add proper error handling.
2. Move dw_pcie_msi_init() from each users to designware host to solve
msi page leakage in resume path.

Since v1:
- add proper error handling patches.
- solve the msi page leakage by moving dw_pcie_msi_init() from each
  users to designware host


Jisheng Zhang (5):
PCI: dwc: Call dma_unmap_page() before freeing the msi page
PCI: dwc: Check alloc_page() return value
PCI: dwc: Rename dw_pcie_free_msi to dw_pcie_msi_deinit
PCI: dwc: Skip PCIE_MSI_INTR0* programming if MSI is disabled
PCI: dwc: Move dw_pcie_msi_init() from each users to designware host

   drivers/pci/controller/dwc/pci-dra7xx.c   |  1 +
   drivers/pci/controller/dwc/pci-exynos.c   |  2 -
   drivers/pci/controller/dwc/pci-imx6.c |  3 --
   drivers/pci/controller/dwc/pci-meson.c|  8 
   drivers/pci/controller/dwc/pcie-artpec6.c | 10 -
   .../pci/controller/dwc/pcie-designware-host.c | 43 +--
   .../pci/controller/dwc/pcie-designware-plat.c |  3 --
   drivers/pci/controller/dwc/pcie-designware.h  |  9 +++-
   drivers/pci/controller/dwc/pcie-histb.c   |  3 --
   drivers/pci/controller/dwc/pcie-kirin.c   |  3 --
   drivers/pci/controller/dwc/pcie-qcom.c|  3 --
   drivers/pci/controller/dwc/pcie-spear13xx.c   |  1 -
   drivers/pci/controller/dwc/pcie-tegra194.c|  2 -
   drivers/pci/controller/dwc/pcie-uniphier.c|  9 +---
   14 files changed, 38 insertions(+), 62 deletions(-)

--
2.28.0





Re: [PATCH v2 0/5] PCI: dwc: improve msi handling

2020-10-06 Thread Jisheng Zhang
On Tue, 6 Oct 2020 11:56:34 +0530 Vidya Sagar wrote:

> 
> 
> Hi,

Hi,

> I would like to verify this series along with the other series "PCI:
> dwc: fix two MSI issues" on Tegra194. I tried to apply these series on
> both linux-next and Lorenzo's pci/dwc branches but there seem to be non
> trivial conflicts. Could you please tell me which branch I can use and
> apply these series cleanly?

This is a fix, so I thought the series would be picked up in v5.9, so the
series is patched against v5.9-rcN

could you please try v5 https://lkml.org/lkml/2020/9/29/2511 on v5.9-rc7?


Thanks

> FWIW, I acknowledge that the existing code does leak MSI target page
> every time system goes through suspend-resume sequence on Tegra194.
> 
> Thanks,
> Vidya Sagar
> 
> On 9/24/2020 4:35 PM, Jisheng Zhang wrote:
> > External email: Use caution opening links or attachments
> >
> >
> > Improve the msi code:
> > 1. Add proper error handling.
> > 2. Move dw_pcie_msi_init() from each users to designware host to solve
> > msi page leakage in resume path.
> >
> > Since v1:
> >- add proper error handling patches.
> >- solve the msi page leakage by moving dw_pcie_msi_init() from each
> >  users to designware host
> >
> >
> > Jisheng Zhang (5):
> >PCI: dwc: Call dma_unmap_page() before freeing the msi page
> >PCI: dwc: Check alloc_page() return value
> >PCI: dwc: Rename dw_pcie_free_msi to dw_pcie_msi_deinit
> >PCI: dwc: Skip PCIE_MSI_INTR0* programming if MSI is disabled
> >PCI: dwc: Move dw_pcie_msi_init() from each users to designware host
> >
> >   drivers/pci/controller/dwc/pci-dra7xx.c   |  1 +
> >   drivers/pci/controller/dwc/pci-exynos.c   |  2 -
> >   drivers/pci/controller/dwc/pci-imx6.c |  3 --
> >   drivers/pci/controller/dwc/pci-meson.c|  8 
> >   drivers/pci/controller/dwc/pcie-artpec6.c | 10 -
> >   .../pci/controller/dwc/pcie-designware-host.c | 43 +--
> >   .../pci/controller/dwc/pcie-designware-plat.c |  3 --
> >   drivers/pci/controller/dwc/pcie-designware.h  |  9 +++-
> >   drivers/pci/controller/dwc/pcie-histb.c   |  3 --
> >   drivers/pci/controller/dwc/pcie-kirin.c   |  3 --
> >   drivers/pci/controller/dwc/pcie-qcom.c|  3 --
> >   drivers/pci/controller/dwc/pcie-spear13xx.c   |  1 -
> >   drivers/pci/controller/dwc/pcie-tegra194.c|  2 -
> >   drivers/pci/controller/dwc/pcie-uniphier.c|  9 +---
> >   14 files changed, 38 insertions(+), 62 deletions(-)
> >
> > --
> > 2.28.0
> >  



Re: [PATCH v2 0/5] PCI: dwc: improve msi handling

2020-10-06 Thread Vidya Sagar

Hi,
I would like to verify this series along with the other series "PCI: 
dwc: fix two MSI issues" on Tegra194. I tried to apply these series on 
both linux-next and Lorenzo's pci/dwc branches but there seem to be non 
trivial conflicts. Could you please tell me which branch I can use and 
apply these series cleanly?
FWIW, I acknowledge that the existing code does leak MSI target page 
every time system goes through suspend-resume sequence on Tegra194.


Thanks,
Vidya Sagar

On 9/24/2020 4:35 PM, Jisheng Zhang wrote:

External email: Use caution opening links or attachments


Improve the msi code:
1. Add proper error handling.
2. Move dw_pcie_msi_init() from each users to designware host to solve
msi page leakage in resume path.

Since v1:
   - add proper error handling patches.
   - solve the msi page leakage by moving dw_pcie_msi_init() from each
 users to designware host


Jisheng Zhang (5):
   PCI: dwc: Call dma_unmap_page() before freeing the msi page
   PCI: dwc: Check alloc_page() return value
   PCI: dwc: Rename dw_pcie_free_msi to dw_pcie_msi_deinit
   PCI: dwc: Skip PCIE_MSI_INTR0* programming if MSI is disabled
   PCI: dwc: Move dw_pcie_msi_init() from each users to designware host

  drivers/pci/controller/dwc/pci-dra7xx.c   |  1 +
  drivers/pci/controller/dwc/pci-exynos.c   |  2 -
  drivers/pci/controller/dwc/pci-imx6.c |  3 --
  drivers/pci/controller/dwc/pci-meson.c|  8 
  drivers/pci/controller/dwc/pcie-artpec6.c | 10 -
  .../pci/controller/dwc/pcie-designware-host.c | 43 +--
  .../pci/controller/dwc/pcie-designware-plat.c |  3 --
  drivers/pci/controller/dwc/pcie-designware.h  |  9 +++-
  drivers/pci/controller/dwc/pcie-histb.c   |  3 --
  drivers/pci/controller/dwc/pcie-kirin.c   |  3 --
  drivers/pci/controller/dwc/pcie-qcom.c|  3 --
  drivers/pci/controller/dwc/pcie-spear13xx.c   |  1 -
  drivers/pci/controller/dwc/pcie-tegra194.c|  2 -
  drivers/pci/controller/dwc/pcie-uniphier.c|  9 +---
  14 files changed, 38 insertions(+), 62 deletions(-)

--
2.28.0



Re: [PATCH v2 0/5] PCI: dwc: improve msi handling

2020-09-29 Thread Marc Zyngier

On 2020-09-29 19:02, Jon Hunter wrote:

On 29/09/2020 18:25, Marc Zyngier wrote:

On 2020-09-29 14:22, Jon Hunter wrote:

Hi Jisheng,

On 29/09/2020 11:48, Jisheng Zhang wrote:

Hi Jon,

On Fri, 25 Sep 2020 09:53:45 +0100 Jon Hunter wrote:



On 24/09/2020 12:05, Jisheng Zhang wrote:

Improve the msi code:
1. Add proper error handling.
2. Move dw_pcie_msi_init() from each users to designware host to 
solve

msi page leakage in resume path.


Apologies if this is slightly off topic, but I have been meaning to 
ask

about MSIs and PCI. On Tegra194 which uses the DWC PCI driver,
whenever we
hotplug CPUs we see the following warnings ...

 [  79.068351] WARNING KERN IRQ70: set affinity failed(-22).
 [  79.068362] WARNING KERN IRQ71: set affinity failed(-22).



I tried to reproduce this issue on Synaptics SoC, but can't 
reproduce

it.
Per my understanding of the code in kernel/irq/cpuhotplug.c, this
warning
happened when we migrate irqs away from the offline cpu, this 
implicitly

implies that before this point the irq has bind to the offline cpu,
but how
could this happen given current dw_pci_msi_set_affinity() 
implementation

always return -EINVAL


By default the smp_affinity should be set so that all CPUs can be
interrupted ...

$ cat /proc/irq/70/smp_affinity
0xff

In my case there are 8 CPUs and so 0xff implies that the interrupt 
can

be triggered on any of the 8 CPUs.

Do you see the set_affinity callback being called for the DWC irqchip 
in

migrate_one_irq()?


The problem is common to all MSI implementations that end up muxing
all the end-point MSIs into a single interrupt. With these systems,
you cannot set the affinity of individual MSIs (they don't target a
CPU, they target another interrupt... braindead). Only the mux
interrupt can have its affinity changed.

So returning -EINVAL is the right thing to do.


Right, so if that is the case, then surely there should be some way to
avoid these warnings because they are not relevant?


I don't think there is a way to do this, because the core code
doesn't (and cannot) know the exact interrupt topology.

The only alternative would be to change the affinity of the mux
interrupt when a MSI affinity changes, but that tends to break
userspace (irqbalance, for example).

M.
--
Jazz is not dead. It just smells funny...


Re: [PATCH v2 0/5] PCI: dwc: improve msi handling

2020-09-29 Thread Jon Hunter


On 29/09/2020 18:25, Marc Zyngier wrote:
> On 2020-09-29 14:22, Jon Hunter wrote:
>> Hi Jisheng,
>>
>> On 29/09/2020 11:48, Jisheng Zhang wrote:
>>> Hi Jon,
>>>
>>> On Fri, 25 Sep 2020 09:53:45 +0100 Jon Hunter wrote:
>>>

 On 24/09/2020 12:05, Jisheng Zhang wrote:
> Improve the msi code:
> 1. Add proper error handling.
> 2. Move dw_pcie_msi_init() from each users to designware host to solve
> msi page leakage in resume path.

 Apologies if this is slightly off topic, but I have been meaning to ask
 about MSIs and PCI. On Tegra194 which uses the DWC PCI driver,
 whenever we
 hotplug CPUs we see the following warnings ...

  [  79.068351] WARNING KERN IRQ70: set affinity failed(-22).
  [  79.068362] WARNING KERN IRQ71: set affinity failed(-22).

>>>
>>> I tried to reproduce this issue on Synaptics SoC, but can't reproduce
>>> it.
>>> Per my understanding of the code in kernel/irq/cpuhotplug.c, this
>>> warning
>>> happened when we migrate irqs away from the offline cpu, this implicitly
>>> implies that before this point the irq has bind to the offline cpu,
>>> but how
>>> could this happen given current dw_pci_msi_set_affinity() implementation
>>> always return -EINVAL
>>
>> By default the smp_affinity should be set so that all CPUs can be
>> interrupted ...
>>
>> $ cat /proc/irq/70/smp_affinity
>> 0xff
>>
>> In my case there are 8 CPUs and so 0xff implies that the interrupt can
>> be triggered on any of the 8 CPUs.
>>
>> Do you see the set_affinity callback being called for the DWC irqchip in
>> migrate_one_irq()?
> 
> The problem is common to all MSI implementations that end up muxing
> all the end-point MSIs into a single interrupt. With these systems,
> you cannot set the affinity of individual MSIs (they don't target a
> CPU, they target another interrupt... braindead). Only the mux
> interrupt can have its affinity changed.
> 
> So returning -EINVAL is the right thing to do.

Right, so if that is the case, then surely there should be some way to
avoid these warnings because they are not relevant?

Cheers
Jon

-- 
nvpublic


Re: [PATCH v2 0/5] PCI: dwc: improve msi handling

2020-09-29 Thread Marc Zyngier

On 2020-09-29 14:22, Jon Hunter wrote:

Hi Jisheng,

On 29/09/2020 11:48, Jisheng Zhang wrote:

Hi Jon,

On Fri, 25 Sep 2020 09:53:45 +0100 Jon Hunter wrote:



On 24/09/2020 12:05, Jisheng Zhang wrote:

Improve the msi code:
1. Add proper error handling.
2. Move dw_pcie_msi_init() from each users to designware host to 
solve

msi page leakage in resume path.


Apologies if this is slightly off topic, but I have been meaning to 
ask
about MSIs and PCI. On Tegra194 which uses the DWC PCI driver, 
whenever we

hotplug CPUs we see the following warnings ...

 [  79.068351] WARNING KERN IRQ70: set affinity failed(-22).
 [  79.068362] WARNING KERN IRQ71: set affinity failed(-22).



I tried to reproduce this issue on Synaptics SoC, but can't reproduce 
it.
Per my understanding of the code in kernel/irq/cpuhotplug.c, this 
warning
happened when we migrate irqs away from the offline cpu, this 
implicitly
implies that before this point the irq has bind to the offline cpu, 
but how
could this happen given current dw_pci_msi_set_affinity() 
implementation

always return -EINVAL


By default the smp_affinity should be set so that all CPUs can be
interrupted ...

$ cat /proc/irq/70/smp_affinity
0xff

In my case there are 8 CPUs and so 0xff implies that the interrupt can
be triggered on any of the 8 CPUs.

Do you see the set_affinity callback being called for the DWC irqchip 
in

migrate_one_irq()?


The problem is common to all MSI implementations that end up muxing
all the end-point MSIs into a single interrupt. With these systems,
you cannot set the affinity of individual MSIs (they don't target a
CPU, they target another interrupt... braindead). Only the mux
interrupt can have its affinity changed.

So returning -EINVAL is the right thing to do.

 M.
--
Jazz is not dead. It just smells funny...


Re: [PATCH v2 0/5] PCI: dwc: improve msi handling

2020-09-29 Thread Jon Hunter
Hi Jisheng,

On 29/09/2020 11:48, Jisheng Zhang wrote:
> Hi Jon,
> 
> On Fri, 25 Sep 2020 09:53:45 +0100 Jon Hunter wrote:
> 
>>
>> On 24/09/2020 12:05, Jisheng Zhang wrote:
>>> Improve the msi code:
>>> 1. Add proper error handling.
>>> 2. Move dw_pcie_msi_init() from each users to designware host to solve
>>> msi page leakage in resume path.  
>>
>> Apologies if this is slightly off topic, but I have been meaning to ask
>> about MSIs and PCI. On Tegra194 which uses the DWC PCI driver, whenever we
>> hotplug CPUs we see the following warnings ...
>>
>>  [  79.068351] WARNING KERN IRQ70: set affinity failed(-22).
>>  [  79.068362] WARNING KERN IRQ71: set affinity failed(-22).
>>
> 
> I tried to reproduce this issue on Synaptics SoC, but can't reproduce it.
> Per my understanding of the code in kernel/irq/cpuhotplug.c, this warning
> happened when we migrate irqs away from the offline cpu, this implicitly
> implies that before this point the irq has bind to the offline cpu, but how
> could this happen given current dw_pci_msi_set_affinity() implementation
> always return -EINVAL

By default the smp_affinity should be set so that all CPUs can be
interrupted ...

$ cat /proc/irq/70/smp_affinity
0xff

In my case there are 8 CPUs and so 0xff implies that the interrupt can
be triggered on any of the 8 CPUs.

Do you see the set_affinity callback being called for the DWC irqchip in
migrate_one_irq()?

Cheers
Jon

-- 
nvpublic


Re: [PATCH v2 0/5] PCI: dwc: improve msi handling

2020-09-29 Thread Jisheng Zhang
Hi Jon,

On Fri, 25 Sep 2020 09:53:45 +0100 Jon Hunter wrote:

> 
> On 24/09/2020 12:05, Jisheng Zhang wrote:
> > Improve the msi code:
> > 1. Add proper error handling.
> > 2. Move dw_pcie_msi_init() from each users to designware host to solve
> > msi page leakage in resume path.  
> 
> Apologies if this is slightly off topic, but I have been meaning to ask
> about MSIs and PCI. On Tegra194 which uses the DWC PCI driver, whenever we
> hotplug CPUs we see the following warnings ...
> 
>  [  79.068351] WARNING KERN IRQ70: set affinity failed(-22).
>  [  79.068362] WARNING KERN IRQ71: set affinity failed(-22).
> 

I tried to reproduce this issue on Synaptics SoC, but can't reproduce it.
Per my understanding of the code in kernel/irq/cpuhotplug.c, this warning
happened when we migrate irqs away from the offline cpu, this implicitly
implies that before this point the irq has bind to the offline cpu, but how
could this happen given current dw_pci_msi_set_affinity() implementation
always return -EINVAL

thanks


Re: [PATCH v2 0/5] PCI: dwc: improve msi handling

2020-09-28 Thread Jon Hunter


On 27/09/2020 09:28, Jisheng Zhang wrote:

...

> I see, the msi_domain_set_affinity() calls parent->chip->irq_set_affinity
> without checking, grepping the irqchip and pci dir, I found that
> if the MSI is based on some cascaded interrupt mechanism, they all
> point the irq_set_affinity to irq_chip_set_affinity_parent(), so I believe
> below patch works:
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
> b/drivers/pci/controller/dwc/pcie-designware-host.c
> index bf25d783b5c5..093fba616736 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -137,12 +137,6 @@ static void dw_pci_setup_msi_msg(struct irq_data *d, 
> struct msi_msg *msg)
>   (int)d->hwirq, msg->address_hi, msg->address_lo);
>  }
>  
> -static int dw_pci_msi_set_affinity(struct irq_data *d,
> -const struct cpumask *mask, bool force)
> -{
> - return -EINVAL;
> -}
> -
>  static void dw_pci_bottom_mask(struct irq_data *d)
>  {
>   struct pcie_port *pp = irq_data_get_irq_chip_data(d);
> @@ -197,7 +191,7 @@ static struct irq_chip dw_pci_msi_bottom_irq_chip = {
>   .name = "DWPCI-MSI",
>   .irq_ack = dw_pci_bottom_ack,
>   .irq_compose_msi_msg = dw_pci_setup_msi_msg,
> - .irq_set_affinity = dw_pci_msi_set_affinity,
> + .irq_set_affinity = irq_chip_set_affinity_parent,
>   .irq_mask = dw_pci_bottom_mask,
>   .irq_unmask = dw_pci_bottom_unmask,
>  };
> 


Unfortunately, this still crashes ...

[   11.521674] Unable to handle kernel NULL pointer dereference at virtual 
address 0018
[   11.530324] Mem abort info:
[   11.533089]   ESR = 0x9604
[   11.536105]   EC = 0x25: DABT (current EL), IL = 32 bits
[   11.541333]   SET = 0, FnV = 0
[   11.544344]   EA = 0, S1PTW = 0
[   11.547441] Data abort info:
[   11.550279]   ISV = 0, ISS = 0x0004
[   11.554056]   CM = 0, WnR = 0
[   11.557007] user pgtable: 4k pages, 48-bit VAs, pgdp=000467341000
[   11.56] [0018] pgd=, p4d=
[   11.570024] Internal error: Oops: 9604 [#1] PREEMPT SMP
[   11.575517] Modules linked in: crct10dif_ce pwm_tegra snd_hda_core 
phy_tegra194_p2u lm90 pcie_tegra194 tegra_bpmp_thermal pwm_fan ip_tables 
x_tables ipv6
[   11.589046] CPU: 3 PID: 148 Comm: kworker/3:1 Not tainted 
5.9.0-rc4-9-g6fdf18edb995-dirty #7
[   11.597669] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT)
[   11.604110] Workqueue: events deferred_probe_work_func
[   11.609174] pstate: 60c00089 (nZCv daIf +PAN +UAO BTYPE=--)
[   11.614657] pc : irq_chip_set_affinity_parent+0x4/0x30
[   11.619735] lr : msi_domain_set_affinity+0x44/0xc0
[   11.624448] sp : 800012d4b390
[   11.627744] x29: 800012d4b390 x28: 0003e7234c20 
[   11.632983] x27: 0003e913e460 x26:  
[   11.638231] x25: 800011d7e890 x24: 800011d7e8b8 
[   11.643466] x23:  x22: 0003e913e400 
[   11.648701] x21: 0003e913e460 x20: 0003e913e460 
[   11.653932] x19: 800011b19000 x18:  
[   11.659160] x17:  x16:  
[   11.664390] x15: 0001 x14: 0040 
[   11.669636] x13: 0228 x12: 0030 
[   11.674864] x11: 0101010101010101 x10: 0040 
[   11.680111] x9 :  x8 : 0004 
[   11.685363] x7 :  x6 : 00ff 
[   11.690596] x5 :  x4 :  
[   11.695843] x3 : 8000100d89a8 x2 :  
[   11.701058] x1 : 800011d7e8d8 x0 :  
[   11.706288] Call trace:
[   11.708708]  irq_chip_set_affinity_parent+0x4/0x30
[   11.713431]  irq_do_set_affinity+0x4c/0x178
[   11.717540]  irq_setup_affinity+0x124/0x1b0
[   11.721650]  irq_startup+0x6c/0x118
[   11.725081]  __setup_irq+0x810/0x8a0
[   11.728580]  request_threaded_irq+0xdc/0x188
[   11.732793]  pcie_pme_probe+0x98/0x110
[   11.736481]  pcie_port_probe_service+0x34/0x60
[   11.740848]  really_probe+0x110/0x400
[   11.75]  driver_probe_device+0x54/0xb8
[   11.748482]  __device_attach_driver+0x90/0xc0
[   11.752758]  bus_for_each_drv+0x70/0xc8
[   11.756526]  __device_attach+0xec/0x150
[   11.760306]  device_initial_probe+0x10/0x18
[   11.764413]  bus_probe_device+0x94/0xa0
[   11.768203]  device_add+0x464/0x730
[   11.771630]  device_register+0x1c/0x28
[   11.775311]  pcie_port_device_register+0x2d0/0x3e8
[   11.780017]  pcie_portdrv_probe+0x34/0xd8
[   11.783957]  local_pci_probe+0x3c/0xa0
[   11.787647]  pci_device_probe+0x128/0x1c8
[   11.791588]  really_probe+0x110/0x400
[   11.795179]  driver_probe_device+0x54/0xb8
[   11.799202]  __device_attach_driver+0x90/0xc0
[   11.803480]  bus_for_each_drv+0x70/0xc8
[   11.807244]  __device_attach+0xec/0x150
[   11.811009]  device_attach+0x10/0x18
[   11.814518]  pci_bus_add_device+0x4c/0xb0
[   11.818456]  pci_bus_add_devices+0x44/0x90
[   

Re: [PATCH v2 0/5] PCI: dwc: improve msi handling

2020-09-27 Thread Jisheng Zhang
Hi,

On Fri, 25 Sep 2020 16:13:02 +0100 Jon Hunter wrote:

> 
> Hi Jisheng,
> 
> On 25/09/2020 10:27, Jisheng Zhang wrote:
> 
> ...
> 
> >> Could you please try below patch?
> >>
> >>
> >> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
> >> b/drivers/pci/controller/dwc/pcie-designware-host.c
> >> index bf25d783b5c5..7e5dc54d060e 100644
> >> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> >> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> >> @@ -197,7 +197,6 @@ static struct irq_chip dw_pci_msi_bottom_irq_chip = {
> >> .name = "DWPCI-MSI",
> >> .irq_ack = dw_pci_bottom_ack,
> >> .irq_compose_msi_msg = dw_pci_setup_msi_msg,
> >> -   .irq_set_affinity = dw_pci_msi_set_affinity,
> >> .irq_mask = dw_pci_bottom_mask,
> >> .irq_unmask = dw_pci_bottom_unmask,
> >>  };  
> >
> > A complete patch w/o compiler warning:
> >
> > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
> > b/drivers/pci/controller/dwc/pcie-designware-host.c
> > index bf25d783b5c5..18f719cfed0b 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > @@ -137,12 +137,6 @@ static void dw_pci_setup_msi_msg(struct irq_data *d, 
> > struct msi_msg *msg)
> >   (int)d->hwirq, msg->address_hi, msg->address_lo);
> >  }
> >
> > -static int dw_pci_msi_set_affinity(struct irq_data *d,
> > -const struct cpumask *mask, bool force)
> > -{
> > - return -EINVAL;
> > -}
> > -
> >  static void dw_pci_bottom_mask(struct irq_data *d)
> >  {
> >   struct pcie_port *pp = irq_data_get_irq_chip_data(d);
> > @@ -197,7 +191,6 @@ static struct irq_chip dw_pci_msi_bottom_irq_chip = {
> >   .name = "DWPCI-MSI",
> >   .irq_ack = dw_pci_bottom_ack,
> >   .irq_compose_msi_msg = dw_pci_setup_msi_msg,
> > - .irq_set_affinity = dw_pci_msi_set_affinity,
> >   .irq_mask = dw_pci_bottom_mask,
> >   .irq_unmask = dw_pci_bottom_unmask,
> >  };
> >  
> 
> 
> Thanks I was not expecting this to work because ...
> 
>  int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
>  bool force)
>  {
>  struct irq_desc *desc = irq_data_to_desc(data);
>  struct irq_chip *chip = irq_data_get_irq_chip(data);
>  int ret;
> 
>  if (!chip || !chip->irq_set_affinity)
>  return -EINVAL;
> 
> However, with your patch Tegra crashes on boot ...
> 
> [   11.613853] Unable to handle kernel NULL pointer dereference at virtual 
> address 
> [   11.622500] Mem abort info:
> [   11.622515]   ESR = 0x8604
> [   11.622524]   EC = 0x21: IABT (current EL), IL = 32 bits
> [   11.622540]   SET = 0, FnV = 0
> [   11.636544]   EA = 0, S1PTW = 0
> [   11.636554] user pgtable: 4k pages, 48-bit VAs, pgdp=00046a28e000
> [   11.636559] [] pgd=, p4d=
> [   11.652652] Internal error: Oops: 8604 [#1] PREEMPT SMP
> [   11.652658] Modules linked in: pwm_tegra phy_tegra194_p2u crct10dif_ce 
> lm90 pwm_fan tegra_bpmp_thermal pcie_tegra194 ip_tables x_tables ipv6
> [   11.670525] CPU: 3 PID: 138 Comm: kworker/3:3 Not tainted 5.9.0-rc4-dirty 
> #12
> [   11.670534] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT)
> [   11.683967] Workqueue: events deferred_probe_work_func
> [   11.683974] pstate: 60c00089 (nZCv daIf +PAN +UAO BTYPE=--)
> [   11.683985] pc : 0x0
> [   11.696669] lr : msi_domain_set_affinity+0x44/0xc0
> [   11.696672] sp : 800012bcb390
> [   11.696680] x29: 800012bcb390 x28: 0003e3033c20
> [   11.709891] x27: 0003e76cfe58 x26: 
> [   11.709900] x25: 800011d7e850 x24: 800011d7e878
> [   11.709908] x23:  x22: 0003e76cfe00
> [   11.709914] x21: 0003e76cfe58 x20: 0003e76cfe58
> [   11.709921] x19: 800011b19000 x18: 
> [   11.709927] x17:  x16: 
> [   11.741262] x15: 800011b19948 x14: 0040
> [   11.741267] x13: 0228 x12: 0030
> [   11.741272] x11: 0101010101010101 x10: 0040
> [   11.741277] x9 :  x8 : 0004
> [   11.741281] x7 :  x6 : 00ff
> [   11.767374] x5 :  x4 : 
> [   11.767379] x3 :  x2 : 
> [   11.767384] x1 : 800011d7e898 x0 : 0003e262bf00
> [   11.767406] Call trace:
> [   11.767410]  0x0
> [   11.767424]  irq_do_set_affinity+0x4c/0x178
> [   11.791400]  irq_setup_affinity+0x124/0x1b0
> [   11.791423]  irq_startup+0x6c/0x118
> [   11.791434]  __setup_irq+0x810/0x8a0
> [   11.802510]  request_threaded_irq+0xdc/0x188
> [   11.802517]  pcie_pme_probe+0x98/0x110
> [   11.802536]  pcie_port_probe_service+0x34/0x60
> [   11.814799]  really_probe+0x110/0x400
> [   11.814809]  driver_probe_device+0x54/0xb8
> [   

Re: [PATCH v2 0/5] PCI: dwc: improve msi handling

2020-09-25 Thread Jon Hunter
Hi Jisheng,

On 25/09/2020 10:27, Jisheng Zhang wrote:

...

>> Could you please try below patch?
>>
>>
>> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
>> b/drivers/pci/controller/dwc/pcie-designware-host.c
>> index bf25d783b5c5..7e5dc54d060e 100644
>> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
>> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
>> @@ -197,7 +197,6 @@ static struct irq_chip dw_pci_msi_bottom_irq_chip = {
>> .name = "DWPCI-MSI",
>> .irq_ack = dw_pci_bottom_ack,
>> .irq_compose_msi_msg = dw_pci_setup_msi_msg,
>> -   .irq_set_affinity = dw_pci_msi_set_affinity,
>> .irq_mask = dw_pci_bottom_mask,
>> .irq_unmask = dw_pci_bottom_unmask,
>>  };
> 
> A complete patch w/o compiler warning:
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
> b/drivers/pci/controller/dwc/pcie-designware-host.c
> index bf25d783b5c5..18f719cfed0b 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -137,12 +137,6 @@ static void dw_pci_setup_msi_msg(struct irq_data *d, 
> struct msi_msg *msg)
>   (int)d->hwirq, msg->address_hi, msg->address_lo);
>  }
>  
> -static int dw_pci_msi_set_affinity(struct irq_data *d,
> -const struct cpumask *mask, bool force)
> -{
> - return -EINVAL;
> -}
> -
>  static void dw_pci_bottom_mask(struct irq_data *d)
>  {
>   struct pcie_port *pp = irq_data_get_irq_chip_data(d);
> @@ -197,7 +191,6 @@ static struct irq_chip dw_pci_msi_bottom_irq_chip = {
>   .name = "DWPCI-MSI",
>   .irq_ack = dw_pci_bottom_ack,
>   .irq_compose_msi_msg = dw_pci_setup_msi_msg,
> - .irq_set_affinity = dw_pci_msi_set_affinity,
>   .irq_mask = dw_pci_bottom_mask,
>   .irq_unmask = dw_pci_bottom_unmask,
>  };
> 


Thanks I was not expecting this to work because ...

 int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
 bool force)
 {
 struct irq_desc *desc = irq_data_to_desc(data);
 struct irq_chip *chip = irq_data_get_irq_chip(data);
 int ret;
 
 if (!chip || !chip->irq_set_affinity)
 return -EINVAL;

However, with your patch Tegra crashes on boot ...

[   11.613853] Unable to handle kernel NULL pointer dereference at virtual 
address 
[   11.622500] Mem abort info:
[   11.622515]   ESR = 0x8604
[   11.622524]   EC = 0x21: IABT (current EL), IL = 32 bits
[   11.622540]   SET = 0, FnV = 0
[   11.636544]   EA = 0, S1PTW = 0
[   11.636554] user pgtable: 4k pages, 48-bit VAs, pgdp=00046a28e000
[   11.636559] [] pgd=, p4d=
[   11.652652] Internal error: Oops: 8604 [#1] PREEMPT SMP
[   11.652658] Modules linked in: pwm_tegra phy_tegra194_p2u crct10dif_ce lm90 
pwm_fan tegra_bpmp_thermal pcie_tegra194 ip_tables x_tables ipv6
[   11.670525] CPU: 3 PID: 138 Comm: kworker/3:3 Not tainted 5.9.0-rc4-dirty #12
[   11.670534] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT)
[   11.683967] Workqueue: events deferred_probe_work_func
[   11.683974] pstate: 60c00089 (nZCv daIf +PAN +UAO BTYPE=--)
[   11.683985] pc : 0x0
[   11.696669] lr : msi_domain_set_affinity+0x44/0xc0
[   11.696672] sp : 800012bcb390
[   11.696680] x29: 800012bcb390 x28: 0003e3033c20 
[   11.709891] x27: 0003e76cfe58 x26:  
[   11.709900] x25: 800011d7e850 x24: 800011d7e878 
[   11.709908] x23:  x22: 0003e76cfe00 
[   11.709914] x21: 0003e76cfe58 x20: 0003e76cfe58 
[   11.709921] x19: 800011b19000 x18:  
[   11.709927] x17:  x16:  
[   11.741262] x15: 800011b19948 x14: 0040 
[   11.741267] x13: 0228 x12: 0030 
[   11.741272] x11: 0101010101010101 x10: 0040 
[   11.741277] x9 :  x8 : 0004 
[   11.741281] x7 :  x6 : 00ff 
[   11.767374] x5 :  x4 :  
[   11.767379] x3 :  x2 :  
[   11.767384] x1 : 800011d7e898 x0 : 0003e262bf00 
[   11.767406] Call trace:
[   11.767410]  0x0
[   11.767424]  irq_do_set_affinity+0x4c/0x178
[   11.791400]  irq_setup_affinity+0x124/0x1b0
[   11.791423]  irq_startup+0x6c/0x118
[   11.791434]  __setup_irq+0x810/0x8a0
[   11.802510]  request_threaded_irq+0xdc/0x188
[   11.802517]  pcie_pme_probe+0x98/0x110
[   11.802536]  pcie_port_probe_service+0x34/0x60
[   11.814799]  really_probe+0x110/0x400
[   11.814809]  driver_probe_device+0x54/0xb8
[   11.822438]  __device_attach_driver+0x90/0xc0
[   11.822463]  bus_for_each_drv+0x70/0xc8
[   11.822471]  __device_attach+0xec/0x150
[   11.834307]  device_initial_probe+0x10/0x18
[   11.834311]  bus_probe_device+0x94/0xa0
[   11.834315]  device_add+0x464/0x730
[   11.834338]  

Re: [PATCH v2 0/5] PCI: dwc: improve msi handling

2020-09-25 Thread Jisheng Zhang
On Fri, 25 Sep 2020 17:17:12 +0800
Jisheng Zhang  wrote:

> CAUTION: Email originated externally, do not click links or open attachments 
> unless you recognize the sender and know the content is safe.
> 
> 
> Hi Jon,
> 
> On Fri, 25 Sep 2020 09:53:45 +0100 Jon Hunter wrote:
> 
> 
> >
> > On 24/09/2020 12:05, Jisheng Zhang wrote:  
> > > Improve the msi code:
> > > 1. Add proper error handling.
> > > 2. Move dw_pcie_msi_init() from each users to designware host to solve
> > > msi page leakage in resume path.  
> >
> > Apologies if this is slightly off topic, but I have been meaning to ask
> > about MSIs and PCI. On Tegra194 which uses the DWC PCI driver, whenever we
> > hotplug CPUs we see the following warnings ...
> >
> >  [  79.068351] WARNING KERN IRQ70: set affinity failed(-22).
> >  [  79.068362] WARNING KERN IRQ71: set affinity failed(-22).
> >
> > These interrupts are the MSIs ...
> >
> > 70:  0  0  0  0  0  0   
> >0  0   PCI-MSI 134217728 Edge  PCIe PME, aerdrv
> > 71:  0  0  0  0  0  0   
> >0  0   PCI-MSI 134742016 Edge  ahci[0001:01:00.0]
> >
> > This caused because ...
> >
> >  static int dw_pci_msi_set_affinity(struct irq_data *d,
> > const struct cpumask *mask, bool force)
> >  {
> >  return -EINVAL;
> >  }
> >
> > Now the above is not unique to the DWC PCI host driver, it appears that
> > most PCIe drivers also do the same. However, I am curious if there is
> > any way to avoid the above warnings given that setting the affinity does
> > not appear to be supported in anyway AFAICT.
> >  
> 
> 
> Could you please try below patch?
> 
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
> b/drivers/pci/controller/dwc/pcie-designware-host.c
> index bf25d783b5c5..7e5dc54d060e 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -197,7 +197,6 @@ static struct irq_chip dw_pci_msi_bottom_irq_chip = {
> .name = "DWPCI-MSI",
> .irq_ack = dw_pci_bottom_ack,
> .irq_compose_msi_msg = dw_pci_setup_msi_msg,
> -   .irq_set_affinity = dw_pci_msi_set_affinity,
> .irq_mask = dw_pci_bottom_mask,
> .irq_unmask = dw_pci_bottom_unmask,
>  };

A complete patch w/o compiler warning:

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
b/drivers/pci/controller/dwc/pcie-designware-host.c
index bf25d783b5c5..18f719cfed0b 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -137,12 +137,6 @@ static void dw_pci_setup_msi_msg(struct irq_data *d, 
struct msi_msg *msg)
(int)d->hwirq, msg->address_hi, msg->address_lo);
 }
 
-static int dw_pci_msi_set_affinity(struct irq_data *d,
-  const struct cpumask *mask, bool force)
-{
-   return -EINVAL;
-}
-
 static void dw_pci_bottom_mask(struct irq_data *d)
 {
struct pcie_port *pp = irq_data_get_irq_chip_data(d);
@@ -197,7 +191,6 @@ static struct irq_chip dw_pci_msi_bottom_irq_chip = {
.name = "DWPCI-MSI",
.irq_ack = dw_pci_bottom_ack,
.irq_compose_msi_msg = dw_pci_setup_msi_msg,
-   .irq_set_affinity = dw_pci_msi_set_affinity,
.irq_mask = dw_pci_bottom_mask,
.irq_unmask = dw_pci_bottom_unmask,
 };


Re: [PATCH v2 0/5] PCI: dwc: improve msi handling

2020-09-25 Thread Jisheng Zhang
Hi Jon,

On Fri, 25 Sep 2020 09:53:45 +0100 Jon Hunter wrote:


> 
> On 24/09/2020 12:05, Jisheng Zhang wrote:
> > Improve the msi code:
> > 1. Add proper error handling.
> > 2. Move dw_pcie_msi_init() from each users to designware host to solve
> > msi page leakage in resume path.  
> 
> Apologies if this is slightly off topic, but I have been meaning to ask
> about MSIs and PCI. On Tegra194 which uses the DWC PCI driver, whenever we
> hotplug CPUs we see the following warnings ...
> 
>  [  79.068351] WARNING KERN IRQ70: set affinity failed(-22).
>  [  79.068362] WARNING KERN IRQ71: set affinity failed(-22).
> 
> These interrupts are the MSIs ...
> 
> 70:  0  0  0  0  0  0 
>  0  0   PCI-MSI 134217728 Edge  PCIe PME, aerdrv
> 71:  0  0  0  0  0  0 
>  0  0   PCI-MSI 134742016 Edge  ahci[0001:01:00.0]
> 
> This caused because ...
> 
>  static int dw_pci_msi_set_affinity(struct irq_data *d,
> const struct cpumask *mask, bool force)
>  {
>  return -EINVAL;
>  }
> 
> Now the above is not unique to the DWC PCI host driver, it appears that
> most PCIe drivers also do the same. However, I am curious if there is
> any way to avoid the above warnings given that setting the affinity does
> not appear to be supported in anyway AFAICT.
> 


Could you please try below patch?


diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
b/drivers/pci/controller/dwc/pcie-designware-host.c
index bf25d783b5c5..7e5dc54d060e 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -197,7 +197,6 @@ static struct irq_chip dw_pci_msi_bottom_irq_chip = {
.name = "DWPCI-MSI",
.irq_ack = dw_pci_bottom_ack,
.irq_compose_msi_msg = dw_pci_setup_msi_msg,
-   .irq_set_affinity = dw_pci_msi_set_affinity,
.irq_mask = dw_pci_bottom_mask,
.irq_unmask = dw_pci_bottom_unmask,
 };



Re: [PATCH v2 0/5] PCI: dwc: improve msi handling

2020-09-25 Thread Jon Hunter


On 24/09/2020 12:05, Jisheng Zhang wrote:
> Improve the msi code:
> 1. Add proper error handling.
> 2. Move dw_pcie_msi_init() from each users to designware host to solve
> msi page leakage in resume path.

Apologies if this is slightly off topic, but I have been meaning to ask
about MSIs and PCI. On Tegra194 which uses the DWC PCI driver, whenever we
hotplug CPUs we see the following warnings ...

 [  79.068351] WARNING KERN IRQ70: set affinity failed(-22).
 [  79.068362] WARNING KERN IRQ71: set affinity failed(-22).

These interrupts are the MSIs ...

70:  0  0  0  0  0  0  
0  0   PCI-MSI 134217728 Edge  PCIe PME, aerdrv
71:  0  0  0  0  0  0  
0  0   PCI-MSI 134742016 Edge  ahci[0001:01:00.0]

This caused because ...

 static int dw_pci_msi_set_affinity(struct irq_data *d,
const struct cpumask *mask, bool force)
 {
 return -EINVAL;
 }

Now the above is not unique to the DWC PCI host driver, it appears that
most PCIe drivers also do the same. However, I am curious if there is
any way to avoid the above warnings given that setting the affinity does
not appear to be supported in anyway AFAICT.

Cheers
Jon 

-- 
nvpublic


[PATCH v2 0/5] PCI: dwc: improve msi handling

2020-09-24 Thread Jisheng Zhang
Improve the msi code:
1. Add proper error handling.
2. Move dw_pcie_msi_init() from each users to designware host to solve
msi page leakage in resume path.

Since v1:
  - add proper error handling patches.
  - solve the msi page leakage by moving dw_pcie_msi_init() from each
users to designware host


Jisheng Zhang (5):
  PCI: dwc: Call dma_unmap_page() before freeing the msi page
  PCI: dwc: Check alloc_page() return value
  PCI: dwc: Rename dw_pcie_free_msi to dw_pcie_msi_deinit
  PCI: dwc: Skip PCIE_MSI_INTR0* programming if MSI is disabled
  PCI: dwc: Move dw_pcie_msi_init() from each users to designware host

 drivers/pci/controller/dwc/pci-dra7xx.c   |  1 +
 drivers/pci/controller/dwc/pci-exynos.c   |  2 -
 drivers/pci/controller/dwc/pci-imx6.c |  3 --
 drivers/pci/controller/dwc/pci-meson.c|  8 
 drivers/pci/controller/dwc/pcie-artpec6.c | 10 -
 .../pci/controller/dwc/pcie-designware-host.c | 43 +--
 .../pci/controller/dwc/pcie-designware-plat.c |  3 --
 drivers/pci/controller/dwc/pcie-designware.h  |  9 +++-
 drivers/pci/controller/dwc/pcie-histb.c   |  3 --
 drivers/pci/controller/dwc/pcie-kirin.c   |  3 --
 drivers/pci/controller/dwc/pcie-qcom.c|  3 --
 drivers/pci/controller/dwc/pcie-spear13xx.c   |  1 -
 drivers/pci/controller/dwc/pcie-tegra194.c|  2 -
 drivers/pci/controller/dwc/pcie-uniphier.c|  9 +---
 14 files changed, 38 insertions(+), 62 deletions(-)

-- 
2.28.0