Re: [Linux-stm32] [PATCH v2 00/14] Introduce STM32MP1 RCC in secured mode

2021-03-11 Thread Alex G.
2-As you suggest, create a new "secure" dtb per boards (Not my wish for maintenance perspectives). I agree with Alex (G) that the "secure" option should be opt-in. That way existing setups remain working and no extra requirements are imposed on MP1 users. Esp

Re: [PATCH v2 00/14] Introduce STM32MP1 RCC in secured mode

2021-03-09 Thread Alex G.
On 1/26/21 3:01 AM, gabriel.fernan...@foss.st.com wrote: From: Gabriel Fernandez Platform STM32MP1 can be used in configuration where some clocks and IP resets can relate as secure resources. These resources are moved from a RCC clock/reset handle to a SCMI clock/reset_domain handle. The RCC

Re: Issues with "PCI/LINK: Report degraded links via link bandwidth notification"

2021-02-02 Thread Alex G.
On 2/2/21 2:16 PM, Bjorn Helgaas wrote: On Tue, Feb 02, 2021 at 01:50:20PM -0600, Alex G. wrote: On 1/29/21 3:56 PM, Bjorn Helgaas wrote: On Thu, Jan 28, 2021 at 06:07:36PM -0600, Alex G. wrote: On 1/28/21 5:51 PM, Sinan Kaya wrote: On 1/28/2021 6:39 PM, Bjorn Helgaas wrote: AFAICT

Re: Issues with "PCI/LINK: Report degraded links via link bandwidth notification"

2021-02-02 Thread Alex G.
On 1/29/21 3:56 PM, Bjorn Helgaas wrote: On Thu, Jan 28, 2021 at 06:07:36PM -0600, Alex G. wrote: On 1/28/21 5:51 PM, Sinan Kaya wrote: On 1/28/2021 6:39 PM, Bjorn Helgaas wrote: AFAICT, this thread petered out with no resolution. If the bandwidth change notifications are important

Re: Issues with "PCI/LINK: Report degraded links via link bandwidth notification"

2021-01-28 Thread Alex G.
On 1/28/21 5:51 PM, Sinan Kaya wrote: On 1/28/2021 6:39 PM, Bjorn Helgaas wrote: AFAICT, this thread petered out with no resolution. If the bandwidth change notifications are important to somebody, please speak up, preferably with a patch that makes the notifications disabled by default and

Re: [PATCH v2 1/2] drm/bridge: sii902x: Enable I/O and core VCC supplies if present

2020-10-20 Thread Alex G.
On 10/20/20 2:16 AM, Sam Ravnborg wrote: Hi Alex. [snip] diff --git a/drivers/gpu/drm/bridge/sii902x.c b/drivers/gpu/drm/bridge/sii902x.c index 33fd33f953ec..d15e9f2c0d8a 100644 --- a/drivers/gpu/drm/bridge/sii902x.c +++ b/drivers/gpu/drm/bridge/sii902x.c @@ -17,6 +17,7 @@ #include

Re: [PATCH v2 1/2] drm/bridge: sii902x: Enable I/O and core VCC supplies if present

2020-10-19 Thread Alex G.
On 9/28/20 12:30 PM, Alexandru Gagniuc wrote: On the SII9022, the IOVCC and CVCC12 supplies must reach the correct voltage before the reset sequence is initiated. On most boards, this assumption is true at boot-up, so initialization succeeds. However, when we try to initialize the chip with

Re: [PATCH 1/2] drm/bridge: sii902x: Enable I/O and core VCC supplies if present

2020-09-28 Thread Alex G.
On 9/26/20 1:49 PM, Sam Ravnborg wrote: Hi Alexandru On Thu, Sep 24, 2020 at 03:05:05PM -0500, Alexandru Gagniuc wrote: On the SII9022, the IOVCC and CVCC12 supplies must reach the correct voltage before the reset sequence is initiated. On most boards, this assumption is true at boot-up, so

Re: [PATCH 4/5] PCI: only return true when dev io state is really changed

2020-09-25 Thread Alex G.
Hi Ethan, On 9/24/20 9:34 PM, Ethan Zhao wrote: When uncorrectable error happens, AER driver and DPC driver interrupt handlers likely call pcie_do_recovery()->pci_walk_bus()->report_frozen_detected() with pci_channel_io_frozen the same time. If pci_dev_set_io_state() return true even if

Re: [PATCH 1/2] drm/bridge: sii902x: Enable I/O and core VCC supplies if present

2020-09-24 Thread Alex G.
On 9/24/20 3:22 PM, Fabio Estevam wrote: Hi Fabio, On Thu, Sep 24, 2020 at 5:16 PM Alexandru Gagniuc wrote: + ret = regulator_enable(sii902x->cvcc12); + if (ret < 0) { + dev_err(dev, "Failed to enable cvcc12 supply: %d\n", ret); +

Re: [PATCH v3 3/3] PCI: pciehp: Add dmi table for in-band presence disabled

2019-10-21 Thread Alex G.
On 10/21/19 1:19 PM, Stuart Hayes wrote: On 10/21/19 8:47 AM, Mika Westerberg wrote: On Thu, Oct 17, 2019 at 03:32:56PM -0400, Stuart Hayes wrote: Some systems have in-band presence detection disabled for hot-plug PCI slots, but do not report this in the slot capabilities 2 (SLTCAP2)

Re: [PATCH 0/3] PCI: pciehp: Do not turn off slot if presence comes up after link

2019-10-02 Thread Alex G.
On 10/1/19 11:13 PM, Lukas Wunner wrote: On Tue, Oct 01, 2019 at 05:14:16PM -0400, Stuart Hayes wrote: This patch set is based on a patch set [1] submitted many months ago by Alexandru Gagniuc, who is no longer working on it. [1] https://patchwork.kernel.org/cover/10909167/ [v3,0/4] PCI:

Re: [PATCH 3/3] PCI: pciehp: Add dmi table for in-band presence disabled

2019-10-01 Thread Alex G.
On 10/1/19 4:14 PM, Stuart Hayes wrote: Some systems have in-band presence detection disabled for hot-plug PCI slots, but do not report this in the slot capabilities 2 (SLTCAP2) register. On these systems, presence detect can become active well after the link is reported to be active, which

Re: [PATCH] Revert "PCI/LINK: Report degraded links via link bandwidth notification"

2019-04-29 Thread Alex G
On 4/29/19 1:56 PM, Bjorn Helgaas wrote: From: Bjorn Helgaas This reverts commit e8303bb7a75c113388badcc49b2a84b4121c1b3e. e8303bb7a75c added logging whenever a link changed speed or width to a state that is considered degraded. Unfortunately, it cannot differentiate signal integrity-related

Re: [PATCH] PCI: Add link_change error handler and vfio-pci user

2019-04-24 Thread Alex G
On 4/24/19 12:19 PM, Alex Williamson wrote: On Wed, 24 Apr 2019 16:45:45 + wrote: On 4/23/2019 5:42 PM, Alex Williamson wrote: diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 7e12d0163863..233cd4b5b6e8 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2403,6

Re: [PATCH] PCI/LINK: Account for BW notification in vector calculation

2019-04-23 Thread Alex G
On 4/22/19 5:43 PM, Alex Williamson wrote: On systems that don't support any PCIe services other than bandwidth notification, pcie_message_numbers() can return zero vectors, causing the vector reallocation in pcie_port_enable_irq_vec() to retry with zero, which fails, resulting in fallback to

Re: [PATCH] PCI/LINK: Account for BW notification in vector calculation

2019-04-23 Thread Alex G
On 4/23/19 12:10 PM, Bjorn Helgaas wrote: On Tue, Apr 23, 2019 at 09:33:53AM -0500, Alex G wrote: On 4/22/19 7:33 PM, Alex Williamson wrote: There is nothing wrong happening here that needs to fill logs. I thought maybe if I enabled notification of autonomous bandwidth changes that it might

Re: [PATCH] PCI/LINK: Account for BW notification in vector calculation

2019-04-23 Thread Alex G
On 4/23/19 11:22 AM, Alex Williamson wrote: Nor should pci-core decide what link speed changes are intended or errors. Minimally we should be enabling drivers to receive this feedback. Thanks, Not errors. pci core reports that a link speed change event has occured. Period. Alex

Re: [PATCH] PCI/LINK: Account for BW notification in vector calculation

2019-04-23 Thread Alex G
On 4/23/19 10:34 AM, Alex Williamson wrote: On Tue, 23 Apr 2019 09:33:53 -0500 Alex G wrote: On 4/22/19 7:33 PM, Alex Williamson wrote: On Mon, 22 Apr 2019 19:05:57 -0500 Alex G wrote: echo :07:00.0:pcie010 | sudo tee /sys/bus/pci_express/drivers/pcie_bw_notification/unbind That's

Re: [PATCH] PCI/LINK: Account for BW notification in vector calculation

2019-04-23 Thread Alex G
On 4/22/19 7:33 PM, Alex Williamson wrote: On Mon, 22 Apr 2019 19:05:57 -0500 Alex G wrote: echo :07:00.0:pcie010 | sudo tee /sys/bus/pci_express/drivers/pcie_bw_notification/unbind That's a bad solution for users, this is meaningless tracking of a device whose driver is actively

Re: [PATCH] PCI/LINK: Account for BW notification in vector calculation

2019-04-22 Thread Alex G
On 4/22/19 5:43 PM, Alex Williamson wrote: [ 329.725607] vfio-pci :07:00.0: 32.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s x16 link at :00:02.0 (capable of 64.000 Gb/s with 5 GT/s x16 link) [ 708.151488] vfio-pci :07:00.0: 32.000 Gb/s available PCIe bandwidth, limited by

Re: [PATCH v1 1/3] PCI / ACPI: Do not export pci_get_hp_params()

2019-04-22 Thread Alex G
On 4/22/19 3:58 PM, Bjorn Helgaas wrote: On Fri, Feb 08, 2019 at 10:24:11AM -0600, Alexandru Gagniuc wrote: This is only used within drivers/pci, and there is no reason to make it available outside of the PCI core. Signed-off-by: Alexandru Gagniuc Applied the whole series to pci/hotplug for

Fixing the GHES driver vs not causing issues in the first place

2019-03-29 Thread Alex G.
The issue of dying inside the GHES driver has popped up a few times before. I've looked into fixing this before, but we didn't quite come to agreement because the verbiage in the ACPI spec is vague:     " When a fatal uncorrected error occurs, the system is       restarted to prevent propagation

Re: [PATCH v2] PCI/LINK: bw_notification: Do not leave interrupt handler NULL

2019-03-25 Thread Alex G.
On 3/25/19 5:25 PM, Bjorn Helgaas wrote: On Fri, Mar 22, 2019 at 07:36:51PM -0500, Alexandru Gagniuc wrote: A threaded IRQ with a NULL handler does not work with level-triggered interrupts. request_threaded_irq() will return an error: genirq: Threaded irq requested with handler=NULL and

Re: [PATCH] PCI/LINK: Request a one-shot IRQ with NULL handler

2019-03-25 Thread Alex G.
Hi Borislav, Thanks for the update. We've settled on a different fix [1], since Lukas was not happy with IRQF_ONESHOT [2]. Alex [1] https://lore.kernel.org/linux-pci/20190323003700.7294-1-mr.nuke...@gmail.com/ [2] https://lore.kernel.org/linux-pci/20190318043314.noyj6t4sh26sp...@wunner.de/

Re: [PATCH v3] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected

2019-03-20 Thread Alex G
On 3/20/19 4:44 PM, Linus Torvalds wrote: On Wed, Mar 20, 2019 at 1:52 PM Bjorn Helgaas wrote: AFAICT, the consensus there was that it would be better to find some sort of platform solution instead of dealing with it in individual drivers. The PCI core isn't really a driver, but I think the

Re: [PATCH] PCI/LINK: bw_notification: Do not leave interrupt handler NULL

2019-03-20 Thread Alex G.
On 3/20/19 8:46 AM, Bjorn Helgaas wrote: Hi Alexandru, On Mon, Mar 18, 2019 at 08:12:04PM -0500, Alexandru Gagniuc wrote: A threaded IRQ with a NULL handler does not work with level-triggered interrupts. request_threaded_irq() will return an error: genirq: Threaded irq requested with

Re: [GIT PULL] PCI changes for v5.1

2019-03-17 Thread Alex G
On 3/17/19 4:18 PM, Linus Torvalds wrote: On Fri, Mar 8, 2019 at 9:31 AM Bjorn Helgaas wrote: - Report PCIe links that become degraded at run-time (Alexandru Gagniuc) Gaah. Only now as I'm about to do the rc1 release am I looking at new runtime warnings, and noticing that this causes

Re: [PATCH v2] PCI: pciehp: Report degraded links via link bandwidth notification

2018-12-27 Thread Alex G.
On 12/7/18 12:20 PM, Alexandru Gagniuc wrote: A warning is generated when a PCIe device is probed with a degraded link, but there was no similar mechanism to warn when the link becomes degraded after probing. The Link Bandwidth Notification provides this mechanism. Use the link bandwidth

Re: [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected

2018-11-05 Thread Alex G.
ping On 09/18/2018 05:15 PM, Alexandru Gagniuc wrote: When a PCI device is gone, we don't want to send IO to it if we can avoid it. We expose functionality via the irq_chip structure. As users of that structure may not know about the underlying PCI device, it's our responsibility to guard

Re: [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected

2018-11-05 Thread Alex G.
ping On 09/18/2018 05:15 PM, Alexandru Gagniuc wrote: When a PCI device is gone, we don't want to send IO to it if we can avoid it. We expose functionality via the irq_chip structure. As users of that structure may not know about the underlying PCI device, it's our responsibility to guard

Re: [PATCH] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected

2018-08-29 Thread Alex G.
Should I resubmit this rebased on 4.19-rc*, or just leave this patch as is? Alex On 07/30/2018 04:21 PM, Alexandru Gagniuc wrote: When a PCI device is gone, we don't want to send IO to it if we can avoid it. We expose functionality via the irq_chip structure. As users of that structure may not

Re: [PATCH] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected

2018-08-29 Thread Alex G.
Should I resubmit this rebased on 4.19-rc*, or just leave this patch as is? Alex On 07/30/2018 04:21 PM, Alexandru Gagniuc wrote: When a PCI device is gone, we don't want to send IO to it if we can avoid it. We expose functionality via the irq_chip structure. As users of that structure may not

Re: [PATCH] PCI/AER: Do not clear AER bits if we don't own AER

2018-08-09 Thread Alex G.
On 08/09/2018 02:18 PM, Bjorn Helgaas wrote: On Thu, Aug 09, 2018 at 02:00:23PM -0500, Alex G. wrote: On 08/09/2018 01:29 PM, Bjorn Helgaas wrote: On Thu, Aug 09, 2018 at 04:46:32PM +, alex_gagn...@dellteam.com wrote: On 08/09/2018 09:16 AM, Bjorn Helgaas wrote: (snip_

Re: [PATCH] PCI/AER: Do not clear AER bits if we don't own AER

2018-08-09 Thread Alex G.
On 08/09/2018 02:18 PM, Bjorn Helgaas wrote: On Thu, Aug 09, 2018 at 02:00:23PM -0500, Alex G. wrote: On 08/09/2018 01:29 PM, Bjorn Helgaas wrote: On Thu, Aug 09, 2018 at 04:46:32PM +, alex_gagn...@dellteam.com wrote: On 08/09/2018 09:16 AM, Bjorn Helgaas wrote: (snip_

Re: [PATCH] PCI/AER: Do not clear AER bits if we don't own AER

2018-08-09 Thread Alex G.
On 08/09/2018 01:29 PM, Bjorn Helgaas wrote: On Thu, Aug 09, 2018 at 04:46:32PM +, alex_gagn...@dellteam.com wrote: On 08/09/2018 09:16 AM, Bjorn Helgaas wrote: (snip_ enable_ecrc_checking() disable_ecrc_checking() I don't immediately see how this would affect FFS, but the

Re: [PATCH] PCI/AER: Do not clear AER bits if we don't own AER

2018-08-09 Thread Alex G.
On 08/09/2018 01:29 PM, Bjorn Helgaas wrote: On Thu, Aug 09, 2018 at 04:46:32PM +, alex_gagn...@dellteam.com wrote: On 08/09/2018 09:16 AM, Bjorn Helgaas wrote: (snip_ enable_ecrc_checking() disable_ecrc_checking() I don't immediately see how this would affect FFS, but the

Re: [PATCH v3] PCI/AER: Do not clear AER bits if we don't own AER

2018-08-07 Thread Alex G.
On 08/07/2018 08:14 PM, Bjorn Helgaas wrote: On Mon, Jul 30, 2018 at 06:35:31PM -0500, Alexandru Gagniuc wrote: When we don't own AER, we shouldn't touch the AER error bits. Clearing error bits willy-nilly might cause firmware to miss some errors. In theory, these bits get cleared by FFS, or

Re: [PATCH v3] PCI/AER: Do not clear AER bits if we don't own AER

2018-08-07 Thread Alex G.
On 08/07/2018 08:14 PM, Bjorn Helgaas wrote: On Mon, Jul 30, 2018 at 06:35:31PM -0500, Alexandru Gagniuc wrote: When we don't own AER, we shouldn't touch the AER error bits. Clearing error bits willy-nilly might cause firmware to miss some errors. In theory, these bits get cleared by FFS, or

Re: [PATCH v5] PCI: Check for PCIe downtraining conditions

2018-07-31 Thread Alex G.
On 07/31/2018 01:40 AM, Tal Gilboa wrote: [snip] @@ -2240,6 +2258,9 @@ static void pci_init_capabilities(struct pci_dev *dev)   /* Advanced Error Reporting */   pci_aer_init(dev); +    /* Check link and detect downtrain errors */ +    pcie_check_upstream_link(dev); This is called for

Re: [PATCH v5] PCI: Check for PCIe downtraining conditions

2018-07-31 Thread Alex G.
On 07/31/2018 01:40 AM, Tal Gilboa wrote: [snip] @@ -2240,6 +2258,9 @@ static void pci_init_capabilities(struct pci_dev *dev)   /* Advanced Error Reporting */   pci_aer_init(dev); +    /* Check link and detect downtrain errors */ +    pcie_check_upstream_link(dev); This is called for

Re: [PATCH v2] PCI/AER: Do not clear AER bits if we don't own AER

2018-07-24 Thread Alex G.
On 07/23/2018 11:52 AM, Alexandru Gagniuc wrote: When we don't own AER, we shouldn't touch the AER error bits. Clearing error bits willy-nilly might cause firmware to miss some errors. In theory, these bits get cleared by FFS, or via ACPI _HPX method. These mechanisms are not subject to the

Re: [PATCH v2] PCI/AER: Do not clear AER bits if we don't own AER

2018-07-24 Thread Alex G.
On 07/23/2018 11:52 AM, Alexandru Gagniuc wrote: When we don't own AER, we shouldn't touch the AER error bits. Clearing error bits willy-nilly might cause firmware to miss some errors. In theory, these bits get cleared by FFS, or via ACPI _HPX method. These mechanisms are not subject to the

Re: [PATCH v5] PCI: Check for PCIe downtraining conditions

2018-07-23 Thread Alex G.
On 07/23/2018 05:14 PM, Jakub Kicinski wrote: On Tue, 24 Jul 2018 00:52:22 +0300, Tal Gilboa wrote: On 7/24/2018 12:01 AM, Jakub Kicinski wrote: On Mon, 23 Jul 2018 15:03:38 -0500, Alexandru Gagniuc wrote: PCIe downtraining happens when both the device and PCIe port are capable of a larger

Re: [PATCH v5] PCI: Check for PCIe downtraining conditions

2018-07-23 Thread Alex G.
On 07/23/2018 05:14 PM, Jakub Kicinski wrote: On Tue, 24 Jul 2018 00:52:22 +0300, Tal Gilboa wrote: On 7/24/2018 12:01 AM, Jakub Kicinski wrote: On Mon, 23 Jul 2018 15:03:38 -0500, Alexandru Gagniuc wrote: PCIe downtraining happens when both the device and PCIe port are capable of a larger

Re: [PATCH v3] PCI: Check for PCIe downtraining conditions

2018-07-23 Thread Alex G.
On 07/23/2018 12:21 AM, Tal Gilboa wrote: On 7/19/2018 6:49 PM, Alex G. wrote: On 07/18/2018 08:38 AM, Tal Gilboa wrote: On 7/16/2018 5:17 PM, Bjorn Helgaas wrote: [+cc maintainers of drivers that already use pcie_print_link_status() and GPU folks] [snip] +    /* Multi-function PCIe

Re: [PATCH v3] PCI: Check for PCIe downtraining conditions

2018-07-23 Thread Alex G.
On 07/23/2018 12:21 AM, Tal Gilboa wrote: On 7/19/2018 6:49 PM, Alex G. wrote: On 07/18/2018 08:38 AM, Tal Gilboa wrote: On 7/16/2018 5:17 PM, Bjorn Helgaas wrote: [+cc maintainers of drivers that already use pcie_print_link_status() and GPU folks] [snip] +    /* Multi-function PCIe

Re: [PATCH] PCI/AER: Do not clear AER bits if we don't own AER

2018-07-19 Thread Alex G.
On 07/19/2018 11:58 AM, Sinan Kaya wrote: On 7/19/2018 8:55 AM, Alex G. wrote: I find the intent clearer if we check it here rather than having to do the mental parsing of the state of aer_cap. I don't feel too strong about my comment to be honest. This was a style/maintenance comment

Re: [PATCH] PCI/AER: Do not clear AER bits if we don't own AER

2018-07-19 Thread Alex G.
On 07/19/2018 11:58 AM, Sinan Kaya wrote: On 7/19/2018 8:55 AM, Alex G. wrote: I find the intent clearer if we check it here rather than having to do the mental parsing of the state of aer_cap. I don't feel too strong about my comment to be honest. This was a style/maintenance comment

Re: [PATCH] PCI/AER: Do not clear AER bits if we don't own AER

2018-07-19 Thread Alex G.
On 07/17/2018 10:41 AM, Sinan Kaya wrote: On 7/17/2018 8:31 AM, Alexandru Gagniuc wrote: +    if (pcie_aer_get_firmware_first(dev)) +    return -EIO; Can you move this to closer to the caller pci_aer_init()? I could move it there. although pci_cleanup_aer_error_status_regs() is

Re: [PATCH] PCI/AER: Do not clear AER bits if we don't own AER

2018-07-19 Thread Alex G.
On 07/17/2018 10:41 AM, Sinan Kaya wrote: On 7/17/2018 8:31 AM, Alexandru Gagniuc wrote: +    if (pcie_aer_get_firmware_first(dev)) +    return -EIO; Can you move this to closer to the caller pci_aer_init()? I could move it there. although pci_cleanup_aer_error_status_regs() is

Re: [PATCH v3] PCI: Check for PCIe downtraining conditions

2018-07-19 Thread Alex G.
On 07/18/2018 08:38 AM, Tal Gilboa wrote: On 7/16/2018 5:17 PM, Bjorn Helgaas wrote: [+cc maintainers of drivers that already use pcie_print_link_status() and GPU folks] [snip] +    /* Multi-function PCIe share the same link/status. */ +    if ((PCI_FUNC(dev->devfn) != 0) ||

Re: [PATCH v3] PCI: Check for PCIe downtraining conditions

2018-07-19 Thread Alex G.
On 07/18/2018 08:38 AM, Tal Gilboa wrote: On 7/16/2018 5:17 PM, Bjorn Helgaas wrote: [+cc maintainers of drivers that already use pcie_print_link_status() and GPU folks] [snip] +    /* Multi-function PCIe share the same link/status. */ +    if ((PCI_FUNC(dev->devfn) != 0) ||

Re: [PATCH v3] PCI: Check for PCIe downtraining conditions

2018-07-19 Thread Alex G.
On 07/18/2018 04:53 PM, Bjorn Helgaas wrote: [+cc Mike (hfi1)] On Mon, Jul 16, 2018 at 10:28:35PM +, alex_gagn...@dellteam.com wrote: On 7/16/2018 4:17 PM, Bjorn Helgaas wrote: ... The easiest way to detect this is with pcie_print_link_status(), since the bottleneck is usually the link

Re: [PATCH v3] PCI: Check for PCIe downtraining conditions

2018-07-19 Thread Alex G.
On 07/18/2018 04:53 PM, Bjorn Helgaas wrote: [+cc Mike (hfi1)] On Mon, Jul 16, 2018 at 10:28:35PM +, alex_gagn...@dellteam.com wrote: On 7/16/2018 4:17 PM, Bjorn Helgaas wrote: ... The easiest way to detect this is with pcie_print_link_status(), since the bottleneck is usually the link

Re: [PATCH v3] PCI/AER: Fix aerdrv loading with "pcie_ports=native" parameter

2018-07-03 Thread Alex G.
On 07/03/2018 11:38 AM, Bjorn Helgaas wrote: > On Mon, Jul 02, 2018 at 11:16:01AM -0500, Alexandru Gagniuc wrote: >> According to the documentation, "pcie_ports=native", linux should use >> native AER and DPC services. While that is true for the _OSC method >> parsing, this is not the only

Re: [PATCH v3] PCI/AER: Fix aerdrv loading with "pcie_ports=native" parameter

2018-07-03 Thread Alex G.
On 07/03/2018 11:38 AM, Bjorn Helgaas wrote: > On Mon, Jul 02, 2018 at 11:16:01AM -0500, Alexandru Gagniuc wrote: >> According to the documentation, "pcie_ports=native", linux should use >> native AER and DPC services. While that is true for the _OSC method >> parsing, this is not the only

Re: [PATCH v2] PCI/AER: Fix aerdrv loading with "pcie_ports=native" parameter

2018-07-02 Thread Alex G.
On 07/02/2018 08:16 AM, Bjorn Helgaas wrote: > On Sat, Jun 30, 2018 at 11:39:00PM -0500, Alex G wrote: >> On 06/30/2018 04:31 PM, Bjorn Helgaas wrote: >>> [+cc Borislav, linux-acpi, since this involves APEI/HEST] >> >> Borislav is not the relevant maintainer he

Re: [PATCH v2] PCI/AER: Fix aerdrv loading with "pcie_ports=native" parameter

2018-07-02 Thread Alex G.
On 07/02/2018 08:16 AM, Bjorn Helgaas wrote: > On Sat, Jun 30, 2018 at 11:39:00PM -0500, Alex G wrote: >> On 06/30/2018 04:31 PM, Bjorn Helgaas wrote: >>> [+cc Borislav, linux-acpi, since this involves APEI/HEST] >> >> Borislav is not the relevant maintainer he

Re: [PATCH v2] PCI/AER: Fix aerdrv loading with "pcie_ports=native" parameter

2018-06-30 Thread Alex G
On 06/30/2018 04:31 PM, Bjorn Helgaas wrote: [+cc Borislav, linux-acpi, since this involves APEI/HEST] Borislav is not the relevant maintainer here, since we're not contingent on APEI handling. I think Keith has a lot more experience with this part of the kernel. On Tue, Jun 19, 2018 at

Re: [PATCH v2] PCI/AER: Fix aerdrv loading with "pcie_ports=native" parameter

2018-06-30 Thread Alex G
On 06/30/2018 04:31 PM, Bjorn Helgaas wrote: [+cc Borislav, linux-acpi, since this involves APEI/HEST] Borislav is not the relevant maintainer here, since we're not contingent on APEI handling. I think Keith has a lot more experience with this part of the kernel. On Tue, Jun 19, 2018 at

Re: [PATCH] PCI: DPC: Clear AER status bits before disabling port containment

2018-06-26 Thread Alex G.
On 06/19/2018 04:57 PM, Bjorn Helgaas wrote: > On Wed, May 16, 2018 at 05:12:21PM -0600, Keith Busch wrote: >> On Wed, May 16, 2018 at 06:44:22PM -0400, Sinan Kaya wrote: >>> On 5/16/2018 5:33 PM, Alexandru Gagniuc wrote: AER status bits are sticky, and they survive system resets.

Re: [PATCH] PCI: DPC: Clear AER status bits before disabling port containment

2018-06-26 Thread Alex G.
On 06/19/2018 04:57 PM, Bjorn Helgaas wrote: > On Wed, May 16, 2018 at 05:12:21PM -0600, Keith Busch wrote: >> On Wed, May 16, 2018 at 06:44:22PM -0400, Sinan Kaya wrote: >>> On 5/16/2018 5:33 PM, Alexandru Gagniuc wrote: AER status bits are sticky, and they survive system resets.

Re: [PATCH v2] PCI: Check for PCIe downtraining conditions

2018-06-01 Thread Alex G.
On 06/01/2018 10:10 AM, Sinan Kaya wrote: > On 6/1/2018 11:06 AM, Alex G. wrote: >> On 06/01/2018 10:03 AM, Sinan Kaya wrote: >>> On 6/1/2018 11:01 AM, Alexandru Gagniuc wrote: >>>> + /* Multi-function PCIe share the same link/status. */ >>

Re: [PATCH v2] PCI: Check for PCIe downtraining conditions

2018-06-01 Thread Alex G.
On 06/01/2018 10:10 AM, Sinan Kaya wrote: > On 6/1/2018 11:06 AM, Alex G. wrote: >> On 06/01/2018 10:03 AM, Sinan Kaya wrote: >>> On 6/1/2018 11:01 AM, Alexandru Gagniuc wrote: >>>> + /* Multi-function PCIe share the same link/status. */ >>

Re: [PATCH v2] PCI: Check for PCIe downtraining conditions

2018-06-01 Thread Alex G.
On 06/01/2018 10:12 AM, Andy Shevchenko wrote: > On Fri, Jun 1, 2018 at 6:01 PM, Alexandru Gagniuc > wrote: >> PCIe downtraining happens when both the device and PCIe port are >> capable of a larger bus width or higher speed than negotiated. >> Downtraining might be indicative of other problems

Re: [PATCH v2] PCI: Check for PCIe downtraining conditions

2018-06-01 Thread Alex G.
On 06/01/2018 10:12 AM, Andy Shevchenko wrote: > On Fri, Jun 1, 2018 at 6:01 PM, Alexandru Gagniuc > wrote: >> PCIe downtraining happens when both the device and PCIe port are >> capable of a larger bus width or higher speed than negotiated. >> Downtraining might be indicative of other problems

Re: [PATCH v2] PCI: Check for PCIe downtraining conditions

2018-06-01 Thread Alex G.
On 06/01/2018 10:03 AM, Sinan Kaya wrote: > On 6/1/2018 11:01 AM, Alexandru Gagniuc wrote: >> +/* Multi-function PCIe share the same link/status. */ >> +if (PCI_FUNC(dev->devfn) != 0) >> +return; > > How about virtual functions? I have almost no clue about those. Is your

Re: [PATCH v2] PCI: Check for PCIe downtraining conditions

2018-06-01 Thread Alex G.
On 06/01/2018 10:03 AM, Sinan Kaya wrote: > On 6/1/2018 11:01 AM, Alexandru Gagniuc wrote: >> +/* Multi-function PCIe share the same link/status. */ >> +if (PCI_FUNC(dev->devfn) != 0) >> +return; > > How about virtual functions? I have almost no clue about those. Is your

Re: [PATCH] PCI: Check for PCIe downtraining conditions

2018-05-31 Thread Alex G.
On 05/31/2018 12:27 PM, Alex G. wrote: > On 05/31/2018 12:11 PM, Sinan Kaya wrote: >> On 5/31/2018 12:49 PM, Alex G. wrote: >>>>bw_cap = pcie_bandwidth_capable(dev, _cap, _cap); >>>>bw_avail = pcie_bandwidth_available(dev, _dev, , , >>>>

Re: [PATCH] PCI: Check for PCIe downtraining conditions

2018-05-31 Thread Alex G.
On 05/31/2018 12:27 PM, Alex G. wrote: > On 05/31/2018 12:11 PM, Sinan Kaya wrote: >> On 5/31/2018 12:49 PM, Alex G. wrote: >>>>bw_cap = pcie_bandwidth_capable(dev, _cap, _cap); >>>>bw_avail = pcie_bandwidth_available(dev, _dev, , , >>>>

Re: [PATCH] PCI: Check for PCIe downtraining conditions

2018-05-31 Thread Alex G.
On 05/31/2018 10:30 AM, Sinan Kaya wrote: > On 5/31/2018 11:05 AM, Alexandru Gagniuc wrote: >> +if (dev_cur_speed < max_link_speed) >> +pci_warn(dev, "PCIe downtrain: link speed is %s (%s capable)", >> + pcie_bus_speed_name(dev_cur_speed), >> +

Re: [PATCH] PCI: Check for PCIe downtraining conditions

2018-05-31 Thread Alex G.
On 05/31/2018 10:30 AM, Sinan Kaya wrote: > On 5/31/2018 11:05 AM, Alexandru Gagniuc wrote: >> +if (dev_cur_speed < max_link_speed) >> +pci_warn(dev, "PCIe downtrain: link speed is %s (%s capable)", >> + pcie_bus_speed_name(dev_cur_speed), >> +

Re: [PATCH] PCI: Check for PCIe downtraining conditions

2018-05-31 Thread Alex G.
On 05/31/2018 12:11 PM, Sinan Kaya wrote: > On 5/31/2018 12:49 PM, Alex G. wrote: >>> bw_cap = pcie_bandwidth_capable(dev, _cap, _cap); >>> bw_avail = pcie_bandwidth_available(dev, _dev, , , >>> *parent*); >> That's confusing. I'd expect _capable() a

Re: [PATCH] PCI: Check for PCIe downtraining conditions

2018-05-31 Thread Alex G.
On 05/31/2018 12:11 PM, Sinan Kaya wrote: > On 5/31/2018 12:49 PM, Alex G. wrote: >>> bw_cap = pcie_bandwidth_capable(dev, _cap, _cap); >>> bw_avail = pcie_bandwidth_available(dev, _dev, , , >>> *parent*); >> That's confusing. I'd expect _capable() a

Re: [PATCH] PCI: Check for PCIe downtraining conditions

2018-05-31 Thread Alex G.
On 05/31/2018 11:49 AM, Alex G. wrote: > > > On 05/31/2018 11:13 AM, Sinan Kaya wrote: >> On 5/31/2018 12:01 PM, Alex G. wrote: >>>> PCI: Add pcie_print_link_status() to log link speed and whether it's >>>> limited >>> This o

Re: [PATCH] PCI: Check for PCIe downtraining conditions

2018-05-31 Thread Alex G.
On 05/31/2018 11:49 AM, Alex G. wrote: > > > On 05/31/2018 11:13 AM, Sinan Kaya wrote: >> On 5/31/2018 12:01 PM, Alex G. wrote: >>>> PCI: Add pcie_print_link_status() to log link speed and whether it's >>>> limited >>> This o

Re: [PATCH] PCI: Check for PCIe downtraining conditions

2018-05-31 Thread Alex G.
On 05/31/2018 11:13 AM, Sinan Kaya wrote: > On 5/31/2018 12:01 PM, Alex G. wrote: >>> PCI: Add pcie_print_link_status() to log link speed and whether it's >>> limited >> This one, I have, but it's not what I need. This looks at the available >> ban

Re: [PATCH] PCI: Check for PCIe downtraining conditions

2018-05-31 Thread Alex G.
On 05/31/2018 11:13 AM, Sinan Kaya wrote: > On 5/31/2018 12:01 PM, Alex G. wrote: >>> PCI: Add pcie_print_link_status() to log link speed and whether it's >>> limited >> This one, I have, but it's not what I need. This looks at the available >> ban

Re: [PATCH] PCI: Check for PCIe downtraining conditions

2018-05-31 Thread Alex G.
On 05/31/2018 10:54 AM, Sinan Kaya wrote: > On 5/31/2018 11:46 AM, Alex G. wrote: >>> https://lkml.org/lkml/2018/3/30/553 >> Oh, pcie_get_speed_cap()/pcie_get_width_cap() seems to handle the >> capability. Not seeing one for status and speed name. >> >>>

Re: [PATCH] PCI: Check for PCIe downtraining conditions

2018-05-31 Thread Alex G.
On 05/31/2018 10:54 AM, Sinan Kaya wrote: > On 5/31/2018 11:46 AM, Alex G. wrote: >>> https://lkml.org/lkml/2018/3/30/553 >> Oh, pcie_get_speed_cap()/pcie_get_width_cap() seems to handle the >> capability. Not seeing one for status and speed name. >> >>>

Re: [PATCH] PCI: Check for PCIe downtraining conditions

2018-05-31 Thread Alex G.
On 05/31/2018 10:38 AM, Sinan Kaya wrote: > On 5/31/2018 11:29 AM, alex_gagn...@dellteam.com wrote: >> On 5/31/2018 10:28 AM, Sinan Kaya wrote: >>> On 5/31/2018 11:05 AM, Alexandru Gagniuc wrote: +static void pcie_max_link_cap(struct pci_dev *dev, enum pci_bus_speed *speed, +

Re: [PATCH] PCI: Check for PCIe downtraining conditions

2018-05-31 Thread Alex G.
On 05/31/2018 10:38 AM, Sinan Kaya wrote: > On 5/31/2018 11:29 AM, alex_gagn...@dellteam.com wrote: >> On 5/31/2018 10:28 AM, Sinan Kaya wrote: >>> On 5/31/2018 11:05 AM, Alexandru Gagniuc wrote: +static void pcie_max_link_cap(struct pci_dev *dev, enum pci_bus_speed *speed, +

Re: [PATCH 1/5] PCI/AER: Define and allocate aer_stats structure for AER capable devices

2018-05-23 Thread Alex G.
On 05/23/2018 09:32 AM, Jes Sorensen wrote: > On 05/23/2018 10:26 AM, Matthew Wilcox wrote: >> On Wed, May 23, 2018 at 10:20:10AM -0400, Jes Sorensen wrote: +++ b/drivers/pci/pcie/aer/aerdrv_stats.c @@ -0,0 +1,64 @@ +// SPDX-License-Identifier: GPL-2.0 >>> >>> Fix the formatting

Re: [PATCH 1/5] PCI/AER: Define and allocate aer_stats structure for AER capable devices

2018-05-23 Thread Alex G.
On 05/23/2018 09:32 AM, Jes Sorensen wrote: > On 05/23/2018 10:26 AM, Matthew Wilcox wrote: >> On Wed, May 23, 2018 at 10:20:10AM -0400, Jes Sorensen wrote: +++ b/drivers/pci/pcie/aer/aerdrv_stats.c @@ -0,0 +1,64 @@ +// SPDX-License-Identifier: GPL-2.0 >>> >>> Fix the formatting

Re: [PATCH 1/5] PCI/AER: Define and allocate aer_stats structure for AER capable devices

2018-05-23 Thread Alex G.
On 05/23/2018 09:20 AM, Jes Sorensen wrote: > On 05/22/2018 06:28 PM, Rajat Jain wrote: >> new file mode 100644 >> index ..b9f251992209 >> --- /dev/null >> +++ b/drivers/pci/pcie/aer/aerdrv_stats.c >> @@ -0,0 +1,64 @@ >> +// SPDX-License-Identifier: GPL-2.0 > > Fix the formatting

Re: [PATCH 1/5] PCI/AER: Define and allocate aer_stats structure for AER capable devices

2018-05-23 Thread Alex G.
On 05/23/2018 09:20 AM, Jes Sorensen wrote: > On 05/22/2018 06:28 PM, Rajat Jain wrote: >> new file mode 100644 >> index ..b9f251992209 >> --- /dev/null >> +++ b/drivers/pci/pcie/aer/aerdrv_stats.c >> @@ -0,0 +1,64 @@ >> +// SPDX-License-Identifier: GPL-2.0 > > Fix the formatting

Re: [PATCH 5/5] Documentation/PCI: Add details of PCI AER statistics

2018-05-22 Thread Alex G.
On 05/22/2018 05:28 PM, Rajat Jain wrote: > Add the PCI AER statistics details to > Documentation/PCI/pcieaer-howto.txt > > Signed-off-by: Rajat Jain > --- > Documentation/PCI/pcieaer-howto.txt | 35 + > 1 file changed, 35 insertions(+) > > diff

Re: [PATCH 5/5] Documentation/PCI: Add details of PCI AER statistics

2018-05-22 Thread Alex G.
On 05/22/2018 05:28 PM, Rajat Jain wrote: > Add the PCI AER statistics details to > Documentation/PCI/pcieaer-howto.txt > > Signed-off-by: Rajat Jain > --- > Documentation/PCI/pcieaer-howto.txt | 35 + > 1 file changed, 35 insertions(+) > > diff --git

Re: [PATCH 2/5] PCI/AER: Add sysfs stats for AER capable devices

2018-05-22 Thread Alex G.
On 05/22/2018 05:28 PM, Rajat Jain wrote: > Add the following AER sysfs stats to represent the counters for each > kind of error as seen by the device: > > dev_total_cor_errs > dev_total_fatal_errs > dev_total_nonfatal_errs > > Signed-off-by: Rajat Jain > --- >

Re: [PATCH 2/5] PCI/AER: Add sysfs stats for AER capable devices

2018-05-22 Thread Alex G.
On 05/22/2018 05:28 PM, Rajat Jain wrote: > Add the following AER sysfs stats to represent the counters for each > kind of error as seen by the device: > > dev_total_cor_errs > dev_total_fatal_errs > dev_total_nonfatal_errs > > Signed-off-by: Rajat Jain > --- > drivers/pci/pci-sysfs.c

Re: [PATCH v6 1/2] acpi: apei: Rename ghes_severity() to ghes_cper_severity()

2018-05-22 Thread Alex G.
On 05/22/2018 01:45 PM, Luck, Tony wrote: > On Tue, May 22, 2018 at 01:19:34PM -0500, Alex G. wrote: >> Firmware started passing "fatal" GHES headers with the explicit intent of >> crashing an OS. At the same time, we've learnt how to handle these errors in >> a nu

Re: [PATCH v6 1/2] acpi: apei: Rename ghes_severity() to ghes_cper_severity()

2018-05-22 Thread Alex G.
On 05/22/2018 01:45 PM, Luck, Tony wrote: > On Tue, May 22, 2018 at 01:19:34PM -0500, Alex G. wrote: >> Firmware started passing "fatal" GHES headers with the explicit intent of >> crashing an OS. At the same time, we've learnt how to handle these errors in >> a nu

Re: [PATCH v6 1/2] acpi: apei: Rename ghes_severity() to ghes_cper_severity()

2018-05-22 Thread Alex G.
On 05/22/2018 01:13 PM, Rafael J. Wysocki wrote: (snip) Of course, you are free to have a differing opinion and I don't have to convince you about my point. You need to convince me about your point to get the patch in through my tree, which you haven't done so far. My point is that crossing

Re: [PATCH v6 1/2] acpi: apei: Rename ghes_severity() to ghes_cper_severity()

2018-05-22 Thread Alex G.
On 05/22/2018 01:13 PM, Rafael J. Wysocki wrote: (snip) Of course, you are free to have a differing opinion and I don't have to convince you about my point. You need to convince me about your point to get the patch in through my tree, which you haven't done so far. My point is that crossing

Re: [PATCH v6 1/2] acpi: apei: Rename ghes_severity() to ghes_cper_severity()

2018-05-22 Thread Alex G.
On 05/22/2018 01:10 PM, Rafael J. Wysocki wrote: On Tue, May 22, 2018 at 7:57 PM, Luck, Tony wrote: On Tue, May 22, 2018 at 04:54:26PM +0200, Borislav Petkov wrote: I especially don't want to have the case where a PCIe error is *really* fatal and then we noodle in some

Re: [PATCH v6 1/2] acpi: apei: Rename ghes_severity() to ghes_cper_severity()

2018-05-22 Thread Alex G.
On 05/22/2018 01:10 PM, Rafael J. Wysocki wrote: On Tue, May 22, 2018 at 7:57 PM, Luck, Tony wrote: On Tue, May 22, 2018 at 04:54:26PM +0200, Borislav Petkov wrote: I especially don't want to have the case where a PCIe error is *really* fatal and then we noodle in some handlers debating about

Re: [PATCH v6 1/2] acpi: apei: Rename ghes_severity() to ghes_cper_severity()

2018-05-22 Thread Alex G.
On 05/22/2018 12:57 PM, Luck, Tony wrote: On Tue, May 22, 2018 at 04:54:26PM +0200, Borislav Petkov wrote: I especially don't want to have the case where a PCIe error is *really* fatal and then we noodle in some handlers debating about the severity because it got marked as recoverable

Re: [PATCH v6 1/2] acpi: apei: Rename ghes_severity() to ghes_cper_severity()

2018-05-22 Thread Alex G.
On 05/22/2018 12:57 PM, Luck, Tony wrote: On Tue, May 22, 2018 at 04:54:26PM +0200, Borislav Petkov wrote: I especially don't want to have the case where a PCIe error is *really* fatal and then we noodle in some handlers debating about the severity because it got marked as recoverable

Re: [PATCH v6 1/2] acpi: apei: Rename ghes_severity() to ghes_cper_severity()

2018-05-22 Thread Alex G.
On 05/22/2018 09:54 AM, Borislav Petkov wrote: > On Tue, May 22, 2018 at 09:39:15AM -0500, Alex G. wrote: >> No, the problem is with the current approach, not with mine. The problem >> is trying to handle the error outside of the existing handler. That's a >> no-no, IMO.

  1   2   >