Re: [PATCH] powerpc/eeh: Fix enabling bridge MMIO windows

2018-04-18 Thread Russell Currey
On Wed, 2018-04-11 at 13:37 +1000, Michael Neuling wrote:
> On boot we save the configuration space of PCIe bridges. We do this
> so
> when we get an EEH event and everything gets reset that we can
> restore
> them.
> 
> Unfortunately we save this state before we've enabled the MMIO space
> on the bridges. Hence if we have to reset the bridge when we come
> back
> MMIO is not enabled and we end up taking an PE freeze when the driver
> starts accessing again.
> 
> This patch forces the memory/MMIO and bus mastering on when restoring
> bridges on EEH. Ideally we'd do this correctly by saving the
> configuration space writes later, but that will have to come later in
> a larger EEH rewrite. For now we have this simple fix.
> 
> The original bug can be triggered on a boston machine by doing:
>   echo 0x8000 >
> /sys/kernel/debug/powerpc/PCI0001/err_injct_outbound
> On boston, this PHB has a PCIe switch on it.  Without this patch,
> you'll see two EEH events, 1 expected and 1 the failure we are fixing
> here. The second EEH event causes the anything under the PHB to
> disappear (i.e. the i40e eth).
> 
> With this patch, only 1 EEH event occurs and devices properly
> recover.
> 
> Reported-by: Pridhiviraj Paidipeddi 
> Signed-off-by: Michael Neuling 
> Cc: sta...@vger.kernel.org

Acked-by: Russell Currey 


[PATCH] powerpc/eeh: Fix enabling bridge MMIO windows

2018-04-10 Thread Michael Neuling
On boot we save the configuration space of PCIe bridges. We do this so
when we get an EEH event and everything gets reset that we can restore
them.

Unfortunately we save this state before we've enabled the MMIO space
on the bridges. Hence if we have to reset the bridge when we come back
MMIO is not enabled and we end up taking an PE freeze when the driver
starts accessing again.

This patch forces the memory/MMIO and bus mastering on when restoring
bridges on EEH. Ideally we'd do this correctly by saving the
configuration space writes later, but that will have to come later in
a larger EEH rewrite. For now we have this simple fix.

The original bug can be triggered on a boston machine by doing:
  echo 0x8000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_outbound
On boston, this PHB has a PCIe switch on it.  Without this patch,
you'll see two EEH events, 1 expected and 1 the failure we are fixing
here. The second EEH event causes the anything under the PHB to
disappear (i.e. the i40e eth).

With this patch, only 1 EEH event occurs and devices properly recover.

Reported-by: Pridhiviraj Paidipeddi 
Signed-off-by: Michael Neuling 
Cc: sta...@vger.kernel.org
---
 arch/powerpc/kernel/eeh_pe.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index 2d4956e97a..ee5a67d57a 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -807,7 +807,8 @@ static void eeh_restore_bridge_bars(struct eeh_dev *edev)
eeh_ops->write_config(pdn, 15*4, 4, edev->config_space[15]);
 
/* PCI Command: 0x4 */
-   eeh_ops->write_config(pdn, PCI_COMMAND, 4, edev->config_space[1]);
+   eeh_ops->write_config(pdn, PCI_COMMAND, 4, edev->config_space[1] |
+ PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER);
 
/* Check the PCIe link is ready */
eeh_bridge_check_link(edev);
-- 
2.14.1