RE: [PATCH 4/4] edac/85xx: PCI/PCIE error interrupt edac support.

2011-08-03 Thread Xie Shaohui-B21989
Hi all,

Any concerns of this patch?


Best Regards, 
Shaohui Xie 


>-Original Message-
>From: Xie Shaohui-B21989
>Sent: Tuesday, July 26, 2011 2:52 PM
>To: linuxppc-dev@lists.ozlabs.org; Kumar Gala
>Cc: mm-comm...@vger.kernel.org; avoront...@mvista.com; da...@davemloft.net;
>grant.lik...@secretlab.ca; a...@linux-foundation.org; Jiang Kai-B18973
>Subject: RE: [PATCH 4/4] edac/85xx: PCI/PCIE error interrupt edac support.
>
>I've verified this patch can apply for galak/powerpc.git 'next' branch
>with no change.
>
>
>Best Regards,
>Shaohui Xie
>
>
>>-Original Message-
>>From: Xie Shaohui-B21989
>>Sent: Thursday, July 21, 2011 6:33 PM
>>To: linuxppc-dev@lists.ozlabs.org
>>Cc: Gala Kumar-B11780; mm-comm...@vger.kernel.org; avoront...@mvista.com;
>>da...@davemloft.net; grant.lik...@secretlab.ca; a...@linux-foundation.org;
>>Jiang Kai-B18973; Kumar Gala; Xie Shaohui-B21989
>>Subject: [PATCH 4/4] edac/85xx: PCI/PCIE error interrupt edac support.
>>
>>From: Kai.Jiang 
>>
>>Add pcie error interrupt edac support for mpc85xx and p4080.
>>mpc85xx uses the legacy interrupt report mechanism - the error interrupts
>>are reported directly to mpic. While, p4080 attaches most of error
>>interrupts to interrupt 0. And report error interrupt to mpic via
>>interrupt 0. This patch can handle both of them.
>>
>>
>>Due to the error management register offset and definition
>>
>>difference between pci and pcie, use ccsr_pci structure to merge pci and
>>pcie edac code into one.
>>
>>Signed-off-by: Kai.Jiang 
>>Signed-off-by: Kumar Gala 
>>Signed-off-by: Shaohui Xie 
>>---
>> drivers/edac/mpc85xx_edac.c |  239 -
>-
>>
>> drivers/edac/mpc85xx_edac.h |   17 +--
>> 2 files changed, 188 insertions(+), 68 deletions(-)
>>
>>diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
>>index b048a5f..dde156f 100644
>>--- a/drivers/edac/mpc85xx_edac.c
>>+++ b/drivers/edac/mpc85xx_edac.c
>>@@ -1,5 +1,6 @@
>> /*
>>  * Freescale MPC85xx Memory Controller kenel module
>>+ * Copyright (c) 2011 Freescale Semiconductor, Inc.
>>  *
>>  * Author: Dave Jiang 
>>  *
>>@@ -21,6 +22,8 @@
>>
>> #include 
>> #include 
>>+#include 
>>+#include 
>> #include "edac_module.h"
>> #include "edac_core.h"
>> #include "mpc85xx_edac.h"
>>@@ -34,14 +37,6 @@ static int edac_mc_idx;  static u32
>>orig_ddr_err_disable;  static u32 orig_ddr_err_sbe;
>>
>>-/*
>>- * PCI Err defines
>>- */
>>-#ifdef CONFIG_PCI
>>-static u32 orig_pci_err_cap_dr;
>>-static u32 orig_pci_err_en;
>>-#endif
>>-
>> static u32 orig_l2_err_disable;
>> #ifdef CONFIG_FSL_SOC_BOOKE
>> static u32 orig_hid1[2];
>>@@ -151,37 +146,52 @@ static void mpc85xx_pci_check(struct
>>edac_pci_ctl_info *pci)  {
>>  struct mpc85xx_pci_pdata *pdata = pci->pvt_info;
>>  u32 err_detect;
>>+ struct ccsr_pci *reg = pdata->pci_reg;
>>+
>>+ err_detect = in_be32(&pdata->pci_reg->pex_err_dr);
>>+
>>+ if (pdata->pcie_flag) {
>>+ printk(KERN_ERR "PCIE error(s) detected\n");
>>+ printk(KERN_ERR "PCIE ERR_DR register: 0x%08x\n", err_detect);
>>+ printk(KERN_ERR "PCIE ERR_CAP_STAT register: 0x%08x\n",
>>+ in_be32(®->pex_err_cap_stat));
>>+ printk(KERN_ERR "PCIE ERR_CAP_R0 register: 0x%08x\n",
>>+ in_be32(®->pex_err_cap_r0));
>>+ printk(KERN_ERR "PCIE ERR_CAP_R1 register: 0x%08x\n",
>>+ in_be32(®->pex_err_cap_r1));
>>+ printk(KERN_ERR "PCIE ERR_CAP_R2 register: 0x%08x\n",
>>+ in_be32(®->pex_err_cap_r2));
>>+ printk(KERN_ERR "PCIE ERR_CAP_R3 register: 0x%08x\n",
>>+ in_be32(®->pex_err_cap_r3));
>>+ } else {
>>+ /* master aborts can happen during PCI config cycles */
>>+ if (!(err_detect & ~(PCI_EDE_MULTI_ERR | PCI_EDE_MST_ABRT))) {
>>+ out_be32(®->pex_err_dr, err_detect);
>>+ return;
>>+ }
>>
>>- err_detect = in_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR);
>>-
>>- /* master aborts can happen during PCI config cycles */
>>- if (!(err_detect & ~(PCI_EDE_MULTI_ERR | PCI_EDE_M

RE: [PATCH 4/4] edac/85xx: PCI/PCIE error interrupt edac support.

2011-07-25 Thread Xie Shaohui-B21989
I've verified this patch can apply for galak/powerpc.git 'next' branch with no 
change.


Best Regards, 
Shaohui Xie 


>-Original Message-
>From: Xie Shaohui-B21989
>Sent: Thursday, July 21, 2011 6:33 PM
>To: linuxppc-dev@lists.ozlabs.org
>Cc: Gala Kumar-B11780; mm-comm...@vger.kernel.org; avoront...@mvista.com;
>da...@davemloft.net; grant.lik...@secretlab.ca; a...@linux-foundation.org;
>Jiang Kai-B18973; Kumar Gala; Xie Shaohui-B21989
>Subject: [PATCH 4/4] edac/85xx: PCI/PCIE error interrupt edac support.
>
>From: Kai.Jiang 
>
>Add pcie error interrupt edac support for mpc85xx and p4080.
>mpc85xx uses the legacy interrupt report mechanism - the error interrupts
>are reported directly to mpic. While, p4080 attaches most of error
>interrupts to interrupt 0. And report error interrupt to mpic via
>interrupt 0. This patch can handle both of them.
>
>
>Due to the error management register offset and definition
>
>difference between pci and pcie, use ccsr_pci structure to merge pci and
>pcie edac code into one.
>
>Signed-off-by: Kai.Jiang 
>Signed-off-by: Kumar Gala 
>Signed-off-by: Shaohui Xie 
>---
> drivers/edac/mpc85xx_edac.c |  239 --
>
> drivers/edac/mpc85xx_edac.h |   17 +--
> 2 files changed, 188 insertions(+), 68 deletions(-)
>
>diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
>index b048a5f..dde156f 100644
>--- a/drivers/edac/mpc85xx_edac.c
>+++ b/drivers/edac/mpc85xx_edac.c
>@@ -1,5 +1,6 @@
> /*
>  * Freescale MPC85xx Memory Controller kenel module
>+ * Copyright (c) 2011 Freescale Semiconductor, Inc.
>  *
>  * Author: Dave Jiang 
>  *
>@@ -21,6 +22,8 @@
>
> #include 
> #include 
>+#include 
>+#include 
> #include "edac_module.h"
> #include "edac_core.h"
> #include "mpc85xx_edac.h"
>@@ -34,14 +37,6 @@ static int edac_mc_idx;  static u32
>orig_ddr_err_disable;  static u32 orig_ddr_err_sbe;
>
>-/*
>- * PCI Err defines
>- */
>-#ifdef CONFIG_PCI
>-static u32 orig_pci_err_cap_dr;
>-static u32 orig_pci_err_en;
>-#endif
>-
> static u32 orig_l2_err_disable;
> #ifdef CONFIG_FSL_SOC_BOOKE
> static u32 orig_hid1[2];
>@@ -151,37 +146,52 @@ static void mpc85xx_pci_check(struct
>edac_pci_ctl_info *pci)  {
>   struct mpc85xx_pci_pdata *pdata = pci->pvt_info;
>   u32 err_detect;
>+  struct ccsr_pci *reg = pdata->pci_reg;
>+
>+  err_detect = in_be32(&pdata->pci_reg->pex_err_dr);
>+
>+  if (pdata->pcie_flag) {
>+  printk(KERN_ERR "PCIE error(s) detected\n");
>+  printk(KERN_ERR "PCIE ERR_DR register: 0x%08x\n", err_detect);
>+  printk(KERN_ERR "PCIE ERR_CAP_STAT register: 0x%08x\n",
>+  in_be32(®->pex_err_cap_stat));
>+  printk(KERN_ERR "PCIE ERR_CAP_R0 register: 0x%08x\n",
>+  in_be32(®->pex_err_cap_r0));
>+  printk(KERN_ERR "PCIE ERR_CAP_R1 register: 0x%08x\n",
>+  in_be32(®->pex_err_cap_r1));
>+  printk(KERN_ERR "PCIE ERR_CAP_R2 register: 0x%08x\n",
>+  in_be32(®->pex_err_cap_r2));
>+  printk(KERN_ERR "PCIE ERR_CAP_R3 register: 0x%08x\n",
>+  in_be32(®->pex_err_cap_r3));
>+  } else {
>+  /* master aborts can happen during PCI config cycles */
>+  if (!(err_detect & ~(PCI_EDE_MULTI_ERR | PCI_EDE_MST_ABRT))) {
>+  out_be32(®->pex_err_dr, err_detect);
>+  return;
>+  }
>
>-  err_detect = in_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR);
>-
>-  /* master aborts can happen during PCI config cycles */
>-  if (!(err_detect & ~(PCI_EDE_MULTI_ERR | PCI_EDE_MST_ABRT))) {
>-  out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR, err_detect);
>-  return;
>+  printk(KERN_ERR "PCI error(s) detected\n");
>+  printk(KERN_ERR "PCI/X ERR_DR register: 0x%08x\n", err_detect);
>+  printk(KERN_ERR "PCI/X ERR_ATTRIB register: 0x%08x\n",
>+ in_be32(®->pex_err_attrib));
>+  printk(KERN_ERR "PCI/X ERR_ADDR register: 0x%08x\n",
>+ in_be32(®->pex_err_disr));
>+  printk(KERN_ERR "PCI/X ERR_EXT_ADDR register: 0x%08x\n",
>+ in_be32(®->pex_err_ext_addr));
>+  printk(KERN_ERR "PCI/X ERR_DL register: 0x%08x\n",
>+ in_be32(®->pex_err_dl));
>+  printk(KERN_ERR "PCI/X ERR_DH register: 0x%08x\n",
>+ in_be32(®->pex_err_dh));
>+
>+  if (err_detect & PCI_EDE_PERR_MASK)
>+  edac_pci_handle_pe(pci, pci->ctl_name);
>+
>+  if ((err_detect & ~PCI_EDE_MULTI_ERR) & ~PCI_EDE_PERR_MASK)
>+  edac_pci_handle_npe(pci, pci->ctl_name);
>   }
>
>-  printk(KERN_ERR "PCI error(s) detected\n");
>-  printk(KERN_ERR "PCI/X ERR_DR register: %#08x\n", err_detect);
>-
>-  printk(KERN_ERR "PCI/X ERR_ATTRIB register: %#08x\n",
>- in_

Re: [PATCH 4/4] edac/85xx: PCI/PCIE error interrupt edac support.

2011-07-22 Thread Stijn Devriendt
On Thu, Jul 21, 2011 at 12:33 PM, Shaohui Xie  wrote:
> From: Kai.Jiang 
>
> Add pcie error interrupt edac support for mpc85xx and p4080.
> mpc85xx uses the legacy interrupt report mechanism - the error
> interrupts are reported directly to mpic. While, p4080 attaches
> most of error interrupts to interrupt 0. And report error interrupt
> to mpic via interrupt 0. This patch can handle both of them.
>
> Due to the error management register offset and definition
>
> difference between pci and pcie, use ccsr_pci structure to merge
> pci and pcie edac code into one.
>

This code has been posted a couple of months ago, if I'm not mistaken.
I'm currently testing it on a P2020 based design.

One of the failures I'm trying to cope with is a PCIe device that does not
send back a completion with data. e.g. a userspace process reads memory
through a memory map, but the PCIe device is not responding.
In this case the P2020 will stall due to the core_fault_in being asserted.

If configured, this interrupt will be called, but it does nothing to cure the
root cause (e.g. kill the process). End result is that the processor still
hangs.
I've been hacking my way around the kernel for a while and ended up a lot
closer to a working solution to recover from such a failure.

The issue I'm facing now is that the PIC can be configured to send the
interrupt as a critical interrupt to one of both cores, but that may not
be the core that is running the process that initiated the read.
I've done 2 test-runs and both killed the right process, but I'd like to make
sure that it's not by accident.
Bottom-line: what mechanisms are in place (or are required) to ensure
that the the right process (on the same core or on another core) is killed
regardless of how the PIC is configured?

Regards,
Stijn
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev