RE: [PATCH 4/4] edac/85xx: PCI/PCIE error interrupt edac support.
Hi all, Any concerns of this patch? Best Regards, Shaohui Xie >-Original Message- >From: Xie Shaohui-B21989 >Sent: Tuesday, July 26, 2011 2:52 PM >To: linuxppc-dev@lists.ozlabs.org; Kumar Gala >Cc: mm-comm...@vger.kernel.org; avoront...@mvista.com; da...@davemloft.net; >grant.lik...@secretlab.ca; a...@linux-foundation.org; Jiang Kai-B18973 >Subject: RE: [PATCH 4/4] edac/85xx: PCI/PCIE error interrupt edac support. > >I've verified this patch can apply for galak/powerpc.git 'next' branch >with no change. > > >Best Regards, >Shaohui Xie > > >>-Original Message- >>From: Xie Shaohui-B21989 >>Sent: Thursday, July 21, 2011 6:33 PM >>To: linuxppc-dev@lists.ozlabs.org >>Cc: Gala Kumar-B11780; mm-comm...@vger.kernel.org; avoront...@mvista.com; >>da...@davemloft.net; grant.lik...@secretlab.ca; a...@linux-foundation.org; >>Jiang Kai-B18973; Kumar Gala; Xie Shaohui-B21989 >>Subject: [PATCH 4/4] edac/85xx: PCI/PCIE error interrupt edac support. >> >>From: Kai.Jiang >> >>Add pcie error interrupt edac support for mpc85xx and p4080. >>mpc85xx uses the legacy interrupt report mechanism - the error interrupts >>are reported directly to mpic. While, p4080 attaches most of error >>interrupts to interrupt 0. And report error interrupt to mpic via >>interrupt 0. This patch can handle both of them. >> >> >>Due to the error management register offset and definition >> >>difference between pci and pcie, use ccsr_pci structure to merge pci and >>pcie edac code into one. >> >>Signed-off-by: Kai.Jiang >>Signed-off-by: Kumar Gala >>Signed-off-by: Shaohui Xie >>--- >> drivers/edac/mpc85xx_edac.c | 239 - >- >> >> drivers/edac/mpc85xx_edac.h | 17 +-- >> 2 files changed, 188 insertions(+), 68 deletions(-) >> >>diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c >>index b048a5f..dde156f 100644 >>--- a/drivers/edac/mpc85xx_edac.c >>+++ b/drivers/edac/mpc85xx_edac.c >>@@ -1,5 +1,6 @@ >> /* >> * Freescale MPC85xx Memory Controller kenel module >>+ * Copyright (c) 2011 Freescale Semiconductor, Inc. >> * >> * Author: Dave Jiang >> * >>@@ -21,6 +22,8 @@ >> >> #include >> #include >>+#include >>+#include >> #include "edac_module.h" >> #include "edac_core.h" >> #include "mpc85xx_edac.h" >>@@ -34,14 +37,6 @@ static int edac_mc_idx; static u32 >>orig_ddr_err_disable; static u32 orig_ddr_err_sbe; >> >>-/* >>- * PCI Err defines >>- */ >>-#ifdef CONFIG_PCI >>-static u32 orig_pci_err_cap_dr; >>-static u32 orig_pci_err_en; >>-#endif >>- >> static u32 orig_l2_err_disable; >> #ifdef CONFIG_FSL_SOC_BOOKE >> static u32 orig_hid1[2]; >>@@ -151,37 +146,52 @@ static void mpc85xx_pci_check(struct >>edac_pci_ctl_info *pci) { >> struct mpc85xx_pci_pdata *pdata = pci->pvt_info; >> u32 err_detect; >>+ struct ccsr_pci *reg = pdata->pci_reg; >>+ >>+ err_detect = in_be32(&pdata->pci_reg->pex_err_dr); >>+ >>+ if (pdata->pcie_flag) { >>+ printk(KERN_ERR "PCIE error(s) detected\n"); >>+ printk(KERN_ERR "PCIE ERR_DR register: 0x%08x\n", err_detect); >>+ printk(KERN_ERR "PCIE ERR_CAP_STAT register: 0x%08x\n", >>+ in_be32(®->pex_err_cap_stat)); >>+ printk(KERN_ERR "PCIE ERR_CAP_R0 register: 0x%08x\n", >>+ in_be32(®->pex_err_cap_r0)); >>+ printk(KERN_ERR "PCIE ERR_CAP_R1 register: 0x%08x\n", >>+ in_be32(®->pex_err_cap_r1)); >>+ printk(KERN_ERR "PCIE ERR_CAP_R2 register: 0x%08x\n", >>+ in_be32(®->pex_err_cap_r2)); >>+ printk(KERN_ERR "PCIE ERR_CAP_R3 register: 0x%08x\n", >>+ in_be32(®->pex_err_cap_r3)); >>+ } else { >>+ /* master aborts can happen during PCI config cycles */ >>+ if (!(err_detect & ~(PCI_EDE_MULTI_ERR | PCI_EDE_MST_ABRT))) { >>+ out_be32(®->pex_err_dr, err_detect); >>+ return; >>+ } >> >>- err_detect = in_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR); >>- >>- /* master aborts can happen during PCI config cycles */ >>- if (!(err_detect & ~(PCI_EDE_MULTI_ERR | PCI_EDE_M
RE: [PATCH 4/4] edac/85xx: PCI/PCIE error interrupt edac support.
I've verified this patch can apply for galak/powerpc.git 'next' branch with no change. Best Regards, Shaohui Xie >-Original Message- >From: Xie Shaohui-B21989 >Sent: Thursday, July 21, 2011 6:33 PM >To: linuxppc-dev@lists.ozlabs.org >Cc: Gala Kumar-B11780; mm-comm...@vger.kernel.org; avoront...@mvista.com; >da...@davemloft.net; grant.lik...@secretlab.ca; a...@linux-foundation.org; >Jiang Kai-B18973; Kumar Gala; Xie Shaohui-B21989 >Subject: [PATCH 4/4] edac/85xx: PCI/PCIE error interrupt edac support. > >From: Kai.Jiang > >Add pcie error interrupt edac support for mpc85xx and p4080. >mpc85xx uses the legacy interrupt report mechanism - the error interrupts >are reported directly to mpic. While, p4080 attaches most of error >interrupts to interrupt 0. And report error interrupt to mpic via >interrupt 0. This patch can handle both of them. > > >Due to the error management register offset and definition > >difference between pci and pcie, use ccsr_pci structure to merge pci and >pcie edac code into one. > >Signed-off-by: Kai.Jiang >Signed-off-by: Kumar Gala >Signed-off-by: Shaohui Xie >--- > drivers/edac/mpc85xx_edac.c | 239 -- > > drivers/edac/mpc85xx_edac.h | 17 +-- > 2 files changed, 188 insertions(+), 68 deletions(-) > >diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c >index b048a5f..dde156f 100644 >--- a/drivers/edac/mpc85xx_edac.c >+++ b/drivers/edac/mpc85xx_edac.c >@@ -1,5 +1,6 @@ > /* > * Freescale MPC85xx Memory Controller kenel module >+ * Copyright (c) 2011 Freescale Semiconductor, Inc. > * > * Author: Dave Jiang > * >@@ -21,6 +22,8 @@ > > #include > #include >+#include >+#include > #include "edac_module.h" > #include "edac_core.h" > #include "mpc85xx_edac.h" >@@ -34,14 +37,6 @@ static int edac_mc_idx; static u32 >orig_ddr_err_disable; static u32 orig_ddr_err_sbe; > >-/* >- * PCI Err defines >- */ >-#ifdef CONFIG_PCI >-static u32 orig_pci_err_cap_dr; >-static u32 orig_pci_err_en; >-#endif >- > static u32 orig_l2_err_disable; > #ifdef CONFIG_FSL_SOC_BOOKE > static u32 orig_hid1[2]; >@@ -151,37 +146,52 @@ static void mpc85xx_pci_check(struct >edac_pci_ctl_info *pci) { > struct mpc85xx_pci_pdata *pdata = pci->pvt_info; > u32 err_detect; >+ struct ccsr_pci *reg = pdata->pci_reg; >+ >+ err_detect = in_be32(&pdata->pci_reg->pex_err_dr); >+ >+ if (pdata->pcie_flag) { >+ printk(KERN_ERR "PCIE error(s) detected\n"); >+ printk(KERN_ERR "PCIE ERR_DR register: 0x%08x\n", err_detect); >+ printk(KERN_ERR "PCIE ERR_CAP_STAT register: 0x%08x\n", >+ in_be32(®->pex_err_cap_stat)); >+ printk(KERN_ERR "PCIE ERR_CAP_R0 register: 0x%08x\n", >+ in_be32(®->pex_err_cap_r0)); >+ printk(KERN_ERR "PCIE ERR_CAP_R1 register: 0x%08x\n", >+ in_be32(®->pex_err_cap_r1)); >+ printk(KERN_ERR "PCIE ERR_CAP_R2 register: 0x%08x\n", >+ in_be32(®->pex_err_cap_r2)); >+ printk(KERN_ERR "PCIE ERR_CAP_R3 register: 0x%08x\n", >+ in_be32(®->pex_err_cap_r3)); >+ } else { >+ /* master aborts can happen during PCI config cycles */ >+ if (!(err_detect & ~(PCI_EDE_MULTI_ERR | PCI_EDE_MST_ABRT))) { >+ out_be32(®->pex_err_dr, err_detect); >+ return; >+ } > >- err_detect = in_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR); >- >- /* master aborts can happen during PCI config cycles */ >- if (!(err_detect & ~(PCI_EDE_MULTI_ERR | PCI_EDE_MST_ABRT))) { >- out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR, err_detect); >- return; >+ printk(KERN_ERR "PCI error(s) detected\n"); >+ printk(KERN_ERR "PCI/X ERR_DR register: 0x%08x\n", err_detect); >+ printk(KERN_ERR "PCI/X ERR_ATTRIB register: 0x%08x\n", >+ in_be32(®->pex_err_attrib)); >+ printk(KERN_ERR "PCI/X ERR_ADDR register: 0x%08x\n", >+ in_be32(®->pex_err_disr)); >+ printk(KERN_ERR "PCI/X ERR_EXT_ADDR register: 0x%08x\n", >+ in_be32(®->pex_err_ext_addr)); >+ printk(KERN_ERR "PCI/X ERR_DL register: 0x%08x\n", >+ in_be32(®->pex_err_dl)); >+ printk(KERN_ERR "PCI/X ERR_DH register: 0x%08x\n", >+ in_be32(®->pex_err_dh)); >+ >+ if (err_detect & PCI_EDE_PERR_MASK) >+ edac_pci_handle_pe(pci, pci->ctl_name); >+ >+ if ((err_detect & ~PCI_EDE_MULTI_ERR) & ~PCI_EDE_PERR_MASK) >+ edac_pci_handle_npe(pci, pci->ctl_name); > } > >- printk(KERN_ERR "PCI error(s) detected\n"); >- printk(KERN_ERR "PCI/X ERR_DR register: %#08x\n", err_detect); >- >- printk(KERN_ERR "PCI/X ERR_ATTRIB register: %#08x\n", >- in_
Re: [PATCH 4/4] edac/85xx: PCI/PCIE error interrupt edac support.
On Thu, Jul 21, 2011 at 12:33 PM, Shaohui Xie wrote: > From: Kai.Jiang > > Add pcie error interrupt edac support for mpc85xx and p4080. > mpc85xx uses the legacy interrupt report mechanism - the error > interrupts are reported directly to mpic. While, p4080 attaches > most of error interrupts to interrupt 0. And report error interrupt > to mpic via interrupt 0. This patch can handle both of them. > > Due to the error management register offset and definition > > difference between pci and pcie, use ccsr_pci structure to merge > pci and pcie edac code into one. > This code has been posted a couple of months ago, if I'm not mistaken. I'm currently testing it on a P2020 based design. One of the failures I'm trying to cope with is a PCIe device that does not send back a completion with data. e.g. a userspace process reads memory through a memory map, but the PCIe device is not responding. In this case the P2020 will stall due to the core_fault_in being asserted. If configured, this interrupt will be called, but it does nothing to cure the root cause (e.g. kill the process). End result is that the processor still hangs. I've been hacking my way around the kernel for a while and ended up a lot closer to a working solution to recover from such a failure. The issue I'm facing now is that the PIC can be configured to send the interrupt as a critical interrupt to one of both cores, but that may not be the core that is running the process that initiated the read. I've done 2 test-runs and both killed the right process, but I'd like to make sure that it's not by accident. Bottom-line: what mechanisms are in place (or are required) to ensure that the the right process (on the same core or on another core) is killed regardless of how the PIC is configured? Regards, Stijn ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev