Re: read the memory mapped address - pcie - kernel hangs
Thanks, I'll check it out. On Sat, 11 Jan, 2020, 4:33 AM Onur Atilla, wrote: > On 10.01.20 15:58, Muni Sekhar wrote: > > On Fri, Jan 10, 2020 at 4:46 PM Primoz Beltram > wrote: > >> > >> Hi, > >> Have read also other replays to this topic. > >> I have seen-debug such deadlock problems with FPGA based PCIe endpoint > >> devices (Xilinx chips) and usually (if not signal integrity problems), > >> the problem was in wrong AXI master/slave bus handling in FPGA design. > >> I guess you have FPGA Xilinx PCIe endpoint IP core attached as AXI > >> master to FPGA internal AXI bus (access to AXI slaves inside FPGA > design). > >> If FPGA code in your design does not handle correctly AXI master > >> read/write requests, e.g. FPGA AXI slave does not generate bus ACK in > >> correct way, the PCIe bus will stay locked (no PCIe completion sent > >> back), resulting in complete system lock. Some PCIe root chips have > >> diagnostic LEDs to help decode PCIe problems. > >> From your notice about doing two 32bit reads on 64bit CPU, I would > >> guess the problem is in handling AXI transfer size signals in FPGA slave > >> code. > >> I would suggest you to check the code in FPGA design. You can use FPGA > >> test bench simulation to check the behaviour of PCIe endpoint originated > >> AXI read/write requests. > >> Xilinx provides test bench simulation code for their PCIe IP's. > >> They provide also PCIe root port model, so you can simulate AXI > >> read/writes accesses as they would come from CPU I/O memory requests via > >> PCIe TLPs. > > Thank you so much for sharing valuable information, will work on this. > > > >> WBR Primoz > > Hi, > > you may also want to have a look at the AXI Timeout Block (ATB) to > prevent system/core locks due to a missing ACK of a slave. If given by > the HW, ATB generates an alternative response in case the slave fails to > respond within a given time. It may also trigger an interrupt to help > handle/debug the error. > > Regards, > Onur > > ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: read the memory mapped address - pcie - kernel hangs
On 10.01.20 15:58, Muni Sekhar wrote: > On Fri, Jan 10, 2020 at 4:46 PM Primoz Beltram wrote: >> >> Hi, >> Have read also other replays to this topic. >> I have seen-debug such deadlock problems with FPGA based PCIe endpoint >> devices (Xilinx chips) and usually (if not signal integrity problems), >> the problem was in wrong AXI master/slave bus handling in FPGA design. >> I guess you have FPGA Xilinx PCIe endpoint IP core attached as AXI >> master to FPGA internal AXI bus (access to AXI slaves inside FPGA design). >> If FPGA code in your design does not handle correctly AXI master >> read/write requests, e.g. FPGA AXI slave does not generate bus ACK in >> correct way, the PCIe bus will stay locked (no PCIe completion sent >> back), resulting in complete system lock. Some PCIe root chips have >> diagnostic LEDs to help decode PCIe problems. >> From your notice about doing two 32bit reads on 64bit CPU, I would >> guess the problem is in handling AXI transfer size signals in FPGA slave >> code. >> I would suggest you to check the code in FPGA design. You can use FPGA >> test bench simulation to check the behaviour of PCIe endpoint originated >> AXI read/write requests. >> Xilinx provides test bench simulation code for their PCIe IP's. >> They provide also PCIe root port model, so you can simulate AXI >> read/writes accesses as they would come from CPU I/O memory requests via >> PCIe TLPs. > Thank you so much for sharing valuable information, will work on this. > >> WBR Primoz Hi, you may also want to have a look at the AXI Timeout Block (ATB) to prevent system/core locks due to a missing ACK of a slave. If given by the HW, ATB generates an alternative response in case the slave fails to respond within a given time. It may also trigger an interrupt to help handle/debug the error. Regards, Onur signature.asc Description: OpenPGP digital signature ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: read the memory mapped address - pcie - kernel hangs
On Fri, Jan 10, 2020 at 4:46 PM Primoz Beltram wrote: > > Hi, > Have read also other replays to this topic. > I have seen-debug such deadlock problems with FPGA based PCIe endpoint > devices (Xilinx chips) and usually (if not signal integrity problems), > the problem was in wrong AXI master/slave bus handling in FPGA design. > I guess you have FPGA Xilinx PCIe endpoint IP core attached as AXI > master to FPGA internal AXI bus (access to AXI slaves inside FPGA design). > If FPGA code in your design does not handle correctly AXI master > read/write requests, e.g. FPGA AXI slave does not generate bus ACK in > correct way, the PCIe bus will stay locked (no PCIe completion sent > back), resulting in complete system lock. Some PCIe root chips have > diagnostic LEDs to help decode PCIe problems. > From your notice about doing two 32bit reads on 64bit CPU, I would > guess the problem is in handling AXI transfer size signals in FPGA slave > code. > I would suggest you to check the code in FPGA design. You can use FPGA > test bench simulation to check the behaviour of PCIe endpoint originated > AXI read/write requests. > Xilinx provides test bench simulation code for their PCIe IP's. > They provide also PCIe root port model, so you can simulate AXI > read/writes accesses as they would come from CPU I/O memory requests via > PCIe TLPs. Thank you so much for sharing valuable information, will work on this. > WBR Primoz > > On 8. 01. 20 20:00, Muni Sekhar wrote: > > Hi All, > > > > I have module with Xilinx FPGA. It implements UART(s), SPI(s), > > parallel I/O and interfaces them to the Host CPU via PCI Express bus. > > I see that my system freezes without capturing the crash dump for certain > > tests. > > I debugged this issue and it was tracked down to the ‘readl()’ in > > interrupt handler code > > > > In ISR, first reads the Interrupt Status register using ‘readl()’ as > > given below. > > status = readl(ctrl->reg + INT_STATUS); > > > > And then clears the pending interrupts using ‘writel()’ as given blow. > > writel(status, ctrl->reg + INT_STATUS); > > > > I've noticed a kernel hang if INT_STATUS register read again after > > clearing the pending interrupts. > > > > My system freezes only after executing the same ISR code after > > millions of interrupts. Basically reading the memory mapped register > > in ISR resulting this behavior. > > If I comment “status = readl(ctrl->reg + INT_STATUS);” after clearing > > the pending interrupts then system is stable . > > > > As a temporary workaround I avoided reading the INT_STATUS register > > after clearing the pending bits, and this code change works fine. > > > > Can someone clarify me why the kernel hangs without crash dump incase > > if I read the INT_STATUS register using readl() after > > clearing(writel()) the pending bits? > > > > To read the memory mapped IO kernel provides {read}{b,w,l,q}() API’s. > > If PCIe card is not responsive , can call to readl() from interrupt > > context makes system freeze? > > > > Thanks for any suggestions and solutions to this problem! > > > > Snippet of the ISR code is given blow: > > https://pastebin.com/as2tSPwE > > > > > > static irqreturn_t pcie_isr(int irq, void *data) > > > > { > > > > struct test_device *ctrl = (struct test_device *)data; > > > > u32 status; > > > > … > > > > > > > > status = readl(ctrl->reg + INT_STATUS); > > > > /* > > > > * Check to see if it was our interrupt > > > > */ > > > > if (!(status & 0x000C)) > > > > return IRQ_NONE; > > > > > > > > /* Clear the interrupt */ > > > > writel(status, ctrl->reg + INT_STATUS); > > > > > > > > if (status & 0x0004) { > > > > /* > > > > * Tx interrupt pending. > > > > */ > > > > > > > > } > > > > > > > > if (status & 0x0008) { > > > > /* Rx interrupt Pending */ > > > > /* The system freezes if I read again the INT_STATUS > > register as given below */ > > > > status = readl(ctrl->reg + INT_STATUS); > > > > > > > > } > > > > .. > > > > return IRQ_HANDLED; > > } > > > -- Thanks, Sekhar ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
Re: read the memory mapped address - pcie - kernel hangs
Hi, Have read also other replays to this topic. I have seen-debug such deadlock problems with FPGA based PCIe endpoint devices (Xilinx chips) and usually (if not signal integrity problems), the problem was in wrong AXI master/slave bus handling in FPGA design. I guess you have FPGA Xilinx PCIe endpoint IP core attached as AXI master to FPGA internal AXI bus (access to AXI slaves inside FPGA design). If FPGA code in your design does not handle correctly AXI master read/write requests, e.g. FPGA AXI slave does not generate bus ACK in correct way, the PCIe bus will stay locked (no PCIe completion sent back), resulting in complete system lock. Some PCIe root chips have diagnostic LEDs to help decode PCIe problems. From your notice about doing two 32bit reads on 64bit CPU, I would guess the problem is in handling AXI transfer size signals in FPGA slave code. I would suggest you to check the code in FPGA design. You can use FPGA test bench simulation to check the behaviour of PCIe endpoint originated AXI read/write requests. Xilinx provides test bench simulation code for their PCIe IP's. They provide also PCIe root port model, so you can simulate AXI read/writes accesses as they would come from CPU I/O memory requests via PCIe TLPs. WBR Primoz On 8. 01. 20 20:00, Muni Sekhar wrote: Hi All, I have module with Xilinx FPGA. It implements UART(s), SPI(s), parallel I/O and interfaces them to the Host CPU via PCI Express bus. I see that my system freezes without capturing the crash dump for certain tests. I debugged this issue and it was tracked down to the ‘readl()’ in interrupt handler code In ISR, first reads the Interrupt Status register using ‘readl()’ as given below. status = readl(ctrl->reg + INT_STATUS); And then clears the pending interrupts using ‘writel()’ as given blow. writel(status, ctrl->reg + INT_STATUS); I've noticed a kernel hang if INT_STATUS register read again after clearing the pending interrupts. My system freezes only after executing the same ISR code after millions of interrupts. Basically reading the memory mapped register in ISR resulting this behavior. If I comment “status = readl(ctrl->reg + INT_STATUS);” after clearing the pending interrupts then system is stable . As a temporary workaround I avoided reading the INT_STATUS register after clearing the pending bits, and this code change works fine. Can someone clarify me why the kernel hangs without crash dump incase if I read the INT_STATUS register using readl() after clearing(writel()) the pending bits? To read the memory mapped IO kernel provides {read}{b,w,l,q}() API’s. If PCIe card is not responsive , can call to readl() from interrupt context makes system freeze? Thanks for any suggestions and solutions to this problem! Snippet of the ISR code is given blow: https://pastebin.com/as2tSPwE static irqreturn_t pcie_isr(int irq, void *data) { struct test_device *ctrl = (struct test_device *)data; u32 status; … status = readl(ctrl->reg + INT_STATUS); /* * Check to see if it was our interrupt */ if (!(status & 0x000C)) return IRQ_NONE; /* Clear the interrupt */ writel(status, ctrl->reg + INT_STATUS); if (status & 0x0004) { /* * Tx interrupt pending. */ } if (status & 0x0008) { /* Rx interrupt Pending */ /* The system freezes if I read again the INT_STATUS register as given below */ status = readl(ctrl->reg + INT_STATUS); } .. return IRQ_HANDLED; } ___ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies