Hello,

On Fri, Apr 19, 2024 at 04:12:46PM +1000, Michael Ellerman wrote:
> Gaurav Batra <gba...@linux.ibm.com> writes:
> > At the time of LPAR reboot, partition firmware provides Open Firmware
> > property ibm,dma-window for the PE. This property is provided on the PCI
> > bus the PE is attached to.
> 
> AFAICS you're actually describing a bug that happens during boot *up*?
> 
> Describing it as "reboot" makes me think you're talking about the
> shutdown path. I think that will confuse people, me at least :)

there is probably an assumption that it must have been running
previously for the errors to happen in the first place but given the
error state persists for a day it may be a very long 'reboot'.

Thanks

Michal
> 
> cheers
> 
> > There are execptions where the partition firmware might not provide this
> > property for the PE at the time of LPAR reboot. One of the scenario is
> > where the firmware has frozen the PE due to some error conditions. This
> > PE is frozen for 24 hours or unless the whole system is reinitialized.
> >
> > Within this time frame, if the LPAR is rebooted, the frozen PE will be
> > presented to the LPAR but ibm,dma-window property could be missing.
> >
> > Today, under these circumstances, the LPAR oopses with NULL pointer
> > dereference, when configuring the PCI bus the PE is attached to.
> >
> > BUG: Kernel NULL pointer dereference on read at 0x000000c8
> > Faulting instruction address: 0xc0000000001024c0
> > Oops: Kernel access of bad area, sig: 7 [#1]
> > LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> > Modules linked in:
> > Supported: Yes
> > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.4.0-150600.9-default #1
> > Hardware name: IBM,9043-MRX POWER10 (raw) 0x800200 0xf000006 
> > of:IBM,FW1060.00 (NM1060_023) hv:phyp pSeries
> > NIP:  c0000000001024c0 LR: c0000000001024b0 CTR: c000000000102450
> > REGS: c0000000037db5c0 TRAP: 0300   Not tainted  (6.4.0-150600.9-default)
> > MSR:  8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 28000822  XER: 
> > 00000000
> > CFAR: c00000000010254c DAR: 00000000000000c8 DSISR: 00080000 IRQMASK: 0
> > ...
> > NIP [c0000000001024c0] pci_dma_bus_setup_pSeriesLP+0x70/0x2a0
> > LR [c0000000001024b0] pci_dma_bus_setup_pSeriesLP+0x60/0x2a0
> > Call Trace:
> >     pci_dma_bus_setup_pSeriesLP+0x60/0x2a0 (unreliable)
> >     pcibios_setup_bus_self+0x1c0/0x370
> >     __of_scan_bus+0x2f8/0x330
> >     pcibios_scan_phb+0x280/0x3d0
> >     pcibios_init+0x88/0x12c
> >     do_one_initcall+0x60/0x320
> >     kernel_init_freeable+0x344/0x3e4
> >     kernel_init+0x34/0x1d0
> >     ret_from_kernel_user_thread+0x14/0x1c
> >
> > Fixes: b1fc44eaa9ba ("pseries/iommu/ddw: Fix kdump to work in absence of 
> > ibm,dma-window")
> > Signed-off-by: Gaurav Batra <gba...@linux.ibm.com>
> > ---
> >  arch/powerpc/platforms/pseries/iommu.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/arch/powerpc/platforms/pseries/iommu.c 
> > b/arch/powerpc/platforms/pseries/iommu.c
> > index e8c4129697b1..e808d5b1fa49 100644
> > --- a/arch/powerpc/platforms/pseries/iommu.c
> > +++ b/arch/powerpc/platforms/pseries/iommu.c
> > @@ -786,8 +786,16 @@ static void pci_dma_bus_setup_pSeriesLP(struct pci_bus 
> > *bus)
> >      * parent bus. During reboot, there will be ibm,dma-window property to
> >      * define DMA window. For kdump, there will at least be default window 
> > or DDW
> >      * or both.
> > +    * There is an exception to the above. In case the PE goes into frozen
> > +    * state, firmware may not provide ibm,dma-window property at the time
> > +    * of LPAR reboot.
> >      */
> >  
> > +   if (!pdn) {
> > +           pr_debug("  no ibm,dma-window property !\n");
> > +           return;
> > +   }
> > +
> >     ppci = PCI_DN(pdn);
> >  
> >     pr_debug("  parent is %pOF, iommu_table: 0x%p\n",
> >
> > base-commit: 2c71fdf02a95b3dd425b42f28fd47fb2b1d22702
> > -- 
> > 2.39.3 (Apple Git-146)

Reply via email to