RE: No check of the size passed to unmap_single in swiotlb
> -Original Message- > From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com] > Sent: Tuesday, November 21, 2017 12:27 AM > To: Eric Yang ; io...@lists.linux-foundation.org > Cc: linux-kernel@vger.kernel.org; Greg Kroah-Hartman > ; Andrew Morton ; > Andrey Ryabinin ; David Miller > ; Ingo Molnar ; Geert > Uytterhoeven ; Al Viro ; > Kees Cook ; Daniel Borkmann > > Subject: Re: No check of the size passed to unmap_single in swiotlb > > On Mon, Nov 20, 2017 at 08:17:14AM +, Eric Yang wrote: > > Hi all, > > Hi! Hi Konrad, > > > > During debug a device only support 32bits DMA(Qualcomm Atheros AP) in our > LS1043A 64bits ARM SOC, we found that the invoke of dma_unmap_single --> > swiotlb_tbl_unmap_single will unmap the passed "size" anyway even when the > "size" is incorrect. > > > > If the size is larger than it should, the extra entries in io_tlb_orig_addr > > array > will be refilled by INVALID_PHYS_ADDR, and it will cause the bounce buffer > copy > not happen when the one who really used the mis-freed entries doing DMA data > transfers, and this will cause further unknow behaviors. > > > > Here we just fix it temporarily by adding a judge of the "size" in the > swiotlb_tbl_unmap_single, if it is larger than it deserves, just unmap the > right > size only. Like the code: > > Did the DMA debug API (CONFIG_DMA_API_DEBUG) help in figuring this issue as > well? > I have just enabled this config and move off the kernel fix and test, there seems the debug API has no output for this size incorrect issue. For I only enable the config and no other operations and my kernel is 4.9, do I missed something? I confirm that the debug API is working, for there are other drivers triggered its warning output like: caam_jr 171.jr: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x68004584] > > > > [yangyu@titan dash-lts]$ git diff ./lib/swiotlb.c diff --git > > a/lib/swiotlb.c b/lib/swiotlb.c index ad1d2962d129..58c97ede9d78 > > 100644 > > --- a/lib/swiotlb.c > > +++ b/lib/swiotlb.c > > @@ -591,7 +591,10 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, > phys_addr_t tlb_addr, > > */ > > for (i = index + nslots - 1; i >= index; i--) { > > io_tlb_list[i] = ++count; > > - io_tlb_orig_addr[i] = INVALID_PHYS_ADDR; > > + if(io_tlb_orig_addr[i] != orig_addr) > > + printk("==size wrong, ally down ally > > down!===\n"); > > + else > > + io_tlb_orig_addr[i] = INVALID_PHYS_ADDR; > > } > > /* > > * Step 2: merge the returned slots with the preceding > > slots, > > > > Although pass a right size of DMA buffer is the responsibility of the > > drivers, but > Is it useful to add some size check code to prevent real damage happen? > > > > Regards, > > Eric
RE: No check of the size passed to unmap_single in swiotlb
> -Original Message- > From: Robin Murphy [mailto:robin.mur...@arm.com] > Sent: Wednesday, November 22, 2017 9:15 PM > To: Eric Yang ; Konrad Rzeszutek Wilk > ; io...@lists.linux-foundation.org > Cc: Daniel Borkmann ; Kees Cook > ; Geert Uytterhoeven ; > Greg Kroah-Hartman ; linux- > ker...@vger.kernel.org; David Miller ; Al Viro > ; Andrey Ryabinin ; > Andrew Morton ; Ingo Molnar > > Subject: Re: No check of the size passed to unmap_single in swiotlb > > On 22/11/17 03:23, Eric Yang wrote: > > > > > >> -Original Message- > >> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com] > >> Sent: Tuesday, November 21, 2017 12:27 AM > >> To: Eric Yang ; io...@lists.linux-foundation.org > >> Cc: linux-kernel@vger.kernel.org; Greg Kroah-Hartman > >> ; Andrew Morton > >> ; Andrey Ryabinin > >> ; David Miller ; Ingo > >> Molnar ; Geert Uytterhoeven > >> ; Al Viro ; Kees > >> Cook ; Daniel Borkmann > >> Subject: Re: No check of the size passed to unmap_single in swiotlb > >> > >> On Mon, Nov 20, 2017 at 08:17:14AM +, Eric Yang wrote: > >>> Hi all, > >> > >> Hi! > > > > Hi Konrad, > >>> > >>> During debug a device only support 32bits DMA(Qualcomm Atheros AP) > >>> in our > >> LS1043A 64bits ARM SOC, we found that the invoke of dma_unmap_single > >> --> swiotlb_tbl_unmap_single will unmap the passed "size" anyway > >> even when the "size" is incorrect. > >>> > >>> If the size is larger than it should, the extra entries in > >>> io_tlb_orig_addr array > >> will be refilled by INVALID_PHYS_ADDR, and it will cause the bounce > >> buffer copy not happen when the one who really used the mis-freed > >> entries doing DMA data transfers, and this will cause further unknow > behaviors. > >>> > >>> Here we just fix it temporarily by adding a judge of the "size" in > >>> the > >> swiotlb_tbl_unmap_single, if it is larger than it deserves, just > >> unmap the right size only. Like the code: > >> > >> Did the DMA debug API (CONFIG_DMA_API_DEBUG) help in figuring this > >> issue as well? > >> > > > > I have just enabled this config and move off the kernel fix and test, > > there seems the debug API has no output for this size incorrect issue. > > For I only enable the config and no other operations and my kernel is 4.9, > > do I > missed something? > > > > I confirm that the debug API is working, for there are other drivers > > triggered its warning output > > like: > > > > caam_jr 171.jr: DMA-API: device driver tries to free DMA memory it > > has not allocated [device address=0x68004584] > > By default, dma-debug will only log the first error it sees - you can adjust > this > with the "num_errors" or "all_errors" sysfs controls, and use the driver > filter to > hide errors from other drivers if necessary. > > Robin. > Hi Robin, I've caught the size mismatch warning with the debug API, thanks a lot for the help. Regards, Eric > >>> > >>> [yangyu@titan dash-lts]$ git diff ./lib/swiotlb.c diff --git > >>> a/lib/swiotlb.c b/lib/swiotlb.c index ad1d2962d129..58c97ede9d78 > >>> 100644 > >>> --- a/lib/swiotlb.c > >>> +++ b/lib/swiotlb.c > >>> @@ -591,7 +591,10 @@ void swiotlb_tbl_unmap_single(struct device > >>> *hwdev, > >> phys_addr_t tlb_addr, > >>> */ > >>> for (i = index + nslots - 1; i >= index; i--) { > >>> io_tlb_list[i] = ++count; > >>> - io_tlb_orig_addr[i] = INVALID_PHYS_ADDR; > >>> + if(io_tlb_orig_addr[i] != orig_addr) > >>> + printk("==size wrong, ally down ally > >>> down!===\n"); > >>> + else > >>> + io_tlb_orig_addr[i] = INVALID_PHYS_ADDR; > >>> } > >>> /* > >>> * Step 2: merge the returned slots with the > >>> preceding slots, > >>> > >>> Although pass a right size of DMA buffer is the responsibility of > >>> the drivers, but > >> Is it useful to add some size check code to prevent real damage happen? > >>> > >>> Regards, > >>> Eric > > ___ > > iommu mailing list > > io...@lists.linux-foundation.org > > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flis > > > ts.linuxfoundation.org%2Fmailman%2Flistinfo%2Fiommu&data=02%7C01%7Cyu. > > > yang_3%40nxp.com%7Cf3851e3c8ae24f341b3608d531ab1165%7C686ea1d3bc > 2b4c6f > > > a92cd99c5c301635%7C0%7C0%7C636469533155971955&sdata=7PnqUlgRlJq2H > sXmTf > > CmEUuWaBfUhzSd33UoK4li1Ro%3D&reserved=0 > >
RE: No check of the size passed to unmap_single in swiotlb
> -Original Message- > From: Robin Murphy [mailto:robin.mur...@arm.com] > Sent: Tuesday, November 21, 2017 12:50 AM > To: Konrad Rzeszutek Wilk ; Eric Yang > ; io...@lists.linux-foundation.org > Cc: Daniel Borkmann ; Kees Cook > ; Geert Uytterhoeven ; > Greg Kroah-Hartman ; linux- > ker...@vger.kernel.org; David Miller ; Al Viro > ; Andrey Ryabinin ; > Andrew Morton ; Ingo Molnar > > Subject: Re: No check of the size passed to unmap_single in swiotlb > > On 20/11/17 16:26, Konrad Rzeszutek Wilk wrote: > > On Mon, Nov 20, 2017 at 08:17:14AM +, Eric Yang wrote: > >> Hi all, > > > > Hi! > >> > >> During debug a device only support 32bits DMA(Qualcomm Atheros AP) in > our LS1043A 64bits ARM SOC, we found that the invoke of dma_unmap_single - > -> swiotlb_tbl_unmap_single will unmap the passed "size" anyway even when > the "size" is incorrect. > >> > >> If the size is larger than it should, the extra entries in > >> io_tlb_orig_addr array > will be refilled by INVALID_PHYS_ADDR, and it will cause the bounce buffer > copy > not happen when the one who really used the mis-freed entries doing DMA data > transfers, and this will cause further unknow behaviors. > >> > >> Here we just fix it temporarily by adding a judge of the "size" in the > swiotlb_tbl_unmap_single, if it is larger than it deserves, just unmap the > right > size only. Like the code: > > > > Did the DMA debug API (CONFIG_DMA_API_DEBUG) help in figuring this issue > as well? > > > >> > >> [yangyu@titan dash-lts]$ git diff ./lib/swiotlb.c diff --git > >> a/lib/swiotlb.c b/lib/swiotlb.c index ad1d2962d129..58c97ede9d78 > >> 100644 > >> --- a/lib/swiotlb.c > >> +++ b/lib/swiotlb.c > >> @@ -591,7 +591,10 @@ void swiotlb_tbl_unmap_single(struct device > *hwdev, phys_addr_t tlb_addr, > >> */ > >> for (i = index + nslots - 1; i >= index; i--) { > >> io_tlb_list[i] = ++count; > >> - io_tlb_orig_addr[i] = INVALID_PHYS_ADDR; > >> + if(io_tlb_orig_addr[i] != orig_addr) > >> + printk("==size wrong, ally down ally > >> down!===\n"); > >> + else > >> + io_tlb_orig_addr[i] = INVALID_PHYS_ADDR; > >> } > >> /* > >> * Step 2: merge the returned slots with the > >> preceding slots, > >> > >> Although pass a right size of DMA buffer is the responsibility of the > >> drivers, > but Is it useful to add some size check code to prevent real damage happen? > > There doesn't seem to be much good reason for SWIOTLB to be more special > than other DMA API backends, and not all of them have enough internal state to > be able to make such a check. It's also not necessarily possible to "prevent > damage" anyway - if a driver does pass a bogus size for dma_unmap_single(..., > DMA_FROM_DEVICE), SWIOTLB might be able to keep itself internally consistent, > but it still can't prevent the arch code in the middle from invalidating the > wrong > cache lines and potentially corrupting adjacent memory. > > In short, trying to work around broken drivers is a much worse idea than just > fixing those drivers, and that's what we already have dma-debug for. > > Robin. Hi Robin, I agree that hacking kernel to fix broken drivers is not acceptable, actually we spent days to fight driver supplier with this, they do not want to change their code and want fix it directly in kernel. I tried Dma-debug yesterday, it works very well, but I think only the size mismatch check may not be enough for the map entry corrupt situation, some run-time warning may be better when the real corruption happen. For most of the dma-api backend, the size mismatch may do no harm at all, and even in SWIOTLB itself when the bounce buffer is not used, the size mismatch do no harm either. In our case, the same buggy driver works well when board has 2G DDR, but panic frequently in 4G DDR because of the use of bounce buffer and these corrupted map entries. it is hard to catch this kind of bugs, for when the corruption happen, the kernel has all kind of reasons to panic, but not even one may directly point to the real source. Add the warning messages is a big convenience for figure this kind of issues, at least to me and the AP driver supplier, such warnings may save weeks of hard debug time. Regards, Eric
No check of the size passed to unmap_single in swiotlb
Hi all, During debug a device only support 32bits DMA(Qualcomm Atheros AP) in our LS1043A 64bits ARM SOC, we found that the invoke of dma_unmap_single --> swiotlb_tbl_unmap_single will unmap the passed "size" anyway even when the "size" is incorrect. If the size is larger than it should, the extra entries in io_tlb_orig_addr array will be refilled by INVALID_PHYS_ADDR, and it will cause the bounce buffer copy not happen when the one who really used the mis-freed entries doing DMA data transfers, and this will cause further unknow behaviors. Here we just fix it temporarily by adding a judge of the "size" in the swiotlb_tbl_unmap_single, if it is larger than it deserves, just unmap the right size only. Like the code: [yangyu@titan dash-lts]$ git diff ./lib/swiotlb.c diff --git a/lib/swiotlb.c b/lib/swiotlb.c index ad1d2962d129..58c97ede9d78 100644 --- a/lib/swiotlb.c +++ b/lib/swiotlb.c @@ -591,7 +591,10 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr, */ for (i = index + nslots - 1; i >= index; i--) { io_tlb_list[i] = ++count; - io_tlb_orig_addr[i] = INVALID_PHYS_ADDR; + if(io_tlb_orig_addr[i] != orig_addr) + printk("==size wrong, ally down ally down!===\n"); + else + io_tlb_orig_addr[i] = INVALID_PHYS_ADDR; } /* * Step 2: merge the returned slots with the preceding slots, Although pass a right size of DMA buffer is the responsibility of the drivers, but Is it useful to add some size check code to prevent real damage happen? Regards, Eric
RE: No check of the size passed to unmap_single in swiotlb
Hi Robin, > -Original Message- > From: Robin Murphy [mailto:robin.mur...@arm.com] > Sent: Tuesday, November 28, 2017 10:18 PM > To: Eric Yang ; Konrad Rzeszutek Wilk > ; io...@lists.linux-foundation.org > Cc: Daniel Borkmann ; Kees Cook > ; Geert Uytterhoeven ; > Greg Kroah-Hartman ; linux- > ker...@vger.kernel.org; David Miller ; Al Viro > ; Andrey Ryabinin ; > Andrew Morton ; Ingo Molnar > > Subject: Re: No check of the size passed to unmap_single in swiotlb > > Hi Eric, > > On 23/11/17 09:08, Eric Yang wrote: > > > > > >> -Original Message- From: Robin Murphy > >> [mailto:robin.mur...@arm.com] Sent: Tuesday, November 21, 2017 > >> 12:50 AM To: Konrad Rzeszutek Wilk ; Eric > >> Yang ; io...@lists.linux-foundation.org Cc: > >> Daniel Borkmann ; Kees Cook > >> ; Geert Uytterhoeven > >> ; Greg Kroah-Hartman > >> ; linux- ker...@vger.kernel.org; David > >> Miller ; Al Viro ; > >> Andrey Ryabinin ; Andrew Morton > >> ; Ingo Molnar > >> Subject: Re: No check of the size passed to unmap_single in swiotlb > >> > >> On 20/11/17 16:26, Konrad Rzeszutek Wilk wrote: > >>> On Mon, Nov 20, 2017 at 08:17:14AM +, Eric Yang wrote: > >>>> Hi all, > >>> > >>> Hi! > >>>> > >>>> During debug a device only support 32bits DMA(Qualcomm Atheros > >>>> AP) in > >> our LS1043A 64bits ARM SOC, we found that the invoke of > >> dma_unmap_single - -> swiotlb_tbl_unmap_single will unmap the passed > >> "size" anyway even when the "size" is incorrect. > >>>> > >>>> If the size is larger than it should, the extra entries in > >>>> io_tlb_orig_addr array > >> will be refilled by INVALID_PHYS_ADDR, and it will cause the bounce > >> buffer copy not happen when the one who really used the mis-freed > >> entries doing DMA data transfers, and this will cause further unknow > >> behaviors. > >>>> > >>>> Here we just fix it temporarily by adding a judge of the "size" > >>>> in the > >> swiotlb_tbl_unmap_single, if it is larger than it deserves, just > >> unmap the right size only. Like the code: > >>> > >>> Did the DMA debug API (CONFIG_DMA_API_DEBUG) help in figuring this > >>> issue > >> as well? > >>> > >>>> > >>>> [yangyu@titan dash-lts]$ git diff ./lib/swiotlb.c diff --git > >>>> a/lib/swiotlb.c b/lib/swiotlb.c index > >>>> ad1d2962d129..58c97ede9d78 100644 --- a/lib/swiotlb.c +++ > >>>> b/lib/swiotlb.c @@ -591,7 +591,10 @@ void > >>>> swiotlb_tbl_unmap_single(struct device > >> *hwdev, phys_addr_t tlb_addr, > >>>> */ for (i = index + nslots - 1; i >= index; i--) { io_tlb_list[i] = > >>>> ++count; - io_tlb_orig_addr[i] = INVALID_PHYS_ADDR; + > >>>> if(io_tlb_orig_addr[i] != orig_addr) + printk("==size wrong, > >>>> ally down ally down!===\n"); + > >>>> else + io_tlb_orig_addr[i] = > >>>> INVALID_PHYS_ADDR; } /* * Step 2: merge the returned slots with the > >>>> preceding slots, > >>>> > >>>> Although pass a right size of DMA buffer is the responsibility of > >>>> the drivers, > >> but Is it useful to add some size check code to prevent real damage > >> happen? > >> > >> There doesn't seem to be much good reason for SWIOTLB to be more > >> special than other DMA API backends, and not all of them have enough > >> internal state to be able to make such a check. It's also not > >> necessarily possible to "prevent damage" anyway - if a driver does > >> pass a bogus size for dma_unmap_single(..., DMA_FROM_DEVICE), SWIOTLB > >> might be able to keep itself internally consistent, but it still > >> can't prevent the arch code in the middle from invalidating the wrong > >> cache lines and potentially corrupting adjacent memory. > >> > >> In short, trying to work around broken drivers is a much worse idea > >> than just fixing those drivers, and that's what we already have > >> dma-debug for. > >> > >> Robin. > > > > Hi Robin, > > > > I agree that hacking kernel to fix broken drivers is not acceptable, > > actually we spent days to fight driver supplier with this, they do > > not want