[PATCH] dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC
We observed the error "cacheline tracking ENOMEM, dma-debug disabled" during a light system load (copying some files). The reason for this error is that the dma_active_cacheline radix tree uses GFP_NOWAIT allocation - so it can't access the emergency memory reserves and it fails as soon as anybody reaches the watermark. This patch changes GFP_NOWAIT to GFP_ATOMIC, so that it can access the emergency memory reserves. Signed-off-by: Mikulas Patocka --- kernel/dma/debug.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6/kernel/dma/debug.c === --- linux-2.6.orig/kernel/dma/debug.c 2022-03-30 14:39:00.0 +0200 +++ linux-2.6/kernel/dma/debug.c2022-05-09 16:32:07.0 +0200 @@ -448,7 +448,7 @@ void debug_dma_dump_mappings(struct devi * other hand, consumes a single dma_debug_entry, but inserts 'nents' * entries into the tree. */ -static RADIX_TREE(dma_active_cacheline, GFP_NOWAIT); +static RADIX_TREE(dma_active_cacheline, GFP_ATOMIC); static DEFINE_SPINLOCK(radix_lock); #define ACTIVE_CACHELINE_MAX_OVERLAP ((1 << RADIX_TREE_MAX_TAGS) - 1) #define CACHELINE_PER_PAGE_SHIFT (PAGE_SHIFT - L1_CACHE_SHIFT) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] parisc iommu: fix panic due to trying to allocate too large region
When using the Promise TX2+ SATA controller on PA-RISC, the system often crashes with kernel panic, for example just writing data with the dd utility will make it crash. Kernel panic - not syncing: drivers/parisc/sba_iommu.c: I/O MMU @ a000 is out of mapping resources CPU: 0 PID: 18442 Comm: mkspadfs Not tainted 4.4.0-rc2 #2 Backtrace: [<4021497c>] show_stack+0x14/0x20 [<40410bf0>] dump_stack+0x88/0x100 [<4023978c>] panic+0x124/0x360 [<40452c18>] sba_alloc_range+0x698/0x6a0 [<40453150>] sba_map_sg+0x260/0x5b8 [<0c18dbb4>] ata_qc_issue+0x264/0x4a8 [libata] [<0c19535c>] ata_scsi_translate+0xe4/0x220 [libata] [<0c19a93c>] ata_scsi_queuecmd+0xbc/0x320 [libata] [<40499bbc>] scsi_dispatch_cmd+0xfc/0x130 [<4049da34>] scsi_request_fn+0x6e4/0x970 [<403e95a8>] __blk_run_queue+0x40/0x60 [<403e9d8c>] blk_run_queue+0x3c/0x68 [<4049a534>] scsi_run_queue+0x2a4/0x360 [<4049be68>] scsi_end_request+0x1a8/0x238 [<4049de84>] scsi_io_completion+0xfc/0x688 [<40493c74>] scsi_finish_command+0x17c/0x1d0 The cause of the crash is not exhaustion of the IOMMU space, there is plenty of free pages. The function sba_alloc_range is called with size 0x11000, thus the pages_needed variable is 0x11. The function sba_search_bitmap is called with bits_wanted 0x11 and boundary size is 0x10 (because dma_get_seg_boundary(dev) returns 0x). The function sba_search_bitmap attempts to allocate 17 pages that must not cross 16-page boundary - it can't satisfy this requirement (iommu_is_span_boundary always returns true) and fails even if there are many free entries in the IOMMU space. How did it happen that we try to allocate 17 pages that don't cross 16-page boundary? The cause is in the function iommu_coalesce_chunks. This function tries to coalesce adjacent entries in the scatterlist. The function does several checks if it may coalesce one entry with the next, one of those checks is this: if (startsg->length + dma_len > max_seg_size) break; When it finishes coalescing adjacent entries, it allocates the mapping: sg_dma_len(contig_sg) = dma_len; dma_len = ALIGN(dma_len + dma_offset, IOVP_SIZE); sg_dma_address(contig_sg) = PIDE_FLAG | (iommu_alloc_range(ioc, dev, dma_len) << IOVP_SHIFT) | dma_offset; It is possible that (startsg->length + dma_len > max_seg_size) is false (we are just near the 0x1 max_seg_size boundary), so the funcion decides to coalesce this entry with the next entry. When the coalescing succeeds, the function performs dma_len = ALIGN(dma_len + dma_offset, IOVP_SIZE); And now, because of non-zero dma_offset, dma_len is greater than 0x1. iommu_alloc_range (a pointer to sba_alloc_range) is called and it attempts to allocate 17 pages for a device that must not cross 16-page boundary. To fix the bug, we must make sure that dma_len after addition of dma_offset and alignment doesn't cross the segment boundary. I.e. change if (startsg->length + dma_len > max_seg_size) break; to if (ALIGN(dma_len + dma_offset + startsg->length, IOVP_SIZE) > max_seg_size) break; This patch makes this change (it precalculates max_seg_boundary at the beginning of the function iommu_coalesce_chunks). I also added a check that the mapping length doesn't exceed dma_get_seg_boundary(dev) (it is not needed for Promise TX2+ SATA, but it may be needed for other devices that have dma_get_seg_boundary lower than dma_get_max_seg_size). Signed-off-by: Mikulas Patocka Cc: sta...@vger.kernel.org --- drivers/parisc/iommu-helpers.h | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) Index: linux-4.4-rc3/drivers/parisc/iommu-helpers.h === --- linux-4.4-rc3.orig/drivers/parisc/iommu-helpers.h 2015-11-30 17:52:10.0 +0100 +++ linux-4.4-rc3/drivers/parisc/iommu-helpers.h2015-11-30 20:19:53.0 +0100 @@ -104,7 +104,11 @@ iommu_coalesce_chunks(struct ioc *ioc, s struct scatterlist *contig_sg; /* contig chunk head */ unsigned long dma_offset, dma_len; /* start/len of DMA stream */ unsigned int n_mappings = 0; - unsigned int max_seg_size = dma_get_max_seg_size(dev); + unsigned int max_seg_size = min(dma_get_max_seg_size(dev), + (unsigned)DMA_CHUNK_SIZE); + unsigned int max_seg_boundary = dma_get_seg_boundary(dev) + 1; + if (max_seg_boundary) /* check if the addition above didn't overflow */ + max_seg_size = min(max_seg_size, max_seg_boundary); while (nents > 0) { @@ -138,14 +142,11 @@ iommu_coalesce_chunks(struct ioc *ioc,
Re: [dm-devel] AMD-Vi IO_PAGE_FAULTs and ata3.00: failed command: READ FPDMA QUEUED errors since Linux 4.0
On Fri, 9 Oct 2015, Andreas Hartmann wrote: > Hello Jörg, > > On 10/09/2015 at 04:59 PM, Joerg Roedel wrote: > > On Fri, Oct 09, 2015 at 11:15:05AM +0200, Andreas Hartmann wrote: > >> v4.3-rc4 isn't usable at all for me as long as is hangs the machine on > >> the necessary PCI passthrough for VMs (I need them). > > > > If the fix I just sent you works, could you please test this again with > > a (patched) v4.3-rc4 kernel? > > Your IOMMU-patch works fine - but the ata-problem can be seen here, too. > Same behavior as with 4.1.10. Could you try another ata disk? (copy the whole filesystem to it and run the same test) It may be bug in disk's firmware. Mikulas___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [dm-devel] AMD-Vi IO_PAGE_FAULTs and ata3.00: failed command: READ FPDMA QUEUED errors since Linux 4.0
On Tue, 29 Sep 2015, Joerg Roedel wrote: > On Sun, Sep 20, 2015 at 08:50:40AM +0200, Andreas Hartmann wrote: > > > I would submit this bug to maintainers of AMD-Vi. They understand the > > > hardware, so they should tell why do large I/O requests result in > > > IO_PAGE_FAULTs. > > > > > > It is probably bug either in AMD-Vi driver or in hardware. > > > > Until now, I didn't hear anything from the maintainers of AMD-Vi. > > What do you mean by this? I've been commenting on this issue in the > past and I thought we agreed that this is no issue of the IOMMU driver. > > It it were, bisection should lead to a commit that breaks it, but there > are no commits between v3.18 and v3.19 in the AMD IOMMU driver touching > the DMA-API path. > > Joerg I don't know why are you so certain that the bug in not in AMD-Vi IOMMU. There was a patch (34b48db66e08ca1c1bc07cf305d672ac940268dc) that increased default block request size. That patch triggers AMD-Vi page faults. The bug may be in ATA driver, in ATA controller on in AMD-Vi driver or hardware. I didn't see anything in that thread that proves that the bug in not in AMD-Vi IOMMU. The bug probably existed even before kernel 3.19, but it was masked by the fact that I/O request size was artifically capped. Bisecting probably won't find it, as it may have existed since ever. Mikulas ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [dm-devel] AMD-Vi IO_PAGE_FAULTs and ata3.00: failed command: READ FPDMA QUEUED errors since Linux 4.0
On Sun, 2 Aug 2015, Andreas Hartmann wrote: > On 08/01/2015 at 04:20 PM Andreas Hartmann wrote: > > On 07/28/2015 at 09:29 PM, Mike Snitzer wrote: > > [...] > >> Mikulas was saying to biect what is causing ATA to fail. > > > > Some good news and some bad news. The good news first: > > > > Your patchset > > > > f3396c58fd8442850e759843457d78b6ec3a9589, > > cf2f1abfbd0dba701f7f16ef619e4d2485de3366, > > 7145c241a1bf2841952c3e297c4080b357b3e52d, > > 94f5e0243c48aa01441c987743dc468e2d6eaca2, > > dc2676210c425ee8e5cb1bec5bc84d004ddf4179, > > 0f5d8e6ee758f7023e4353cca75d785b2d4f6abe, > > b3c5fd3052492f1b8d060799d4f18be5a5438add > > > > seems to work fine w/ 3.18.19 !! > > > > Why did I test it with 3.18.x now? Because I suddenly got two ata errors > > (ata1 and ata2) with clean 3.19.8 (w/o the AMD-Vi IO_PAGE_FAULTs) during > > normal operation. This means: 3.19 must already be broken, too. > > > > Therefore, I applied your patchset to 3.18.x and it seems to work like a > > charme - I don't get any AMD-Vi IO_PAGE_FAULTs on boot and no ata errors > > (until now). > > > > > > Next I did: I tried to bisect between 3.18 and 3.19 with your patchset > > applied, because w/ this patchset applied, the problem can be seen > > easily and directly on boot. Unfortunately, this does work only a few > > git bisect rounds until I got stuck because of interferences with your > > extra patches applied: > > [Resolved the problems written at the last post.] > > Bisecting ended here: > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=34b48db66e08ca1c1bc07cf305d672ac940268dc > > block: remove artifical max_hw_sectors cap > > > Removing this patch on 3.19 and 4.1 make things working again. Didn't > test 4.0, but I think it's the same. No more AMD-Vi IO_PAGE_FAULTS with > that patch reverted. > > > Please check why this patch triggers AMD-Vi IO_PAGE_FAULTS. > > > Thanks, > kind regards, > Andreas Hartmann I would submit this bug to maintainers of AMD-Vi. They understand the hardware, so they should tell why do large I/O requests result in IO_PAGE_FAULTs. It is probably bug either in AMD-Vi driver or in hardware. Mikulas ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu