Re: [PATCH] xen/swiotlb: correct the check for xen_destroy_contiguous_region
On 4/28/20 10:25 AM, Konrad Rzeszutek Wilk wrote: > On Tue, Apr 28, 2020 at 12:19:41PM +0200, Jürgen Groß wrote: >> On 28.04.20 10:25, Peng Fan wrote: > > Adding Joe Jin. > > Joe, didn't you have some ideas on how this could be implemented? > >>>> Subject: Re: [PATCH] xen/swiotlb: correct the check for >>>> xen_destroy_contiguous_region >>>> >>>> On 28.04.20 09:33, peng@nxp.com wrote: >>>>> From: Peng Fan >>>>> >>>>> When booting xen on i.MX8QM, met: >>>>> " >>>>> [3.602128] Unable to handle kernel paging request at virtual address >>>> 00272d40 >>>>> [3.610804] Mem abort info: >>>>> [3.613905] ESR = 0x9604 >>>>> [3.617332] EC = 0x25: DABT (current EL), IL = 32 bits >>>>> [3.623211] SET = 0, FnV = 0 >>>>> [3.626628] EA = 0, S1PTW = 0 >>>>> [3.630128] Data abort info: >>>>> [3.633362] ISV = 0, ISS = 0x0004 >>>>> [3.637630] CM = 0, WnR = 0 >>>>> [3.640955] [00272d40] user address but active_mm is >>>> swapper >>>>> [3.647983] Internal error: Oops: 9604 [#1] PREEMPT SMP >>>>> [3.654137] Modules linked in: >>>>> [3.677285] Hardware name: Freescale i.MX8QM MEK (DT) >>>>> [3.677302] Workqueue: events deferred_probe_work_func >>>>> [3.684253] imx6q-pcie 5f00.pcie: PCI host bridge to bus :00 >>>>> [3.688297] pstate: 6005 (nZCv daif -PAN -UAO) >>>>> [3.688310] pc : xen_swiotlb_free_coherent+0x180/0x1c0 >>>>> [3.693993] pci_bus :00: root bus resource [bus 00-ff] >>>>> [3.701002] lr : xen_swiotlb_free_coherent+0x44/0x1c0 >>>>> " >>>>> >>>>> In xen_swiotlb_alloc_coherent, if !(dev_addr + size - 1 <= dma_mask) >>>>> or range_straddles_page_boundary(phys, size) are true, it will create >>>>> contiguous region. So when free, we need to free contiguous region use >>>>> upper check condition. >>>> >>>> No, this will break PV guests on x86. >>> >>> Could you share more details why alloc and free not matching for the check? >> >> xen_create_contiguous_region() is needed only in case: >> >> - the bus address is not within dma_mask, or >> - the memory region is not physically contiguous (can happen only for >> PV guests) >> >> In any case it should arrange for the memory to be suitable for the >> DMA operation, so to be contiguous and within dma_mask afterwards. So >> xen_destroy_contiguous_region() should only ever called for areas >> which match above criteria, as otherwise we can be sure >> xen_create_contiguous_region() was not used for making the area DMA-able >> in the beginning. I agreed with Juergen's explanation, That is my understanding. Peng, if panic caused by (dev_addr + size - 1 > dma_mask), you should check how you get the addr, if memory created by xen_create_contiguous_region(), memory must be with in [0 - dma_mask]. Thanks, Joe >> >> And this is very important in the PV case, as in those guests the page >> tables are containing the host-PFNs, not the guest-PFNS, and >> xen_create_contiguous_region() will fiddle with host- vs. guest-PFN >> arrangements, and xen_destroy_contiguous_region() is reverting this >> fiddling. Any call of xen_destroy_contiguous_region() for an area it >> was not intended to be called for might swap physical pages beneath >> random virtual addresses, which was the reason for this test to be >> added by me. >> >> >> Juergen >> >>> >>> Thanks, >>> Peng. >>> >>>> >>>> I think there is something wrong with your setup in combination with the >>>> ARM >>>> xen_create_contiguous_region() implementation. >>>> >>>> Stefano? >>>> >>>> >>>> Juergen >>>> >>>>> >>>>> Signed-off-by: Peng Fan >>>>> --- >>>>>drivers/xen/swiotlb-xen.c | 4 ++-- >>>>>1 file changed, 2 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c >>>>> index b6d27762c6f8..ab96e468584f 100644 >>>>> --- a/drivers/xen/swiotlb-xen.c >>>>> +++ b/drivers/xen/swiotlb-xen.c >>>>> @@ -346,8 +346,8 @@ xen_swiotlb_free_coherent(struct device *hwdev, >>>> size_t size, void *vaddr, >>>>> /* Convert the size to actually allocated. */ >>>>> size = 1UL << (order + XEN_PAGE_SHIFT); >>>>> >>>>> - if (!WARN_ON((dev_addr + size - 1 > dma_mask) || >>>>> - range_straddles_page_boundary(phys, size)) && >>>>> + if (((dev_addr + size - 1 > dma_mask) || >>>>> + range_straddles_page_boundary(phys, size)) && >>>>> TestClearPageXenRemapped(virt_to_page(vaddr))) >>>>> xen_destroy_contiguous_region(phys, order); >>>>> >>>>> >>> >> ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] swiotlb: save io_tlb_used to local variable before leaving critical section
I'm good to have this patch, which helps identify the cause of failure is fragmentation or it really been used up. On 4/12/19 4:38 AM, Dongli Zhang wrote: > When swiotlb is full, the kernel would print io_tlb_used. However, the > result might be inaccurate at that time because we have left the critical > section protected by spinlock. > > Therefore, we backup the io_tlb_used into local variable before leaving > critical section. > > Fixes: 83ca25948940 ("swiotlb: dump used and total slots when swiotlb buffer > is full") > Suggested-by: Håkon Bugge > Signed-off-by: Dongli Zhang Reviewed-by: Joe Jin Thanks, Joe ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/2] swiotlb: add debugfs to track swiotlb buffer usage
On 12/10/18 12:00 PM, Tim Chen wrote: >> @@ -528,6 +538,9 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, >> dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes)\n", >> size); >> return SWIOTLB_MAP_ERROR; >> found: >> +#ifdef CONFIG_DEBUG_FS >> +io_tlb_used += nslots; >> +#endif > One nit I have about this patch is there are too many CONFIG_DEBUG_FS. > > For example here, instead of io_tlb_used, we can have a macro defined, > perhaps something like inc_iotlb_used(nslots). It can be placed in the > same section that swiotlb_create_debugfs is defined so there's a single > place where all the CONFIG_DEBUG_FS stuff is located. > > Then define inc_iotlb_used to be null when we don't have > CONFIG_DEBUG_FS. > Dongli had removed above ifdef/endif on his next patch, "[PATCH v2 2/2] swiotlb: checking whether swiotlb buffer is full with io_tlb_used" Thanks, Joe ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 1/2] swiotlb: add debugfs to track swiotlb buffer usage
On 12/9/18 4:37 PM, Dongli Zhang wrote: > The device driver will not be able to do dma operations once swiotlb buffer > is full, either because the driver is using so many IO TLB blocks inflight, > or because there is memory leak issue in device driver. To export the > swiotlb buffer usage via debugfs would help the user estimate the size of > swiotlb buffer to pre-allocate or analyze device driver memory leak issue. > > Signed-off-by: Dongli Zhang Reviewed-by: Joe Jin > --- > Changed since v1: > * init debugfs with late_initcall (suggested by Robin Murphy) > * create debugfs entries with debugfs_create_ulong(suggested by Robin > Murphy) > > kernel/dma/swiotlb.c | 50 ++ > 1 file changed, 50 insertions(+) > > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c > index 045930e..3979c2c 100644 > --- a/kernel/dma/swiotlb.c > +++ b/kernel/dma/swiotlb.c > @@ -35,6 +35,9 @@ > #include > #include > #include > +#ifdef CONFIG_DEBUG_FS > +#include > +#endif > > #include > #include > @@ -73,6 +76,13 @@ static phys_addr_t io_tlb_start, io_tlb_end; > */ > static unsigned long io_tlb_nslabs; > > +#ifdef CONFIG_DEBUG_FS > +/* > + * The number of used IO TLB block > + */ > +static unsigned long io_tlb_used; > +#endif > + > /* > * This is a free list describing the number of free entries available from > * each index > @@ -528,6 +538,9 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, > dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes)\n", > size); > return SWIOTLB_MAP_ERROR; > found: > +#ifdef CONFIG_DEBUG_FS > + io_tlb_used += nslots; > +#endif > spin_unlock_irqrestore(_tlb_lock, flags); > > /* > @@ -588,6 +601,10 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, > phys_addr_t tlb_addr, >*/ > for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != > IO_TLB_SEGSIZE -1) && io_tlb_list[i]; i--) > io_tlb_list[i] = ++count; > + > +#ifdef CONFIG_DEBUG_FS > + io_tlb_used -= nslots; > +#endif > } > spin_unlock_irqrestore(_tlb_lock, flags); > } > @@ -883,3 +900,36 @@ const struct dma_map_ops swiotlb_dma_ops = { > .dma_supported = dma_direct_supported, > }; > EXPORT_SYMBOL(swiotlb_dma_ops); > + > +#ifdef CONFIG_DEBUG_FS > + > +static int __init swiotlb_create_debugfs(void) > +{ > + static struct dentry *d_swiotlb_usage; > + struct dentry *ent; > + > + d_swiotlb_usage = debugfs_create_dir("swiotlb", NULL); > + > + if (!d_swiotlb_usage) > + return -ENOMEM; > + > + ent = debugfs_create_ulong("io_tlb_nslabs", 0400, > +d_swiotlb_usage, _tlb_nslabs); > + if (!ent) > + goto fail; > + > + ent = debugfs_create_ulong("io_tlb_used", 0400, > + d_swiotlb_usage, _tlb_used); > + if (!ent) > + goto fail; > + > + return 0; > + > +fail: > + debugfs_remove_recursive(d_swiotlb_usage); > + return -ENOMEM; > +} > + > +late_initcall(swiotlb_create_debugfs); > + > +#endif > -- Oracle <http://www.oracle.com> Joe Jin | Software Development Director ORACLE | Linux and Virtualization 500 Oracle Parkway Redwood City, CA US 94065 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2 2/2] swiotlb: checking whether swiotlb buffer is full with io_tlb_used
On 12/9/18 4:37 PM, Dongli Zhang wrote: > This patch uses io_tlb_used to help check whether swiotlb buffer is full. > io_tlb_used is no longer used for only debugfs. It is also used to help > optimize swiotlb_tbl_map_single(). > > Suggested-by: Joe Jin > Signed-off-by: Dongli Zhang Reviewed-by: Joe Jin > --- > kernel/dma/swiotlb.c | 10 -- > 1 file changed, 4 insertions(+), 6 deletions(-) > > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c > index 3979c2c..9300341 100644 > --- a/kernel/dma/swiotlb.c > +++ b/kernel/dma/swiotlb.c > @@ -76,12 +76,10 @@ static phys_addr_t io_tlb_start, io_tlb_end; > */ > static unsigned long io_tlb_nslabs; > > -#ifdef CONFIG_DEBUG_FS > /* > * The number of used IO TLB block > */ > static unsigned long io_tlb_used; > -#endif > > /* > * This is a free list describing the number of free entries available from > @@ -489,6 +487,10 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, >* request and allocate a buffer from that IO TLB pool. >*/ > spin_lock_irqsave(_tlb_lock, flags); > + > + if (unlikely(nslots > io_tlb_nslabs - io_tlb_used)) > + goto not_found; > + > index = ALIGN(io_tlb_index, stride); > if (index >= io_tlb_nslabs) > index = 0; > @@ -538,9 +540,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, > dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes)\n", > size); > return SWIOTLB_MAP_ERROR; > found: > -#ifdef CONFIG_DEBUG_FS > io_tlb_used += nslots; > -#endif > spin_unlock_irqrestore(_tlb_lock, flags); > > /* > @@ -602,9 +602,7 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, > phys_addr_t tlb_addr, > for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != > IO_TLB_SEGSIZE -1) && io_tlb_list[i]; i--) > io_tlb_list[i] = ++count; > > -#ifdef CONFIG_DEBUG_FS > io_tlb_used -= nslots; > -#endif > } > spin_unlock_irqrestore(_tlb_lock, flags); > } > -- Oracle <http://www.oracle.com> Joe Jin | Software Development Director ORACLE | Linux and Virtualization 500 Oracle Parkway Redwood City, CA US 94065 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RFC 1/1] swiotlb: add debugfs to track swiotlb buffer usage
On 12/6/18 9:49 PM, Dongli Zhang wrote: > > > On 12/07/2018 12:12 AM, Joe Jin wrote: >> Hi Dongli, >> >> Maybe move d_swiotlb_usage declare into swiotlb_create_debugfs(): > > I assume the call of swiotlb_tbl_map_single() might be frequent in some > situations, e.g., when 'swiotlb=force'. > > That's why I declare the d_swiotlb_usage out of any functions and use "if > (unlikely(!d_swiotlb_usage))". This is reasonable. Thanks, Joe > > I think "if (unlikely(!d_swiotlb_usage))" incur less performance overhead than > calling swiotlb_create_debugfs() every time to confirm if debugfs is created. > I > would declare d_swiotlb_usage statically inside swiotlb_create_debugfs() if > the > performance overhead is acceptable (it is trivial indeed). > > > That is the reason I tag the patch with RFC because I am not sure if the > on-demand creation of debugfs is fine with maintainers/reviewers. If swiotlb > pages are never allocated, we would not be able to see the debugfs entry. > > I would prefer to limit the modification within swiotlb and to not taint any > other files. > > The drawback is there is no place to create or delete the debugfs entry > because > swiotlb buffer could be initialized and uninitialized at very early stage. > >> >> void swiotlb_create_debugfs(void) >> { >> #ifdef CONFIG_DEBUG_FS >> static struct dentry *d_swiotlb_usage = NULL; >> >> if (d_swiotlb_usage) >> return; >> >> d_swiotlb_usage = debugfs_create_dir("swiotlb", NULL); >> >> if (!d_swiotlb_usage) >> return; >> >> debugfs_create_file("usage", 0600, d_swiotlb_usage, >> NULL, _usage_fops); >> #endif >> } >> >> And for io_tlb_used, possible add a check at the begin of >> swiotlb_tbl_map_single(), >> if there were not any free slots or not enough slots, return fail directly? > > This would optimize the slots allocation path. I will follow this in next > version after I got more suggestions and confirmations from maintainers. > > > Thank you very much! > > Dongli Zhang > >> >> Thanks, >> Joe >> On 12/5/18 7:59 PM, Dongli Zhang wrote: >>> The device driver will not be able to do dma operations once swiotlb buffer >>> is full, either because the driver is using so many IO TLB blocks inflight, >>> or because there is memory leak issue in device driver. To export the >>> swiotlb buffer usage via debugfs would help the user estimate the size of >>> swiotlb buffer to pre-allocate or analyze device driver memory leak issue. >>> >>> As the swiotlb can be initialized at very early stage when debugfs cannot >>> register successfully, this patch creates the debugfs entry on demand. >>> >>> Signed-off-by: Dongli Zhang >>> --- >>> kernel/dma/swiotlb.c | 57 >>> >>> 1 file changed, 57 insertions(+) >>> >>> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c >>> index 045930e..d3c8aa4 100644 >>> --- a/kernel/dma/swiotlb.c >>> +++ b/kernel/dma/swiotlb.c >>> @@ -35,6 +35,9 @@ >>> #include >>> #include >>> #include >>> +#ifdef CONFIG_DEBUG_FS >>> +#include >>> +#endif >>> >>> #include >>> #include >>> @@ -73,6 +76,13 @@ static phys_addr_t io_tlb_start, io_tlb_end; >>> */ >>> static unsigned long io_tlb_nslabs; >>> >>> +#ifdef CONFIG_DEBUG_FS >>> +/* >>> + * The number of used IO TLB block >>> + */ >>> +static unsigned long io_tlb_used; >>> +#endif >>> + >>> /* >>> * This is a free list describing the number of free entries available from >>> * each index >>> @@ -100,6 +110,41 @@ static DEFINE_SPINLOCK(io_tlb_lock); >>> >>> static int late_alloc; >>> >>> +#ifdef CONFIG_DEBUG_FS >>> + >>> +static struct dentry *d_swiotlb_usage; >>> + >>> +static int swiotlb_usage_show(struct seq_file *m, void *v) >>> +{ >>> + seq_printf(m, "%lu\n%lu\n", io_tlb_used, io_tlb_nslabs); >>> + return 0; >>> +} >>> + >>> +static int swiotlb_usage_open(struct inode *inode, struct file *filp) >>> +{ >>> + return single_open(filp, swiotlb_usage_show, NULL); >>> +} >>> + >>> +static const struc
Re: [PATCH RFC 1/1] swiotlb: add debugfs to track swiotlb buffer usage
Hi Dongli, Maybe move d_swiotlb_usage declare into swiotlb_create_debugfs(): void swiotlb_create_debugfs(void) { #ifdef CONFIG_DEBUG_FS static struct dentry *d_swiotlb_usage = NULL; if (d_swiotlb_usage) return; d_swiotlb_usage = debugfs_create_dir("swiotlb", NULL); if (!d_swiotlb_usage) return; debugfs_create_file("usage", 0600, d_swiotlb_usage, NULL, _usage_fops); #endif } And for io_tlb_used, possible add a check at the begin of swiotlb_tbl_map_single(), if there were not any free slots or not enough slots, return fail directly? Thanks, Joe On 12/5/18 7:59 PM, Dongli Zhang wrote: > The device driver will not be able to do dma operations once swiotlb buffer > is full, either because the driver is using so many IO TLB blocks inflight, > or because there is memory leak issue in device driver. To export the > swiotlb buffer usage via debugfs would help the user estimate the size of > swiotlb buffer to pre-allocate or analyze device driver memory leak issue. > > As the swiotlb can be initialized at very early stage when debugfs cannot > register successfully, this patch creates the debugfs entry on demand. > > Signed-off-by: Dongli Zhang > --- > kernel/dma/swiotlb.c | 57 > > 1 file changed, 57 insertions(+) > > diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c > index 045930e..d3c8aa4 100644 > --- a/kernel/dma/swiotlb.c > +++ b/kernel/dma/swiotlb.c > @@ -35,6 +35,9 @@ > #include > #include > #include > +#ifdef CONFIG_DEBUG_FS > +#include > +#endif > > #include > #include > @@ -73,6 +76,13 @@ static phys_addr_t io_tlb_start, io_tlb_end; > */ > static unsigned long io_tlb_nslabs; > > +#ifdef CONFIG_DEBUG_FS > +/* > + * The number of used IO TLB block > + */ > +static unsigned long io_tlb_used; > +#endif > + > /* > * This is a free list describing the number of free entries available from > * each index > @@ -100,6 +110,41 @@ static DEFINE_SPINLOCK(io_tlb_lock); > > static int late_alloc; > > +#ifdef CONFIG_DEBUG_FS > + > +static struct dentry *d_swiotlb_usage; > + > +static int swiotlb_usage_show(struct seq_file *m, void *v) > +{ > + seq_printf(m, "%lu\n%lu\n", io_tlb_used, io_tlb_nslabs); > + return 0; > +} > + > +static int swiotlb_usage_open(struct inode *inode, struct file *filp) > +{ > + return single_open(filp, swiotlb_usage_show, NULL); > +} > + > +static const struct file_operations swiotlb_usage_fops = { > + .open = swiotlb_usage_open, > + .read = seq_read, > + .llseek = seq_lseek, > + .release= single_release, > +}; > + > +void swiotlb_create_debugfs(void) > +{ > + d_swiotlb_usage = debugfs_create_dir("swiotlb", NULL); > + > + if (!d_swiotlb_usage) > + return; > + > + debugfs_create_file("usage", 0600, d_swiotlb_usage, > + NULL, _usage_fops); > +} > + > +#endif > + > static int __init > setup_io_tlb_npages(char *str) > { > @@ -449,6 +494,11 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, > pr_warn_once("%s is active and system is using DMA bounce > buffers\n", >sme_active() ? "SME" : "SEV"); > > +#ifdef CONFIG_DEBUG_FS > + if (unlikely(!d_swiotlb_usage)) > + swiotlb_create_debugfs(); > +#endif > + > mask = dma_get_seg_boundary(hwdev); > > tbl_dma_addr &= mask; > @@ -528,6 +578,9 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, > dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes)\n", > size); > return SWIOTLB_MAP_ERROR; > found: > +#ifdef CONFIG_DEBUG_FS > + io_tlb_used += nslots; > +#endif > spin_unlock_irqrestore(_tlb_lock, flags); > > /* > @@ -588,6 +641,10 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, > phys_addr_t tlb_addr, >*/ > for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != > IO_TLB_SEGSIZE -1) && io_tlb_list[i]; i--) > io_tlb_list[i] = ++count; > + > +#ifdef CONFIG_DEBUG_FS > + io_tlb_used -= nslots; > +#endif > } > spin_unlock_irqrestore(_tlb_lock, flags); > } > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu