[PATCH 11/13 v2] ARM: ixp4xx: Drop custom DMA coherency and bouncing
The new PCI driver does not need any of this stuff, so just drop it. Cc: iommu@lists.linux-foundation.org Reviewed-by: Christoph Hellwig Signed-off-by: Linus Walleij --- ChangeLog v1->v2: - Pick up Christoph's Reviewed-by and add proper CC for iommu - Resending with the rest --- arch/arm/Kconfig | 5 --- arch/arm/mach-ixp4xx/common.c | 57 --- kernel/dma/mapping.c | 2 -- 3 files changed, 64 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 3a95203236d2..ec0dbaf73a81 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -217,9 +217,6 @@ config ARCH_MAY_HAVE_PC_FDC config ARCH_SUPPORTS_UPROBES def_bool y -config ARCH_HAS_DMA_SET_COHERENT_MASK - bool - config GENERIC_ISA_DMA bool @@ -381,10 +378,8 @@ config ARCH_IOP32X config ARCH_IXP4XX bool "IXP4xx-based" depends on MMU - select ARCH_HAS_DMA_SET_COHERENT_MASK select ARCH_SUPPORTS_BIG_ENDIAN select CPU_XSCALE - select DMABOUNCE if PCI select GENERIC_IRQ_MULTI_HANDLER select GPIO_IXP4XX select GPIOLIB diff --git a/arch/arm/mach-ixp4xx/common.c b/arch/arm/mach-ixp4xx/common.c index 4e51514ace6d..310e1602fbfc 100644 --- a/arch/arm/mach-ixp4xx/common.c +++ b/arch/arm/mach-ixp4xx/common.c @@ -30,7 +30,6 @@ #include #include #include -#include #include #include #include @@ -330,59 +329,3 @@ void ixp4xx_restart(enum reboot_mode mode, const char *cmd) *IXP4XX_OSWE = IXP4XX_WDT_RESET_ENABLE | IXP4XX_WDT_COUNT_ENABLE; } } - -#ifdef CONFIG_PCI -static int ixp4xx_needs_bounce(struct device *dev, dma_addr_t dma_addr, size_t size) -{ - return (dma_addr + size) > SZ_64M; -} - -static int ixp4xx_platform_notify_remove(struct device *dev) -{ - if (dev_is_pci(dev)) - dmabounce_unregister_dev(dev); - - return 0; -} -#endif - -/* - * Setup DMA mask to 64MB on PCI devices and 4 GB on all other things. - */ -static int ixp4xx_platform_notify(struct device *dev) -{ - dev->dma_mask = &dev->coherent_dma_mask; - -#ifdef CONFIG_PCI - if (dev_is_pci(dev)) { - dev->coherent_dma_mask = DMA_BIT_MASK(28); /* 64 MB */ - dmabounce_register_dev(dev, 2048, 4096, ixp4xx_needs_bounce); - return 0; - } -#endif - - dev->coherent_dma_mask = DMA_BIT_MASK(32); - return 0; -} - -int dma_set_coherent_mask(struct device *dev, u64 mask) -{ - if (dev_is_pci(dev)) - mask &= DMA_BIT_MASK(28); /* 64 MB */ - - if ((mask & DMA_BIT_MASK(28)) == DMA_BIT_MASK(28)) { - dev->coherent_dma_mask = mask; - return 0; - } - - return -EIO;/* device wanted sub-64MB mask */ -} -EXPORT_SYMBOL(dma_set_coherent_mask); - -void __init ixp4xx_init_early(void) -{ - platform_notify = ixp4xx_platform_notify; -#ifdef CONFIG_PCI - platform_notify_remove = ixp4xx_platform_notify_remove; -#endif -} diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index 9478eccd1c8e..559461a826ba 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -745,7 +745,6 @@ int dma_set_mask(struct device *dev, u64 mask) } EXPORT_SYMBOL(dma_set_mask); -#ifndef CONFIG_ARCH_HAS_DMA_SET_COHERENT_MASK int dma_set_coherent_mask(struct device *dev, u64 mask) { /* @@ -761,7 +760,6 @@ int dma_set_coherent_mask(struct device *dev, u64 mask) return 0; } EXPORT_SYMBOL(dma_set_coherent_mask); -#endif size_t dma_max_mapping_size(struct device *dev) { -- 2.34.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 05/11] iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
On 2/7/22 15:02, Fenghua Yu wrote: ... > Get rid of the refcounting mechanisms and replace/rename the interfaces > to reflect this new approach. ... > .../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 5 +-- > drivers/iommu/intel/iommu.c | 4 +- > drivers/iommu/intel/svm.c | 9 - > drivers/iommu/ioasid.c| 39 ++- > drivers/iommu/iommu-sva-lib.c | 39 ++- > drivers/iommu/iommu-sva-lib.h | 1 - > include/linux/ioasid.h| 12 +- > include/linux/sched/mm.h | 16 > kernel/fork.c | 1 + > 9 files changed, 38 insertions(+), 88 deletions(-) Given the heavily non-x86 diffstat here, I was hoping to see some acks from folks that this might affect, especially on the ARM side. Is everyone OK with this? ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v4 00/11] Re-enable ENQCMD and PASID MSR
Hi, Thomas, On Mon, Feb 07, 2022 at 03:02:43PM -0800, Fenghua Yu wrote: > Problems in the old code to manage SVM (Shared Virtual Memory) devices > and the PASID (Process Address Space ID) led to that code being > disabled. > > Subsequent discussions resulted in a far simpler approach: > > 1) PASID life cycle is from first allocation by a process until that >process exits. > 2) All tasks begin with PASID disabled > 3) The #GP fault handler tries to fix faulting ENQCMD instructions very >early (thus avoiding complexities of the XSAVE infrastructure) > > Change Log: > v4: > - Update commit message in patch #4 (Thomas). > - Update commit message in patch #5 (Thomas). > - Add "Reviewed-by: Thomas Gleixner " in patch #1-#3 > and patch #6-#9 (Thomas). > - Rebased to 5.17-rc3. A friendly reminder. Any comment on this series? Will you pick up this series in tip? Thank you very much! -Fenghua ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] dma-mapping: benchmark: Extract a common header file for map_benchmark definition
On 2/10/22 9:22 PM, Song Bao Hua (Barry Song) wrote: -Original Message- From: tiantao (H) Sent: Friday, February 11, 2022 4:15 PM To: Song Bao Hua (Barry Song) ; sh...@kernel.org; chenxiang (M) Cc: iommu@lists.linux-foundation.org; linux-kselft...@vger.kernel.org; linux...@openeuler.org Subject: [PATCH] dma-mapping: benchmark: Extract a common header file for map_benchmark definition kernel/dma/map_benchmark.c and selftests/dma/dma_map_benchmark.c have duplicate map_benchmark definitions, which tends to lead to inconsistent changes to map_benchmark on both sides, extract a common header file to avoid this problem. Signed-off-by: Tian Tao +To: Christoph Looks like a right cleanup. This will help decrease the maintain overhead in the future. Other similar selftests tools are also doing this. Acked-by: Barry Song +1 on this cleanup making this code maintainable. We are moving in the direction of cleaning up defines in selftests for the same reason. Let's just make sure this works on older kernels. We do support mainline kselftest on stable releases. With that: Reviewed-by: Shuah Khan thanks, -- Shuah ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 5/6] drivers: virtio_mem: use pageblock size as the minimum virtio_mem size.
From: Zi Yan alloc_contig_range() now only needs to be aligned to pageblock_order, drop virtio_mem size requirement that it needs to be the max of pageblock_order and MAX_ORDER. Signed-off-by: Zi Yan --- drivers/virtio/virtio_mem.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index 38becd8d578c..2307e65d18c2 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -2476,13 +2476,12 @@ static int virtio_mem_init_hotplug(struct virtio_mem *vm) VIRTIO_MEM_DEFAULT_OFFLINE_THRESHOLD); /* -* We want subblocks to span at least MAX_ORDER_NR_PAGES and -* pageblock_nr_pages pages. This: +* We want subblocks to span at least pageblock_nr_pages pages. +* This: * - Is required for now for alloc_contig_range() to work reliably - * it doesn't properly handle smaller granularity on ZONE_NORMAL. */ - sb_size = max_t(uint64_t, MAX_ORDER_NR_PAGES, - pageblock_nr_pages) * PAGE_SIZE; + sb_size = pageblock_nr_pages * PAGE_SIZE; sb_size = max_t(uint64_t, vm->device_block_size, sb_size); if (sb_size < memory_block_size_bytes() && !force_bbm) { -- 2.34.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 0/6] Use pageblock_order for cma and alloc_contig_range alignment.
From: Zi Yan Hi all, This patchset tries to remove the MAX_ORDER-1 alignment requirement for CMA and alloc_contig_range(). It prepares for my upcoming changes to make MAX_ORDER adjustable at boot time[1]. It is on top of mmotm-2022-02-08-15-31. Changelog === V5 --- 1. Moved isolation address alignment handling in start_isolate_page_range(). 2. Rewrote and simplified how alloc_contig_range() works at pageblock granularity (Patch 3). Only two pageblock migratetypes need to be saved and restored. start_isolate_page_range() might need to migrate pages in this version, but it prevents the caller from worrying about max(MAX_ORDER_NR_PAEGS, pageblock_nr_pages) alignment after the page range is isolated. V4 --- 1. Dropped two irrelevant patches on non-lru compound page handling, as it is not supported upstream. 2. Renamed migratetype_has_fallback() to migratetype_is_mergeable(). 3. Always check whether two pageblocks can be merged in __free_one_page() when order is >= pageblock_order, as the case (not mergeable pageblocks are isolated, CMA, and HIGHATOMIC) becomes more common. 3. Moving has_unmovable_pages() is now a separate patch. 4. Removed MAX_ORDER-1 alignment requirement in the comment in virtio_mem code. Description === The MAX_ORDER - 1 alignment requirement comes from that alloc_contig_range() isolates pageblocks to remove free memory from buddy allocator but isolating only a subset of pageblocks within a page spanning across multiple pageblocks causes free page accounting issues. Isolated page might not be put into the right free list, since the code assumes the migratetype of the first pageblock as the whole free page migratetype. This is based on the discussion at [2]. To remove the requirement, this patchset: 1. isolates pages at pageblock granularity instead of max(MAX_ORDER_NR_PAEGS, pageblock_nr_pages); 2. splits free pages across the specified range or migrates in-use pages across the specified range then splits the freed page to avoid free page accounting issues (it happens when multiple pageblocks within a single page have different migratetypes); 3. only checks unmovable pages within the range instead of MAX_ORDER - 1 aligned range during isolation to avoid alloc_contig_range() failure when pageblocks within a MAX_ORDER - 1 aligned range are allocated separately. 4. returns pages not in the range as it did before. One optimization might come later: 1. make MIGRATE_ISOLATE a separate bit to be able to restore the original migratetypes when isolation fails in the middle of the range. Feel free to give comments and suggestions. Thanks. [1] https://lore.kernel.org/linux-mm/20210805190253.2795604-1-zi@sent.com/ [2] https://lore.kernel.org/linux-mm/d19fb078-cb9b-f60f-e310-fdeea1b94...@redhat.com/ Zi Yan (6): mm: page_isolation: move has_unmovable_pages() to mm/page_isolation.c mm: page_isolation: check specified range for unmovable pages mm: make alloc_contig_range work at pageblock granularity mm: cma: use pageblock_order as the single alignment drivers: virtio_mem: use pageblock size as the minimum virtio_mem size. arch: powerpc: adjust fadump alignment to be pageblock aligned. arch/powerpc/include/asm/fadump-internal.h | 4 +- drivers/virtio/virtio_mem.c| 7 +- include/linux/mmzone.h | 5 +- include/linux/page-isolation.h | 16 +- kernel/dma/contiguous.c| 2 +- mm/cma.c | 6 +- mm/internal.h | 3 + mm/memory_hotplug.c| 3 +- mm/page_alloc.c| 371 ++--- mm/page_isolation.c| 172 +- 10 files changed, 367 insertions(+), 222 deletions(-) -- 2.34.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 6/6] arch: powerpc: adjust fadump alignment to be pageblock aligned.
From: Zi Yan CMA only requires pageblock alignment now. Change CMA alignment in fadump too. Signed-off-by: Zi Yan --- arch/powerpc/include/asm/fadump-internal.h | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/fadump-internal.h b/arch/powerpc/include/asm/fadump-internal.h index 52189928ec08..fbfca85b4200 100644 --- a/arch/powerpc/include/asm/fadump-internal.h +++ b/arch/powerpc/include/asm/fadump-internal.h @@ -20,9 +20,7 @@ #define memblock_num_regions(memblock_type)(memblock.memblock_type.cnt) /* Alignment per CMA requirement. */ -#define FADUMP_CMA_ALIGNMENT (PAGE_SIZE << \ -max_t(unsigned long, MAX_ORDER - 1,\ -pageblock_order)) +#define FADUMP_CMA_ALIGNMENT (PAGE_SIZE << pageblock_order) /* FAD commands */ #define FADUMP_REGISTER1 -- 2.34.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 2/6] mm: page_isolation: check specified range for unmovable pages
From: Zi Yan Enable set_migratetype_isolate() to check specified sub-range for unmovable pages during isolation. Page isolation is done at max(MAX_ORDER_NR_PAEGS, pageblock_nr_pages) granularity, but not all pages within that granularity are intended to be isolated. For example, alloc_contig_range(), which uses page isolation, allows ranges without alignment. This commit makes unmovable page check only look for interesting pages, so that page isolation can succeed for any non-overlapping ranges. Signed-off-by: Zi Yan --- include/linux/page-isolation.h | 12 + mm/page_alloc.c| 15 +-- mm/page_isolation.c| 46 +- 3 files changed, 41 insertions(+), 32 deletions(-) diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h index e14eddf6741a..4ef7be6def83 100644 --- a/include/linux/page-isolation.h +++ b/include/linux/page-isolation.h @@ -15,6 +15,18 @@ static inline bool is_migrate_isolate(int migratetype) { return migratetype == MIGRATE_ISOLATE; } +static inline unsigned long pfn_max_align_down(unsigned long pfn) +{ + return ALIGN_DOWN(pfn, max_t(unsigned long, MAX_ORDER_NR_PAGES, +pageblock_nr_pages)); +} + +static inline unsigned long pfn_max_align_up(unsigned long pfn) +{ + return ALIGN(pfn, max_t(unsigned long, MAX_ORDER_NR_PAGES, + pageblock_nr_pages)); +} + #else static inline bool has_isolate_pageblock(struct zone *zone) { diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e2c6a67fc386..62ef78f3d771 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -8963,18 +8963,6 @@ void *__init alloc_large_system_hash(const char *tablename, } #ifdef CONFIG_CONTIG_ALLOC -static unsigned long pfn_max_align_down(unsigned long pfn) -{ - return pfn & ~(max_t(unsigned long, MAX_ORDER_NR_PAGES, -pageblock_nr_pages) - 1); -} - -static unsigned long pfn_max_align_up(unsigned long pfn) -{ - return ALIGN(pfn, max_t(unsigned long, MAX_ORDER_NR_PAGES, - pageblock_nr_pages)); -} - #if defined(CONFIG_DYNAMIC_DEBUG) || \ (defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE)) /* Usage: See admin-guide/dynamic-debug-howto.rst */ @@ -9119,8 +9107,7 @@ int alloc_contig_range(unsigned long start, unsigned long end, * put back to page allocator so that buddy can use them. */ - ret = start_isolate_page_range(pfn_max_align_down(start), - pfn_max_align_up(end), migratetype, 0); + ret = start_isolate_page_range(start, end, migratetype, 0); if (ret) return ret; diff --git a/mm/page_isolation.c b/mm/page_isolation.c index b34f1310aeaa..64d093ab83ec 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -16,7 +16,8 @@ #include /* - * This function checks whether pageblock includes unmovable pages or not. + * This function checks whether pageblock within [start_pfn, end_pfn) includes + * unmovable pages or not. * * PageLRU check without isolation or lru_lock could race so that * MIGRATE_MOVABLE block might include unmovable pages. And __PageMovable @@ -29,11 +30,14 @@ * */ static struct page *has_unmovable_pages(struct zone *zone, struct page *page, -int migratetype, int flags) +int migratetype, int flags, +unsigned long start_pfn, unsigned long end_pfn) { - unsigned long iter = 0; - unsigned long pfn = page_to_pfn(page); - unsigned long offset = pfn % pageblock_nr_pages; + unsigned long first_pfn = max(page_to_pfn(page), start_pfn); + unsigned long pfn = first_pfn; + unsigned long last_pfn = min(ALIGN(pfn + 1, pageblock_nr_pages), end_pfn); + + page = pfn_to_page(pfn); if (is_migrate_cma_page(page)) { /* @@ -47,8 +51,8 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page, return page; } - for (; iter < pageblock_nr_pages - offset; iter++) { - page = pfn_to_page(pfn + iter); + for (pfn = first_pfn; pfn < last_pfn; pfn++) { + page = pfn_to_page(pfn); /* * Both, bootmem allocations and memory holes are marked @@ -85,7 +89,7 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page, } skip_pages = compound_nr(head) - (page - head); - iter += skip_pages - 1; + pfn += skip_pages - 1; continue; } @@ -97,7 +101,7 @@ static struct page *has_unmovable_pages(struct zone *zone, struct page *page, */ if (!page_ref_count(page)) { if (PageBuddy(pag
[PATCH v5 3/6] mm: make alloc_contig_range work at pageblock granularity
From: Zi Yan alloc_contig_range() worked at MAX_ORDER-1 granularity to avoid merging pageblocks with different migratetypes. It might unnecessarily convert extra pageblocks at the beginning and at the end of the range. Change alloc_contig_range() to work at pageblock granularity. Special handling is needed for free pages and in-use pages across the boundaries of the range specified alloc_contig_range(). Because these partially isolated pages causes free page accounting issues. The free pages will be split and freed into separate migratetype lists; the in-use pages will be migrated then the freed pages will be handled. Signed-off-by: Zi Yan --- include/linux/page-isolation.h | 2 +- mm/internal.h | 3 + mm/memory_hotplug.c| 3 +- mm/page_alloc.c| 235 + mm/page_isolation.c| 33 - 5 files changed, 211 insertions(+), 65 deletions(-) diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h index 4ef7be6def83..78ff940cc169 100644 --- a/include/linux/page-isolation.h +++ b/include/linux/page-isolation.h @@ -54,7 +54,7 @@ int move_freepages_block(struct zone *zone, struct page *page, */ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, -unsigned migratetype, int flags); +unsigned migratetype, int flags, gfp_t gfp_flags); /* * Changes MIGRATE_ISOLATE to MIGRATE_MOVABLE. diff --git a/mm/internal.h b/mm/internal.h index 0d240e876831..509cbdc25992 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -319,6 +319,9 @@ isolate_freepages_range(struct compact_control *cc, int isolate_migratepages_range(struct compact_control *cc, unsigned long low_pfn, unsigned long end_pfn); + +int +isolate_single_pageblock(unsigned long boundary_pfn, gfp_t gfp_flags, int isolate_before_boundary); #endif int find_suitable_fallback(struct free_area *area, unsigned int order, int migratetype, bool only_stealable, bool *can_steal); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index ce68098832aa..82406d2f3e46 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1863,7 +1863,8 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, /* set above range as isolated */ ret = start_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE, - MEMORY_OFFLINE | REPORT_FAILURE); + MEMORY_OFFLINE | REPORT_FAILURE, + GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL); if (ret) { reason = "failure to isolate range"; goto failed_removal_pcplists_disabled; diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 62ef78f3d771..7a4fa21aea5c 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -8985,7 +8985,7 @@ static inline void alloc_contig_dump_pages(struct list_head *page_list) #endif /* [start, end) must belong to a single zone. */ -static int __alloc_contig_migrate_range(struct compact_control *cc, +int __alloc_contig_migrate_range(struct compact_control *cc, unsigned long start, unsigned long end) { /* This function is based on compact_zone() from compaction.c. */ @@ -9043,6 +9043,167 @@ static int __alloc_contig_migrate_range(struct compact_control *cc, return 0; } +/** + * split_free_page() -- split a free page at split_pfn_offset + * @free_page: the original free page + * @order: the order of the page + * @split_pfn_offset: split offset within the page + * + * It is used when the free page crosses two pageblocks with different migratetypes + * at split_pfn_offset within the page. The split free page will be put into + * separate migratetype lists afterwards. Otherwise, the function achieves + * nothing. + */ +static inline void split_free_page(struct page *free_page, + int order, unsigned long split_pfn_offset) +{ + struct zone *zone = page_zone(free_page); + unsigned long free_page_pfn = page_to_pfn(free_page); + unsigned long pfn; + unsigned long flags; + int free_page_order; + + spin_lock_irqsave(&zone->lock, flags); + del_page_from_free_list(free_page, zone, order); + for (pfn = free_page_pfn; +pfn < free_page_pfn + (1UL << order);) { + int mt = get_pfnblock_migratetype(pfn_to_page(pfn), pfn); + + free_page_order = order_base_2(split_pfn_offset); + __free_one_page(pfn_to_page(pfn), pfn, zone, free_page_order, + mt, FPI_NONE); + pfn += 1UL << free_page_order; + split_pfn_offset -= (1UL << free_page_order); + /* we have done the first part, now switch to second part
[PATCH v5 4/6] mm: cma: use pageblock_order as the single alignment
From: Zi Yan Now alloc_contig_range() works at pageblock granularity. Change CMA allocation, which uses alloc_contig_range(), to use pageblock_order alignment. Signed-off-by: Zi Yan --- include/linux/mmzone.h | 5 + kernel/dma/contiguous.c | 2 +- mm/cma.c| 6 ++ mm/page_alloc.c | 4 ++-- 4 files changed, 6 insertions(+), 11 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 3fff6deca2c0..da38c8436493 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -54,10 +54,7 @@ enum migratetype { * * The way to use it is to change migratetype of a range of * pageblocks to MIGRATE_CMA which can be done by -* __free_pageblock_cma() function. What is important though -* is that a range of pageblocks must be aligned to -* MAX_ORDER_NR_PAGES should biggest page be bigger than -* a single pageblock. +* __free_pageblock_cma() function. */ MIGRATE_CMA, #endif diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c index 3d63d91cba5c..ac35b14b0786 100644 --- a/kernel/dma/contiguous.c +++ b/kernel/dma/contiguous.c @@ -399,7 +399,7 @@ static const struct reserved_mem_ops rmem_cma_ops = { static int __init rmem_cma_setup(struct reserved_mem *rmem) { - phys_addr_t align = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order); + phys_addr_t align = PAGE_SIZE << pageblock_order; phys_addr_t mask = align - 1; unsigned long node = rmem->fdt_node; bool default_cma = of_get_flat_dt_prop(node, "linux,cma-default", NULL); diff --git a/mm/cma.c b/mm/cma.c index 766f1b82b532..b2e927fab7b5 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -187,8 +187,7 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, return -EINVAL; /* ensure minimal alignment required by mm core */ - alignment = PAGE_SIZE << - max_t(unsigned long, MAX_ORDER - 1, pageblock_order); + alignment = PAGE_SIZE << pageblock_order; /* alignment should be aligned with order_per_bit */ if (!IS_ALIGNED(alignment >> PAGE_SHIFT, 1 << order_per_bit)) @@ -275,8 +274,7 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, * migratetype page by page allocator's buddy algorithm. In the case, * you couldn't get a contiguous memory, which is not what we want. */ - alignment = max(alignment, (phys_addr_t)PAGE_SIZE << - max_t(unsigned long, MAX_ORDER - 1, pageblock_order)); + alignment = max(alignment, (phys_addr_t)PAGE_SIZE << pageblock_order); if (fixed && base & (alignment - 1)) { ret = -EINVAL; pr_err("Region at %pa must be aligned to %pa bytes\n", diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 7a4fa21aea5c..ac9432e63ce1 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -9214,8 +9214,8 @@ int isolate_single_pageblock(unsigned long boundary_pfn, gfp_t gfp_flags, * be either of the two. * @gfp_mask: GFP mask to use during compaction * - * The PFN range does not have to be pageblock or MAX_ORDER_NR_PAGES - * aligned. The PFN range must belong to a single zone. + * The PFN range does not have to be pageblock aligned. The PFN range must + * belong to a single zone. * * The first thing this routine does is attempt to MIGRATE_ISOLATE all * pageblocks in the range. Once isolated, the pageblocks should not -- 2.34.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v5 1/6] mm: page_isolation: move has_unmovable_pages() to mm/page_isolation.c
From: Zi Yan has_unmovable_pages() is only used in mm/page_isolation.c. Move it from mm/page_alloc.c and make it static. Signed-off-by: Zi Yan Reviewed-by: Oscar Salvador --- include/linux/page-isolation.h | 2 - mm/page_alloc.c| 119 - mm/page_isolation.c| 119 + 3 files changed, 119 insertions(+), 121 deletions(-) diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h index 572458016331..e14eddf6741a 100644 --- a/include/linux/page-isolation.h +++ b/include/linux/page-isolation.h @@ -33,8 +33,6 @@ static inline bool is_migrate_isolate(int migratetype) #define MEMORY_OFFLINE 0x1 #define REPORT_FAILURE 0x2 -struct page *has_unmovable_pages(struct zone *zone, struct page *page, -int migratetype, int flags); void set_pageblock_migratetype(struct page *page, int migratetype); int move_freepages_block(struct zone *zone, struct page *page, int migratetype, int *num_movable); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index cface1d38093..e2c6a67fc386 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -8962,125 +8962,6 @@ void *__init alloc_large_system_hash(const char *tablename, return table; } -/* - * This function checks whether pageblock includes unmovable pages or not. - * - * PageLRU check without isolation or lru_lock could race so that - * MIGRATE_MOVABLE block might include unmovable pages. And __PageMovable - * check without lock_page also may miss some movable non-lru pages at - * race condition. So you can't expect this function should be exact. - * - * Returns a page without holding a reference. If the caller wants to - * dereference that page (e.g., dumping), it has to make sure that it - * cannot get removed (e.g., via memory unplug) concurrently. - * - */ -struct page *has_unmovable_pages(struct zone *zone, struct page *page, -int migratetype, int flags) -{ - unsigned long iter = 0; - unsigned long pfn = page_to_pfn(page); - unsigned long offset = pfn % pageblock_nr_pages; - - if (is_migrate_cma_page(page)) { - /* -* CMA allocations (alloc_contig_range) really need to mark -* isolate CMA pageblocks even when they are not movable in fact -* so consider them movable here. -*/ - if (is_migrate_cma(migratetype)) - return NULL; - - return page; - } - - for (; iter < pageblock_nr_pages - offset; iter++) { - page = pfn_to_page(pfn + iter); - - /* -* Both, bootmem allocations and memory holes are marked -* PG_reserved and are unmovable. We can even have unmovable -* allocations inside ZONE_MOVABLE, for example when -* specifying "movablecore". -*/ - if (PageReserved(page)) - return page; - - /* -* If the zone is movable and we have ruled out all reserved -* pages then it should be reasonably safe to assume the rest -* is movable. -*/ - if (zone_idx(zone) == ZONE_MOVABLE) - continue; - - /* -* Hugepages are not in LRU lists, but they're movable. -* THPs are on the LRU, but need to be counted as #small pages. -* We need not scan over tail pages because we don't -* handle each tail page individually in migration. -*/ - if (PageHuge(page) || PageTransCompound(page)) { - struct page *head = compound_head(page); - unsigned int skip_pages; - - if (PageHuge(page)) { - if (!hugepage_migration_supported(page_hstate(head))) - return page; - } else if (!PageLRU(head) && !__PageMovable(head)) { - return page; - } - - skip_pages = compound_nr(head) - (page - head); - iter += skip_pages - 1; - continue; - } - - /* -* We can't use page_count without pin a page -* because another CPU can free compound page. -* This check already skips compound tails of THP -* because their page->_refcount is zero at all time. -*/ - if (!page_ref_count(page)) { - if (PageBuddy(page)) - iter += (1 << buddy_order(page)) - 1; - continue; - } - - /* -* The HWPoisoned page may be
Re: [PATCH v1 00/10] iommu/vt-d: Some Intel IOMMU cleanups
On Mon, Feb 07, 2022 at 02:41:32PM +0800, Lu Baolu wrote: > Hi folks, > > After a long time of evolution, the drivers/iommu/intel/iommu.c becomes > fat and a bit messy. This series tries to cleanup and refactor the > driver to make it more concise. Your comments are very appreciated. I wanted to take a closer look at what you are trying to do with rcu, but these patches don't apply. Please always sent patches against a well known tree like v5.17-rc or the iommu tree, or something. Anyhow, I think you should split the last 4 patches out of this series and send them seperately. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Error when running fio against nvme-of rdma target (mlx5 driver)
On 2022-02-10 23:58, Martin Oliveira wrote: On 2/9/22 1:41 AM, Chaitanya Kulkarni wrote: On 2/8/22 6:50 PM, Martin Oliveira wrote: Hello, We have been hitting an error when running IO over our nvme-of setup, using the mlx5 driver and we are wondering if anyone has seen anything similar/has any suggestions. Both initiator and target are AMD EPYC 7502 machines connected over RDMA using a Mellanox MT28908. Target has 12 NVMe SSDs which are exposed as a single NVMe fabrics device, one physical SSD per namespace. Thanks for reporting this, if you can bisect the problem on your setup it will help others to help you better. -ck Hi Chaitanya, I went back to a kernel as old as 4.15 and the problem was still there, so I don't know of a good commit to start from. I also learned that I can reproduce this with as little as 3 cards and I updated the firmware on the Mellanox cards to the latest version. I'd be happy to try any tests if someone has any suggestions. The IOMMU is probably your friend here - one thing that might be worth trying is capturing the iommu:map and iommu:unmap tracepoints to see if the address reported in subsequent IOMMU faults was previously mapped as a valid DMA address (be warned that there will likely be a *lot* of trace generated). With 5.13 or newer, booting with "iommu.forcedac=1" should also make it easier to tell real DMA IOVAs from rogue physical addresses or other nonsense, as real DMA addresses should then look more like 0x24d08000. That could at least help narrow down whether it's some kind of use-after-free race or a completely bogus address creeping in somehow. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: explicitly check for NULL in iommu_dma_get_resv_regions()
> On 2022-02-09 14:09, Aleksandr Fedorov wrote: >> iommu_dma_get_resv_regions() assumes that iommu_fwspec field for >> corresponding device is set which is not always true. Since >> iommu_dma_get_resv_regions() seems to be a future-proof generic API >> that can be used by any iommu driver, add an explicit check for NULL. > > Except it's not a "generic" interface for drivers to call at random, > it's a helper for retrieving common firmware-based information > specifically for drivers already using the fwspec mechanism for common > firmware bindings. If any driver calls this with a device *without* a > valid fwnode, it deserves to crash because it's done something > fundamentally wrong. > > I concur that it's not exactly obvious that "non-IOMMU-specific" means > "based on common firmware bindings, thus implying fwspec". Thanks for the explanations, yes, this was the misunderstanding on my part. Maybe add a comment? diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index d85d54f2b549..ce5e7d4d054a 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -379,6 +379,9 @@ void iommu_put_dma_cookie(struct iommu_domain *domain) * for general non-IOMMU-specific reservations. Currently, this covers GICv3 * ITS region reservation on ACPI based ARM platforms that may require HW MSI * reservation. + * + * Note that this helper is meant to be used only by drivers that are already + * using the fwspec mechanism for common firmware bindings. */ void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list) { ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu