On Tue, Jun 10, 2025 at 06:18:09PM +0200, Marek Szyprowski wrote:
> Dear All,
> 
> On 04.06.2025 05:21, Alistair Popple wrote:
> > The PFN_MAP flag is no longer used for anything, so remove it.
> > The PFN_SG_CHAIN and PFN_SG_LAST flags never appear to have been
> > used so also remove them. The last user of PFN_SPECIAL was removed
> > by 653d7825c149 ("dcssblk: mark DAX broken, remove FS_DAX_LIMITED
> > support").
> >
> > Signed-off-by: Alistair Popple <apop...@nvidia.com>
> > Acked-by: David Hildenbrand <da...@redhat.com>
> > Reviewed-by: Christoph Hellwig <h...@lst.de>
> > Reviewed-by: Jason Gunthorpe <j...@nvidia.com>
> > Cc: gerald.schae...@linux.ibm.com
> > Cc: dan.j.willi...@intel.com
> > Cc: j...@ziepe.ca
> > Cc: wi...@infradead.org
> > Cc: da...@redhat.com
> > Cc: linux-kernel@vger.kernel.org
> > Cc: nvd...@lists.linux.dev
> > Cc: jhubb...@nvidia.com
> > Cc: h...@lst.de
> > Cc: zhang.l...@gmail.com
> > Cc: de...@rivosinc.com
> > Cc: bj...@kernel.org
> > Cc: balb...@nvidia.com
> > Cc: lorenzo.stoa...@oracle.com
> > Cc: j...@groves.net
> >
> > ---
> >
> > Splitting this off from the rest of my series[1] as a separate clean-up
> > for consideration for the v6.16 merge window as suggested by Christoph.
> >
> > [1] - 
> > https://lore.kernel.org/linux-mm/cover.541c2702181b7461b84f1a6967a3f0e823023fcc.1748500293.git-series.apop...@nvidia.com/
> > ---
> >   include/linux/pfn_t.h             | 31 +++----------------------------
> >   mm/memory.c                       |  2 --
> >   tools/testing/nvdimm/test/iomap.c |  4 ----
> >   3 files changed, 3 insertions(+), 34 deletions(-)
> 
> This patch landed in today's linux-next as commit 28be5676b4a3 ("mm: 
> remove PFN_MAP, PFN_SPECIAL, PFN_SG_CHAIN and PFN_SG_LAST"). In my tests 
> I've noticed that it breaks operation of all RISC-V 64bit boards on my 
> test farm (VisionFive2, BananaPiF3 as well as QEMU's Virt machine). I've 
> isolated the changes responsible for this issue, see the inline comments 
> in the patch below. Here is an example of the issues observed in the 
> logs from those machines:

Thanks for the report. I'm really confused by this because this change should
just be removal of dead code - nothing sets any of the removed PFN_* flags
AFAICT.

I don't have access to any RISC-V hardwdare but you say this reproduces under
qemu - what do you run on the system to cause the error? Is it just a simple
boot and load a module or are you running selftests or something else?

Also if possible it would be good to know if you still see the issue
after applying the full series (https://lore.kernel.org/linux-mm/
cover.541c2702181b7461b84f1a6967a3f0e823023fcc.1748500293.git-series.apopple@nvi
dia.com/).

> BUG: Bad page map in process modprobe  pte:20682653 pmd:20f23c01
> page: refcount:1 mapcount:-1 mapping:0000000000000000 index:0x0 pfn:0x81a09
> flags: 0x2004(referenced|reserved|zone=0)
> raw: 0000000000002004 ff1c000000068248 ff1c000000068248 0000000000000000
> raw: 0000000000000000 0000000000000000 00000001fffffffe 0000000000000000
> page dumped because: bad pte
> addr:00007fff84619000 vm_flags:04044411 anon_vma:0000000000000000 
> mapping:0000000000000000 index:0
> file:(null) fault:special_mapping_fault mmap:0x0 mmap_prepare: 0x0 
> read_folio:0x0
> CPU: 1 UID: 0 PID: 58 Comm: modprobe Not tainted 
> 6.16.0-rc1-next-20250610+ #15719 NONE
> Hardware name: riscv-virtio,qemu (DT)
> Call Trace:
> [<ffffffff80016152>] dump_backtrace+0x1c/0x24
> [<ffffffff8000147a>] show_stack+0x28/0x34
> [<ffffffff8000f61e>] dump_stack_lvl+0x5e/0x86
> [<ffffffff8000f65a>] dump_stack+0x14/0x1c
> [<ffffffff80234b7e>] print_bad_pte+0x1b4/0x1ee
> [<ffffffff8023854a>] unmap_page_range+0x4da/0xf74
> [<ffffffff80239042>] unmap_single_vma.constprop.0+0x5e/0x90
> [<ffffffff8023913a>] unmap_vmas+0xc6/0x1c4
> [<ffffffff80244a70>] exit_mmap+0xb6/0x418
> [<ffffffff80021dc4>] mmput+0x56/0xf2
> [<ffffffff8002b84e>] do_exit+0x182/0x80e
> [<ffffffff8002c02a>] do_group_exit+0x24/0x70
> [<ffffffff8002c092>] pid_child_should_wake+0x0/0x54
> [<ffffffff80b66112>] do_trap_ecall_u+0x29c/0x4cc
> [<ffffffff80b747e6>] handle_exception+0x146/0x152
> Disabling lock debugging due to kernel taint
> 
> 
> > diff --git a/include/linux/pfn_t.h b/include/linux/pfn_t.h
> > index 2d9148221e9a..46afa12eb33b 100644
> > --- a/include/linux/pfn_t.h
> > +++ b/include/linux/pfn_t.h
> > @@ -5,26 +5,13 @@
> >   
> >   /*
> >    * PFN_FLAGS_MASK - mask of all the possible valid pfn_t flags
> > - * PFN_SG_CHAIN - pfn is a pointer to the next scatterlist entry
> > - * PFN_SG_LAST - pfn references a page and is the last scatterlist entry
> >    * PFN_DEV - pfn is not covered by system memmap by default
> > - * PFN_MAP - pfn has a dynamic page mapping established by a device driver
> > - * PFN_SPECIAL - for CONFIG_FS_DAX_LIMITED builds to allow XIP, but not
> > - *          get_user_pages
> >    */
> >   #define PFN_FLAGS_MASK (((u64) (~PAGE_MASK)) << (BITS_PER_LONG_LONG - 
> > PAGE_SHIFT))
> > -#define PFN_SG_CHAIN (1ULL << (BITS_PER_LONG_LONG - 1))
> > -#define PFN_SG_LAST (1ULL << (BITS_PER_LONG_LONG - 2))
> >   #define PFN_DEV (1ULL << (BITS_PER_LONG_LONG - 3))
> > -#define PFN_MAP (1ULL << (BITS_PER_LONG_LONG - 4))
> > -#define PFN_SPECIAL (1ULL << (BITS_PER_LONG_LONG - 5))
> >   
> >   #define PFN_FLAGS_TRACE \
> > -   { PFN_SPECIAL,  "SPECIAL" }, \
> > -   { PFN_SG_CHAIN, "SG_CHAIN" }, \
> > -   { PFN_SG_LAST,  "SG_LAST" }, \
> > -   { PFN_DEV,      "DEV" }, \
> > -   { PFN_MAP,      "MAP" }
> > +   { PFN_DEV,      "DEV" }
> >   
> >   static inline pfn_t __pfn_to_pfn_t(unsigned long pfn, u64 flags)
> >   {
> > @@ -46,7 +33,7 @@ static inline pfn_t phys_to_pfn_t(phys_addr_t addr, u64 
> > flags)
> >   
> >   static inline bool pfn_t_has_page(pfn_t pfn)
> >   {
> > -   return (pfn.val & PFN_MAP) == PFN_MAP || (pfn.val & PFN_DEV) == 0;
> > +   return (pfn.val & PFN_DEV) == 0;
> >   }
> >   
> >   static inline unsigned long pfn_t_to_pfn(pfn_t pfn)
> > @@ -100,7 +87,7 @@ static inline pud_t pfn_t_pud(pfn_t pfn, pgprot_t pgprot)
> >   #ifdef CONFIG_ARCH_HAS_PTE_DEVMAP
> >   static inline bool pfn_t_devmap(pfn_t pfn)
> >   {
> > -   const u64 flags = PFN_DEV|PFN_MAP;
> > +   const u64 flags = PFN_DEV;
> >   
> >     return (pfn.val & flags) == flags;
> >   }
> 
> The above change causes the stability issues on RISC-V based boards. To 
> get them working again with today's linux-next I had to apply the 
> following change:
> 
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 6ff7dd305639..f502860f2a76 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -46,7 +46,6 @@ config RISCV
>          select ARCH_HAS_PREEMPT_LAZY
>          select ARCH_HAS_PREPARE_SYNC_CORE_CMD
>          select ARCH_HAS_PTDUMP if MMU
> -       select ARCH_HAS_PTE_DEVMAP if 64BIT && MMU
>          select ARCH_HAS_PTE_SPECIAL
>          select ARCH_HAS_SET_DIRECT_MAP if MMU
>          select ARCH_HAS_SET_MEMORY if MMU
> 
> I'm not sure if this is really the desired solution and frankly speaking 
> I don't understand the code behind the 'devmap' related bits. I can help 
> testing other patches that will fix this issue properly.
> 
> 
> > @@ -116,16 +103,4 @@ pmd_t pmd_mkdevmap(pmd_t pmd);
> >   pud_t pud_mkdevmap(pud_t pud);
> >   #endif
> >   #endif /* CONFIG_ARCH_HAS_PTE_DEVMAP */
> > -
> > -#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL
> > -static inline bool pfn_t_special(pfn_t pfn)
> > -{
> > -   return (pfn.val & PFN_SPECIAL) == PFN_SPECIAL;
> > -}
> > -#else
> > -static inline bool pfn_t_special(pfn_t pfn)
> > -{
> > -   return false;
> > -}
> > -#endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */
> >   #endif /* _LINUX_PFN_T_H_ */
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 49199410805c..cc85f814bc1c 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -2569,8 +2569,6 @@ static bool vm_mixed_ok(struct vm_area_struct *vma, 
> > pfn_t pfn, bool mkwrite)
> >             return true;
> >     if (pfn_t_devmap(pfn))
> >             return true;
> > -   if (pfn_t_special(pfn))
> > -           return true;
> >     if (is_zero_pfn(pfn_t_to_pfn(pfn)))
> >             return true;
> >     return false;
> > diff --git a/tools/testing/nvdimm/test/iomap.c 
> > b/tools/testing/nvdimm/test/iomap.c
> > index e4313726fae3..ddceb04b4a9a 100644
> > --- a/tools/testing/nvdimm/test/iomap.c
> > +++ b/tools/testing/nvdimm/test/iomap.c
> > @@ -137,10 +137,6 @@ EXPORT_SYMBOL_GPL(__wrap_devm_memremap_pages);
> >   
> >   pfn_t __wrap_phys_to_pfn_t(phys_addr_t addr, unsigned long flags)
> >   {
> > -   struct nfit_test_resource *nfit_res = get_nfit_res(addr);
> > -
> > -   if (nfit_res)
> > -           flags &= ~PFN_MAP;
> >           return phys_to_pfn_t(addr, flags);
> >   }
> >   EXPORT_SYMBOL(__wrap_phys_to_pfn_t);
> 
> Best regards
> -- 
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
> 

Reply via email to