Re: arm32: panic in move_freepages (Was [PATCH v2 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid())
On Sun, Apr 25, 2021 at 03:51:56PM +0800, Kefeng Wang wrote: > > On 2021/4/25 15:19, Mike Rapoport wrote: > > On Fri, Apr 23, 2021 at 04:11:16PM +0800, Kefeng Wang wrote: > > I tested this patchset(plus arm32 change, like arm64 does) based on > lts > 5.10,add > > some debug log, the useful info shows below, if we enable > HOLES_IN_ZONE, no > panic, > > any idea, thanks. > > > Are there any changes on top of 5.10 except for pfn_valid() patch? > Do you see this panic on 5.10 without the changes? > > Yes, there are some BSP support for arm board based on 5.10, with or without > > your patch will get same panic, the panic pfn=de600 in the range of > [dcc00,de00] > > which is freed by free_memmap, start_pfn = dcc00, dcc0 end_pfn = de700, > de70 > > we see the PC is at PageLRU, same reason like arm64 panic log, > >"PageBuddy in move_freepages returns false > Then we call PageLRU, the macro calls PF_HEAD which is compound_page() > compound_page reads page->compound_head, it is 0x, so it > resturns 0xfffe - and accessing this address causes crash" > > Can you see stack backtrace beyond move_freepages_block? > > I do some oom test, so the log is about memory allocate, > > [] (move_freepages_block) from [] > (steal_suitable_fallback+0x174/0x1f4) > > [] (steal_suitable_fallback) from [] > (get_page_from_freelist+0x490/0x9a4) Hmm, this is called with a page from free list, having a page from a freed part of the memory map passed to steal_suitable_fallback() means that there is an issue with creation of the free list. Can you please add "memblock=debug" to the kernel command line and post the log? > [] (get_page_from_freelist) from [] > (__alloc_pages_nodemask+0x188/0xc08) > [] (__alloc_pages_nodemask) from [] > (alloc_zeroed_user_highpage_movable+0x14/0x3c) > [] (alloc_zeroed_user_highpage_movable) from [] > (handle_mm_fault+0x254/0xac8) > [] (handle_mm_fault) from [] (do_page_fault+0x228/0x2f4) > [] (do_page_fault) from [] (do_DataAbort+0x48/0xd0) > [] (do_DataAbort) from [] (__dabt_usr+0x40/0x60) > > > > Zone ranges: > Normal [mem 0x80a0-0xb01f] > HighMem [mem 0xb020-0xefff] > Movable zone start for each node > Early memory node ranges > node 0: [mem 0x80a0-0x855f] > node 0: [mem 0x86a0-0x87df] > node 0: [mem 0x8bd0-0x8c4f] > node 0: [mem 0x8e30-0x8ecf] > node 0: [mem 0x90d0-0xbfff] > node 0: [mem 0xcc00-0xdc9f] > node 0: [mem 0xde70-0xde9f] > node 0: [mem 0xe080-0xe0bf] > node 0: [mem 0xf4b0-0xf6ff] > node 0: [mem 0xfda0-0xefff] > > > free_memmap, start_pfn = 85800, 8580 end_pfn = 86a00, > 86a0 > > free_memmap, start_pfn = 8c800, 8c80 end_pfn = 8e300, > 8e30 > > free_memmap, start_pfn = 8f000, 8f00 end_pfn = 9, > 9000 > > free_memmap, start_pfn = dcc00, dcc0 end_pfn = de700, > de70 > > free_memmap, start_pfn = dec00, dec0 end_pfn = e, > e000 > > free_memmap, start_pfn = e0c00, e0c0 end_pfn = e4000, > e400 > > free_memmap, start_pfn = f7000, f700 end_pfn = f8000, > f800 > === >move_freepages: start_pfn/end_pfn [de601, de7ff], [de60, > de7ff000] > : pfn =de600 pfn2phy = de60 , page = ef3cc000, page-flags = > > 8<--- cut here --- > Unable to handle kernel paging request at virtual address fffe > pgd = 5dd50df5 > [fffe] *pgd=a861, *pte=, *ppte= > Internal error: Oops: 37 [#1] SMP ARM > Modules linked in: gmac(O) > CPU: 2 PID: 635 Comm: test-oom Tainted: G O 5.10.0+ #31 > Hardware name: Hisilicon A9 > PC is at move_freepages_block+0x150/0x278 > LR is at move_freepages_block+0x150/0x278 > pc : [] lr : [] psr: 200e0393 > sp : c4179cf8 ip : fp : 0001 > r10: c4179d58 r9 : 000de7ff r8 : > r7 : c0863280 r6 : 000de600 r5 : 000de600 r4 : ef3cc000 > r3 : r2 : r1 : ef5d069c r0 : fffe > Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user > Control: 1ac5387d Table: 83b0c04a DAC: > Process test-oom (pid: 635, stack limit = 0x25d667df) > > -- Sincerely yours, Mike. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu
Re: arm32: panic in move_freepages (Was [PATCH v2 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid())
On 2021/4/25 15:19, Mike Rapoport wrote: On Fri, Apr 23, 2021 at 04:11:16PM +0800, Kefeng Wang wrote: I tested this patchset(plus arm32 change, like arm64 does) based on lts 5.10,add some debug log, the useful info shows below, if we enable HOLES_IN_ZONE, no panic, any idea, thanks. Are there any changes on top of 5.10 except for pfn_valid() patch? Do you see this panic on 5.10 without the changes? Yes, there are some BSP support for arm board based on 5.10, with or without your patch will get same panic, the panic pfn=de600 in the range of [dcc00,de00] which is freed by free_memmap, start_pfn = dcc00, dcc0 end_pfn = de700, de70 we see the PC is at PageLRU, same reason like arm64 panic log, "PageBuddy in move_freepages returns false Then we call PageLRU, the macro calls PF_HEAD which is compound_page() compound_page reads page->compound_head, it is 0x, so it resturns 0xfffe - and accessing this address causes crash" Can you see stack backtrace beyond move_freepages_block? I do some oom test, so the log is about memory allocate, [] (move_freepages_block) from [] (steal_suitable_fallback+0x174/0x1f4) [] (steal_suitable_fallback) from [] (get_page_from_freelist+0x490/0x9a4) [] (get_page_from_freelist) from [] (__alloc_pages_nodemask+0x188/0xc08) [] (__alloc_pages_nodemask) from [] (alloc_zeroed_user_highpage_movable+0x14/0x3c) [] (alloc_zeroed_user_highpage_movable) from [] (handle_mm_fault+0x254/0xac8) [] (handle_mm_fault) from [] (do_page_fault+0x228/0x2f4) [] (do_page_fault) from [] (do_DataAbort+0x48/0xd0) [] (do_DataAbort) from [] (__dabt_usr+0x40/0x60) Zone ranges: Normal [mem 0x80a0-0xb01f] HighMem [mem 0xb020-0xefff] Movable zone start for each node Early memory node ranges node 0: [mem 0x80a0-0x855f] node 0: [mem 0x86a0-0x87df] node 0: [mem 0x8bd0-0x8c4f] node 0: [mem 0x8e30-0x8ecf] node 0: [mem 0x90d0-0xbfff] node 0: [mem 0xcc00-0xdc9f] node 0: [mem 0xde70-0xde9f] node 0: [mem 0xe080-0xe0bf] node 0: [mem 0xf4b0-0xf6ff] node 0: [mem 0xfda0-0xefff] > free_memmap, start_pfn = 85800, 8580 end_pfn = 86a00, 86a0 > free_memmap, start_pfn = 8c800, 8c80 end_pfn = 8e300, 8e30 > free_memmap, start_pfn = 8f000, 8f00 end_pfn = 9, 9000 > free_memmap, start_pfn = dcc00, dcc0 end_pfn = de700, de70 > free_memmap, start_pfn = dec00, dec0 end_pfn = e, e000 > free_memmap, start_pfn = e0c00, e0c0 end_pfn = e4000, e400 > free_memmap, start_pfn = f7000, f700 end_pfn = f8000, f800 === >move_freepages: start_pfn/end_pfn [de601, de7ff], [de60, de7ff000] : pfn =de600 pfn2phy = de60 , page = ef3cc000, page-flags = 8<--- cut here --- Unable to handle kernel paging request at virtual address fffe pgd = 5dd50df5 [fffe] *pgd=a861, *pte=, *ppte= Internal error: Oops: 37 [#1] SMP ARM Modules linked in: gmac(O) CPU: 2 PID: 635 Comm: test-oom Tainted: G O 5.10.0+ #31 Hardware name: Hisilicon A9 PC is at move_freepages_block+0x150/0x278 LR is at move_freepages_block+0x150/0x278 pc : [] lr : [] psr: 200e0393 sp : c4179cf8 ip : fp : 0001 r10: c4179d58 r9 : 000de7ff r8 : r7 : c0863280 r6 : 000de600 r5 : 000de600 r4 : ef3cc000 r3 : r2 : r1 : ef5d069c r0 : fffe Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 1ac5387d Table: 83b0c04a DAC: Process test-oom (pid: 635, stack limit = 0x25d667df) ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v14 00/13] SMMUv3 Nested Stage Setup (IOMMU part)
I have worked around the issue by filtering out the request if the pfn is not valid in __clean_dcache_guest_page(). As the patch wasn't posted in the community, reverted it as well. That's papering over the real issue, and this mapping path needs fixing as it was only ever expected to be called for CoW. Can you please try the following patch and let me know if that fixes the issue for good? Hi Marc, Thank you for the patch. This patch fixed the crash for me. For the formal patch, please add: Tested-by: Sumit Gupta diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 77cb2d28f2a4..b62dd40a4083 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1147,7 +1147,8 @@ int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte) * We've moved a page around, probably through CoW, so let's treat it * just like a translation fault and clean the cache to the PoC. */ - clean_dcache_guest_page(pfn, PAGE_SIZE); + if (!kvm_is_device_pfn(pfn)) + clean_dcache_guest_page(pfn, PAGE_SIZE); handle_hva_to_gpa(kvm, hva, end, _set_spte_handler, ); return 0; } -- Without deviation from the norm, progress is not possible. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: arm32: panic in move_freepages (Was [PATCH v2 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid())
On Fri, Apr 23, 2021 at 04:11:16PM +0800, Kefeng Wang wrote: > > I tested this patchset(plus arm32 change, like arm64 does) based on lts > 5.10,add > > some debug log, the useful info shows below, if we enable HOLES_IN_ZONE, no > panic, > > any idea, thanks. Are there any changes on top of 5.10 except for pfn_valid() patch? Do you see this panic on 5.10 without the changes? Can you see stack backtrace beyond move_freepages_block? > Zone ranges: > Normal [mem 0x80a0-0xb01f] > HighMem [mem 0xb020-0xefff] > Movable zone start for each node > Early memory node ranges > node 0: [mem 0x80a0-0x855f] > node 0: [mem 0x86a0-0x87df] > node 0: [mem 0x8bd0-0x8c4f] > node 0: [mem 0x8e30-0x8ecf] > node 0: [mem 0x90d0-0xbfff] > node 0: [mem 0xcc00-0xdc9f] > node 0: [mem 0xde70-0xde9f] > node 0: [mem 0xe080-0xe0bf] > node 0: [mem 0xf4b0-0xf6ff] > node 0: [mem 0xfda0-0xefff] > > > free_memmap, start_pfn = 85800, 8580 end_pfn = 86a00, 86a0 > > free_memmap, start_pfn = 8c800, 8c80 end_pfn = 8e300, 8e30 > > free_memmap, start_pfn = 8f000, 8f00 end_pfn = 9, 9000 > > free_memmap, start_pfn = dcc00, dcc0 end_pfn = de700, de70 > > free_memmap, start_pfn = dec00, dec0 end_pfn = e, e000 > > free_memmap, start_pfn = e0c00, e0c0 end_pfn = e4000, e400 > > free_memmap, start_pfn = f7000, f700 end_pfn = f8000, f800 > === >move_freepages: start_pfn/end_pfn [de601, de7ff], [de60, de7ff000] > : pfn =de600 pfn2phy = de60 , page = ef3cc000, page-flags = > 8<--- cut here --- > Unable to handle kernel paging request at virtual address fffe > pgd = 5dd50df5 > [fffe] *pgd=a861, *pte=, *ppte= > Internal error: Oops: 37 [#1] SMP ARM > Modules linked in: gmac(O) > CPU: 2 PID: 635 Comm: test-oom Tainted: G O 5.10.0+ #31 > Hardware name: Hisilicon A9 > PC is at move_freepages_block+0x150/0x278 > LR is at move_freepages_block+0x150/0x278 > pc : [] lr : [] psr: 200e0393 > sp : c4179cf8 ip : fp : 0001 > r10: c4179d58 r9 : 000de7ff r8 : > r7 : c0863280 r6 : 000de600 r5 : 000de600 r4 : ef3cc000 > r3 : r2 : r1 : ef5d069c r0 : fffe > Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user > Control: 1ac5387d Table: 83b0c04a DAC: > Process test-oom (pid: 635, stack limit = 0x25d667df) > -- Sincerely yours, Mike. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2 0/4] arm64: drop pfn_valid_within() and simplify pfn_valid()
On Thu, Apr 22, 2021 at 11:28:24PM +0800, Kefeng Wang wrote: > > On 2021/4/22 15:29, Mike Rapoport wrote: > > On Thu, Apr 22, 2021 at 03:00:20PM +0800, Kefeng Wang wrote: > > > On 2021/4/21 14:51, Mike Rapoport wrote: > > > > From: Mike Rapoport > > > > > > > > Hi, > > > > > > > > These patches aim to remove CONFIG_HOLES_IN_ZONE and essentially > > > > hardwire > > > > pfn_valid_within() to 1. > > > > > > > > The idea is to mark NOMAP pages as reserved in the memory map and > > > > restore > > > > the intended semantics of pfn_valid() to designate availability of > > > > struct > > > > page for a pfn. > > > > > > > > With this the core mm will be able to cope with the fact that it cannot > > > > use > > > > NOMAP pages and the holes created by NOMAP ranges within MAX_ORDER > > > > blocks > > > > will be treated correctly even without the need for pfn_valid_within. > > > > > > > > The patches are only boot tested on qemu-system-aarch64 so I'd really > > > > appreciate memory stress tests on real hardware. > > > > > > > > If this actually works we'll be one step closer to drop custom > > > > pfn_valid() > > > > on arm64 altogether. > > > Hi Mike,I have a question, without HOLES_IN_ZONE, the pfn_valid_within() > > > in > > > move_freepages_block()->move_freepages() > > > will be optimized, if there are holes in zone, the 'struce page'(memory > > > map) > > > for pfn range of hole will be free by > > > free_memmap(), and then the page traverse in the zone(with holes) from > > > move_freepages() will meet the wrong page, > > > then it could panic at PageLRU(page) test, check link[1], > > First, HOLES_IN_ZONE name us hugely misleading, this configuration option > > has nothing to to with memory holes, but rather it is there to deal with > > holes or undefined struct pages in the memory map, when these holes can be > > inside a MAX_ORDER_NR_PAGES region. > > > > In general pfn walkers use pfn_valid() and pfn_valid_within() to avoid > > accessing *missing* struct pages, like those that are freed at > > free_memmap(). But on arm64 these tests also filter out the nomap entries > > because their struct pages are not initialized. > > > > The panic you refer to happened because there was an uninitialized struct > > page in the middle of MAX_ORDER_NR_PAGES region because it corresponded to > > nomap memory. > > > > With these changes I make sure that such pages will be properly initialized > > as PageReserved and the pfn walkers will be able to rely on the memory map. > > > > Note also, that free_memmap() aligns the parts being freed on MAX_ORDER > > boundaries, so there will be no missing parts in the memory map within a > > MAX_ORDER_NR_PAGES region. > > Ok, thanks, we met a same panic like the link on arm32(without > HOLES_IN_ZONE), > > the scheme for arm64 could be suit for arm32, right? In general yes. You just need to make sure that usage of pfn_valid() in arch/arm does not presume that it tests something beyond availability of struct page for a pfn. > I will try the patchset with some changes on arm32 and give some > feedback. > > Again, the stupid question, where will mark the region of memblock with > MEMBLOCK_NOMAP flag ? Not sure I understand the question. The memory regions with "nomap" property in the device tree will be marked MEMBLOCK_NOMAP. > > > "The idea is to mark NOMAP pages as reserved in the memory map", I see the > > > patch2 check memblock_is_nomap() in memory region > > > of memblock, but it seems that memblock_mark_nomap() is not called(maybe I > > > missed), then memmap_init_reserved_pages() won't > > > work, so should the HOLES_IN_ZONE still be needed for generic mm code? > > > > > > [1] > > > https://lore.kernel.org/linux-arm-kernel/541193a6-2bce-f042-5bb2-88913d5f1...@arm.com/ > > > -- Sincerely yours, Mike. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm