Shiyang Ruan wrote: > Changes since v1: > 1. Added a snippet of the warning message and some of the failed cases > 2. Separated the patch for easily review > 3. Added page->share and its helper functions > 4. Included the patch[1] that removes the restrictions of fsdax and reflink > [1] > https://lore.kernel.org/linux-xfs/1663234002-17-1-git-send-email-ruansy.f...@fujitsu.com/ > > Many testcases failed in dax+reflink mode with warning message in dmesg. > Such as generic/051,075,127. The warning message is like this: > [ 775.509337] ------------[ cut here ]------------ > [ 775.509636] WARNING: CPU: 1 PID: 16815 at fs/dax.c:386 > dax_insert_entry.cold+0x2e/0x69 > [ 775.510151] Modules linked in: auth_rpcgss oid_registry nfsv4 algif_hash > af_alg af_packet nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject > nft_ct nft_chain_nat iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 > nf_defrag_ipv4 ip_set nf_tables nfnetlink ip6table_filter ip6_tables > iptable_filter ip_tables x_tables dax_pmem nd_pmem nd_btt sch_fq_codel > configfs xfs libcrc32c fuse > [ 775.524288] CPU: 1 PID: 16815 Comm: fsx Kdump: loaded Tainted: G W > 6.1.0-rc4+ #164 eb34e4ee4200c7cbbb47de2b1892c5a3e027fd6d > [ 775.524904] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch > Linux 1.16.0-3-3 04/01/2014 > [ 775.525460] RIP: 0010:dax_insert_entry.cold+0x2e/0x69 > [ 775.525797] Code: c7 c7 18 eb e0 81 48 89 4c 24 20 48 89 54 24 10 e8 73 6d > ff ff 48 83 7d 18 00 48 8b 54 24 10 48 8b 4c 24 20 0f 84 e3 e9 b9 ff <0f> 0b > e9 dc e9 b9 ff 48 c7 c6 a0 20 c3 81 48 c7 c7 f0 ea e0 81 48 > [ 775.526708] RSP: 0000:ffffc90001d57b30 EFLAGS: 00010082 > [ 775.527042] RAX: 000000000000002a RBX: 0000000000000000 RCX: > 0000000000000042 > [ 775.527396] RDX: ffffea000a0f6c80 RSI: ffffffff81dfab1b RDI: > 00000000ffffffff > [ 775.527819] RBP: ffffea000a0f6c40 R08: 0000000000000000 R09: > ffffffff820625e0 > [ 775.528241] R10: ffffc90001d579d8 R11: ffffffff820d2628 R12: > ffff88815fc98320 > [ 775.528598] R13: ffffc90001d57c18 R14: 0000000000000000 R15: > 0000000000000001 > [ 775.528997] FS: 00007f39fc75d740(0000) GS:ffff88817bc80000(0000) > knlGS:0000000000000000 > [ 775.529474] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 775.529800] CR2: 00007f39fc772040 CR3: 0000000107eb6001 CR4: > 00000000003706e0 > [ 775.530214] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 775.530592] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > 0000000000000400 > [ 775.531002] Call Trace: > [ 775.531230] <TASK> > [ 775.531444] dax_fault_iter+0x267/0x6c0 > [ 775.531719] dax_iomap_pte_fault+0x198/0x3d0 > [ 775.532002] __xfs_filemap_fault+0x24a/0x2d0 [xfs > aa8d25411432b306d9554da38096f4ebb86bdfe7] > [ 775.532603] __do_fault+0x30/0x1e0 > [ 775.532903] do_fault+0x314/0x6c0 > [ 775.533166] __handle_mm_fault+0x646/0x1250 > [ 775.533480] handle_mm_fault+0xc1/0x230 > [ 775.533810] do_user_addr_fault+0x1ac/0x610 > [ 775.534110] exc_page_fault+0x63/0x140 > [ 775.534389] asm_exc_page_fault+0x22/0x30 > [ 775.534678] RIP: 0033:0x7f39fc55820a > [ 775.534950] Code: 00 01 00 00 00 74 99 83 f9 c0 0f 87 7b fe ff ff c5 fe 6f > 4e 20 48 29 fe 48 83 c7 3f 49 8d 0c 10 48 83 e7 c0 48 01 fe 48 29 f9 <f3> a4 > c4 c1 7e 7f 00 c4 c1 7e 7f 48 20 c5 f8 77 c3 0f 1f 44 00 00 > [ 775.535839] RSP: 002b:00007ffc66a08118 EFLAGS: 00010202 > [ 775.536157] RAX: 00007f39fc772001 RBX: 0000000000042001 RCX: > 00000000000063c1 > [ 775.536537] RDX: 0000000000006400 RSI: 00007f39fac42050 RDI: > 00007f39fc772040 > [ 775.536919] RBP: 0000000000006400 R08: 00007f39fc772001 R09: > 0000000000042000 > [ 775.537304] R10: 0000000000000001 R11: 0000000000000246 R12: > 0000000000000001 > [ 775.537694] R13: 00007f39fc772000 R14: 0000000000006401 R15: > 0000000000000003 > [ 775.538086] </TASK> > [ 775.538333] ---[ end trace 0000000000000000 ]--- > > This also effects dax+noreflink mode if we run the test after a > dax+reflink test. So, the most urgent thing is solving the warning > messages. > > With these fixes, most warning messages in dax_associate_entry() are > gone. But honestly, generic/388 will randomly failed with the warning. > The case shutdown the xfs when fsstress is running, and do it for many > times. I think the reason is that dax pages in use are not able to be > invalidated in time when fs is shutdown. The next time dax page to be > associated, it still remains the mapping value set last time. I'll keep > on solving it.
This one also sounds like it is going to be relevant for CXL PMEM, and the improvements to the reference counting. CXL has a facility where the driver asserts that no more writes are in-flight to the device so that the device can assert a clean shutdown. Part of that will be making sure that page access ends at fs shutdown. > The warning message in dax_writeback_one() can also be fixed because of > the dax unshare. > > > Shiyang Ruan (8): > fsdax: introduce page->share for fsdax in reflink mode > fsdax: invalidate pages when CoW > fsdax: zero the edges if source is HOLE or UNWRITTEN > fsdax,xfs: set the shared flag when file extent is shared > fsdax: dedupe: iter two files at the same time > xfs: use dax ops for zero and truncate in fsdax mode > fsdax,xfs: port unshare to fsdax > xfs: remove restrictions for fsdax and reflink > > fs/dax.c | 220 +++++++++++++++++++++++++------------ > fs/xfs/xfs_ioctl.c | 4 - > fs/xfs/xfs_iomap.c | 6 +- > fs/xfs/xfs_iops.c | 4 - > fs/xfs/xfs_reflink.c | 8 +- > include/linux/dax.h | 2 + > include/linux/mm_types.h | 5 +- > include/linux/page-flags.h | 2 +- > 8 files changed, 166 insertions(+), 85 deletions(-) > > -- > 2.38.1 > >