Fan, Many thanks, it helps a lot. Previous I forgot to create a new dax device(daxctl create-device region0) Question: Why we need to create a the dax0.1, why the dax0.0 doesn't associate to the new adding DCD region.
Ira, Let me try to report a kernel panic. kernel: dcd-2024-04-17 qemu: dcd-2024-04-17 QEMU command line: 164 <qemu:arg value='-device'/> 165 <qemu:arg value='cxl-type3,bus=cxl-rp-hb0rp0,persistent-memdev=cxl-pmem0,lsa=cxl-pmem-lsa0,id=pmem-dcmem,volatile-dc-memdev=cxl-dcmem0,num-dc-regions=4'/> 166 <qemu:arg value='-object'/> 167 <qemu:arg value='memory-backend-file,id=cxl-dcmem0,share=on,mem-path=/home/lizhijian/images/cxldcmem0.raw,size=2048M'/> 168 <qemu:arg value='-object'/> 169 <qemu:arg value='memory-backend-file,id=cxl-pmem0,share=on,mem-path=/home/lizhijian/images/cxlpmem0.raw,size=2048M'/> 170 <qemu:arg value='-object'/> 171 <qemu:arg value='memory-backend-file,id=cxl-pmem-lsa0,share=on,mem-path=/home/lizhijian/images/cxlpmem-lsa0.raw,size=4K'/> 172 <qemu:arg value='-M'/> 173 <qemu:arg value='cxl=on,cxl-fmw.0.targets.0=pxb-cxl.0,cxl-fmw.0.size=4G,cxl-fmw.0.interleave-granularity=8k'/> Reproducer: 1. guest: ./create-dc.sh 2. host: virsh qemu-monitor-command rdma-server-cxl-persistent-dcd $(cat cxl-add-dcd.json) 3. guest: daxctl create-device region0 # will create dax0.1 4. daxctl reconfigure-device --mode=system-ram --force dax0.1 -u # kernel panic ===================== # cat ./create-dc.sh #!/bin/bash set -ex region=$(cat /sys/bus/cxl/devices/decoder0.0/create_dc_region) echo $region> /sys/bus/cxl/devices/decoder0.0/create_dc_region echo 256 > /sys/bus/cxl/devices/$region/interleave_granularity echo 1 > /sys/bus/cxl/devices/$region/interleave_ways echo "dc0" >/sys/bus/cxl/devices/decoder2.0/mode echo 0x10000000 >/sys/bus/cxl/devices/decoder2.0/dpa_size echo 0x10000000 > /sys/bus/cxl/devices/$region/size echo "decoder2.0" > /sys/bus/cxl/devices/$region/target0 echo 1 > /sys/bus/cxl/devices/$region/commit echo $region > /sys/bus/cxl/drivers/cxl_region/bind ========================= # cat cxl-add-dcd.json { "execute": "cxl-add-dynamic-capacity", "arguments": { "path": "/machine/peripheral/pmem-dcmem", "hid": 0, "selection-policy": 2, "region-id": 0, "tag": "", "extents": [ { "offset": 0, "len": 268435456 } ] } } [ 126.909297] Demotion targets for Node 0: preferred: 1, fallback: 1 [ 126.911186] Demotion targets for Node 1: null [ 126.913808] BUG: kernel NULL pointer dereference, address: 0000000000000468 [ 126.915431] #PF: supervisor read access in kernel mode [ 126.917156] #PF: error_code(0x0000) - not-present page [ 126.918976] PGD 8000000006771067 P4D 8000000006771067 PUD e777067 PMD 0 [ 126.920587] Oops: 0000 [#1] PREEMPT SMP PTI [ 126.921714] CPU: 0 PID: 1101 Comm: daxctl Kdump: loaded Not tainted 6.9.0-rc3-lizhijian+ #489 [ 126.924914] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 [ 126.928620] RIP: 0010:cxl_region_perf_attrs_callback+0x25/0x110 [cxl_core] [ 126.930316] Code: 90 90 90 90 90 0f 1f 44 00 00 41 56 41 55 41 54 55 53 8b 6a 24 83 fd ff 74 20 48 83 fe 01 75 1a 48 8b 87 58 ff ff ff 48 89 fb <48> 8b b8 68 04 00 00 e8 cf a2 f4 e0 39 c5 74 13 45 31 e4 5b 44 89 [ 126.934920] RSP: 0018:ffffc900007cbc58 EFLAGS: 00010246 [ 126.936994] RAX: 0000000000000000 RBX: ffff888007534d60 RCX: 0000000000000020 [ 126.939378] RDX: ffffc900007cbcf8 RSI: 0000000000000001 RDI: ffff888007534d60 [ 126.942721] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000001 [ 126.944762] R10: ffff88807fc31d80 R11: 0000000000000000 R12: 0000000000000000 [ 126.946900] R13: 0000000000000001 R14: ffffc900007cbcf8 R15: ffff888007534d60 [ 126.948871] FS: 00007fb2ab918880(0000) GS:ffff88807fc00000(0000) knlGS:0000000000000000 [ 126.951241] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 126.952722] CR2: 0000000000000468 CR3: 000000000aaf0003 CR4: 00000000001706f0 [ 126.954623] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 126.956768] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 126.958887] Call Trace: [ 126.959814] <TASK> [ 126.960569] ? __die+0x20/0x70 [ 126.961645] ? page_fault_oops+0x15a/0x450 [ 126.962930] ? search_module_extables+0x33/0x90 [ 126.964374] ? fixup_exception+0x22/0x310 [ 126.965693] ? exc_page_fault+0x68/0x200 [ 126.967371] ? asm_exc_page_fault+0x22/0x30 [ 126.968713] ? cxl_region_perf_attrs_callback+0x25/0x110 [cxl_core] [ 126.972508] notifier_call_chain+0x40/0x110 [ 126.974380] blocking_notifier_call_chain+0x43/0x60 [ 126.975788] online_pages+0x24c/0x2d0 [ 126.977008] memory_subsys_online+0x233/0x290 [ 126.978338] device_online+0x64/0x90 [ 126.979440] state_store+0xae/0xc0 [ 126.980510] kernfs_fop_write_iter+0x143/0x200 [ 126.981734] vfs_write+0x3a6/0x570 [ 126.982851] ksys_write+0x65/0xf0 [ 126.984006] do_syscall_64+0x6d/0x140 [ 126.985309] entry_SYSCALL_64_after_hwframe+0x71/0x79 [ 126.986927] RIP: 0033:0x7fb2abc777a7 [ 126.987983] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 [ 126.992770] RSP: 002b:00007ffebec70b98 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 126.994874] RAX: ffffffffffffffda RBX: 000000000040e1f0 RCX: 00007fb2abc777a7 [ 126.996906] RDX: 000000000000000f RSI: 00007fb2abdb6434 RDI: 0000000000000004 [ 126.998911] RBP: 00007ffebec70bd0 R08: 0000000000000000 R09: 00007ffebec70640 [ 127.000879] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000403840 [ 127.003572] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 127.005543] </TASK> Thanks Zhijian On 17/05/2024 01:12, fan wrote: > On Tue, May 14, 2024 at 02:16:51AM +0000, Zhijian Li (Fujitsu) wrote: >> Hi Fan >> >> >> Do you have a newer instruction to play with the DCD. It seems that >> the instruction in RFC[0] doesn't work for current code. >> >> [0] https://lore.kernel.org/all/20230511175609.2091136-1-fan...@samsung.com/ >> > > For the testing, the only thing that has been changed for this series is > the QMP interface for add/release DC extents. > > https://lore.kernel.org/linux-cxl/d708f7c8-2598-4a17-9cbb-935c6ae2a...@fujitsu.com/T/#m05066f0098e976fb1c4b05db5e7ff7ca1bf27b1e > > 1. Add dynamic capacity extents: > > For example, the command to add two continuous extents (each 128MiB long) > to region 0 (starting at DPA offset 0) looks like below: > > { "execute": "qmp_capabilities" } > > { "execute": "cxl-add-dynamic-capacity", > "arguments": { > "path": "/machine/peripheral/cxl-dcd0", > "hid": 0, > "selection-policy": 2, > "region-id": 0, > "tag": "", > "extents": [ > { > "offset": 0, > "len": 134217728 > }, > { > "offset": 134217728, > "len": 134217728 > } > ] > } > } > > 2. Release dynamic capacity extents: > > For example, the command to release an extent of size 128MiB from region 0 > (DPA offset 128MiB) looks like below: > > { "execute": "cxl-release-dynamic-capacity", > "arguments": { > "path": "/machine/peripheral/cxl-dcd0", > "hid": 0, > "flags": 1, > "region-id": 0, > "tag": "", > "extents": [ > { > "offset": 134217728, > "len": 134217728 > } > ] > } > } > > btw, I have a wiki page to explain how to test CXL DCD with a tool I > wrote. > https://github.com/moking/moking.github.io/wiki/cxl%E2%80%90test%E2%80%90tool:-A-tool-to-ease-CXL-test-with-QEMU-setup%E2%80%90%E2%80%90Using-DCD-test-as-an-example > > Let me know if you need more info for testing. > > > Fan > >> >> >> On 19/04/2024 07:10, nifan....@gmail.com wrote: >>> A git tree of this series can be found here (with one extra commit on top >>> for printing out accepted/pending extent list): >>> https://github.com/moking/qemu/tree/dcd-v7 >>> >>> v6->v7: >>> >>> 1. Fixed the dvsec range register issue mentioned in the the cover letter >>> in v6. >>> Only relevant bits are set to mark the device ready (Patch 6). >>> (Jonathan) >>> 2. Moved the if statement in cxl_setup_memory from Patch 6 to Patch 4. >>> (Jonathan) >>> 3. Used MIN instead of if statement to get record_count in Patch 7. >>> (Jonathan) >>> 4. Added "Reviewed-by" tag to Patch 7. >>> 5. Modified cxl_dc_extent_release_dry_run so the updated extent list can be >>> reused in cmd_dcd_release_dyn_cap to simplify the process in Patch 8. >>> (Jørgen) >>> 6. Added comments to indicate further "TODO" items in >>> cmd_dcd_add_dyn_cap_rsp. >>> (Jonathan) >>> 7. Avoided irrelevant code reformat in Patch 8. (Jonathan) >>> 8. Modified QMP interfaces for adding/releasing DC extents to allow passing >>> tags, selection policy, flags in the interface. (Jonathan, Gregory) >>> 9. Redesigned the pending list so extents in the same requests are grouped >>> together. A new data structure is introduced to represent "extent >>> group" >>> in pending list. (Jonathan) >>> 10. Added support in QMP interface for "More" flag. >>> 11. Check "Forced removal" flag for release request and not let it pass >>> through. >>> 12. Removed the dynamic capacity log type from CxlEventLog definition in >>> cxl.json >>> to avoid the side effect it may introduce to inject error to DC event >>> log. >>> (Jonathan) >>> 13. Hard coded the event log type to dynamic capacity event log in QMP >>> interfaces. (Jonathan) >>> 14. Adding space in between "-1]". (Jonathan) >>> 15. Some minor comment fixes. >>> >>> The code is tested with similar setup and has passed similar tests as listed >>> in the cover letter of v5[1] and v6[2]. >>> Also, the code is tested with the latest DCD kernel patchset[3]. >>> >>> [1] Qemu DCD patchset v5: >>> https://lore.kernel.org/linux-cxl/20240304194331.1586191-1-nifan....@gmail.com/T/#t >>> [2] Qemu DCD patchset v6: >>> https://lore.kernel.org/linux-cxl/20240325190339.696686-1-nifan....@gmail.com/T/#t >>> [3] DCD kernel patches: >>> https://lore.kernel.org/linux-cxl/20240324-dcd-type2-upstream-v1-0-b7b00d623...@intel.com/T/#m11c571e21c4fe17c7d04ec5c2c7bc7cbf2cd07e3 >>> >>> >>> Fan Ni (12): >>> hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output >>> payload of identify memory device command >>> hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative >>> and mailbox command support >>> include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for >>> type3 memory devices >>> hw/mem/cxl_type3: Add support to create DC regions to type3 memory >>> devices >>> hw/mem/cxl-type3: Refactor ct3_build_cdat_entries_for_mr to take mr >>> size instead of mr as argument >>> hw/mem/cxl_type3: Add host backend and address space handling for DC >>> regions >>> hw/mem/cxl_type3: Add DC extent list representative and get DC extent >>> list mailbox support >>> hw/cxl/cxl-mailbox-utils: Add mailbox commands to support add/release >>> dynamic capacity response >>> hw/cxl/events: Add qmp interfaces to add/release dynamic capacity >>> extents >>> hw/mem/cxl_type3: Add DPA range validation for accesses to DC regions >>> hw/cxl/cxl-mailbox-utils: Add superset extent release mailbox support >>> hw/mem/cxl_type3: Allow to release extent superset in QMP interface >>> >>> hw/cxl/cxl-mailbox-utils.c | 620 ++++++++++++++++++++++++++++++++++- >>> hw/mem/cxl_type3.c | 633 +++++++++++++++++++++++++++++++++--- >>> hw/mem/cxl_type3_stubs.c | 20 ++ >>> include/hw/cxl/cxl_device.h | 81 ++++- >>> include/hw/cxl/cxl_events.h | 18 + >>> qapi/cxl.json | 69 ++++ >>> 6 files changed, 1396 insertions(+), 45 deletions(-) >>>