Hi Dan
By bisecting, this issue was introduced with bellow patch
commit f8f6ae5d077a9bdaf5cbf2ac960a5d1a04b47482
Author: Jason Gunthorpe <j...@ziepe.ca>
Date: Sun Nov 1 17:08:00 2020 -0800
mm: always have io_remap_pfn_range() set pgprot_decrypted()
The purpose of io_remap_pfn_range() is to map IO memory, such as a
memory mapped IO exposed through a PCI BAR. IO devices do not
understand encryption, so this memory must always be decrypted.
Automatically call pgprot_decrypted() as part of the generic
implementation.
This fixes a bug where enabling AMD SME causes subsystems, such as
RDMA,
using io_remap_pfn_range() to expose BAR pages to user space to fail.
The CPU will encrypt access to those BAR pages instead of passing
unencrypted IO directly to the device.
Places not mapping IO should use remap_pfn_range().
On 11/9/20 10:38 AM, Yi Zhang wrote:
Hello
I found this regression during devdax fio test on 5.10.0-rc3, could anyone help
check it, thanks.
[ 303.441089] memmap_init_zone_device initialised 2063872 pages in 34ms
[ 303.501085] memmap_init_zone_device initialised 2063872 pages in 34ms
[ 303.556891] memmap_init_zone_device initialised 2063872 pages in 24ms
[ 303.612790] memmap_init_zone_device initialised 2063872 pages in 24ms
[ 326.779920] perf: interrupt took too long (2714 > 2500), lowering
kernel.perf_event_max_sample_rate to 73000
[ 334.857133] perf: interrupt took too long (3737 > 3392), lowering
kernel.perf_event_max_sample_rate to 53000
[ 366.202597] memmap_init_zone_device initialised 1835008 pages in 21ms
[ 366.255031] memmap_init_zone_device initialised 1835008 pages in 22ms
[ 366.317048] memmap_init_zone_device initialised 1835008 pages in 31ms
[ 366.377970] memmap_init_zone_device initialised 1835008 pages in 32ms
[ 368.785285] BUG: Bad page state in process kworker/41:0 pfn:891066
[ 368.818471] page:00000000581ab220 refcount:0 mapcount:-1024
mapping:0000000000000000 index:0x0 pfn:0x891066
[ 368.865117] flags: 0x57ffffc0000000()
[ 368.882138] raw: 0057ffffc0000000 dead000000000100 dead000000000122
0000000000000000
[ 368.917429] raw: 0000000000000000 0000000000000000 00000000fffffbff
0000000000000000
[ 368.952788] page dumped because: nonzero mapcount
[ 368.974190] Modules linked in: rfkill sunrpc vfat fat dm_multipath
intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp
coretemp mgag200 ipmi_ssif i2c_algo_bit kvm_intel drm_kms_helper syscopyarea
acpi_ipmi sysfillrect kvm sysimgblt ipmi_si fb_sys_fops iTCO_wdt
iTCO_vendor_support ipmi_devintf drm irqbypass crct10dif_pclmul ipmi_msghandler
crc32_pclmul i2c_i801 ghash_clmulni_intel dax_pmem_compat rapl device_dax
i2c_smbus intel_cstate ioatdma intel_uncore joydev hpilo dax_pmem_core pcspkr
acpi_tad hpwdt lpc_ich dca acpi_power_meter ip_tables xfs sr_mod cdrom sd_mod
t10_pi sg nd_pmem nd_btt ahci nfit bnx2x libahci libata tg3 libnvdimm hpsa mdio
libcrc32c scsi_transport_sas wmi crc32c_intel dm_mirror dm_region_hash dm_log
dm_mod
[ 369.281195] CPU: 41 PID: 3258 Comm: kworker/41:0 Tainted: G S
5.10.0-rc3 #1
[ 369.321037] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS
P89 10/05/2016
[ 369.363640] Workqueue: mm_percpu_wq vmstat_update
[ 369.385044] Call Trace:
[ 369.388275] perf: interrupt took too long (5477 > 4671), lowering
kernel.perf_event_max_sample_rate to 36000
[ 369.396225] dump_stack+0x57/0x6a
[ 369.411391] bad_page.cold.114+0x9b/0xa0
[ 369.429316] free_pcppages_bulk+0x538/0x760
[ 369.448465] drain_zone_pages+0x1f/0x30
[ 369.466027] refresh_cpu_vm_stats+0x1ea/0x2b0
[ 369.485972] vmstat_update+0xf/0x50
[ 369.502064] process_one_work+0x1a4/0x340
[ 369.520412] ? process_one_work+0x340/0x340
[ 369.539510] worker_thread+0x30/0x370
[ 369.555744] ? process_one_work+0x340/0x340
[ 369.574765] kthread+0x116/0x130
[ 369.589612] ? kthread_park+0x80/0x80
[ 369.606231] ret_from_fork+0x22/0x30
[ 369.622910] Disabling lock debugging due to kernel taint
[ 393.619285] perf: interrupt took too long (6874 > 6846), lowering
kernel.perf_event_max_sample_rate to 29000
[ 397.904036] BUG: Bad page state in process kworker/57:1 pfn:189525
[ 397.936971] page:00000000be782875 refcount:0 mapcount:-1024
mapping:0000000000000000 index:0x0 pfn:0x189525
[ 397.984722] flags: 0x17ffffc0000000()
[ 398.002324] raw: 0017ffffc0000000 dead000000000100 dead000000000122
0000000000000000
[ 398.039032] raw: 0000000000000000 0000000000000000 00000000fffffbff
0000000000000000
[ 398.075804] page dumped because: nonzero mapcount
[ 398.098130] Modules linked in: rfkill sunrpc vfat fat dm_multipath
intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp
coretemp mgag200 ipmi_ssif i2c_algo_bit kvm_intel drm_kms_helper syscopyarea
acpi_ipmi sysfillrect kvm sysimgblt ipmi_si fb_sys_fops iTCO_wdt
iTCO_vendor_support ipmi_devintf drm irqbypass crct10dif_pclmul ipmi_msghandler
crc32_pclmul i2c_i801 ghash_clmulni_intel dax_pmem_compat rapl device_dax
i2c_smbus intel_cstate ioatdma intel_uncore joydev hpilo dax_pmem_core pcspkr
acpi_tad hpwdt lpc_ich dca acpi_power_meter ip_tables xfs sr_mod cdrom sd_mod
t10_pi sg nd_pmem nd_btt ahci nfit bnx2x libahci libata tg3 libnvdimm hpsa mdio
libcrc32c scsi_transport_sas wmi crc32c_intel dm_mirror dm_region_hash dm_log
dm_mod
[ 398.413042] CPU: 57 PID: 587 Comm: kworker/57:1 Tainted: G S B
5.10.0-rc3 #1
[ 398.455914] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS
P89 10/05/2016
[ 398.496657] Workqueue: mm_percpu_wq vmstat_update
[ 398.518938] Call Trace:
[ 398.530673] dump_stack+0x57/0x6a
[ 398.546463] bad_page.cold.114+0x9b/0xa0
[ 398.564977] free_pcppages_bulk+0x538/0x760
[ 398.584697] drain_zone_pages+0x1f/0x30
[ 398.602907] refresh_cpu_vm_stats+0x1ea/0x2b0
[ 398.623681] vmstat_update+0xf/0x50
[ 398.640415] process_one_work+0x1a4/0x340
[ 398.659517] ? process_one_work+0x340/0x340
[ 398.678659] worker_thread+0x30/0x370
[ 398.695506] ? process_one_work+0x340/0x340
[ 398.715204] kthread+0x116/0x130
[ 398.730572] ? kthread_park+0x80/0x80
[ 398.747761] ret_from_fork+0x22/0x30
Best Regards,
Yi Zhang
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org