Re: amd-staging-drm-next breaks suspend
Tested amd-staging-drm-next (HEAD e5c18a35031963eb22bfabf84cce3545da56a8ee) and suspend/resume works despite the warnings. So the amdgpu_gart_bind warning did not cause problems. Am Donnerstag, dem 20.01.2022 um 01:52 + schrieb Kim, Jonathan: > [Public] > > This should fix the issue by getting rid of the unneeded flag check > during gart bind: > https://patchwork.freedesktop.org/patch/469907/ > > Thanks, > > Jon > > > -Original Message- > > From: amd-gfx On Behalf Of > > Bert > > Karwatzki > > Sent: January 19, 2022 8:12 PM > > To: Alex Deucher > > Cc: Chris Hixon ; Zhuo, Qingqing > > (Lillian) ; Das, Nirmoy > > ; amd-gfx@lists.freedesktop.org; Scott Bruce > > ; Limonciello, Mario > > ; Kazlauskas, Nicholas > > > > Subject: Re: amd-staging-drm-next breaks suspend > > > > [CAUTION: External Email] > > > > Unfortunately this does not work either: > > > > [ 0.859998] [ cut here ] > > [ 0.859998] trying to bind memory to uninitialized GART ! > > [ 0.860003] WARNING: CPU: 13 PID: 235 at > > drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c:254 > > amdgpu_gart_bind+0x29/0x40 [amdgpu] > > [ 0.860099] Modules linked in: amdgpu(+) drm_ttm_helper ttm > > gpu_sched i2c_algo_bit drm_kms_helper syscopyarea hid_sensor_hub > > sysfillrect mfd_core sysimgblt hid_generic fb_sys_fops cec xhci_pci > > xhci_hcd nvme drm r8169 nvme_core psmouse crc32c_intel realtek > > amd_sfh usbcore i2c_hid_acpi mdio_devres t10_pi crc_t10dif i2c_hid > > i2c_piix4 crct10dif_generic libphy crct10dif_common hid backlight > > i2c_designware_platform i2c_designware_core > > [ 0.860113] CPU: 13 PID: 235 Comm: systemd-udevd Not tainted > > 5.13.0+ > > #15 > > [ 0.860115] Hardware name: Micro-Star International Co., Ltd. > > Alpha > > 15 B5EEK/MS-158L, BIOS E158LAMS.107 11/10/2021 > > [ 0.860116] RIP: 0010:amdgpu_gart_bind+0x29/0x40 [amdgpu] > > [ 0.860210] Code: 00 80 bf 34 25 00 00 00 74 14 4c 8b 8f 20 25 > > 00 00 > > 4d 85 c9 74 05 e9 16 ff ff ff 31 c0 c3 48 c7 c7 08 06 7d c0 e8 8e > > cc 31 > > e2 <0f> 0b b8 ea ff ff ff c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f > > 40 > > [ 0.860212] RSP: 0018:bb9e80b6f968 EFLAGS: 00010286 > > [ 0.860213] RAX: RBX: 0067 RCX: > > a3080968 > > [ 0.860214] RDX: RSI: efff RDI: > > a3028960 > > [ 0.860215] RBP: 947c91e49a80 R08: R09: > > bb9e80b6f798 > > [ 0.860215] R10: bb9e80b6f790 R11: a30989a8 R12: > > > > [ 0.860216] R13: 947c8a74 R14: 947c8a74 R15: > > > > [ 0.860216] FS: 7f60a3c918c0() > > GS:947f5e94() > > knlGS: > > [ 0.860217] CS: 0010 DS: ES: CR0: 80050033 > > [ 0.860218] CR2: 7f60a4213480 CR3: 000135ee2000 CR4: > > 00550ee0 > > [ 0.860218] PKRU: 5554 > > [ 0.860219] Call Trace: > > [ 0.860221] amdgpu_ttm_gart_bind+0x74/0xc0 [amdgpu] > > [ 0.860305] amdgpu_ttm_alloc_gart+0x13e/0x190 [amdgpu] > > [ 0.860385] amdgpu_bo_create_reserved.part.0+0xf3/0x1b0 > > [amdgpu] > > [ 0.860465] ? amdgpu_ttm_debugfs_init+0x110/0x110 [amdgpu] > > [ 0.860554] amdgpu_bo_create_kernel+0x36/0xa0 [amdgpu] > > [ 0.860641] amdgpu_ttm_init.cold+0x167/0x181 [amdgpu] > > [ 0.860784] gmc_v10_0_sw_init+0x2d7/0x430 [amdgpu] > > [ 0.860889] amdgpu_device_init.cold+0x147f/0x1ad7 [amdgpu] > > [ 0.861007] ? acpi_ns_get_node+0x4a/0x55 > > [ 0.861011] ? acpi_get_handle+0x89/0xb2 > > [ 0.861012] amdgpu_driver_load_kms+0x55/0x290 [amdgpu] > > [ 0.861098] amdgpu_pci_probe+0x181/0x250 [amdgpu] > > [ 0.861188] pci_device_probe+0xcd/0x140 > > [ 0.861191] really_probe+0xed/0x460 > > [ 0.861193] driver_probe_device+0xe3/0x150 > > [ 0.861195] device_driver_attach+0x9c/0xb0 > > [ 0.861196] __driver_attach+0x8a/0x150 > > [ 0.861197] ? device_driver_attach+0xb0/0xb0 > > [ 0.861198] ? device_driver_attach+0xb0/0xb0 > > [ 0.861198] bus_for_each_dev+0x73/0xb0 > > [ 0.861200] bus_add_driver+0x121/0x1e0 > > [ 0.861201] driver_register+0x8a/0xe0 > > [ 0.861202] ? 0xc1117000 > > [ 0.861203] do_one_initcall+0x47/0x180 > > [ 0.861205] ? do_init_module+0x19/0x230 > > [ 0.861208] ? kmem_cache_alloc+0x182/0x260 > > [ 0.861210] do_init_module+0x51/0x230 > &g
Re: amd-staging-drm-next breaks suspend
The warn_on is still triggered because of empty gart.ptr in function amdgpu_gart_bind On 1/20/2022 10:56 AM, Chen, Guchun wrote: > [Public] > > [ 1.310551] trying to bind memory to uninitialized GART ! > > This is a warning only, it should not break suspend/resume. There is a fix on > drm-next for this "drm/amdgpu: remove gart.ready flag", pls have a try. > If you still observe suspend issue, I guess it's caused by other regression. > Then can you pls bisect it? > > Regards, > Guchun > > -Original Message- > From: amd-gfx On Behalf Of Bert > Karwatzki > Sent: Thursday, January 20, 2022 5:52 AM > To: amd-gfx@lists.freedesktop.org > Cc: Chris Hixon ; Zhuo, Qingqing (Lillian) > ; Scott Bruce ; Limonciello, Mario > ; Alex Deucher ; > Kazlauskas, Nicholas > Subject: amd-staging-drm-next breaks suspend > > I just tested drm-staging-drm-next with HEAD > f1b2924ee6929cb431440e6f961f06eb65d52beb: > Going into suspend leads to a hang again: > This is probably caused by > [ 1.310551] trying to bind memory to uninitialized GART ! > and/or > [ 3.976438] trying to bind memory to uninitialized GART ! > > > Here's the complete dmesg: > [ 0.00] Linux version 5.13.0+ (bert@lisa) (gcc (Debian 11.2.0-14) > 11.2.0, GNU ld (GNU Binutils for Debian) 2.37.50.20220106) #4 SMP Wed > Jan 19 22:19:19 CET 2022 > [ 0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0+ > root=UUID=78dcbf14-902d-49c0-9d4d-b7ad84550d9a ro > mt7921e.disable_aspm=1 quiet > [ 0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating > point registers' > [ 0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' > [ 0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' > [ 0.00] x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys > User registers' > [ 0.00] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 > [ 0.00] x86/fpu: xstate_offset[9]: 832, xstate_sizes[9]: 8 > [ 0.00] x86/fpu: Enabled xstate features 0x207, context size is 840 > bytes, using 'compacted' format. > [ 0.00] BIOS-provided physical RAM map: > [ 0.00] BIOS-e820: [mem 0x-0x0009] > usable > [ 0.00] BIOS-e820: [mem 0x000a-0x000f] > reserved > [ 0.00] BIOS-e820: [mem 0x0010-0x09bfefff] > usable > [ 0.00] BIOS-e820: [mem 0x09bff000-0x0a000fff] > reserved > [ 0.00] BIOS-e820: [mem 0x0a001000-0x0a1f] > usable > [ 0.00] BIOS-e820: [mem 0x0a20-0x0a20efff] ACPI > NVS > [ 0.00] BIOS-e820: [mem 0x0a20f000-0xe9e1] > usable > [ 0.00] BIOS-e820: [mem 0xe9e2-0xeb33efff] > reserved > [ 0.00] BIOS-e820: [mem 0xeb33f000-0xeb39efff] ACPI > data > [ 0.00] BIOS-e820: [mem 0xeb39f000-0xeb556fff] ACPI > NVS > [ 0.00] BIOS-e820: [mem 0xeb557000-0xed17cfff] > reserved > [ 0.00] BIOS-e820: [mem 0xed17d000-0xed1fefff] type > 20 > [ 0.00] BIOS-e820: [mem 0xed1ff000-0xedff] > usable > [ 0.00] BIOS-e820: [mem 0xee00-0xf7ff] > reserved > [ 0.00] BIOS-e820: [mem 0xfd00-0xfdff] > reserved > [ 0.00] BIOS-e820: [mem 0xfeb8-0xfec01fff] > reserved > [ 0.00] BIOS-e820: [mem 0xfec1-0xfec10fff] > reserved > [ 0.00] BIOS-e820: [mem 0xfed0-0xfed00fff] > reserved > [ 0.00] BIOS-e820: [mem 0xfed4-0xfed44fff] > reserved > [ 0.00] BIOS-e820: [mem 0xfed8-0xfed8] > reserved > [ 0.00] BIOS-e820: [mem 0xfedc4000-0xfedc9fff] > reserved > [ 0.00] BIOS-e820: [mem 0xfedcc000-0xfedcefff] > reserved > [ 0.00] BIOS-e820: [mem 0xfedd5000-0xfedd5fff] > reserved > [ 0.00] BIOS-e820: [mem 0xff00-0x] > reserved > [ 0.00] BIOS-e820: [mem 0x0001-0x0003ee2f] > usable > [ 0.00] BIOS-e820: [mem 0x0003ee30-0x00040fff] > reserved > [ 0.00] NX (Execute Disable) protection: active > [ 0.00] efi: EFI v2.70 by American Megatrends > [ 0.00] efi: ACPI=0xeb54 ACPI 2.0=0xeb540014 > TPMFinalLog=0xeb50c000 SMBIOS=0xed02 SMBIOS 3.0=0xed01f000 > MEMATTR=0xe6fa3018 ESRT=0xe87cb918 MOKvar=0xe6fa > [ 0.00] SMBIOS 3.3.0 present. > [ 0.00] DMI: Micro-Star International Co., Ltd. Alpha 15 B5EEK/MS- > 158L, BIOS E158LAMS.107 11/10/2021 > [ 0.00] tsc: Fast TSC calibration using PIT > [ 0.00] tsc: Detected 3194.034 MHz processor > [ 0.000125] e820: update [mem 0x-0x0fff] usable ==> > reserved > [ 0.000126] e820: remove [mem 0x000a-0x000f] usable > [ 0.000131] last_pfn = 0x3ee300 max_arch_pfn = 0x4 > [ 0.000363] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT > [ 0.000577] e820: update [mem
RE: amd-staging-drm-next breaks suspend
[Public] [ 1.310551] trying to bind memory to uninitialized GART ! This is a warning only, it should not break suspend/resume. There is a fix on drm-next for this "drm/amdgpu: remove gart.ready flag", pls have a try. If you still observe suspend issue, I guess it's caused by other regression. Then can you pls bisect it? Regards, Guchun -Original Message- From: amd-gfx On Behalf Of Bert Karwatzki Sent: Thursday, January 20, 2022 5:52 AM To: amd-gfx@lists.freedesktop.org Cc: Chris Hixon ; Zhuo, Qingqing (Lillian) ; Scott Bruce ; Limonciello, Mario ; Alex Deucher ; Kazlauskas, Nicholas Subject: amd-staging-drm-next breaks suspend I just tested drm-staging-drm-next with HEAD f1b2924ee6929cb431440e6f961f06eb65d52beb: Going into suspend leads to a hang again: This is probably caused by [ 1.310551] trying to bind memory to uninitialized GART ! and/or [ 3.976438] trying to bind memory to uninitialized GART ! Here's the complete dmesg: [ 0.00] Linux version 5.13.0+ (bert@lisa) (gcc (Debian 11.2.0-14) 11.2.0, GNU ld (GNU Binutils for Debian) 2.37.50.20220106) #4 SMP Wed Jan 19 22:19:19 CET 2022 [ 0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0+ root=UUID=78dcbf14-902d-49c0-9d4d-b7ad84550d9a ro mt7921e.disable_aspm=1 quiet [ 0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [ 0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [ 0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [ 0.00] x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers' [ 0.00] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [ 0.00] x86/fpu: xstate_offset[9]: 832, xstate_sizes[9]: 8 [ 0.00] x86/fpu: Enabled xstate features 0x207, context size is 840 bytes, using 'compacted' format. [ 0.00] BIOS-provided physical RAM map: [ 0.00] BIOS-e820: [mem 0x-0x0009] usable [ 0.00] BIOS-e820: [mem 0x000a-0x000f] reserved [ 0.00] BIOS-e820: [mem 0x0010-0x09bfefff] usable [ 0.00] BIOS-e820: [mem 0x09bff000-0x0a000fff] reserved [ 0.00] BIOS-e820: [mem 0x0a001000-0x0a1f] usable [ 0.00] BIOS-e820: [mem 0x0a20-0x0a20efff] ACPI NVS [ 0.00] BIOS-e820: [mem 0x0a20f000-0xe9e1] usable [ 0.00] BIOS-e820: [mem 0xe9e2-0xeb33efff] reserved [ 0.00] BIOS-e820: [mem 0xeb33f000-0xeb39efff] ACPI data [ 0.00] BIOS-e820: [mem 0xeb39f000-0xeb556fff] ACPI NVS [ 0.00] BIOS-e820: [mem 0xeb557000-0xed17cfff] reserved [ 0.00] BIOS-e820: [mem 0xed17d000-0xed1fefff] type 20 [ 0.00] BIOS-e820: [mem 0xed1ff000-0xedff] usable [ 0.00] BIOS-e820: [mem 0xee00-0xf7ff] reserved [ 0.00] BIOS-e820: [mem 0xfd00-0xfdff] reserved [ 0.00] BIOS-e820: [mem 0xfeb8-0xfec01fff] reserved [ 0.00] BIOS-e820: [mem 0xfec1-0xfec10fff] reserved [ 0.00] BIOS-e820: [mem 0xfed0-0xfed00fff] reserved [ 0.00] BIOS-e820: [mem 0xfed4-0xfed44fff] reserved [ 0.00] BIOS-e820: [mem 0xfed8-0xfed8] reserved [ 0.00] BIOS-e820: [mem 0xfedc4000-0xfedc9fff] reserved [ 0.00] BIOS-e820: [mem 0xfedcc000-0xfedcefff] reserved [ 0.00] BIOS-e820: [mem 0xfedd5000-0xfedd5fff] reserved [ 0.00] BIOS-e820: [mem 0xff00-0x] reserved [ 0.00] BIOS-e820: [mem 0x0001-0x0003ee2f] usable [ 0.00] BIOS-e820: [mem 0x0003ee30-0x00040fff] reserved [ 0.00] NX (Execute Disable) protection: active [ 0.00] efi: EFI v2.70 by American Megatrends [ 0.00] efi: ACPI=0xeb54 ACPI 2.0=0xeb540014 TPMFinalLog=0xeb50c000 SMBIOS=0xed02 SMBIOS 3.0=0xed01f000 MEMATTR=0xe6fa3018 ESRT=0xe87cb918 MOKvar=0xe6fa [ 0.00] SMBIOS 3.3.0 present. [ 0.00] DMI: Micro-Star International Co., Ltd. Alpha 15 B5EEK/MS- 158L, BIOS E158LAMS.107 11/10/2021 [ 0.00] tsc: Fast TSC calibration using PIT [ 0.00] tsc: Detected 3194.034 MHz processor [ 0.000125] e820: update [mem 0x-0x0fff] usable ==> reserved [ 0.000126] e820: remove [mem 0x000a-0x000f] usable [ 0.000131] last_pfn = 0x3ee300 max_arch_pfn = 0x4 [ 0.000363] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT [ 0.000577] e820: update [mem 0xf000-0x] usable ==> reserved [ 0.000582] last_pfn = 0xee000 max_arch_pfn = 0x4 [ 0.003213] esrt: Reserving ESRT space from 0xe87cb918 to 0xe87cb950. [ 0.003217] e820: update [mem 0xe87cb000-0xe87cbfff] usable ==> reserved [ 0.003225] e820: update [mem 0xe6fa-0xe6fa2fff] usable ==> reserved [ 0.003235] Using GB pages for direct mapping [
RE: amd-staging-drm-next breaks suspend
[Public] This should fix the issue by getting rid of the unneeded flag check during gart bind: https://patchwork.freedesktop.org/patch/469907/ Thanks, Jon > -Original Message- > From: amd-gfx On Behalf Of Bert > Karwatzki > Sent: January 19, 2022 8:12 PM > To: Alex Deucher > Cc: Chris Hixon ; Zhuo, Qingqing > (Lillian) ; Das, Nirmoy > ; amd-gfx@lists.freedesktop.org; Scott Bruce > ; Limonciello, Mario > ; Kazlauskas, Nicholas > > Subject: Re: amd-staging-drm-next breaks suspend > > [CAUTION: External Email] > > Unfortunately this does not work either: > > [0.859998] [ cut here ] > [0.859998] trying to bind memory to uninitialized GART ! > [0.860003] WARNING: CPU: 13 PID: 235 at > drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c:254 > amdgpu_gart_bind+0x29/0x40 [amdgpu] > [0.860099] Modules linked in: amdgpu(+) drm_ttm_helper ttm > gpu_sched i2c_algo_bit drm_kms_helper syscopyarea hid_sensor_hub > sysfillrect mfd_core sysimgblt hid_generic fb_sys_fops cec xhci_pci > xhci_hcd nvme drm r8169 nvme_core psmouse crc32c_intel realtek > amd_sfh usbcore i2c_hid_acpi mdio_devres t10_pi crc_t10dif i2c_hid > i2c_piix4 crct10dif_generic libphy crct10dif_common hid backlight > i2c_designware_platform i2c_designware_core > [0.860113] CPU: 13 PID: 235 Comm: systemd-udevd Not tainted 5.13.0+ > #15 > [0.860115] Hardware name: Micro-Star International Co., Ltd. Alpha > 15 B5EEK/MS-158L, BIOS E158LAMS.107 11/10/2021 > [0.860116] RIP: 0010:amdgpu_gart_bind+0x29/0x40 [amdgpu] > [0.860210] Code: 00 80 bf 34 25 00 00 00 74 14 4c 8b 8f 20 25 00 00 > 4d 85 c9 74 05 e9 16 ff ff ff 31 c0 c3 48 c7 c7 08 06 7d c0 e8 8e cc 31 > e2 <0f> 0b b8 ea ff ff ff c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 > [0.860212] RSP: 0018:bb9e80b6f968 EFLAGS: 00010286 > [0.860213] RAX: RBX: 0067 RCX: > a3080968 > [0.860214] RDX: RSI: efff RDI: > a3028960 > [0.860215] RBP: 947c91e49a80 R08: R09: > bb9e80b6f798 > [0.860215] R10: bb9e80b6f790 R11: a30989a8 R12: > > [0.860216] R13: 947c8a74 R14: 947c8a74 R15: > > [0.860216] FS: 7f60a3c918c0() GS:947f5e94() > knlGS: > [0.860217] CS: 0010 DS: ES: CR0: 80050033 > [0.860218] CR2: 7f60a4213480 CR3: 000135ee2000 CR4: > 00550ee0 > [0.860218] PKRU: 5554 > [0.860219] Call Trace: > [0.860221] amdgpu_ttm_gart_bind+0x74/0xc0 [amdgpu] > [0.860305] amdgpu_ttm_alloc_gart+0x13e/0x190 [amdgpu] > [0.860385] amdgpu_bo_create_reserved.part.0+0xf3/0x1b0 [amdgpu] > [0.860465] ? amdgpu_ttm_debugfs_init+0x110/0x110 [amdgpu] > [0.860554] amdgpu_bo_create_kernel+0x36/0xa0 [amdgpu] > [0.860641] amdgpu_ttm_init.cold+0x167/0x181 [amdgpu] > [0.860784] gmc_v10_0_sw_init+0x2d7/0x430 [amdgpu] > [0.860889] amdgpu_device_init.cold+0x147f/0x1ad7 [amdgpu] > [0.861007] ? acpi_ns_get_node+0x4a/0x55 > [0.861011] ? acpi_get_handle+0x89/0xb2 > [0.861012] amdgpu_driver_load_kms+0x55/0x290 [amdgpu] > [0.861098] amdgpu_pci_probe+0x181/0x250 [amdgpu] > [0.861188] pci_device_probe+0xcd/0x140 > [0.861191] really_probe+0xed/0x460 > [0.861193] driver_probe_device+0xe3/0x150 > [0.861195] device_driver_attach+0x9c/0xb0 > [0.861196] __driver_attach+0x8a/0x150 > [0.861197] ? device_driver_attach+0xb0/0xb0 > [0.861198] ? device_driver_attach+0xb0/0xb0 > [0.861198] bus_for_each_dev+0x73/0xb0 > [0.861200] bus_add_driver+0x121/0x1e0 > [0.861201] driver_register+0x8a/0xe0 > [0.861202] ? 0xc1117000 > [0.861203] do_one_initcall+0x47/0x180 > [0.861205] ? do_init_module+0x19/0x230 > [0.861208] ? kmem_cache_alloc+0x182/0x260 > [0.861210] do_init_module+0x51/0x230 > [0.861211] __do_sys_finit_module+0xb1/0x110 > [0.861213] do_syscall_64+0x40/0xb0 > [0.861216] entry_SYSCALL_64_after_hwframe+0x44/0xae > [0.861218] RIP: 0033:0x7f60a4149679 > [0.861220] Code: 48 8d 3d 9a a1 0c 00 0f 05 eb a5 66 0f 1f 44 00 00 > 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f > 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c7 57 0c 00 f7 d8 64 89 01 48 > [0.861221] RSP: 002b:7ffe25f17ea8 EFLAGS: 0246 ORIG_RAX: > 0139 > [0.861223] RAX: ffda RBX: 56004a10a660 RCX: > 7f60a4149679 > [0.861224] RDX: RSI: 7f60a42e9eed RDI: > 0016 > [0.861224] RBP: 0002 R08: R09: &g
Re: amd-staging-drm-next breaks suspend
Unfortunately this does not work either: [0.859998] [ cut here ] [0.859998] trying to bind memory to uninitialized GART ! [0.860003] WARNING: CPU: 13 PID: 235 at drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c:254 amdgpu_gart_bind+0x29/0x40 [amdgpu] [0.860099] Modules linked in: amdgpu(+) drm_ttm_helper ttm gpu_sched i2c_algo_bit drm_kms_helper syscopyarea hid_sensor_hub sysfillrect mfd_core sysimgblt hid_generic fb_sys_fops cec xhci_pci xhci_hcd nvme drm r8169 nvme_core psmouse crc32c_intel realtek amd_sfh usbcore i2c_hid_acpi mdio_devres t10_pi crc_t10dif i2c_hid i2c_piix4 crct10dif_generic libphy crct10dif_common hid backlight i2c_designware_platform i2c_designware_core [0.860113] CPU: 13 PID: 235 Comm: systemd-udevd Not tainted 5.13.0+ #15 [0.860115] Hardware name: Micro-Star International Co., Ltd. Alpha 15 B5EEK/MS-158L, BIOS E158LAMS.107 11/10/2021 [0.860116] RIP: 0010:amdgpu_gart_bind+0x29/0x40 [amdgpu] [0.860210] Code: 00 80 bf 34 25 00 00 00 74 14 4c 8b 8f 20 25 00 00 4d 85 c9 74 05 e9 16 ff ff ff 31 c0 c3 48 c7 c7 08 06 7d c0 e8 8e cc 31 e2 <0f> 0b b8 ea ff ff ff c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 [0.860212] RSP: 0018:bb9e80b6f968 EFLAGS: 00010286 [0.860213] RAX: RBX: 0067 RCX: a3080968 [0.860214] RDX: RSI: efff RDI: a3028960 [0.860215] RBP: 947c91e49a80 R08: R09: bb9e80b6f798 [0.860215] R10: bb9e80b6f790 R11: a30989a8 R12: [0.860216] R13: 947c8a74 R14: 947c8a74 R15: [0.860216] FS: 7f60a3c918c0() GS:947f5e94() knlGS: [0.860217] CS: 0010 DS: ES: CR0: 80050033 [0.860218] CR2: 7f60a4213480 CR3: 000135ee2000 CR4: 00550ee0 [0.860218] PKRU: 5554 [0.860219] Call Trace: [0.860221] amdgpu_ttm_gart_bind+0x74/0xc0 [amdgpu] [0.860305] amdgpu_ttm_alloc_gart+0x13e/0x190 [amdgpu] [0.860385] amdgpu_bo_create_reserved.part.0+0xf3/0x1b0 [amdgpu] [0.860465] ? amdgpu_ttm_debugfs_init+0x110/0x110 [amdgpu] [0.860554] amdgpu_bo_create_kernel+0x36/0xa0 [amdgpu] [0.860641] amdgpu_ttm_init.cold+0x167/0x181 [amdgpu] [0.860784] gmc_v10_0_sw_init+0x2d7/0x430 [amdgpu] [0.860889] amdgpu_device_init.cold+0x147f/0x1ad7 [amdgpu] [0.861007] ? acpi_ns_get_node+0x4a/0x55 [0.861011] ? acpi_get_handle+0x89/0xb2 [0.861012] amdgpu_driver_load_kms+0x55/0x290 [amdgpu] [0.861098] amdgpu_pci_probe+0x181/0x250 [amdgpu] [0.861188] pci_device_probe+0xcd/0x140 [0.861191] really_probe+0xed/0x460 [0.861193] driver_probe_device+0xe3/0x150 [0.861195] device_driver_attach+0x9c/0xb0 [0.861196] __driver_attach+0x8a/0x150 [0.861197] ? device_driver_attach+0xb0/0xb0 [0.861198] ? device_driver_attach+0xb0/0xb0 [0.861198] bus_for_each_dev+0x73/0xb0 [0.861200] bus_add_driver+0x121/0x1e0 [0.861201] driver_register+0x8a/0xe0 [0.861202] ? 0xc1117000 [0.861203] do_one_initcall+0x47/0x180 [0.861205] ? do_init_module+0x19/0x230 [0.861208] ? kmem_cache_alloc+0x182/0x260 [0.861210] do_init_module+0x51/0x230 [0.861211] __do_sys_finit_module+0xb1/0x110 [0.861213] do_syscall_64+0x40/0xb0 [0.861216] entry_SYSCALL_64_after_hwframe+0x44/0xae [0.861218] RIP: 0033:0x7f60a4149679 [0.861220] Code: 48 8d 3d 9a a1 0c 00 0f 05 eb a5 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c7 57 0c 00 f7 d8 64 89 01 48 [0.861221] RSP: 002b:7ffe25f17ea8 EFLAGS: 0246 ORIG_RAX: 0139 [0.861223] RAX: ffda RBX: 56004a10a660 RCX: 7f60a4149679 [0.861224] RDX: RSI: 7f60a42e9eed RDI: 0016 [0.861224] RBP: 0002 R08: R09: 56004a105980 [0.861225] R10: 0016 R11: 0246 R12: 7f60a42e9eed [0.861225] R13: R14: 56004a0efdd0 R15: 56004a10a660 [0.861226] ---[ end trace 0319f26df48f8ef0 ]--- [0.861228] [drm:amdgpu_ttm_gart_bind [amdgpu]] *ERROR* failed to bind 1 pages at 0x0040 [0.861540] amdgpu :03:00.0: amdgpu: a9dfe17c bind failed Am Mittwoch, dem 19.01.2022 um 19:54 -0500 schrieb Alex Deucher: > On Wed, Jan 19, 2022 at 7:48 PM Bert Karwatzki > wrote: > > > > Bisected the error and found the first bad commit to be > > d015e9861e55928a78137a2c95897bc50637fc47 is the first bad commit > > commit d015e9861e55928a78137a2c95897bc50637fc47 > > Author: Jonathan Kim > > Date: Thu Dec 9 16:48:56 2021 -0500 > > > > drm/amdgpu: improve debug VRAM access performance using sdma > > > > For better performance during VRAM access for debugged > > processes, > > do > > read/write copies over SDMA. > > > > In
Re: amd-staging-drm-next breaks suspend
On Wed, Jan 19, 2022 at 7:48 PM Bert Karwatzki wrote: > > Bisected the error and found the first bad commit to be > d015e9861e55928a78137a2c95897bc50637fc47 is the first bad commit > commit d015e9861e55928a78137a2c95897bc50637fc47 > Author: Jonathan Kim > Date: Thu Dec 9 16:48:56 2021 -0500 > > drm/amdgpu: improve debug VRAM access performance using sdma > > For better performance during VRAM access for debugged processes, > do > read/write copies over SDMA. > > In order to fulfill post mortem debugging on a broken device, > fallback to > stable MMIO access when gpu recovery is disabled or when job > submission > time outs are set to max. Failed SDMA access should automatically > fall > back to MMIO access. > > Use a pre-allocated GTT bounce buffer pre-mapped into GART to avoid > page-table updates and TLB flushes on access. > > Signed-off-by: Jonathan Kim > Reviewed-by: Felix Kuehling > > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 78 > + > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 4 ++ > 2 files changed, 82 insertions(+) Should be fixed with: https://patchwork.freedesktop.org/patch/470069/ Alex > > > Am Donnerstag, dem 20.01.2022 um 00:22 +0100 schrieb Bert Karwatzki: > > Reverting commit 72f686438de13f121c52f58d7445570a33dfdc61 does not > > change the errors: > > [1.310550] [ cut here ] > > [1.310551] trying to bind memory to uninitialized GART ! > > [1.310556] WARNING: CPU: 9 PID: 252 at > > drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c:254 > > amdgpu_gart_bind+0x2e/0x40 > > [amdgpu] > > [1.310659] Modules linked in: amdgpu(+) gpu_sched i2c_algo_bit > > drm_ttm_helper hid_sensor_hub ttm hid_generic nvme drm_kms_helper > > nvme_core cec xhci_pci t10_pi r8169 rc_core crc32_pclmul crc_t10dif > > i2c_hid_acpi realtek xhci_hcd psmouse crc32c_intel crct10dif_generic > > i2c_hid amd_sfh mdio_devres crct10dif_pclmul drm i2c_piix4 usbcore > > libphy crct10dif_common wmi button battery video fjes(-) hid > > [1.310672] CPU: 9 PID: 252 Comm: systemd-udevd Not tainted > > 5.13.0+ > > #4 > > [1.310673] Hardware name: Micro-Star International Co., Ltd. > > Alpha > > 15 B5EEK/MS-158L, BIOS E158LAMS.107 11/10/2021 > > [1.310674] RIP: 0010:amdgpu_gart_bind+0x2e/0x40 [amdgpu] > > [1.310762] Code: 00 80 bf 34 25 00 00 00 74 14 4c 8b 8f 20 25 00 > > 00 > > 4d 85 c9 74 05 e9 01 ff ff ff 31 c0 c3 48 c7 c7 68 36 dd c0 e8 86 db > > 19 > > e8 <0f> 0b b8 ea ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 > > 00 > > [1.310763] RSP: 0018:b19d00c33920 EFLAGS: 00010282 > > [1.310764] RAX: RBX: 0067 RCX: > > a9abb208 > > [1.310765] RDX: RSI: efff RDI: > > a9a63200 > > [1.310766] RBP: 985ce2a796c0 R08: R09: > > b19d00c33748 > > [1.310766] R10: b19d00c33740 R11: a9ad3248 R12: > > > > [1.310766] R13: 985cd45a R14: 985cd45a R15: > > > > [1.310767] FS: 7f69fabdc8c0() GS:985f9e64() > > knlGS: > > [1.310768] CS: 0010 DS: ES: CR0: 80050033 > > [1.310768] CR2: 7f69fabc5dca CR3: 0001139ec000 CR4: > > 00750ee0 > > [1.310769] PKRU: 5554 > > [1.310770] Call Trace: > > [1.310772] amdgpu_ttm_gart_bind+0x79/0xc0 [amdgpu] > > [1.310858] amdgpu_ttm_alloc_gart+0x146/0x1a0 [amdgpu] > > [1.310942] amdgpu_bo_create_reserved.part.0+0xf8/0x1b0 [amdgpu] > > [1.311025] ? amdgpu_ttm_debugfs_init+0x110/0x110 [amdgpu] > > [1.311145] amdgpu_bo_create_kernel+0x3b/0xa0 [amdgpu] > > [1.311229] amdgpu_ttm_init.cold+0x165/0x17f [amdgpu] > > [1.311349] gmc_v10_0_sw_init+0x2dc/0x430 [amdgpu] > > [1.311455] amdgpu_device_init.cold+0x1544/0x1b54 [amdgpu] > > [1.311570] ? acpi_ns_get_node+0x4f/0x5a > > [1.311574] ? acpi_get_handle+0x8e/0xb7 > > [1.311576] amdgpu_driver_load_kms+0x67/0x320 [amdgpu] > > [1.311664] amdgpu_pci_probe+0x1bc/0x290 [amdgpu] > > [1.311750] local_pci_probe+0x42/0x80 > > [1.311753] ? __cond_resched+0x16/0x40 > > [1.311755] pci_device_probe+0xfd/0x1b0 > > [1.311756] really_probe+0xf2/0x460 > > [1.311759] driver_probe_device+0xe8/0x160 > > [1.311760] device_driver_attach+0xa1/0xb0 > > [1.311761] __driver_attach+0x8f/0x150 > > [1.311763] ? device_driver_attach+0xb0/0xb0 > > [1.311764] ? device_driver_attach+0xb0/0xb0 > > [1.311765] bus_for_each_dev+0x78/0xc0 > > [1.311766] bus_add_driver+0x12b/0x1e0 > > [1.311768] driver_register+0x8f/0xe0 > > [1.311769] ? 0xc1828000 > > [1.311770] do_one_initcall+0x44/0x1d0 > > [1.311772] ? kmem_cache_alloc_trace+0x103/0x240 > > [1.311775] do_init_module+0x5c/0x270 > > [1.311777] __do_sys_finit_module+0xb1/0x110 > > [1.311779]
Re: amd-staging-drm-next breaks suspend
Bisected the error and found the first bad commit to be d015e9861e55928a78137a2c95897bc50637fc47 is the first bad commit commit d015e9861e55928a78137a2c95897bc50637fc47 Author: Jonathan Kim Date: Thu Dec 9 16:48:56 2021 -0500 drm/amdgpu: improve debug VRAM access performance using sdma For better performance during VRAM access for debugged processes, do read/write copies over SDMA. In order to fulfill post mortem debugging on a broken device, fallback to stable MMIO access when gpu recovery is disabled or when job submission time outs are set to max. Failed SDMA access should automatically fall back to MMIO access. Use a pre-allocated GTT bounce buffer pre-mapped into GART to avoid page-table updates and TLB flushes on access. Signed-off-by: Jonathan Kim Reviewed-by: Felix Kuehling drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 78 + drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 4 ++ 2 files changed, 82 insertions(+) Am Donnerstag, dem 20.01.2022 um 00:22 +0100 schrieb Bert Karwatzki: > Reverting commit 72f686438de13f121c52f58d7445570a33dfdc61 does not > change the errors: > [ 1.310550] [ cut here ] > [ 1.310551] trying to bind memory to uninitialized GART ! > [ 1.310556] WARNING: CPU: 9 PID: 252 at > drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c:254 > amdgpu_gart_bind+0x2e/0x40 > [amdgpu] > [ 1.310659] Modules linked in: amdgpu(+) gpu_sched i2c_algo_bit > drm_ttm_helper hid_sensor_hub ttm hid_generic nvme drm_kms_helper > nvme_core cec xhci_pci t10_pi r8169 rc_core crc32_pclmul crc_t10dif > i2c_hid_acpi realtek xhci_hcd psmouse crc32c_intel crct10dif_generic > i2c_hid amd_sfh mdio_devres crct10dif_pclmul drm i2c_piix4 usbcore > libphy crct10dif_common wmi button battery video fjes(-) hid > [ 1.310672] CPU: 9 PID: 252 Comm: systemd-udevd Not tainted > 5.13.0+ > #4 > [ 1.310673] Hardware name: Micro-Star International Co., Ltd. > Alpha > 15 B5EEK/MS-158L, BIOS E158LAMS.107 11/10/2021 > [ 1.310674] RIP: 0010:amdgpu_gart_bind+0x2e/0x40 [amdgpu] > [ 1.310762] Code: 00 80 bf 34 25 00 00 00 74 14 4c 8b 8f 20 25 00 > 00 > 4d 85 c9 74 05 e9 01 ff ff ff 31 c0 c3 48 c7 c7 68 36 dd c0 e8 86 db > 19 > e8 <0f> 0b b8 ea ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 > 00 > [ 1.310763] RSP: 0018:b19d00c33920 EFLAGS: 00010282 > [ 1.310764] RAX: RBX: 0067 RCX: > a9abb208 > [ 1.310765] RDX: RSI: efff RDI: > a9a63200 > [ 1.310766] RBP: 985ce2a796c0 R08: R09: > b19d00c33748 > [ 1.310766] R10: b19d00c33740 R11: a9ad3248 R12: > > [ 1.310766] R13: 985cd45a R14: 985cd45a R15: > > [ 1.310767] FS: 7f69fabdc8c0() GS:985f9e64() > knlGS: > [ 1.310768] CS: 0010 DS: ES: CR0: 80050033 > [ 1.310768] CR2: 7f69fabc5dca CR3: 0001139ec000 CR4: > 00750ee0 > [ 1.310769] PKRU: 5554 > [ 1.310770] Call Trace: > [ 1.310772] amdgpu_ttm_gart_bind+0x79/0xc0 [amdgpu] > [ 1.310858] amdgpu_ttm_alloc_gart+0x146/0x1a0 [amdgpu] > [ 1.310942] amdgpu_bo_create_reserved.part.0+0xf8/0x1b0 [amdgpu] > [ 1.311025] ? amdgpu_ttm_debugfs_init+0x110/0x110 [amdgpu] > [ 1.311145] amdgpu_bo_create_kernel+0x3b/0xa0 [amdgpu] > [ 1.311229] amdgpu_ttm_init.cold+0x165/0x17f [amdgpu] > [ 1.311349] gmc_v10_0_sw_init+0x2dc/0x430 [amdgpu] > [ 1.311455] amdgpu_device_init.cold+0x1544/0x1b54 [amdgpu] > [ 1.311570] ? acpi_ns_get_node+0x4f/0x5a > [ 1.311574] ? acpi_get_handle+0x8e/0xb7 > [ 1.311576] amdgpu_driver_load_kms+0x67/0x320 [amdgpu] > [ 1.311664] amdgpu_pci_probe+0x1bc/0x290 [amdgpu] > [ 1.311750] local_pci_probe+0x42/0x80 > [ 1.311753] ? __cond_resched+0x16/0x40 > [ 1.311755] pci_device_probe+0xfd/0x1b0 > [ 1.311756] really_probe+0xf2/0x460 > [ 1.311759] driver_probe_device+0xe8/0x160 > [ 1.311760] device_driver_attach+0xa1/0xb0 > [ 1.311761] __driver_attach+0x8f/0x150 > [ 1.311763] ? device_driver_attach+0xb0/0xb0 > [ 1.311764] ? device_driver_attach+0xb0/0xb0 > [ 1.311765] bus_for_each_dev+0x78/0xc0 > [ 1.311766] bus_add_driver+0x12b/0x1e0 > [ 1.311768] driver_register+0x8f/0xe0 > [ 1.311769] ? 0xc1828000 > [ 1.311770] do_one_initcall+0x44/0x1d0 > [ 1.311772] ? kmem_cache_alloc_trace+0x103/0x240 > [ 1.311775] do_init_module+0x5c/0x270 > [ 1.311777] __do_sys_finit_module+0xb1/0x110 > [ 1.311779] do_syscall_64+0x40/0xb0 > [ 1.311781] entry_SYSCALL_64_after_hwframe+0x44/0xae > [ 1.311783] RIP: 0033:0x7f69fb094679 > [ 1.311785] Code: 48 8d 3d 9a a1 0c 00 0f 05 eb a5 66 0f 1f 44 00 > 00 > 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 > 0f > 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c7 57 0c 00 f7 d8 64 89 01 > 48 > [
Re: amd-staging-drm-next breaks suspend
Reverting commit 72f686438de13f121c52f58d7445570a33dfdc61 does not change the errors: [1.310550] [ cut here ] [1.310551] trying to bind memory to uninitialized GART ! [1.310556] WARNING: CPU: 9 PID: 252 at drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c:254 amdgpu_gart_bind+0x2e/0x40 [amdgpu] [1.310659] Modules linked in: amdgpu(+) gpu_sched i2c_algo_bit drm_ttm_helper hid_sensor_hub ttm hid_generic nvme drm_kms_helper nvme_core cec xhci_pci t10_pi r8169 rc_core crc32_pclmul crc_t10dif i2c_hid_acpi realtek xhci_hcd psmouse crc32c_intel crct10dif_generic i2c_hid amd_sfh mdio_devres crct10dif_pclmul drm i2c_piix4 usbcore libphy crct10dif_common wmi button battery video fjes(-) hid [1.310672] CPU: 9 PID: 252 Comm: systemd-udevd Not tainted 5.13.0+ #4 [1.310673] Hardware name: Micro-Star International Co., Ltd. Alpha 15 B5EEK/MS-158L, BIOS E158LAMS.107 11/10/2021 [1.310674] RIP: 0010:amdgpu_gart_bind+0x2e/0x40 [amdgpu] [1.310762] Code: 00 80 bf 34 25 00 00 00 74 14 4c 8b 8f 20 25 00 00 4d 85 c9 74 05 e9 01 ff ff ff 31 c0 c3 48 c7 c7 68 36 dd c0 e8 86 db 19 e8 <0f> 0b b8 ea ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 [1.310763] RSP: 0018:b19d00c33920 EFLAGS: 00010282 [1.310764] RAX: RBX: 0067 RCX: a9abb208 [1.310765] RDX: RSI: efff RDI: a9a63200 [1.310766] RBP: 985ce2a796c0 R08: R09: b19d00c33748 [1.310766] R10: b19d00c33740 R11: a9ad3248 R12: [1.310766] R13: 985cd45a R14: 985cd45a R15: [1.310767] FS: 7f69fabdc8c0() GS:985f9e64() knlGS: [1.310768] CS: 0010 DS: ES: CR0: 80050033 [1.310768] CR2: 7f69fabc5dca CR3: 0001139ec000 CR4: 00750ee0 [1.310769] PKRU: 5554 [1.310770] Call Trace: [1.310772] amdgpu_ttm_gart_bind+0x79/0xc0 [amdgpu] [1.310858] amdgpu_ttm_alloc_gart+0x146/0x1a0 [amdgpu] [1.310942] amdgpu_bo_create_reserved.part.0+0xf8/0x1b0 [amdgpu] [1.311025] ? amdgpu_ttm_debugfs_init+0x110/0x110 [amdgpu] [1.311145] amdgpu_bo_create_kernel+0x3b/0xa0 [amdgpu] [1.311229] amdgpu_ttm_init.cold+0x165/0x17f [amdgpu] [1.311349] gmc_v10_0_sw_init+0x2dc/0x430 [amdgpu] [1.311455] amdgpu_device_init.cold+0x1544/0x1b54 [amdgpu] [1.311570] ? acpi_ns_get_node+0x4f/0x5a [1.311574] ? acpi_get_handle+0x8e/0xb7 [1.311576] amdgpu_driver_load_kms+0x67/0x320 [amdgpu] [1.311664] amdgpu_pci_probe+0x1bc/0x290 [amdgpu] [1.311750] local_pci_probe+0x42/0x80 [1.311753] ? __cond_resched+0x16/0x40 [1.311755] pci_device_probe+0xfd/0x1b0 [1.311756] really_probe+0xf2/0x460 [1.311759] driver_probe_device+0xe8/0x160 [1.311760] device_driver_attach+0xa1/0xb0 [1.311761] __driver_attach+0x8f/0x150 [1.311763] ? device_driver_attach+0xb0/0xb0 [1.311764] ? device_driver_attach+0xb0/0xb0 [1.311765] bus_for_each_dev+0x78/0xc0 [1.311766] bus_add_driver+0x12b/0x1e0 [1.311768] driver_register+0x8f/0xe0 [1.311769] ? 0xc1828000 [1.311770] do_one_initcall+0x44/0x1d0 [1.311772] ? kmem_cache_alloc_trace+0x103/0x240 [1.311775] do_init_module+0x5c/0x270 [1.311777] __do_sys_finit_module+0xb1/0x110 [1.311779] do_syscall_64+0x40/0xb0 [1.311781] entry_SYSCALL_64_after_hwframe+0x44/0xae [1.311783] RIP: 0033:0x7f69fb094679 [1.311785] Code: 48 8d 3d 9a a1 0c 00 0f 05 eb a5 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c7 57 0c 00 f7 d8 64 89 01 48 [1.311786] RSP: 002b:7ffce4131708 EFLAGS: 0246 ORIG_RAX: 0139 [1.311788] RAX: ffda RBX: 55d71350a3a0 RCX: 7f69fb094679 [1.311788] RDX: RSI: 7f69fb234eed RDI: 0013 [1.311789] RBP: 0002 R08: R09: 55d7134f3930 [1.311789] R10: 0013 R11: 0246 R12: 7f69fb234eed [1.311790] R13: R14: 55d7134da0f0 R15: 55d71350a3a0 [1.311791] ---[ end trace ff47998e3140e95d ]--- [1.311793] [drm:amdgpu_ttm_gart_bind [amdgpu]] *ERROR* failed to bind 1 pages at 0x0040 [1.312100] amdgpu :03:00.0: amdgpu: 989bdfac bind failed and using https://patchwork.freedesktop.org/patch/469907/ gives a this message: [1.311502] [ cut here ] [1.311502] WARNING: CPU: 9 PID: 221 at drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c:244 amdgpu_gart_bind+0x16/0x20 [amdgpu] [1.311602] Modules linked in: amdgpu(+) gpu_sched i2c_algo_bit drm_ttm_helper hid_sensor_hub ttm hid_generic nvme xhci_pci drm_kms_helper nvme_core t10_pi xhci_hcd crc_t10dif r8169 cec crct10dif_generic i2c_hid_acpi amd_sfh rc_core crct10dif_pclmul realtek i2c_hid crc32_pclmul
Re: amd-staging-drm-next breaks suspend
On 1/19/2022 10:59 PM, Limonciello, Mario wrote: [Public] -Original Message- From: Bert Karwatzki Sent: Wednesday, January 19, 2022 15:52 To: amd-gfx@lists.freedesktop.org Cc: Limonciello, Mario ; Kazlauskas, Nicholas ; Zhuo, Qingqing (Lillian) ; Scott Bruce ; Alex Deucher ; Chris Hixon Subject: amd-staging-drm-next breaks suspend I just tested drm-staging-drm-next with HEAD f1b2924ee6929cb431440e6f961f06eb65d52beb: Going into suspend leads to a hang again: This is probably caused by [ 1.310551] trying to bind memory to uninitialized GART ! and/or [ 3.976438] trying to bind memory to uninitialized GART ! Could you please also try https://patchwork.freedesktop.org/patch/469907/ ? Regards, Nirmoy +@Das, Nirmoy The only thing that touched that file recently was 72f686438de13f121c52f58d7445570a33dfdc61 Could you see if backing that out helps? Here's the complete dmesg: [ 0.00] Linux version 5.13.0+ (bert@lisa) (gcc (Debian 11.2.0-14) 11.2.0, GNU ld (GNU Binutils for Debian) 2.37.50.20220106) #4 SMP Wed Jan 19 22:19:19 CET 2022 [ 0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0+ root=UUID=78dcbf14-902d-49c0-9d4d-b7ad84550d9a ro mt7921e.disable_aspm=1 quiet [ 0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [ 0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [ 0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [ 0.00] x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers' [ 0.00] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [ 0.00] x86/fpu: xstate_offset[9]: 832, xstate_sizes[9]: 8 [ 0.00] x86/fpu: Enabled xstate features 0x207, context size is 840 bytes, using 'compacted' format. [ 0.00] BIOS-provided physical RAM map: [ 0.00] BIOS-e820: [mem 0x-0x0009] usable [ 0.00] BIOS-e820: [mem 0x000a-0x000f] reserved [ 0.00] BIOS-e820: [mem 0x0010-0x09bfefff] usable [ 0.00] BIOS-e820: [mem 0x09bff000-0x0a000fff] reserved [ 0.00] BIOS-e820: [mem 0x0a001000-0x0a1f] usable [ 0.00] BIOS-e820: [mem 0x0a20-0x0a20efff] ACPI NVS [ 0.00] BIOS-e820: [mem 0x0a20f000-0xe9e1] usable [ 0.00] BIOS-e820: [mem 0xe9e2-0xeb33efff] reserved [ 0.00] BIOS-e820: [mem 0xeb33f000-0xeb39efff] ACPI data [ 0.00] BIOS-e820: [mem 0xeb39f000-0xeb556fff] ACPI NVS [ 0.00] BIOS-e820: [mem 0xeb557000-0xed17cfff] reserved [ 0.00] BIOS-e820: [mem 0xed17d000-0xed1fefff] type 20 [ 0.00] BIOS-e820: [mem 0xed1ff000-0xedff] usable [ 0.00] BIOS-e820: [mem 0xee00-0xf7ff] reserved [ 0.00] BIOS-e820: [mem 0xfd00-0xfdff] reserved [ 0.00] BIOS-e820: [mem 0xfeb8-0xfec01fff] reserved [ 0.00] BIOS-e820: [mem 0xfec1-0xfec10fff] reserved [ 0.00] BIOS-e820: [mem 0xfed0-0xfed00fff] reserved [ 0.00] BIOS-e820: [mem 0xfed4-0xfed44fff] reserved [ 0.00] BIOS-e820: [mem 0xfed8-0xfed8] reserved [ 0.00] BIOS-e820: [mem 0xfedc4000-0xfedc9fff] reserved [ 0.00] BIOS-e820: [mem 0xfedcc000-0xfedcefff] reserved [ 0.00] BIOS-e820: [mem 0xfedd5000-0xfedd5fff] reserved [ 0.00] BIOS-e820: [mem 0xff00-0x] reserved [ 0.00] BIOS-e820: [mem 0x0001-0x0003ee2f] usable [ 0.00] BIOS-e820: [mem 0x0003ee30-0x00040fff] reserved [ 0.00] NX (Execute Disable) protection: active [ 0.00] efi: EFI v2.70 by American Megatrends [ 0.00] efi: ACPI=0xeb54 ACPI 2.0=0xeb540014 TPMFinalLog=0xeb50c000 SMBIOS=0xed02 SMBIOS 3.0=0xed01f000 MEMATTR=0xe6fa3018 ESRT=0xe87cb918 MOKvar=0xe6fa [ 0.00] SMBIOS 3.3.0 present. [ 0.00] DMI: Micro-Star International Co., Ltd. Alpha 15 B5EEK/MS- 158L, BIOS E158LAMS.107 11/10/2021 [ 0.00] tsc: Fast TSC calibration using PIT [ 0.00] tsc: Detected 3194.034 MHz processor [ 0.000125] e820: update [mem 0x-0x0fff] usable ==> reserved [ 0.000126] e820: remove [mem 0x000a-0x000f] usable [ 0.000131] last_pfn = 0x3ee300 max_arch_pfn = 0x4 [ 0.000363] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT [ 0.000577] e820: update [mem 0xf000-0x] usable ==> reserved [ 0.000582] last_pfn = 0xee000 max_arch_pfn = 0x4 [ 0.003213] esrt: Reserving ESRT space from 0xe87cb918 to 0xe87cb950. [ 0.003217] e820: update [mem 0xe87cb000-0xe87cbfff] usable ==> reserved [ 0.003225] e820: update [mem 0xe6fa-0xe6fa2fff] usable ==> reserved [ 0.003235] Using GB pages for direct mapping [ 0.003498] Secure boot disabled [ 0.003499] RAMDISK: [mem
RE: amd-staging-drm-next breaks suspend
[Public] > -Original Message- > From: Bert Karwatzki > Sent: Wednesday, January 19, 2022 15:52 > To: amd-gfx@lists.freedesktop.org > Cc: Limonciello, Mario ; Kazlauskas, Nicholas > ; Zhuo, Qingqing (Lillian) > ; Scott Bruce ; Alex Deucher > ; Chris Hixon > Subject: amd-staging-drm-next breaks suspend > > I just tested drm-staging-drm-next with HEAD > f1b2924ee6929cb431440e6f961f06eb65d52beb: > Going into suspend leads to a hang again: > This is probably caused by > [ 1.310551] trying to bind memory to uninitialized GART ! > and/or > [ 3.976438] trying to bind memory to uninitialized GART ! > +@Das, Nirmoy The only thing that touched that file recently was 72f686438de13f121c52f58d7445570a33dfdc61 Could you see if backing that out helps? > > Here's the complete dmesg: > [ 0.00] Linux version 5.13.0+ (bert@lisa) (gcc (Debian 11.2.0-14) > 11.2.0, GNU ld (GNU Binutils for Debian) 2.37.50.20220106) #4 SMP Wed > Jan 19 22:19:19 CET 2022 > [ 0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0+ > root=UUID=78dcbf14-902d-49c0-9d4d-b7ad84550d9a ro > mt7921e.disable_aspm=1 quiet > [ 0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating > point registers' > [ 0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' > [ 0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' > [ 0.00] x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys > User registers' > [ 0.00] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 > [ 0.00] x86/fpu: xstate_offset[9]: 832, xstate_sizes[9]: 8 > [ 0.00] x86/fpu: Enabled xstate features 0x207, context size is 840 > bytes, using 'compacted' format. > [ 0.00] BIOS-provided physical RAM map: > [ 0.00] BIOS-e820: [mem 0x-0x0009] > usable > [ 0.00] BIOS-e820: [mem 0x000a-0x000f] > reserved > [ 0.00] BIOS-e820: [mem 0x0010-0x09bfefff] > usable > [ 0.00] BIOS-e820: [mem 0x09bff000-0x0a000fff] > reserved > [ 0.00] BIOS-e820: [mem 0x0a001000-0x0a1f] > usable > [ 0.00] BIOS-e820: [mem 0x0a20-0x0a20efff] ACPI > NVS > [ 0.00] BIOS-e820: [mem 0x0a20f000-0xe9e1] > usable > [ 0.00] BIOS-e820: [mem 0xe9e2-0xeb33efff] > reserved > [ 0.00] BIOS-e820: [mem 0xeb33f000-0xeb39efff] ACPI > data > [ 0.00] BIOS-e820: [mem 0xeb39f000-0xeb556fff] ACPI > NVS > [ 0.00] BIOS-e820: [mem 0xeb557000-0xed17cfff] > reserved > [ 0.00] BIOS-e820: [mem 0xed17d000-0xed1fefff] type > 20 > [ 0.00] BIOS-e820: [mem 0xed1ff000-0xedff] > usable > [ 0.00] BIOS-e820: [mem 0xee00-0xf7ff] > reserved > [ 0.00] BIOS-e820: [mem 0xfd00-0xfdff] > reserved > [ 0.00] BIOS-e820: [mem 0xfeb8-0xfec01fff] > reserved > [ 0.00] BIOS-e820: [mem 0xfec1-0xfec10fff] > reserved > [ 0.00] BIOS-e820: [mem 0xfed0-0xfed00fff] > reserved > [ 0.00] BIOS-e820: [mem 0xfed4-0xfed44fff] > reserved > [ 0.00] BIOS-e820: [mem 0xfed8-0xfed8] > reserved > [ 0.00] BIOS-e820: [mem 0xfedc4000-0xfedc9fff] > reserved > [ 0.00] BIOS-e820: [mem 0xfedcc000-0xfedcefff] > reserved > [ 0.00] BIOS-e820: [mem 0xfedd5000-0xfedd5fff] > reserved > [ 0.00] BIOS-e820: [mem 0xff00-0x] > reserved > [ 0.00] BIOS-e820: [mem 0x0001-0x0003ee2f] > usable > [ 0.00] BIOS-e820: [mem 0x0003ee30-0x00040fff] > reserved > [ 0.00] NX (Execute Disable) protection: active > [ 0.00] efi: EFI v2.70 by American Megatrends > [ 0.00] efi: ACPI=0xeb54 ACPI 2.0=0xeb540014 > TPMFinalLog=0xeb50c000 SMBIOS=0xed02 SMBIOS 3.0=0xed01f000 > MEMATTR=0xe6fa3018 ESRT=0xe87cb918 MOKvar=0xe6fa > [ 0.00] SMBIOS 3.3.0 present. > [ 0.00] DMI: Micro-Star International Co., Ltd. Alpha 15 B5EEK/MS- > 158L, BIOS E158LAMS.107 11/10/2021 > [ 0.00] tsc: Fast TSC calibration using PIT > [ 0.00] tsc: Detected 3194.034 MHz processor > [ 0.000125] e820: update [mem 0x-0x0fff] usable ==> > reserved > [ 0.000126] e820: remove [mem 0x000a-0x000f] usable > [ 0.000131] last_pfn = 0x3ee300 max_arch_pfn = 0x4 > [ 0.000363] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT > [ 0.000577] e820: update [mem 0xf000-0x] usable ==> > reserved > [ 0.000582] last_pfn = 0xee000 max_arch_pfn = 0x4 > [ 0.003213] esrt: Reserving ESRT space from 0xe87cb918 to > 0xe87cb950. > [ 0.003217] e820: update [mem 0xe87cb000-0xe87cbfff] usable ==> > reserved > [ 0.003225] e820: update [mem 0xe6fa-0xe6fa2fff] usable ==> > reserved > [ 0.003235] Using GB pages for