Re: [Bug 215958] New: thunderbolt3 egpu cannot disconnect cleanly
On 2022-05-09 14:03, Deucher, Alexander wrote: [Public] -Original Message- From: Bjorn Helgaas Sent: Monday, May 9, 2022 12:23 PM To: Linux PCI Cc: r087...@yahoo.it; Deucher, Alexander ; Koenig, Christian ; Pan, Xinhui ; amd-gfx mailing list ; dri-devel Subject: Re: [Bug 215958] New: thunderbolt3 egpu cannot disconnect cleanly On Sun, May 8, 2022 at 3:29 PM wrote: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugz illa.kernel.org%2Fshow_bug.cgi%3Fid%3D215958&data=05%7C01%7Cal exan der.deucher%40amd.com%7C8bb8567427844b05e5f808da31d8435f%7C3d d8961fe48 84e608e11a82d994e183d%7C0%7C0%7C637877102168668221%7CUnkno wn%7CTWFpbGZ sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn 0%3 D%7C3000%7C%7C%7C&sdata=PpcDBIpUW8vCX%2F4kM6Q8RjdgS1qw2 uuWoWZXis4M dDQ%3D&reserved=0 Bug ID: 215958 Summary: thunderbolt3 egpu cannot disconnect cleanly Product: Drivers Version: 2.5 Kernel Version: 5.17.0-1003-oem #3-Ubuntu SMP PREEMPT Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: PCI Assignee: drivers_...@kernel-bugs.osdl.org Reporter: r087...@yahoo.it Regression: No I assume this is not a regression, right? If it is a regression, what previous kernel worked correctly? I have an external egpu (Radeon 6600 RX) connected through thunderbolt3 to my Thinkpad X1 carbon 6th Gen.. When I disconnect the thunderbolt3 cable I get the following error in dmesg: [21874.194994] amdgpu :0c:00.0: amdgpu: SMU: response:0x for index:18 param:0x0005 message:TransferTableSmu2Dram? [21874.195006] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics table! [21874.195123] amdgpu :0c:00.0: amdgpu: SMU: response:0x for index:18 param:0x0005 message:TransferTableSmu2Dram? [21874.195129] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics table! [21874.195271] amdgpu :0c:00.0: amdgpu: SMU: response:0x for index:18 param:0x0005 message:TransferTableSmu2Dram? [21874.195276] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics table! [21874.195406] amdgpu :0c:00.0: amdgpu: SMU: response:0x for index:18 param:0x0005 message:TransferTableSmu2Dram? [21874.195411] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics table! [21874.195544] amdgpu :0c:00.0: amdgpu: SMU: response:0x for index:51 param:0x message:GetPptLimit? [21874.195550] amdgpu :0c:00.0: amdgpu: [smu_v11_0_get_current_power_limit] get PPT limit failed! [21874.195582] amdgpu :0c:00.0: amdgpu: SMU: response:0x for index:18 param:0x0005 message:TransferTableSmu2Dram? [21874.195587] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics table! [21874.227454] amdgpu :0c:00.0: amdgpu: SMU: response:0x for index:18 param:0x0005 message:TransferTableSmu2Dram? [21874.227463] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics table! [21874.227532] amdgpu :0c:00.0: amdgpu: SMU: response:0x for index:18 param:0x0005 message:TransferTableSmu2Dram? [21874.227536] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics table! [21874.227618] amdgpu :0c:00.0: amdgpu: SMU: response:0x for index:18 param:0x0005 message:TransferTableSmu2Dram? [21874.227621] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics table! [21874.227700] amdgpu :0c:00.0: amdgpu: SMU: response:0x for index:18 param:0x0005 message:TransferTableSmu2Dram? [21874.227703] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics table! [21874.227784] amdgpu :0c:00.0: amdgpu: [smu_v11_0_get_current_power_limit] get PPT limit failed! [21874.227804] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics table! [21874.514661] snd_hda_codec_hdmi hdaudioC1D0: Unable to sync register 0x2f0d00. -5 [21874.568360] amdgpu :0c:00.0: amdgpu: Failed to switch to AC mode! [21874.599292] amdgpu :0c:00.0: amdgpu: Failed to switch to AC mode! [21874.718562] amdgpu :0c:00.0: amdgpu: amdgpu: finishing device. [21878.722376] amdgpu: cp queue pipe 4 queue 0 preemption failed [21878.722422] amdgpu :0c:00.0: amdgpu: Failed to disable gfxoff! [21879.134918] amdgpu :0c:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110) [21879.135144] [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed [21879.338158] amdgpu :0c:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110) [21879.338402] [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed [21879.543318] [drm:gfx_v10_0_cp_gfx_enable.isra.0 [amdgpu]] *ERROR* failed to halt cp gfx [21879.544216] __smu_cmn_reg_print_error: 5 callbacks suppressed [21879.544220] amdgpu :0c:00.0:
Re: [Bug 215958] New: thunderbolt3 egpu cannot disconnect cleanly
On Sun, May 8, 2022 at 3:29 PM wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=215958 > > Bug ID: 215958 >Summary: thunderbolt3 egpu cannot disconnect cleanly >Product: Drivers >Version: 2.5 > Kernel Version: 5.17.0-1003-oem #3-Ubuntu SMP PREEMPT > Hardware: All > OS: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: PCI > Assignee: drivers_...@kernel-bugs.osdl.org > Reporter: r087...@yahoo.it > Regression: No I assume this is not a regression, right? If it is a regression, what previous kernel worked correctly? > I have an external egpu (Radeon 6600 RX) connected through thunderbolt3 to my > Thinkpad X1 carbon 6th Gen.. When I disconnect the thunderbolt3 cable I get > the > following error in dmesg: > > [21874.194994] amdgpu :0c:00.0: amdgpu: SMU: response:0x for > index:18 param:0x0005 message:TransferTableSmu2Dram? > [21874.195006] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics > table! > [21874.195123] amdgpu :0c:00.0: amdgpu: SMU: response:0x for > index:18 param:0x0005 message:TransferTableSmu2Dram? > [21874.195129] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics > table! > [21874.195271] amdgpu :0c:00.0: amdgpu: SMU: response:0x for > index:18 param:0x0005 message:TransferTableSmu2Dram? > [21874.195276] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics > table! > [21874.195406] amdgpu :0c:00.0: amdgpu: SMU: response:0x for > index:18 param:0x0005 message:TransferTableSmu2Dram? > [21874.195411] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics > table! > [21874.195544] amdgpu :0c:00.0: amdgpu: SMU: response:0x for > index:51 param:0x message:GetPptLimit? > [21874.195550] amdgpu :0c:00.0: amdgpu: > [smu_v11_0_get_current_power_limit] > get PPT limit failed! > [21874.195582] amdgpu :0c:00.0: amdgpu: SMU: response:0x for > index:18 param:0x0005 message:TransferTableSmu2Dram? > [21874.195587] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics > table! > [21874.227454] amdgpu :0c:00.0: amdgpu: SMU: response:0x for > index:18 param:0x0005 message:TransferTableSmu2Dram? > [21874.227463] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics > table! > [21874.227532] amdgpu :0c:00.0: amdgpu: SMU: response:0x for > index:18 param:0x0005 message:TransferTableSmu2Dram? > [21874.227536] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics > table! > [21874.227618] amdgpu :0c:00.0: amdgpu: SMU: response:0x for > index:18 param:0x0005 message:TransferTableSmu2Dram? > [21874.227621] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics > table! > [21874.227700] amdgpu :0c:00.0: amdgpu: SMU: response:0x for > index:18 param:0x0005 message:TransferTableSmu2Dram? > [21874.227703] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics > table! > [21874.227784] amdgpu :0c:00.0: amdgpu: > [smu_v11_0_get_current_power_limit] > get PPT limit failed! > [21874.227804] amdgpu :0c:00.0: amdgpu: Failed to export SMU metrics > table! > [21874.514661] snd_hda_codec_hdmi hdaudioC1D0: Unable to sync register > 0x2f0d00. -5 > [21874.568360] amdgpu :0c:00.0: amdgpu: Failed to switch to AC mode! > [21874.599292] amdgpu :0c:00.0: amdgpu: Failed to switch to AC mode! > [21874.718562] amdgpu :0c:00.0: amdgpu: amdgpu: finishing device. > [21878.722376] amdgpu: cp queue pipe 4 queue 0 preemption failed > [21878.722422] amdgpu :0c:00.0: amdgpu: Failed to disable gfxoff! > [21879.134918] amdgpu :0c:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] > *ERROR* ring kiq_2.1.0 test failed (-110) > [21879.135144] [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed > [21879.338158] amdgpu :0c:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] > *ERROR* ring kiq_2.1.0 test failed (-110) > [21879.338402] [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed > [21879.543318] [drm:gfx_v10_0_cp_gfx_enable.isra.0 [amdgpu]] *ERROR* failed to > halt cp gfx > [21879.544216] __smu_cmn_reg_print_error: 5 callbacks suppressed > [21879.544220] amdgpu :0c:00.0: amdgpu: SMU: response:0x for > index:7 param:0x message:DisableAllSmuFeatures? > [21879.544226] amdgpu :0c:00.0: amdgpu: Failed to disable smu features. > [21879.544230] amdgpu :0c:00.0: amdgpu: Fail to disable dpm features! > [21879.544238] [drm] free PSP TMR buffer The above looks like what amdgpu would see when the GPU is no longer accessible (writes are dropped and reads return 0x). It's possible amdgpu could notice this and shut down more gracefully, but I don't think it's the main problem here and it probably wouldn't force you to reboot. > [21880.455935] i915 :00:02.0: vgaarb: changed
RE: [Bug 215958] New: thunderbolt3 egpu cannot disconnect cleanly
[Public] > -Original Message- > From: Bjorn Helgaas > Sent: Monday, May 9, 2022 12:23 PM > To: Linux PCI > Cc: r087...@yahoo.it; Deucher, Alexander > ; Koenig, Christian > ; Pan, Xinhui ; amd-gfx > mailing list ; dri-devel de...@lists.freedesktop.org> > Subject: Re: [Bug 215958] New: thunderbolt3 egpu cannot disconnect cleanly > > On Sun, May 8, 2022 at 3:29 PM wrote: > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugz > > > illa.kernel.org%2Fshow_bug.cgi%3Fid%3D215958&data=05%7C01%7Cal > exan > > > der.deucher%40amd.com%7C8bb8567427844b05e5f808da31d8435f%7C3d > d8961fe48 > > > 84e608e11a82d994e183d%7C0%7C0%7C637877102168668221%7CUnkno > wn%7CTWFpbGZ > > > sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn > 0%3 > > > D%7C3000%7C%7C%7C&sdata=PpcDBIpUW8vCX%2F4kM6Q8RjdgS1qw2 > uuWoWZXis4M > > dDQ%3D&reserved=0 > > > > Bug ID: 215958 > >Summary: thunderbolt3 egpu cannot disconnect cleanly > >Product: Drivers > >Version: 2.5 > > Kernel Version: 5.17.0-1003-oem #3-Ubuntu SMP PREEMPT > > Hardware: All > > OS: Linux > > Tree: Mainline > > Status: NEW > > Severity: normal > > Priority: P1 > > Component: PCI > > Assignee: drivers_...@kernel-bugs.osdl.org > > Reporter: r087...@yahoo.it > > Regression: No > > I assume this is not a regression, right? If it is a regression, what > previous > kernel worked correctly? > > > I have an external egpu (Radeon 6600 RX) connected through > > thunderbolt3 to my Thinkpad X1 carbon 6th Gen.. When I disconnect the > > thunderbolt3 cable I get the following error in dmesg: > > > > [21874.194994] amdgpu :0c:00.0: amdgpu: SMU: > response:0x > > for > > index:18 param:0x0005 message:TransferTableSmu2Dram? > > [21874.195006] amdgpu :0c:00.0: amdgpu: Failed to export SMU > metrics table! > > [21874.195123] amdgpu :0c:00.0: amdgpu: SMU: > response:0x > > for > > index:18 param:0x0005 message:TransferTableSmu2Dram? > > [21874.195129] amdgpu :0c:00.0: amdgpu: Failed to export SMU > metrics table! > > [21874.195271] amdgpu :0c:00.0: amdgpu: SMU: > response:0x > > for > > index:18 param:0x0005 message:TransferTableSmu2Dram? > > [21874.195276] amdgpu :0c:00.0: amdgpu: Failed to export SMU > metrics table! > > [21874.195406] amdgpu :0c:00.0: amdgpu: SMU: > response:0x > > for > > index:18 param:0x0005 message:TransferTableSmu2Dram? > > [21874.195411] amdgpu :0c:00.0: amdgpu: Failed to export SMU > metrics table! > > [21874.195544] amdgpu :0c:00.0: amdgpu: SMU: > response:0x > > for > > index:51 param:0x message:GetPptLimit? > > [21874.195550] amdgpu :0c:00.0: amdgpu: > > [smu_v11_0_get_current_power_limit] > > get PPT limit failed! > > [21874.195582] amdgpu :0c:00.0: amdgpu: SMU: > response:0x > > for > > index:18 param:0x0005 message:TransferTableSmu2Dram? > > [21874.195587] amdgpu :0c:00.0: amdgpu: Failed to export SMU > metrics table! > > [21874.227454] amdgpu :0c:00.0: amdgpu: SMU: > response:0x > > for > > index:18 param:0x0005 message:TransferTableSmu2Dram? > > [21874.227463] amdgpu :0c:00.0: amdgpu: Failed to export SMU > metrics table! > > [21874.227532] amdgpu :0c:00.0: amdgpu: SMU: > response:0x > > for > > index:18 param:0x0005 message:TransferTableSmu2Dram? > > [21874.227536] amdgpu :0c:00.0: amdgpu: Failed to export SMU > metrics table! > > [21874.227618] amdgpu :0c:00.0: amdgpu: SMU: > response:0x > > for > > index:18 param:0x0005 message:TransferTableSmu2Dram? > > [21874.227621] amdgpu :0c:00.0: amdgpu: Failed to export SMU > metrics table! > > [21874.227700] amdgpu :0c:00.0: amdgpu: SMU: > response:0x > > for > > index:18 param:0x0005 message:TransferTableSmu2Dram? > > [21874.227703] amdgpu :0c:00.0: amdgpu: Failed to export SMU > metrics table! > > [21874.227784] amdgpu :0c:00.0: amdgpu: > > [smu_v11_0_get_current_power_limit] > > get PPT limit failed! > > [21874.227804] amdgpu :0c:00.0: amdgpu: Failed to export SMU > metrics table! > > [21874.514661] snd_hda_codec_hdmi hdaudioC1D0: Unable to sync > register > > 0x2f0d00. -5 [21874.