[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 Martin Peres changed: What|Removed |Added Resolution|--- |MOVED Status|NEW |RESOLVED --- Comment #17 from Martin Peres --- -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/274. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 --- Comment #16 from Sverd Johnsen --- hmm this is new [41889.542562] Console: switching to colour dummy device 80x25 [41890.859216] [drm] amdgpu: finishing device. [41891.096266] [TTM] Finalizing pool allocator [41891.100313] [TTM] Finalizing DMA pool allocator [41891.100326] [TTM] Zone kernel: Used memory at exit: 0 kiB [41891.100327] [TTM] Zone dma32: Used memory at exit: 0 kiB [41891.100328] [drm] amdgpu: ttm finalized [41891.108164] BUG: unable to handle kernel paging request at c0a31750 [41891.108167] PGD 2dd20a067 P4D 2dd20a067 PUD 2dd20c067 PMD 4897b7067 PTE 0 [41891.108170] Oops: 0010 [#1] PREEMPT SMP PTI [41891.108171] Modules linked in: chash gpu_sched ttm arc4 md4 md5 sha512_ssse3 sha512_generic cmac cifs ccm nls_iso8859_1 nls_cp437 vfat fat msr rpcsec_gss_krb5 nfsv4 cachefiles dns_resolver nfs lockd grace fscache auth_rpcgss sunrpc af_packet macvtap macvlan bonding nf_log_ipv6 nf_log_ipv4 nf_log_common nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_log nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nft_limit nft_ct nf_conntrack xfrm_user xfrm_algo cls_u32 nft_counter nft_meta nft_set_bitmap sch_htb nft_set_hash nft_set_rbtree raid0 intel_pmc_core x86_pkg_temp_thermal intel_powerclamp kvm_intel md_mod vhost_net tun vhost tap kvm snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi bcache snd_hda_intel snd_hda_codec snd_hwdep intel_cstate intel_uncore snd_hda_core [41891.108194] cdc_ether intel_rapl_perf r8152 efi_pstore snd_pcm plusb mei_me usbnet input_leds mei mii led_class efivars tpm_crb binfmt_misc crypto_user efivarfs algif_skcipher af_alg mousedev joydev psmouse crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr tpm_tis tpm_tis_core tpm shpchp thermal fan i8042 rng_core acpi_pad vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio atkbd libps2 [last unloaded: amdgpu] [41891.108208] CPU: 0 PID: 28737 Comm: zsh Not tainted 4.17.1-2-ph #2 [41891.108208] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD3/Z170X-UD3-CF, BIOS F23d 12/01/2017 [41891.108210] RIP: 0010:0xc0a31750 [41891.108211] RSP: 0018:912fdec03f10 EFLAGS: 00010292 [41891.108212] RAX: c0a31750 RBX: 912fdec20200 RCX: 912dbf11f390 [41891.108213] RDX: 912f0fd6b990 RSI: 912fdec03f20 RDI: 912dbf11fd90 [41891.108214] RBP: a9223480 R08: 912fdec1bc00 R09: 0100 [41891.108215] R10: 0080 R11: 2619991f4040 R12: 912fdec20238 [41891.108215] R13: 000a R14: 7fff R15: 0202 [41891.108216] FS: 7febd04cff00() GS:912fdec0() knlGS: [41891.108217] CS: 0010 DS: ES: CR0: 80050033 [41891.108218] CR2: c0a31750 CR3: 000468368002 CR4: 003606f0 [41891.108219] DR0: DR1: DR2: [41891.108220] Call Trace: [41891.108220] DR3: DR6: fffe0ff0 DR7: 0400 [41891.108222] [41891.108224] ? rcu_process_callbacks+0x1f9/0x3c0 [41891.108226] ? __do_softirq+0xd0/0x1f4 [41891.108228] ? irq_exit+0x7c/0xb0 [41891.108229] ? smp_apic_timer_interrupt+0x59/0x90 [41891.108231] [41891.108231] ? apic_timer_interrupt+0xf/0x20 [41891.108233] ? privileged_wrt_inode_uidgid+0x12/0x30 [41891.108235] ? generic_permission+0xf4/0x190 [41891.108236] ? inode_permission+0x24/0x130 [41891.108237] ? link_path_walk+0x6c/0x530 [41891.108239] ? path_lookupat.isra.10+0x92/0x200 [41891.108241] ? unmap_page_range+0x5ed/0x890 [41891.108242] ? filename_lookup.part.18+0x9b/0x170 [41891.108244] ? __check_object_size+0xf6/0x17b [41891.108246] ? strncpy_from_user+0x4c/0x170 [41891.108247] ? vfs_statx+0x6e/0xd0 [41891.108249] ? __audit_syscall_exit+0x22b/0x2a0 [41891.108250] ? __se_sys_newstat+0x39/0x70 [41891.108251] ? syscall_trace_enter+0x1d9/0x240 [41891.108253] ? vm_munmap+0x64/0x90 [41891.108254] ? do_syscall_64+0x43/0xf0 [41891.108255] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 [41891.108256] Code: Bad RIP value. [41891.108259] RIP: 0xc0a31750 RSP: 912fdec03f10 [41891.108260] CR2: c0a31750 [41891.108262] ---[ end trace 51bac120fccff9bb ]--- [41891.230170] Kernel panic - not syncing: Fatal exception in interrupt [41891.230175] Kernel Offset: 0x2700 from 0x8100 (relocation range: 0x8000-0xbfff) -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 --- Comment #15 from Sverd Johnsen --- Seems to be much better or completly (well almost, bug 106820) solved with 4.17 according to my preliminary tests. Still a problem in 4.16: [16981.789980] [drm] amdgpu: finishing device. [16981.901013] BUG: unable to handle kernel NULL pointer dereference at [16981.901017] IP: (null) [16981.901018] PGD 0 P4D 0 [16981.901020] Oops: 0010 [#1] PREEMPT SMP PTI [16981.901022] Modules linked in: amdgpu(-) chash gpu_sched ttm fuse af_packet macvtap macvlan bonding nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_nat nft_chain_nat_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv6 nf_log_ipv4 nf_log_common nft_log nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nft_ct nf_conntrack xfrm_user xfrm_algo nft_counter nft_meta nft_set_bitmap nft_set_hash nft_set_rbtree cls_u32 nf_tables_inet sch_htb raid0 intel_pmc_core x86_pkg_temp_thermal intel_powerclamp kvm_intel vhost_net tun vhost tap kvm snd_hda_codec_realtek bcache snd_hda_codec_generic intel_cstate intel_uncore snd_hda_codec_hdmi efi_pstore md_mod snd_hda_intel snd_hda_codec intel_rapl_perf efivars snd_hwdep snd_hda_core mei_me mei plusb snd_pcm input_leds usbnet mii led_class tpm_crb binfmt_misc [16981.901045] crypto_user efivarfs algif_skcipher af_alg mousedev joydev psmouse atkbd libps2 tpm_tis crct10dif_pclmul tpm_tis_core crc32_pclmul tpm ghash_clmulni_intel pcspkr rng_core shpchp thermal fan i8042 acpi_pad vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio [16981.901055] CPU: 1 PID: 23604 Comm: rmmod Not tainted 4.16.12-5-ph #2 [16981.901056] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD3/Z170X-UD3-CF, BIOS F23d 12/01/2017 [16981.901056] RIP: 0010: (null) [16981.901057] RSP: 0018:b8ac03083d08 EFLAGS: 00010286 [16981.901058] RAX: RBX: a16af8c39f00 RCX: 0001802e [16981.901059] RDX: 0001802f RSI: 5c02 RDI: a16cc19a8240 [16981.901060] RBP: a16c51c42290 R08: 0001 R09: a16d0cc01900 [16981.901061] R10: a16b845ef300 R11: e4b3ca133420 R12: a16b89eb4400 [16981.901061] R13: 0040 R14: c09aaf68 R15: dead0100 [16981.901062] FS: 7f3733f98b80() GS:a16d1ec8() knlGS: [16981.901063] CS: 0010 DS: ES: CR0: 80050033 [16981.901064] CR2: CR3: 0003df3de006 CR4: 003606e0 [16981.901065] DR0: DR1: DR2: [16981.901066] DR3: DR6: fffe0ff0 DR7: 0400 [16981.901066] Call Trace: [16981.901087] ? destroy+0x23/0xb0 [amdgpu] [16981.901100] ? dal_i2caux_destruct+0x6a/0xb0 [amdgpu] [16981.901113] ? destroy+0x10/0x30 [amdgpu] [16981.901126] ? dal_i2caux_destroy+0x1d/0x30 [amdgpu] [16981.901137] ? destruct+0x8e/0x110 [amdgpu] [16981.901148] ? dc_destroy+0xc/0x20 [amdgpu] [16981.901162] ? dm_hw_fini+0x19/0x20 [amdgpu] [16981.901173] ? amdgpu_device_ip_fini+0xef/0x30a [amdgpu] [16981.901184] ? amdgpu_device_fini+0x68/0x177 [amdgpu] [16981.901191] ? amdgpu_driver_unload_kms+0x3d/0x90 [amdgpu] [16981.901193] ? drm_dev_unregister+0x3a/0xf0 [16981.901201] ? amdgpu_pci_remove+0x14/0x40 [amdgpu] [16981.901203] ? pci_device_remove+0x36/0xb0 [16981.901205] ? device_release_driver_internal+0x155/0x220 [16981.901206] ? driver_detach+0x32/0x63 [16981.901208] ? bus_remove_driver+0x6f/0xd0 [16981.901209] ? pci_unregister_driver+0x38/0x90 [16981.901220] ? amdgpu_exit+0x11/0x1029 [amdgpu] [16981.901222] ? SyS_delete_module+0x17f/0x240 [16981.901223] ? do_syscall_64+0x5b/0x100 [16981.901226] ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [16981.901226] Code: Bad RIP value. [16981.901230] RIP: (null) RSP: b8ac03083d08 [16981.901230] CR2: [16981.901232] ---[ end trace f3ff8fc93836e132 ]--- -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 --- Comment #14 from Sverd Johnsen --- I just looked at the comments again and based on Comment 2 this seems like expected behavior for now. So my update was kind of pointless. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 --- Comment #13 from Sverd Johnsen --- Still seen with 4.15.15 and dc=1. Not sure if this always reproduces or not, havn't tested this in a while. [13342.285357] [drm] amdgpu: finishing device. [13342.288330] amdgpu: [powerplay] failed to send message 30a ret is 254 [13342.288345] amdgpu: [powerplay] failed to send pre message 26b ret is 254 [13342.369191] BUG: unable to handle kernel NULL pointer dereference at [13342.369193] IP: (null) [13342.369194] PGD 8003eed8a067 P4D 8003eed8a067 PUD 3f027a067 PMD 0 [13342.369197] Oops: 0010 [#1] PREEMPT SMP PTI [13342.369198] Modules linked in: nfnetlink_log bluetooth ecdh_generic amdgpu(-) chash ttm af_packet macvtap macvlan bonding nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nf_log_ipv6 nf_log_ipv4 nf_log_common nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nft_log nft_ct nf_conntrack xfrm_user xfrm_algo nft_counter nft_meta nft_set_bitmap nft_set_hash nft_set_rbtree nf_tables_inet cls_u32 nf_tables_ipv6 nf_tables_ipv4 sch_htb dm_cache_smq dm_cache dm_bio_prison dm_persistent_data dm_bufio libcrc32c raid0 x86_pkg_temp_thermal intel_powerclamp kvm_intel vhost_net tun vhost tap kvm md_mod snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel intel_cstate intel_uncore snd_hda_codec intel_rapl_perf efi_pstore snd_hwdep snd_hda_core mei_me plusb snd_pcm [13342.369219] input_leds usbnet mei led_class mii efivars tpm_crb crypto_user efivarfs algif_skcipher af_alg joydev mousedev psmouse atkbd libps2 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr tpm_tis tpm_tis_core shpchp thermal fan tpm i8042 acpi_pad vfio_pci irqbypass vfio_virqfd vfio_iommu_type1 vfio [13342.369230] CPU: 1 PID: 18251 Comm: rmmod Not tainted 4.15.15-5-ph #2 [13342.369232] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD3/Z170X-UD3-CF, BIOS F23d 12/01/2017 [13342.369233] RIP: 0010: (null) [13342.369234] RSP: 0018:b40e025e7d00 EFLAGS: 00010282 [13342.369235] RAX: RBX: 8ee9c9c1b420 RCX: 000180200011 [13342.369236] RDX: 000180200012 RSI: 5c02 RDI: 8ee989680f60 [13342.369236] RBP: 8ee9ac57da90 R08: 0001 R09: 8ee982daae00 [13342.369237] R10: 8ee9ccc01900 R11: 00023610 R12: 0003 [13342.369238] R13: 8ee9ca032f18 R14: R15: [13342.369239] FS: 7f738c970b80() GS:8ee9dec8() knlGS: [13342.369240] CS: 0010 DS: ES: CR0: 80050033 [13342.369241] CR2: CR3: 00033ee74001 CR4: 003606e0 [13342.369241] DR0: DR1: DR2: [13342.369242] DR3: DR6: fffe0ff0 DR7: 0400 [13342.369243] Call Trace: [13342.369258] ? destroy+0x23/0xb0 [amdgpu] [13342.369269] ? dal_i2caux_destruct+0x6a/0xb0 [amdgpu] [13342.369278] ? destroy+0x10/0x30 [amdgpu] [13342.369288] ? dal_i2caux_destroy+0x1d/0x30 [amdgpu] [13342.369297] ? destruct+0x89/0x110 [amdgpu] [13342.369306] ? dc_destroy+0xc/0x20 [amdgpu] [13342.369318] ? dm_hw_fini+0x19/0x20 [amdgpu] [13342.369323] ? amdgpu_fini+0x9c/0x310 [amdgpu] [13342.369329] ? amdgpu_device_fini+0x5f/0x1c0 [amdgpu] [13342.369334] ? amdgpu_driver_unload_kms+0x45/0x90 [amdgpu] [13342.369336] ? drm_dev_unregister+0x3a/0xe0 [13342.369341] ? amdgpu_pci_remove+0x14/0x40 [amdgpu] [13342.369344] ? pci_device_remove+0x36/0xb0 [13342.369346] ? device_release_driver_internal+0x155/0x220 [13342.369347] ? driver_detach+0x32/0x70 [13342.369349] ? bus_remove_driver+0x4c/0xc0 [13342.369350] ? pci_unregister_driver+0x24/0x90 [13342.369359] ? amdgpu_exit+0x11/0x3b6 [amdgpu] [13342.369361] ? SyS_delete_module+0x19d/0x230 [13342.369363] ? do_syscall_64+0x5b/0x100 [13342.369365] ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [13342.369366] Code: Bad RIP value. [13342.369368] RIP: (null) RSP: b40e025e7d00 [13342.369369] CR2: [13342.369370] ---[ end trace 9551ca9b94f5680d ]--- -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 --- Comment #12 from Jordan L --- Thanks. Just to clarify, you aren't able to enable DC with your current Kconfig configuration, which prevents you from retrying on this ticket? If so, please just open a new bug rather than conflate the issue here. Once you're unblocked from testing DC with your configuration, we can look at this issue again. Cheers -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 Luke McKee changed: What|Removed |Added CC||hojur...@gmail.com --- Comment #11 from Luke McKee --- I fixed this. I was going to open another ticket. Just mentioned it before. Still dc=1 isn't yet usable due to this: https://bugs.freedesktop.org/show_bug.cgi?id=103953#c7 I found out that the breakage was occurring due to kernel configuration options. I can provide a working and broken kconfig, but my money is on it wanting the amd iommu gart support even though the hardware isn't installed. This is loaded on demand by the amdgpu module. Disabling agpart didn't do much harm though. Would that affect performance if intel_iommu is enabled? Feb 28 08:08:35 hojuruku kernel: AMD IOMMUv2 driver by Joerg Roedel Feb 28 08:08:35 hojuruku kernel: AMD IOMMUv2 functionality not available on this system Feb 28 08:08:35 hojuruku kernel: CRAT table not found Feb 28 08:08:35 hojuruku kernel: Virtual CRAT table created for CPU Feb 28 08:08:35 hojuruku kernel: Parsing CRAT table with 1 nodes If users sensibly choose what hardware they have installed to reduce compile times, or build monolithic kernels they are going to run into trouble here. Ask me for the .config's for the kernel if you need them to replicate the defect if you need them. That's what triggered the error in my last comment I think though in that kernel I had ORC off and no symbols. You might want to review / tweak your Kconfig depends clauses. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 --- Comment #10 from Jordan L --- Hi Luke, it actually looks like you're running with DC disabled. Can you also try with amdgpu.dc=1 explicitly set? Potentially there are issues in multiple IP blocks, though we fixed a driver unload issue recently with DC enabled. Thanks -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 --- Comment #9 from Luke McKee --- Using Polairs 11 I get the same error message btw when loading the latest (today) amd-drm-staging-next. I'll include debugging symbols and try again. Feb 26 14:56:18 hojuruku kernel: Linux agpgart interface v0.103 Feb 26 14:56:18 hojuruku kernel: [drm] amdgpu kernel modesetting enabled. Feb 26 14:56:18 hojuruku kernel: checking generic (e000 30) vs hw (e000 1000) Feb 26 14:56:18 hojuruku kernel: fb: switching to amdgpudrmfb from EFI VGA Feb 26 14:56:18 hojuruku kernel: Console: switching to colour dummy device 80x25 Feb 26 14:56:18 hojuruku kernel: amdgpu :01:00.0: enabling device (0006 -> 0007) Feb 26 14:56:18 hojuruku kernel: [drm] initializing kernel modesetting (POLARIS11 0x1002:0x67FF 0x1462:0x8A91 0xCF). Feb 26 14:56:18 hojuruku kernel: [drm] register mmio base: 0xF7E0 Feb 26 14:56:18 hojuruku kernel: [drm] register mmio size: 262144 Feb 26 14:56:18 hojuruku kernel: [drm] add ip block number 0 Feb 26 14:56:18 hojuruku kernel: [drm] add ip block number 1 Feb 26 14:56:18 hojuruku kernel: [drm] add ip block number 2 Feb 26 14:56:18 hojuruku kernel: [drm] add ip block number 3 Feb 26 14:56:18 hojuruku kernel: [drm] add ip block number 4 Feb 26 14:56:18 hojuruku kernel: [drm] add ip block number 5 Feb 26 14:56:18 hojuruku kernel: [drm] add ip block number 6 Feb 26 14:56:18 hojuruku kernel: [drm] add ip block number 7 Feb 26 14:56:18 hojuruku kernel: [drm] add ip block number 8 Feb 26 14:56:18 hojuruku kernel: [drm] probing gen 2 caps for device 8086:c01 = 261ad03/e Feb 26 14:56:18 hojuruku kernel: [drm] probing mlw for device 8086:c01 = 261ad03 Feb 26 14:56:18 hojuruku kernel: [drm] UVD is enabled in VM mode Feb 26 14:56:18 hojuruku kernel: [drm] UVD ENC is enabled in VM mode Feb 26 14:56:18 hojuruku kernel: [drm] VCE enabled in VM mode Feb 26 14:56:18 hojuruku kernel: ATOM BIOS: 113-C98121-M01 Feb 26 14:56:18 hojuruku kernel: [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit Feb 26 14:56:18 hojuruku kernel: amdgpu :01:00.0: VRAM: 4096M 0x00F4 - 0x00F4 (4096M used) Feb 26 14:56:18 hojuruku kernel: amdgpu :01:00.0: GTT: 256M 0x - 0x0FFF Feb 26 14:56:18 hojuruku kernel: [drm] Detected VRAM RAM=4096M, BAR=256M Feb 26 14:56:18 hojuruku kernel: [drm] RAM width 128bits GDDR5 Feb 26 14:56:18 hojuruku kernel: [TTM] Zone kernel: Available graphics memory: 8175204 kiB Feb 26 14:56:18 hojuruku kernel: [TTM] Zone dma32: Available graphics memory: 2097152 kiB Feb 26 14:56:18 hojuruku kernel: [TTM] Initializing pool allocator Feb 26 14:56:18 hojuruku kernel: [TTM] Initializing DMA pool allocator Feb 26 14:56:18 hojuruku kernel: [drm] amdgpu: 4096M of VRAM memory ready Feb 26 14:56:18 hojuruku kernel: [drm] amdgpu: 4096M of GTT memory ready. Feb 26 14:56:18 hojuruku kernel: [drm] GART: num cpu pages 65536, num gpu pages 65536 Feb 26 14:56:18 hojuruku kernel: [drm] PCIE GART of 256M enabled (table at 0x00F40004). Feb 26 14:56:18 hojuruku kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). Feb 26 14:56:18 hojuruku kernel: [drm] Driver supports precise vblank timestamp query. Feb 26 14:56:18 hojuruku kernel: [drm] AMDGPU Display Connectors Feb 26 14:56:18 hojuruku kernel: [drm] Connector 0: Feb 26 14:56:18 hojuruku kernel: [drm] DP-1 Feb 26 14:56:18 hojuruku kernel: [drm] HPD2 Feb 26 14:56:18 hojuruku kernel: [drm] DDC: 0x4868 0x4868 0x4869 0x4869 0x486a 0x486a 0x486b 0x486b Feb 26 14:56:18 hojuruku kernel: [drm] Encoders: Feb 26 14:56:18 hojuruku kernel: [drm] DFP1: INTERNAL_UNIPHY1 Feb 26 14:56:18 hojuruku kernel: [drm] Connector 1: Feb 26 14:56:18 hojuruku kernel: [drm] HDMI-A-1 Feb 26 14:56:18 hojuruku kernel: [drm] HPD5 Feb 26 14:56:18 hojuruku kernel: [drm] DDC: 0x4874 0x4874 0x4875 0x4875 0x4876 0x4876 0x4877 0x4877 Feb 26 14:56:18 hojuruku kernel: [drm] Encoders: Feb 26 14:56:18 hojuruku kernel: [drm] DFP2: INTERNAL_UNIPHY1 Feb 26 14:56:18 hojuruku kernel: [drm] Connector 2: Feb 26 14:56:18 hojuruku kernel: [drm] DVI-D-1 Feb 26 14:56:18 hojuruku kernel: [drm] HPD3 Feb 26 14:56:18 hojuruku kernel: [drm] DDC: 0x4878 0x4878 0x4879 0x4879 0x487a 0x487a 0x487b 0x487b Feb 26 14:56:18 hojuruku kernel: [drm] Encoders: Feb 26 14:56:18 hojuruku kernel: [drm] DFP3: INTERNAL_UNIPHY Feb 26 14:56:18 hojuruku kernel: [drm] Chained IB support enabled! Feb 26 14:56:18 hojuruku kernel: [drm] Found UVD firmware Version: 1.130 Family ID: 16 Feb 26 14:56:18 hojuruku kernel: [drm] Found VCE firmware Version: 52.4 Binary ID: 3 Feb 26 14:56:18 hojuruku kernel: amdgpu: [powerplay] failed to send message 309 ret is 254 Feb 26 14:56:18 hojuruku kernel: amdgpu: [powerplay] failed to send pre message 14e ret is 254 Feb 26 14:56:18 hojuruku kernel: [drm] UVD and UVD ENC initialized successfully. Feb 26 1
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 --- Comment #8 from Luke McKee --- I commented because the user here had the error relating to powerplay failing that I have seen in other threads. https://forum-en.msi.com/index.php?topic=298468.0 [46667.236028] kernel: amdgpu: [powerplay] failed to send message 309 ret is 254 [46667.236041] kernel: amdgpu: [powerplay] failed to send pre message 14e ret is 254 Thought this issue might be related to the motherboard bios not properly setting up MMIO BARs. I'll open a seperate ticket i a week if the Polaris 11 support is still broken after your next push to linux 4.16. I'm sure you at AMD would be aware of it. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 --- Comment #7 from Harry Wentland --- Luke, does your comment relate to driver unload, specifically driver unload throwing NULL pointer dereference? If not, please open a separate ticket so as not to confuse two issues. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 --- Comment #6 from Luke McKee --- Harry I have tried the latest staging next 2 days ago. Suggest you see there is a long standing issue with powerplay and buggy AMI bioses that don't properly set up MMIO BAR regiions that need to be worked around by your driver becuase the vendor says a 2 year old motherboard is too old to support firmware updates, even if Intel ME is a massive security risk in support emails. I'm actually thinking you guys need to add linuxbios/coreboot support for your driver if available the way things are going: https://forum-en.msi.com/index.php?topic=298468.0 I think Polaris 11 with buggy bioses that dont' properly setup PCIE MMIO BAR ranges have hell with the amdgpu driver and powerplay (no fan control) cooked cards etc. With your new powerplay code it gets even worse maybe and just doesn't boot in this condition. I work OK on 4.14.20 but with exactly the same config and amdgpu-staging-next from last updated 4 days ago I get this mess. Feb 23 01:41:59 hojuruku kernel: [drm] amdgpu kernel modesetting enabled. Feb 23 01:41:59 hojuruku kernel: checking generic (e000 30) vs hw (e000 1000) Feb 23 01:41:59 hojuruku kernel: fb: switching to amdgpudrmfb from EFI VGA Feb 23 01:41:59 hojuruku kernel: Console: switching to colour dummy device 80x25 Feb 23 01:41:59 hojuruku kernel: amdgpu :01:00.0: enabling device (0006 -> 0007) Feb 23 01:41:59 hojuruku kernel: [drm] initializing kernel modesetting (POLARIS11 0x1002:0x67FF 0x1462:0x8A91 0xCF). Feb 23 01:41:59 hojuruku kernel: [drm] register mmio base: 0xF7E0 Feb 23 01:41:59 hojuruku kernel: [drm] register mmio size: 262144 Feb 23 01:41:59 hojuruku kernel: [drm] add ip block number 0 Feb 23 01:41:59 hojuruku kernel: [drm] add ip block number 1 Feb 23 01:41:59 hojuruku kernel: [drm] add ip block number 2 Feb 23 01:41:59 hojuruku kernel: [drm] add ip block number 3 Feb 23 01:41:59 hojuruku kernel: [drm] add ip block number 4 Feb 23 01:41:59 hojuruku kernel: [drm] add ip block number 5 Feb 23 01:41:59 hojuruku kernel: [drm] add ip block number 6 Feb 23 01:41:59 hojuruku kernel: [drm] add ip block number 7 Feb 23 01:41:59 hojuruku kernel: [drm] add ip block number 8 Feb 23 01:41:59 hojuruku kernel: [drm] probing gen 2 caps for device 8086:c01 = 261ad03/e Feb 23 01:41:59 hojuruku kernel: [drm] probing mlw for device 8086:c01 = 261ad03 Feb 23 01:41:59 hojuruku kernel: [drm] UVD is enabled in VM mode Feb 23 01:41:59 hojuruku kernel: [drm] UVD ENC is enabled in VM mode Feb 23 01:41:59 hojuruku kernel: [drm] VCE enabled in VM mode Feb 23 01:41:59 hojuruku kernel: ATOM BIOS: 113-C98121-M01 Feb 23 01:41:59 hojuruku kernel: [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit Feb 23 01:41:59 hojuruku kernel: amdgpu :01:00.0: VRAM: 4096M 0x00F4 - 0x00F4 (4096M used) Feb 23 01:41:59 hojuruku kernel: amdgpu :01:00.0: GTT: 256M 0x - 0x0FFF Feb 23 01:41:59 hojuruku kernel: [drm] Detected VRAM RAM=4096M, BAR=256M Feb 23 01:41:59 hojuruku kernel: [drm] RAM width 128bits GDDR5 Feb 23 01:41:59 hojuruku kernel: [TTM] Zone kernel: Available graphics memory: 8174838 kiB Feb 23 01:41:59 hojuruku kernel: [TTM] Zone dma32: Available graphics memory: 2097152 kiB Feb 23 01:41:59 hojuruku kernel: [TTM] Initializing pool allocator Feb 23 01:41:59 hojuruku kernel: [TTM] Initializing DMA pool allocator Feb 23 01:41:59 hojuruku kernel: [drm] amdgpu: 4096M of VRAM memory ready Feb 23 01:41:59 hojuruku kernel: [drm] amdgpu: 4096M of GTT memory ready. Feb 23 01:41:59 hojuruku kernel: [drm] GART: num cpu pages 65536, num gpu pages 65536 Feb 23 01:41:59 hojuruku kernel: [drm] PCIE GART of 256M enabled (table at 0x00F40004). Feb 23 01:41:59 hojuruku kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). Feb 23 01:41:59 hojuruku kernel: [drm] Driver supports precise vblank timestamp query. Feb 23 01:41:59 hojuruku kernel: [drm] AMDGPU Display Connectors Feb 23 01:41:59 hojuruku kernel: [drm] Connector 0: Feb 23 01:41:59 hojuruku kernel: [drm] DP-1 Feb 23 01:41:59 hojuruku kernel: [drm] HPD2 Feb 23 01:41:59 hojuruku kernel: [drm] DDC: 0x4868 0x4868 0x4869 0x4869 0x486a 0x486a 0x486b 0x486b Feb 23 01:41:59 hojuruku kernel: [drm] Encoders: Feb 23 01:41:59 hojuruku kernel: [drm] DFP1: INTERNAL_UNIPHY1 Feb 23 01:41:59 hojuruku kernel: [drm] Connector 1: Feb 23 01:41:59 hojuruku kernel: [drm] HDMI-A-1 Feb 23 01:41:59 hojuruku kernel: [drm] HPD5 Feb 23 01:41:59 hojuruku kernel: [drm] DDC: 0x4874 0x4874 0x4875 0x4875 0x4876 0x4876 0x4877 0x4877 Feb 23 01:41:59 hojuruku kernel: [drm] Encoders: Feb 23 01:41:59 hojuruku kernel: [drm] DFP2: INTERNAL_UNIPHY1 Feb 23 01:41:59 hojuruku kernel: [drm] Connector 2: Feb 23 01:41:59 hojuruku kernel: [drm] DVI-D-1 Feb 23 01:41:59 hojuruku kernel: [drm] HPD3 Feb 23 01:41:59 hojuruku kernel: [drm] DDC: 0x4878 0x4878 0x4879 0x4879
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 --- Comment #5 from Harry Wentland --- Can you try with the latest amd-staging-drm-next from https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next We just fixed a bunch of driver unload issues. It should be fixed now. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 --- Comment #4 from mikita.lip...@amd.com --- Encountering a deadlock in DRM while trying to force disable CRTCs. If no displays connected - system hard hangs. If DC is disabled (any number of displays) - system hard hangs. Currently investigating the deadlock issue. Thanks -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 --- Comment #3 from Sverd Johnsen --- Thanks for pointing me to that. Seems like it should have gone to stable Alright. Here is what I do now: boot with intel gpu, amdgpu is blacklisted in modules so it doesn't get autoloaded then i load amdgpu with dc on or off by hand, this time i make sure display that is attached via DisplayPort to amdgpu is on i cannot unload until i use echo 0 > /sys/devices/virtual/vtconsole/vtcon1/bind here is with dc=1: rmmod hangs [ 45.665237] LoadPin: kernel-module pinning-ignored obj="/usr/lib/modules/4.15.0-1-rc/kernel/drivers/gpu/drm/ttm/ttm.ko" pid=1201 cmdline="modprobe amdgpu dc=1" [ 45.682434] LoadPin: kernel-module pinning-ignored obj="/usr/lib/modules/4.15.0-1-rc/kernel/drivers/gpu/drm/amd/lib/chash.ko" pid=1201 cmdline="modprobe amdgpu dc=1" [ 45.693068] LoadPin: kernel-module pinning-ignored obj="/usr/lib/modules/4.15.0-1-rc/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko" pid=1201 cmdline="modprobe amdgpu dc=1" [ 46.202469] [drm] amdgpu kernel modesetting enabled. [ 46.202519] amdgpu :01:00.0: enabling device (0006 -> 0007) [ 46.202669] [drm] initializing kernel modesetting (POLARIS11 0x1002:0x67EF 0x1462:0x809D 0xCF). [ 46.202681] [drm] register mmio base: 0xEFE0 [ 46.202682] [drm] register mmio size: 262144 [ 46.202694] [drm] probing gen 2 caps for device 8086:1901 = 261ad03/e [ 46.202694] [drm] probing mlw for device 8086:1901 = 261ad03 [ 46.202700] [drm] UVD is enabled in VM mode [ 46.202700] [drm] UVD ENC is enabled in VM mode [ 46.202702] [drm] VCE enabled in VM mode [ 46.202715] ATOM BIOS: 113-C99401-S01 [ 46.202721] [drm] GPU post is not needed [ 46.202730] [drm] vm size is 64 GB, block size is 13-bit, fragment size is 9-bit [ 46.203725] LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/amdgpu/polaris11_mc.bin" pid=1201 cmdline="modprobe amdgpu dc=1" [ 46.204234] amdgpu :01:00.0: VRAM: 2048M 0x00F4 - 0x00F47FFF (2048M used) [ 46.204235] amdgpu :01:00.0: GTT: 256M 0x - 0x0FFF [ 46.204241] [drm] Detected VRAM RAM=2048M, BAR=256M [ 46.204242] [drm] RAM width 128bits GDDR5 [ 46.204321] [TTM] Zone kernel: Available graphics memory: 8082548 kiB [ 46.204321] [TTM] Zone dma32: Available graphics memory: 2097152 kiB [ 46.204321] [TTM] Initializing pool allocator [ 46.204323] [TTM] Initializing DMA pool allocator [ 46.204333] [drm] amdgpu: 2048M of VRAM memory ready [ 46.204333] [drm] amdgpu: 3072M of GTT memory ready. [ 46.204366] [drm] GART: num cpu pages 65536, num gpu pages 65536 [ 46.204414] [drm] PCIE GART of 256M enabled (table at 0x00F40004). [ 46.204447] amdgpu :01:00.0: amdgpu: using MSI. [ 46.204462] [drm] amdgpu: irq initialized. [ 46.204475] amdgpu: [powerplay] amdgpu: powerplay sw initialized [ 46.204487] LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/amdgpu/polaris11_pfp_2.bin" pid=1201 cmdline="modprobe amdgpu dc=1" [ 46.205094] LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/amdgpu/polaris11_me_2.bin" pid=1201 cmdline="modprobe amdgpu dc=1" [ 46.217397] LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/amdgpu/polaris11_ce_2.bin" pid=1201 cmdline="modprobe amdgpu dc=1" [ 46.217550] [drm] Chained IB support enabled! [ 46.217556] LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/amdgpu/polaris11_rlc.bin" pid=1201 cmdline="modprobe amdgpu dc=1" [ 46.217822] LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/amdgpu/polaris11_mec_2.bin" pid=1201 cmdline="modprobe amdgpu dc=1" [ 46.218718] LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/amdgpu/polaris11_mec2_2.bin" pid=1201 cmdline="modprobe amdgpu dc=1" [ 46.219694] amdgpu :01:00.0: fence driver on ring 0 use gpu addr 0x00400040, cpu addr 0x72e5becd [ 46.219847] amdgpu :01:00.0: fence driver on ring 1 use gpu addr 0x004000c0, cpu addr 0x7d1f8b28 [ 46.219906] amdgpu :01:00.0: fence driver on ring 2 use gpu addr 0x00400140, cpu addr 0x1b657fe6 [ 46.219946] amdgpu :01:00.0: fence driver on ring 3 use gpu addr 0x004001c0, cpu addr 0x9cd8ddd2 [ 46.219993] amdgpu :01:00.0: fence driver on ring 4 use gpu addr 0x00400240, cpu addr 0x4495111b [ 46.220039] amdgpu :01:00.0: fence driver on ring 5 use gpu addr 0x004002c0, cpu addr 0x705ffa8e [ 46.220113] amdgpu :01:00.0: fence driver on ring 6 use gpu addr 0x00400340, cpu addr 0x9005afe9 [ 46.220160] amdgpu :01:00.0: fence driver on ring 7 use gpu addr 0x004003c0, cpu addr 0xe3bdcb19 [ 46.220203] amdgpu :01:00.0: fence driver on ring 8 use gpu addr 0x00400440, cpu addr 0xd3b3c22e [ 46.220214] amdgpu :01:00.0: fence driver on ring 9 use gpu addr 0x004004e0, cpu addr 0x02a1dbbf [ 46.220487
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 Michel Dänzer changed: What|Removed |Added CC||harry.wentl...@amd.com, ||jordan.laz...@amd.com --- Comment #2 from Michel Dänzer --- The problem in the original report was fixed by https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a072c5f896beba806b4b867d478e1b90f94ba29b . Comment 1 looks like a new issue in DC. Looks like this might be related to the core DRM code now only initializing fbdev compatibility when a display connection is detected, like the change above. Sverd, until the latter is fixed, maybe you can try if amdgpu.dc=0 helps. -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 --- Comment #1 from Sverd Johnsen --- with 4.15rc (linus tree from today) [ 78.807441] [drm] amdgpu: finishing device. [ 78.887439] amdgpu: [powerplay] [ 78.887454] amdgpu: [powerplay] [ 78.968349] BUG: unable to handle kernel NULL pointer dereference at (null) [ 78.968352] IP: (null) [ 78.968352] PGD 45af1d067 P4D 45af1d067 PUD 45b065067 PMD 0 [ 78.968354] Oops: 0010 [#1] PREEMPT SMP [ 78.968355] Modules linked in: amdgpu(-) chash ttm cls_u32 sch_htb af_packet nft_limit nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nf_log_ipv6 nf_log_ipv4 nf_log_common nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nft_log nft_ct nf_conntrack xfrm_user xfrm_algo nft_counter nft_meta nft_set_bitmap nft_set_hash nft_set_rbtree nf_tables_netdev nf_tables_inet nf_tables_ipv6 nf_tables_ipv4 dm_cache_smq dm_cache dm_bio_prison dm_persistent_data dm_bufio libcrc32c bcache x86_pkg_temp_thermal intel_powerclamp kvm_intel vhost_net raid0 raid1 tun vhost tap kvm snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_codec md_mod snd_hwdep irqbypass snd_hda_core intel_cstate mei_me efi_pstore intel_uncore plusb mei input_leds usbnet snd_pcm intel_rapl_perf [ 78.968375] led_class mii efivars tpm_crb usbip_host usbip_core efivarfs algif_skcipher af_alg mousedev joydev psmouse atkbd libps2 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr tpm_tis shpchp tpm_tis_core thermal fan tpm i8042 acpi_pad vfat fat [ 78.968383] CPU: 1 PID: 1493 Comm: rmmod Not tainted 4.15.0-1-rc #2 [ 78.968384] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD3/Z170X-UD3-CF, BIOS F23d 12/01/2017 [ 78.968384] RIP: 0010: (null) [ 78.968385] RSP: 0018:9c6a8281bd00 EFLAGS: 00010286 [ 78.968386] RAX: 9294612676e0 RBX: 92948b2a3300 RCX: 00018020001e [ 78.968386] RDX: 00018020001f RSI: 5c02 RDI: 929461267120 [ 78.968387] RBP: 9294658a6c90 R08: 0001 R09: 929476247000 [ 78.968387] R10: R11: 0033 R12: 0003 [ 78.968388] R13: 929453e42f18 R14: R15: c09b13e8 [ 78.968389] FS: 7f0d7f498b80() GS:92949ec8() knlGS: [ 78.968389] CS: 0010 DS: ES: CR0: 80050033 [ 78.968390] CR2: CR3: 00046b3f8001 CR4: 003606e0 [ 78.968390] DR0: DR1: DR2: [ 78.968391] DR3: DR6: fffe0ff0 DR7: 0400 [ 78.968391] Call Trace: [ 78.968429] ? destroy+0x1d/0xa0 [amdgpu] [ 78.968440] ? dal_i2caux_destruct+0x58/0x90 [amdgpu] [ 78.968449] ? destroy+0x10/0x30 [amdgpu] [ 78.968458] ? dal_i2caux_destroy+0x17/0x30 [amdgpu] [ 78.968467] ? destruct+0x89/0x110 [amdgpu] [ 78.968476] ? dc_destroy+0xc/0x20 [amdgpu] [ 78.968488] ? dm_hw_fini+0x19/0x20 [amdgpu] [ 78.968493] ? amdgpu_fini+0x90/0x2f0 [amdgpu] [ 78.968499] ? amdgpu_device_fini+0x5f/0x1c0 [amdgpu] [ 78.968505] ? amdgpu_driver_unload_kms+0x45/0x90 [amdgpu] [ 78.968507] ? drm_dev_unregister+0x37/0xe0 [ 78.968512] ? amdgpu_pci_remove+0x14/0x40 [amdgpu] [ 78.968514] ? pci_device_remove+0x31/0xa0 [ 78.968516] ? device_release_driver_internal+0x152/0x210 [ 78.968517] ? driver_detach+0x32/0x70 [ 78.968518] ? bus_remove_driver+0x4c/0xc0 [ 78.968519] ? pci_unregister_driver+0x25/0xa0 [ 78.968529] ? amdgpu_exit+0x11/0x3b6 [amdgpu] [ 78.968530] ? SyS_delete_module+0x19c/0x2a0 [ 78.968532] ? do_syscall_64+0x48/0xe0 [ 78.968533] ? entry_SYSCALL64_slow_path+0x25/0x25 [ 78.968534] Code: Bad RIP value. [ 78.968536] RIP: (null) RSP: 9c6a8281bd00 [ 78.968537] CR2: [ 78.968538] ---[ end trace bbf73cfe467dd4c8 ]--- -- You are receiving this mail because: You are the assignee for the bug.___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 104274] Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0000000000000258 (mutex_lock)
https://bugs.freedesktop.org/show_bug.cgi?id=104274 Bug ID: 104274 Summary: Unable to cleanly unload kernel module: BUG: unable to handle kernel NULL pointer dereference at 0258 (mutex_lock) Product: DRI Version: unspecified Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: sverd.john...@googlemail.com Use case for this working well is that once GPU is not needed anymore it can be moved into VM with VFIO. Moving from VFIO back to amdgpu already seems to work alright. loading: [4.751628] kernel: LoadPin: kernel-module pinning-ignored obj="/usr/lib/modules/4.14.5-5-ph/kernel/drivers/gpu/drm/ttm/ttm.ko" pid=940 cmdline="modprobe amdgpu disp_priority=1" [4.769592] kernel: LoadPin: kernel-module pinning-ignored obj="/usr/lib/modules/4.14.5-5-ph/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko" pid=940 cmdline="modprobe amdgpu disp_priority=1" [46667.037887] kernel: [drm] amdgpu kernel modesetting enabled. [46667.037938] kernel: amdgpu :01:00.0: enabling device (0006 -> 0007) [46667.038049] kernel: [drm] initializing kernel modesetting (POLARIS11 0x1002:0x67EF 0x1462:0x809D 0xCF). [46667.038079] kernel: [drm] register mmio base: 0xEFE0 [46667.038079] kernel: [drm] register mmio size: 262144 [46667.038087] kernel: [drm] probing gen 2 caps for device 8086:1901 = 261ad03/e [46667.038087] kernel: [drm] probing mlw for device 8086:1901 = 261ad03 [46667.038093] kernel: [drm] UVD is enabled in VM mode [46667.038094] kernel: [drm] VCE enabled in VM mode [46667.038114] kernel: ATOM BIOS: 113-C99401-S01 [46667.038119] kernel: [drm] GPU post is not needed [46667.038184] kernel: [drm] vm size is 64 GB, block size is 13-bit, fragment size is 4-bit [46667.038205] kernel: LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/amdgpu/polaris11_mc.bin" pid=940 cmdline="modprobe amdgpu disp_priority=1" [46667.038643] kernel: amdgpu :01:00.0: VRAM: 2048M 0x00F4 - 0x00F47FFF (2048M used) [46667.038644] kernel: amdgpu :01:00.0: GTT: 256M 0x - 0x0FFF [46667.038649] kernel: [drm] Detected VRAM RAM=2048M, BAR=256M [46667.038649] kernel: [drm] RAM width 128bits GDDR5 [46667.039152] kernel: [TTM] Zone kernel: Available graphics memory: 8082768 kiB [46667.039152] kernel: [TTM] Zone dma32: Available graphics memory: 2097152 kiB [46667.039153] kernel: [TTM] Initializing pool allocator [46667.039155] kernel: [TTM] Initializing DMA pool allocator [46667.039167] kernel: [drm] amdgpu: 2048M of VRAM memory ready [46667.039168] kernel: [drm] amdgpu: 3072M of GTT memory ready. [46667.039202] kernel: [drm] GART: num cpu pages 65536, num gpu pages 65536 [46667.039284] kernel: [drm] PCIE GART of 256M enabled (table at 0x00F40004). [46667.039300] kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [46667.039301] kernel: [drm] Driver supports precise vblank timestamp query. [46667.039338] kernel: amdgpu :01:00.0: amdgpu: using MSI. [46667.039348] kernel: [drm] amdgpu: irq initialized. [46667.168524] kernel: amdgpu: [powerplay] amdgpu: powerplay sw initialized [46667.168612] kernel: [drm] AMDGPU Display Connectors [46667.168612] kernel: [drm] Connector 0: [46667.168613] kernel: [drm] DP-2 [46667.168613] kernel: [drm] HPD5 [46667.168614] kernel: [drm] DDC: 0x4868 0x4868 0x4869 0x4869 0x486a 0x486a 0x486b 0x486b [46667.168614] kernel: [drm] Encoders: [46667.168614] kernel: [drm] DFP1: INTERNAL_UNIPHY1 [46667.168614] kernel: [drm] Connector 1: [46667.168615] kernel: [drm] HDMI-A-4 [46667.168615] kernel: [drm] HPD3 [46667.168615] kernel: [drm] DDC: 0x4874 0x4874 0x4875 0x4875 0x4876 0x4876 0x4877 0x4877 [46667.168615] kernel: [drm] Encoders: [46667.168616] kernel: [drm] DFP2: INTERNAL_UNIPHY1 [46667.168616] kernel: [drm] Connector 2: [46667.168616] kernel: [drm] DVI-D-1 [46667.168616] kernel: [drm] HPD4 [46667.168617] kernel: [drm] DDC: 0x4878 0x4878 0x4879 0x4879 0x487a 0x487a 0x487b 0x487b [46667.168617] kernel: [drm] Encoders: [46667.168617] kernel: [drm] DFP3: INTERNAL_UNIPHY [46667.168631] kernel: LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/amdgpu/polaris11_pfp.bin" pid=940 cmdline="modprobe amdgpu disp_priority=1" [46667.168927] kernel: LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/amdgpu/polaris11_me.bin" pid=940 cmdline="modprobe amdgpu disp_priority=1" [46667.169170] kernel: LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/amdgpu/polaris11_ce.bin" pid=940 cmdline="modprobe amdgpu disp_priority=1" [46667.169297] kernel: LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/amdgpu/polaris11_rlc.bin" pid=940 cmdline="modprobe amdgpu disp_priority=1" [46667.169543] kernel: LoadPin: firmware pinning-ignored o