Re: [drm/ttm] Memory corruption problem when ttm_tt_init() fails.
On Wed, 15 Jul 2020 at 17:00, Tetsuo Handa wrote: > > On 2020/07/14 18:13, Gu Jinxiang wrote: > > I've encountered [BUG: unable to handle kernel NULL pointer dereference at] > > which has call stack like your pattern2. > > And before this happended, I got a lot of memory allocation failure > > warnings. > > And my kernel is 3.10.0-327.62.1.el7.x86_64. > > > > Since, you mentioned it may be a bug of drm/tmm. So, I checked drm/ttm for > > possible patch to fix this problem, but found nothing. > > Could you please tell me is there any progress of this problem that you > > detected. > > I'm not aware of any progress on https://patchwork.kernel.org/patch/5681611/ . Just found this email, I've hopefully fix this issue in my drm-next tree with https://patchwork.freedesktop.org/patch/380782/ Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[drm/ttm] Memory corruption problem when ttm_tt_init() fails.
hi I've encountered [BUG: unable to handle kernel NULL pointer dereference at] which has call stack like your pattern2. And before this happended, I got a lot of memory allocation failure warnings. And my kernel is 3.10.0-327.62.1.el7.x86_64. Since, you mentioned it may be a bug of drm/tmm. So, I checked drm/ttm for possible patch to fix this problem, but found nothing. Could you please tell me is there any progress of this problem that you detected. Best wished! Jinxiang, Gu ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [drm/ttm] Memory corruption problem when ttm_tt_init() fails.
On 2020/07/14 18:13, Gu Jinxiang wrote: > I've encountered [BUG: unable to handle kernel NULL pointer dereference at] > which has call stack like your pattern2. > And before this happended, I got a lot of memory allocation failure warnings. > And my kernel is 3.10.0-327.62.1.el7.x86_64. > > Since, you mentioned it may be a bug of drm/tmm. So, I checked drm/ttm for > possible patch to fix this problem, but found nothing. > Could you please tell me is there any progress of this problem that you > detected. I'm not aware of any progress on https://patchwork.kernel.org/patch/5681611/ . ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[drm/ttm] Memory corruption problem when ttm_tt_init() fails.
I'm doing memory allocation failure injection test using 3.19-rc5 and it seems to me that there is a memory corruption bug in ttm or vmwgfx code. -- Crash pattern 1 start -- [ 80.751971] [TTM] Failed allocating page table [ 83.000393] BUG: unable to handle kernel NULL pointer dereference at (null) [ 83.004392] IP: [] __fput+0x39/0x1e0 [ 83.006944] PGD 7acd2067 PUD 7b0c7067 PMD 0 [ 83.009240] Oops: [#1] SMP [ 83.010940] Modules linked in: stap_fault_injection(OE) ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_mangle ip6table_raw ip6table_filter ip6_tables iptable_mangle iptable_raw iptable_filter ip_tables coretemp crct10dif_pclmul crc32_pclmul crc32c_intel dm_mirror ghash_clmulni_intel dm_region_hash aesni_intel dm_log glue_helper dm_mod lrw gf128mul ablk_helper cryptd ppdev vmw_balloon microcode serio_raw pcspkr parport_pc shpchp parport vmw_vmci i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc uinput sd_mod ata_generic pata_acpi mptspi scsi_transport_spi mptscsih ata_piix e1000 mptbase libata floppy [ 83.038033] CPU: 2 PID: 8795 Comm: sh Tainted: GW OE 3.19.0-rc5+ #28 [ 83.039666] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013 [ 83.042110] task: 88007a22 ti: 880052048000 task.ti: 880052048000 [ 83.043865] RIP: 0010:[] [] __fput+0x39/0x1e0 [ 83.045665] RSP: 0018:88005204bea8 EFLAGS: 00010297 [ 83.046895] RAX: RBX: 88007aff3500 RCX: 0a0a [ 83.048595] RDX: 0002801d RSI: 000a RDI: 88007aff3500 [ 83.050254] RBP: 88005204bee8 R08: 88007cbfd000 R09: 000180080006 [ 83.051848] R10: R11: ea0001f2fe00 R12: 81e6c040 [ 83.053515] R13: R14: R15: [ 83.055156] FS: () GS:88007fc8() knlGS: [ 83.057000] CS: 0010 DS: ES: CR0: 80050033 [ 83.058328] CR2: CR3: 7b0bc000 CR4: 000407e0 [ 83.060004] Stack: [ 83.060482] 88007af0de48 88007af0dc00 88007af0de48 [ 83.062285] 81e6c040 88007a220610 88007a22 [ 83.064115] 88005204bef8 811b679e 88005204bf28 81088f6f [ 83.065956] Call Trace: [ 83.066544] [] fput+0xe/0x10 [ 83.067738] [] task_work_run+0xaf/0xf0 [ 83.068971] [] do_notify_resume+0x7a/0x90 [ 83.070307] [] int_signal+0x12/0x17 [ 83.071464] Code: 55 41 54 53 48 89 fb 48 83 ec 18 4c 8b 7f 18 4c 8b 77 10 4c 8b 6f 20 e8 06 c7 4e 00 8b 53 44 4c 8b 53 20 89 d0 83 e0 02 83 f8 01 <41> 0f b7 02 45 19 e4 41 83 e4 08 41 83 c4 08 44 89 e1 66 25 00 [ 83.077450] RIP [] __fput+0x39/0x1e0 [ 83.078729] RSP [ 83.079522] CR2: crash> bt -l PID: 8795 TASK: 88007a22 CPU: 2 COMMAND: "sh" #0 [88005204ba70] machine_kexec at 8104ef62 /usr/src/linux/arch/x86/kernel/machine_kexec_64.c: 320 #1 [88005204bac0] crash_kexec at 810ed983 /usr/src/linux/kernel/kexec.c: 1482 #2 [88005204bb90] oops_end at 810176e8 /usr/src/linux/arch/x86/kernel/dumpstack.c: 231 #3 [88005204bbc0] no_context at 8169af1f /usr/src/linux/arch/x86/mm/fault.c: 724 #4 [88005204bc20] __bad_area_nosemaphore at 8169aff6 /usr/src/linux/arch/x86/mm/fault.c: 804 #5 [88005204bc70] bad_area at 8169b31f /usr/src/linux/arch/x86/mm/fault.c: 833 #6 [88005204bca0] __do_page_fault at 81059b37 /usr/src/linux/arch/x86/mm/fault.c: 1213 #7 [88005204bdc0] do_page_fault at 81059c11 /usr/src/linux/arch/x86/mm/fault.c: 1295 #8 [88005204bdf0] page_fault at 816a8a28 /usr/src/linux/arch/x86/kernel/entry_64.S: 1283 [exception RIP: __fput+57] RIP: 811b65a9 RSP: 88005204bea8 RFLAGS: 00010297 RAX: RBX: 88007aff3500 RCX: 0a0a RDX: 0002801d RSI: 000a RDI: 88007aff3500 RBP: 88005204bee8 R8: 88007cbfd000 R9: 000180080006 R10: R11: ea0001f2fe00 R12: 81e6c040 R13: R14: R15: ORIG_RAX: CS: 0010 SS: 0018 #9 [88005204bef0] fput at 811b679e /usr/src/linux/fs/file_table.c: 245 #10 [88005204bf00] task_work_run at 81088f6f /usr/src/linux/kernel/task_work.c: 125 #11 [88005204bf30] do_notify_resume at 81013c5a /usr/src/linux/include/linux/tracehook.h: 190 #12 [88005204bf50] int_signal at 816a6d87 /usr/src/linux/arch/x86/kernel