Re: [Intel-gfx] kernel BUG at drivers/gpu/drm/i915/i915_gem
On Fri, 16 Dec 2011 17:38:39 +0100, tino.keitel+x...@tikei.de wrote: > Hum, could this be related to the RC6 vs. IOMMU issue? I have > CONFIG_INTEL_IOMMU enabled in 3.2, and disabled in 3.1, and boot with > i915.i915_enable_rc6=1. Very likely -- RC6 and semaphores are not compatible with IOMMU. You should be able to just disable VTd in your BIOS and use the kernel with IOMMU enabled. -- keith.pack...@intel.com pgphkOkXb9GHs.pgp Description: PGP signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] kernel BUG at drivers/gpu/drm/i915/i915_gem
On Thu, Dec 15, 2011 at 19:54:15 +0100, tino.keitel+x...@tikei.de wrote: > On Wed, Dec 14, 2011 at 20:57:51 +0100, tino.keitel+x...@tikei.de wrote: > > [...] > > > it looks I stumbled over the same: > > > > [88399.844150] kernel BUG at drivers/gpu/drm/i915/i915_gem.c:1952! > > 2011-12-14_19:28:56.93083 <0>[88399.844182] invalid opcode: [#1] > > SMP > > > > While doing this I was running a 32 bit photo software (Bibble 5 pro) > > in fullscreen on an otherwise 64 bit system. > > > > The full log including the trace is attached. > > > > I'll try with the patch applied. > > I just got the same hang again with the patch applied. Hum, could this be related to the RC6 vs. IOMMU issue? I have CONFIG_INTEL_IOMMU enabled in 3.2, and disabled in 3.1, and boot with i915.i915_enable_rc6=1. Regards, Tino ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] kernel BUG at drivers/gpu/drm/i915/i915_gem
On Wed, Dec 14, 2011 at 20:57:51 +0100, tino.keitel+x...@tikei.de wrote: [...] > it looks I stumbled over the same: > > [88399.844150] kernel BUG at drivers/gpu/drm/i915/i915_gem.c:1952! > 2011-12-14_19:28:56.93083 <0>[88399.844182] invalid opcode: [#1] > SMP > > While doing this I was running a 32 bit photo software (Bibble 5 pro) > in fullscreen on an otherwise 64 bit system. > > The full log including the trace is attached. > > I'll try with the patch applied. I just got the same hang again with the patch applied. Regards, Tino ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] kernel BUG at drivers/gpu/drm/i915/i915_gem
On Wed, Dec 14, 2011 at 02:47:33 +0100, Daniel Vetter wrote: > On Mon, Dec 12, 2011 at 10:16, Rocko Requin wrote: > >> If you can wire up netconsole you should be able to gather the full > >> backtrace and that would be really useful. Otherwise can you please > >> confirm by reverting that commit from your current tree that it is > >> indeed the culprit? Otherwise please bisect the issue. > > > > I built 3.2-rc5 with the patch from commit > > eb1711bb94991e93669c5a1b5f84f11be2d51ea1 reversed, and have been using it > > now for a day and a half without any i915_gem issues. So at this stage it > > does seem likely it is the culprit, based on the fact that I had at least 2 > > and probably 3 i915_gem crashes in around 12 hours with the commit applied. > > When I get some free time I'll reapply the patch and see if I can reproduce > > the crash and get a netconsole dump. > > Backtraces from another reporter seriously look like we're hitting > some ugly use-after free. Can you please test whether the patch > "drm/i915: Only clear the GPU domains upon a successful finish" by > Chris Wilson fixes anything for you? You can grab it from > http://cgit.freedesktop.org/~danvet/drm/patch/?id=389a55581e30607af0fcde6cdb4e54f189cf46cf Hi, it looks I stumbled over the same: [88399.844150] kernel BUG at drivers/gpu/drm/i915/i915_gem.c:1952! 2011-12-14_19:28:56.93083 <0>[88399.844182] invalid opcode: [#1] SMP While doing this I was running a 32 bit photo software (Bibble 5 pro) in fullscreen on an otherwise 64 bit system. The full log including the trace is attached. I'll try with the patch applied. Regards, Tino [88399.844150] kernel BUG at drivers/gpu/drm/i915/i915_gem.c:1952! 2011-12-14_19:28:56.93083 <0>[88399.844182] invalid opcode: [#1] SMP [88399.844210] CPU 3 [88399.844222] Modules linked in: bluetooth cpufreq_stats fuse ipv6 loop snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm dvb_usb_vp7045 dvb_usb dvb_core rc_core snd_timer xhci_hcd snd_page_alloc evdev [88399.844368] [88399.844380] Pid: 8959, comm: Xorg Not tainted 3.2.0-rc5-1-g3aae701 #24 /DH67BL [88399.844439] RIP: 0010:[] [] i915_wait_request+0x516/0x530 [88399.844491] RSP: 0018:88020b3cbbe8 EFLAGS: 00010246 [88399.844519] RAX: 88021661e800 RBX: 880216692038 RCX: 5250 [88399.844555] RDX: 8802166923f8 RSI: RDI: 880216692038 [88399.844591] RBP: 880216692000 R08: 0010 R09: 0002 [88399.844627] R10: 88021661e800 R11: 005a R12: [88399.844664] R13: R14: R15: 88021661e800 [88399.844700] FS: 7fbef0355880() GS:88021fb8() knlGS: [88399.844741] CS: 0010 DS: ES: CR0: 8005003b [88399.844770] CR2: 7f2c5ed0200c CR3: 000216748000 CR4: 000406e0 [88399.844806] DR0: DR1: DR2: [88399.844842] DR3: DR6: 0ff0 DR7: 0400 [88399.844879] Process Xorg (pid: 8959, threadinfo 88020b3ca000, task 8801f0eee110) 2011-12-14_19:28:56.93093 <0>[88399.844919] Stack: [88399.844932] 88021661e800 813ac08c 0042 0042 [88399.844977] 8802166922f8 8138056e 8802166923f8 0042 [88399.845023] 88020b3cbd10 81385c01 880216692038 2011-12-14_19:28:56.93094 <0>[88399.845068] Call Trace: [88399.845087] [] ? blt_ring_flush+0xdc/0x110 [88399.845120] [] ? i915_gem_flush_ring+0x4e/0x210 [88399.845154] [] ? i915_gem_execbuffer_relocate_entry+0x171/0x300 [88399.845193] [] ? i915_gem_do_execbuffer.isra.8+0xb3c/0x13d0 [88399.845233] [] ? i915_gem_object_set_to_gtt_domain+0xd8/0x1d0 [88399.845272] [] ? i915_gem_execbuffer2+0x9e/0x260 [88399.845307] [] ? drm_ioctl+0x3ec/0x4a0 [88399.845336] [] ? i915_gem_execbuffer+0x410/0x410 [88399.845371] [] ? do_vfs_ioctl+0x96/0x550 [88399.845402] [] ? vfs_read+0x14d/0x170 [88399.845430] [] ? sys_ioctl+0x49/0x80 [88399.845460] [] ? system_call_fastpath+0x16/0x1b 2011-12-14_19:28:56.93100 <0>[88399.845491] Code: ff ff 0f 1f 00 45 31 e4 e9 de fc ff ff 0f 1f 84 00 00 00 00 00 41 bc f0 ff ff ff e9 cb fc ff ff 41 bc f4 ff ff ff e9 8a fb ff ff <0f> 0b 41 bc 00 fe ff ff eb 91 45 85 e4 0f 84 5e fb ff ff e9 a8 2011-12-14_19:28:56.93101 kern.alert: [88399.845744] RIP [] i915_wait_request+0x516/0x530 [88399.845781] RSP ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] kernel BUG at drivers/gpu/drm/i915/i915_gem
On Mon, Dec 12, 2011 at 10:16, Rocko Requin wrote: >> If you can wire up netconsole you should be able to gather the full >> backtrace and that would be really useful. Otherwise can you please >> confirm by reverting that commit from your current tree that it is >> indeed the culprit? Otherwise please bisect the issue. > > I built 3.2-rc5 with the patch from commit > eb1711bb94991e93669c5a1b5f84f11be2d51ea1 reversed, and have been using it > now for a day and a half without any i915_gem issues. So at this stage it > does seem likely it is the culprit, based on the fact that I had at least 2 > and probably 3 i915_gem crashes in around 12 hours with the commit applied. > When I get some free time I'll reapply the patch and see if I can reproduce > the crash and get a netconsole dump. Backtraces from another reporter seriously look like we're hitting some ugly use-after free. Can you please test whether the patch "drm/i915: Only clear the GPU domains upon a successful finish" by Chris Wilson fixes anything for you? You can grab it from http://cgit.freedesktop.org/~danvet/drm/patch/?id=389a55581e30607af0fcde6cdb4e54f189cf46cf Thanks, Daniel -- Daniel Vetter daniel.vet...@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] kernel BUG at drivers/gpu/drm/i915/i915_gem
On Sat, Dec 10, 2011 at 03:09, Rocko Requin wrote: >> Please report bugs always on the public >> mailing list so that all interest people can join in > > Apologies, I used to log them to the kernel bug tracker, which never came > back up after kernel.org was rebuilt, and I didn't want to bother the public > list. You've trimmed the cc: list and removed intel-gfx and linux-kernel. Don't do that, people won't see your reply otherwise. Re-added to the cc: list. >> Unfortunately you've cut out the backtrace from the >> kernel oops, so I can't tell anything more. > > It wasn't me! The kernel locked up before it could complete writing the > backtrace to the log, so it simply isn't there (and unfortunately the PC > didn't switch to a tty to display it). There is the beginning of a backtrace > from the same bug the day before, but it doesn't get much past the invalid > opcode line If you can wire up netconsole you should be able to gather the full backtrace and that would be really useful. Otherwise can you please confirm by reverting that commit from your current tree that it is indeed the culprit? Otherwise please bisect the issue. -Daniel -- Daniel Vetter daniel.vet...@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] kernel BUG at drivers/gpu/drm/i915/i915_gem
Hi Rocko, Thanks for the bug report. Please report bugs always on the public mailing list so that all interest people can join in (also I'm travelling atm, so I can't do much). Afaics we're hitting BUG_ON(seqno == 0) in wait_requst. Unfortunately you've cut out the backtrace from the kernel oops, so I can't tell anything more. Please add the full dmesg and the usual details about your macine. Yours, Daniel On Fri, Dec 9, 2011 at 00:11, Rocko Requin wrote: > Hi Daniel, > > Since I applied commit eb1711bb94991e93669c5a1b5f84f11be2d51ea1 (drm/i915: > fix infinite recursion on unbind due to ilk vt-d w/a) to my kernel yesterday > afternoon, I've had the kernel lock up completely three times on me, so I'm > assuming it might have been the cause. (I was previously using a kernel > compiled from git on Dec 4th without problems.) > > The last lockup showed this in the syslog: > > Dec 9 06:23:15 sierra kernel: [44486.289428] kernel BUG at > drivers/gpu/drm/i915/i915_gem.c:1952! > Dec 9 06:23:15 sierra kernel: [44486.289460] invalid opcode: [#1] SMP > ... > Dec 9 06:23:15 sierra kernel: [44486.289844] RIP: > 0010:[] [] > i915_wait_request+0x563/0x580 [i915] > > I have attached a longer version of the log (but note that it was truncated > by the crash). > > I am booting the kernel with i915.i915_enable_rc6=1 but I didn't have any > lockup issues until yesterday. > > Thanks, > rocko -- Daniel Vetter daniel.vet...@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx