Re: [Intel-gfx] kernel BUG at drivers/gpu/drm/i915/i915_gem

2011-12-16 Thread Keith Packard
On Fri, 16 Dec 2011 17:38:39 +0100, tino.keitel+x...@tikei.de wrote:

> Hum, could this be related to the RC6 vs. IOMMU issue? I have
> CONFIG_INTEL_IOMMU enabled in 3.2, and disabled in 3.1, and boot with
> i915.i915_enable_rc6=1.

Very likely -- RC6 and semaphores are not compatible with IOMMU. You
should be able to just disable VTd in your BIOS and use the kernel with
IOMMU enabled.

-- 
keith.pack...@intel.com


pgphkOkXb9GHs.pgp
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] kernel BUG at drivers/gpu/drm/i915/i915_gem

2011-12-16 Thread tino . keitel+xorg
On Thu, Dec 15, 2011 at 19:54:15 +0100, tino.keitel+x...@tikei.de wrote:
> On Wed, Dec 14, 2011 at 20:57:51 +0100, tino.keitel+x...@tikei.de wrote:
> 
> [...]
> 
> > it looks I stumbled over the same:
> > 
> > [88399.844150] kernel BUG at drivers/gpu/drm/i915/i915_gem.c:1952!
> > 2011-12-14_19:28:56.93083 <0>[88399.844182] invalid opcode:  [#1]
> > SMP 
> > 
> > While doing this I was running a 32 bit photo software (Bibble 5 pro)
> > in fullscreen on an otherwise 64 bit system.
> > 
> > The full log including the trace is attached.
> > 
> > I'll try with the patch applied.
> 
> I just got the same hang again with the patch applied.

Hum, could this be related to the RC6 vs. IOMMU issue? I have
CONFIG_INTEL_IOMMU enabled in 3.2, and disabled in 3.1, and boot with
i915.i915_enable_rc6=1.

Regards,
Tino
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] kernel BUG at drivers/gpu/drm/i915/i915_gem

2011-12-15 Thread tino . keitel+xorg
On Wed, Dec 14, 2011 at 20:57:51 +0100, tino.keitel+x...@tikei.de wrote:

[...]

> it looks I stumbled over the same:
> 
> [88399.844150] kernel BUG at drivers/gpu/drm/i915/i915_gem.c:1952!
> 2011-12-14_19:28:56.93083 <0>[88399.844182] invalid opcode:  [#1]
> SMP 
> 
> While doing this I was running a 32 bit photo software (Bibble 5 pro)
> in fullscreen on an otherwise 64 bit system.
> 
> The full log including the trace is attached.
> 
> I'll try with the patch applied.

I just got the same hang again with the patch applied.

Regards,
Tino
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] kernel BUG at drivers/gpu/drm/i915/i915_gem

2011-12-14 Thread tino . keitel+xorg
On Wed, Dec 14, 2011 at 02:47:33 +0100, Daniel Vetter wrote:
> On Mon, Dec 12, 2011 at 10:16, Rocko Requin  wrote:
> >> If you can wire up netconsole you should be able to gather the full
> >> backtrace and that would be really useful. Otherwise can you please
> >> confirm by reverting that commit from your current tree that it is
> >> indeed the culprit? Otherwise please bisect the issue.
> >
> > I built 3.2-rc5 with the patch from commit
> > eb1711bb94991e93669c5a1b5f84f11be2d51ea1 reversed, and have been using it
> > now for a day and a half without any i915_gem issues. So at this stage it
> > does seem likely it is the culprit, based on the fact that I had at least 2
> > and probably 3 i915_gem crashes in around 12 hours with the commit applied.
> > When I get some free time I'll reapply the patch and see if I can reproduce
> > the crash and get a netconsole dump.
> 
> Backtraces from another reporter seriously look like we're hitting
> some ugly use-after free. Can you please test whether the patch
> "drm/i915: Only clear the GPU domains upon a successful finish" by
> Chris Wilson fixes anything for you? You can grab it from
> http://cgit.freedesktop.org/~danvet/drm/patch/?id=389a55581e30607af0fcde6cdb4e54f189cf46cf

Hi,

it looks I stumbled over the same:

[88399.844150] kernel BUG at drivers/gpu/drm/i915/i915_gem.c:1952!
2011-12-14_19:28:56.93083 <0>[88399.844182] invalid opcode:  [#1]
SMP 

While doing this I was running a 32 bit photo software (Bibble 5 pro)
in fullscreen on an otherwise 64 bit system.

The full log including the trace is attached.

I'll try with the patch applied.

Regards,
Tino
[88399.844150] kernel BUG at drivers/gpu/drm/i915/i915_gem.c:1952!
2011-12-14_19:28:56.93083 <0>[88399.844182] invalid opcode:  [#1] SMP 
[88399.844210] CPU 3 
[88399.844222] Modules linked in: bluetooth cpufreq_stats fuse ipv6 loop 
snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm 
dvb_usb_vp7045 dvb_usb dvb_core rc_core snd_timer xhci_hcd snd_page_alloc evdev
[88399.844368] 
[88399.844380] Pid: 8959, comm: Xorg Not tainted 3.2.0-rc5-1-g3aae701 #24   
   /DH67BL
[88399.844439] RIP: 0010:[]  [] 
i915_wait_request+0x516/0x530
[88399.844491] RSP: 0018:88020b3cbbe8  EFLAGS: 00010246
[88399.844519] RAX: 88021661e800 RBX: 880216692038 RCX: 5250
[88399.844555] RDX: 8802166923f8 RSI:  RDI: 880216692038
[88399.844591] RBP: 880216692000 R08: 0010 R09: 0002
[88399.844627] R10: 88021661e800 R11: 005a R12: 
[88399.844664] R13:  R14:  R15: 88021661e800
[88399.844700] FS:  7fbef0355880() GS:88021fb8() 
knlGS:
[88399.844741] CS:  0010 DS:  ES:  CR0: 8005003b
[88399.844770] CR2: 7f2c5ed0200c CR3: 000216748000 CR4: 000406e0
[88399.844806] DR0:  DR1:  DR2: 
[88399.844842] DR3:  DR6: 0ff0 DR7: 0400
[88399.844879] Process Xorg (pid: 8959, threadinfo 88020b3ca000, task 
8801f0eee110)
2011-12-14_19:28:56.93093 <0>[88399.844919] Stack:
[88399.844932]  88021661e800 813ac08c 0042 
0042
[88399.844977]  8802166922f8 8138056e 8802166923f8 
0042
[88399.845023]  88020b3cbd10 81385c01  
880216692038
2011-12-14_19:28:56.93094 <0>[88399.845068] Call Trace:
[88399.845087]  [] ? blt_ring_flush+0xdc/0x110
[88399.845120]  [] ? i915_gem_flush_ring+0x4e/0x210
[88399.845154]  [] ? 
i915_gem_execbuffer_relocate_entry+0x171/0x300
[88399.845193]  [] ? 
i915_gem_do_execbuffer.isra.8+0xb3c/0x13d0
[88399.845233]  [] ? 
i915_gem_object_set_to_gtt_domain+0xd8/0x1d0
[88399.845272]  [] ? i915_gem_execbuffer2+0x9e/0x260
[88399.845307]  [] ? drm_ioctl+0x3ec/0x4a0
[88399.845336]  [] ? i915_gem_execbuffer+0x410/0x410
[88399.845371]  [] ? do_vfs_ioctl+0x96/0x550
[88399.845402]  [] ? vfs_read+0x14d/0x170
[88399.845430]  [] ? sys_ioctl+0x49/0x80
[88399.845460]  [] ? system_call_fastpath+0x16/0x1b
2011-12-14_19:28:56.93100 <0>[88399.845491] Code: ff ff 0f 1f 00 45 31 e4 e9 de 
fc ff ff 0f 1f 84 00 00 00 00 00 41 bc f0 ff ff ff e9 cb fc ff ff 41 bc f4 ff 
ff ff e9 8a fb ff ff <0f> 0b 41 bc 00 fe ff ff eb 91 45 85 e4 0f 84 5e fb ff ff 
e9 a8 
2011-12-14_19:28:56.93101 kern.alert: [88399.845744] RIP  [] 
i915_wait_request+0x516/0x530
[88399.845781]  RSP 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] kernel BUG at drivers/gpu/drm/i915/i915_gem

2011-12-13 Thread Daniel Vetter
On Mon, Dec 12, 2011 at 10:16, Rocko Requin  wrote:
>> If you can wire up netconsole you should be able to gather the full
>> backtrace and that would be really useful. Otherwise can you please
>> confirm by reverting that commit from your current tree that it is
>> indeed the culprit? Otherwise please bisect the issue.
>
> I built 3.2-rc5 with the patch from commit
> eb1711bb94991e93669c5a1b5f84f11be2d51ea1 reversed, and have been using it
> now for a day and a half without any i915_gem issues. So at this stage it
> does seem likely it is the culprit, based on the fact that I had at least 2
> and probably 3 i915_gem crashes in around 12 hours with the commit applied.
> When I get some free time I'll reapply the patch and see if I can reproduce
> the crash and get a netconsole dump.

Backtraces from another reporter seriously look like we're hitting
some ugly use-after free. Can you please test whether the patch
"drm/i915: Only clear the GPU domains upon a successful finish" by
Chris Wilson fixes anything for you? You can grab it from
http://cgit.freedesktop.org/~danvet/drm/patch/?id=389a55581e30607af0fcde6cdb4e54f189cf46cf

Thanks, Daniel



-- 
Daniel Vetter
daniel.vet...@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] kernel BUG at drivers/gpu/drm/i915/i915_gem

2011-12-09 Thread Daniel Vetter
On Sat, Dec 10, 2011 at 03:09, Rocko Requin  wrote:
>> Please report bugs always on the public
>> mailing list so that all interest people can join in
>
> Apologies, I used to log them to the kernel bug tracker, which never came
> back up after kernel.org was rebuilt, and I didn't want to bother the public
> list.

You've trimmed the cc: list and removed intel-gfx and linux-kernel.
Don't do that, people won't see your reply otherwise. Re-added to the
cc: list.

>> Unfortunately you've cut out the backtrace from the
>> kernel oops, so I can't tell anything more.
>
> It wasn't me! The kernel locked up before it could complete writing the
> backtrace to the log, so it simply isn't there (and unfortunately the PC
> didn't switch to a tty to display it). There is the beginning of a backtrace
> from the same bug the day before, but it doesn't get much past the invalid
> opcode line

If you can wire up netconsole you should be able to gather the full
backtrace and that would be really useful. Otherwise can you please
confirm by reverting that commit from your current tree that it is
indeed the culprit? Otherwise please bisect the issue.
-Daniel
-- 
Daniel Vetter
daniel.vet...@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] kernel BUG at drivers/gpu/drm/i915/i915_gem

2011-12-09 Thread Daniel Vetter
Hi Rocko,

Thanks for the bug report. Please report bugs always on the public
mailing list so that all interest people can join in (also I'm
travelling atm, so I can't do much).
Afaics we're hitting

BUG_ON(seqno == 0)

in wait_requst. Unfortunately you've cut out the backtrace from the
kernel oops, so I can't tell anything more.

Please add the full dmesg and the usual details about your macine.

Yours, Daniel

On Fri, Dec 9, 2011 at 00:11, Rocko Requin  wrote:
> Hi Daniel,
>
> Since I applied commit eb1711bb94991e93669c5a1b5f84f11be2d51ea1 (drm/i915:
> fix infinite recursion on unbind due to ilk vt-d w/a) to my kernel yesterday
> afternoon, I've had the kernel lock up completely three times on me, so I'm
> assuming it might have been the cause. (I was previously using a kernel
> compiled from git on Dec 4th without problems.)
>
> The last lockup showed this in the syslog:
>
> Dec  9 06:23:15 sierra kernel: [44486.289428] kernel BUG at
> drivers/gpu/drm/i915/i915_gem.c:1952!
> Dec  9 06:23:15 sierra kernel: [44486.289460] invalid opcode:  [#1] SMP
> ...
> Dec  9 06:23:15 sierra kernel: [44486.289844] RIP:
> 0010:[]  []
> i915_wait_request+0x563/0x580 [i915]
>
> I have attached a longer version of the log (but note that it was truncated
> by the crash).
>
> I am booting the kernel with i915.i915_enable_rc6=1 but I didn't have any
> lockup issues until yesterday.
>
> Thanks,
> rocko



-- 
Daniel Vetter
daniel.vet...@ffwll.ch - +41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx