On Tue, Sep 03, 2013 at 04:34:56PM +0200, Maarten Lankhorst wrote:
> Op 03-09-13 16:20, Pasi Kärkkäinen schreef:
> > On Wed, Aug 28, 2013 at 09:44:15AM +0200, Maarten Lankhorst wrote:
> >> Op 28-08-13 08:29, Pasi Kärkkäinen schreef:
> >>> On Fri, Aug 23, 2013 at 11:20:42PM +0300, Pasi Kärkkäinen wrote:
> >>>> On Thu, Aug 22, 2013 at 09:12:40AM +0200, Maarten Lankhorst wrote:
> >>>>> Op 22-08-13 02:10, Ilia Mirkin schreef:
> >>>>>> The code expects non-VRAM mem nodes to have a pages list. If that's not
> >>>>>> set, it will do a null deref down the line. Warn on that condition and
> >>>>>> return an error.
> >>>>>>
> >>>>>> See https://bugs.freedesktop.org/show_bug.cgi?id=64774
> >>>>>>
> >>>>>> Reported-by: Pasi Kärkkäinen <pa...@iki.fi>
> >>>>>> Tested-by: Pasi Kärkkäinen <pa...@iki.fi>
> >>>>>> Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu>
> >>>>>> Cc: <sta...@vger.kernel.org> # 3.8+
> >>>>>> ---
> >>>>>>
> >>>>>> I don't exactly understand what's going on, but this is just a
> >>>>>> straightforward way to avoid a null deref that you see happens in the
> >>>>>> bug. I haven't figured out the root cause of this, but it's getting
> >>>>>> well into the "I have no idea how TTM works" space. However this seems
> >>>>>> like a bit of defensive programming -- nouveau_vm_map_sg will pass
> >>>>>> node->pages as a list down, which will be dereferenced by
> >>>>>> nvc0_vm_map_sg. Perhaps the other arguments should make that
> >>>>>> dereferencing not happen, but it definitely was happening here, as you
> >>>>>> can see in the bug.
> >>>>>>
> >>>>>> Ben/Maarten, I'll let you judge whether this check is appropriate,
> >>>>>> since like I hope I was able to convey above, I'm just not really sure 
> >>>>>> :)
> >>>>> Not it really isn't appropriate..
> >>>>>
> >>>>> You'd have to call call nouveau_vm_map_sg_table instead, the only place 
> >>>>> that doesn't handle that correctly
> >>>>> is where it's not expected to be called.
> >>>>>
> >>>>> Here, have a completely untested patch to fix things...
> >>>>>
> >>>> Maarten: I've been testing this stuff with Linux 3.10.x, so I had to 
> >>>> modify the patch a bit to make it apply there.. 
> >>>> I've attached the modified copy that applies to 3.10.9, hopefully I did 
> >>>> the backport correctly.
> >>>>
> >>>> With Linux 3.10.9 and the patch applied the kernel doesn't crash 
> >>>> anymore, and I get this error in dmesg:
> >>>>
> >>>> [   76.105643] nouveau W[     DRM] Trying to create a fb in vram with 
> >>>> valid_domains=00000004
> >>>>
> >>>> Does that help? 
> >>>>
> >>> Any comments? 
> >>>
> >>> Maarten's patch works for me, I get that warning instead of a kernel 
> >>> crash,
> >>> but it's also a bigger change that doesn't apply to older kernels as-is. 
> >>>
> >>> Ilia's original patch in this thread can be applied to older kernels 
> >>> as-is,
> >>> and it also prevents the kernel crash from happening.
> >>>
> >>> Should we get both applied, so Ilia's patch can be CC'd to stable ? 
> >>>
> >> You haven't understood the root cause then. You're trying to move an 
> >> IMPORTED dma-buf into VRAM.
> >> Doing so would seem to work, but at that point it's no longer a dma-buf so 
> >> changes done by the exporter
> >> would not show up in nouveau because they no longer refer to the same 
> >> piece of memory.
> >>
> > Yes, that's true, I don't understand the root cause. 
> > That's mostly because I'm not familiar with the nouveau code/internals. 
> >
> >
> >> Failing is the only right option here.
> >>
> > Sorry but I'm not sure I understand that correctly.. what exactly do you 
> > suggest? What should fail?
> >
> >
> > Because I'm not familiar with the code (yet) understanding and finding the 
> > root cause
> > will take time for me.. that's why I was suggesting to meanwhile apply 
> > Ilia's very simple patch
> > from this thread which actually prevents the hard kernel crash from 
> > happening, because it seems 
> > like an unharmful fix to have to protect against this kind of bugs (the 
> > root cause) ?
> > Or is that unacceptable? 
> >
> > (To recap: The kernel crash happens very often when trying to use the 
> > nouveau adapter in Optimus mode, with all kernels at least 3.8+, and it's 
> > very annoying that your laptop crashes when trying to enable a nouveau 
> > output. So Ilia's patch doesn't make the driver working properly, but at 
> > least it prevents the hard kernel crash from happening. The crash bug is in 
> > many kernel versions, including the long-term v3.10 tree. I've had the 
> > crash happening with 3.8.x, 3.9.x and 3.10.x)
>
> The fix probably requires commit fdfb8332651db7a280851dfccfc4f0cff4bcd052 to 
> apply cleanly "drm/nouveau: fix some error-path leaks in fbcon handling code"
> 

So what you mean is to use fdfb8332651db7a280851dfccfc4f0cff4bcd052 and your 
patch from this thread? 
and skip Ilia's patch?

-- Pasi

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

Reply via email to