Op 03-09-13 16:20, Pasi Kärkkäinen schreef:
> On Wed, Aug 28, 2013 at 09:44:15AM +0200, Maarten Lankhorst wrote:
>> Op 28-08-13 08:29, Pasi Kärkkäinen schreef:
>>> On Fri, Aug 23, 2013 at 11:20:42PM +0300, Pasi Kärkkäinen wrote:
>>>> On Thu, Aug 22, 2013 at 09:12:40AM +0200, Maarten Lankhorst wrote:
>>>>> Op 22-08-13 02:10, Ilia Mirkin schreef:
>>>>>> The code expects non-VRAM mem nodes to have a pages list. If that's not
>>>>>> set, it will do a null deref down the line. Warn on that condition and
>>>>>> return an error.
>>>>>>
>>>>>> See https://bugs.freedesktop.org/show_bug.cgi?id=64774
>>>>>>
>>>>>> Reported-by: Pasi Kärkkäinen <pa...@iki.fi>
>>>>>> Tested-by: Pasi Kärkkäinen <pa...@iki.fi>
>>>>>> Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu>
>>>>>> Cc: <sta...@vger.kernel.org> # 3.8+
>>>>>> ---
>>>>>>
>>>>>> I don't exactly understand what's going on, but this is just a
>>>>>> straightforward way to avoid a null deref that you see happens in the
>>>>>> bug. I haven't figured out the root cause of this, but it's getting
>>>>>> well into the "I have no idea how TTM works" space. However this seems
>>>>>> like a bit of defensive programming -- nouveau_vm_map_sg will pass
>>>>>> node->pages as a list down, which will be dereferenced by
>>>>>> nvc0_vm_map_sg. Perhaps the other arguments should make that
>>>>>> dereferencing not happen, but it definitely was happening here, as you
>>>>>> can see in the bug.
>>>>>>
>>>>>> Ben/Maarten, I'll let you judge whether this check is appropriate,
>>>>>> since like I hope I was able to convey above, I'm just not really sure :)
>>>>> Not it really isn't appropriate..
>>>>>
>>>>> You'd have to call call nouveau_vm_map_sg_table instead, the only place 
>>>>> that doesn't handle that correctly
>>>>> is where it's not expected to be called.
>>>>>
>>>>> Here, have a completely untested patch to fix things...
>>>>>
>>>> Maarten: I've been testing this stuff with Linux 3.10.x, so I had to 
>>>> modify the patch a bit to make it apply there.. 
>>>> I've attached the modified copy that applies to 3.10.9, hopefully I did 
>>>> the backport correctly.
>>>>
>>>> With Linux 3.10.9 and the patch applied the kernel doesn't crash anymore, 
>>>> and I get this error in dmesg:
>>>>
>>>> [   76.105643] nouveau W[     DRM] Trying to create a fb in vram with 
>>>> valid_domains=00000004
>>>>
>>>> Does that help? 
>>>>
>>> Any comments? 
>>>
>>> Maarten's patch works for me, I get that warning instead of a kernel crash,
>>> but it's also a bigger change that doesn't apply to older kernels as-is. 
>>>
>>> Ilia's original patch in this thread can be applied to older kernels as-is,
>>> and it also prevents the kernel crash from happening.
>>>
>>> Should we get both applied, so Ilia's patch can be CC'd to stable ? 
>>>
>> You haven't understood the root cause then. You're trying to move an 
>> IMPORTED dma-buf into VRAM.
>> Doing so would seem to work, but at that point it's no longer a dma-buf so 
>> changes done by the exporter
>> would not show up in nouveau because they no longer refer to the same piece 
>> of memory.
>>
> Yes, that's true, I don't understand the root cause. 
> That's mostly because I'm not familiar with the nouveau code/internals. 
>
>
>> Failing is the only right option here.
>>
> Sorry but I'm not sure I understand that correctly.. what exactly do you 
> suggest? What should fail?
>
>
> Because I'm not familiar with the code (yet) understanding and finding the 
> root cause
> will take time for me.. that's why I was suggesting to meanwhile apply Ilia's 
> very simple patch
> from this thread which actually prevents the hard kernel crash from 
> happening, because it seems 
> like an unharmful fix to have to protect against this kind of bugs (the root 
> cause) ?
> Or is that unacceptable? 
>
> (To recap: The kernel crash happens very often when trying to use the nouveau 
> adapter in Optimus mode, with all kernels at least 3.8+, and it's very 
> annoying that your laptop crashes when trying to enable a nouveau output. So 
> Ilia's patch doesn't make the driver working properly, but at least it 
> prevents the hard kernel crash from happening. The crash bug is in many 
> kernel versions, including the long-term v3.10 tree. I've had the crash 
> happening with 3.8.x, 3.9.x and 3.10.x)
The fix probably requires commit fdfb8332651db7a280851dfccfc4f0cff4bcd052 to 
apply cleanly "drm/nouveau: fix some error-path leaks in fbcon handling code"

> All comments welcome.
>
> Thanks,
>
> -- Pasi
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Reply via email to