Am 04.08.2014 um 10:55 schrieb Maarten Lankhorst:
> op 04-08-14 10:36, Christian K?nig schreef:
>> Hi Maarten,
>>
>> Sorry for the delay. I've got way to much todo recently.
>>
>> Am 01.08.2014 um 19:46 schrieb Maarten Lankhorst:
>>> On 01-08-14 18:35, Christian K?nig wrote:
>>>> Am 31.07.2014 um 17:33 schrieb Maarten Lankhorst:
>>>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com>
>>>>> ---
>>>>> V1 had a nasty bug breaking gpu lockup recovery. The fix is not
>>>>> allowing radeon_fence_driver_check_lockup to take exclusive_lock,
>>>>> and kill it during lockup recovery instead.
>>>> That looks like the delayed work starts running as soon as we submit a 
>>>> fence, and not when it's needed for waiting.
>>>>
>>>> Since it's a backup for failing IRQs I would rather put it into 
>>>> radeon_irq_kms.c and start/stop it when the IRQs are started/stoped.
>>> The delayed work is not just for failing irq's, it's also the handler 
>>> that's used to detect lockups, which is why I trigger after processing 
>>> fences, and reset the timer after processing.
>> The idea was turning the delayed work on and off when we turn the irq on and 
>> off as well, processing of the delayed work handler can still happen in 
>> radeon_fence.c
>>
>>> Specifically what happened was this scenario:
>>>
>>> - lock up occurs
>>> - write lock taken by gpu_reset
>>> - delayed work runs, tries to acquire read lock, blocks
>>> - gpu_reset tries to cancel delayed work synchronously
>>> - has to wait for delayed work to finish -> deadlock
>> Why do you want to disable the work item from the lockup handler in the 
>> first place?
>>
>> Just take the exclusive lock in the work item, when it concurrently runs 
>> with the lockup handler it will just block for the lockup handler to 
>> complete.
> With the delayed work radeon_fence_wait no longer handles unreliable 
> interrupts itself, so it has to run from the lockup handler.
> But an alternative solution could be adding a radeon_fence_wait_timeout, 
> ignore the timeout and check if fence is signaled on timeout.
> This would probably be a cleaner solution.

Yeah, agree. Manually specifying a timeout in the fence wait on lockup 
handling sounds like the best alternative to me.

Christian.

>
> ~Maarten
>

Reply via email to