On 6/30/23 22:32, Marek Olšák wrote:
> On Fri, Jun 30, 2023 at 11:11 AM Michel Dänzer <michel.daen...@mailbox.org 
> <mailto:michel.daen...@mailbox.org>> wrote:
>> On 6/30/23 16:59, Alex Deucher wrote:
>>> On Fri, Jun 30, 2023 at 10:49 AM Sebastian Wick
>>> <sebastian.w...@redhat.com <mailto:sebastian.w...@redhat.com>> wrote:
>>>> On Tue, Jun 27, 2023 at 3:23 PM André Almeida <andrealm...@igalia.com 
>>>> <mailto:andrealm...@igalia.com>> wrote:
>>>>>
>>>>> +Robustness
>>>>> +----------
>>>>> +
>>>>> +The only way to try to keep an application working after a reset is if it
>>>>> +complies with the robustness aspects of the graphical API that it is 
>>>>> using.
>>>>> +
>>>>> +Graphical APIs provide ways to applications to deal with device resets. 
>>>>> However,
>>>>> +there is no guarantee that the app will use such features correctly, and 
>>>>> the
>>>>> +UMD can implement policies to close the app if it is a repeating 
>>>>> offender,
>>>>> +likely in a broken loop. This is done to ensure that it does not keep 
>>>>> blocking
>>>>> +the user interface from being correctly displayed. This should be done 
>>>>> even if
>>>>> +the app is correct but happens to trigger some bug in the 
>>>>> hardware/driver.
>>>>
>>>> I still don't think it's good to let the kernel arbitrarily kill
>>>> processes that it thinks are not well-behaved based on some heuristics
>>>> and policy.
>>>>
>>>> Can't this be outsourced to user space? Expose the information about
>>>> processes causing a device and let e.g. systemd deal with coming up
>>>> with a policy and with killing stuff.
>>>
>>> I don't think it's the kernel doing the killing, it would be the UMD.
>>> E.g., if the app is guilty and doesn't support robustness the UMD can
>>> just call exit().
>>
>> It would be safer to just ignore API calls[0], similarly to what is done 
>> until the application destroys the context with robustness. Calling exit() 
>> likely results in losing any unsaved work, whereas at least some 
>> applications might otherwise allow saving the work by other means.
> 
> That's a terrible idea. Ignoring API calls would be identical to a freeze. 
> You might as well disable GPU recovery because the result would be the same.

No GPU recovery would affect everything using the GPU, whereas this affects 
only non-robust applications.


> - non-robust contexts: call exit(1) immediately, which is the best way to 
> recover

That's not the UMD's call to make.


>>     [0] Possibly accompanied by a one-time message to stderr along the lines 
>> of "GPU reset detected but robustness not enabled in context, ignoring 
>> OpenGL API calls".


-- 
Earthling Michel Dänzer            |                  https://redhat.com
Libre software enthusiast          |         Mesa and Xwayland developer

Reply via email to