On 6/30/23 22:32, Marek Olšák wrote: > On Fri, Jun 30, 2023 at 11:11 AM Michel Dänzer <michel.daen...@mailbox.org > <mailto:michel.daen...@mailbox.org>> wrote: >> On 6/30/23 16:59, Alex Deucher wrote: >>> On Fri, Jun 30, 2023 at 10:49 AM Sebastian Wick >>> <sebastian.w...@redhat.com <mailto:sebastian.w...@redhat.com>> wrote: >>>> On Tue, Jun 27, 2023 at 3:23 PM André Almeida <andrealm...@igalia.com >>>> <mailto:andrealm...@igalia.com>> wrote: >>>>> >>>>> +Robustness >>>>> +---------- >>>>> + >>>>> +The only way to try to keep an application working after a reset is if it >>>>> +complies with the robustness aspects of the graphical API that it is >>>>> using. >>>>> + >>>>> +Graphical APIs provide ways to applications to deal with device resets. >>>>> However, >>>>> +there is no guarantee that the app will use such features correctly, and >>>>> the >>>>> +UMD can implement policies to close the app if it is a repeating >>>>> offender, >>>>> +likely in a broken loop. This is done to ensure that it does not keep >>>>> blocking >>>>> +the user interface from being correctly displayed. This should be done >>>>> even if >>>>> +the app is correct but happens to trigger some bug in the >>>>> hardware/driver. >>>> >>>> I still don't think it's good to let the kernel arbitrarily kill >>>> processes that it thinks are not well-behaved based on some heuristics >>>> and policy. >>>> >>>> Can't this be outsourced to user space? Expose the information about >>>> processes causing a device and let e.g. systemd deal with coming up >>>> with a policy and with killing stuff. >>> >>> I don't think it's the kernel doing the killing, it would be the UMD. >>> E.g., if the app is guilty and doesn't support robustness the UMD can >>> just call exit(). >> >> It would be safer to just ignore API calls[0], similarly to what is done >> until the application destroys the context with robustness. Calling exit() >> likely results in losing any unsaved work, whereas at least some >> applications might otherwise allow saving the work by other means. > > That's a terrible idea. Ignoring API calls would be identical to a freeze. > You might as well disable GPU recovery because the result would be the same.
No GPU recovery would affect everything using the GPU, whereas this affects only non-robust applications. > - non-robust contexts: call exit(1) immediately, which is the best way to > recover That's not the UMD's call to make. >> [0] Possibly accompanied by a one-time message to stderr along the lines >> of "GPU reset detected but robustness not enabled in context, ignoring >> OpenGL API calls". -- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and Xwayland developer