Hi all,

I would like to share an issue a client of mine ran into with JavaFX 17u on Linux/X11 and discuss a change that seems to resolve it.

On a Linux notebook with an Nvidia GPU (proprietary driver), a JavaFX application, essentially just running a continuous animation in a single window, would eventually crash with a SIGSEGV originating deep inside the driver. Prior to the crash, glXMakeCurrent returned a BadAlloc error. Timing was non-deterministic - sometimes after ~5 minutes, sometimes after roughly a day. A second machine with very similar hardware (different graphics card revision), the same driver version, but a slightly different OS configuration did not exhibit the crash.

When the crash occurred, it was always after ES2Context.makeCurrent() was called with the real drawable (i.e. called from ES2SwapChain.createGraphics()).

I have not been able to pinpoint the exact root cause. What I can say empirically is that the crash correlates with the pattern that JavaFX's current code produces on every frame: for each rendered window, glXMakeCurrent is called twice - once in createGraphics() to bind the real drawable, and once in present() via context.makeCurrent(null) to bind the dummy drawable. So the GL context alternates between a real on-screen window and an off-screen pixmap at the frame rate.

Reducing this alternation (see below) makes the crash disappear on the affected machine in long-running tests, at least for more than 2 days, which was not the case before. I'd like to discuss whether the alternation itself is actually needed.

While researching this, I came across Michael Zucchi's PR #1981 (https://github.com/openjdk/jfx/pull/1981), which addresses an uncontrolled framerate issue. That patch removes / replaces the context.makeCurrent(null) call from ES2SwapChain.present() and calls glFinish after the last presentable. Applying the patch to my 17u fork seems to resolve the crash on the affected machine. So, that was a strong hint that the alternation pattern is involved. However, I also observed new visual artifacts. I commented on them in the PR. So just removing the call is not sufficient.

Instead of calling makeCurrent(null) (which issues glXMakeCurrent to the dummy drawable), I only invalidate the Java-side currentDrawable tracking in ES2Context. The next createGraphics() call for any window then sees currentDrawable == null and reliably performs glContext.makeCurrent(drawable) and glContext.bindFBO(0) before rendering.

The code is here to see: https://github.com/openjdk/jfx/compare/master...tsx84:jfx:GLX_MakeCurrent_Bug

The original motivation for context.makeCurrent(null) at the end of present() appears to be defensive: after swapping buffers, unbind the GL context from the visible drawable so that stray GL operations do not accidentally target a live window.

If this is the only reason, is the dummy round-trip still necessary? The next render cycle always starts with an explicit makeCurrent(drawable) anyway. Invalidating the Java-side tracking seems to be enough to ensure that the next makeCurrent is not skipped by the identity-based cache in ES2Context.

Does anyone know a reason if/why the current behavior is actually needed and if my patch would be reasonable and safe?

Thanks,
Thorsten

Reply via email to