Hi all,
I would like to share an issue a client of mine ran into with JavaFX 17u
on Linux/X11 and discuss a change that seems to resolve it.
On a Linux notebook with an Nvidia GPU (proprietary driver), a JavaFX
application, essentially just running a continuous animation in a single
window, would eventually crash with a SIGSEGV originating deep inside
the driver. Prior to the crash, glXMakeCurrent returned a BadAlloc
error. Timing was non-deterministic - sometimes after ~5 minutes,
sometimes after roughly a day. A second machine with very similar
hardware (different graphics card revision), the same driver version,
but a slightly different OS configuration did not exhibit the crash.
When the crash occurred, it was always after ES2Context.makeCurrent()
was called with the real drawable (i.e. called from
ES2SwapChain.createGraphics()).
I have not been able to pinpoint the exact root cause. What I can say
empirically is that the crash correlates with the pattern that JavaFX's
current code produces on every frame: for each rendered window,
glXMakeCurrent is called twice - once in createGraphics() to bind the
real drawable, and once in present() via context.makeCurrent(null) to
bind the dummy drawable. So the GL context alternates between a real
on-screen window and an off-screen pixmap at the frame rate.
Reducing this alternation (see below) makes the crash disappear on the
affected machine in long-running tests, at least for more than 2 days,
which was not the case before. I'd like to discuss whether the
alternation itself is actually needed.
While researching this, I came across Michael Zucchi's PR #1981
(https://github.com/openjdk/jfx/pull/1981), which addresses an
uncontrolled framerate issue. That patch removes / replaces the
context.makeCurrent(null) call from ES2SwapChain.present() and calls
glFinish after the last presentable. Applying the patch to my 17u fork
seems to resolve the crash on the affected machine. So, that was a
strong hint that the alternation pattern is involved. However, I also
observed new visual artifacts. I commented on them in the PR. So just
removing the call is not sufficient.
Instead of calling makeCurrent(null) (which issues glXMakeCurrent to the
dummy drawable), I only invalidate the Java-side currentDrawable
tracking in ES2Context. The next createGraphics() call for any window
then sees currentDrawable == null and reliably performs
glContext.makeCurrent(drawable) and glContext.bindFBO(0) before rendering.
The code is here to see:
https://github.com/openjdk/jfx/compare/master...tsx84:jfx:GLX_MakeCurrent_Bug
The original motivation for context.makeCurrent(null) at the end of
present() appears to be defensive: after swapping buffers, unbind the GL
context from the visible drawable so that stray GL operations do not
accidentally target a live window.
If this is the only reason, is the dummy round-trip still necessary? The
next render cycle always starts with an explicit makeCurrent(drawable)
anyway. Invalidating the Java-side tracking seems to be enough to ensure
that the next makeCurrent is not skipped by the identity-based cache in
ES2Context.
Does anyone know a reason if/why the current behavior is actually needed
and if my patch would be reasonable and safe?
Thanks,
Thorsten
- [External] : [Linux] glXMakeCurrent BadAlloc crash with N... Thorsten Fischer
-