Hey Johan, thanks for the detailed answer.

Perhaps a bit more context as well... from our experience it seems about
10-12% of the setups we hit fall back to Software because of various issues
or predominantly or 'sadly' virtualisation like Citrix so we have to be
sure in SW mode things run optimally. It isn't necessarily a bad thing to
optimise for the lowest common denominator if you want a fast application
everywhere.
Given this learning I'm now looking to how ensure we somehow 'box' what a
minimum performance looks like and ensure with every change we don't really
get away from that + communicate costs clearly with the UX team.
That way when our developers step on land mines we detect the missing limbs
the same day.

That's the direction I'm currently thinking about and of course I can't be
the only one (hence the post).

I don't like it when people say that "JavaFX is slow" because if everything
is done right, JavaFX can be extremely fast.
But I also realise that it is very easy to make things very slow in JavaFX,
and in case "performance" is slow, it is important to pinpoint the problem
as easy as possible. And that is often not trivial, so the easier it
becomes to detect problems, the more developers will like JavaFX.

>>> Agree that JavaFX can be fast and given we have to support thousands of
setups that we have no control over I'm quite content with how JavaFX is
performing in the wild.
It definitely has a few performance traps that aren't so easy notice from
the high level API, css, effects, etc.
And while blocking the JavaFX thread happens the most frequent it also is
simple to explain to any reasonable developer and easy to troubleshoot as
it is something you influence directly, whereas css pass, layout or paint
are much more indirect and tricky.
An effect on top on an animation for example was very tricky to find. In
this case the framework should even perhaps refuse to do it in software
mode... :)

To a few other points:

The options in PrismSettings.java (e.g. prism.trace,
prism.printrendergraph, prism.showdirty) and in QuantumToolit
(quantum.debug) can be helpful as well.

>>> Are you aware of any write up on the internet about understanding the
output? I mean I had to read the code and use breakpoints for the prism
logger for 'slow background path for null' to understand but it would be
nice if someone has already written about these topics? There is also no
book on JavaFX perf right?
On the other side and regarding analysing issues I think there is also a
wealth of tooling / logging that is useful but it took me a while to work
my way into the performance investigation as the information is scattered
and or incomplete (at least with my googlefu).

It would be great to have tools that auto-detect this. Detecting slow
render phases is already done, but linking to the root cause is of course
much harder.
I don't think that interrupting the paint phase is a good thing. If that
takes 200-300 ms, it is very likely it will take 200-300ms in the next
cycle.

>>> Auto-detection is a hard science, hence why as a developer I wouldn't
mind setting explicitly for my CI perf builds, break if condition A happens
so I can review what changed recently.
>>> At 200-300 ms the application became unusable essentially in our case
as every scheduled pulse was either waiting or painting. I will create
something to also detect this going forward to avoid a repeat (question
just is what).
>>> Of course now I will redirect stdErr and parse it on the fly and/or go
in via reflection but it would be nice if prism would allow to set short
cuts or at least spit additional ERROR or WARNs.

It might be an option though to dynamically throttle the pulse frequency
(which can now only be changed at startup with javafx.animation.pulse) in
case slow rendering is detected.
>>> Does this matter much as the pulses anyway just get skipped, or? In
this case the fact that the animation is still trying to run when the pulse
is almost stalled doesn't make sense and circuit-breaking it would be
perhaps 'intelligent' framework behaviour.

When things are slow in the paint phase, the information about how many
Nodes are visited and rendered, and how many are cached is very useful. I
have a local debug version where I keep track of how many times a Node is
rendered from cache versus how often the cache is invalid and needs to be
recomputed (which is extremely expensive). It this was somehow automated,
it could improve performance. It is similar to what hotspot C1/C2 is doing
for code: it is expensive to compile all code at runtime, but when it turns
out methods are often used, they are compiled. Similar, if the rendering
pipeline detects that a node would remain cache-valid for most of the time,
it might automatically set that node.setCache(true). But that's not a
trivial thing to implement.
>>> I still can't 100% work out what exactly one should cache from what
I've googled. I mean, I only need to cache it if it will be redrawn right,
but generally I'm not redrawing it unless I change it. This seems a bit
counterintuitive? (okay maybe a window move or something but that is seldom)
>>> Hotspot is also great fun! ;) especially if you ever completely fill up
the code cache - which I don't recommend!

On that side are Gluon at documenting (maybe on an internal wiki) or
somewhere publicly the strange cases that you hit? I guess you've seen most
of them by now!
My favourite recently was a * { } css selected that sent our memory
throughput crazy as CSSPseudoState was allocated / collected at insane
rates!

Thanks,
Matt


On 1 May 2018 at 20:48, Johan Vos <johan....@gluonhq.com> wrote:

> Hi Matthew,
>
> I agree this is a very important question.
>
> I don't like it when people say that "JavaFX is slow" because if
> everything is done right, JavaFX can be extremely fast.
> But I also realise that it is very easy to make things very slow in
> JavaFX, and in case "performance" is slow, it is important to pinpoint the
> problem as easy as possible. And that is often not trivial, so the easier
> it becomes to detect problems, the more developers will like JavaFX.
>
> I typically start with setting prism.verbose and javafx.pulseLogger to
> true, as you first need to know in which phase the problem is (e.g. layout
> or render phase). If you see the problem is in the render phase, you have
> to look at completely different things.
> Note that in most cases I've seen with bad performance, the problem was
> not in the rendering, but simply due to too much happening on the JavaFX
> application thread, preventing it to start doing the rendering.
> Those things are typically easily detected by generating thread dumps or
> using simple profilers to find out what the JavaFX thread is typically
> doing. If you see bad performance but in 90% of the thread dump there is no
> trace of rendering, you know where to look.
>
> When things are slow in the paint phase, the information about how many
> Nodes are visited and rendered, and how many are cached is very useful. I
> have a local debug version where I keep track of how many times a Node is
> rendered from cache versus how often the cache is invalid and needs to be
> recomputed (which is extremely expensive). It this was somehow automated,
> it could improve performance. It is similar to what hotspot C1/C2 is doing
> for code: it is expensive to compile all code at runtime, but when it turns
> out methods are often used, they are compiled. Similar, if the rendering
> pipeline detects that a node would remain cache-valid for most of the time,
> it might automatically set that node.setCache(true). But that's not a
> trivial thing to implement.
>
> The options in PrismSettings.java (e.g. prism.trace,
> prism.printrendergraph, prism.showdirty) and in QuantumToolit
> (quantum.debug) can be helpful as well.
>
> When there are no clear indications (e.g. not too many nodes, no invalid
> caches), I go for profiling, working bottom-up. I have the JavaFX source
> code always at hand when doing this, in order to see what exactly is
> happening.
> There are some patterns, e.g. on Android I know that lots of time spent in
> System.arraycopy is an indication about lots of slow CSS processing (Bitset
> operations, if you follow the profiling information bottom-up)
>
> It would be great to have tools that auto-detect this. Detecting slow
> render phases is already done, but linking to the root cause is of course
> much harder.
>
> I don't think that interrupting the paint phase is a good thing. If that
> takes 200-300 ms, it is very likely it will take 200-300ms in the next
> cycle.
>
> It might be an option though to dynamically throttle the pulse frequency
> (which can now only be changed at startup with javafx.animation.pulse) in
> case slow rendering is detected.
>
> - Johan
>
>
> On Tue, May 1, 2018 at 8:17 PM Matthew Elliot <matthew.james.elliot@gmail.
> com> wrote:
>
>> Hi all,
>>
>> The last few days I was troubleshooting a new performance issue that
>> showed
>> up in our PROD application where customers had fallen back to the SW
>> rendering pipeline. It severely affected the application where CPU
>> frequency was under 3 GHz with hover lags of a few seconds in the worst
>> cases. With thousands of potential HW/SW combinations in the wild it took
>> quite a while to even identify it really was an issue in our application
>> and not the usual noise of some silly set up. All this got me thinking...
>>
>> ... what was visible was long paint passes, and long waiting on previous
>> render but narrowing this down to exactly what was going on took a lot of
>> manual inspection of the rendering pipeline code / debugging and somewhat
>> by chance I stumbled over the -Dprism.disableEffects flag which after much
>> more pain helped me narrow down the issue.
>>
>> The root cause turned out to be an -fx-effect (blend, inner shadow) on an
>> animated node that was set from the code by an unknowing developer.
>>
>> While there are tools like mission controller for visualising the pulse
>> and
>> phases it can be difficult to identify for example what is going wrong
>> inside of the painting phase and it is difficult to control that nothing
>> bad happens when many developers can make changes to the code and reviews
>> will never catch everything... I'm therefore thinking about ways to run
>> rendering tests in continuous integration that would fail fast if the SW
>> rendering pipeline would get overloaded.
>> I had a look at PulseListener where I could see pulse times but I'd like
>> to
>> go more detailed and actually like the information tracked in the
>> internals
>> of PulseLogger (PulseData) without doing any nasty tricks.
>>
>> I thought maybe somebody has already thought about this problem before and
>> maybe there is even some tooling around this beyond the logging? I could
>> even imagine using the same technique to monitor the rendering pipeline in
>> real time and alerting us (maybe even the user) if things are going a bit
>> sideways.
>>
>> Maybe more generically, how do you even start to debug delays in the paint
>> phase? Timed breakpoints and IDE assisted debug logging aside. :)
>>
>> Matt.
>>
>> PS: It might even be nice to tell the Painter to give up after N ms (prism
>> setting?). Sometimes better to break than to not be useable because of
>> paint phases taking 200-300 ms and JavaFX Application Thread thread
>> getting
>> almost starved completely.
>>
>

Reply via email to