Re: Optimised, high-performance, multi-threaded rendering pipeline

Tobias Bley Sun, 27 Nov 2016 11:59:20 -0800

Where can we read more about your HPR renderer?




> Am 25.11.2016 um 16:45 schrieb Felix Bembrick <[email protected]>:
> 
> Short answer? Maybe.
> 
> But exactly one more word than any from Oracle ;-)
> 
>> On 26 Nov. 2016, at 00:07, Tobias Bley <[email protected]> wrote:
>> 
>> A very short answer ;) ….
>> 
>> Do you have any URL?
>> 
>> 
>> 
>> 
>> 
>>> Am 25.11.2016 um 12:19 schrieb Felix Bembrick <[email protected]>:
>>> 
>>> Yes.
>>> 
>>>> On 25 Nov. 2016, at 21:45, Tobias Bley <[email protected]> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> @Felix: Is there any Github project, demo video or trial to test HPR with 
>>>> JavaFX?
>>>> 
>>>> Best regards,
>>>> Tobi
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> Am 11.11.2016 um 12:08 schrieb Felix Bembrick <[email protected]>:
>>>>> 
>>>>> Thanks Laurent,
>>>>> 
>>>>> That's another thing we discovered: using Java itself in the most 
>>>>> performant way can help a lot.
>>>>> 
>>>>> It can be tricky, but profiling can often highlight various patterns of 
>>>>> object instantiation that show-up red flags and can lead you directly to 
>>>>> regions of the code that can be refactored to be significantly more 
>>>>> efficient.
>>>>> 
>>>>> Also, the often overlooked GC log analysis can lead to similar 
>>>>> discoveries and remedies.
>>>>> 
>>>>> Blessings,
>>>>> 
>>>>> Felix
>>>>> 
>>>>>> On 11 Nov. 2016, at 21:55, Laurent Bourgès <[email protected]> 
>>>>>> wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> To optimize Pisces that became the Marlin rasterizer, I carefully 
>>>>>> avoided any both array allocation (byte/int/float pools) and also 
>>>>>> reduced array copies or clean up ie only clear dirty parts.
>>>>>> 
>>>>>> This approach is generic and could be applied in other critical places 
>>>>>> of the rendering pipelines.
>>>>>> 
>>>>>> FYI here are my fosdem 2016 slides on the Marlin renderer:
>>>>>> https://bourgesl.github.io/fosdem-2016/slides/fosdem-2016-Marlin.pdf
>>>>>> 
>>>>>> Of course I would be happy to share my experience and work with a tiger 
>>>>>> team on optimizing JavaFX graphics.
>>>>>> 
>>>>>> However I would like getting sort of sponsoring for my potential 
>>>>>> contributions...
>>>>>> 
>>>>>> Cheers,
>>>>>> Laurent
>>>>>> 
>>>>>> Le 11 nov. 2016 11:29, "Tobi" <[email protected]> a écrit :
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> thanks Felix, Laurent and Chris for sharing your stuff with the 
>>>>>>> community!
>>>>>>> 
>>>>>>> I am happy to see starting a discussion about boosting up the JavaFX 
>>>>>>> rendering performance. I can confirm that the performance of JavaFX 
>>>>>>> scene graph is not there where it should be. So multithreading would be 
>>>>>>> an excellent, but difficult approach.
>>>>>>> 
>>>>>>> Felix, concerning your research of other toolkits: Do they all use 
>>>>>>> multithreading or are there any toolkits which use single threading but 
>>>>>>> are faster than JavaFX?
>>>>>>> 
>>>>>>> So maybe there are other points than multithreading where we can boost 
>>>>>>> the performance?
>>>>>>> 
>>>>>>> 2) your HPR sounds great. Did you already try DemoFX (part 3) benchmark 
>>>>>>> with your HPR?
>>>>>>> 
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> Tobi
>>>>>>> 
>>>>>>> 
>>>>>>>> Am 10.11.2016 um 19:11 schrieb Felix Bembrick 
>>>>>>>> <[email protected]>:
>>>>>>>> 
>>>>>>>> (Thanks to Kevin for lifting my "awaiting moderation" impasse).
>>>>>>>> 
>>>>>>>> So, with all the recent discussions regarding the great contribution by
>>>>>>>> Laurent Bourgès of MarlinFX, it was suggested that a separate thread be
>>>>>>>> started to discuss parallelisation of the JavaFX rendering pipeline in
>>>>>>>> general.
>>>>>>>> 
>>>>>>>> As has been correctly pointed-out, converting or modifying the existing
>>>>>>>> rendering pipeline into a fully multi-threaded and performant beast is
>>>>>>>> indeed quite a complex task.
>>>>>>>> 
>>>>>>>> But, that's exactly what myself and my colleagues have been working on 
>>>>>>>> for
>>>>>>>> about 2 years.
>>>>>>>> 
>>>>>>>> The result is what we call the Hyper Rendering Pipeline (HPR).
>>>>>>>> 
>>>>>>>> Work on HPR started when we developed FXMark and were (bitterly)
>>>>>>>> disappointed with the performance of the JavaFX scene graph.  Many 
>>>>>>>> JavaFX
>>>>>>>> developers have blogged about the need to dramatically minimise the 
>>>>>>>> number
>>>>>>>> of nodes (especially on embedded devices) in order to achieve even
>>>>>>>> "acceptable" performance.  Often it is the case that most (if not all
>>>>>>>> rendering) is eventually done in a single Canvas node.
>>>>>>>> 
>>>>>>>> Now, as well already know, the JavaFX Canvas does perform very well 
>>>>>>>> and the
>>>>>>>> recent awesome work (DemoFX) by Chris Newland, just for example, shows 
>>>>>>>> what
>>>>>>>> can be done with this one node.
>>>>>>>> 
>>>>>>>> But, the majority of the animation plumbing in JavaFX is related to the
>>>>>>>> scene graph itself and is designed to make use of multiple nodes and 
>>>>>>>> node
>>>>>>>> types.  At the moment, the performance of this scene graph is the 
>>>>>>>> Achilles
>>>>>>>> Heel of JavaFX (or at least one of them).
>>>>>>>> 
>>>>>>>> Enter HPR.
>>>>>>>> 
>>>>>>>> I personally have worked with a number of hardware-accelerated toolkits
>>>>>>>> over the years and am astounded by just how sluggish the rendering 
>>>>>>>> pipeline
>>>>>>>> for JavaFX is. When I am animating just a couple of hundred nodes using
>>>>>>>> JavaFX and transitions, I am lucky to get more than about 30 FPS, but 
>>>>>>>> on
>>>>>>>> the same (very powerful) machine, I can use other toolkits to render
>>>>>>>> thousands of "objects" and achieve frame rates well over 1000 FPS.
>>>>>>>> 
>>>>>>>> So, we refactored the entire scene graph rendering pipeline with the
>>>>>>>> following goals and principles:
>>>>>>>> 
>>>>>>>> 1. It is written using JavaFX 9 and Java 9 (but could theoretically be
>>>>>>>> back-ported to JavaFX 8 though I see no reason to).
>>>>>>>> 
>>>>>>>> 2. We analysed how other toolkits had optimised their own rendering
>>>>>>>> pipelines (especially Qt which has made some significant advances in 
>>>>>>>> this
>>>>>>>> area in recent years).  We also analysed recent examples of 
>>>>>>>> multi-threaded
>>>>>>>> rendering using the new Vulkan API.
>>>>>>>> 
>>>>>>>> 3. We carefully analysed and determined which parts of the pipeline 
>>>>>>>> should
>>>>>>>> best utilise the CPU and which parts should best utilise the GPU.
>>>>>>>> 
>>>>>>>> 4. For those parts most suited to the CPU, we use the advanced 
>>>>>>>> concurrency
>>>>>>>> features of Java 8/9 to maximise parallelisation and throughput by
>>>>>>>> utilising multiple cores & threads in as an efficient manner as 
>>>>>>>> possible.
>>>>>>>> 
>>>>>>>> 5. We devoted a large amount of time to optimising the "communication"
>>>>>>>> between the CPU and GPU to be far less "chatty" and this alone led to 
>>>>>>>> some
>>>>>>>> huge performance gains.
>>>>>>>> 
>>>>>>>> 6. We also looked at the structure of the scene graph itself and after
>>>>>>>> studying products such as OpenSceneGraph, we refactored the JavaFX 
>>>>>>>> scene
>>>>>>>> graph in such a way that it lends itself to optimised rendering much 
>>>>>>>> more
>>>>>>>> easily.
>>>>>>>> 
>>>>>>>> 7. This is clearly not a "small" patch.  In fact to refer to it as a
>>>>>>>> "patch" is probably rather inappropriate.
>>>>>>>> 
>>>>>>>> The end result is that we now have a fully-functional prototype of HPR 
>>>>>>>> and,
>>>>>>>> already, we are seeing very significant performance improvements.
>>>>>>>> 
>>>>>>>> At the minimum, scene graph rendering performance has improved by 500% 
>>>>>>>> and,
>>>>>>>> with judicious and sometimes "tricky" use of caching, we have seen
>>>>>>>> improvements in performance of 10x or more.
>>>>>>>> 
>>>>>>>> And... we are only just *starting* with the performance optimisation 
>>>>>>>> phase.
>>>>>>>> 
>>>>>>>> The potential for HPR is massive as it opens-up the possibility for the
>>>>>>>> JavaFX scene graph and the animation/transition infrastructure to be 
>>>>>>>> used
>>>>>>>> for a whole new class of applications including games, advanced
>>>>>>>> visualisations etc., without having to rely on imperative programming 
>>>>>>>> of a
>>>>>>>> single Canvas node.
>>>>>>>> 
>>>>>>>> I believe that HPR, along with tremendous recent developments like 
>>>>>>>> JPro and
>>>>>>>> the outstanding work by Gluon on mobiles and embedded devices, could
>>>>>>>> position JavaFX to be the best graphics toolkit of any kind in any 
>>>>>>>> language
>>>>>>>> and, be the ONLY *truly* cross-platform graphics technology available.
>>>>>>>> 
>>>>>>>> WORA for graphics and UIs is finally within reach!
>>>>>>>> 
>>>>>>>> Blessings,
>>>>>>>> 
>>>>>>>> Felix
>>>>>>> 
>>>> 
>>

Re: Optimised, high-performance, multi-threaded rendering pipeline

Reply via email to