[ 
https://issues.apache.org/jira/browse/PIVOT-778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071651#comment-13071651
 ] 

Chris Bartlett commented on PIVOT-778:
--------------------------------------

Piotr - This looks interesting, but it would really help if you could supply an 
example that can be used as a simple benchmark.  Something that paints with the 
current method and then with the optimised method.

> Optimise DisplayHost.paintBuffered and DisplayHost.paintVolatileBuffered
> ------------------------------------------------------------------------
>
>                 Key: PIVOT-778
>                 URL: https://issues.apache.org/jira/browse/PIVOT-778
>             Project: Pivot
>          Issue Type: Improvement
>          Components: wtk
>    Affects Versions: 2.0
>            Reporter: Piotr Kołaczkowski
>            Priority: Minor
>              Labels: DisplayHost, caching, gc, paint, performance, repaint
>             Fix For: 2.0.1
>
>
> We are writing sort of a game, which continually calls Component.repaint 
> method, at 60 FPS. We noticed excessive CPU usage, although the actual amount 
> of painting done by our component (actually in an overriden Panel.paint) is 
> ridiculously small. The profiler pointed us to the paintVolatileBuffered 
> method in the DisplayHost. What you are doing there is:
> 1. obtain a new, fresh BufferedImage of size equal to the actual clip region, 
> let's say for a full screen game it can be about 1280x1024. This is 1.3 Mpix 
> x 4 bytes/pixel = 5.2 MB of raw data, allocated from a probably cold memory 
> region (not in the L2 cache)
> 2. then you call actual paint on that buffered image (this is touching at 
> least 5.2 MB again)
> 3. then you copy that to the onscreen buffer (which means copying 5.2 MB for 
> another time)
> 4. in case GC kicks in after 1 and 3. it has to move the BufferedImage in 
> memory to compact young generation (= touching 5.2 MB fourth time)
> The whole process means allocating from cold memory 5.2 MB per each frame and 
> touching about 20 MB per frame.
> For 60 FPS it makes up ~300 MB/s allocation rate and 1.2GB memory throughput. 
> It also makes the GC go crazy.
> We have found that caching the buffer between the subsequent paint calls 
> improves performance a lot:
> <code>
> /** Stores the prepared offscreen buffer */
>         private BufferedImage bufferedImage;
>         /**
>          * Attempts to paint the display using an offscreen buffer.
>          *
>          * @param graphics
>          * The source graphics context.
>          *
>          * @return
>          * <tt>true</tt> if the display was painted using the offscreen
>          * buffer; <tt>false</tt>, otherwise.
>          */
>         private boolean paintBuffered(Graphics2D graphics) {
>             boolean painted = false;
>             // Paint the display into an offscreen buffer
>             GraphicsConfiguration gc = graphics.getDeviceConfiguration();
>             java.awt.Rectangle clipBounds = graphics.getClipBounds();
>             if (bufferedImage == null ||
>                     bufferedImage.getWidth() < clipBounds.width ||
>                     bufferedImage.getHeight() < clipBounds.height)
>                 bufferedImage = gc.createCompatibleImage(clipBounds.width, 
> clipBounds.height,
>                     Transparency.OPAQUE);
>             if (bufferedImage != null) {
>                 Graphics2D bufferedImageGraphics = 
> (Graphics2D)bufferedImage.getGraphics();
>                 bufferedImageGraphics.setClip(0, 0, clipBounds.width,
> ...
> </code> 
> Advantages:
> 1. it saves from costly allocation of a large object from possibly not-cached 
> memory region
> 2. after a few repaints the GC moves this object to the tenured generation, 
> so that the young generation collector is much more efficient (longer times 
> between runs)
> 3. the image probably stays most of the time in the L2 or L3 cache, which 
> saves on memory bandwidth and speeds up painting
> Disadvantages:
> 1. uses some memory that is probably not required all the time, when the app 
> doesn't need to repaint anything large, however this is almost completely 
> shadowed by the excessive GC overhead due to continuous recreation of the 
> offscreen buffered image
> Anyway, we observed about 2-4x performance increase by this simple change - 
> now when running at 60 FPS it uses only about 25% of CPU for painting, and 
> the rest can be used by the application logic (AI, etc.). Previously 60 FPS 
> was probably the most we could achieve from Core2Duo 2.2 GHz. Of course, this 
> change won't affect any "business applications" that don't do animations etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to