[jira] [Updated] (PIVOT-778) Optimise DisplayHost.paintBuffered and DisplayHost.paintVolatileBuffered

2011-10-24 Thread Sandro Martini (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIVOT-778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandro Martini updated PIVOT-778:
-

Affects Version/s: 2.0.1
Fix Version/s: (was: 2.0.1)

Latest version of the fix (committed) from Noel probably solves this 
completely, but a critical fix like this needs more tests, so moved to 2.1 (and 
need to be reverted from committed sources).

Noel, can you attach here a patch with related changes (so if someone wants to 
patch Pivot 2.0.1) could do it ?

Thank tyou very much,
Sandro


> Optimise DisplayHost.paintBuffered and DisplayHost.paintVolatileBuffered
> 
>
> Key: PIVOT-778
> URL: https://issues.apache.org/jira/browse/PIVOT-778
> Project: Pivot
>  Issue Type: Improvement
>  Components: wtk
>Affects Versions: 2.0, 2.0.1
>Reporter: Piotr Kołaczkowski
>Assignee: Noel Grandin
>  Labels: DisplayHost, caching, gc, paint, performance, repaint
> Fix For: 2.1
>
>
> We are writing sort of a game, which continually calls Component.repaint 
> method, at 60 FPS. We noticed excessive CPU usage, although the actual amount 
> of painting done by our component (actually in an overriden Panel.paint) is 
> ridiculously small. The profiler pointed us to the paintVolatileBuffered 
> method in the DisplayHost. What you are doing there is:
> 1. obtain a new, fresh BufferedImage of size equal to the actual clip region, 
> let's say for a full screen game it can be about 1280x1024. This is 1.3 Mpix 
> x 4 bytes/pixel = 5.2 MB of raw data, allocated from a probably cold memory 
> region (not in the L2 cache)
> 2. then you call actual paint on that buffered image (this is touching at 
> least 5.2 MB again)
> 3. then you copy that to the onscreen buffer (which means copying 5.2 MB for 
> another time)
> 4. in case GC kicks in after 1 and 3. it has to move the BufferedImage in 
> memory to compact young generation (= touching 5.2 MB fourth time)
> The whole process means allocating from cold memory 5.2 MB per each frame and 
> touching about 20 MB per frame.
> For 60 FPS it makes up ~300 MB/s allocation rate and 1.2GB memory throughput. 
> It also makes the GC go crazy.
> We have found that caching the buffer between the subsequent paint calls 
> improves performance a lot:
> 
> /** Stores the prepared offscreen buffer */
> private BufferedImage bufferedImage;
> /**
>  * Attempts to paint the display using an offscreen buffer.
>  *
>  * @param graphics
>  * The source graphics context.
>  *
>  * @return
>  * true if the display was painted using the offscreen
>  * buffer; false, otherwise.
>  */
> private boolean paintBuffered(Graphics2D graphics) {
> boolean painted = false;
> // Paint the display into an offscreen buffer
> GraphicsConfiguration gc = graphics.getDeviceConfiguration();
> java.awt.Rectangle clipBounds = graphics.getClipBounds();
> if (bufferedImage == null ||
> bufferedImage.getWidth() < clipBounds.width ||
> bufferedImage.getHeight() < clipBounds.height)
> bufferedImage = gc.createCompatibleImage(clipBounds.width, 
> clipBounds.height,
> Transparency.OPAQUE);
> if (bufferedImage != null) {
> Graphics2D bufferedImageGraphics = 
> (Graphics2D)bufferedImage.getGraphics();
> bufferedImageGraphics.setClip(0, 0, clipBounds.width,
> ...
>  
> Advantages:
> 1. it saves from costly allocation of a large object from possibly not-cached 
> memory region
> 2. after a few repaints the GC moves this object to the tenured generation, 
> so that the young generation collector is much more efficient (longer times 
> between runs)
> 3. the image probably stays most of the time in the L2 or L3 cache, which 
> saves on memory bandwidth and speeds up painting
> Disadvantages:
> 1. uses some memory that is probably not required all the time, when the app 
> doesn't need to repaint anything large, however this is almost completely 
> shadowed by the excessive GC overhead due to continuous recreation of the 
> offscreen buffered image
> Anyway, we observed about 2-4x performance increase by this simple change - 
> now when running at 60 FPS it uses only about 25% of CPU for painting, and 
> the rest can be used by the application logic (AI, etc.). Previously 60 FPS 
> was probably the most we could achieve from Core2Duo 2.2 GHz. Of course, this 
> change won't affect any "business applications" that don't do animations etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:

[jira] [Updated] (PIVOT-778) Optimise DisplayHost.paintBuffered and DisplayHost.paintVolatileBuffered

2011-10-17 Thread Sandro Martini (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIVOT-778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandro Martini updated PIVOT-778:
-

Affects Version/s: (was: 2.0.1)
Fix Version/s: (was: 2.0.2)
   2.0.1

Reassigned to 2.0.1 to see if we are able to put some enhancement in 2.0.1, and 
maybe after put a long-term solution in 2.1 (if needed).


> Optimise DisplayHost.paintBuffered and DisplayHost.paintVolatileBuffered
> 
>
> Key: PIVOT-778
> URL: https://issues.apache.org/jira/browse/PIVOT-778
> Project: Pivot
>  Issue Type: Improvement
>  Components: wtk
>Affects Versions: 2.0
>Reporter: Piotr Kołaczkowski
>Assignee: Noel Grandin
>  Labels: DisplayHost, caching, gc, paint, performance, repaint
> Fix For: 2.0.1, 2.1
>
>
> We are writing sort of a game, which continually calls Component.repaint 
> method, at 60 FPS. We noticed excessive CPU usage, although the actual amount 
> of painting done by our component (actually in an overriden Panel.paint) is 
> ridiculously small. The profiler pointed us to the paintVolatileBuffered 
> method in the DisplayHost. What you are doing there is:
> 1. obtain a new, fresh BufferedImage of size equal to the actual clip region, 
> let's say for a full screen game it can be about 1280x1024. This is 1.3 Mpix 
> x 4 bytes/pixel = 5.2 MB of raw data, allocated from a probably cold memory 
> region (not in the L2 cache)
> 2. then you call actual paint on that buffered image (this is touching at 
> least 5.2 MB again)
> 3. then you copy that to the onscreen buffer (which means copying 5.2 MB for 
> another time)
> 4. in case GC kicks in after 1 and 3. it has to move the BufferedImage in 
> memory to compact young generation (= touching 5.2 MB fourth time)
> The whole process means allocating from cold memory 5.2 MB per each frame and 
> touching about 20 MB per frame.
> For 60 FPS it makes up ~300 MB/s allocation rate and 1.2GB memory throughput. 
> It also makes the GC go crazy.
> We have found that caching the buffer between the subsequent paint calls 
> improves performance a lot:
> 
> /** Stores the prepared offscreen buffer */
> private BufferedImage bufferedImage;
> /**
>  * Attempts to paint the display using an offscreen buffer.
>  *
>  * @param graphics
>  * The source graphics context.
>  *
>  * @return
>  * true if the display was painted using the offscreen
>  * buffer; false, otherwise.
>  */
> private boolean paintBuffered(Graphics2D graphics) {
> boolean painted = false;
> // Paint the display into an offscreen buffer
> GraphicsConfiguration gc = graphics.getDeviceConfiguration();
> java.awt.Rectangle clipBounds = graphics.getClipBounds();
> if (bufferedImage == null ||
> bufferedImage.getWidth() < clipBounds.width ||
> bufferedImage.getHeight() < clipBounds.height)
> bufferedImage = gc.createCompatibleImage(clipBounds.width, 
> clipBounds.height,
> Transparency.OPAQUE);
> if (bufferedImage != null) {
> Graphics2D bufferedImageGraphics = 
> (Graphics2D)bufferedImage.getGraphics();
> bufferedImageGraphics.setClip(0, 0, clipBounds.width,
> ...
>  
> Advantages:
> 1. it saves from costly allocation of a large object from possibly not-cached 
> memory region
> 2. after a few repaints the GC moves this object to the tenured generation, 
> so that the young generation collector is much more efficient (longer times 
> between runs)
> 3. the image probably stays most of the time in the L2 or L3 cache, which 
> saves on memory bandwidth and speeds up painting
> Disadvantages:
> 1. uses some memory that is probably not required all the time, when the app 
> doesn't need to repaint anything large, however this is almost completely 
> shadowed by the excessive GC overhead due to continuous recreation of the 
> offscreen buffered image
> Anyway, we observed about 2-4x performance increase by this simple change - 
> now when running at 60 FPS it uses only about 25% of CPU for painting, and 
> the rest can be used by the application logic (AI, etc.). Previously 60 FPS 
> was probably the most we could achieve from Core2Duo 2.2 GHz. Of course, this 
> change won't affect any "business applications" that don't do animations etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIVOT-778) Optimise DisplayHost.paintBuffered and DisplayHost.paintVolatileBuffered

2011-08-08 Thread Sandro Martini (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIVOT-778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandro Martini updated PIVOT-778:
-

Fix Version/s: (was: 2.0.1)
   2.1
   2.0.2
 Priority: Major  (was: Minor)
Affects Version/s: 2.0.1

> Optimise DisplayHost.paintBuffered and DisplayHost.paintVolatileBuffered
> 
>
> Key: PIVOT-778
> URL: https://issues.apache.org/jira/browse/PIVOT-778
> Project: Pivot
>  Issue Type: Improvement
>  Components: wtk
>Affects Versions: 2.0, 2.0.1
>Reporter: Piotr Kołaczkowski
>  Labels: DisplayHost, caching, gc, paint, performance, repaint
> Fix For: 2.0.2, 2.1
>
>
> We are writing sort of a game, which continually calls Component.repaint 
> method, at 60 FPS. We noticed excessive CPU usage, although the actual amount 
> of painting done by our component (actually in an overriden Panel.paint) is 
> ridiculously small. The profiler pointed us to the paintVolatileBuffered 
> method in the DisplayHost. What you are doing there is:
> 1. obtain a new, fresh BufferedImage of size equal to the actual clip region, 
> let's say for a full screen game it can be about 1280x1024. This is 1.3 Mpix 
> x 4 bytes/pixel = 5.2 MB of raw data, allocated from a probably cold memory 
> region (not in the L2 cache)
> 2. then you call actual paint on that buffered image (this is touching at 
> least 5.2 MB again)
> 3. then you copy that to the onscreen buffer (which means copying 5.2 MB for 
> another time)
> 4. in case GC kicks in after 1 and 3. it has to move the BufferedImage in 
> memory to compact young generation (= touching 5.2 MB fourth time)
> The whole process means allocating from cold memory 5.2 MB per each frame and 
> touching about 20 MB per frame.
> For 60 FPS it makes up ~300 MB/s allocation rate and 1.2GB memory throughput. 
> It also makes the GC go crazy.
> We have found that caching the buffer between the subsequent paint calls 
> improves performance a lot:
> 
> /** Stores the prepared offscreen buffer */
> private BufferedImage bufferedImage;
> /**
>  * Attempts to paint the display using an offscreen buffer.
>  *
>  * @param graphics
>  * The source graphics context.
>  *
>  * @return
>  * true if the display was painted using the offscreen
>  * buffer; false, otherwise.
>  */
> private boolean paintBuffered(Graphics2D graphics) {
> boolean painted = false;
> // Paint the display into an offscreen buffer
> GraphicsConfiguration gc = graphics.getDeviceConfiguration();
> java.awt.Rectangle clipBounds = graphics.getClipBounds();
> if (bufferedImage == null ||
> bufferedImage.getWidth() < clipBounds.width ||
> bufferedImage.getHeight() < clipBounds.height)
> bufferedImage = gc.createCompatibleImage(clipBounds.width, 
> clipBounds.height,
> Transparency.OPAQUE);
> if (bufferedImage != null) {
> Graphics2D bufferedImageGraphics = 
> (Graphics2D)bufferedImage.getGraphics();
> bufferedImageGraphics.setClip(0, 0, clipBounds.width,
> ...
>  
> Advantages:
> 1. it saves from costly allocation of a large object from possibly not-cached 
> memory region
> 2. after a few repaints the GC moves this object to the tenured generation, 
> so that the young generation collector is much more efficient (longer times 
> between runs)
> 3. the image probably stays most of the time in the L2 or L3 cache, which 
> saves on memory bandwidth and speeds up painting
> Disadvantages:
> 1. uses some memory that is probably not required all the time, when the app 
> doesn't need to repaint anything large, however this is almost completely 
> shadowed by the excessive GC overhead due to continuous recreation of the 
> offscreen buffered image
> Anyway, we observed about 2-4x performance increase by this simple change - 
> now when running at 60 FPS it uses only about 25% of CPU for painting, and 
> the rest can be used by the application logic (AI, etc.). Previously 60 FPS 
> was probably the most we could achieve from Core2Duo 2.2 GHz. Of course, this 
> change won't affect any "business applications" that don't do animations etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIVOT-778) Optimise DisplayHost.paintBuffered and DisplayHost.paintVolatileBuffered

2011-07-27 Thread Sandro Martini (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIVOT-778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandro Martini updated PIVOT-778:
-

Comment: was deleted

(was: Comments ?)

> Optimise DisplayHost.paintBuffered and DisplayHost.paintVolatileBuffered
> 
>
> Key: PIVOT-778
> URL: https://issues.apache.org/jira/browse/PIVOT-778
> Project: Pivot
>  Issue Type: Improvement
>  Components: wtk
>Affects Versions: 2.0
>Reporter: Piotr Kołaczkowski
>Priority: Minor
>  Labels: DisplayHost, caching, gc, paint, performance, repaint
> Fix For: 2.0.1
>
>
> We are writing sort of a game, which continually calls Component.repaint 
> method, at 60 FPS. We noticed excessive CPU usage, although the actual amount 
> of painting done by our component (actually in an overriden Panel.paint) is 
> ridiculously small. The profiler pointed us to the paintVolatileBuffered 
> method in the DisplayHost. What you are doing there is:
> 1. obtain a new, fresh BufferedImage of size equal to the actual clip region, 
> let's say for a full screen game it can be about 1280x1024. This is 1.3 Mpix 
> x 4 bytes/pixel = 5.2 MB of raw data, allocated from a probably cold memory 
> region (not in the L2 cache)
> 2. then you call actual paint on that buffered image (this is touching at 
> least 5.2 MB again)
> 3. then you copy that to the onscreen buffer (which means copying 5.2 MB for 
> another time)
> 4. in case GC kicks in after 1 and 3. it has to move the BufferedImage in 
> memory to compact young generation (= touching 5.2 MB fourth time)
> The whole process means allocating from cold memory 5.2 MB per each frame and 
> touching about 20 MB per frame.
> For 60 FPS it makes up ~300 MB/s allocation rate and 1.2GB memory throughput. 
> It also makes the GC go crazy.
> We have found that caching the buffer between the subsequent paint calls 
> improves performance a lot:
> 
> /** Stores the prepared offscreen buffer */
> private BufferedImage bufferedImage;
> /**
>  * Attempts to paint the display using an offscreen buffer.
>  *
>  * @param graphics
>  * The source graphics context.
>  *
>  * @return
>  * true if the display was painted using the offscreen
>  * buffer; false, otherwise.
>  */
> private boolean paintBuffered(Graphics2D graphics) {
> boolean painted = false;
> // Paint the display into an offscreen buffer
> GraphicsConfiguration gc = graphics.getDeviceConfiguration();
> java.awt.Rectangle clipBounds = graphics.getClipBounds();
> if (bufferedImage == null ||
> bufferedImage.getWidth() < clipBounds.width ||
> bufferedImage.getHeight() < clipBounds.height)
> bufferedImage = gc.createCompatibleImage(clipBounds.width, 
> clipBounds.height,
> Transparency.OPAQUE);
> if (bufferedImage != null) {
> Graphics2D bufferedImageGraphics = 
> (Graphics2D)bufferedImage.getGraphics();
> bufferedImageGraphics.setClip(0, 0, clipBounds.width,
> ...
>  
> Advantages:
> 1. it saves from costly allocation of a large object from possibly not-cached 
> memory region
> 2. after a few repaints the GC moves this object to the tenured generation, 
> so that the young generation collector is much more efficient (longer times 
> between runs)
> 3. the image probably stays most of the time in the L2 or L3 cache, which 
> saves on memory bandwidth and speeds up painting
> Disadvantages:
> 1. uses some memory that is probably not required all the time, when the app 
> doesn't need to repaint anything large, however this is almost completely 
> shadowed by the excessive GC overhead due to continuous recreation of the 
> offscreen buffered image
> Anyway, we observed about 2-4x performance increase by this simple change - 
> now when running at 60 FPS it uses only about 25% of CPU for painting, and 
> the rest can be used by the application logic (AI, etc.). Previously 60 FPS 
> was probably the most we could achieve from Core2Duo 2.2 GHz. Of course, this 
> change won't affect any "business applications" that don't do animations etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIVOT-778) Optimise DisplayHost.paintBuffered and DisplayHost.paintVolatileBuffered

2011-07-27 Thread Sandro Martini (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIVOT-778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandro Martini updated PIVOT-778:
-

Fix Version/s: 2.0.1

Comments ?

> Optimise DisplayHost.paintBuffered and DisplayHost.paintVolatileBuffered
> 
>
> Key: PIVOT-778
> URL: https://issues.apache.org/jira/browse/PIVOT-778
> Project: Pivot
>  Issue Type: Improvement
>  Components: wtk
>Affects Versions: 2.0
>Reporter: Piotr Kołaczkowski
>Priority: Minor
>  Labels: DisplayHost, caching, gc, paint, performance, repaint
> Fix For: 2.0.1
>
>
> We are writing sort of a game, which continually calls Component.repaint 
> method, at 60 FPS. We noticed excessive CPU usage, although the actual amount 
> of painting done by our component (actually in an overriden Panel.paint) is 
> ridiculously small. The profiler pointed us to the paintVolatileBuffered 
> method in the DisplayHost. What you are doing there is:
> 1. obtain a new, fresh BufferedImage of size equal to the actual clip region, 
> let's say for a full screen game it can be about 1280x1024. This is 1.3 Mpix 
> x 4 bytes/pixel = 5.2 MB of raw data, allocated from a probably cold memory 
> region (not in the L2 cache)
> 2. then you call actual paint on that buffered image (this is touching at 
> least 5.2 MB again)
> 3. then you copy that to the onscreen buffer (which means copying 5.2 MB for 
> another time)
> 4. in case GC kicks in after 1 and 3. it has to move the BufferedImage in 
> memory to compact young generation (= touching 5.2 MB fourth time)
> The whole process means allocating from cold memory 5.2 MB per each frame and 
> touching about 20 MB per frame.
> For 60 FPS it makes up ~300 MB/s allocation rate and 1.2GB memory throughput. 
> It also makes the GC go crazy.
> We have found that caching the buffer between the subsequent paint calls 
> improves performance a lot:
> 
> /** Stores the prepared offscreen buffer */
> private BufferedImage bufferedImage;
> /**
>  * Attempts to paint the display using an offscreen buffer.
>  *
>  * @param graphics
>  * The source graphics context.
>  *
>  * @return
>  * true if the display was painted using the offscreen
>  * buffer; false, otherwise.
>  */
> private boolean paintBuffered(Graphics2D graphics) {
> boolean painted = false;
> // Paint the display into an offscreen buffer
> GraphicsConfiguration gc = graphics.getDeviceConfiguration();
> java.awt.Rectangle clipBounds = graphics.getClipBounds();
> if (bufferedImage == null ||
> bufferedImage.getWidth() < clipBounds.width ||
> bufferedImage.getHeight() < clipBounds.height)
> bufferedImage = gc.createCompatibleImage(clipBounds.width, 
> clipBounds.height,
> Transparency.OPAQUE);
> if (bufferedImage != null) {
> Graphics2D bufferedImageGraphics = 
> (Graphics2D)bufferedImage.getGraphics();
> bufferedImageGraphics.setClip(0, 0, clipBounds.width,
> ...
>  
> Advantages:
> 1. it saves from costly allocation of a large object from possibly not-cached 
> memory region
> 2. after a few repaints the GC moves this object to the tenured generation, 
> so that the young generation collector is much more efficient (longer times 
> between runs)
> 3. the image probably stays most of the time in the L2 or L3 cache, which 
> saves on memory bandwidth and speeds up painting
> Disadvantages:
> 1. uses some memory that is probably not required all the time, when the app 
> doesn't need to repaint anything large, however this is almost completely 
> shadowed by the excessive GC overhead due to continuous recreation of the 
> offscreen buffered image
> Anyway, we observed about 2-4x performance increase by this simple change - 
> now when running at 60 FPS it uses only about 25% of CPU for painting, and 
> the rest can be used by the application logic (AI, etc.). Previously 60 FPS 
> was probably the most we could achieve from Core2Duo 2.2 GHz. Of course, this 
> change won't affect any "business applications" that don't do animations etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira