[jira] [Comment Edited] (PDFBOX-2126) Optimize clipping
[ https://issues.apache.org/jira/browse/PDFBOX-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049857#comment-14049857 ] Petr Slaby edited comment on PDFBOX-2126 at 7/2/14 12:19 PM: - ... or save and restore lastClip as shown in the attached ClipPath.2.patch. The patch also resets the lastClip before calling processSubStream in annotation processing. This is necessary e.g. for the attached example_014.pdf containing an AcroForm. was (Author: pslabycz): ... or save and restore lastClip as shown in the attached ClipPath.2.patch. The patch also resets the lastClip before in processSubStream when painting annotation. This is necessary e.g. for the attached example_014.pdf containing an AcroForm. Optimize clipping - Key: PDFBOX-2126 URL: https://issues.apache.org/jira/browse/PDFBOX-2126 Project: PDFBox Issue Type: Improvement Components: Rendering Affects Versions: 2.0.0 Reporter: Petr Slaby Attachments: ClipPath.1.patch, ClipPath.2.patch, ClipPath.patch, PDFBOX-1772.pdf, PDFBOX-1772.pdf-1-bad.png, example_010.pdf, example_014.pdf, pdfbox-1772.pdf-1-good.png, screenshot.png As already stated in a TODO comment in PageDrawer, the call of Graphics2D#setClip() is time and memory consuming. The attached patch optimizes clipping by calling Graphics2D#setClip() only if the clipping path has changed. The effect depends on the document, e.g. the attached one renders in 10.5s without the optimization and in 5.5 seconds in the optimized version. The clipping has to be re-applied whenever the transform in Graphics2D changes. This is not explicitly checked for, the implementation rather depends on the cached value being reset manually. Currently this is only needed at one place when processing annotations (AcroForms). Also, the implementation relies upon the clipping path object stored in PDGraphicsState to never change so that a comparison using == can be used. This works fine, but needs a bit of awareness in future changes. To make the design more clean, the clipping path could be made private to PDGraphcisState and thus really immutable from outside. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2126) Optimize clipping
[ https://issues.apache.org/jira/browse/PDFBOX-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050627#comment-14050627 ] John Hewson edited comment on PDFBOX-2126 at 7/2/14 7:51 PM: - Petr, I've applied your ClipPath.2.patch patch in [r1607463|http://svn.apache.org/r1607463] and the results are good. was (Author: jahewson): I've applied your ClipPath.2.patch patch in [r1607463|http://svn.apache.org/r1607463] and the results are good. Optimize clipping - Key: PDFBOX-2126 URL: https://issues.apache.org/jira/browse/PDFBOX-2126 Project: PDFBox Issue Type: Improvement Components: Rendering Affects Versions: 2.0.0 Reporter: Petr Slaby Attachments: ClipPath.1.patch, ClipPath.2.patch, ClipPath.patch, PDFBOX-1772.pdf, PDFBOX-1772.pdf-1-bad.png, example_010.pdf, example_014.pdf, pdfbox-1772.pdf-1-good.png, screenshot.png As already stated in a TODO comment in PageDrawer, the call of Graphics2D#setClip() is time and memory consuming. The attached patch optimizes clipping by calling Graphics2D#setClip() only if the clipping path has changed. The effect depends on the document, e.g. the attached one renders in 10.5s without the optimization and in 5.5 seconds in the optimized version. The clipping has to be re-applied whenever the transform in Graphics2D changes. This is not explicitly checked for, the implementation rather depends on the cached value being reset manually. Currently this is only needed at one place when processing annotations (AcroForms). Also, the implementation relies upon the clipping path object stored in PDGraphicsState to never change so that a comparison using == can be used. This works fine, but needs a bit of awareness in future changes. To make the design more clean, the clipping path could be made private to PDGraphcisState and thus really immutable from outside. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2126) Optimize clipping
[ https://issues.apache.org/jira/browse/PDFBOX-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049141#comment-14049141 ] Tilman Hausherr edited comment on PDFBOX-2126 at 7/1/14 6:17 PM: - I'm getting really weird results: my tests are messing up a lot of files, e.g. ARCHIVERGB.ai and the HOTRODCMYK.ai that are in the pdfbox\src\test\resources\input\rendering directory. But rendering them from the command line with PDFReader produces a correct result. Another file, bugzilla886049.pdf is suddenly correct for the first time ever. But only with PDDToImage, not with PDFReader. was (Author: tilman): I'm getting really weird results: my tests are messing up a lot of files (although one, bugzilla886049.pdf is suddenly correct for the first time ever), but rendering them from the command line with PDFReader produces a correct result. Optimize clipping - Key: PDFBOX-2126 URL: https://issues.apache.org/jira/browse/PDFBOX-2126 Project: PDFBox Issue Type: Improvement Components: Rendering Affects Versions: 2.0.0 Reporter: Petr Slaby Attachments: ClipPath.1.patch, ClipPath.patch, PDFBOX-1772.pdf, PDFBOX-1772.pdf-1-bad.png, example_010.pdf, pdfbox-1772.pdf-1-good.png, screenshot.png As already stated in a TODO comment in PageDrawer, the call of Graphics2D#setClip() is time and memory consuming. The attached patch optimizes clipping by calling Graphics2D#setClip() only if the clipping path has changed. The effect depends on the document, e.g. the attached one renders in 10.5s without the optimization and in 5.5 seconds in the optimized version. The clipping has to be re-applied whenever the transform in Graphics2D changes. This is not explicitly checked for, the implementation rather depends on the cached value being reset manually. Currently this is only needed at one place when processing annotations (AcroForms). Also, the implementation relies upon the clipping path object stored in PDGraphicsState to never change so that a comparison using == can be used. This works fine, but needs a bit of awareness in future changes. To make the design more clean, the clipping path could be made private to PDGraphcisState and thus really immutable from outside. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2126) Optimize clipping
[ https://issues.apache.org/jira/browse/PDFBOX-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048206#comment-14048206 ] John Hewson edited comment on PDFBOX-2126 at 6/30/14 9:47 PM: -- Ok, I took a look at Petr's optimisations and did some profiling of setClip. The main performance issue was that processTextPosition is called for each glyph and in PageDrawer this calls setClip, however the clipping path can't actually change in between BT and ET operators, so we were calling setClip needlessly 15,000 times or so. I added a new beginText() method to PDFStreamEngine and override that in PageDrawer to set up the graphics for the text - that took the time from 7sec on my machine to about 4.8sec. While making these changes I removed the creation of TextPosition instances from PDFStreamEngine and moved it into a new PDFTextStreamEngine class, because we're interested in glyphs, not in text runs - this fits very well with the refactoring I'm doing in PDFBOX-2149 already. Next I applied Petr's optimisations for not cloning the clipping path in PDGraphicsState and then checking if the clip is needed in PageDrawer. I called this method setClip(). It saves another 0.5sec or so on my machine. I made these changes in [r1606936|http://svn.apache.org/r1606936]. was (Author: jahewson): Ok, I took a look at Petr's optimisations and did some profiling of setClip. The main performance issue was that processTextPosition is called for each glyph and in PageDrawer this calls setClip, however the clipping path can't actually change in between BT and ET operators, so we were calling setClip 15,000 times or so. I added a new beginText() method to PDFStreamEngine and override that in PageDrawer to set up the graphics for the text - that took the time from 7sec on my machine to about 4.8sec. While making these changes I removed the creation of TextPosition instances from PDFStreamEngine and moved it into a new PDFTextStreamEngine class, because we're interested in glyphs, not in text runs - this fits very well with the refactoring I'm doing in PDFBOX-2149 already. Next I applied Petr's optimisations for not cloning the clipping path in PDGraphicsState and then checking if the clip is needed in PageDrawer. I called this method setClip(). It saves another 0.5sec or so on my machine. I made these changes in [r1606936|http://svn.apache.org/r1606936]. Optimize clipping - Key: PDFBOX-2126 URL: https://issues.apache.org/jira/browse/PDFBOX-2126 Project: PDFBox Issue Type: Improvement Components: Rendering Affects Versions: 2.0.0 Reporter: Petr Slaby Attachments: ClipPath.1.patch, ClipPath.patch, example_010.pdf As already stated in a TODO comment in PageDrawer, the call of Graphics2D#setClip() is time and memory consuming. The attached patch optimizes clipping by calling Graphics2D#setClip() only if the clipping path has changed. The effect depends on the document, e.g. the attached one renders in 10.5s without the optimization and in 5.5 seconds in the optimized version. The clipping has to be re-applied whenever the transform in Graphics2D changes. This is not explicitly checked for, the implementation rather depends on the cached value being reset manually. Currently this is only needed at one place when processing annotations (AcroForms). Also, the implementation relies upon the clipping path object stored in PDGraphicsState to never change so that a comparison using == can be used. This works fine, but needs a bit of awareness in future changes. To make the design more clean, the clipping path could be made private to PDGraphcisState and thus really immutable from outside. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2126) Optimize clipping
[ https://issues.apache.org/jira/browse/PDFBOX-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048256#comment-14048256 ] Tilman Hausherr edited comment on PDFBOX-2126 at 6/30/14 10:29 PM: --- Some differences: - The file from PDFBOX-1608 is rendered different, veggie good on the left is grey instead of green - The file from PDFBOX-1689 is rendered without the ELVIA title on the top left - The file from PDFBOX-1689 is rendered on page 1 without EINLADUNG There are many more, but maybe it is all the same reason (color?) Nevertheless {code} the clipping path can't actually change in between BT and ET operators, so we were calling setClip needlessly 15,000 times or so. {code} is a great observation. was (Author: tilman): Some difference: - The file from PDFBOX-1608 is redered different, veggie good on the left is grey instead of green - The file from PDFBOX-1689 is rendered without the ELVIA title on the top left - The file from PDFBOX-1689 is rendered on page 1 without EINLADUNG There are many more, but maybe it is all the same reason (color?) Nevertheless {code} the clipping path can't actually change in between BT and ET operators, so we were calling setClip needlessly 15,000 times or so. {code} is a great observation. Optimize clipping - Key: PDFBOX-2126 URL: https://issues.apache.org/jira/browse/PDFBOX-2126 Project: PDFBox Issue Type: Improvement Components: Rendering Affects Versions: 2.0.0 Reporter: Petr Slaby Attachments: ClipPath.1.patch, ClipPath.patch, example_010.pdf As already stated in a TODO comment in PageDrawer, the call of Graphics2D#setClip() is time and memory consuming. The attached patch optimizes clipping by calling Graphics2D#setClip() only if the clipping path has changed. The effect depends on the document, e.g. the attached one renders in 10.5s without the optimization and in 5.5 seconds in the optimized version. The clipping has to be re-applied whenever the transform in Graphics2D changes. This is not explicitly checked for, the implementation rather depends on the cached value being reset manually. Currently this is only needed at one place when processing annotations (AcroForms). Also, the implementation relies upon the clipping path object stored in PDGraphicsState to never change so that a comparison using == can be used. This works fine, but needs a bit of awareness in future changes. To make the design more clean, the clipping path could be made private to PDGraphcisState and thus really immutable from outside. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2126) Optimize clipping
[ https://issues.apache.org/jira/browse/PDFBOX-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047342#comment-14047342 ] John Hewson edited comment on PDFBOX-2126 at 6/30/14 3:35 AM: -- {quote} Because of the clippingPath.clone() in PDGraphicsState.clone() which is now necessary {quote} {{clippingPath}} was never null though, so it was always being cloned anyway. was (Author: jahewson): {quote} Because of the clippingPath.clone() in PDGraphicsState.clone() which is now necessary {quote} clippingPath was never null though, so it was always being cloned anyway. Optimize clipping - Key: PDFBOX-2126 URL: https://issues.apache.org/jira/browse/PDFBOX-2126 Project: PDFBox Issue Type: Improvement Components: Rendering Affects Versions: 2.0.0 Reporter: Petr Slaby Attachments: ClipPath.1.patch, ClipPath.patch, example_010.pdf As already stated in a TODO comment in PageDrawer, the call of Graphics2D#setClip() is time and memory consuming. The attached patch optimizes clipping by calling Graphics2D#setClip() only if the clipping path has changed. The effect depends on the document, e.g. the attached one renders in 10.5s without the optimization and in 5.5 seconds in the optimized version. The clipping has to be re-applied whenever the transform in Graphics2D changes. This is not explicitly checked for, the implementation rather depends on the cached value being reset manually. Currently this is only needed at one place when processing annotations (AcroForms). Also, the implementation relies upon the clipping path object stored in PDGraphicsState to never change so that a comparison using == can be used. This works fine, but needs a bit of awareness in future changes. To make the design more clean, the clipping path could be made private to PDGraphcisState and thus really immutable from outside. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2126) Optimize clipping
[ https://issues.apache.org/jira/browse/PDFBOX-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047342#comment-14047342 ] John Hewson edited comment on PDFBOX-2126 at 6/30/14 3:35 AM: -- [~pslabycz]: {quote} Because of the clippingPath.clone() in PDGraphicsState.clone() which is now necessary {quote} {{clippingPath}} was never null though, so it was always being cloned anyway. was (Author: jahewson): [~pslabycz] {quote} Because of the clippingPath.clone() in PDGraphicsState.clone() which is now necessary {quote} {{clippingPath}} was never null though, so it was always being cloned anyway. Optimize clipping - Key: PDFBOX-2126 URL: https://issues.apache.org/jira/browse/PDFBOX-2126 Project: PDFBox Issue Type: Improvement Components: Rendering Affects Versions: 2.0.0 Reporter: Petr Slaby Attachments: ClipPath.1.patch, ClipPath.patch, example_010.pdf As already stated in a TODO comment in PageDrawer, the call of Graphics2D#setClip() is time and memory consuming. The attached patch optimizes clipping by calling Graphics2D#setClip() only if the clipping path has changed. The effect depends on the document, e.g. the attached one renders in 10.5s without the optimization and in 5.5 seconds in the optimized version. The clipping has to be re-applied whenever the transform in Graphics2D changes. This is not explicitly checked for, the implementation rather depends on the cached value being reset manually. Currently this is only needed at one place when processing annotations (AcroForms). Also, the implementation relies upon the clipping path object stored in PDGraphicsState to never change so that a comparison using == can be used. This works fine, but needs a bit of awareness in future changes. To make the design more clean, the clipping path could be made private to PDGraphcisState and thus really immutable from outside. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2126) Optimize clipping
[ https://issues.apache.org/jira/browse/PDFBOX-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047342#comment-14047342 ] John Hewson edited comment on PDFBOX-2126 at 6/30/14 3:35 AM: -- [~pslabycz] {quote} Because of the clippingPath.clone() in PDGraphicsState.clone() which is now necessary {quote} {{clippingPath}} was never null though, so it was always being cloned anyway. was (Author: jahewson): {quote} Because of the clippingPath.clone() in PDGraphicsState.clone() which is now necessary {quote} {{clippingPath}} was never null though, so it was always being cloned anyway. Optimize clipping - Key: PDFBOX-2126 URL: https://issues.apache.org/jira/browse/PDFBOX-2126 Project: PDFBox Issue Type: Improvement Components: Rendering Affects Versions: 2.0.0 Reporter: Petr Slaby Attachments: ClipPath.1.patch, ClipPath.patch, example_010.pdf As already stated in a TODO comment in PageDrawer, the call of Graphics2D#setClip() is time and memory consuming. The attached patch optimizes clipping by calling Graphics2D#setClip() only if the clipping path has changed. The effect depends on the document, e.g. the attached one renders in 10.5s without the optimization and in 5.5 seconds in the optimized version. The clipping has to be re-applied whenever the transform in Graphics2D changes. This is not explicitly checked for, the implementation rather depends on the cached value being reset manually. Currently this is only needed at one place when processing annotations (AcroForms). Also, the implementation relies upon the clipping path object stored in PDGraphicsState to never change so that a comparison using == can be used. This works fine, but needs a bit of awareness in future changes. To make the design more clean, the clipping path could be made private to PDGraphcisState and thus really immutable from outside. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2126) Optimize clipping
[ https://issues.apache.org/jira/browse/PDFBOX-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047342#comment-14047342 ] John Hewson edited comment on PDFBOX-2126 at 6/30/14 3:36 AM: -- [~pslabycz]: {quote} Because of the clippingPath.clone() in PDGraphicsState.clone() which is now necessary {quote} {{clippingPath}} was never null though (excluding the preflight bug which Andreas found), so it was always being cloned anyway. was (Author: jahewson): [~pslabycz]: {quote} Because of the clippingPath.clone() in PDGraphicsState.clone() which is now necessary {quote} {{clippingPath}} was never null though, so it was always being cloned anyway. Optimize clipping - Key: PDFBOX-2126 URL: https://issues.apache.org/jira/browse/PDFBOX-2126 Project: PDFBox Issue Type: Improvement Components: Rendering Affects Versions: 2.0.0 Reporter: Petr Slaby Attachments: ClipPath.1.patch, ClipPath.patch, example_010.pdf As already stated in a TODO comment in PageDrawer, the call of Graphics2D#setClip() is time and memory consuming. The attached patch optimizes clipping by calling Graphics2D#setClip() only if the clipping path has changed. The effect depends on the document, e.g. the attached one renders in 10.5s without the optimization and in 5.5 seconds in the optimized version. The clipping has to be re-applied whenever the transform in Graphics2D changes. This is not explicitly checked for, the implementation rather depends on the cached value being reset manually. Currently this is only needed at one place when processing annotations (AcroForms). Also, the implementation relies upon the clipping path object stored in PDGraphicsState to never change so that a comparison using == can be used. This works fine, but needs a bit of awareness in future changes. To make the design more clean, the clipping path could be made private to PDGraphcisState and thus really immutable from outside. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2126) Optimize clipping
[ https://issues.apache.org/jira/browse/PDFBOX-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046836#comment-14046836 ] Tilman Hausherr edited comment on PDFBOX-2126 at 6/28/14 12:13 PM: --- +1 (I was also thinking about testing the patch this weekend). Sorry, I too just made a refactoring in that area as part of PDFBOX-1875. ( http://svn.apache.org/r1606326 ) was (Author: tilman): +1 (I was also thinking about testing the patch this weekend). Sorry, I also just made a refactoring in that area as part of PDFBOX-1875. ( http://svn.apache.org/r1606326 ) Optimize clipping - Key: PDFBOX-2126 URL: https://issues.apache.org/jira/browse/PDFBOX-2126 Project: PDFBox Issue Type: Improvement Components: Rendering Affects Versions: 2.0.0 Reporter: Petr Slaby Attachments: ClipPath.1.patch, ClipPath.patch, example_010.pdf As already stated in a TODO comment in PageDrawer, the call of Graphics2D#setClip() is time and memory consuming. The attached patch optimizes clipping by calling Graphics2D#setClip() only if the clipping path has changed. The effect depends on the document, e.g. the attached one renders in 10.5s without the optimization and in 5.5 seconds in the optimized version. The clipping has to be re-applied whenever the transform in Graphics2D changes. This is not explicitly checked for, the implementation rather depends on the cached value being reset manually. Currently this is only needed at one place when processing annotations (AcroForms). Also, the implementation relies upon the clipping path object stored in PDGraphicsState to never change so that a comparison using == can be used. This works fine, but needs a bit of awareness in future changes. To make the design more clean, the clipping path could be made private to PDGraphcisState and thus really immutable from outside. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2126) Optimize clipping
[ https://issues.apache.org/jira/browse/PDFBOX-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046702#comment-14046702 ] John Hewson edited comment on PDFBOX-2126 at 6/28/14 2:42 AM: -- I like what you're doing with this commit but when I looked closer I thought that the handling of clipping paths needed some more significant improvements. I've done some refactoring in [s1606283|http://svn.apache.org/s1606283] to PDGraphicsState, see what you think. I haven't applied any optimisations yet, you're welcome to update your patch again. Feedback appreciated. was (Author: jahewson): I like what you're doing with this commit but when I looked at applying at I thought that the handling of clipping paths needed some more significant improvements. I've done some refactoring in [s1606283|http://svn.apache.org/s1606283] to PDGraphicsState, see what you think. I haven't applied any optimisations yet, you're welcome to update your patch again. Feedback appreciated. Optimize clipping - Key: PDFBOX-2126 URL: https://issues.apache.org/jira/browse/PDFBOX-2126 Project: PDFBox Issue Type: Improvement Components: Rendering Affects Versions: 2.0.0 Reporter: Petr Slaby Attachments: ClipPath.1.patch, ClipPath.patch, example_010.pdf As already stated in a TODO comment in PageDrawer, the call of Graphics2D#setClip() is time and memory consuming. The attached patch optimizes clipping by calling Graphics2D#setClip() only if the clipping path has changed. The effect depends on the document, e.g. the attached one renders in 10.5s without the optimization and in 5.5 seconds in the optimized version. The clipping has to be re-applied whenever the transform in Graphics2D changes. This is not explicitly checked for, the implementation rather depends on the cached value being reset manually. Currently this is only needed at one place when processing annotations (AcroForms). Also, the implementation relies upon the clipping path object stored in PDGraphicsState to never change so that a comparison using == can be used. This works fine, but needs a bit of awareness in future changes. To make the design more clean, the clipping path could be made private to PDGraphcisState and thus really immutable from outside. -- This message was sent by Atlassian JIRA (v6.2#6252)