[ https://issues.apache.org/jira/browse/PDFBOX-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868068#comment-17868068 ]
Tilman Hausherr commented on PDFBOX-5852: ----------------------------------------- The time is spent in this shading {{Root/Pages/Kids/[0]/Resources/Pattern/P0/Resources/Shading/Sh0}} and in this softmask shading {{Root/Pages/Kids/[0]/Resources/Pattern/P0/Resources/ExtGState/GS0/SMask/G/Resources/Shading/Sh0}} Both shadings are also function-based and have a stitching function that is made of 7 different type 0 functions (shading) / type 2 functions (softmask shading). The type 0 functions deliver the same result for any input. The type 2 functions all have an exponent of 1. The two slow shadings have a bounding box of 1226 x 1226. I tried some caching in the functions but it made no difference, maybe because java optimizes it itself. A look with VisualVM shows that some amount of time is lost in map put and get, which is related to the size of the shading. > Hi CPU and memory usage when converting a PDF with type 4 shading > ----------------------------------------------------------------- > > Key: PDFBOX-5852 > URL: https://issues.apache.org/jira/browse/PDFBOX-5852 > Project: PDFBox > Issue Type: Wish > Components: Rendering > Affects Versions: 2.0.28 > Reporter: Larry Lynn > Priority: Major > Attachments: minimal.pdf > > > We've observed excessive CPU and memory consumption when converting a PDF to > images when the PDF contains type 4 shading. This is especially noticeable > when the conversion is done with a high DPI. Can this be improved? > > Conversation from the PDFBox users mailing list follows > Initial email: > {code:java} > Hi CPU and memory usage when converting a PDF with type 4 shadingHello PDFBox > users and maintainers, > We have a PDF that causes performance problems when we use PDFBox to > convert it to an image with renderImageWithDPI(). We're calling > renderImageWithDPI() > with 650 DPI. I realize this is a very high value - we're using it for > high fidelity original images that will later be downsampled. On my work > laptop which has fairly strong hardware, the conversion takes 25 minutes > and consumes 20GB of memory. CPU and memory usage is reduced if we use a > lower DPI. > The PDF is 1 page long. It contains type 4 shading / Gouraud free form > triangle meshes. We've been aware of some performance issues with type 4 > shading for a little while now, but the PDFs that contained the type 4 > shading belonged to our customers and we were not authorized to share > them. We finally found a problem input document that is non-sensitive and > that we are authorized to share. I've attached a copy of the problem PDF > to this email. > I searched the archives for the users and the developers mailing list and I > didn't find anything specifically about this issue. > I searched through the PDFBox jira tickets and I found a couple of tickets > that looked similar: PDFBOX-2901 & PDFBOX-4491. PDFBOX-2901 seems to most > closely describe what we're seeing, but that was closed in PDFBox 2.0.0, > and our issue still reproduces with PDFBox 2.0.28. > Should I refer this issue over to the developers mailing list or create a > PDFBox Jira ticket for this? > Thanks and Regards, > Larry Lynn {code} > Response: > {code:java} > Hi, > Yes shading can be very slow, especially at high dpi. The attachment > didn't get through, please upload to a sharehoster or create a ticket. > If you need to register then add a meaningful text, e.g. the subject of > this post so we know you're not a spammer. Also retry with 2.0.31 and > 3.0.2 just to be sure. However I'm pessimistic that this can be fixed. > Tilman {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org