[ https://issues.apache.org/jira/browse/PDFBOX-4869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17129489#comment-17129489 ]
Alfred commented on PDFBOX-4869: -------------------------------- The null checks are no longer needed because of "new BufferedInputStream" The new operator cannot return nulls, it either throws an exception or it works. > Reading standard 14 fonts is slow > --------------------------------- > > Key: PDFBOX-4869 > URL: https://issues.apache.org/jira/browse/PDFBOX-4869 > Project: PDFBox > Issue Type: Improvement > Components: Parsing, Text extraction > Affects Versions: 3.0.0 PDFBox > Reporter: Alfred > Priority: Major > Attachments: PDFBOX-4869.patch > > Original Estimate: 1m > Remaining Estimate: 1m > > I am testing text extraction from PDF and profiling the execution. > I found that the second biggest time consumer is the static code in > Standard14Fonts that loads fonts from the pdf box jar. > The culprit seems to be the direct use of the stream returned > getResurceAsStream. > That would be a ZipInputStream when using PDFBox as a jar. > Using a buffered stream around it reduces the load time a lot. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org