[ https://issues.apache.org/jira/browse/PDFBOX-4895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165188#comment-17165188 ]
Alfred commented on PDFBOX-4895: -------------------------------- I see you added a check for non-numeric characters: String numberString = number.startsWith("+") || number.startsWith("-") ? number.substring(1) : number; if (!numberString.matches("[0-9]*")) { throw new IOException("Not a number: " + number); } I think that is already verified at that point. I see COSNumber.get is called from BasePArser and PDFStreamParser which are both checking the chars they feed into COSNumber. The only problem I see is that Foxit PDF is able to render a much better image out of PDFBOX-3703-966635-p12.pdf Here's the PDF box result: !PDFBOX-3703-966635-p12.pdf-1.png! and here's the foxit result: !Untitled.png! > Faster COSNumber > ---------------- > > Key: PDFBOX-4895 > URL: https://issues.apache.org/jira/browse/PDFBOX-4895 > Project: PDFBox > Issue Type: Improvement > Affects Versions: 2.0.20, 3.0.0 PDFBox > Reporter: Alfred > Assignee: Tilman Hausherr > Priority: Trivial > Labels: Optimization > Fix For: 2.0.21, 3.0.0 PDFBox > > Attachments: PDFBOX-3703-966635-p12.pdf-1.png, PDFBOX-4895-b.patch, > PDFBOX-4895.patch, Untitled.png > > > A small improvement can be made to COSNumber when checking if it's float. > Current version uses indexOf twice, to check for '.' or 'e'. > We can do that in one scan. > > Each call will scan through the entire string. > We only have to scan until we find the chars, and stop if found. > > I found while profiling the code that the method gets called a lot, so the > improvement makes a a bit of a difference. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org