[ 
https://issues.apache.org/jira/browse/PDFBOX-4895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165188#comment-17165188
 ] 

Alfred commented on PDFBOX-4895:
--------------------------------

I see you added a check for non-numeric characters:

 

String numberString = number.startsWith("+") || number.startsWith("-") ? 
number.substring(1) : number;
if (!numberString.matches("[0-9]*"))
{
   throw new IOException("Not a number: " + number);
}

I think that is already verified at that point.
I see COSNumber.get is called from BasePArser and PDFStreamParser which are 
both checking the chars they feed into COSNumber.


The only problem I see is that Foxit PDF is able to render a much better image 
out of PDFBOX-3703-966635-p12.pdf
Here's the PDF box result: !PDFBOX-3703-966635-p12.pdf-1.png!

 

and here's the foxit result:

!Untitled.png!

 

 

> Faster COSNumber
> ----------------
>
>                 Key: PDFBOX-4895
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4895
>             Project: PDFBox
>          Issue Type: Improvement
>    Affects Versions: 2.0.20, 3.0.0 PDFBox
>            Reporter: Alfred
>            Assignee: Tilman Hausherr
>            Priority: Trivial
>              Labels: Optimization
>             Fix For: 2.0.21, 3.0.0 PDFBox
>
>         Attachments: PDFBOX-3703-966635-p12.pdf-1.png, PDFBOX-4895-b.patch, 
> PDFBOX-4895.patch, Untitled.png
>
>
> A small improvement can be made to COSNumber when checking if it's float.
> Current version uses indexOf twice, to check for '.' or 'e'.
>  We can do that in one scan.
>  
> Each call will scan through the entire string.
>  We only have to scan until we find the chars, and stop if found.
>  
> I found while profiling the code that the method gets called a lot, so the 
> improvement makes a a bit of a difference.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to