[ 
https://issues.apache.org/jira/browse/PDFBOX-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17146964#comment-17146964
 ] 

Andreas Lehmkühler commented on PDFBOX-4883:
--------------------------------------------

[~Faltiska] your last patch looks good and I've committed the changes with some 
slight changes. The key aspect is to save the string representation within the 
constuctor with the string as parameter so that the origin representation is 
preserved. Thanks for your input and your persistence :)

> COSFloat is extremely slow
> --------------------------
>
>                 Key: PDFBOX-4883
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4883
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 2.0.20, 3.0.0 PDFBox
>            Reporter: Alfred
>            Assignee: Andreas Lehmkühler
>            Priority: Major
>              Labels: display, optimization, parsing, textextraction
>         Attachments: After.png, Before.png, PDFBOX-4883.patch, 
> extreme-values-out.pdf
>
>
> I am testing text extraction from PDF and profiling the execution.
> I found that biggest time consumer is the COSFloat class.
>  
> All other improvements I suggested so far are small compared to this.
> But this is the also the most complex one.
>  
> I have attached te profiler output for the same text extraction, with and 
> without the COSFloat changes.
> The time to extract the same text was 4 times long with the original COSFlow, 
> because of its use of BigDecimal.
> I will try to write extra tests for all cases I see in the original COSFLoat 
> code, if they are not already tested.
> Then I will submit for review a new COSFloat version.
>  
> I think this affects parsing and displaying PDFs too, not just text 
> extraction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to