[
https://issues.apache.org/jira/browse/PDFBOX-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133221#comment-17133221
]
Andreas Lehmkühler edited comment on PDFBOX-4877 at 6/11/20, 12:25 PM:
-----------------------------------------------------------------------
{quote}Oh, and another question is, why do we do ?
if (!Float.isFinite(c[0]) || !Float.isFinite(c[1]) || !Float.isFinite(c[2]) ...
{quote}
It is just a dream that all pdfs are wellformed. We are doing a lot of checks
and magic repairs to avoid issues while parsing and of course to avoid
questions like "Why can't PDFBox parse this pdf, adobe (or any other popular
pdf tool) can. See PDFBOX-4778 for further information
was (Author: lehmi):
{quote}Oh, and another question is, why do we do ?
if (!Float.isFinite(c[0]) || !Float.isFinite(c[1]) || !Float.isFinite(c[2]) ...
{quote}
It is just a dream that all pdfs are wellformed. We are doing a lot of checks
and magic repairs to avoid issues while parsing and of course to avoid
questions like "Why can't PDFBox parse this pdf, adobe (or any other popular
pdf tool) can. See PDFBOX-4778 for further informaation
> Matrix class performance improvements
> -------------------------------------
>
> Key: PDFBOX-4877
> URL: https://issues.apache.org/jira/browse/PDFBOX-4877
> Project: PDFBox
> Issue Type: Improvement
> Components: Parsing, Text extraction
> Affects Versions: 2.0.20, 3.0.0 PDFBox
> Reporter: Alfred
> Assignee: Andreas Lehmkühler
> Priority: Major
> Labels: Optimization
> Attachments: PDFBOX-4877.patch
>
> Original Estimate: 1m
> Remaining Estimate: 1m
>
> I am testing text extraction from PDF and profiling the execution.
> I found that the third major time consumer is with matrix multiplicaitons.
> The Matrix class spends large amounts of time copying results to new
> instances.
> Also, the if statements are slowing down execution as they kill performance
> in modern CPUs.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]