[
https://issues.apache.org/jira/browse/PDFBOX-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14624833#comment-14624833
]
Andreas Lehmkühler commented on PDFBOX-2860:
--------------------------------------------
The non sequential parser of PDFBox needs random access to the pdf, so that an
input stream is copied to a file (1.8.9) before parsing it. I guess that's one
of the reasons/the reason for the different performance.
BTW: in 2.0.0 the user can decide if the stream is copied to the memory or a
file (scratchfile = true)
> NonSeq parser slower than Seq parser
> ------------------------------------
>
> Key: PDFBOX-2860
> URL: https://issues.apache.org/jira/browse/PDFBOX-2860
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 2.0.0
> Reporter: simon steiner
>
> PDF from PDFBOX-797
> for (int i=0; i<1000; i++) {
> PDDocument.load(new FileInputStream(
> "4218.pdf")).close();
> }
> Nonseq:
> real 0m23.691s
> Seq:
> real 0m9.705s
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]