[
https://issues.apache.org/jira/browse/PDFBOX-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469037#comment-16469037
]
Tilman Hausherr commented on PDFBOX-4215:
-----------------------------------------
No that's not how it works. The pages can be at any place. All the elements
don't have to be in any particular order. And PDF isn't something like HTML /
XML where everything is in sequence.
> Get pages from a HTTP stream of a large pdf file
> ------------------------------------------------
>
> Key: PDFBOX-4215
> URL: https://issues.apache.org/jira/browse/PDFBOX-4215
> Project: PDFBox
> Issue Type: Wish
> Components: Parsing
> Affects Versions: 2.0.9
> Reporter: Alexandre
> Priority: Minor
>
> Hi Apache contributors,
> Suppose I have a very big pdf file and I want to split this file into file
> chunks (e.g. one file per page). I cannot load the entire file into memory
> and I cannot use the hard disk of the computer as described in the doc for
> large files... :D. But I still have the stream of the file, line by line.(on)
> I read that it is not feasible to get the pages of the pdf in order (because
> of the pdf specs), but is it feasible to load random pages if you read line
> by line and look for page breaks in pdfbox?
> Hagd, A.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]