[
https://issues.apache.org/jira/browse/PDFBOX-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17574627#comment-17574627
]
Michael Klink commented on PDFBOX-5483:
---------------------------------------
Indeed, I didn't necessarily mean keeping the original method signature but at
least keeping it simple.
E.g. one can introduce an enumeration {{PdfCaching}} with values {{inMemory}},
{{inFile}}, and {{inMemoryMappedFile}}. Then one could change
{code:java}
public static PDDocument loadPDF(InputStream input) throws IOException
{code}
to
{code:java}
public static PDDocument loadPDF(InputStream input, PdfCaching pdfCaching)
throws IOException
{code}
IMO it is more friendly and less frustrating to have to write
{code:java}
PDDocument pdDocument = Loader.loadPdf(inputStream, PdfCaching.inMemory);
{code}
than
{code:java}
PDDocument pdDocument =
Loader.loadPDF(RandomAccessReadBuffer.createBufferFromStream(inputStream));
{code}
in particular as IDEs often support enumeration value proposals there.
To keep things in one place, the actual code for creating the
{{RandomAccessRead}} for an {{InputStream}} may be a method of the enumeration.
> Replace methods using an InputStream from Loader.loadPDF
> --------------------------------------------------------
>
> Key: PDFBOX-5483
> URL: https://issues.apache.org/jira/browse/PDFBOX-5483
> Project: PDFBox
> Issue Type: Improvement
> Components: Parsing
> Affects Versions: 3.0.0 PDFBox
> Reporter: Andreas Lehmkühler
> Assignee: Andreas Lehmkühler
> Priority: Major
> Fix For: 3.0.0 PDFBox
>
>
> As discussed on dev@pdfbox
> {quote}
> We have to remove the loadPDF variants using InputStream and replace them
> with RandomAccessRead.
> If it comes to InputStreams users have to decide how to procide:
> * copy the InputStream to memory by using RandomAccessReadBuffer
> * copy the InputStream to a file and use RandomAccessReadBufferedFile or
> RandomAccessReadMemoryMappedFile
> This would make it more transparent what happens under the hood when using
> the different kinds of loadPDF methods:
> * a byte array as source is already in memory and the obvious choice is to
> use RandomAccessReadBuffer as a wrapper
> * a file as source targets a local file and the most obvious choice is to use
> RandomAccessReadBufferedFile as a wrapper. We should document that as the
> other alternative RandomAccessReadMemoryMappedFile is offered in this case
> * RandomAccessRead as source is the most obvious one and the user decides how
> to create it. Additionally is ist possible to implement some own caching
> loading and/or mechanism
> {quote}
> see PDFBOX-5462 and [High memory usage with pdfbox
> 3|https://lists.apache.org/thread/6mmgp23v8b2yztj4hghkgkd14s1gzs8g] as well
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]