This is an automated email from the ASF dual-hosted git repository. lehmi pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/pdfbox-docs.git
The following commit(s) were added to refs/heads/asf-site by this push: new bee43e1a Site checkin for project Apache PDFBox Website bee43e1a is described below commit bee43e1a0f5517d2a71130a84b24246ebb9b98f8 Author: Andreas Lehmkühler <andr...@lehmi.de> AuthorDate: Thu May 26 21:47:37 2022 +0200 Site checkin for project Apache PDFBox Website --- content/1.8/architecture.html | 16 +++++++-------- content/1.8/commandline.html | 16 +++++++-------- content/1.8/dependencies.html | 16 +++++++-------- content/1.8/faq.html | 16 +++++++-------- content/3.0/migration.html | 45 ++++++++++++++++++++++++++++++++++++------- 5 files changed, 70 insertions(+), 39 deletions(-) diff --git a/content/1.8/architecture.html b/content/1.8/architecture.html index 6dcdda8d..00ffd184 100644 --- a/content/1.8/architecture.html +++ b/content/1.8/architecture.html @@ -104,22 +104,22 @@ <a href="/1.8/cookbook/encryption.html" > Encrypting a File </a> - </li><li> - <a href="/1.8/cookbook/pdfavalidation.html" > - PDF/A Validation - </a> </li><li> <a href="/1.8/cookbook/fill-form-field.html" > Fill a Form Field </a> - </li><li> - <a href="/1.8/cookbook/rendering.html" > - Document Rendering - </a> </li><li> <a href="/1.8/cookbook/pdfacreation.html" > PDF/A Creation </a> + </li><li> + <a href="/1.8/cookbook/pdfavalidation.html" > + PDF/A Validation + </a> + </li><li> + <a href="/1.8/cookbook/rendering.html" > + Document Rendering + </a> </li><li> <a href="/1.8/cookbook/textextraction.html" > Text Extraction diff --git a/content/1.8/commandline.html b/content/1.8/commandline.html index c939586f..ce7bc53d 100644 --- a/content/1.8/commandline.html +++ b/content/1.8/commandline.html @@ -104,22 +104,22 @@ <a href="/1.8/cookbook/encryption.html" > Encrypting a File </a> - </li><li> - <a href="/1.8/cookbook/pdfavalidation.html" > - PDF/A Validation - </a> </li><li> <a href="/1.8/cookbook/fill-form-field.html" > Fill a Form Field </a> - </li><li> - <a href="/1.8/cookbook/rendering.html" > - Document Rendering - </a> </li><li> <a href="/1.8/cookbook/pdfacreation.html" > PDF/A Creation </a> + </li><li> + <a href="/1.8/cookbook/pdfavalidation.html" > + PDF/A Validation + </a> + </li><li> + <a href="/1.8/cookbook/rendering.html" > + Document Rendering + </a> </li><li> <a href="/1.8/cookbook/textextraction.html" > Text Extraction diff --git a/content/1.8/dependencies.html b/content/1.8/dependencies.html index 02e370db..aabd9bd3 100644 --- a/content/1.8/dependencies.html +++ b/content/1.8/dependencies.html @@ -104,22 +104,22 @@ <a href="/1.8/cookbook/encryption.html" > Encrypting a File </a> - </li><li> - <a href="/1.8/cookbook/pdfavalidation.html" > - PDF/A Validation - </a> </li><li> <a href="/1.8/cookbook/fill-form-field.html" > Fill a Form Field </a> - </li><li> - <a href="/1.8/cookbook/rendering.html" > - Document Rendering - </a> </li><li> <a href="/1.8/cookbook/pdfacreation.html" > PDF/A Creation </a> + </li><li> + <a href="/1.8/cookbook/pdfavalidation.html" > + PDF/A Validation + </a> + </li><li> + <a href="/1.8/cookbook/rendering.html" > + Document Rendering + </a> </li><li> <a href="/1.8/cookbook/textextraction.html" > Text Extraction diff --git a/content/1.8/faq.html b/content/1.8/faq.html index e9ee968c..467665ee 100644 --- a/content/1.8/faq.html +++ b/content/1.8/faq.html @@ -104,22 +104,22 @@ <a href="/1.8/cookbook/encryption.html" > Encrypting a File </a> - </li><li> - <a href="/1.8/cookbook/pdfavalidation.html" > - PDF/A Validation - </a> </li><li> <a href="/1.8/cookbook/fill-form-field.html" > Fill a Form Field </a> - </li><li> - <a href="/1.8/cookbook/rendering.html" > - Document Rendering - </a> </li><li> <a href="/1.8/cookbook/pdfacreation.html" > PDF/A Creation </a> + </li><li> + <a href="/1.8/cookbook/pdfavalidation.html" > + PDF/A Validation + </a> + </li><li> + <a href="/1.8/cookbook/rendering.html" > + Document Rendering + </a> </li><li> <a href="/1.8/cookbook/textextraction.html" > Text Extraction diff --git a/content/3.0/migration.html b/content/3.0/migration.html index 7bb23709..ac45b818 100644 --- a/content/3.0/migration.html +++ b/content/3.0/migration.html @@ -99,17 +99,17 @@ a missing topic, open an issue or help us with a contribution to improve the gui <p>This guide describes the updates in Apache PDFBox 3.0 release. Use the information provided to upgrade your PDFBox 2.x applications to PDFBox 3.0. It provides information about the new, deprecated and unsupported features in this release.</p> <h2 id="java-versions" tabindex="-1">Java Versions</h2> -<p>PDFBox 3.0 requires at least Java 8. Testing has been done up to Java 11.</p> +<p>PDFBox 3.0 requires at least Java 8. Testing has been done up to Java 19.</p> <h2 id="dependency-updates" tabindex="-1">Dependency Updates</h2> <p>All libraries on which PDFBox depends are updated to their latest stable versions:</p> <ul> -<li>Bouncy Castle 1.69</li> +<li>Bouncy Castle 1.70</li> <li>Apache Commons Logging 1.2</li> -<li>picocli 4.6.1</li> +<li>picocli 4.6.3</li> </ul> <p>For test support the libraries are updated to</p> <ul> -<li>JUnit 5.8</li> +<li>JUnit 5.8.2</li> <li>JAI Image Core 1.4.0</li> <li>JAI JPEG2000 1.4.0</li> <li>JBIG ImageIO Plugin 3.0.4</li> @@ -128,13 +128,38 @@ as they are treated to be of <strong>internal use only</strong>.</p> <artifactId>io</artifactId> </dependency> </code></pre> +<p>The whole code was overhauled including the following changes:</p> +<ul> +<li>switch to java.nio</li> +<li>add support for memory mapped files for reading</li> +<li>use the origin source when creating a new reader to process parts of it</li> +<li>read operations no longer use scratch files</li> +</ul> <h3 id="use-loader-to-get-a-pdf-document" tabindex="-1">Use <strong>Loader</strong> to get a PDF document</h3> -<p>For loading a PDF <code>PDDocument.load</code> has been replaced with the <code>Loader</code> methods. The same is true for loading a FDF document.</p> -<p>When saving a PDF this will now be done in compressed mode per default. To override that use <code>PDDocument.save</code> with <code>CompressParameters.NO_COMPRESSION</code>.</p> +<p>The new class <em><strong>org.apache.pdfbox.Loader</strong></em> is used for loading a PDF. It offers several methods to load a pdf using different kind of sources. All load methods have been removed from <em><strong>org.apache.pdfbox.pdmodel.PDDocument</strong></em>. The same is true for loading a FDF document.</p> +<p>Sample usage:</p> +<pre><code> try (PDDocument document = Loader.loadPDF(new File("yourfile.pdf"))) + { + for (PDPage page : document.getPages()) + { + .... + } + } +</code></pre> +<h3 id="changes-when-saving-pdf" tabindex="-1">Changes when saving PDF</h3> +<h4 id="compressed-saving-by-default" tabindex="-1">Compressed saving by default</h4> +<p>When saving a PDF this will now be done in compressed mode by default. To override that use <code>PDDocument.save</code> with <code>CompressParameters.NO_COMPRESSION</code>.</p> +<h4 id="don't-use-the-source-as-output" tabindex="-1">Don't use the source as output</h4> +<p>The input file must not be used as output for saving operations. It will corrupt the file and throw an exception as parts of the file are read the first time when saving it.</p> +<h3 id="reduced-memory-usage" tabindex="-1">Reduced memory usage</h3> +<h4 id="incremental-parsing" tabindex="-1">Incremental Parsing</h4> <p>PDFBox now loads a PDF Document incrementally reducing the initial memory footprint. This will also reduce the memory needed to consume a PDF if only certain parts of the PDF are accessed. Note that, due to the nature of PDF, uses such as iterating over all pages, accessing annotations, signing a PDF etc. might still load all parts of the PDF overtime leading to a similar memory consumption as with PDFBox 2.0.</p> -<p>The input file must not be used as output for saving operations. It will corrupt the file and throw an exception as parts of the file are read the first time when saving it.</p> +<h4 id="improved-io-operations" tabindex="-1">Improved IO operations</h4> +<p>The introduction of the new io classes has a positive impact on the memory usage. Especially the re-usage of the source for reading parts of it instead of using intermediate streams reduces the memory footprint significantly.</p> +<h4 id="further-optimizations" tabindex="-1">Further optimizations</h4> +<p>There were a lot of changes and optimizations which have a more or less huge impact on the memory consumption.</p> <h3 id="static-instances-for-standard-14-fonts-removed" tabindex="-1">Static instances for Standard 14 fonts removed</h3> <p>The static instances of <code>PDType1Font</code> for the standard 14 fonts were removed as the underlying <code>COSDictionary</code> isn't supposed to be immutable which led to several issues.</p> <p>A new constructor for <code>PDType1Font</code> was introduced to create a standard 14 font. The new Enum <code>Standard14Fonts.FontName</code> is the one and only parameter and defines the @@ -190,6 +215,12 @@ of Adobe Reader. If you'd like to bypass this use <code>PDDocumentCatalog.getAcr <li><a href="#use-loader-to-get-a-pdf-document">Use Loader to get a PDF document</a> </li> + <li><a href="#changes-when-saving-pdf">Changes when saving PDF</a> + </li> + + <li><a href="#reduced-memory-usage">Reduced memory usage</a> + </li> + <li><a href="#static-instances-for-standard-14-fonts-removed">Static instances for Standard 14 fonts removed</a> </li> </ol>