This is an automated email from the ASF dual-hosted git repository.
lehmi pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/pdfbox-docs.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 83c70f79 Site checkin for project Apache PDFBox Website
83c70f79 is described below
commit 83c70f794f1ea1b836c229c1966f7cb0c6537230
Author: Andreas Lehmkühler <[email protected]>
AuthorDate: Sun Dec 4 14:02:34 2022 +0100
Site checkin for project Apache PDFBox Website
---
content/1.8/architecture.html | 20 ++++++++++----------
content/1.8/commandline.html | 20 ++++++++++----------
content/1.8/dependencies.html | 20 ++++++++++----------
content/1.8/faq.html | 20 ++++++++++----------
content/3.0/migration.html | 33 +++++++++++++++++++++++++++++++--
5 files changed, 71 insertions(+), 42 deletions(-)
diff --git a/content/1.8/architecture.html b/content/1.8/architecture.html
index 8c4b9a29..6f77df64 100644
--- a/content/1.8/architecture.html
+++ b/content/1.8/architecture.html
@@ -97,10 +97,6 @@
<i></i>
<label>Cookbook</label>
<ul><li>
- <a href="/1.8/cookbook/pdfacreation.html" >
- PDF/A Creation
- </a>
- </li><li>
<a href="/1.8/cookbook/documentcreation.html" >
Document Creation
</a>
@@ -109,17 +105,17 @@
Encrypting a File
</a>
</li><li>
- <a href="/1.8/cookbook/pdfavalidation.html" >
- PDF/A Validation
- </a>
- </li><li>
- <a href="/1.8/cookbook/textextraction.html" >
- Text Extraction
+ <a href="/1.8/cookbook/pdfacreation.html" >
+ PDF/A Creation
</a>
</li><li>
<a href="/1.8/cookbook/fill-form-field.html" >
Fill a Form Field
</a>
+ </li><li>
+ <a href="/1.8/cookbook/pdfavalidation.html" >
+ PDF/A Validation
+ </a>
</li><li>
<a href="/1.8/cookbook/rendering.html" >
Document Rendering
@@ -132,6 +128,10 @@
<a href="/1.8/cookbook/workingwithfonts.html" >
Working with Fonts
</a>
+ </li><li>
+ <a href="/1.8/cookbook/textextraction.html" >
+ Text Extraction
+ </a>
</li><li>
<a href="/1.8/cookbook/workingwithmetadata.html" >
Working with Metadata
diff --git a/content/1.8/commandline.html b/content/1.8/commandline.html
index 348a3453..28db26de 100644
--- a/content/1.8/commandline.html
+++ b/content/1.8/commandline.html
@@ -97,10 +97,6 @@
<i></i>
<label>Cookbook</label>
<ul><li>
- <a href="/1.8/cookbook/pdfacreation.html" >
- PDF/A Creation
- </a>
- </li><li>
<a href="/1.8/cookbook/documentcreation.html" >
Document Creation
</a>
@@ -109,17 +105,17 @@
Encrypting a File
</a>
</li><li>
- <a href="/1.8/cookbook/pdfavalidation.html" >
- PDF/A Validation
- </a>
- </li><li>
- <a href="/1.8/cookbook/textextraction.html" >
- Text Extraction
+ <a href="/1.8/cookbook/pdfacreation.html" >
+ PDF/A Creation
</a>
</li><li>
<a href="/1.8/cookbook/fill-form-field.html" >
Fill a Form Field
</a>
+ </li><li>
+ <a href="/1.8/cookbook/pdfavalidation.html" >
+ PDF/A Validation
+ </a>
</li><li>
<a href="/1.8/cookbook/rendering.html" >
Document Rendering
@@ -132,6 +128,10 @@
<a href="/1.8/cookbook/workingwithfonts.html" >
Working with Fonts
</a>
+ </li><li>
+ <a href="/1.8/cookbook/textextraction.html" >
+ Text Extraction
+ </a>
</li><li>
<a href="/1.8/cookbook/workingwithmetadata.html" >
Working with Metadata
diff --git a/content/1.8/dependencies.html b/content/1.8/dependencies.html
index 6fe56fdc..302f3248 100644
--- a/content/1.8/dependencies.html
+++ b/content/1.8/dependencies.html
@@ -97,10 +97,6 @@
<i></i>
<label>Cookbook</label>
<ul><li>
- <a href="/1.8/cookbook/pdfacreation.html" >
- PDF/A Creation
- </a>
- </li><li>
<a href="/1.8/cookbook/documentcreation.html" >
Document Creation
</a>
@@ -109,17 +105,17 @@
Encrypting a File
</a>
</li><li>
- <a href="/1.8/cookbook/pdfavalidation.html" >
- PDF/A Validation
- </a>
- </li><li>
- <a href="/1.8/cookbook/textextraction.html" >
- Text Extraction
+ <a href="/1.8/cookbook/pdfacreation.html" >
+ PDF/A Creation
</a>
</li><li>
<a href="/1.8/cookbook/fill-form-field.html" >
Fill a Form Field
</a>
+ </li><li>
+ <a href="/1.8/cookbook/pdfavalidation.html" >
+ PDF/A Validation
+ </a>
</li><li>
<a href="/1.8/cookbook/rendering.html" >
Document Rendering
@@ -132,6 +128,10 @@
<a href="/1.8/cookbook/workingwithfonts.html" >
Working with Fonts
</a>
+ </li><li>
+ <a href="/1.8/cookbook/textextraction.html" >
+ Text Extraction
+ </a>
</li><li>
<a href="/1.8/cookbook/workingwithmetadata.html" >
Working with Metadata
diff --git a/content/1.8/faq.html b/content/1.8/faq.html
index 20bb35ad..3cc91030 100644
--- a/content/1.8/faq.html
+++ b/content/1.8/faq.html
@@ -97,10 +97,6 @@
<i></i>
<label>Cookbook</label>
<ul><li>
- <a href="/1.8/cookbook/pdfacreation.html" >
- PDF/A Creation
- </a>
- </li><li>
<a href="/1.8/cookbook/documentcreation.html" >
Document Creation
</a>
@@ -109,17 +105,17 @@
Encrypting a File
</a>
</li><li>
- <a href="/1.8/cookbook/pdfavalidation.html" >
- PDF/A Validation
- </a>
- </li><li>
- <a href="/1.8/cookbook/textextraction.html" >
- Text Extraction
+ <a href="/1.8/cookbook/pdfacreation.html" >
+ PDF/A Creation
</a>
</li><li>
<a href="/1.8/cookbook/fill-form-field.html" >
Fill a Form Field
</a>
+ </li><li>
+ <a href="/1.8/cookbook/pdfavalidation.html" >
+ PDF/A Validation
+ </a>
</li><li>
<a href="/1.8/cookbook/rendering.html" >
Document Rendering
@@ -132,6 +128,10 @@
<a href="/1.8/cookbook/workingwithfonts.html" >
Working with Fonts
</a>
+ </li><li>
+ <a href="/1.8/cookbook/textextraction.html" >
+ Text Extraction
+ </a>
</li><li>
<a href="/1.8/cookbook/workingwithmetadata.html" >
Working with Metadata
diff --git a/content/3.0/migration.html b/content/3.0/migration.html
index 82490f8c..54aa9b7a 100644
--- a/content/3.0/migration.html
+++ b/content/3.0/migration.html
@@ -140,11 +140,31 @@ as they are treated to be of <strong>internal use
only</strong>.</p>
<li>add support for memory mapped files for reading</li>
<li>use the origin source when creating a new reader to process parts of
it</li>
<li>read operations no longer use scratch files</li>
+<li>provide an interface to implement an individual class to read an pdf</li>
+<li>provide an interface to implement an individual cache holding streams when
creating/writing a pdf</li>
</ul>
+<p>PDFBox offers the following implementations of the interface
"org.apache.pdfbox.io.RandomAccessRead" to be used as source to read
a pdf:</p>
+<ul>
+<li><em><strong>org.apache.pdfbox.io.RandomAccessReadBuffer</strong></em></li>
+</ul>
+<p>RandomAccessReadBuffer stores all the data in memory. It is backed by the
given byte array or ByteBuffer. Using the constructor with an InputStream
copies the data to the buffer. Internally the data is stored in a chunk of
ByteBuffers with a default chunk size of 4KB.</p>
+<ul>
+<li><em><strong>org.apache.pdfbox.io.RandomAccessReadBufferedFile</strong></em></li>
+</ul>
+<p>RandomAccessReadBufferedFile is backed by the given file. It has an
in-memory cache using pages with a size of 4KB. The cache follows the FIFO
principle. If the the maximum of 1000 pages is reached the first added page is
replaced with new data.</p>
+<ul>
+<li><em><strong>org.apache.pdfbox.io.RandomAccessReadMemoryMappedFile</strong></em></li>
+</ul>
+<p>RandomAccessReadMemoryMappedFile uses the memory mapping feature of java.
The whole file is mapped to memory and the maximum allowed file size is
<em><strong>Integer.MAX_VALUE</strong></em>.</p>
+<p class="alert alert-warning">There is a <a
href="https://bugs.openjdk.java.net/browse/JDK-4715154">known issue</a> with
locked files after closing the memory mapped file on windows. PDFBox implements
its own unmapper as a workaround.</p>
+<ul>
+<li><em><strong>Implementing your own reader</strong></em></li>
+</ul>
+<p>If there is any need to implement your own reader it has to implement the
interface <code>org.apache.pdfbox.io.RandomAccessRead</code>. It shall be done
thread safe to avoid issues in multithreaded environments.</p>
<h3 id="use-loader-to-get-a-pdf-document" tabindex="-1">Use
<strong>Loader</strong> to get a PDF document</h3>
<p>The new class <em><strong>org.apache.pdfbox.Loader</strong></em> is used
for loading a PDF. It offers several methods to load a pdf using different kind
of sources. All load methods have been removed from
<em><strong>org.apache.pdfbox.pdmodel.PDDocument</strong></em>. The same is
true for loading a FDF document.</p>
-<p>Sample usage:</p>
-<pre><code> try (PDDocument document = Loader.loadPDF(new
File("yourfile.pdf")))
+<p>The most flexible way is to use an instance of RandomAccessRead such as the
following sample:</p>
+<pre><code> try (PDDocument document = Loader.loadPDF(new
RandomAccessReadBufferedFile("yourfile.pdf")))
{
for (PDPage page : document.getPages())
{
@@ -152,6 +172,15 @@ as they are treated to be of <strong>internal use
only</strong>.</p>
}
}
</code></pre>
+<p><em><strong>org.apache.pdfbox.Loader</strong></em> provides two other kind
of load methods for your convenience.</p>
+<ul>
+<li><em><strong>using a byte array as input</strong></em></li>
+</ul>
+<p>If a byte array is provided as source PDFBox uses
<code>org.apache.pdfbox.io.RandomAccessReadBuffer</code> to hold the data. The
byte buffer is backed by the given byte array.</p>
+<ul>
+<li><em><strong>using a file as input</strong></em></li>
+</ul>
+<p>If a file is provided as source PDFBox uses
<code>org.apache.pdfbox.io.RandomAccessReadBufferedFile</code> to wrap the
source data using the in-memory cache as described above.</p>
<h3 id="changes-when-saving-pdf" tabindex="-1">Changes when saving PDF</h3>
<h4 id="compressed-saving-by-default" tabindex="-1">Compressed saving by
default</h4>
<p>When saving a PDF this will now be done in compressed mode by default. To
override that use <code>PDDocument.save</code> with
<code>CompressParameters.NO_COMPRESSION</code>.</p>