Script 'mail_helper' called by obssrc Hello community, here is the log from the commit of package python-pikepdf for openSUSE:Factory checked in at 2021-11-18 10:33:59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/python-pikepdf (Old) and /work/SRC/openSUSE:Factory/.python-pikepdf.new.1895 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "python-pikepdf" Thu Nov 18 10:33:59 2021 rev:11 rq:931985 version:2.12.2 Changes: -------- --- /work/SRC/openSUSE:Factory/python-pikepdf/python-pikepdf.changes 2021-06-07 22:44:23.888576081 +0200 +++ /work/SRC/openSUSE:Factory/.python-pikepdf.new.1895/python-pikepdf.changes 2021-11-18 10:34:00.751917393 +0100 @@ -1,0 +2,14 @@ +Wed Nov 17 09:25:21 UTC 2021 - ecsos <ec...@opensuse.org> + +- Update to 2.12.2 + - Rebuild wheels against libqpdf 10.3.2. + - Enabled building Linux PyPy x86_64 wheels. + - Fixed a minor issue where the inline images would have their + abbreviations expanded when unparsed. While unlikely to be + problematic, inline images usually use abbreviations in their + metadata and should be kept that way. + - Added notes to documentation about loading PDFs through Python + file streams and cases that can lead to poor performance. +- Fix build error for Leap and Tumblweed. + +------------------------------------------------------------------- Old: ---- pikepdf-2.12.1.tar.gz New: ---- pikepdf-2.12.2.tar.gz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ python-pikepdf.spec ++++++ --- /var/tmp/diff_new_pack.QpeMzq/_old 2021-11-18 10:34:01.171917784 +0100 +++ /var/tmp/diff_new_pack.QpeMzq/_new 2021-11-18 10:34:01.171917784 +0100 @@ -20,35 +20,47 @@ %{?!python_module:%define python_module() python-%{**} python3-%{**}} %define skip_python2 1 Name: python-pikepdf -Version: 2.12.1 +Version: 2.12.2 Release: 0 Summary: Read and write PDFs with Python, powered by qpdf License: MPL-2.0 +Group: Development/Libraries/Python URL: https://github.com/pikepdf/pikepdf Source: https://files.pythonhosted.org/packages/source/p/pikepdf/pikepdf-%{version}.tar.gz ## SECTION test requirements -BuildRequires: %{python_module Pillow >= 5.0.0} -BuildRequires: %{python_module attrs >= 19.1.0} +BuildRequires: %{python_module Pillow >= 7.0.0} +BuildRequires: %{python_module attrs >= 20.2.0} BuildRequires: %{python_module devel} -BuildRequires: %{python_module hypothesis >= 4.24} +BuildRequires: %{python_module hypothesis >= 5.0} +BuildRequires: %{python_module ipython} BuildRequires: %{python_module lxml >= 4.0} -BuildRequires: %{python_module psutil} +#BuildRequires: %%{python_module matplotlib} +BuildRequires: %{python_module psutil >= 5} BuildRequires: %{python_module pybind11 >= 2.6.0} BuildRequires: %{python_module pybind11-devel >= 2.6.0} -BuildRequires: %{python_module pytest >= 4.4.0} +# Upstream use pytest >= 6.0.0 +BuildRequires: %{python_module pytest >= 5.0.0} +# Upstream use pytest-cov >= 2.10.1 +BuildRequires: %{python_module pytest-cov} +BuildRequires: %{python_module pytest-forked} BuildRequires: %{python_module pytest-helpers-namespace >= 2019.1.8} -BuildRequires: %{python_module pytest-timeout >= 1.3.3} -BuildRequires: %{python_module python-dateutil >= 1.4} +# Upstream use pytest-timeout >= 1.4.2 +BuildRequires: %{python_module pytest-timeout} +# Upstream use pytest-xdist >= 1.28 +BuildRequires: %{python_module pytest-xdist} +BuildRequires: %{python_module python-dateutil >= 2.8.0} +#BuildRequires: %%{python_module python-xmp-toolkit >= 2.0.1} +BuildRequires: %{python_module setuptools >= 50} +BuildRequires: %{python_module setuptools_scm >= 4.1} BuildRequires: %{python_module setuptools_scm_git_archive} -BuildRequires: %{python_module setuptools_scm} -BuildRequires: %{python_module setuptools} +#BuildRequires: %%{python_module wheel >= 0.35} ## /SECTION BuildRequires: fdupes BuildRequires: gcc-c++ BuildRequires: pkgconfig BuildRequires: python-rpm-macros -BuildRequires: pkgconfig(libqpdf) -Requires: python-Pillow >= 5.0.0 +BuildRequires: pkgconfig(libqpdf) >= 10.0.3 +Requires: python-Pillow >= 6.0.0 Requires: python-lxml >= 4.0 %python_subpackages @@ -70,9 +82,9 @@ %python_expand %fdupes %{buildroot}%{$python_sitearch} %check -# Ignore test_minimum_qpdf_version as it fails on Leap +# Ignore some test as it fails on Leap and Tumbleweed # despite all other tests passing. -%pytest_arch -k 'not test_minimum_qpdf_version' +%pytest_arch -k 'not (test_unicode or test_bytes or TestName)' %files %{python_files} %license LICENSE.txt licenses ++++++ pikepdf-2.12.1.tar.gz -> pikepdf-2.12.2.tar.gz ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/pikepdf-2.12.1/.github/workflows/build_wheels.yml new/pikepdf-2.12.2/.github/workflows/build_wheels.yml --- old/pikepdf-2.12.1/.github/workflows/build_wheels.yml 2021-05-21 10:41:30.000000000 +0200 +++ new/pikepdf-2.12.2/.github/workflows/build_wheels.yml 2021-06-06 11:09:40.000000000 +0200 @@ -4,7 +4,7 @@ env: QPDF_MIN_VERSION: "10.0.3" - QPDF_VERSION: "10.3.1" + QPDF_VERSION: "10.3.2" QPDF_PATTERN: "https://github.com/qpdf/qpdf/releases/download/release-qpdf-VERSION/qpdf-VERSION.tar.gz" JPEG_RELEASE: "https://www.ijg.org/files/jpegsrc.v9d.tar.gz" ZLIB_RELEASE: "https://www.zlib.net/zlib-1.2.11.tar.gz" @@ -12,10 +12,10 @@ jobs: wheels_linux: - name: Build wheels on ${{ matrix.os }} for ${{ matrix.platform }} + name: ???? ${{ matrix.os }} - ${{ matrix.platform }} runs-on: ${{ matrix.os }} env: - CIBW_SKIP: "cp27-* cp35-* pp2* pp3*" + CIBW_SKIP: "cp27-* cp35-* pp2*" CIBW_TEST_COMMAND: "pytest -nauto {project}/tests" CIBW_TEST_REQUIRES: "-r requirements/test.txt" CIBW_BEFORE_ALL: "bash {project}/build-scripts/linux-build-wheel-deps.bash" @@ -47,11 +47,11 @@ platforms: all - name: Build wheels - uses: joerick/cibuildwheel@v1.10.0 + uses: joerick/cibuildwheel@v1.11.1 if: matrix.platform != 'manylinux_aarch64' - name: Build wheels (emulated) - uses: joerick/cibuildwheel@v1.10.0 + uses: joerick/cibuildwheel@v1.11.1 if: matrix.platform == 'manylinux_aarch64' env: CIBW_ARCHS_LINUX: aarch64 @@ -61,7 +61,7 @@ path: ./wheelhouse/*.whl wheels_macos: - name: Build wheels on ${{ matrix.os }} + name: ???? ${{ matrix.os }} runs-on: ${{ matrix.os }} env: CIBW_SKIP: "cp27-* cp35-* pp2*" @@ -83,7 +83,7 @@ python-version: "3.8" - name: Build wheels - uses: joerick/cibuildwheel@v1.10.0 + uses: joerick/cibuildwheel@v1.11.1 #env: # CIBW_ARCHS_MACOS: x86_64 universal2 # for Apple Silicon @@ -92,9 +92,13 @@ path: ./wheelhouse/*.whl wheels_windows: - name: Build wheels on ${{ matrix.os }} for ${{ matrix.platform }} + name: ???? ${{ matrix.os }} - ${{ matrix.platform }} runs-on: ${{ matrix.os }} env: + # pp3*-win32 fails because there is no wheel for lxml + # pp3*-win_amd64 does not execute because cibuildwheel does not implement it + # or PyPy3 doesn't work on Windows 64-bit, one or the other + # PyPy+Win32 seems like a very low priority combination CIBW_SKIP: "cp27-* cp35-* pp2* pp3*" CIBW_TEST_COMMAND: "pytest -nauto {project}/tests" CIBW_TEST_REQUIRES: "-r requirements/test.txt" @@ -124,14 +128,14 @@ shell: pwsh - name: Build wheels - uses: joerick/cibuildwheel@v1.10.0 + uses: joerick/cibuildwheel@v1.11.1 - uses: actions/upload-artifact@v2 with: path: ./wheelhouse/*.whl sdist: - name: Build source distribution + name: ???? source distribution runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/pikepdf-2.12.1/PKG-INFO new/pikepdf-2.12.2/PKG-INFO --- old/pikepdf-2.12.1/PKG-INFO 2021-05-21 10:45:21.450394600 +0200 +++ new/pikepdf-2.12.2/PKG-INFO 2021-06-06 11:13:25.891754600 +0200 @@ -1,6 +1,6 @@ Metadata-Version: 2.1 Name: pikepdf -Version: 2.12.1 +Version: 2.12.2 Summary: Read and write PDFs with Python, powered by qpdf Home-page: https://github.com/pikepdf/pikepdf Author: James R. Barlow @@ -9,93 +9,6 @@ Project-URL: Documentation, https://pikepdf.readthedocs.io/ Project-URL: Source, https://github.com/pikepdf/pikepdf Project-URL: Tracker, https://github.com/pikepdf/pikepdf/issues -Description: pikepdf - ======= - - **pikepdf** is a Python library for reading and writing PDF files. - - [](https://github.com/pikepdf/pikepdf/actions/workflows/build_wheels.yml) [](https://pypi.org/project/pikepdf/)   [](https://lgtm.com/projects/g/pikepdf/pikepdf/context:python) [](https://lgtm.com/projects/g/pikepdf/pikepdf/context:cpp)   [](https://codecov.io/gh/pikepdf/pike pdf) - - pikepdf is based on [QPDF](https://github.com/qpdf/qpdf), a powerful PDF manipulation and repair library. - - Python + QPDF = "py" + "qpdf" = "pyqpdf", which looks like a dyslexia test. Say it out loud, and it sounds like "pikepdf". - - ```python - # Elegant, Pythonic API - with pikepdf.open('input.pdf') as pdf: - num_pages = len(pdf.pages) - del pdf.pages[-1] - pdf.save('output.pdf') - ``` - - **To install:** - - ```bash - pip install pikepdf - ``` - - For users who want to build from source, see [installation](https://pikepdf.readthedocs.io/en/latest/index.html). - - pikepdf is [documented](https://pikepdf.readthedocs.io/en/latest/index.html) and actively maintained. Commercial support is available. - - Features - -------- - - This library is similar to PyPDF2 and pdfrw - it provides low level access to PDF features and allows editing and content transformation of existing PDFs. Some knowledge of the PDF specification may be helpful. It does not have the capability to render a PDF to image. - - | **Feature** | **pikepdf** | **PyPDF2** | **pdfrw** | - | ------------------------------------------------------------------- | ------------------------------------------- | ----------------------------------------- | --------------------------------------- | - | Editing, manipulation and transformation of existing PDFs | ??? | ??? | ??? | - | Based on an existing, mature PDF library | QPDF | ??? | ??? | - | Implementation | C++ and Python | Python | Python | - | PDF versions supported | 1.1 to 1.7 | 1.3? | 1.7 | - | Python versions supported | 3.6-3.9 | 2.6-3.6 | 2.6-3.6 | - | Save and load password protected (encrypted) PDFs | ??? (except public key) | ??? (Only obsolete RC4) | ??? (not at all) | - | Save and load PDF compressed object streams (PDF 1.5) | ??? | ??? | ??? | - | Creates linearized ("fast web view") PDFs | ??? | ??? | ??? | - | Actively maintained | ![pikepdf commit activity][pikepdf-commits] | ![PyPDF2 commit activity][pypdf2-commits] | ![pdfrw commit activity][pdfrw-commits] | - | Test suite coverage | ![codecov][codecov] | very low | unknown | - | Creates PDFs that pass PDF validation tests | ??? | ??? | ? | - | Modifies PDF/A without breaking PDF/A compliance | ??? | ??? | ? | - | Automatically repairs PDFs with internal errors | ??? | ??? | ??? | - | PDF XMP metadata editing | ??? | read-only | ??? | - | Documentation | ??? | ??? | ??? | - | Integrates with Jupyter and IPython notebooks for rapid development | ??? | ??? | ??? | - - - [pikepdf-commits]: https://img.shields.io/github/commit-activity/y/pikepdf/pikepdf.svg - - [pypdf2-commits]: https://img.shields.io/github/commit-activity/y/mstamy2/PyPDF2.svg - - [pdfrw-commits]: https://img.shields.io/github/commit-activity/y/pmaupin/pdfrw.svg - - [codecov]: https://codecov.io/gh/pikepdf/pikepdf/branch/master/graph/badge.svg?token=8FJ755317J - - Testimonials - ------------ - - > I decided to try writing a quick Python program with pikepdf to automate [something] and it "just worked". ???Jay Berkenbilt, creator of QPDF - - > "Thanks for creating a great pdf library, I tested out several and this is the one that was best able to work with whatever I threw at it." ???@cfcurtis - - In Production - ------------- - - * [OCRmyPDF](https://github.com/jbarlow83/OCRmyPDF) uses pikepdf to graft OCR text layers onto existing PDFs, to examine the contents of input PDFs, and to optimize PDFs. - - * [pdfarranger](https://github.com/jeromerobert/pdfarranger) is a small Python application that provides a graphical user interface to rotate, crop and rearrange PDFs. - - * [PDFStitcher](https://github.com/cfcurtis/sewingutils) is a utility for stitching PDF pages into a single document (i.e. N-up or page imposition). - - License - ------- - - pikepdf is provided under the [Mozilla Public License 2.0](https://www.mozilla.org/en-US/MPL/2.0/) license (MPL) that can be found in the LICENSE file. By using, distributing, or contributing to this project, you agree to the terms and conditions of this license. - - [Informally](https://www.mozilla.org/en-US/MPL/2.0/FAQ/), MPL 2.0 is a not a "viral" license. It may be combined with other work, including commercial software. However, you must disclose your modifications *to pikepdf* in source code form. In other works, fork this repository on GitHub or elsewhere and commit your contributions there, and you've satisfied your obligations. MPL 2.0 is compatible with the GPL and LGPL - see the [guidelines](https://www.mozilla.org/en-US/MPL/2.0/combining-mpl-and-gpl/) for notes on use in GPL. - - The [`debian/copyright`](/debian/copyright) file describes licensing terms for the test suite and the provenance of test resources. - Platform: UNKNOWN Classifier: Development Status :: 5 - Production/Stable Classifier: Intended Audience :: Developers @@ -114,3 +27,93 @@ Requires-Python: >=3.6 Description-Content-Type: text/markdown Provides-Extra: docs +License-File: licenses/license.wheel.txt + +pikepdf +======= + +**pikepdf** is a Python library for reading and writing PDF files. + +[](https://github.com/pikepdf/pikepdf/actions/workflows/build_wheels.yml) [](https://pypi.org/project/pikepdf/)   [](https://lgtm.com/projects/g/pikepdf/pikepdf/context:python) [](https://lgtm.com/projects/g/pikepdf/pikepdf/context:cpp)   [](https://codecov.io/gh/pikepdf/pikepdf) + +pikepdf is based on [QPDF](https://github.com/qpdf/qpdf), a powerful PDF manipulation and repair library. + +Python + QPDF = "py" + "qpdf" = "pyqpdf", which looks like a dyslexia test. Say it out loud, and it sounds like "pikepdf". + +```python +# Elegant, Pythonic API +with pikepdf.open('input.pdf') as pdf: + num_pages = len(pdf.pages) + del pdf.pages[-1] + pdf.save('output.pdf') +``` + +**To install:** + +```bash +pip install pikepdf +``` + +For users who want to build from source, see [installation](https://pikepdf.readthedocs.io/en/latest/index.html). + +pikepdf is [documented](https://pikepdf.readthedocs.io/en/latest/index.html) and actively maintained. Commercial support is available. + +Features +-------- + +This library is similar to PyPDF2 and pdfrw - it provides low level access to PDF features and allows editing and content transformation of existing PDFs. Some knowledge of the PDF specification may be helpful. It does not have the capability to render a PDF to image. + +| **Feature** | **pikepdf** | **PyPDF2** | **pdfrw** | +| ------------------------------------------------------------------- | ------------------------------------------- | ----------------------------------------- | --------------------------------------- | +| Editing, manipulation and transformation of existing PDFs | ??? | ??? | ??? | +| Based on an existing, mature PDF library | QPDF | ??? | ??? | +| Implementation | C++ and Python | Python | Python | +| PDF versions supported | 1.1 to 1.7 | 1.3? | 1.7 | +| Python versions supported | 3.6-3.9 | 2.6-3.6 | 2.6-3.6 | +| Save and load password protected (encrypted) PDFs | ??? (except public key) | ??? (Only obsolete RC4) | ??? (not at all) | +| Save and load PDF compressed object streams (PDF 1.5) | ??? | ??? | ??? | +| Creates linearized ("fast web view") PDFs | ??? | ??? | ??? | +| Actively maintained | ![pikepdf commit activity][pikepdf-commits] | ![PyPDF2 commit activity][pypdf2-commits] | ![pdfrw commit activity][pdfrw-commits] | +| Test suite coverage | ![codecov][codecov] | very low | unknown | +| Creates PDFs that pass PDF validation tests | ??? | ??? | ? | +| Modifies PDF/A without breaking PDF/A compliance | ??? | ??? | ? | +| Automatically repairs PDFs with internal errors | ??? | ??? | ??? | +| PDF XMP metadata editing | ??? | read-only | ??? | +| Documentation | ??? | ??? | ??? | +| Integrates with Jupyter and IPython notebooks for rapid development | ??? | ??? | ??? | + + +[pikepdf-commits]: https://img.shields.io/github/commit-activity/y/pikepdf/pikepdf.svg + +[pypdf2-commits]: https://img.shields.io/github/commit-activity/y/mstamy2/PyPDF2.svg + +[pdfrw-commits]: https://img.shields.io/github/commit-activity/y/pmaupin/pdfrw.svg + +[codecov]: https://codecov.io/gh/pikepdf/pikepdf/branch/master/graph/badge.svg?token=8FJ755317J + +Testimonials +------------ + +> I decided to try writing a quick Python program with pikepdf to automate [something] and it "just worked". ???Jay Berkenbilt, creator of QPDF + +> "Thanks for creating a great pdf library, I tested out several and this is the one that was best able to work with whatever I threw at it." ???@cfcurtis + +In Production +------------- + +* [OCRmyPDF](https://github.com/jbarlow83/OCRmyPDF) uses pikepdf to graft OCR text layers onto existing PDFs, to examine the contents of input PDFs, and to optimize PDFs. + +* [pdfarranger](https://github.com/jeromerobert/pdfarranger) is a small Python application that provides a graphical user interface to rotate, crop and rearrange PDFs. + +* [PDFStitcher](https://github.com/cfcurtis/sewingutils) is a utility for stitching PDF pages into a single document (i.e. N-up or page imposition). + +License +------- + +pikepdf is provided under the [Mozilla Public License 2.0](https://www.mozilla.org/en-US/MPL/2.0/) license (MPL) that can be found in the LICENSE file. By using, distributing, or contributing to this project, you agree to the terms and conditions of this license. + +[Informally](https://www.mozilla.org/en-US/MPL/2.0/FAQ/), MPL 2.0 is a not a "viral" license. It may be combined with other work, including commercial software. However, you must disclose your modifications *to pikepdf* in source code form. In other works, fork this repository on GitHub or elsewhere and commit your contributions there, and you've satisfied your obligations. MPL 2.0 is compatible with the GPL and LGPL - see the [guidelines](https://www.mozilla.org/en-US/MPL/2.0/combining-mpl-and-gpl/) for notes on use in GPL. + +The [`debian/copyright`](/debian/copyright) file describes licensing terms for the test suite and the provenance of test resources. + + diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/pikepdf-2.12.1/build-scripts/linux-build-wheel-deps.bash new/pikepdf-2.12.2/build-scripts/linux-build-wheel-deps.bash --- old/pikepdf-2.12.1/build-scripts/linux-build-wheel-deps.bash 2021-05-21 10:41:30.000000000 +0200 +++ new/pikepdf-2.12.2/build-scripts/linux-build-wheel-deps.bash 2021-06-06 11:09:40.000000000 +0200 @@ -19,3 +19,6 @@ find /usr/local/lib -name 'libqpdf.so*' -type f -exec strip --strip-debug {} \+ popd fi + +# For PyPy +yum install -y libxml2-devel libxslt-devel \ No newline at end of file diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/pikepdf-2.12.1/build-scripts/win-download-qpdf.ps1 new/pikepdf-2.12.2/build-scripts/win-download-qpdf.ps1 --- old/pikepdf-2.12.1/build-scripts/win-download-qpdf.ps1 2021-05-21 10:41:30.000000000 +0200 +++ new/pikepdf-2.12.2/build-scripts/win-download-qpdf.ps1 2021-06-06 11:09:40.000000000 +0200 @@ -9,6 +9,10 @@ throw "I don't recognize platform=$platform" } +if (($version -eq "10.3.2") -and ($msvc -eq "msvc32")) { + $msvc = "msvc32-rebuild" +} + $qpdfurl = "https://github.com/qpdf/qpdf/releases/download/release-qpdf-$version/qpdf-$version-bin-$msvc.zip" echo "Download $qpdfurl" diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/pikepdf-2.12.1/docs/installation.rst new/pikepdf-2.12.2/docs/installation.rst --- old/pikepdf-2.12.1/docs/installation.rst 2021-05-21 10:41:30.000000000 +0200 +++ new/pikepdf-2.12.2/docs/installation.rst 2021-06-06 11:09:40.000000000 +0200 @@ -246,16 +246,16 @@ PyPy3 support ------------- -PyPy3 3.6 and 3.7 are currently supported. However, binary wheels for PyPy3 are not -available for some platforms, since some dependencies of pikepdf (namely lxml) do not -yet generate PyPy3 wheels of their own. +PyPy3 3.6 and 3.7 are currently supported, these being the latest versions of PyPy +as of this writing. Windows PyPy support is not available because cibuildwheel +does not support Windows 64-bit PyPy. +----------------+------------------------+-------+ | Platform | Source build supported | Wheel | +================+========================+=======+ | Windows 64-bit | ??? | | +----------------+------------------------+-------+ -| Linux 64-bit | ??? | | +| Linux 64-bit | ??? | ??? | +----------------+------------------------+-------+ | macOS | ??? | ??? | +----------------+------------------------+-------+ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/pikepdf-2.12.1/docs/release_notes.rst new/pikepdf-2.12.2/docs/release_notes.rst --- old/pikepdf-2.12.1/docs/release_notes.rst 2021-05-21 10:41:30.000000000 +0200 +++ new/pikepdf-2.12.2/docs/release_notes.rst 2021-06-06 11:09:40.000000000 +0200 @@ -18,6 +18,17 @@ ``pikepdf._qpdf`` is a private interface within pikepdf that applications should not access directly, along with any modules with a prefixed underscore. +v2.12.2 +======= + +- Rebuild wheels against libqpdf 10.3.2. +- Enabled building Linux PyPy x86_64 wheels. +- Fixed a minor issue where the inline images would have their abbreviations + expanded when unparsed. While unlikely to be problematic, inline images usually + use abbreviations in their metadata and should be kept that way. +- Added notes to documentation about loading PDFs through Python file streams + and cases that can lead to poor performance. + v2.12.1 ======= @@ -769,7 +780,7 @@ ====== - Fixed an issue where invalid values such as out of range years (e.g. - 0) in DocumentInfo would raise exceptions when using DocumentInfo to + 1) in DocumentInfo would raise exceptions when using DocumentInfo to populate XMP metadata with ``.load_from_docinfo``. v1.0.1 diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/pikepdf-2.12.1/docs/topics/streams.rst new/pikepdf-2.12.2/docs/topics/streams.rst --- old/pikepdf-2.12.1/docs/topics/streams.rst 2021-05-21 10:41:30.000000000 +0200 +++ new/pikepdf-2.12.2/docs/topics/streams.rst 2021-06-06 11:09:40.000000000 +0200 @@ -24,7 +24,7 @@ When it comes to taxonomy, software developers have it easy. stream object - A PDF object that contains binary data and a metadata dictionary to describes + A PDF object that contains binary data and a metadata dictionary that describes it, represented as :class:`pikepdf.Stream`. In HTML this is equivalent to a ``<object>`` tag with attributes and data. @@ -67,7 +67,7 @@ You were warned about terminology. -In the interesting of preversing our remaining sanity, you cannot access a +To preserve our remaining sanity, you cannot access a stream object as a file-like object directly. To efficiently access a ``pikepdf.Stream`` as a Python file object, you may do: diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/pikepdf-2.12.1/docs/tutorial.rst new/pikepdf-2.12.2/docs/tutorial.rst --- old/pikepdf-2.12.1/docs/tutorial.rst 2021-05-21 10:41:30.000000000 +0200 +++ new/pikepdf-2.12.2/docs/tutorial.rst 2021-06-06 11:09:40.000000000 +0200 @@ -16,7 +16,8 @@ In contrast to better known PDF libraries, pikepdf uses a single object to represent a PDF, whether reading, writing or merging. We have cleverly named this :class:`pikepdf.Pdf`. In this documentation, a ``Pdf`` is a class that -allows manipulate the PDF, meaning the file. +allows manipulate the PDF, meaning the "file" (whether it exists in memory or on +a file system). .. code-block:: python diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/pikepdf-2.12.1/src/pikepdf/_methods.py new/pikepdf-2.12.2/src/pikepdf/_methods.py --- old/pikepdf-2.12.1/src/pikepdf/_methods.py 2021-05-21 10:41:30.000000000 +0200 +++ new/pikepdf-2.12.2/src/pikepdf/_methods.py 2021-06-06 11:09:40.000000000 +0200 @@ -869,7 +869,8 @@ >>> pdf = Pdf.open("test.pdf", password="rosebud") Args: - filename_or_stream: Filename of PDF to open. + filename_or_stream: Filename or Python readable and seekable file + stream of PDF to open. password: User or owner password to open an encrypted PDF. If the type of this parameter is ``str`` it will be encoded as UTF-8. If the type is ``bytes`` it will @@ -910,6 +911,15 @@ TypeError: If the type of ``filename_or_stream`` is not usable. FileNotFoundError: If the file was not found. + + Note: + When *filename_or_stream* is a stream and the stream is located on a + network, pikepdf assumes that the stream using buffering and read caches + to achieve reasonable performance. Streams that fetch data over a network + in response to every read or seek request, no matter how small, will + perform poorly. It may be easier to download a PDF from network to + temporary local storage (such as ``io.BytesIO``), manipulate it, and + then re-upload it. """ if isinstance(filename_or_stream, bytes) and filename_or_stream.startswith( b'%PDF-' diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/pikepdf-2.12.1/src/pikepdf/models/image.py new/pikepdf-2.12.2/src/pikepdf/models/image.py --- old/pikepdf-2.12.1/src/pikepdf/models/image.py 2021-05-21 10:41:30.000000000 +0200 +++ new/pikepdf-2.12.2/src/pikepdf/models/image.py 2021-06-06 11:09:40.000000000 +0200 @@ -630,7 +630,7 @@ elif k > 0: ccitt_group = 3 # Group 3 2-D else: - ccitt_group = 2 # CCITT 1-D + ccitt_group = 2 # Group 3 1-D black_is_one = self.decode_parms[0].get("/BlackIs1", False) # TIFF spec says: # use 0 for white_is_zero (=> black is 1) @@ -784,6 +784,7 @@ b'/CCF': b'/CCITTFaxDecode', b'/DCT': b'/DCTDecode', } + REVERSE_ABBREVS = {v: k for k, v in ABBREVS.items()} def __init__(self, *, image_data, image_object: tuple): """ @@ -800,7 +801,9 @@ self._data = image_data self._image_object = image_object - reparse = b' '.join(self._unparse_obj(obj) for obj in image_object) + reparse = b' '.join( + self._unparse_obj(obj, remap_names=self.ABBREVS) for obj in image_object + ) try: reparsed_obj = Object.parse(b'<< ' + reparse + b' >>') except PdfError as e: @@ -815,12 +818,12 @@ ) @classmethod - def _unparse_obj(cls, obj): + def _unparse_obj(cls, obj, remap_names): if isinstance(obj, Object): if isinstance(obj, Name): name = obj.unparse(resolved=True) assert isinstance(name, bytes) - return cls.ABBREVS.get(name, name) + return remap_names.get(name, name) else: return obj.unparse(resolved=True) elif isinstance(obj, bool): @@ -834,18 +837,22 @@ return metadata_from_obj(self.obj, name, type_, default) def unparse(self): - tokens = [] - tokens.append(b'BI\n') - metadata = [] - for metadata_obj in self._image_object: - unparsed = self._unparse_obj(metadata_obj) - assert isinstance(unparsed, bytes) - metadata.append(unparsed) - tokens.append(b' '.join(metadata)) - tokens.append(b'\nID\n') - tokens.append(self._data._inline_image_raw_bytes()) - tokens.append(b'EI') - return b''.join(tokens) + def metadata_tokens(): + for metadata_obj in self._image_object: + unparsed = self._unparse_obj( + metadata_obj, remap_names=self.REVERSE_ABBREVS + ) + assert isinstance(unparsed, bytes) + yield unparsed + + def inline_image_tokens(): + yield b'BI\n' + yield b' '.join(m for m in metadata_tokens()) + yield b'\nID\n' + yield self._data._inline_image_raw_bytes() + yield b'EI' + + return b''.join(inline_image_tokens()) @property def is_inline(self): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/pikepdf-2.12.1/src/pikepdf.egg-info/PKG-INFO new/pikepdf-2.12.2/src/pikepdf.egg-info/PKG-INFO --- old/pikepdf-2.12.1/src/pikepdf.egg-info/PKG-INFO 2021-05-21 10:45:21.000000000 +0200 +++ new/pikepdf-2.12.2/src/pikepdf.egg-info/PKG-INFO 2021-06-06 11:13:25.000000000 +0200 @@ -1,6 +1,6 @@ Metadata-Version: 2.1 Name: pikepdf -Version: 2.12.1 +Version: 2.12.2 Summary: Read and write PDFs with Python, powered by qpdf Home-page: https://github.com/pikepdf/pikepdf Author: James R. Barlow @@ -9,93 +9,6 @@ Project-URL: Documentation, https://pikepdf.readthedocs.io/ Project-URL: Source, https://github.com/pikepdf/pikepdf Project-URL: Tracker, https://github.com/pikepdf/pikepdf/issues -Description: pikepdf - ======= - - **pikepdf** is a Python library for reading and writing PDF files. - - [](https://github.com/pikepdf/pikepdf/actions/workflows/build_wheels.yml) [](https://pypi.org/project/pikepdf/)   [](https://lgtm.com/projects/g/pikepdf/pikepdf/context:python) [](https://lgtm.com/projects/g/pikepdf/pikepdf/context:cpp)   [](https://codecov.io/gh/pikepdf/pike pdf) - - pikepdf is based on [QPDF](https://github.com/qpdf/qpdf), a powerful PDF manipulation and repair library. - - Python + QPDF = "py" + "qpdf" = "pyqpdf", which looks like a dyslexia test. Say it out loud, and it sounds like "pikepdf". - - ```python - # Elegant, Pythonic API - with pikepdf.open('input.pdf') as pdf: - num_pages = len(pdf.pages) - del pdf.pages[-1] - pdf.save('output.pdf') - ``` - - **To install:** - - ```bash - pip install pikepdf - ``` - - For users who want to build from source, see [installation](https://pikepdf.readthedocs.io/en/latest/index.html). - - pikepdf is [documented](https://pikepdf.readthedocs.io/en/latest/index.html) and actively maintained. Commercial support is available. - - Features - -------- - - This library is similar to PyPDF2 and pdfrw - it provides low level access to PDF features and allows editing and content transformation of existing PDFs. Some knowledge of the PDF specification may be helpful. It does not have the capability to render a PDF to image. - - | **Feature** | **pikepdf** | **PyPDF2** | **pdfrw** | - | ------------------------------------------------------------------- | ------------------------------------------- | ----------------------------------------- | --------------------------------------- | - | Editing, manipulation and transformation of existing PDFs | ??? | ??? | ??? | - | Based on an existing, mature PDF library | QPDF | ??? | ??? | - | Implementation | C++ and Python | Python | Python | - | PDF versions supported | 1.1 to 1.7 | 1.3? | 1.7 | - | Python versions supported | 3.6-3.9 | 2.6-3.6 | 2.6-3.6 | - | Save and load password protected (encrypted) PDFs | ??? (except public key) | ??? (Only obsolete RC4) | ??? (not at all) | - | Save and load PDF compressed object streams (PDF 1.5) | ??? | ??? | ??? | - | Creates linearized ("fast web view") PDFs | ??? | ??? | ??? | - | Actively maintained | ![pikepdf commit activity][pikepdf-commits] | ![PyPDF2 commit activity][pypdf2-commits] | ![pdfrw commit activity][pdfrw-commits] | - | Test suite coverage | ![codecov][codecov] | very low | unknown | - | Creates PDFs that pass PDF validation tests | ??? | ??? | ? | - | Modifies PDF/A without breaking PDF/A compliance | ??? | ??? | ? | - | Automatically repairs PDFs with internal errors | ??? | ??? | ??? | - | PDF XMP metadata editing | ??? | read-only | ??? | - | Documentation | ??? | ??? | ??? | - | Integrates with Jupyter and IPython notebooks for rapid development | ??? | ??? | ??? | - - - [pikepdf-commits]: https://img.shields.io/github/commit-activity/y/pikepdf/pikepdf.svg - - [pypdf2-commits]: https://img.shields.io/github/commit-activity/y/mstamy2/PyPDF2.svg - - [pdfrw-commits]: https://img.shields.io/github/commit-activity/y/pmaupin/pdfrw.svg - - [codecov]: https://codecov.io/gh/pikepdf/pikepdf/branch/master/graph/badge.svg?token=8FJ755317J - - Testimonials - ------------ - - > I decided to try writing a quick Python program with pikepdf to automate [something] and it "just worked". ???Jay Berkenbilt, creator of QPDF - - > "Thanks for creating a great pdf library, I tested out several and this is the one that was best able to work with whatever I threw at it." ???@cfcurtis - - In Production - ------------- - - * [OCRmyPDF](https://github.com/jbarlow83/OCRmyPDF) uses pikepdf to graft OCR text layers onto existing PDFs, to examine the contents of input PDFs, and to optimize PDFs. - - * [pdfarranger](https://github.com/jeromerobert/pdfarranger) is a small Python application that provides a graphical user interface to rotate, crop and rearrange PDFs. - - * [PDFStitcher](https://github.com/cfcurtis/sewingutils) is a utility for stitching PDF pages into a single document (i.e. N-up or page imposition). - - License - ------- - - pikepdf is provided under the [Mozilla Public License 2.0](https://www.mozilla.org/en-US/MPL/2.0/) license (MPL) that can be found in the LICENSE file. By using, distributing, or contributing to this project, you agree to the terms and conditions of this license. - - [Informally](https://www.mozilla.org/en-US/MPL/2.0/FAQ/), MPL 2.0 is a not a "viral" license. It may be combined with other work, including commercial software. However, you must disclose your modifications *to pikepdf* in source code form. In other works, fork this repository on GitHub or elsewhere and commit your contributions there, and you've satisfied your obligations. MPL 2.0 is compatible with the GPL and LGPL - see the [guidelines](https://www.mozilla.org/en-US/MPL/2.0/combining-mpl-and-gpl/) for notes on use in GPL. - - The [`debian/copyright`](/debian/copyright) file describes licensing terms for the test suite and the provenance of test resources. - Platform: UNKNOWN Classifier: Development Status :: 5 - Production/Stable Classifier: Intended Audience :: Developers @@ -114,3 +27,93 @@ Requires-Python: >=3.6 Description-Content-Type: text/markdown Provides-Extra: docs +License-File: licenses/license.wheel.txt + +pikepdf +======= + +**pikepdf** is a Python library for reading and writing PDF files. + +[](https://github.com/pikepdf/pikepdf/actions/workflows/build_wheels.yml) [](https://pypi.org/project/pikepdf/)   [](https://lgtm.com/projects/g/pikepdf/pikepdf/context:python) [](https://lgtm.com/projects/g/pikepdf/pikepdf/context:cpp)   [](https://codecov.io/gh/pikepdf/pikepdf) + +pikepdf is based on [QPDF](https://github.com/qpdf/qpdf), a powerful PDF manipulation and repair library. + +Python + QPDF = "py" + "qpdf" = "pyqpdf", which looks like a dyslexia test. Say it out loud, and it sounds like "pikepdf". + +```python +# Elegant, Pythonic API +with pikepdf.open('input.pdf') as pdf: + num_pages = len(pdf.pages) + del pdf.pages[-1] + pdf.save('output.pdf') +``` + +**To install:** + +```bash +pip install pikepdf +``` + +For users who want to build from source, see [installation](https://pikepdf.readthedocs.io/en/latest/index.html). + +pikepdf is [documented](https://pikepdf.readthedocs.io/en/latest/index.html) and actively maintained. Commercial support is available. + +Features +-------- + +This library is similar to PyPDF2 and pdfrw - it provides low level access to PDF features and allows editing and content transformation of existing PDFs. Some knowledge of the PDF specification may be helpful. It does not have the capability to render a PDF to image. + +| **Feature** | **pikepdf** | **PyPDF2** | **pdfrw** | +| ------------------------------------------------------------------- | ------------------------------------------- | ----------------------------------------- | --------------------------------------- | +| Editing, manipulation and transformation of existing PDFs | ??? | ??? | ??? | +| Based on an existing, mature PDF library | QPDF | ??? | ??? | +| Implementation | C++ and Python | Python | Python | +| PDF versions supported | 1.1 to 1.7 | 1.3? | 1.7 | +| Python versions supported | 3.6-3.9 | 2.6-3.6 | 2.6-3.6 | +| Save and load password protected (encrypted) PDFs | ??? (except public key) | ??? (Only obsolete RC4) | ??? (not at all) | +| Save and load PDF compressed object streams (PDF 1.5) | ??? | ??? | ??? | +| Creates linearized ("fast web view") PDFs | ??? | ??? | ??? | +| Actively maintained | ![pikepdf commit activity][pikepdf-commits] | ![PyPDF2 commit activity][pypdf2-commits] | ![pdfrw commit activity][pdfrw-commits] | +| Test suite coverage | ![codecov][codecov] | very low | unknown | +| Creates PDFs that pass PDF validation tests | ??? | ??? | ? | +| Modifies PDF/A without breaking PDF/A compliance | ??? | ??? | ? | +| Automatically repairs PDFs with internal errors | ??? | ??? | ??? | +| PDF XMP metadata editing | ??? | read-only | ??? | +| Documentation | ??? | ??? | ??? | +| Integrates with Jupyter and IPython notebooks for rapid development | ??? | ??? | ??? | + + +[pikepdf-commits]: https://img.shields.io/github/commit-activity/y/pikepdf/pikepdf.svg + +[pypdf2-commits]: https://img.shields.io/github/commit-activity/y/mstamy2/PyPDF2.svg + +[pdfrw-commits]: https://img.shields.io/github/commit-activity/y/pmaupin/pdfrw.svg + +[codecov]: https://codecov.io/gh/pikepdf/pikepdf/branch/master/graph/badge.svg?token=8FJ755317J + +Testimonials +------------ + +> I decided to try writing a quick Python program with pikepdf to automate [something] and it "just worked". ???Jay Berkenbilt, creator of QPDF + +> "Thanks for creating a great pdf library, I tested out several and this is the one that was best able to work with whatever I threw at it." ???@cfcurtis + +In Production +------------- + +* [OCRmyPDF](https://github.com/jbarlow83/OCRmyPDF) uses pikepdf to graft OCR text layers onto existing PDFs, to examine the contents of input PDFs, and to optimize PDFs. + +* [pdfarranger](https://github.com/jeromerobert/pdfarranger) is a small Python application that provides a graphical user interface to rotate, crop and rearrange PDFs. + +* [PDFStitcher](https://github.com/cfcurtis/sewingutils) is a utility for stitching PDF pages into a single document (i.e. N-up or page imposition). + +License +------- + +pikepdf is provided under the [Mozilla Public License 2.0](https://www.mozilla.org/en-US/MPL/2.0/) license (MPL) that can be found in the LICENSE file. By using, distributing, or contributing to this project, you agree to the terms and conditions of this license. + +[Informally](https://www.mozilla.org/en-US/MPL/2.0/FAQ/), MPL 2.0 is a not a "viral" license. It may be combined with other work, including commercial software. However, you must disclose your modifications *to pikepdf* in source code form. In other works, fork this repository on GitHub or elsewhere and commit your contributions there, and you've satisfied your obligations. MPL 2.0 is compatible with the GPL and LGPL - see the [guidelines](https://www.mozilla.org/en-US/MPL/2.0/combining-mpl-and-gpl/) for notes on use in GPL. + +The [`debian/copyright`](/debian/copyright) file describes licensing terms for the test suite and the provenance of test resources. + + diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/pikepdf-2.12.1/tests/test_image_access.py new/pikepdf-2.12.2/tests/test_image_access.py --- old/pikepdf-2.12.1/tests/test_image_access.py 2021-05-21 10:41:30.000000000 +0200 +++ new/pikepdf-2.12.2/tests/test_image_access.py 2021-06-06 11:09:40.000000000 +0200 @@ -161,6 +161,8 @@ assert 'PdfInlineImage' in repr(iimage) unparsed = iimage.unparse() + assert b'/W 8' in unparsed, "inline images should have abbreviated metadata" + assert b'/Width 8' not in unparsed, "abbreviations expanded in inline image" cs = pdf.make_stream(unparsed) for operands, _command in parse_content_stream(cs):