Bug#1009680: marked as done (Ghostscript 9.56 removes hidden (e.g. OCR) text layers when refrying with NEWPDF=true)

Debian Bug Tracking System Wed, 20 Apr 2022 14:09:17 -0700

Your message dated Wed, 20 Apr 2022 21:05:28 +0000
with message-id <e1nhhvw-0008ey...@fasolo.debian.org>
and subject line Bug#1009680: fixed in ghostscript 9.56.1~dfsg-1
has caused the Debian Bug report #1009680,
regarding Ghostscript 9.56 removes hidden (e.g. OCR) text layers when refrying 
with NEWPDF=true
to be marked as done.


This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)


-- 
1009680: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1009680
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems

--- Begin Message ---
Source: ghostscript, ocrmypdf
Control: found -1 ghostscript/9.56.0~dfsg-1
Control: found -1 ocrmypdf/13.4.0+dfsg-1
Severity: serious
Tags: sid bookworm
User: debian...@lists.debian.org
Usertags: breaks needs-update

Dear maintainer(s),
With a recent upload of ghostscript the autopkgtest of ocrmypdf fails in testing when that autopkgtest is run with the binary packages of ghostscript from unstable. It passes when run with only packages from testing. In tabular form:
                       pass            fail
ghostscript            from testing    9.56.0~dfsg-1
ocrmypdf               from testing    13.4.0+dfsg-1
all others             from testing    from testing

I copied some of the output at the bottom of this report.
Currently this regression is blocking the migration of ghostscript to testing [1]. Due to the nature of this issue, I filed this bug report against both packages. Can you please investigate the situation and reassign the bug to the right package?
More information about this bug and the reason for filing it can be found on
https://wiki.debian.org/ContinuousIntegration/RegressionEmailInformation

Paul

[1] https://qa.debian.org/excuses.php?package=ghostscript

https://ci.debian.net/data/autopkgtest/testing/amd64/o/ocrmypdf/20818050/log.gz
=================================== FAILURES =================================== ________________________________ test_force_ocr ________________________________
resources = PosixPath('/tmp/autopkgtest-lxc.zdbcipww/downtmp/build.V8r/src/tests/resources')
outpdf = PosixPath('/tmp/pytest-of-debci/pytest-0/test_force_ocr0/out.pdf')

    def test_force_ocr(resources, outpdf):
        out = check_ocrmypdf(
            resources / 'graph_ocred.pdf',
            outpdf,
            '-f',
            '--plugin',
            'tests/plugins/tesseract_cache.py',
        )
        pdfinfo = PdfInfo(out)
      assert pdfinfo[0].has_text
E       assert False
E + where False = <PageInfo pageno=0 7.573333333333333333333333333"x6.16" rotation=0 dpi=400.000000x400.000000 has_text=False>.has_text
tests/test_main.py:83: AssertionError
----------------------------- Captured stderr call -----------------------------
Scanning contents:   0%|          | 0/1 [00:00<?, ?page/s]
Scanning contents: 100%|██████████| 1/1 [00:00<00:00, 62.30page/s]

OCR:   0%|          | 0.0/1.0 [00:00<?, ?page/s]
OCR:  50%|█████     | 0.5/1.0 [00:02<00:02,  5.47s/page]
OCR: 100%|██████████| 1.0/1.0 [00:02<00:00,  2.75s/page]

PDF/A conversion:   0%|          | 0/1 [00:00<?, ?page/s]

Recompressing JPEGs: 0image [00:00, ?image/s][A
Recompressing JPEGs: 0image [00:00, ?image/s]


Deflating JPEGs:   0%|          | 0/1 [00:00<?, ?image/s][A
Deflating JPEGs: 100%|██████████| 1/1 [00:00<00:00, 74.34image/s]


JBIG2: 0item [00:00, ?item/s][A
JBIG2: 0item [00:00, ?item/s]
------------------------------ Captured log call ------------------------------- INFO ocrmypdf._pipeline:_pipeline.py:275 page already has text! - rasterizing text and running OCR anyway
INFO     ocrmypdf._sync:_sync.py:301 Postprocessing...
WARNING ocrmypdf._pipeline:_pipeline.py:776 Some input metadata could not be copied because it is not permitted in PDF/A. You may wish to examine the output PDF's XMP metadata. INFO ocrmypdf.optimize:optimize.py:665 Optimize ratio: 1.52 savings: 34.1%
INFO     ocrmypdf._sync:_sync.py:399 Output file is a PDF/A-2B (as expected)
WARNING ocrmypdf._validation:_validation.py:381 The output file size is 2.45× larger than the input file.
Possible reasons for this include:
The argument --force-ocr was issued, causing transcoding.
The optional dependency 'jbig2' was not found, so some image optimizations could not be attempted.
PDF/A conversion was enabled. (Try `--output-type pdf`.)
Plugins were used.
--------------------------- Captured stderr teardown ---------------------------
PDF/A conversion: 100%|██████████| 1/1 [00:01<00:00,  1.20s/page]
________________________________ test_skip_ocr _________________________________
resources = PosixPath('/tmp/autopkgtest-lxc.zdbcipww/downtmp/build.V8r/src/tests/resources')
outpdf = PosixPath('/tmp/pytest-of-debci/pytest-0/test_skip_ocr0/out.pdf')

    def test_skip_ocr(resources, outpdf):
        out = check_ocrmypdf(
            resources / 'graph_ocred.pdf',
            outpdf,
            '-s',
            '--plugin',
            'tests/plugins/tesseract_cache.py',
        )
        pdfinfo = PdfInfo(out)
      assert pdfinfo[0].has_text
E       assert False
E + where False = <PageInfo pageno=0 7.573333333333333333333333333"x6.16" rotation=0 dpi=150.000000x150.000000 has_text=False>.has_text
tests/test_main.py:95: AssertionError
----------------------------- Captured stderr call -----------------------------
Scanning contents:   0%|          | 0/1 [00:00<?, ?page/s]
Scanning contents: 100%|██████████| 1/1 [00:00<00:00, 70.71page/s]

OCR:   0%|          | 0.0/1.0 [00:00<?, ?page/s]
OCR: 100%|██████████| 1.0/1.0 [00:00<00:00, 47.12page/s]

PDF/A conversion:   0%|          | 0/1 [00:00<?, ?page/s]

Recompressing JPEGs: 0image [00:00, ?image/s][A
Recompressing JPEGs: 0image [00:00, ?image/s]


Deflating JPEGs:   0%|          | 0/1 [00:00<?, ?image/s][A
Deflating JPEGs: 100%|██████████| 1/1 [00:00<00:00, 235.24image/s]


JBIG2: 0item [00:00, ?item/s][A
JBIG2: 0item [00:00, ?item/s]
------------------------------ Captured log call ------------------------------- INFO ocrmypdf._pipeline:_pipeline.py:287 skipping all processing on this page
INFO     ocrmypdf._sync:_sync.py:301 Postprocessing...
WARNING ocrmypdf._pipeline:_pipeline.py:776 Some input metadata could not be copied because it is not permitted in PDF/A. You may wish to examine the output PDF's XMP metadata. INFO ocrmypdf.optimize:optimize.py:665 Optimize ratio: 1.14 savings: 12.6%
INFO     ocrmypdf._sync:_sync.py:399 Output file is a PDF/A-2B (as expected)
--------------------------- Captured stderr teardown ---------------------------
PDF/A conversion: 100%|██████████| 1/1 [00:00<00:00,  4.16page/s]
________________________________ test_redo_ocr _________________________________
resources = PosixPath('/tmp/autopkgtest-lxc.zdbcipww/downtmp/build.V8r/src/tests/resources')
outpdf = PosixPath('/tmp/pytest-of-debci/pytest-0/test_redo_ocr0/out.pdf')

    def test_redo_ocr(resources, outpdf):
        in_ = resources / 'graph_ocred.pdf'
        before = PdfInfo(in_, detailed_analysis=True)
        out = outpdf
        out = check_ocrmypdf(in_, out, '--redo-ocr')
        after = PdfInfo(out, detailed_analysis=True)
      assert before[0].has_text and after[0].has_text
E       assert (True and False)
E + where True = <PageInfo pageno=0 7.573333333333333333333333333"x6.16" rotation=0 dpi=150.000000x150.000000 has_text=True>.has_text E + and False = <PageInfo pageno=0 7.573333333333333333333333333"x6.16" rotation=0 dpi=150.000000x150.000000 has_text=False>.has_text
tests/test_main.py:104: AssertionError
----------------------------- Captured stderr call -----------------------------
Scanning contents:   0%|          | 0/1 [00:00<?, ?page/s]
Scanning contents: 100%|██████████| 1/1 [00:00<00:00, 20.63page/s]

OCR:   0%|          | 0.0/1.0 [00:00<?, ?page/s]
OCR:  50%|█████     | 0.5/1.0 [00:04<00:04,  8.64s/page]
OCR: 100%|██████████| 1.0/1.0 [00:04<00:00,  4.35s/page]

PDF/A conversion:   0%|          | 0/1 [00:00<?, ?page/s]

Recompressing JPEGs: 0image [00:00, ?image/s][A
Recompressing JPEGs: 0image [00:00, ?image/s]


Deflating JPEGs:   0%|          | 0/1 [00:00<?, ?image/s][A
Deflating JPEGs: 100%|██████████| 1/1 [00:00<00:00, 254.88image/s]


JBIG2: 0item [00:00, ?item/s][A
JBIG2: 0item [00:00, ?item/s]
------------------------------ Captured log call -------------------------------
INFO     ocrmypdf._pipeline:_pipeline.py:284 redoing OCR
INFO     ocrmypdf._sync:_sync.py:301 Postprocessing...
ERROR ocrmypdf._exec.ghostscript:ghostscript.py:277 GPL Ghostscript 9.56.0 (2022-03-29)
Copyright (C) 2022 Artifex Software, Inc.  All rights reserved.
This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
see the file COPYING for details.
Processing pages 1 through 1.
Page 1
The following warnings were encountered at least once while processing this file:
        number uses illegal exponent form
ERROR ocrmypdf._exec.ghostscript:ghostscript.py:277 This file had errors that were repaired or ignored. ERROR ocrmypdf._exec.ghostscript:ghostscript.py:277 The file was produced by: ERROR ocrmypdf._exec.ghostscript:ghostscript.py:277 >>>> GPL Ghostscript 9.15 <<<< ERROR ocrmypdf._exec.ghostscript:ghostscript.py:277 Please notify the author of the software that produced this ERROR ocrmypdf._exec.ghostscript:ghostscript.py:277 file that it does not conform to Adobe's published PDF
   ERROR    ocrmypdf._exec.ghostscript:ghostscript.py:277  specification.
WARNING ocrmypdf._pipeline:_pipeline.py:776 Some input metadata could not be copied because it is not permitted in PDF/A. You may wish to examine the output PDF's XMP metadata. INFO ocrmypdf.optimize:optimize.py:665 Optimize ratio: 1.14 savings: 12.6%
INFO     ocrmypdf._sync:_sync.py:399 Output file is a PDF/A-2B (as expected)
--------------------------- Captured stderr teardown ---------------------------
PDF/A conversion: 100%|██████████| 1/1 [00:00<00:00,  3.91page/s]
=========================== short test summary info ============================
FAILED tests/test_main.py::test_force_ocr - assert False
FAILED tests/test_main.py::test_skip_ocr - assert False
FAILED tests/test_main.py::test_redo_ocr - assert (True and False)
======= 3 failed, 274 passed, 37 skipped, 4 xfailed in 397.41s (0:06:37) =======
autopkgtest [08:17:33]: test test-suite
OpenPGP_signature
Description: OpenPGP digital signature

--- End Message ---

--- Begin Message ---

Source: ghostscript
Source-Version: 9.56.1~dfsg-1
Done: Jonas Smedegaard <d...@jones.dk>

We believe that the bug you reported is fixed in the latest version of
ghostscript, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 1009...@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Jonas Smedegaard <d...@jones.dk> (supplier of updated ghostscript package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmas...@ftp-master.debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Format: 1.8
Date: Wed, 20 Apr 2022 22:47:35 +0200
Source: ghostscript
Architecture: source
Version: 9.56.1~dfsg-1
Distribution: unstable
Urgency: medium
Maintainer: Jonas Smedegaard <d...@jones.dk>
Changed-By: Jonas Smedegaard <d...@jones.dk>
Closes: 1009680
Changes:
 ghostscript (9.56.1~dfsg-1) unstable; urgency=medium
 .
   [ upstream ]
   * new release
     + fix text rendering mode 3 and pdfwrite;
       closes: bug#1009680, thanks to Paul Gevers and others
 .
   [ Jonas Smedegaard ]
   * fix watch file
   * update symbols: 1 private symbol added
Checksums-Sha1:
 ed0a4c61548fe84d3f6cfaecc4406141d7c9da20 2609 ghostscript_9.56.1~dfsg-1.dsc
 5567367f9ab85cb5039fb5c4c3bd93cbbc32c2fb 26707316 
ghostscript_9.56.1~dfsg.orig.tar.xz
 8ec7a53f70a4f727ac5f4234f5788f3c63e64140 113264 
ghostscript_9.56.1~dfsg-1.debian.tar.xz
 1c2d964ec506a91c1c70107addb6a641f298fd4b 12278 
ghostscript_9.56.1~dfsg-1_amd64.buildinfo
Checksums-Sha256:
 5005656f92994e3deb6cf4cdad8fc6d6b11759d0b05e86820e66d78a4b5ff041 2609 
ghostscript_9.56.1~dfsg-1.dsc
 ea0debe2333e5ea8059719b51055e75a294ffb45767217fa277b36d786585df5 26707316 
ghostscript_9.56.1~dfsg.orig.tar.xz
 9e743cec3da70827d95086e240b1494bc3f4509fa2894bc0fa0b2090bfa68c9e 113264 
ghostscript_9.56.1~dfsg-1.debian.tar.xz
 a09c01fc62c23ea7d456615900a5f82424c5301b92e173ccec4e822b3c485e97 12278 
ghostscript_9.56.1~dfsg-1_amd64.buildinfo
Files:
 5f7485ac990f9dca270d80c30e4084b5 2609 text optional 
ghostscript_9.56.1~dfsg-1.dsc
 cdecea0cf73fc09aba3158826ea023a7 26707316 text optional 
ghostscript_9.56.1~dfsg.orig.tar.xz
 ec39ebd0f550f171526d9c86083387e2 113264 text optional 
ghostscript_9.56.1~dfsg-1.debian.tar.xz
 01651e808cf7be2684b0fa508943f0ed 12278 text optional 
ghostscript_9.56.1~dfsg-1_amd64.buildinfo

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEn+Ppw2aRpp/1PMaELHwxRsGgASEFAmJgcigACgkQLHwxRsGg
ASFf5RAAkoyw+ZXQemIOXsdRl3nsqMR/eLxymRfBQQWeHkAaUIDFgOa1TPL8gCNg
XV0ZOC+19ev3sURQVKxv+8ZmviW2cRcsXq6u2vgFpoAHDHTSZ8+F3o6V+zIex6nM
JIomGm0douueFQqD+j1E8wxcqazcrseostt0p2duWN60QNYlBKP1OiEkL/5RK/9I
S03ok63KzxeF7oihmnugCcnbiFxa+nk4lHgyRvfJGzqGmYslizsaJxq2lq3pqHex
5GRgkClkZV4diDyPu+rsipA2IBcVdH3BM8nbcZPFjx3lpUsdg4sXTXVF+iGLqFVH
+hkO4cWvxF/tZ+KVNfepbsvCYoei74FAvbKxWlFbMehO++Bolns95ZazQhMFZFpE
RZCee44RH7asx77ObdTJC7jZV/cuk8y1exnF2IIjhung2GJUjcC8fGeSgYHA4FsO
2mCMr1ynC+QFs+834LZ60idamGlsKrnrhQyatr4ZghPfjMZolIQ791RadeBbHw7r
xs4nlQIAE2Nk/RrCcyvm60vMmFfwHo3P3LOP/sdoZ2T6MGAN3d9K4lAqlUHO9ZUb
OOwALT3j3CXHvMlp92zgKwsDxHQuDHEcrS1RCeUSZsT5ZSR/C3rNYieKgwCKtCUK
YwyUNL3Uiab5Xap3NOuFxngwmONKn9F8ns3T0ArKiPcs3d6qjgU=
=9qLU
-----END PGP SIGNATURE-----

--- End Message ---

Bug#1009680: marked as done (Ghostscript 9.56 removes hidden (e.g. OCR) text layers when refrying with NEWPDF=true)

Reply via email to