Processed: Re: Bug#908937: ghostscript breaks ocrmypdf autopkgtest

2018-09-16 Thread Debian Bug Tracking System
Processing control commands:

> reassign 908937 ocrmypdf
Bug #908937 [src:ghostscript, src:ocrmypdf] ghostscript breaks ocrmypdf 
autopkgtest
Bug reassigned from package 'src:ghostscript, src:ocrmypdf' to 'ocrmypdf'.
No longer marked as found in versions ghostscript/9.25~dfsg-2 and 
ocrmypdf/6.2.3-1.
Ignoring request to alter fixed versions of bug #908937 to the same values 
previously set

-- 
908937: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=908937
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#908937: ghostscript breaks ocrmypdf autopkgtest

2018-09-16 Thread Jonas Smedegaard
Control: reassign 908937 ocrmypdf

Quoting Paul Gevers (2018-09-16 11:25:47)
> ginggs already noted this:
> 
> this patch fixes 1 of the 3 failing tests
> https://github.com/jbarlow83/OCRmyPDF/commit/517b385fe5cb2195023100a807e6f18dc7e6faea#diff-b61a6d542f9036550ba9c401c80f00ef

At http://git.ghostscript.com/?p=ghostpdl.git;a=commit;h=e997c68 linked 
from above the change is described as a deliberate change in 
Ghostscript, so reassigning this to ocrmypdf.


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private


signature.asc
Description: signature


Bug#908937: ghostscript breaks ocrmypdf autopkgtest

2018-09-16 Thread Paul Gevers
Dear Sean,

On 16-09-18 20:30, Sean Whitton wrote:
> Paul: does preventing regressions in testing take precedence?

Normally, yes temporarily (we are not blocking yet), but ghostscript was
uploaded with urgency high. That means that regressions are ignored and
without an RC bug, ghostscript will migrate to testing tomorrow (if my
counting is correct).

> If so,
> this bug should be reassigned to ghostscript and raised to RC severity.

I don't follow what you mean by this sentence.

> But ISTM that ocrmypdf is less important, so ghostscript should be
> allowed to migrate and ocrmypdf should be kicked out.

ocrmypdf can stay in testing as long as it doesn't have an RC bug itself.

So just to make it clear: if this change making ocrmypdf totally
unusable and you still want ghostscript to migrate to testing to fix
multiple CVE's, than assigning this bug to ocrmypdf and raising it to RC
level will start the autoremoval process. If you think it is worth
searching for a solution in ghostscript to avoid it breaking ocrmypdf,
than reassign this bug to ghostscript and raise the severity to RC level
to avoid migration.

Paul



signature.asc
Description: OpenPGP digital signature


Bug#908937: ghostscript breaks ocrmypdf autopkgtest

2018-09-16 Thread Sean Whitton
Hello,

On Sun 16 Sep 2018 at 11:20AM +0200, Paul Gevers wrote:

> With a recent upload of ghostscript [9.25] the autopkgtest of ocrmypdf
> [6.2.3] fails in testing when that autopkgtest is run with the binary
> packages of ghostscript from unstable. It passes when run with only
> packages from testing. I copied some of the output at the bottom of
> this report.

James: how easily could we backport the GhostScript change to
OCRmyPDF 6.x series?

If not easily, and you still think that pikepdf's API will stabilise in
time for Debian's upcoming stable release, maybe OCRmyPDF should just be
kept out of testing for the timebeing.

Paul: does preventing regressions in testing take precedence?  If so,
this bug should be reassigned to ghostscript and raised to RC severity.
But ISTM that ocrmypdf is less important, so ghostscript should be
allowed to migrate and ocrmypdf should be kicked out.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Bug#908937: ghostscript breaks ocrmypdf autopkgtest

2018-09-16 Thread Paul Gevers
ginggs already noted this:

this patch fixes 1 of the 3 failing tests
https://github.com/jbarlow83/OCRmyPDF/commit/517b385fe5cb2195023100a807e6f18dc7e6faea#diff-b61a6d542f9036550ba9c401c80f00ef

Paul




signature.asc
Description: OpenPGP digital signature


Bug#908937: ghostscript breaks ocrmypdf autopkgtest

2018-09-16 Thread Paul Gevers
Source: ghostscript, ocrmypdf
Control: found -1 ghostscript/9.25~dfsg-2
Control: found -1 ocrmypdf/6.2.3-1
X-Debbugs-CC: debian...@lists.debian.org
User: debian...@lists.debian.org
Usertags: breaks needs-update

Dear maintainers,

With a recent upload of ghostscript the autopkgtest of ocrmypdf fails in
testing when that autopkgtest is run with the binary packages of
ghostscript from unstable. It passes when run with only packages from
testing. I copied some of the output at the bottom of this report.

As ghostscript is uploaded with urgency high, this regression is NOT
delaying of the migration of ghostscript to testing [1]. If this
regression requires blockage of ghostscript to testing, fast action is
required (raising the severity of this bug should be enough, albeit I
haven't tested RC blockage when bugs are assigned to multiple packages.
Due to the nature of this issue, I filed this bug report against both
packages. Can you please investigate the situation and reassign the bug
to the right package? As needed, please change the bug's severity.

More information about this bug and the reason for filing it can be found on
https://wiki.debian.org/ContinuousIntegration/RegressionEmailInformation

Paul

[1] https://qa.debian.org/excuses.php?package=ghostscript

https://ci.debian.net/data/autopkgtest/testing/amd64/o/ocrmypdf/1000885/log.gz

=== FAILURES
===
___ test_compression_changed[congress.jpg-lossless]


spoof_tesseract_noop = {'ADTTMP':
'/tmp/autopkgtest-lxc.bj5c8t8i/downtmp/autopkgtest_tmp',
'ADT_ARTIFACTS':
'/tmp/autopkgtest-lxc.bj5c8t8i/do...j5c8t8i/downtmp/test-suite-artifacts',
'AUTOPKGTEST_TMP':
'/tmp/autopkgtest-lxc.bj5c8t8i/downtmp/autopkgtest_tmp', ...}
ocrmypdf_exec = ['/usr/bin/python3', '-m', 'ocrmypdf']
resources =
PosixPath('/tmp/autopkgtest-lxc.bj5c8t8i/downtmp/build.24A/src/tests/resources')
image = 'congress.jpg', compression = 'lossless'
outpdf =
'/tmp/pytest-of-debci/pytest-0/test_compression_changed_congr0/out.pdf'

@pytest.mark.parametrize('image,compression', [
('baiona.png', 'jpeg'),
('baiona_gray.png', 'lossless'),
('congress.jpg', 'lossless')
])
def test_compression_changed(spoof_tesseract_noop, ocrmypdf_exec,
 resources, image, compression, outpdf):
from PIL import Image

input_file = str(resources / image)
output_file = str(outpdf)

im = Image.open(input_file)

# Runs: ocrmypdf - output.pdf < testfile
with open(input_file, 'rb') as input_stream:
p_args = ocrmypdf_exec + [
'--image-dpi', '150', '--output-type', 'pdfa',
'--pdfa-image-compression', compression,
'-', output_file]
p = Popen(
p_args, close_fds=True, stdout=PIPE, stderr=PIPE,
stdin=input_stream, env=spoof_tesseract_noop)
out, err = p.communicate()

assert p.returncode == ExitCode.ok, err

pdfinfo = PdfInfo(output_file)

pdfimage = pdfinfo[0].images[0]

if compression == "jpeg":
assert pdfimage.enc == Encoding.jpeg
else:
if ghostscript.jpeg_passthrough_available():
# Ghostscript 9.23 adds JPEG passthrough, which allows a
JPEG to be
# copied without transcoding - so report
if image.endswith('jpg'):
assert pdfimage.enc == Encoding.jpeg
else:
>   assert pdfimage.enc not in (Encoding.jpeg,
Encoding.jpeg2000)
E   AssertionError: assert  not in
(, )
E+  where  = .enc

tests/test_main.py:917: AssertionError
_ test_preserve_metadata[pdfa]
_

spoof_tesseract_noop = {'ADTTMP':
'/tmp/autopkgtest-lxc.bj5c8t8i/downtmp/autopkgtest_tmp',
'ADT_ARTIFACTS':
'/tmp/autopkgtest-lxc.bj5c8t8i/do...j5c8t8i/downtmp/test-suite-artifacts',
'AUTOPKGTEST_TMP':
'/tmp/autopkgtest-lxc.bj5c8t8i/downtmp/autopkgtest_tmp', ...}
output_type = 'pdfa'
resources =
PosixPath('/tmp/autopkgtest-lxc.bj5c8t8i/downtmp/build.24A/src/tests/resources')
outpdf =
'/tmp/pytest-of-debci/pytest-0/test_preserve_metadata_pdfa_0/out.pdf'

@pytest.mark.parametrize("output_type", [
'pdfa', 'pdf'
])
def test_preserve_metadata(spoof_tesseract_noop, output_type,
   resources, outpdf):
pdf_before = pypdf.PdfFileReader(str(resources / 'graph.pdf'))

output = check_ocrmypdf(
resources / 'graph.pdf', outpdf,
'--output-type', output_type,
env=spoof_tesseract_noop)

pdf_after = pypdf.PdfFileReader(str(output))

for key in ('/Title', '/Author'):
>   assert pdf_before.documentInfo[key] ==
pdf_after.documentInfo[key]

tests/test_metadata.py:52:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _