Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Vincent Lefevre
On 2021-10-01 16:52:13 +0200, Jonas Smedegaard wrote:
> Quoting Vincent Lefevre (2021-10-01 15:53:38)
> > It seems that the issue partly comes from pdflatex: On an old file for 
> > which ps2pdf was correct with ghostscript 9.53.3~dfsg-4, it is now 
> > incorrect still with ghostscript 9.53.3~dfsg-4. But if I regenerate 
> > the intermediate PDF file on an old Debian machine and transfer it to 
> > my current machine, ps2pdf is correct with ghostscript 9.53.3~dfsg-4 
> > and with ghostscript 9.53.3~dfsg-7 (stable), and also with ghostscript 
> > 9.54.0~dfsg-5.
> 
> Are you sure you mean 9.53.3~dfsg-7, not 9.53.3~dfsg-7+deb11u1?

Yes, I used "apt .../stable", and it was 9.53.3~dfsg-7 that was
fetched, not the security update.

zira:~> apt-show-versions -a ghostscript
ghostscript:amd64 9.54.0~dfsg-5 install ok installed
ghostscript:amd64 9.53.3~dfsg-7 stable  ftp.debian.org
ghostscript:amd64 9.53.3~dfsg-7+deb11u1 stable-security security.debian.org
No stable-updates version
ghostscript:amd64 9.54.0~dfsg-5 testing ftp.debian.org
ghostscript:amd64 9.54.0~dfsg-5 unstableftp.debian.org
ghostscript:amd64 9.55.0~~rc1~dfsg-1experimentalftp.debian.org
ghostscript:amd64/testing 9.54.0~dfsg-5 uptodate

Perhaps I should have used "/stable-security".

> Some upstream changes was backported for -7 and other changes was 
> introduced by -7+deb11u1: 
> https://tracker.debian.org/media/packages/g/ghostscript/changelog-9.53.3dfsg-7deb11u1
> 
> Possibly related to the recent changes to Ghostscripts SAFER: 
> https://www.ghostscript.com/doc/9.55.0/Use.htm#Safer
> 
> Perhaps recent pdflatex was adapted to handle the change to SAFER, and 
> in doing so became dependent on recent Ghostscript (and perhaps that was 
> then not reflected in packaging of pdflatex)?

Anyway, I doubt that this is related to the font / mapping issue.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Jonas Smedegaard
Quoting Vincent Lefevre (2021-10-01 15:53:38)
> On 2021-10-01 14:31:57 +0200, Vincent Lefevre wrote:
> > On 2021-10-01 14:26:02 +0200, Vincent Lefevre wrote:
> > > On 2021-10-01 14:17:53 +0200, Vincent Lefevre wrote:
> > > > In my archives, I can see that the issue occurred with 
> > > > ghostscript 9.26a~dfsg-0+deb9u1, but 9.27~dfsg-2+deb10u4 isn't 
> > > > affected on my second testcase.
> > > 
> > > The following PDF file (on which I got the issue with ghostscript 
> > > 9.26a~dfsg-0+deb9u1) may be a useful testcase:
> > > 
> > >   https://hal.archives-ouvertes.fr/hal-02001080v1/document
> > 
> > On this testcase, the issue is actually reproducible with 
> > ghostscript 9.27~dfsg-2+deb10u4!
> 
> It seems that the issue partly comes from pdflatex: On an old file for 
> which ps2pdf was correct with ghostscript 9.53.3~dfsg-4, it is now 
> incorrect still with ghostscript 9.53.3~dfsg-4. But if I regenerate 
> the intermediate PDF file on an old Debian machine and transfer it to 
> my current machine, ps2pdf is correct with ghostscript 9.53.3~dfsg-4 
> and with ghostscript 9.53.3~dfsg-7 (stable), and also with ghostscript 
> 9.54.0~dfsg-5.

Are you sure you mean 9.53.3~dfsg-7, not 9.53.3~dfsg-7+deb11u1?

Some upstream changes was backported for -7 and other changes was 
introduced by -7+deb11u1: 
https://tracker.debian.org/media/packages/g/ghostscript/changelog-9.53.3dfsg-7deb11u1

Possibly related to the recent changes to Ghostscripts SAFER: 
https://www.ghostscript.com/doc/9.55.0/Use.htm#Safer

Perhaps recent pdflatex was adapted to handle the change to SAFER, and 
in doing so became dependent on recent Ghostscript (and perhaps that was 
then not reflected in packaging of pdflatex)?


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Vincent Lefevre
On 2021-10-01 14:31:57 +0200, Vincent Lefevre wrote:
> On 2021-10-01 14:26:02 +0200, Vincent Lefevre wrote:
> > On 2021-10-01 14:17:53 +0200, Vincent Lefevre wrote:
> > > In my archives, I can see that the issue occurred with
> > > ghostscript 9.26a~dfsg-0+deb9u1, but 9.27~dfsg-2+deb10u4
> > > isn't affected on my second testcase.
> > 
> > The following PDF file (on which I got the issue with
> > ghostscript 9.26a~dfsg-0+deb9u1) may be a useful testcase:
> > 
> >   https://hal.archives-ouvertes.fr/hal-02001080v1/document
> 
> On this testcase, the issue is actually reproducible with
> ghostscript 9.27~dfsg-2+deb10u4!

It seems that the issue partly comes from pdflatex: On an old
file for which ps2pdf was correct with ghostscript 9.53.3~dfsg-4,
it is now incorrect still with ghostscript 9.53.3~dfsg-4. But if
I regenerate the intermediate PDF file on an old Debian machine
and transfer it to my current machine, ps2pdf is correct with
ghostscript 9.53.3~dfsg-4 and with ghostscript 9.53.3~dfsg-7
(stable), and also with ghostscript 9.54.0~dfsg-5.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#990210: fixed in cups-pdf 3.0.1-12

2021-10-01 Thread Till Kamppeter

Sorry, I had overlooked the link in the very first post.

Also thanks for the patch which shows how cups-filters (most probably 
pstops) massages the file.


The file has actually 993 pages:

$ gs -q -dBATCH -dNOPAUSE -sDEVICE=bbox all.ps 2>&1 | grep 
%%BoundingBox: | wc -l

993

or simply display it with

gs all.ps

(and press Enter 993 times).

evince also shows only the 422 pages which your PostScript viewer shows 
to you.


The file has strange internal page numbering:

$ grep -i '%%Page: ' all.ps | wc -l
993
$ grep -i '%%Page: ' all.ps

It redefines "showpage" (the PostScript function to display/print a page 
when completed rendering it:


$ grep showpage all.ps | wc -l

1
$ grep showpage all.ps
/p{pop showpage pagesave restore /pagesave save def}def

This makes a single "p" displaying/printing the page.

So let us search for those "p"s:

$ grep ' p$' all.ps | wc -l

993

So Ghostscript (or the print process) outputting 993 pages seems correct 
to me, and I do not understand why evince and also your PostScript 
viewer only output 422 pages. Perhaps they consider duplicate page 
numbers as duplicate pages and skip them.


First numbers in "%%Page:" lines:

$ grep -i '%%Page: ' all.ps | cut -d ' ' -f 2 | sort | uniq | wc -l

422


Second numbers in "%%Page:" lines:

$ grep -i '%%Page: ' all.ps | cut -d ' ' -f 3 | sort | uniq | wc -l
422

The changes coming from cups-filters/the pstops filter mainly only 
change the DSC comments, letting the second number in the "%%Page:" 
lines going from 1 to 993 instead of being the same as the first number, 
starting from 1 again and again. This seems to make the viewers 
accepting all pages.


I hope this gives some insight.

On 01/10/2021 13:11, Andre Heider wrote:

Hi Till,

On 01/10/2021 12:32, Till Kamppeter wrote:
Andre, could you attach your PostScript file, once the original and 
also the one you get after pre-processing when using "GSCall echo %s 
%s %s;

cp %s /tmp"? Thanks.


attached a patch for the original .ps file, see the first post for a link.

But maybe that patch already hints at the problem?

Cheers,
Andre




Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Vincent Lefevre
On 2021-10-01 14:26:02 +0200, Vincent Lefevre wrote:
> On 2021-10-01 14:17:53 +0200, Vincent Lefevre wrote:
> > In my archives, I can see that the issue occurred with
> > ghostscript 9.26a~dfsg-0+deb9u1, but 9.27~dfsg-2+deb10u4
> > isn't affected on my second testcase.
> 
> The following PDF file (on which I got the issue with
> ghostscript 9.26a~dfsg-0+deb9u1) may be a useful testcase:
> 
>   https://hal.archives-ouvertes.fr/hal-02001080v1/document

On this testcase, the issue is actually reproducible with
ghostscript 9.27~dfsg-2+deb10u4!

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Vincent Lefevre
On 2021-10-01 14:17:53 +0200, Vincent Lefevre wrote:
> In my archives, I can see that the issue occurred with
> ghostscript 9.26a~dfsg-0+deb9u1, but 9.27~dfsg-2+deb10u4
> isn't affected on my second testcase.

The following PDF file (on which I got the issue with
ghostscript 9.26a~dfsg-0+deb9u1) may be a useful testcase:

  https://hal.archives-ouvertes.fr/hal-02001080v1/document

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Vincent Lefevre
In my archives, I can see that the issue occurred with
ghostscript 9.26a~dfsg-0+deb9u1, but 9.27~dfsg-2+deb10u4
isn't affected on my second testcase.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Vincent Lefevre
On 2021-10-01 12:05:28 +0200, Vincent Lefevre wrote:
> Well, with 9.53.3~dfsg-8, I can reproduce the bug on another PDF file,
> where it is the U+2019 RIGHT SINGLE QUOTATION MARK character (used as
> an apostrophe) that is incorrectly replaced by Š. I'll have to make
> another simple testcase.

The LaTeX source generating the testcase:

\documentclass[12pt]{article}
\usepackage{lmodern}
\begin{document}
\thispagestyle{empty}
Don't ff.
\end{document}

I've attached this testcase "chartest2.pdf", and the incorrect file
"chartest2-gs.pdf" obtained with

  ps2pdf chartest2.pdf chartest2-gs.pdf

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


chartest2.pdf
Description: Adobe PDF document


chartest2-gs.pdf
Description: Adobe PDF document


Bug#990210: fixed in cups-pdf 3.0.1-12

2021-10-01 Thread Andre Heider

Hi Till,

On 01/10/2021 12:32, Till Kamppeter wrote:
Andre, could you attach your PostScript file, once the original and also 
the one you get after pre-processing when using "GSCall echo %s %s %s;

cp %s /tmp"? Thanks.


attached a patch for the original .ps file, see the first post for a link.

But maybe that patch already hints at the problem?

Cheers,
Andre

all.patch.gz
Description: application/gzip


Bug#990210: fixed in cups-pdf 3.0.1-12

2021-10-01 Thread Till Kamppeter
Andre, could you attach your PostScript file, once the original and also 
the one you get after pre-processing when using "GSCall echo %s %s %s;

cp %s /tmp"? Thanks.


--

On 28/09/2021 14:20, Andre Heider wrote:

Indeed, still only getting an empty pdf on that file too.

That's another problem than the empty pdf files I was seeing before.
That issue prevented cups-pdf to produce non-empty files at all, no
matter what the input was. Even the cups test page failed.

Now it seems specific to this file. My pdf viewer atril claims the
original ps has 422 pages. If I fix it up with `ps2ps all.ps all2.ps`
the new file reports 993 pages, atril even shows another first page...

And printing that with `lp -dPDF all2.ps` works just fine.

The error handling from cups-pdf sure is funky here.

Cheers,
Andre

--

On 29/09/2021 09:11, Andre Heider wrote:

Did some more digging since I reused this bug because I thought it's the
same issue...

If you set the GSCall config to the default value but append
"1>/tmp/gs.out 2>&1" you can see an error in that file:

--- 8< ---
Error: /nocurrentpoint in --currentpoint--
Operand stack:
   (--)   80
Execution stack:
   %interp_exit   .runexec2   --nostringval--   --nostringval--
--nostringval--   2   %stopped_push   --nostringval--   --nostringval--
  --nostringval--   false   1   %stopped_push   1990   1   3
%oparray_pop   1989   1   3   %oparray_pop   1977   1   3   %oparray_pop
  1833   1   3   %oparray_pop   --nostringval--   %errorexec_pop
.runexec2   --nostringval--   --nostringval--   --nostringval--   2
%stopped_push   --nostringval--   --nostringval--
Dictionary stack:
   --dict:736/1123(ro)(G)--   --dict:0/20(G)--   --dict:89/200(L)--
--dict:63/140(L)--
Current allocation mode is local
Current file position is 10444
GPL Ghostscript 9.54.0: Unrecoverable error, exit code 1
--- 8< ---

The reason it fails with lp but succeeds with the manual gs cmdline is
that cups preprocesses the input file. If you use "GSCall echo %s %s %s;
cp %s /tmp" it'll copy the actual file used for cups-pdf to /tmp, for
which you then get the same error if you manually use the gs cmdline on it.

I don't know enough about the printing stack/postscript to tell if
that's fixable, but it all sounds like a corrupt ps file to me.



Processed: Re: Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Debian Bug Tracking System
Processing control commands:

> found -1 9.53.3~dfsg-8
Bug #995392 [ghostscript] ghostscript: ps2pdf trashes some characters
Marked as found in versions ghostscript/9.53.3~dfsg-8.

-- 
995392: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995392
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Vincent Lefevre
Control: found -1 9.53.3~dfsg-8

On 2021-09-30 22:00:47 +, JustAnotherArchivist wrote:
> Apologies, I somehow missed the part about pdftotext and the glyph's normal
> appearance in your original message. I can reproduce that with both files
> produced by 9.54.0~dfsg-5 but *not* the one produced by 9.53.3~dfsg-8
> (attached for reference), using the same pdftotext version (poppler-utils
> 20.09.0-3.1) for all files.

Well, with 9.53.3~dfsg-8, I can reproduce the bug on another PDF file,
where it is the U+2019 RIGHT SINGLE QUOTATION MARK character (used as
an apostrophe) that is incorrectly replaced by Š. I'll have to make
another simple testcase.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)