Bug#995392: ghostscript: ps2pdf trashes some characters

2021-11-04 Thread Jonas Smedegaard
Quoting Vincent Lefevre (2021-11-04 16:49:34)
> On 2021-11-03 18:06:50 +0100, Jonas Smedegaard wrote:
> > Quoting Vincent Lefevre (2021-11-03 14:29:26)
> > > This Debian bug actually covers several similar Ghostscript bugs.
> > 
> > Please track each bug separately.  Otherwise it is not possible to 
> > reliably track which bug affects which packaging releases.
> 
> Done:
>   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=998458
>   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=998461
> 
> I'm going to do tests against various ghostscript versions to see
> which ones are affected (I've currently only tested the experimental
> version ghostscript/9.55.0~~rc1~dfsg-1 for these bug reports).

Thanks!

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Bug#995392: ghostscript: ps2pdf trashes some characters

2021-11-04 Thread Vincent Lefevre
On 2021-11-03 18:06:50 +0100, Jonas Smedegaard wrote:
> Quoting Vincent Lefevre (2021-11-03 14:29:26)
> > This Debian bug actually covers several similar Ghostscript bugs.
> 
> Please track each bug separately.  Otherwise it is not possible to 
> reliably track which bug affects which packaging releases.

Done:
  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=998458
  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=998461

I'm going to do tests against various ghostscript versions to see
which ones are affected (I've currently only tested the experimental
version ghostscript/9.55.0~~rc1~dfsg-1 for these bug reports).

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-11-03 Thread Jonas Smedegaard
Quoting Vincent Lefevre (2021-11-03 14:29:26)
> This Debian bug actually covers several similar Ghostscript bugs.

Please track each bug separately.  Otherwise it is not possible to 
reliably track which bug affects which packaging releases.


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Bug#995392: ghostscript: ps2pdf trashes some characters

2021-11-03 Thread Vincent Lefevre
Control: forwarded -1 https://bugs.ghostscript.com/show_bug.cgi?id=704681

On 2021-11-03 14:29:26 +0100, Vincent Lefevre wrote:
> Control: retitle -1 ghostscript: pdfwrite incorrectly deals with embedded 
> ToUnicode CMap
> Control: found -1 9.27~dfsg-2+deb10u4
> 
> This Debian bug actually covers several similar Ghostscript bugs.
> I consider the most general bug given by the testcase below,
> which is still not fixed upstream.

So I should move it to the new upstream bug URL.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-11-03 Thread Vincent Lefevre
Control: retitle -1 ghostscript: pdfwrite incorrectly deals with embedded 
ToUnicode CMap
Control: found -1 9.27~dfsg-2+deb10u4

This Debian bug actually covers several similar Ghostscript bugs.
I consider the most general bug given by the testcase below,
which is still not fixed upstream.

On 2021-11-03 05:04:36 +0100, Vincent Lefevre wrote:
> \documentclass{article}
> \usepackage[T1]{fontenc}
> \usepackage{lmodern}
> \pdfglyphtounicode{Scaron}{0160}
> \pdfgentounicode=1
> \begin{document}
> \thispagestyle{empty}
> 'ê
> \end{document}

This testcase shows that this bug is not new and could have always
been present in Ghostscript. The fact is that I had never used
\pdfglyphtounicode{Scaron}{0160} before (I don't need it), and
I noticed this bug only due to TeX Live 2021, which now uses this
mapping (among others), unless all mappings are disabled by the
user with an explicit \pdfgentounicode=0.

This remaining bug (the other ones being recently fixed upstream)
might be specific to this mapping. A partial cause may be that
the ' character is transformed to a /quoteright, which leads to
a /Differences that confuses Ghostscript when generating the
ToUnicode CMap.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-11-02 Thread Vincent Lefevre
On 2021-11-03 03:29:43 +0100, Vincent Lefevre wrote:
> On 2021-11-02 16:25:27 +0100, Vincent Lefevre wrote:
> > With commit 8f62213019bc682eeb0ed9467d8841f3770cfda6 upstream,
> > I can no longer reproduce any issue, even when
> > /usr/share/texlive/texmf-dist/tex/generic/pdftex/glyphtounicode.tex
> > from Tex Live 2020 is included and \pdfgentounicode=1 is used.
> 
> Hmm... I didn't check carefully. On one of my files, there is
> actually one place where the quoteright (used for the apostrophe)
> is replaced by "Š" (checked with pdftotext, xpdf and atril). The
> cause may be that the paragraph in question is in a smaller font.

I have an explanation: it seems that in this smaller font,
no ligatures (ff, ffi, fl...) are used.

In a recent fix, Ghostscript no longer generates a ToUnicode CMap
when there are \pdfglyphtounicode with more than 2 bytes (such as
those used for the ligatures). So this fix made the bug disappear
when ligatures are used. Bug the bug was still there, and visible
when ligatures are not used.

> So the issue is still visible in practice.
> 
> I'll try to produce a simple testcase.

Here is it:

\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{lmodern}
\pdfglyphtounicode{Scaron}{0160}
\pdfgentounicode=1
\begin{document}
\thispagestyle{empty}
'ê
\end{document}

(Tested on the PDF generated by pdflatex from TeX Live 2020.)

My new upstream bug report:

  https://bugs.ghostscript.com/show_bug.cgi?id=704681

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-11-02 Thread Vincent Lefevre
Control: tags -1 - fixed-upstream

On 2021-11-02 16:25:27 +0100, Vincent Lefevre wrote:
> With commit 8f62213019bc682eeb0ed9467d8841f3770cfda6 upstream,
> I can no longer reproduce any issue, even when
> /usr/share/texlive/texmf-dist/tex/generic/pdftex/glyphtounicode.tex
> from Tex Live 2020 is included and \pdfgentounicode=1 is used.

Hmm... I didn't check carefully. On one of my files, there is
actually one place where the quoteright (used for the apostrophe)
is replaced by "Š" (checked with pdftotext, xpdf and atril). The
cause may be that the paragraph in question is in a smaller font.
So the issue is still visible in practice.

I'll try to produce a simple testcase.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-11-02 Thread Vincent Lefevre
On 2021-11-02 16:25:27 +0100, Vincent Lefevre wrote:
> With commit 8f62213019bc682eeb0ed9467d8841f3770cfda6 upstream,
> I can no longer reproduce any issue, even when
> /usr/share/texlive/texmf-dist/tex/generic/pdftex/glyphtounicode.tex
> from Tex Live 2020 is included and \pdfgentounicode=1 is used.
[...]

To be clear, both commit 8f62213019bc682eeb0ed9467d8841f3770cfda6
and the older commit b4e8434defb8e05ea05bb130b92217290efd2fba
should be needed and be sufficient to solve all the issues.

Commit b4e8434defb8e05ea05bb130b92217290efd2fba fixed

  https://bugs.ghostscript.com/show_bug.cgi?id=704478

triggered by the first simple testcase in this bug (the full story
is that by reducing my testcase, I inadvertently found this other
related issue).

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-11-02 Thread Vincent Lefevre
Control: found -1 9.55.0~~rc1~dfsg-1
Control: tags -1 fixed-upstream

With commit 8f62213019bc682eeb0ed9467d8841f3770cfda6 upstream,
I can no longer reproduce any issue, even when
/usr/share/texlive/texmf-dist/tex/generic/pdftex/glyphtounicode.tex
from Tex Live 2020 is included and \pdfgentounicode=1 is used.
However, note that:
  * Yesterday, I could still find an issue, though I could have made
a mistake in my test (e.g. looking at an old file): I've just
tested the same .tex file (with the same included files), and
the issue no longer appears.
  * The upstream bug is still open as an enhancement. Apparently,
Ghostscript no longer generates a (buggy) ToUnicode CMap, and
this could yield issues, but in practice on my files, everything
seems fine with xpdf, atril and pdftotext (not sure what happens,
but if they are using heuristics, they are working fine).

So I'm tagging this as fixed-upstream (the upstream bug should be
sufficient for the enhancement). Note that 9.55.0~~rc1~dfsg-1 from
experimental does not have the commit mentioned above, and I've
checked that this version is still incorrect.

I haven't tested with PDF generated by TeX Live 2021 yet, but I don't
expect any issue.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-28 Thread Vincent Lefevre
Control: forwarded -1 https://bugs.ghostscript.com/show_bug.cgi?id=704674
Control: tags -1 - fixed-upstream
Control: retitle -1 ghostscript: pdfwrite no longer preserves the ToUnicode 
CMap of PDF files

This is actually the real upstream bug (it appears that the first
testcase I gave was affected by a similar bug).

See the testcases in the attached archive from

  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995392#86

I've attached the main testcase.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


chartest5a-tl2021.pdf
Description: Adobe PDF document


Processed: Re: Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-17 Thread Debian Bug Tracking System
Processing commands for cont...@bugs.debian.org:

> reassign 995678 ghostscript
Bug #995678 [texlive-base] texlive-base: pdflatex in TeX Live 2021 potentially 
generates invalid PDF, making ps2pdf silently trash some characters
Bug reassigned from package 'texlive-base' to 'ghostscript'.
No longer marked as found in versions texlive-base/2021.20210921-1.
Ignoring request to alter fixed versions of bug #995678 to the same values 
previously set
> retitle 995678 deals incorrectly with embedded cmaps
Bug #995678 [ghostscript] texlive-base: pdflatex in TeX Live 2021 potentially 
generates invalid PDF, making ps2pdf silently trash some characters
Changed Bug title to 'deals incorrectly with embedded cmaps' from 
'texlive-base: pdflatex in TeX Live 2021 potentially generates invalid PDF, 
making ps2pdf silently trash some characters'.
> thanks
Stopping processing here.

Please contact me if you need assistance.
-- 
995678: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995678
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Processed: Re: Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-14 Thread Debian Bug Tracking System
Processing control commands:

> severity -1 normal
Bug #995392 [ghostscript] ghostscript: ps2pdf trashes some characters
Severity set to 'normal' from 'grave'

-- 
995392: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995392
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-14 Thread Jonas Smedegaard
Control: severity -1 normal

Quoting Vincent Lefevre (2021-09-30 16:53:01)
> Package: ghostscript
> Version: 9.54.0~dfsg-5
> Severity: grave
> Justification: causes non-serious data loss
> 
> The ps2pdf trashes some characters, making text non-searchable and
> partly unreadable via pdftotext (even though the glyph appears to
> be OK). There was no such issue in the recent past.

I agree with Ken Sharp¹ that this issue is not grave for the ghostscript 
package as a whole - regardless of how important that feature of 
ghostscript is for your usecase.

 - Jonas

¹ https://bugs.ghostscript.com/show_bug.cgi?id=704478#c8

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Processed: Re: Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-11 Thread Debian Bug Tracking System
Processing commands for cont...@bugs.debian.org:

> tags 995392 - moreinfo
Bug #995392 [ghostscript] ghostscript: ps2pdf trashes some characters
Removed tag(s) moreinfo.
> thanks
Stopping processing here.

Please contact me if you need assistance.
-- 
995392: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995392
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Processed: Re: Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-11 Thread Debian Bug Tracking System
Processing control commands:

> severity 995392 grave
Bug #995392 [ghostscript] ghostscript: ps2pdf trashes some characters
Severity set to 'grave' from 'normal'
> severity 995678 normal
Bug #995678 [texlive-base] texlive-base: pdflatex in TeX Live 2021 potentially 
generates invalid PDF, making ps2pdf silently trash some characters
Severity set to 'normal' from 'grave'
> tags 995678 - moreinfo
Bug #995678 [texlive-base] texlive-base: pdflatex in TeX Live 2021 potentially 
generates invalid PDF, making ps2pdf silently trash some characters
Ignoring request to alter tags of bug #995678 to the same tags previously set

-- 
995392: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995392
995678: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995678
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Processed: Re: Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-10 Thread Debian Bug Tracking System
Processing commands for cont...@bugs.debian.org:

> severity 995392 normal
Bug #995392 [ghostscript] ghostscript: ps2pdf trashes some characters
Severity set to 'normal' from 'grave'
> tags 995392 + moreinfo
Bug #995392 [ghostscript] ghostscript: ps2pdf trashes some characters
Added tag(s) moreinfo.
> thanks
Stopping processing here.

Please contact me if you need assistance.
-- 
995392: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995392
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-10 Thread Norbert Preining
severity 995392 normal
tags 995392 + moreinfo
thanks

Hi all,

first of all, it seems this message didn't make it either to the list or
my computer, just found it by randomly checking transitioning.

Then, this is by far not a grave bug in TL. pdflatex is **not**
affected, since it generated pdf files without using ghostscript.

What might have some problems - and I haven't reproduced this till now
nor tested - is dvi -> ps -> pdf route.

>   * chartest3.tex:  Test: « don't ».
>   * chartest4[ab].tex:  Test: « don't finite float ».
>   * chartest5[ab].tex:  Test: « don't finite float offer affine ».
> where the 4b and 5b versions contain \pdfglyphtounicode commands for
> the ligatures (from glyphtounicode.tex), though the tests below show
> that they do not have any influence here.

Vincent, thanks for the tests, but without explanation or make files or
some hints on **what** you did run, this is not reproducible and
testable.

What I want to see is
* input file
* commands run
* log files of each program run
* what is the problematic output

Thanks

> \documentclass[12pt]{article}
> \usepackage[utf8]{inputenc}
> \usepackage[T1]{fontenc}
> \usepackage{lmodern}
> \begin{document}
> \thispagestyle{empty}
> Test: « don't ».
> \end{document}

This document and copy and paste of its content does work fine for me
with
* current sid
* latex/dvips/ps2pdf
* pdflatex

Generated pdf file can be copy/pasted.

I really don't see what is going on, thanks for any explanations.

Best

Norbert

--
PREINING Norbert  https://www.preining.info
Fujitsu Research  +  IFMGA Guide  +  TU Wien  +  TeX Live  + Debian Dev
GPG: 0x860CDC13   fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13



Processed: Re: Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-03 Thread Debian Bug Tracking System
Processing control commands:

> clone -1 -2
Bug #995392 [ghostscript] ghostscript: ps2pdf trashes some characters
Bug 995392 cloned as bug 995678
> reassign -2 texlive-base 2021.20210921-1
Bug #995678 [ghostscript] ghostscript: ps2pdf trashes some characters
Bug reassigned from package 'ghostscript' to 'texlive-base'.
No longer marked as found in versions ghostscript/9.54.0~dfsg-5 and 
ghostscript/9.53.3~dfsg-8.
Ignoring request to alter fixed versions of bug #995678 to the same values 
previously set
Bug #995678 [texlive-base] ghostscript: ps2pdf trashes some characters
Marked as found in versions texlive-base/2021.20210921-1.
> retitle -2 texlive-base: pdflatex in TeX Live 2021 potentially generates 
> invalid PDF, making ps2pdf silently trash some characters
Bug #995678 [texlive-base] ghostscript: ps2pdf trashes some characters
Changed Bug title to 'texlive-base: pdflatex in TeX Live 2021 potentially 
generates invalid PDF, making ps2pdf silently trash some characters' from 
'ghostscript: ps2pdf trashes some characters'.

-- 
995392: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995392
995678: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995678
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Vincent Lefevre
On 2021-10-01 16:52:13 +0200, Jonas Smedegaard wrote:
> Quoting Vincent Lefevre (2021-10-01 15:53:38)
> > It seems that the issue partly comes from pdflatex: On an old file for 
> > which ps2pdf was correct with ghostscript 9.53.3~dfsg-4, it is now 
> > incorrect still with ghostscript 9.53.3~dfsg-4. But if I regenerate 
> > the intermediate PDF file on an old Debian machine and transfer it to 
> > my current machine, ps2pdf is correct with ghostscript 9.53.3~dfsg-4 
> > and with ghostscript 9.53.3~dfsg-7 (stable), and also with ghostscript 
> > 9.54.0~dfsg-5.
> 
> Are you sure you mean 9.53.3~dfsg-7, not 9.53.3~dfsg-7+deb11u1?

Yes, I used "apt .../stable", and it was 9.53.3~dfsg-7 that was
fetched, not the security update.

zira:~> apt-show-versions -a ghostscript
ghostscript:amd64 9.54.0~dfsg-5 install ok installed
ghostscript:amd64 9.53.3~dfsg-7 stable  ftp.debian.org
ghostscript:amd64 9.53.3~dfsg-7+deb11u1 stable-security security.debian.org
No stable-updates version
ghostscript:amd64 9.54.0~dfsg-5 testing ftp.debian.org
ghostscript:amd64 9.54.0~dfsg-5 unstableftp.debian.org
ghostscript:amd64 9.55.0~~rc1~dfsg-1experimentalftp.debian.org
ghostscript:amd64/testing 9.54.0~dfsg-5 uptodate

Perhaps I should have used "/stable-security".

> Some upstream changes was backported for -7 and other changes was 
> introduced by -7+deb11u1: 
> https://tracker.debian.org/media/packages/g/ghostscript/changelog-9.53.3dfsg-7deb11u1
> 
> Possibly related to the recent changes to Ghostscripts SAFER: 
> https://www.ghostscript.com/doc/9.55.0/Use.htm#Safer
> 
> Perhaps recent pdflatex was adapted to handle the change to SAFER, and 
> in doing so became dependent on recent Ghostscript (and perhaps that was 
> then not reflected in packaging of pdflatex)?

Anyway, I doubt that this is related to the font / mapping issue.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Jonas Smedegaard
Quoting Vincent Lefevre (2021-10-01 15:53:38)
> On 2021-10-01 14:31:57 +0200, Vincent Lefevre wrote:
> > On 2021-10-01 14:26:02 +0200, Vincent Lefevre wrote:
> > > On 2021-10-01 14:17:53 +0200, Vincent Lefevre wrote:
> > > > In my archives, I can see that the issue occurred with 
> > > > ghostscript 9.26a~dfsg-0+deb9u1, but 9.27~dfsg-2+deb10u4 isn't 
> > > > affected on my second testcase.
> > > 
> > > The following PDF file (on which I got the issue with ghostscript 
> > > 9.26a~dfsg-0+deb9u1) may be a useful testcase:
> > > 
> > >   https://hal.archives-ouvertes.fr/hal-02001080v1/document
> > 
> > On this testcase, the issue is actually reproducible with 
> > ghostscript 9.27~dfsg-2+deb10u4!
> 
> It seems that the issue partly comes from pdflatex: On an old file for 
> which ps2pdf was correct with ghostscript 9.53.3~dfsg-4, it is now 
> incorrect still with ghostscript 9.53.3~dfsg-4. But if I regenerate 
> the intermediate PDF file on an old Debian machine and transfer it to 
> my current machine, ps2pdf is correct with ghostscript 9.53.3~dfsg-4 
> and with ghostscript 9.53.3~dfsg-7 (stable), and also with ghostscript 
> 9.54.0~dfsg-5.

Are you sure you mean 9.53.3~dfsg-7, not 9.53.3~dfsg-7+deb11u1?

Some upstream changes was backported for -7 and other changes was 
introduced by -7+deb11u1: 
https://tracker.debian.org/media/packages/g/ghostscript/changelog-9.53.3dfsg-7deb11u1

Possibly related to the recent changes to Ghostscripts SAFER: 
https://www.ghostscript.com/doc/9.55.0/Use.htm#Safer

Perhaps recent pdflatex was adapted to handle the change to SAFER, and 
in doing so became dependent on recent Ghostscript (and perhaps that was 
then not reflected in packaging of pdflatex)?


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Vincent Lefevre
On 2021-10-01 14:31:57 +0200, Vincent Lefevre wrote:
> On 2021-10-01 14:26:02 +0200, Vincent Lefevre wrote:
> > On 2021-10-01 14:17:53 +0200, Vincent Lefevre wrote:
> > > In my archives, I can see that the issue occurred with
> > > ghostscript 9.26a~dfsg-0+deb9u1, but 9.27~dfsg-2+deb10u4
> > > isn't affected on my second testcase.
> > 
> > The following PDF file (on which I got the issue with
> > ghostscript 9.26a~dfsg-0+deb9u1) may be a useful testcase:
> > 
> >   https://hal.archives-ouvertes.fr/hal-02001080v1/document
> 
> On this testcase, the issue is actually reproducible with
> ghostscript 9.27~dfsg-2+deb10u4!

It seems that the issue partly comes from pdflatex: On an old
file for which ps2pdf was correct with ghostscript 9.53.3~dfsg-4,
it is now incorrect still with ghostscript 9.53.3~dfsg-4. But if
I regenerate the intermediate PDF file on an old Debian machine
and transfer it to my current machine, ps2pdf is correct with
ghostscript 9.53.3~dfsg-4 and with ghostscript 9.53.3~dfsg-7
(stable), and also with ghostscript 9.54.0~dfsg-5.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Vincent Lefevre
On 2021-10-01 14:26:02 +0200, Vincent Lefevre wrote:
> On 2021-10-01 14:17:53 +0200, Vincent Lefevre wrote:
> > In my archives, I can see that the issue occurred with
> > ghostscript 9.26a~dfsg-0+deb9u1, but 9.27~dfsg-2+deb10u4
> > isn't affected on my second testcase.
> 
> The following PDF file (on which I got the issue with
> ghostscript 9.26a~dfsg-0+deb9u1) may be a useful testcase:
> 
>   https://hal.archives-ouvertes.fr/hal-02001080v1/document

On this testcase, the issue is actually reproducible with
ghostscript 9.27~dfsg-2+deb10u4!

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Vincent Lefevre
On 2021-10-01 14:17:53 +0200, Vincent Lefevre wrote:
> In my archives, I can see that the issue occurred with
> ghostscript 9.26a~dfsg-0+deb9u1, but 9.27~dfsg-2+deb10u4
> isn't affected on my second testcase.

The following PDF file (on which I got the issue with
ghostscript 9.26a~dfsg-0+deb9u1) may be a useful testcase:

  https://hal.archives-ouvertes.fr/hal-02001080v1/document

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Vincent Lefevre
In my archives, I can see that the issue occurred with
ghostscript 9.26a~dfsg-0+deb9u1, but 9.27~dfsg-2+deb10u4
isn't affected on my second testcase.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Vincent Lefevre
On 2021-10-01 12:05:28 +0200, Vincent Lefevre wrote:
> Well, with 9.53.3~dfsg-8, I can reproduce the bug on another PDF file,
> where it is the U+2019 RIGHT SINGLE QUOTATION MARK character (used as
> an apostrophe) that is incorrectly replaced by Š. I'll have to make
> another simple testcase.

The LaTeX source generating the testcase:

\documentclass[12pt]{article}
\usepackage{lmodern}
\begin{document}
\thispagestyle{empty}
Don't ff.
\end{document}

I've attached this testcase "chartest2.pdf", and the incorrect file
"chartest2-gs.pdf" obtained with

  ps2pdf chartest2.pdf chartest2-gs.pdf

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


chartest2.pdf
Description: Adobe PDF document


chartest2-gs.pdf
Description: Adobe PDF document


Processed: Re: Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Debian Bug Tracking System
Processing control commands:

> found -1 9.53.3~dfsg-8
Bug #995392 [ghostscript] ghostscript: ps2pdf trashes some characters
Marked as found in versions ghostscript/9.53.3~dfsg-8.

-- 
995392: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995392
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-10-01 Thread Vincent Lefevre
Control: found -1 9.53.3~dfsg-8

On 2021-09-30 22:00:47 +, JustAnotherArchivist wrote:
> Apologies, I somehow missed the part about pdftotext and the glyph's normal
> appearance in your original message. I can reproduce that with both files
> produced by 9.54.0~dfsg-5 but *not* the one produced by 9.53.3~dfsg-8
> (attached for reference), using the same pdftotext version (poppler-utils
> 20.09.0-3.1) for all files.

Well, with 9.53.3~dfsg-8, I can reproduce the bug on another PDF file,
where it is the U+2019 RIGHT SINGLE QUOTATION MARK character (used as
an apostrophe) that is incorrectly replaced by Š. I'll have to make
another simple testcase.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Processed: Re: Bug#995392: ghostscript: ps2pdf trashes some characters

2021-09-30 Thread Debian Bug Tracking System
Processing control commands:

> tags -1 upstream
Bug #995392 [ghostscript] ghostscript: ps2pdf trashes some characters
Added tag(s) upstream.
> forwarded -1 https://bugs.ghostscript.com/show_bug.cgi?id=704478
Bug #995392 [ghostscript] ghostscript: ps2pdf trashes some characters
Set Bug forwarded-to-address to 
'https://bugs.ghostscript.com/show_bug.cgi?id=704478'.

-- 
995392: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995392
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-09-30 Thread Vincent Lefevre
Control: tags -1 upstream
Control: forwarded -1 https://bugs.ghostscript.com/show_bug.cgi?id=704478

On 2021-09-30 18:49:02 +0200, Jonas Smedegaard wrote:
> Quoting Vincent Lefevre (2021-09-30 18:28:51)
> > On 2021-09-30 17:18:46 +0200, Jonas Smedegaard wrote:
> > > This seems an upstream bug, and it would be helpful if you report it 
> > > upstream as well.  Their bugtracker is at https://bugs.ghostscript.com/
> > 
> > OK. I'll do it tonight (I could also try to find the cause).

I've identified the commit that introduced the issue (though I'm not
sure whether the bug could be already present on other kinds of text)
and reported the bug upstream with the details (see above URL).

> Also, you could test against the newer pre-release in experimental.

Not tried, but the issue is still present in master.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-09-30 Thread JustAnotherArchivist

Control: notfound -1 9.53.3~dfsg-8

Apologies, I somehow missed the part about pdftotext and the glyph's 
normal appearance in your original message. I can reproduce that with 
both files produced by 9.54.0~dfsg-5 but *not* the one produced by 
9.53.3~dfsg-8 (attached for reference), using the same pdftotext version 
(poppler-utils 20.09.0-3.1) for all files.




chartest-gs-jaa-9.53.3.pdf
Description: Adobe PDF document


Bug#995392: ghostscript: ps2pdf trashes some characters

2021-09-30 Thread JustAnotherArchivist

Hi Vincent,

For what it's worth, I do not see the corruption you're describing with 
`gv chartest-gs.pdf` nor when converting it myself from your input file 
using versions 9.53.3~dfsg-8 or 9.54.0~dfsg-5.


I noticed that your file used a different internal conversion command 
compared to when I try it with ps2pdf 9.54.0~dfsg-5:


Yours: %%Invocation: path/gs -dPrinted=false -P- -dSAFER 
-dCompatibilityLevel=1.5 -q -P- -dNOPAUSE -dBATCH -sDEVICE=pdfwrite 
-sstdout=? -sOutputFile=? -P- -dSAFER -dCompatibilityLevel=1.5 ?
Mine: %%Invocation: path/gs -P- -dSAFER -dCompatibilityLevel=1.4 -q -P- 
-dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sstdout=? -sOutputFile=? -P- 
-dSAFER -dCompatibilityLevel=1.4 ?


Invoking it like your command manually did not make a difference for me 
though, with the output file being identical except for the expected 
differences in the version string, timestamps, and UUIDs.


I have attached my `ps2pdf chartest.pdf chartest-gs-jaa.pdf` output file 
(created with 9.54.0~dfsg-5).


Cheers,
JustAnotherArchivist



chartest-gs-jaa.pdf
Description: Adobe PDF document


Bug#995392: ghostscript: ps2pdf trashes some characters

2021-09-30 Thread Jonas Smedegaard
Quoting Vincent Lefevre (2021-09-30 18:28:51)
> On 2021-09-30 17:18:46 +0200, Jonas Smedegaard wrote:
> > This seems an upstream bug, and it would be helpful if you report it 
> > upstream as well.  Their bugtracker is at https://bugs.ghostscript.com/
> 
> OK. I'll do it tonight (I could also try to find the cause).

Great. Thanks!

Also, you could test against the newer pre-release in experimental.


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Bug#995392: ghostscript: ps2pdf trashes some characters

2021-09-30 Thread Vincent Lefevre
On 2021-09-30 17:18:46 +0200, Jonas Smedegaard wrote:
> This seems an upstream bug, and it would be helpful if you report it 
> upstream as well.  Their bugtracker is at https://bugs.ghostscript.com/

OK. I'll do it tonight (I could also try to find the cause).

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#995392: ghostscript: ps2pdf trashes some characters

2021-09-30 Thread Jonas Smedegaard
Hi Vincent,

Quoting Vincent Lefevre (2021-09-30 16:53:01)
> The ps2pdf trashes some characters, making text non-searchable and 
> partly unreadable via pdftotext (even though the glyph appears to be 
> OK). There was no such issue in the recent past.

Thanks for reporting this!

This seems an upstream bug, and it would be helpful if you report it 
upstream as well.  Their bugtracker is at https://bugs.ghostscript.com/


Kind regards,

 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Bug#995392: ghostscript: ps2pdf trashes some characters

2021-09-30 Thread Vincent Lefevre
Package: ghostscript
Version: 9.54.0~dfsg-5
Severity: grave
Justification: causes non-serious data loss

The ps2pdf trashes some characters, making text non-searchable and
partly unreadable via pdftotext (even though the glyph appears to
be OK). There was no such issue in the recent past.

LaTeX source to generate the PDF testcase:

\documentclass[12pt]{article}
\usepackage[T1]{fontenc}
\begin{document}
\thispagestyle{empty}
Test: float.
\end{document}

to be compiled with pdflatex.

I've attached 2 files:
  * chartest.pdf (testcase generated by pdflatex).
  * chartest-gs.pdf, which is the buggy result obtained with
"ps2pdf chartest.pdf chartest-gs.pdf".

chartest.pdf contains the text "Test: float." as expected.
But chartest-gs.pdf contains the text "Test: ŕoat.", which
is incorrect: "fl" has been replaced by "ŕ".

Removing "\usepackage[T1]{fontenc}" or the period after "float" makes
this issue disappear.

-- System Information:
Debian Release: bookworm/sid
  APT prefers unstable-debug
  APT policy: (500, 'unstable-debug'), (500, 'stable-updates'), (500, 
'stable-security'), (500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 
'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 5.14.0-1-amd64 (SMP w/12 CPU threads)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, 
TAINT_UNSIGNED_MODULE
Locale: LANG=POSIX, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages ghostscript depends on:
ii  libc6   2.32-4
ii  libgs9  9.54.0~dfsg-5

ghostscript recommends no packages.

Versions of packages ghostscript suggests:
ii  ghostscript-x  9.54.0~dfsg-5

-- no debconf information

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


chartest.pdf
Description: Adobe PDF document


chartest-gs.pdf
Description: Adobe PDF document