Package: ghostscript
Version: 10.04.0~dfsg-1
Severity: important
X-Debbugs-Cc: stu...@debian.org

Dear Maintainer,

The txtwrite device in the most recent upload of ghostscript is broken;
this causes src:plastex to FTBFS as it uses the device in its test
suite.

The following is a simple reproducer based on a unit test from plastex.
(The necessary test3.pdf is also attached).

/-----
$ cat test3.tex
\documentclass{article}
\begin{document}
\thispagestyle{empty}
a.b
\end{document}

$ lualatex test3
[...]
Output written on test3.pdf (1 page, 2719 bytes).

### In bookworm

$ gs --version
10.00.0

$ gs -q -sDEVICE=txtwrite -o %stdout% test3.pdf
a.b

### In sid

$ gs --version
10.04.0

$ gs -q -sDEVICE=txtwrite -o %stdout% test3.pdf
X#
-----/

The tool pdftotext from poppler-utils also correctly extracts the text
from the test file.

This problem does not extend to all PDFs; in fact it seems to be
confined to PDFs generated by lualatex while pdflatex is OK.
Unfortanately for modern fonts and UTF-8, users are encouraged to use
lualatex these days, and the plastex test suite does so. As seen below,
lualatex picks different fonts and encodes them differently - that seems
to be what ghostscript is getting wrong.

$ pdffonts test3-lualatex.pdf
name                                 type              encoding         emb sub 
uni object ID
------------------------------------ ----------------- ---------------- --- --- 
--- ---------
VFSMBO+LMRoman10-Regular             CID Type 0C       Identity-H       yes yes 
yes      4  0

$ pdffonts test3-pdftex.pdf
name                                 type              encoding         emb sub 
uni object ID
------------------------------------ ----------------- ---------------- --- --- 
--- ---------
ZKXRNQ+CMR10                         Type 1            Builtin          yes yes 
yes      4  0

regards
Stuart

Attachment: test3.pdf
Description: Adobe PDF document

Reply via email to