> > Heirloom troff and groff both render \- as en dash,
> > not minus sign, in PDF output.

> If you use groff's native pdf driver (-Tpdf) I believe
> minus is rendered, can be searched for and copy/pasted.
> The postscript driver also outputs a "minus" so I suspect
> it is the ghostscript conversion to pdf which is changing it.

Here on my system, ghostscript keeps the minus when converting
to PDF.  The input file

  .sp 3c
  minus: \-
  .br
  en-dash: \(en

when processed by groff (using the default -Tps) and converted
to PDF using ghostscript results in the following page content
in the PDF:

  10 0 0 10 0 0 cm BT
  /R7 10 Tf
  1 0 0 1 72 744.851 Tm
  (minus: <AD>)Tj
  12 TL
  (en-dash: \211)'
  ET

where the <AD> is a single byte, matching groff's "text.enc"
that says minus is to be encoded at position 173.  The font
"R7" is a Times-Roman subset with the encoding

  /BaseEncoding /WinAnsiEncoding
  /Differences [ 137 /endash 173 /minus ]

Acroread (version 9) clearly renders the minus and the en-dash
differently.

When copied and pasted in a UTF-8 locale, it delivers them
as <e28892> and <e28093>, i.e., 'MINUS SIGN' (U+2212) and
'EN DASH' (U+2013).

In an ISO8859-1 locale both (like the hyphen) are pasted
as <2d>, i.e., "hyphen-minus".


If you want cut-and-pasteable ASCII command lines in PDF files,
I think the easiest way is to set up a hacked "code" font with
renamed glyphs.  Alternatively, you can try adding a
GlyphNames2Unicode dictionary.



Reply via email to