On Monday 28 May 2018 15:16:53 Pali Rohár wrote: > On Monday 28 May 2018 02:48:09 Ingo Schwarze wrote: > > Hi Pali, > > > > Pali Rohar wrote on Sun, May 27, 2018 at 11:52:44PM +0200: > > > > > Now I looked deeply at man -Tps output and basically \- sequence is > > > written as character 0xAD (\255 in octet) into output postscript file. > > > Therefore it is SOFT HYPHEN (U+00AD), > > > > No, that is not a "soft hyphen". Glyph numbers in fonts used for > > PostScript output have nothing to do with Unicode code points. > > Look at the file font/devps/TR for examples: > > > > PS name TR# Unicode > > ------- --- ------- > > asciicircum 0x00 U+005E > > asciitilde 0x01 U+007E > > Scaron 0x02 U+0053 U+030C > > Zcaron 0x03 U+005A U+030C > > scaron 0x04 U+0073 U+030C > > zcaron 0x05 U+007A U+030C > > Ydieresis 0x06 U+0059 U+0308 > > trademark 0x07 U+2122 > > quotesingle 0x08 U+0027 > > Euro 0x09 U+20AC > > hyphen 0x2d U+2010 > > circumflex 0x5e U+02C6 > > quoteleft 0x60 U+2018 > > tilde 0x7e U+02DC > > bullet 0x83 U+2022 > > florin 0x84 U+0192 > > minus 0xad U+2212 > > > > and so on and so forth, it's completely different all over the place. > > I'm saying that I generated PostScript file via man -Tps and then looked > into generated PostScript file. > > And in PostScript file on place where should command line switch > --something was F2(\255... or F2<ad... \255 is IIRC glyph encoded in > octets and <ad> in hex. 0255 and 0xAD are both decimal 173, so both > refers to same glyph. > > Now I see that in that PostScript file is also attached encoding vector > def /ENC0 [ ... ] and on position 173 is name /minus. And according to > Adobe /minus name represent Unicode code points U+2212. > > So you are right it is not soft-hyphen, I forgot to see at encoding > vector in result PostScript file. > > And also answer my question why ps2pdf converter from generates PDF file > where for switches are used U+2212 code points. ps2pdf did it correctly > by looking into attached encoding vector /ENC0. > > So problem is for sure in grodvi which generates that PS file with ~~~~~~ I mean grops
> attached encoding vector. Unicode's hyphe-minus has code point U+002D > and according to Adobe's glyphlist.txt, U+002D is assigned to glyph name > /hyphen. > > So man -Tps (or grodvi) can be fixed. Just it is needed to generate > correct encoding vector and use proper glyph name /hyphen for \- when > generating from manpage. Here is simple fix results to have hyphen-minus (U+002D) for command line switches in postscript output via man -Tps: man -Tps groff | sed 's:/minus:/hyphen:g' > groff.ps But on some places it damage formatting due to different font metrics. Therefore this replacing should be fixed in groff postscript generator. > > > so it is incorrect for command line switch. > > > > It is not incorrect. The TR font does not contain a glyph for > > hyphen-minus, so plain minus is used as a fallback. > > In font/devps/TR file is this line in "charset" section: > > \- 564,286 0 173 minus > > Should not this be number 45 instead of 173? > -- Pali Rohár pali.ro...@gmail.com