On Sat, Jul 22, 2023 at 06:46:28PM +0200, Sven Joachim wrote: > This version of groff maps an unescaped "-" to HYPHEN rather than > HYPHEN-MINUS. Due to that, copying text from manpages or following > references in the "SEE ALSO" section is rather unreliable, because many > manpages contain a plain "-" where they should have used "\-" instead. > > Neither the upstream NEWS file nor the Debian changelog make any mention > of this change, or how to revert it locally. Running "man" under > LC_ALL=C works around it, at the cost of worse typography.
It is in fact mentioned in the upstream NEWS file: o The an (man) and doc (mdoc) macro packages no longer remap the -, ', and ` input characters to Basic Latin code points on UTF-8 devices, but treat them as groff normally does (and AT&T troff before it did) for typesetting devices, where they become the hyphen, apostrophe or right single quotation mark, and left single quotation mark, respectively. This change is expected to expose glyph usage errors in man pages. See the "PROBLEMS" file for a recipe that will conceal these errors. A better long-term approach is for man pages to adopt correct input practices; the man pages groff_man_style(7), groff_char(7), and man-pages(7) (subsection "Generating optimal glyphs"; from the Linux man-pages project) contain such instructions. Doing so also improves man page typography when formatting for PDF. If you maintain a generator of man(7) or mdoc(7) documents (such as a tool that converts other formats to them), and need assistance, please contact the gr...@gnu.org mailing list and describe your situation. And the PROBLEMS file says: * When viewing man pages, some characters on my UTF-8 terminal emulator look funny or copy-and-paste wrong. Why? Some Unicode Basic Latin ("ASCII") input characters are mapped to non-Basic Latin code points in output for consistency with other output devices, like PDF. See groff_man_style(7) and groff_char(7) for correct input conventions and background. If you use the correct groff special character escape sequences to input them, you will get correct output no matter what device the input is formatted for. However, many man pages are written in ignorance of the correct special characters to obtain the desired glyphs. You can conceal these errors by adding the following to your site-local man(7) configuration. The file is called "man.local"; its installation directory depends on how groff was configured when it was built. --- start --- .if '\*[.T]'utf8' \{\ . char ' \[aq] . char - \- . char ^ \[ha] . char ` \[ga] . char ~ \[ti] .\} --- end --- You may also wish to do the same for "mdoc.local". In man pages (only), groff maps the minus sign special character '\-' to the Basic Latin hyphen-minus (U+002D) because man pages require this glyph and there is no historically established *roff input character, ordinary or special, for obtaining it when a hyphen and minus sign are both separately available. To obtain a true minus sign, use the special character escape sequences '\(mi' or '\[mi]'. I admit I overlooked this; I was aware of the change, but it somehow fell off my list of things to make a positive decision about when packaging 1.23.0. I'm rather inclined to revert this by adding the rest of the recipe above to debian/mandoc.local (while I agree with the idealized typographical point being made, I have approximately negative appetite for the Sisyphean task of fixing an entire distribution's manual pages in practice), but I'll let this suggestion sit for a few days in case anyone wants to make a reasoned argument against it in the meantime. -- Colin Watson (he/him) [cjwat...@debian.org]