Thanks for this; I'll get working on integrating it.
On Wed, Jul 05, 2023 at 05:30:53PM -0500, G. Branden Robinson wrote:
> o The an (man) and doc (mdoc) macro packages no longer remap the -, ',
> and ` input characters to Basic Latin code points on UTF-8 devices,
> but treat them as groff normally does (and AT troff before it did)
> for typesetting devices, where they become the hyphen, apostrophe or
> right single quotation mark, and left single quotation mark,
> respectively. This change is expected to expose glyph usage errors in
> man pages. See the "PROBLEMS" file for a recipe that will conceal
> these errors. A better long-term approach is for man pages to adopt
> correct input practices; the man pages groff_man_style(7),
> groff_char(7), and man-pages(7) (subsection "Generating optimal
> glyphs"; from the Linux man-pages project) contain such instructions.
> Doing so also improves man page typography when formatting for PDF.
>
> If you maintain a generator of man(7) or mdoc(7) documents (such as a
> tool that converts other formats to them), and need assistance, please
> contact the gr...@gnu.org mailing list and describe your situation.
Do you have any opinions on what I should do with this, in
debian/mandoc.local? In the past, this has been one of those lose-lose
situations where I agree with the typographical concerns but have ended
up yielding to the weight of practical considerations in the
distribution.
. \" Debian: Strictly, "-" is a hyphen while "\-" is a minus sign, and the
. \" former may not always be rendered in the form expected for things like
. \" command-line options. Uncomment this if you want to make sure that
. \" manual pages you're writing are clear of this problem.
. \" if '\*[.T]'utf8' \
. \" char - \[hy]
.
. \" Debian: "\-" is more commonly used for option dashes than for minus
. \" signs in manual pages, so map it to plain "-" for HTML/XHTML output
. \" rather than letting it be rendered as "".
. ie '\*[.T]'html' \
.char \- \N'45'
. el \{\
.if '\*[.T]'xhtml' \
. char \- \N'45'
. \}
(It has of course been a while. Maybe we should try again at Debian's
scale.)
> o The "utf8" output device now maps the input characters '^' (caret,
> circumflex accent, or "hat") and '~' (tilde) to U+02C6 (modifier
> letter circumflex accent) and U+02DC (small tilde), respectively, for
> consistency with groff's other output devices. This change is
> expected to expose glyph usage errors in man pages. See the
> "PROBLEMS" file for a recipe that will conceal these errors. A better
> long-term approach is for man pages to adopt correct input practices;
> the man pages groff_man_style(7), groff_char(7), and man-pages(7)
> (subsection "Generating optimal glyphs"; from the Linux man-pages
> project) contain such instructions. Doing so also improves man page
> typography when formatting for PDF.
I'm surprised by the tilde change, and I suspect many other people will
be too. You're quite right that it was already that way for PDF, but I
expect there'll be a lot of references to configuration files in
people's home directories that will be tripped up by this. Perhaps we
should conceal these new errors in Debian for now?
> o The "sgr" device control command, which dynamically configured support
> for ISO 6429/ECMA-48 SGR escape sequences (and restored traditional
> overstriking behavior if disabled), has been removed. It took effect
> only at page boundaries. grotty's "-c" command-line option and the
> GROFF_NO_SGR environment variable remain supported.
As you're aware:
. \" Debian: Disable the use of SGR (ANSI colour) escape sequences by
. \" grotty.
. if '\V[GROFF_SGR]'' \
.output x X tty: sgr 0
I added this with the note "because most pagers either fail to cope with
it or need special options to do so". However, that was in 2002 ... so
I think it's about time to retire this Debian-specific customization.
(I expect some greybeard complaints along the lines of
https://bugs.debian.org/312935, but at least the environment variable
exists.)
> o The semantics of the environment variable SOURCE_DATE_EPOCH to groff,
> support for which was added in 1.22.4, were not established at that
> time with respect to time zone selection, prompting divergent
> interpretations; Debian and distributions derived from it have for
> several years patched groff to implicitly use UTC as the time zone
> when interpreting the current time (or SOURCE_DATE_EPOCH) as a local
> time. While a convenient and defensible choice for reproducible build
> efforts, it runs against the grain of user expectations. Systems
> programmers like time zone-invariant, monotonically increasing clocks;
> the broader user base usually prefers a clock that follows an
> applicable civil calendar. groff programs now reckon
> SOURCE_DATE_EPOCH with respect to the local time