[groff] 07/48: [docs]: Further update input encoding discussion.

G. Branden Robinson Thu, 15 Aug 2024 10:17:12 -0700

gbranden pushed a commit to branch master
in repository groff.

commit 26980424dc66b32cbb4eede0e7154bba58918c48
Author: G. Branden Robinson <[email protected]>
AuthorDate: Tue Aug 13 00:21:07 2024 -0500


    [docs]: Further update input encoding discussion.
---
 doc/groff.texi.in    | 31 +++++++++++++++++++++----------
 man/groff_tmac.5.man | 31 ++++++++++++++++++++++++++-----
 2 files changed, 47 insertions(+), 15 deletions(-)

diff --git a/doc/groff.texi.in b/doc/groff.texi.in
index f0d5fad50..da10f7723 100644
--- a/doc/groff.texi.in
+++ b/doc/groff.texi.in
@@ -464,7 +464,7 @@ Documentation License''.
 @title groff
 @subtitle The GNU implementation of @code{troff}
 @subtitle version @VERSION@
-@subtitle July 2024
+@subtitle August 2024
 @author Trent@tie{}A.@: Fisher
 @author Werner Lemberg
 @author G.@tie{}Branden Robinson
@@ -5615,13 +5615,24 @@ package can load it with the @code{mso} (``macro 
source'') request.
 @c (e.g., what character encodings _they_ support for output and their
 @c responsibility for converting to them) as well.
 
+@c BEGIN Keep roughly parallel with groff_tmac(5) section "Input
+@c Encodings".
 @node Input Encodings, Input Conventions, Macro Packages, Text
 @subsection Input Encodings
 
 The @command{groff} command's @option{-k} option calls the
 @command{preconv} preprocessor to perform input character encoding
 conversions.  Input to the GNU @code{troff} formatter itself, on the
-other hand, must be in one of two encodings it can recognize.
+other hand, must be in a single-byte encoding compatible with @w{ISO
+646:1991 IRV} (US-@acronym{ASCII}).
+
+Certain macro files are responsible for translating input character
+codes above 127 decimal to appropriate GNU @code{troff} escape
+sequences, and setting up hyphenation codes for letters their encodings
+define; typically, they also invoke @code{hcode} requests to case-fold
+such letters where necessary so that they match hyphenation patterns.
+As a rule, a localization file (recall @pxref{Manipulating Hyphenation})
+loads one of these files; a document need not do so directly.
 
 @table @code
 @item latin1
@@ -5654,12 +5665,11 @@ To use @w{KOI8-R}, an encoding for the Russian 
language, either place
 supply @samp{-m koi8-r} as a command-line argument to @code{groff}.  The
 localization file @file{ru.tmac} takes care of this automatically; see
 @ref{Manipulating Hyphenation}.@footnote{KOI8-R code points in the range
-@code{0x80}--@code{0x9F} are not valid input on systems using ISO
-character codings natively; see @ref{Identifiers}.  This should be no
-impediment to practical documents, as these KOI8-R code points do not
-encode letters, but box-drawing symbols and characters that are better
-obtained via special character escape sequences; see
-@cite{groff_char@r{(7)}}.}
+@code{0x80}--@code{0x9F} are not valid input to GNU @command{troff}; see
+@ref{Identifiers}.  This should be no impediment to practical documents,
+as these KOI8-R code points do not encode letters, but box-drawing
+symbols and characters that are better obtained via special character
+escape sequences; see @cite{groff_char@r{(7)}}.}
 
 @item latin2
 @cindex encoding, input, @w{ISO Latin-2} (@w{8859-2})
@@ -5693,6 +5703,8 @@ coverage for French.  To use this encoding, invoke 
@w{@samp{.mso
 latin9.tmac}} at the beginning of your document or supply
 @samp{-m latin9} as a command-line argument to @code{groff}.
 @end table
+@c END Keep roughly parallel with groff_tmac(5) section "Input
+@c Encodings".
 
 Some characters from an input encoding may not be available with a
 particular output driver, or their glyphs may not have representation in
@@ -8906,8 +8918,7 @@ patterns.  Its arguments are pairs of character 
codes---integers from 0
 to@tie{}255.  The request maps character code@tie{}@var{a} to
 code@tie{}@var{b}, code@tie{}@var{c} to code@tie{}@var{d}, and so on.
 Character codes that would otherwise be invalid in GNU @code{troff} can
-be used.  By default, every code maps to itself except those for letters
-`A' to `Z', which map to those for `a' to `z'.
+be used.
 
 @cindex localization
 @pindex troffrc
diff --git a/man/groff_tmac.5.man b/man/groff_tmac.5.man
index 60297686f..822aa088a 100644
--- a/man/groff_tmac.5.man
+++ b/man/groff_tmac.5.man
@@ -9,7 +9,7 @@ typesetting system
 .\" Legal Terms
 .\" ====================================================================
 .\"
-.\" Copyright (C) 2000-2023 Free Software Foundation, Inc.
+.\" Copyright (C) 2000-2024 Free Software Foundation, Inc.
 .\"
 .\" This file is part of groff (GNU roff), which is a free software
 .\" project.
@@ -357,6 +357,9 @@ does the same for the new orthography
 .I en
 English.
 .
+Sets the input encoding to Latin-1 by loading
+.IR latin1.tmac .
+.
 .
 .TP
 .I es
@@ -399,6 +402,9 @@ localizes
 and
 .IR ms .
 .
+Sets the input encoding to Latin-1 by loading
+.IR latin1.tmac .
+.
 .
 .TP
 .I ja
@@ -450,8 +456,23 @@ Chinese.
 .SS "Input encodings"
 .\" ====================================================================
 .
-A document that requires one of the following encodings can load a
-corresponding macro file.
+Certain macro files are responsible for translating input character
+codes above 127 decimal to appropriate GNU
+.I troff \" GNU
+escape sequences,
+and setting up hyphenation codes for
+letters their encodings define;
+typically,
+they also invoke
+.B hcode
+requests to case-fold such letters for where necessary so that they
+match hyphenation patterns.
+.
+As a rule,
+a localization file
+(documented in the previous section)
+loads one of these files;
+a document need not do so directly.
 .
 .
 .TP 8n \" "latin1" + 2n
@@ -479,8 +500,8 @@ respectively).
 .I koi8\-r
 supports the KOI8-R encoding.
 .
-KOI8-R code points in the range 0x80\[en]0x9F are not valid input on
-systems using ISO character codings natively;
+KOI8-R code points in the range 0x80\[en]0x9F are not valid input to GNU
+.IR troff ; \" GNU
 see section \[lq]Identifiers\[rq] in
 .MR groff @MAN7EXT@ .
 .

_______________________________________________
Groff-commit mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/groff-commit

[groff] 07/48: [docs]: Further update input encoding discussion.

Reply via email to