Follow-up Comment #21, bug #66392 (group groff): Hi Peter & Dave,
At 2025-02-01T12:56:28-0500, Peter Schaffter wrote:
> Why is \n[.hla] not global regardless of ev? It seems an eminently
> reasonable expectation that a document's hyphenation language will
> apply throughout the whole document. I can only think of edge cases
> where one might want to switch hla's, e.g. a document in French with a
> formatted blockquote in Italian.
That is the sort of scenario I was thinking of. But what motivated the
change is the fact that the hyphenation mode itself is not global, but a
property of the environment. I _think_ this is true all the way back to
Ossanna troff but it's tedious to verify that fact, as the value of the
hyphenation mode is not introspectable except via a GNU extension.
(You'd have to infer it by formatting text and seeing if the placement
of the hyphenation breaks changed from one environment to another, given
comparable inputs.)
> Having to explicitly instantiate .hla for every .ev that doesn't call
> .evc 0 makes no sense.
I disagree--here's why. I think a lot of people assume that when they
create a new environment, it's a copy of environment 0 already. But it
isn't. It's a copy of the _formatter_'s default environment, meaning,
in practice, it has the attributes that correspond go the way its C++
object was constructed when the formatter started up. This is an
implementation detail in capital letters.
At a more practical level, that "formatter's default" environment is not
affected by anything that happens in the "troffrc" and "troffrc-end"
files.
To be fair, most of what the _stock_ startup files (and each of the
several files it macro-sources) do alters only global state.[1]
"troffrc" itself sets a register, defines (and then removes) some
strings, and sets up blank-line and leading-space traps (for diagnostic
purposes, which "troffrc-end" later removes).
What about the macro files "troffrc" loads?
"composite.tmac" sets up a handful of composite character mappings.
These are global.
"fallbacks.tmac" creates user-defined characters. Also global.
An output-driver-specific macro file is loaded. These generally do
things like define more characters. Some assign hyphenation codes
(global, but one should feel a twitch here[2]). Some define color names
(global). "pdf.tmac" defines boatloads of macros (global).
A localization macro file is loaded. By default, it's the one for
English, but we do encourage sites to alter this if they wish.
The localization macro file itself loads an encoding macro file, which
sets up input character translations (`trin` requests) and more
hyphenation codes (twitch).
The localization macro file then goes on to configure the inter-sentence
spacing amount (environmental), set up a default hyphenation mode
(environmental), and select that mode (environmental). It sets the
hyphenation language (formerly global, now environmental), and loads
hyphenation pattern files (global, but a separate dictionary for each
hyphenation language code is maintained--so until/unless we support
maintenance of multiple sets of hyphenation patterns for a given
language code,[3] I figure this looks as good as environmental to the
user).
Finally, for convenience, and depending on the output device,
"pdfpic.tmac" or "pspic.tmac" might get loaded. These do only global
stuff, mainly defining namesake (albeit fully capitalized) macros.
The rug may not be pulled yet, but the dog is tugging at a corner of it.
Here's the rug pull.
Because we advocate site-local customization of "troffrc" and
"troffrc-end", there's simply no way for us know of or prevent the user
from putting all kinds of environment-altering stuff in them. They
might choose an adjustment mode. They might override the line length.
Change the page offset. Alter the type size. Here's the output of the
`pev` request from bleeding-edge GNU troff.
Current Environment:
previous type size: 10p (10000s)
type size: 10p (10000s)
previous requested type size: 10000s
requested type size: 10000s
valid type size list for selected font: 1000s-10000000s
previous default family: 'T'
default family: 'T'
previous font selection: 1 ('TR')
font selection: 1 ('TR')
space size: 12/12 of font space width
sentence space size: 12/12 of font space width
previous line length: 468000u
line length: 468000u
previous title line length: 468000u
title line length: 468000u
previous line interrupted/continued: no
filling: on
alignment/adjustment: both
previous vertical spacing: 12000u
vertical spacing: 12000u
previous post-vertical spacing: 0u
post-vertical spacing: 0u
previous line spacing: 1
line spacing: 1
previous indentation: 0u
indentation: 0u
temporary indentation: 0u
temporary indentation pending: no
total indentation: 0u
previous text length: 0u
target text length: 0u
input line start: 0u
computing tab stops from: input line start
forcing adjustment: no
hyphenation language code: en
hyphenation mode: 4 (on, not allowed within last two characters)
hyphenation mode default: 4
count of consecutive hyphenated lines: 0
consecutive hyphenated line count limit: -1 (unlimited)
hyphenation space: 0u
hyphenation margin: 0u
Environment 0:
current
And it's stuff they _won't get_ automatically when creating a new
environment. Our documentation should probably urge the user more
strongly to, as a rule, `evc 0` when creating an environment.
All of that said, we _could_ change `ev` to, when creating a new
environment, copy from environment `0` automatically. (I'm not sure how
we would represent a desire to copy the formatter's default environment
though. I hope not with yet another new request.) But that seemed like
a more disruptive and less backward-compatible change.
I think that if people have been creating environments and _not_ using
`evc 0` on them immediately afterward, they've been relying on luck.
At 2025-02-01T14:13:16-0500, Dave wrote:
> Follow-up Comment #19, bug #66392 (group groff):
>
> [comment #18 comment #18:]
>> Why is \n[.hla] not global regardless of ev?
>
> By my reading of bug #66387, the salient sentence is, "Pretty weird to
> pop the environment stack and have the hyphenation mode, but not the
> hyphenation _language_, change."
>
> But perhaps this is something that warrants wider discussion.
That could be; I don't mind. But the status quo ante did not look to me
like a situation anyone would expect or desire. Hmm, I do see that I
missed an opportunity to post one of my "trivia challenges" about it to
the list. ;-)
Regards,
Branden
[1] The stock "troffrc" performs one character translation involving the
non-breaking space. Character translations are presently global but
I have a notion to make those environmental as well, to avoid a
problem seen in the real world where a set of translations
temporarily set up happens to be in force when a page break happens,
corrupting header and/or footer text. I don't have this work
scheduled. Like the item in the next footnote, it will demand major
surgery to reorganize data structures.
[2] It hadn't occurred to me before now that we might need to house the
hyphenation code assignments in the environment instead. Doing so
will require some significant refactoring, as presently a
character's hyphenation code is stored in its `charinfo` object, the
dictionary of which is global. I have no appetite to add this to my
groff 1.24 plate.
[3] And I don't know why we would; if we ever need to distinguish
"en_GB" from "en_US", for example (which really do hyphenate a few
words differently, I gather), those strings will _be_ the
hyphenation language codes, and they obviously differ. Similarly,
if a user wants to set up their own multiple distinct hyphenation
configurations for a given language, they'd pick distinct
identifiers (language codes) for them.
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?66392>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
signature.asc
Description: PGP signature
