Hi Oliver, At 2023-04-26T09:19:41+0200, Oliver Corff wrote: > thank you very much for the sharing your insight regarding groff > internals.
I wish they were deeper! There is still plenty I have to learn. > I tried your demonstration, replacing the text file with my own file > (utf8-encoded Cyrillic), and I did not succeed to reproduce your > results. > > I copied all Russian-related macros (ru.tmac, hyphen.ru and > koi8-ru.tmac) into my ../current/tmac directory (production system is > still 1.22.4), and running groff results in unusable output. No, I wouldn't expect this to work. > The headline "Abstract" gets translated into Russian, but is displayed > in non-utf8 format. All utf8-text is ok. If I omit the -k option then > utf8-encoded text is unusable as well, but this is no surprise. As noted in my previous mail, if you want hyphenation to work with Russian, neither UTF-8 input (processed by preconv(1)) not Unicode code points from the Cyrillic code block in their groff special character escape form, like \[u0400], can be used. > Do I miss something from post-1.23.0 that enables the described magic? Yes. I refactored localization handling extensively to enable the current approach. As noted earlier in my compliment on your demo document, I wanted to make it easy to change localizations an arbitrary number of times within a document. I worked on this stuff a while back. In about January 2021 I made an attempt, some of which I had to revert, and re-landed the work in its current form around July of that year. More work specifically on hyphenation followed in early 2022. Some relevant commit IDs, not including the must more recent Spanish and Russian localization work (which slotted right in as I had hoped) are: a86d9251ed05cec18f6279a9e613449ae7aa7315 a60784b82a5c53caff5443fc036b8d13f4084a32 7eb25c45b5ec67f1037abcc670793b734584987c 7c31d53f83888d88262075875b6ba5463dcfa5c5 2a36cf12b865be4c1df1c27139b1c58798cafb60 920fff1cf59d38bacd9b1b99b3d1ce3ce4e1e13f I don't recall having to change anything in the formatter to enable this work, so in principle you could replace an entire tmac directory from a groff 1.22.4 installation with one from 1.23.0 (RC), but I can't claim that as a supported configuration. It's probably better just to build and install groff 1.23.0.rc4, and _then_ add in the Russian localization files. If you're comfortable setting up chroots or virtual machines, you might prefer to evaluate things that way. Regards, Branden
signature.asc
Description: PGP signature