So, if I understand the situation correctly, groff gets its
hyphenation information from TeX.  TeX isn't accommodating any English
words with non-ASCII characters because of its hyphenation algorithm's
limitations, and Werner is reluctant to have groff accommodate them
because of the maintenance complexity of modifying or augmenting the
TeX rules.  Is this a fair summation?

Can TeX's list of patterns be expanded to include letters with
diacritics without breaking TeX's English hyphenation algorithm?  That
is, if Latin-9 characters are included, will the algorithm simply
ignore them, or fail?

On 4/12/17, Werner LEMBERG <[email protected]> wrote:
> The very issue is rather that *users* are not accomodated to select an
> input and/or font encoding while typesetting US English texts.

Probably true in general.  However, those English users who write
about résumés or Blue Öyster Cult -- and who care enough to get
details correct -- will either learn how to produce Latin-1 characters
(which groff accepts), or learn the escape sequences in groff (and I
presume TeX has an equivalent mechanism) that allow these characters
to be represented with ASCII input.

The user can, of course, use .hw to correctly break the occasional
such word in predominantly ASCII English text,  However, it's far from
intuitive that such accommodation is the user's responsibility, when
all other hyphenation Just Works without the user having to think
about it.  It would be nice if these sorts of words worked out of the
box.

Side note: groff does, I observe, correctly break "öyster" (which is
technically not even a real English word) but not "résumé" (which is
not only a real word, but needs the accents to distinguish it from the
unrelated word "resume").  I assume this is because no hyphenation
point of öyster is adjacent to the non-ASCII letter.

_______________________________________________
bug-groff mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/bug-groff

Reply via email to