Hello Paul,
Many thanks for your extensive and mostly positive replies.
To reduce the length of this mail, I have removed most of the points
where you replied positively, or where Pete interpreted the WG consensus
to my satisfaction.
On 2025-10-29 09:33, Paul Hoffman wrote:
First off: thanks for the careful review with proposals for better wording!
Notes below.
On Oct 28, 2025, at 01:35, Martin J. Dürst <[email protected]> wrote:
I'm not listing minor grammatical mistakes, of which I have found quite a few.
These can be dealt with by the RPC.
Feel free to send those to me off-list so that I can make the RPC's job easier.
Done.
(Most of the list doesn't know this, but Martin has helpfully made major and
minor suggestions on quite a few of my drafts, all to their betterment, for
more than 25 years.)
It's mostly because I have difficulties reading a document without also
noticing minor and often close to irrelevant details. I seriously envy
people who can do that.
Content, major: Section 3: "There are many Unicode characters that obviously cannot be
displayed (such as control characters), and many whose ability to be displayed is debatable.":
It's unclear what "many whose ability to be displayed is debatable." means. I'd guess it
refers to scripts and characters standardized recently, for which font support is still thin. If
that's what is meant, please say so; if something else is meant, please make clear what that is.
There is a wide variety of things that can be debatable. Are combining characters like U+0315
(COMBINING COMMA ABOVE RIGHT) displayable? What about non-spacing marks like U+0650 (ARABIC KASRA)?
I am sure people would take each side of the debate ("I can see the symbol printed in the
Unicode Standard" vs. "I can't see that code point on my laptop even though it has quite
a complete font set" and so on).
On any decent browser, these should display without problems. When it
comes to editors, shells, and the like, the field is much wider, so
there are no absolute guarantees. But these are in Unicode since Unicode
1.0 or so, so I would expect these to show.
Content, major (same paragraph): "If an RFC includes such characters in normative or
descriptive text, the RFC needs to also clearly describe the character.": There may be cases,
in particular for the correct display of examples including bidirectional text in plain text, where
we want to use bidi control characters but we do not want to "describe" them (because
they are not needed in HTML or PostScript).
Why would we not want to describe them? We are quite sure that some people
reading the RFC will have them displayed R-to-L, and others L-to-R.
Bidi support is fairly widespread these days. I just tried to type "אבג"
(the first three letters of the Hebrew 'Alphabeth'; the x-like letter
(aleph) should be on the right) in various shells and editors. None of
them showed the letters with the wrong directionality. One (Putty)
didn't show the letters at all (missing font?), but it moved the "XYZ:
command not found" error message to the right of the Window, so it
clearly knew that the characters where RTL.
But I'm not talking about RTL characters such as Hebrew and Arabic. I'm
talking about BIDI control characters, which are invisible (except that
they may affect how the graphic characters close to them are ordered. If
we need to insert such characters, we shouldn't necessarily talk about
these characters, but about how we expect them to reorder the rest of
the text (so that readers can check whether they see the text in the
order the author expected them to see it).
Editorial, medium: Please remove "Authors of RFCs whose names include non-ASCII
characters will likely have preferences for how their names are displayed based on their
lived experiences." People, including authors, just have names.
I fully disagree that authors don't have preferences.
I'm not saying at all that authors don't have preferences. Of course
they have. But that applies as well to authors that have names that can
be written in all ASCII. Some want their middle name included, others
not. Some want to be William, others Bill, even though their birth
certificates probably both said William, and so on. What I'm trying to
say is that mentioning preferences here makes people with
non-ASCII/non-Latin names special when they aren't (at least not in this
respect).
In fact, at various times in the past, you have had different preferences about
the spelling of your surname in IETF documents. :-)
Not exactly. I have had the same preferences, but in old times,
technology was limited, so I had to use some fallback.
In particular, some authors with Han / Kanji names have asked that their names
be spelled with Latin characters, other have asked for their names to only be
spelled with Han / Kanji, and yet others want both (often with the Latin of
their family name in all caps). These are preferences that I think should be
acknowledged and honored when sensible, even if bugs some other people.
In general, I agree. Only using Latin should of course be possible. Only
using Han/Kanji (or any other non-Latin script) I think is a big
disservice to the reader, and I'm glad that our current document, as far
as I understand it, disallows this. As for putting the family name in
all caps, I think that's a style issue that should be left to the RPC.
Content, major: "Company names and geographic names generally do not need ASCII
interpretations, but they can be included at the discretion of the author and the
RPC.": This would mean that I could give my affiliation as 青山学院大学 and my address as
相模原、日本 or so, but it surely can't be what we want.
If that's what the author of an RFC and their stream manager wants, then it is
indeed what we want. The RPC can disagree, but that disagreement is on a
case-by-case basis, not colored by this document.
Sorry, but first, I don't understand why we are making a difference
between names (where Latin equivalents are required) and company
names,..., where Latin equivalents are voluntary. Second, I think it
would be a big disservice to the readers if affiliations and locations
would be unreadable for most of them. As an example, the current policy
would allow to use just 华为 or 東芝, without making clear that the author
is affiliated with Huawei or Toshiba.
Content, major: RFCs currently use last (family) name plus initial(s) in many
places, and we should change this (as a matter of policy if necessary). The
reason is that there are many people where the family name isn't very
informative. This is very frequent for Koreans, Chinese, and Danish. It can
also happen in other cultures.
I fully agree, but that's a topic for the Style Guide, not this document. If
you start a thread about this on rfc-interest@, I would certainly participate.
I'm not at all convinced that the RFC will be ready to change this, as
it goes back to the start of the series. If the RFC doesn't think this
needs changing, the only way to change it is to make it an issue of
policy, which means that this WG is responsible. And the quickest way to
do that is to put it into the current document, which already contains
policy about names.
Regards, Martin.
--
rswg mailing list -- [email protected]
To unsubscribe send an email to [email protected]