[Rswg] Last call comments on draft-rswg-rfc7997bis

Martin J . Dürst Tue, 28 Oct 2025 01:35:27 -0700

Dear Chairs and WG members,

These are my last call comments on rfc7997bis. I have read the documentlast evening. I also read John Klensin's comments from Oct. 25 from topto bottom, but I'm writing this mail separately with my own comments. Imay contribute to the discussion of John's comments at a later stage iftime permits.

I have to admit that I have not always had time to follow the WGdiscussion in detail, although I tried to skim most emails.

My overall impression is that the direction of the draft, in trying tobe short, is okay. However, there are several issues of various severitythat make the current draft unsuitable for forwarding to the IESG atthis point in time. Some of these issues are fundamental and thereforesevere, but they can all be fixed rather easily if there's agreement onwhat to do.

I'm not listing minor grammatical mistakes, of which I have found quitea few. These can be dealt with by the RPC.

Content, major: The draft needs to say that RFCs are written(mainly/mostly) in English. I know this was discussed, but I haven'tseen the main argument, namely that we define policy and that this ispolicy. And if this isn't policy, then nothing in this draft is.

Editorial, major: The abstract should be written so that it can be readeven in 10 or 20 years, which means it should not contain (and inparticular shouldn't start) with historic references. As a start, thefirst paragraph of the abstract should move to the introduction, and thefirst two sentences of the introduction should in turn move to theabstract. After that, a bit of cleanup will be needed.

Content, major: Section 2 is entitled "Basic Requirements for Text inRFCs". But the way it's written, it contains requirements for "readersand browsers", people, maybe fonts, and searches. The text should berewritten to actually talk about text in RFCs. As an example, instead of"RFCs should be displayed correctly across a wide range of readers andbrowsers.", write "RFCs should only contain text that can be displayedcorrectly across a wide range of readers and browsers.". Similar for therest of the section.

Content, major: Section 3: "There are many Unicode characters thatobviously cannot be displayed (such as control characters), and manywhose ability to be displayed is debatable.": It's unclear what "manywhose ability to be displayed is debatable." means. I'd guess it refersto scripts and characters standardized recently, for which font supportis still thin. If that's what is meant, please say so; if something elseis meant, please make clear what that is.

Content, major: Section 3 points to BCP137 for various notations. Theseare all numeric. There are many places where numeric notation isappropriate. But RFC7997 also recommends the use of Unicode characternames. I see no reason to change this, as support for this is alsoavailable in RFC2XML. In some cases (see also below), character namesmake an RFC more readable because they reduce additional lookups. (Ihave nothing against mentioning that in some cases, Unicode characternames contain errors, and in these cases, an official alias should be used.)

Content, major (same paragraph): "If an RFC includes such characters innormative or descriptive text, the RFC needs to also clearly describethe character.": There may be cases, in particular for the correctdisplay of examples including bidirectional text in plain text, where wewant to use bidi control characters but we do not want to "describe"them (because they are not needed in HTML or PostScript).

Content, major: 3.1 Names: This section confuses ASCII and Latin script.If you look at recent RFCs such as RFC 9694 (sorry, that was just theexample that was easiest for me to find), the name is there in Latinscript (M.J. Dürst at the top, Martin J. Dürst at the end), without an"ASCII interpretation". And there would be no point to force me to addan "ASCII interpretation" next time I write an RFC. So please change"These authors can give their names using only ASCII characters, or asUnicode characters and an ASCII interpretation of their name." to"Authors can give their names using only Latin script characters, orusing non-Latin script and an equivalent in Latin script." Please notethat this includes e.g. somebody (fictional) with a name of 加藤竜太郎 witha Latin (not ASCII) equivalent of Ryūtarō Katō (if the person prefersthis to the simpler Ryutaro Kato). Please also note that I'm using"equivalent", not "interpretation". There's no interpretation involved.

Editorial, medium: Please remove "Authors of RFCs whose names includenon-ASCII characters will likely have preferences for how their namesare displayed based on their lived experiences." People, includingauthors, just have names.

Content, major: "Company names and geographic names generally do notneed ASCII interpretations, but they can be included at the discretionof the author and the RPC.": This would mean that I could give myaffiliation as 青山学院大学 and my address as 相模原、日本 or so, but it surelycan't be what we want.

Content, major: RFCs currently use last (family) name plus initial(s) inmany places, and we should change this (as a matter of policy ifnecessary). The reason is that there are many people where the familyname isn't very informative. This is very frequent for Koreans, Chinese,and Danish. It can also happen in other cultures.

Editorial, minor: 3.2 Examples: "giving the Unicode equivalent of thenon-ASCII characters": This is confusing because these characters willbe in UTF-8 and therefore will use Unicode. What we want to say is touse Unicode code points or Unicode character names.

Editorial, major: When talking about color, the text says "If so, thoseexamples need to also include the "U+NNNN" syntax.". This excludes thepossibility to use Unicode character names. But as has been discussed inprevious mail, in the example at hand, it would be much more helpful forthe reader to replace 'For example, "A color display should be able todifferentiate 🔴 (U+1F534), 🟢 (U+1F7E2), and 🔵 (U+1F535)."' with 'Forexample, "A color display should be able to differentiate 🔴 (LARGE REDCIRCLE), 🟢 (LARGE GREEN CIRCLE), and 🔵 (LARGE BLUE CIRCLE).", becauseit saves somebody with a black-and-white display some lookups.

Content, major: 5. Security: "Valid Unicode that matches the expectedtext must be verified in order to preserve expected behavior andprotocol information.": It's totally unclear what this means, and whoshould deal with it. Maybe this should read "Authors and the RPC shouldcross-check that the used characters match their code point numbers orUnicode character names." If something else is intended, please makeclearer what it is.

Editorial, minor: The reference label "[UnicodeCurrent]" should bechanged to "[UnicodeLatest]", because that will help people who arefamiliar with Unicode terminology. In the reference section, the yearshould be removed because that's how the Unicode Consortium advises tocite the latest version, see e.g. "Version References" athttps://www.unicode.org/versions/Unicode17.0.0/. If the RFC Editordoesn't allow to remove the year, then at least 2025 should be used(currently 2023).

Content, minor: "in Normalization Form C (NFC) as defined in[UnicodeNorm]": I recently learned this by accident, but UnicodeStandard Annex #15 does no longer actually define normalization.Paragraph 3 of the Introduction says "For the formal specification ofthe Unicode Normalization Algorithm, see Section 3.11, NormalizationForms in [Unicode].". So please change this at least to "inNormalization Form C (NFC) as defined in Section 3.11, NormalizationForms, in [UnicodeLatest] and [UnicodeNorm]".


Editorial, minor: For [UnicodeNorm] (if it's kept), change
'The Unicode Consortium, "Unicode Standard Annex", 2023' to

'The Unicode Consortium, "Unicode Standard Annex #15, UnicodeNormalization Forms", 2025'.



Regards,    Martin.

--
rswg mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[Rswg] Last call comments on draft-rswg-rfc7997bis

Reply via email to