I'm for this proposal.

I would like the NCC to clarify a bit more regarding the allowed code
points as the IDNA is about domain names which differ from free form text.

-Cynthia

On Wed, 22 Oct 2025, 08:00 Edward Shryane, <[email protected]> wrote:

> Dear colleagues,
>
> As I presented at RIPE 89 and RIPE 90, I'd like to propose to allow UTF-8
> encoded characters in "descr:" and "remarks:" attributes.
>
> Is there support for adding UTF-8 in the RIPE database? Please let me know
> your feedback.
>
> Regards
> Ed Shryane
> RIPE NCC
>
>
> Problem Definition
> ------------------
>
> It is currently only possible to store Latin-1 encoded data in the RIPE
> database. This is an issue for the majority of the RIPE region whose native
> language is not supported by Latin-1. We should allow regional operators to
> add notices to their RIPE database objects in their native language, using
> UTF-8 encoded data, so long as this does not affect interoperability.
>
> Solution Definition
> -------------------
>
> In order to allow operators across the RIPE region to add notices in their
> own local language, we will allow UTF-8 characters in the “descr:” and
> “remarks:” attributes only. This change reduces the risk of impact to
> operators, users and the RIPE NCC, and does not affect existing RIPE policy.
>
> We can extend support for UTF-8 in additional existing or new attributes
> in the future, once we have more operational experience with it, but for
> now, only “descr:” and “remarks:” will be supported.
>
> Background
> ----------
>
> Some work has already been done towards internationalization of the RIPE
> database. For example, in April 2015, Piotr Strzyzewski suggested to the
> DB-WG to support UTF-8 in free-text attributes.
>
> "Proposal to allow UTF8 (April 2015)"
>
> https://mailman.ripe.net/archives/list/[email protected]/thread/QEYKOWZBCVA6HNH5MPVX5CJO2XMCIRPH/
>
> In May 2022, I published a RIPE Labs article on the impact analysis of
> supporting UTF-8 in the RIPE database.
>
> https://labs.ripe.net/author/ed_shryane/impact-analysis-for-utf-8-in-the-ripe-database/
>
> At RIPE 89 and RIPE 90 I proposed to support UTF-8 in the RIPE database
> and asked for feedback.
>
> https://ripe89.ripe.net/wp-content/uploads/presentations/105-RIPE89-DB-WG-UTF-8.pdf
>
> https://ripe90.ripe.net/wp-content/uploads/presentations/120-RIPE90-DB-WG-Operational-Update.pdf
>
> Impact Analysis
> ---------------
>
> Backwards Compatibility
> UTF-8 is backwards compatible with ASCII, in the same way as Latin-1. Any
> RPSL objects solely using ASCII will be compatible with UTF-8 encoding.
> Approximately 99% of all objects in the RIPE database only contain ASCII
> characters.
>
> Personal Data
> Users must not add personal data in “remarks:” or “descr:” attributes, as
> these attributes are not included in the daily limit accounting, are not
> validated as they contain free text, and are not filtered by default. This
> is already the case in the RIPE database and the introduction of UTF-8
> encoding does not change this. Personal data with UTF-8 encoding is out of
> scope.
>
> Interoperability
> If interoperability is a concern (i.e. a notice must be readable by a
> wider community) then it is recommended that only ASCII values are used.
>
> Valid Codepoints
> Validate UTF-8 input with the IDNA 2008 standard to decide whether a
> Unicode codepoint is allowed (i.e. only allow protocol valid code points).
> This standard is used in the implementation of Internationalised Domain
> Names (IDNs). This allows for consistency (code points will be mapped to a
> specific set of characters) and improved security (using an inclusion model
> to only allow certain characters).
>
> Guidelines for the Implementation of Internationalized Domain Names
> https://www.icann.org/resources/pages/idn-guidelines-2011-09-02-en
>
> Transliteration
> Transliteration to Latin-1 is only done when necessary to match the
> default response encoding. Otherwise transliteration is not done (i.e.
> UTF-8 characters will be returned as-is).
>
> Impact on RIPE Database Services
> --------------------------------
>
> Whois (Port 43) Query
> * The “descr:” and “remarks:” attributes are returned by default on port
> 43 query responses.
> * Port 43 will continue to use Latin-1 by default. If so, any UTF-8
> characters outside the ASCII character set will be transliterated to
> Latin-1 or will be substituted with a “?” character.
> * The client can specify the “-Z utf-8” flag to change the response
> encoding to UTF-8, then no transliteration will be done.
>
> NRTMv3 (Port 4444)
> * The encoding used by NRTMv3 will continue to be Latin-1. As for port 43,
> any non Latin-1 characters will be substituted with a “?” character.
>
> NRTMv4
> * No impact. RPSL objects will continue to be returned in UTF-8 encoding
> in snapshot and delta files.
>
> Whois REST API
> * No impact. The Whois REST API already supports UTF-8.
>
> RDAP
> * No impact. The RDAP protocol already supports UTF-8.
>
> Web Application
> * UTF-8 encoding is already supported on the query page.
> * The create and update page validation will be changed to allow UTF-8
> characters in “descr:” and “remarks:” attributes.
>
> Mailupdates
> * No impact. UTF-8 encoding is supported.
>
> Syncupdates
> * No impact. UTF-8 encoding is supported.
>
> Daily Database Dump and Split Files
> * The encoding of the database dump and split files remains Latin-1. The
> “descr:” and “remarks:” attributes are included unfiltered. Any non-Latin-1
> UTF-8 characters will be substituted with a “?” character.
> * We will provide a separate UTF-8 encoded database dump and split files,
> which will include “descr:” and “remarks:” attributes without substitutions.
>
> New LIR Application
> * No impact.
>
> Registry Team
> * No comments or conerns as changes are limited to descr and remarks
> attributes.
>
>
> -----
> To unsubscribe from this mailing list or change your subscription options,
> please visit: https://mailman.ripe.net/mailman3/lists/db-wg.ripe.net/
> As we have migrated to Mailman 3, you will need to create an account with
> the email matching your subscription before you can change your settings.
> More details at: https://www.ripe.net/membership/mail/mailman-3-migration/
-----
To unsubscribe from this mailing list or change your subscription options, 
please visit: https://mailman.ripe.net/mailman3/lists/db-wg.ripe.net/
As we have migrated to Mailman 3, you will need to create an account with the 
email matching your subscription before you can change your settings. 
More details at: https://www.ripe.net/membership/mail/mailman-3-migration/

Reply via email to