On 1/19/2018 5:37 AM, Philippe Verdy wrote:
May be the IDN could accept a new combining diacritic (sort of right-side acute accent). After all the Kazakh intent is not to define a new separate character but a modification of base letter to create a single letter in their alphabet. So a proposal for COMBINING APOSTROPHE (whose spacing non-combining version is 02BC), so that SPACE+COMBINING APOSTROPHE will render exactly like 02BC.
In the case of TLD IDNs what is at issue is the fact that it "renders exactly like" 02BC (which renders exactly like 2019).
You can see the issue when you look at Andre's twitter tags: you can create two strings that look the same, but the part that is a hashtag is different. That is deemed an unacceptable security risk for TLD IDNs.
If you encoded such a combining character, it would also not be eligible for TLD IDNs.
A./
2018-01-18 19:51 GMT+01:00 Asmus Freytag via Unicode <[email protected] <mailto:[email protected]>>:Top level IDN domain names can not contain 02BC, nor 0027 or 2019. (RFC 6912 gives the rationale and RZ-LGR the implementation, see MSR-3 <https://www.icann.org/public-comments/msr-3-2018-01-17-en>) A./ On 1/18/2018 3:00 AM, Andre Schappo via Unicode wrote:On 18 Jan 2018, at 08:21, Andre Schappo via Unicode <[email protected] <mailto:[email protected]>> wrote:On 16 Jan 2018, at 08:00, Richard Wordingham via Unicode <[email protected] <mailto:[email protected]>> wrote: On Mon, 15 Jan 2018 20:16:21 -0800 James Kass via Unicode <[email protected] <mailto:[email protected]>> wrote:It will probably be the ASCII apostrophe. The stated intent favors the apostrophe over diacritics or special characters to ensure that the language can be input to computers with standard keyboards.Typing U+0027 into a word processor takes planning. Of the three, it should obviously be the modifier letter U+02BC, but I think what gets stored will be U+0027 or the single quotation mark U+2019. However, we shouldn't overlook the diacritic mark U+0315 COMBINING COMMA ABOVE RIGHT. Richard.I have just tested twitter hashtags and as one would expect, U+02BC does not break hashtags. See twitter.com/andreschappo/status/953903964722024448 <http://twitter.com/andreschappo/status/953903964722024448>...and, just in case twitter.com/andreschappo/status/953944089896083456 <http://twitter.com/andreschappo/status/953944089896083456> <https://twitter.com/andreschappo/status/953944089896083456> André Schappo

