On Sun, Nov 2, 2025 at 6:10 PM Martin J. Dürst <[email protected]>
wrote:

> Hello Rob, others,
>
> On 2025-11-03 08:28, Rob Sayre wrote:
> >
> >
> > On 11/2/25 5:03 AM, Pete Resnick wrote:
> >> On 31 Oct 2025, at 7:57, Martin J. Dürst wrote:
> >>
> >>> On 2025-10-29 09:33, Paul Hoffman wrote:
> >>>>
> >>>> On Oct 28, 2025, at 01:35, Martin J. Dürst <[email protected]>
> >>>> wrote:
> >>>>
> >>>>> Content, major: Section 3: "There are many Unicode characters that
> >>>>> obviously cannot be displayed (such as control characters), and
> >>>>> many whose ability to be displayed is debatable.": It's unclear
> >>>>> what "many whose ability to be displayed is debatable." means. I'd
> >>>>> guess it refers to scripts and characters standardized recently,
> >>>>> for which font support is still thin. If that's what is meant,
> >>>>> please say so; if something else is meant, please make clear what
> >>>>> that is.
> >>>>
> >>>> There is a wide variety of things that can be debatable. Are
> >>>> combining characters like U+0315 (COMBINING COMMA ABOVE RIGHT)
> >>>> displayable? What about non-spacing marks like U+0650 (ARABIC
> >>>> KASRA)? I am sure people would take each side of the debate ("I can
> >>>> see the symbol printed in the Unicode Standard" vs. "I can't see
> >>>> that code point on my laptop even though it has quite a complete
> >>>> font set" and so on).
> >>>
> >>> On any decent browser, these should display without problems. When it
> >>> comes to editors, shells, and the like, the field is much wider, so
> >>> there are no absolute guarantees. But these are in Unicode since
> >>> Unicode 1.0 or so, so I would expect these to show.
> >>
> >> I will leave it to you and Paul to replace "debatable" with something
> >> clearer.
> >>
> >
> >
> > Hi,
> >
> > There is an entire RFC about this, which Paul co-wrote.
> >
> > https://www.rfc-editor.org/rfc/rfc9839.html
>
> Last time I checked, none of the characters excluded in any of the sets
> defined in RFC 9839 had any chance whatsoever to turn up in names of
> people or companies or places.
>
>
> > What you may be missing is that social networks have character counts,
> > and they sure do go after these issues.
> >
> > These systems do in fact count a "family" as one character, not
> > multiples with ZWNJs. Once you understand that, it gets a little cleaner.
>
> I know. At a Unicode Conference many years back, I learned (directly
> from the person who initiated that change) that Twitter had switched
> from counting bytes to counting code points, which was the first step in
> that direction.
>
> But we are currently not looking at writing policy about length
> restrictions, so I think this is irrelevant. [It's also irrelevant
> because of the low (=zero?) likeliness of somebody having a family
> emoji, or any emoji for that, in their name.]
>

You need them in Arabic and Persian (not even the correct name there, but
let's carry on).

https://www.w3.org/TR/2025/DNOTE-alreq-20251002/

Here, we can go for

4.3.4.1 Disjoining Enforcement

or

4.3.4.2 Joining Enforcement

or

4.3.4.3 Joining-Disjoining Enforcement

I am pretty sure you know this stuff, but most others probably don't.

We could use this last name:

علی‌رضا‎ (Ali‌Reza)

 thanks,
Rob
-- 
rswg mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to