On 2/3/2025 9:36 AM, Sławomir Osipiuk via Unicode wrote:
On Monday, 03 February 2025, 12:19:06 (-05:00), Peter Constable via Unicode wrote:

    As stated previously, Unicode makes no guarantee of supporting
    source separation / round-trip compatibility with HP264x.


I'm honestly surprised by this. I always thought (because it was repeated so many times - must remember repetition does not equal truth) that round-trip compatibility with old character sets was a founding cornerstone of Unicode and so contrastive use (aka source separation) in an old charset would be persuasive evidence for inclusion.

You guys are talking past each other a bit.

Unicode decided early on to guarantee round-trip to important, widely used character sets of the time. The key interest was to be able to deploy software that worked internally in Unicode but could interface with existing systems without incurring data loss in round trip.

This level guarantee does not exist for just any character set. It didn't even exist for all character sets then in existence.

However, if conflating two characters causes a particular problem, Unicode has accepted case-by-case requests not to unify them, or even to disunify them. However, instead of applying a guarantee, the UTC will look at a bit of a cost/benefit analysis, considering the cost of having to encode additional characters (in perpetuity) vs. the benefit for the intended users.

If this is a problem with a single character, I don't really buy the cost savings argument, especially in a case where after adding some extensions, a whole set could be matched. If there is a group involved, the cost goes up.

On the other hand, I also would like to understand the benefit for the supposed user group. Is it mainly that of avoiding a single pixel infidelity in display only, or are these characters that would need to round-trip, because they might be in data that is entered on a simulated device, processed on a Unicode system and then output again.

I think it's stupid for both sides to fight over a single pixel. Yes, it smells like a bad unification even though the character is arcane (but so are others where minute details matter even though 'nobody' is likely to use that character much). Having a stupidly incomplete mapping can be frustrating, but is being unfaithful going to impact users in any noticeable way?

A./


Reply via email to