On 10/24/2025 10:54 PM, [email protected] wrote:
Dnia 25 października 2025 00:38 Asmus Freytag via Unicode
<[email protected]> napisał(a):
On 10/24/2025 2:58 PM, [email protected]
<mailto:[email protected]> via Unicode wrote:
and not subject to font variation.
That's overstating things.
A./
How is that overstating things?
Because exact glyph details are not normative. Especially for
compatibility characters. Here, the intent is usually to facilitate a
unique and unambiguous mapping between some kind of legacy character and
a Unicode character.
I think that the analysis for the curved connectors that unified two
distinct elements because their rendering was close was a mistake,
because the distinction occurred in the same set and unifying the
characters killed fidelity in round trip conversion for many members of
the "large set" while "saving" only one character code. In my personal
view, that's precisely the wrong way to do unification.
When a legacy computing platform defines blocks in terms of fractions,
it does so to ensure specific alignment with those fractions, making
it part of the fundamental character identity. On the other hand, when
a legacy computing platform defines strokes in terms of stem weight
and there is known variation across platforms, it is inappropriate to
define those characters using exact fractions when those fractions
mismatch some of the platforms.
So far, you have only argued that a font (or bitmap) used to emulate a
specific legacy platform should faithfully adhere to any specifications
that apply to that platform.
There is nothing wrong with the same *Unicode* character being rendered
slightly differently when used to emulate *different* platforms. Unless
it is the very same platform that exhibits different shapes (and in the
same display "mode" or "shift"). In that case, the principle of source
set separation becomes applicable (which is the principle that should
have been applied to the curved connector case. If it makes you happy,
you can cite my opinion on that).
However, I didn't spot where that would have been the case for the line
segments. From my quick perusal of the proposals and the critique here
it seems that this is a matter of the different displays having
different weights and therefore, the preferred font / bitmap cannot be
the same in each context. However, there's not implied need to be able
to emulate a screen where different parts of the emulator have support a
different legacy system. Usually, a single window (or nested window)
would display a single emulator.
Again, the identity of the Unicode character is giving by encoding the
intended mappings. If Unicode decides to map the same character to
similar characters on different platforms, that is not a problem, as
long as implementers know that the intent is to use a platform-specific
rendering (and not assume that there is only one possible rendering per
character).
If you feel that the guidance available to implementers in the text of
the standard or in an annotation of the nameslist is not sufficent, then
the remedy would be to ask for the explanation to be updated. We are
unfortunately locked in as far as character names are concerned, but we
can add a note (best in the text of the standard) that explains that
emulators for some systems will need an adjusted design so a sequence or
other arrangement of these characters looks correct.
A./
PS: I see that you confirm below that the two cases are of a different
nature.
Dnia 25 października 2025 00:44 Asmus Freytag via Unicode
<[email protected]> napisał(a):
On 10/24/2025 2:54 PM, Nitai Sasson via Unicode wrote:
f you use a font that makes those Unicode characters look like
they did on their original platform, there is no issue. But a
given font can only emulate one platform at a time. You're not
going to get a C64 and PET/VIC-20 frankenstein of a document.
Take your pick: do you want it to look like C64, or do you want
it to look like PET/VIC-20? Choose your font accordingly.
Round tripping plain text to a mix of devices is not a goal, just
as round tripping plain text Han characters to a mix of regional
variants is not a goal.
You (Piotr) need to demonstrate that for a single display, on a
single device or emulator for a single device, you cannot get the
correct appearance by systematically using a device appropriate font.
If a device supports "shifted" modes, then a device appropriate
font may change based on the shift status.
Only when that accommodation fails to produce the correct
appearance is there a case for further disunification.
The diagonal connector issue satisfies this requirement, but as
far as I have been able to understand, the block characters do not.
A./
In case of PETSCII and Apple II characters, this is an instance of
source characters having an incompatible character identity from their
mapped Unicode characters. Therefore, there is a character identity
conflict between the legacy platform and the Unicode characters they
are mapped to.
Whereas in case of HP 264x characters, two source characters having an
incompatible character identity from each other are mapped to the same
Unicode character. Therefore, there is a character identity conflict
between the two characters.
The required evidence to support a request for disunification
therefore
always consists of a document (screenshot) (usually other than a
character set table) that shows that the two characters are
distinct in
their source environment and that that distinction matters (for
example,
that it can't be determined mechanically by context).
From the original document (section 1, page 1), it looks like that
there are two characters that are distinct in the source, but have
been
mapped to the same Unicode character 1CE2B. I can certainly sympathize
with the view that unifying these based on their close visual
similarity
was, what we used to call a case of "arms-length" unification.
As I have explained in Odp: Re: Unicode fundamental character identity
<https://corp.unicode.org/pipermail/unicode/2025-January/011312.html>,
This is what it looks like on a screenshot:
https://i.imgur.com/obGQ4Ie.png . The two different characters and
their different types of connections are demonstrated. Furthermore,
since all character tiles are visually independent, and both
characters may be used as isolated character cells, no contextual
mechanism can possibly apply.