Re: Unicode fundamental character identity

James Kass via Unicode Fri, 31 Jan 2025 15:43:21 -0800



On 2025-01-31 11:15 PM, [email protected] via Unicode wrote:

Dnia 31 stycznia 2025 23:45 James Kass <[email protected]>napisał(a):
    (Hi Piotr, I sent this to the list about 45 minutes ago but it has not
    come through yet so I'm sending it along to you directly.  Hope this
    helps.  -James)

    -------- Forwarded Message --------
    Subject: Re: Odp: RE: Re: Re: Unicode fundamental character identity
    Date: Fri, 31 Jan 2025 22:01:54 +0000
    From: James Kass <[email protected]>
    To: [email protected]





    On 2025-01-31 9:28 PM, [email protected] via Unicode wrote:

        The proposal L2/25-037 already shows a difference in plain
        text of the
        HP 264x characters, where 0x12 (2) connects below vertical or
        perpendicular diagonal, whereas 0x18 (8) connects below
        diagonal of
        same direction. Those are different types of connections which
        is a
        plain text distinction of box drawings.

    A "smart" font dedicated to these characters would provide appropriate
    glyphs based on context.  This would result in a plain-text display
    identical to the original display.
That doesn't make sense because on a fundamental level, in a legacycomputing semigraphical environment, each character tile is drawnindependently, and only affects the area of the screen dedicated tothat character. Having a context dependent system would overcomplicatethe renderer beyond the scope of the original system. Furthermore, onthe HP 264x system, the two characters can exist in isolation (asshown in obGQ4Ie.png (1440×720) (imgur.com)<https://i.imgur.com/obGQ4Ie.png>), and the user can in fact type thetwo characters differently, with the 2 and 8 keys as shown in page 31of 204 in02645-90005_2641A_2645A_2645S_N_Display_Station_Reference_Manual_Nov1978.pdf(bitsavers.org)<http://www.bitsavers.org/pdf/hp/terminal/264x/2645A/02645-90005_2641A_2645A_2645S_N_Display_Station_Reference_Manual_Nov1978.pdf>.

Sorry for the confusion. I'm referring to a Unicode "smart" fontworking on a modern system displaying Unicode plain-text. This is allautomatic and handled by the rendering system. If a dedicated font isused to display the text, contextual glyph substitution would make thedisplay indistinguishable from the original display on the legacysystem. Also, on a modern system any "dumb" font supporting thecharacters would still produce a *legible* display, even though it mightnot be as pretty. And legibility in plain-text is one of the factorsdriving encoding decisions. (This might be why font selection wasmentioned as a solution in the document referenced earlier.)

        Data loss in round-tripping is implicitly evident from the
        information
        provided in the proposal: if an HP 264x Large Character set mode
        document has the characters 0x12 0x18, it converts to Unicode as
        U+1CE2B U+1CE2B, which converted back to HP 264x Large
        Character set
        mode is 0x12 0x12, which loses the distinction between the two
        characters and will appear slightly differently than the original
        document on HP 264x platform.

    Yes, this is implicit in the proposal.  Any future proposal should
    make
    it explicit while referring to the earlier proposal for background.
    Please keep in mind that the committee members must wade through many
    different proposals covering all aspects of character encoding. 
    Keep it
    short, straightforward, and simple as possible to ease their burden.

The character has already been proposed. What would any futureproposal have to do with that?

If my understanding is correct, the character has already been proposedand rejected. It's not uncommon for a subsequent proposal to besubmitted which addresses concerns raised during the rejection of anearlier proposal. (If my understanding is not correct, someone willprobably set me straight.)

Re: Unicode fundamental character identity

Reply via email to