Pd: Odp: Re: Unicode fundamental character identity

[email protected] via Unicode Fri, 31 Jan 2025 16:06:50 -0800
Dnia 01 lutego 2025 00:48 James Kass via Unicode 
&lt;[email protected]&gt; napisał(a):  On 2025-01-31 11:15 PM, 
[email protected] via Unicode wrote:   Dnia 31 stycznia 2025 23:45 James 
Kass &lt;[email protected]&gt;  napisał(a):      On 2025-01-31 9:28 PM, 
[email protected] via Unicode wrote:          The proposal L2/25-037 already 
shows a difference in plain         text of the         HP 264x characters, 
where 0x12 (2) connects below vertical or         perpendicular diagonal, 
whereas 0x18 (8) connects below         diagonal of         same direction. 
Those are different types of connections which         is a         plain text 
distinction of box drawings.      A &#34;smart&#34; font dedicated to these 
characters would provide appropriate     glyphs based on context.  This would 
result in a plain-text display     identical to the original display.   That 
doesn&#39;t make sense because on a fundamental level, in a legacy  computing 
semigraphical environment, each character tile is drawn  independently, and 
only affects the area of the screen dedicated to  that character. Having a 
context dependent system would overcomplicate  the renderer beyond the scope of 
the original system. Furthermore, on  the HP 264x system, the two characters 
can exist in isolation (as  shown in obGQ4Ie.png (1440×720) (imgur.com)  &lt; 
i.imgur.com https://i.imgur.com/obGQ4Ie.png&gt;),  and the user can in fact 
type the  two characters differently, with the 2 and 8 keys as shown in page 31 
 of 204 in  
02645-90005_2641A_2645A_2645S_N_Display_Station_Reference_Manual_Nov1978.pdf  
(bitsavers.org)  &lt; www.bitsavers.org 
http://www.bitsavers.org/pdf/hp/terminal/264x/2645A/02645-90005_2641A_2645A_2645S_N_Display_Station_Reference_Manual_Nov1978.pdf&gt;.
   Sorry for the confusion.  I&#39;m referring to a Unicode &#34;smart&#34; 
font  working on a modern system displaying Unicode plain-text.  This is all  
automatic and handled by the rendering system.  If a dedicated font is  used to 
display the text, contextual glyph substitution would make the  display 
indistinguishable from the original display on the legacy  system.  Also, on a 
modern system any &#34;dumb&#34; font supporting the  characters would still 
produce a *legible* display, even though it might  not be as pretty.  And 
legibility in plain-text is one of the factors  driving encoding decisions.  
(This might be why font selection was  mentioned as a solution in the document 
referenced earlier.)   That still cannot possibly work on isolated instances of 
the characters. In fact, if you have two different Large Character set strings 
that only differ by the use of 0x12 or 0x18 character, then the HP 264x will 
display them distinct in Unicode 16.0 mapping they will result in the exact 
same string and no amount of contextual glyph substitution will work. And as I 
said, complex features such as contextual glyph substitution are fundamentally 
completely out of scope for characters that originated from semigraphical text, 
no matter how modern the system displaying it is.           Data loss in 
round-tripping is implicitly evident from the         information         
provided in the proposal: if an HP 264x Large Character set mode         
document has the characters 0x12 0x18, it converts to Unicode as         
U+1CE2B U+1CE2B, which converted back to HP 264x Large         Character set    
     mode is 0x12 0x12, which loses the distinction between the two         
characters and will appear slightly differently than the original         
document on HP 264x platform.      Yes, this is implicit in the proposal.  Any 
future proposal should     make     it explicit while referring to the earlier 
proposal for background.     Please keep in mind that the committee members 
must wade through many     different proposals covering all aspects of 
character encoding.      Keep it     short, straightforward, and simple as 
possible to ease their burden.   The character has already been proposed. What 
would any future  proposal have to do with that?   If my understanding is 
correct, the character has already been proposed  and rejected.  It&#39;s not 
uncommon for a subsequent proposal to be  submitted which addresses concerns 
raised during the rejection of an  earlier proposal.  (If my understanding is 
not correct, someone will  probably set me straight.)   It&#39;s not in  
www.unicode.org Archive of Notices of Non-Approval (unicode.org)  so it&#39;s 
not rejected.
Pd: Odp: Re: Unicode fundamental character identity

Reply via email to