Re: U+2018 is not RIGHT HIGH 6

Asmus Freytag Thu, 03 May 2012 01:17:22 -0700

Sometimes you are not free to choose what you would like.


One thing that's off the table is a new character code.

The reason for that categorical statement is that there is too much dataand software out that uses the existing character codes. Throwing a newcharacter into the mix will just create confusion. Text that should beidentical would acquire two alternate representations depending onwhether the new or the old character is used. That's not good.Especially not for a situation that, while not ideal, has been toleratedby tens of millions of users for decades - which means it's not one oflife's most urgent crises.

Sometimes, even when you are creating a "new" character encoding, youare not actually free in your design of it. That happened to Unicode.For characters that were in use (especially widespread use) at the timeUnicode was created, it was practically impossible to re-analyze thembased on some "ideal" precepts.

Where this was attempted, reality caught up with Unicode rather swiftly.You can see traces of this in the naming of the quotation mark.Unicode's principle had been the "semantic" encoding of characters, sothe distinction was made based on the presumed function or positioning(opening or closing, left or right adjacent to the text).

At the same time, the actual set of these characters was based onvarious other, then existing character sets and collections. Although atthe time many standard character sets were still limited to straightquotes, some did have the curly ones, including sets used fortypesetting systems.

The fact that the use of quotation marks is so radically languagedependent was not understood from the beginning, otherwise they wouldhave been named not by function but by something else. The use ofTimes-like roman font for the representative glyph further obscured theglyph issue that you are trying to bring to our attention here.

Taking all of this together meant that Unicode took on board thecharacters as they were then defined in legacy sets and half-heartedlynamed them by function ignoring that the function wasn't constant acrosslanguages.

In any case, trying to approach this from the semantic position hasissues. In Swedish, for example, you use the same quotation mark symbolfor both opening and closing. It would be more than bizarre to use twodifferent characters for that purpose.

So that defines the characters a bit more by appearance than semantics.On the other hand, you are pointing out that some uses allow a widerrange of glyphic variation of the existing characters than other usages.

This is something that should be documented, but in terms of helpingfont designers provide the correct glyphs for each context. The time tocreate a special character code for German quotation marks is passed.The moment for that would have been the late 70s.

A./

Re: U+2018 is not RIGHT HIGH 6

Reply via email to