RE: Characters that should be displayed?

Shawn Steele Sun, 29 Jun 2014 12:25:27 -0700

Corrected typo, sorry. (someone thing/someone think)

-----Original Message-----
From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Shawn Steele
Sent: Sunday, June 29, 2014 11:59 AM
To: Koji Ishii; Unicode Mailing List
Subject: RE: Characters that should be displayed?


If the concern is security, I cannot imagine why CSS would even want something 
like BELL to be legal at all.  

I'm not sure that replacement glyphs would help much.  I mean would someone 
think that �Shawn was something spoofing Shawn, or just assume their 
browser/computer had a rendering glitch?  I think most people would just ignore 
the unexpected character and assume something was quirky with the web page.

-Shawn

-----Original Message-----
From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Koji Ishii
Sent: Sunday, June 29, 2014 11:44 AM
To: Unicode Mailing List
Subject: Characters that should be displayed?

Hello Unicoders,

I’m a co-editor of CSS Text Level 3[1], and I would appreciate your support in 
defining rendering behavior in CSS.

The spec currently has the following text[2]:

> Control characters (Unicode class Cc) other than tab (U+0009), line feed 
> (U+000A), and carriage return (U+000D) are ignored for the purpose of 
> rendering. (As required by [UNICODE], unsupported Default_ignorable 
> characters must also be ignored for rendering.)

and there’s a feedback saying that CSS should display visible glyphs for these 
control characters. Since all major browsers do not display them today, this is 
a breaking-change and the CSS WG needs to discuss on this feedback. But the WG 
would appreciate to understand what Unicode recommends.

I found the following text in Unicode 6.3, p. 185, "5.21 Ignoring Characters in 
Processing”[3]:

> Surrogate code points, private-use characters, and control characters are not 
> given the Default_Ignorable_Code_Point property. To avoid security problems, 
> such characters or code points, when not interpreted and not displayable by 
> normal rendering, should be displayed in fallback rendering with a fallback 
> glyph

By looking at this, my questions are as follows:

1. Should control characters that browsers do not interpret be displayed in 
fallback rendering?
2. Should private-use characters (U+E000-F8FF, 0F0000-0FFFFD, 100000-10FFFD) 
without glyphs be displayed in fallback rendering?

These two questions are probably yes from what I understand the text quoted 
above, but things get harder the more I think:

3. When the above text says “surrogate code points”, does that mean everything 
outside BMP? It reads so to me, but I’m surprised that characters in BMP and 
outside BMP have such differences, so I’m doubting my English skill.
4. Should every code point that are not given the Default_Ignorable_Code_Point 
property and that without interpretations nor glyphs displayed in fallback 
rendering? I could not find such statement in Unicode spec, but there are some 
people who believe so.
5. Is there anything else Unicode recommends to display in fallback rendering, 
or not to display? This must be RTFM, but pointing out where to read would be 
appreciated.

Thank you for your support in advance.

[1] http://dev.w3.org/csswg/css-text/
[2] http://dev.w3.org/csswg/css-text/#white-space-processing
[3] http://www.unicode.org/versions/Unicode6.3.0/ch05.pdf

/koji


_______________________________________________
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode

_______________________________________________
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode

_______________________________________________
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode

RE: Characters that should be displayed?

Reply via email to