Mark E. Shoulson:

> But Unicode isn't about encoding what would be neat to encode.  It's about 
> encoding _text_, (including things that have been encoded before).

It’s sometimes non-obvious, though, what one should consider as text and what 
one should not, e.g. mathematic formulae.

I would assume that usually the following indicated good candidates for 
encoding:

— A glyph composed with (La)TeX commands and frequently asked for in forums.
— 1em high inline pictures included in HTML on websites. CSS not so much.
— Letter-sized glyphs seen in print and manuscripts amongst common characters, 
e.g. the logographic heart in “I ♥ NY”.
— Character( sequence)s used for their glyphic appearance in short-messages 
(SMS, IM, IRC, …), tweets and status updates or the like.

Concerning the last category: does Unicode need to encode a character 
BUTTERFLY, because character sequences like ‘Ƹ̵̡Ӝ̵̨̄Ʒ’ (a Roman/IPA–Cyrillic 
mix) are quite common and popular in certain social groups? 
Of course there had been inline (rotated) ASCII art before the advent of 
Unicode, most notably in the form of emoticons or smilies, but also for 
instance flowers ‘-<-@’ → ⚘ U+2698, hearts ‘<3’ → ♡/♥ U+2661/5 or scissors 
‘8<’/‘>8’ → ✂ U+2702 etc.

Especially in comics, curse words are sometimes not written explicitly, but use 
a sequences of more or less arbitrary symbols instead. Should one encode this 
with existing symbolic caracters for their glyphic value, e.g. “$#*!”, or would 
it be better to have a logic character DISGUISED CURSE that might render like 
an asterisk or bullet point or a random sequence thereof, e.g. “f***”, or in an 
(almost) random manner select non-phonographic symbols, e.g. “f♨⚔⚡”, in smart 
fonts?

Reply via email to