Re: [Python-Dev] bytes / unicode

Terry Reedy Tue, 22 Jun 2010 13:23:50 -0700

On 6/22/2010 1:22 AM, Glyph Lefkowitz wrote:

The thing that I have heard in passing from a couple of folks with
experience in this area is that some older software in asia would
present characters differently if they were originally encoded in a
"japanese" encoding versus a "chinese" encoding, even though they were
really "the same" characters.

As I tried to say in another post, that to me is similar to wanting topresent English text is different fonts depending on whether spoken byan American or Brit, or a modern person versus a Renaissance person.

I do know that Han Unification is a giant political mess
(<http://en.wikipedia.org/wiki/Han_unification> makes for some


Thanks, I will take a look.

interesting reading), but my understanding is that it has handled enough
of the cases by now that one can write software to display asian
languages and it will basically work with a modern version of unicode.
(And of course, there's always the private use area, as Stephen Turnbull
pointed out.)

Regardless, this is another example where keeping around a string isn't
really enough. If you need to display a japanese character in a distinct
way because you are operating in the japanese *script*, you need a tag
surrounding your data that is a hint to its presentation. The fact that
these presentation hints were sometimes determined by their encoding is
an unfortunate historical accident.

Yes. The asian languages I know anything about seems to natively havealmost none of the symbols English has, many borrowed from math, thathave been pressed into service for text markup.



--
Terry Jan Reedy

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] bytes / unicode

Reply via email to