Re: Ascii to Unicode.

Ethan Furman Thu, 29 Jul 2010 11:46:06 -0700

Joe Goldthwaite wrote:

Hi Ulrich,


Ascii.csv isn't really a latin-1 encoded file.  It's an ascii file with a
few characters above the 128 range . . .

It took me a while to get this point too (if you already have "gottenit", I apologize, but the above comment leads me to believe you haven't).

*Every* file is an encoded file... even your UTF-8 file is encoded usingthe UTF-8 format. Someone correct me if I'm wrong, but I believelower-ascii (0-127) matches up to the first 128 Unicode code points, sowhile those first 128 code-points translate easily to ascii, ascii isstill an encoding, and if you have characters higher than 127, you don'treally have an ascii file -- you have (for example) a cp1252 file (whichalso, not coincidentally, shares the first 128 characters/code pointswith ascii).


Hopefully I'm not adding to the confusion.  ;)

~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list

Re: Ascii to Unicode.

Reply via email to