Thank you Malcolm, Duly noted. My understanding of everything network-related is pretty fuzzy, and I guess I was failing to really make a distinction between Unicode and UTF-8 or UTF-16, though I should know better.
Much obliged, Jonathan On Feb 15, 12:45 am, Malcolm Tredinnick <[EMAIL PROTECTED]> wrote: > On Thu, 2008-02-14 at 16:57 -0800, [EMAIL PROTECTED] wrote: > > I am writing an app where site users upload a plaintext file full of > > unstructured *unicode* data, > > By the way -- I know you've already had your a-ha moment for this -- > this statement isn't correct and may confuse you in the future. There's > no such thing as uploading Unicode data. Network transfers are bytes on > the wire. Which means the 21-bit Unicode quantities must be encoded > somehow; there's no "natural" way to send such large values over a > network. So there is an encoding involved, be it UTF-8 or UTF-16 or > ISO-8859-1 or whatever. But don't get caught thinking that the data > coming in from the client is "unicode data". It's not. It may be > convertible to unicode, once you know the encoding of the original, but > the difference is important. > > Regards, > Malcolm > > -- > Experience is something you don't get until just after you need > it.http://www.pointy-stick.com/blog/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---

