Thank you Malcolm,

Duly noted.  My understanding of everything network-related is pretty
fuzzy, and I guess I was failing to really make a distinction between
Unicode and UTF-8 or UTF-16, though I should know better.

Much obliged,
Jonathan

On Feb 15, 12:45 am, Malcolm Tredinnick <[EMAIL PROTECTED]>
wrote:
> On Thu, 2008-02-14 at 16:57 -0800, [EMAIL PROTECTED] wrote:
> > I am writing an app where site users upload a plaintext file full of
> > unstructured *unicode* data,
>
> By the way -- I know you've already had your a-ha moment for this --
> this statement isn't correct and may confuse you in the future. There's
> no such thing as uploading Unicode data. Network transfers are bytes on
> the wire. Which means the 21-bit Unicode quantities must be encoded
> somehow; there's no "natural" way to send such large values over a
> network. So there is an encoding involved, be it UTF-8 or UTF-16 or
> ISO-8859-1 or whatever. But don't get caught thinking that the data
> coming in from the client is "unicode data". It's not. It may be
> convertible to unicode, once you know the encoding of the original, but
> the difference is important.
>
> Regards,
> Malcolm
>
> --
> Experience is something you don't get until just after you need 
> it.http://www.pointy-stick.com/blog/
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to