Re: Localflavors: request for developers and triagers
On Fri, 2007-03-30 at 21:03 -0700, Simon G. wrote: > Hi Malcolm, > > Should ALL the strings in these be unicode objects, even if they don't > have any extended characters? I'm looking at the already checked in fr > localflavor, and these are all ASCII with the respective file set to > UTF-8 with that -*- thing (what is that called?), there's also a > Brazilian one that I've marked as ready to go, but only the strings > with non-ASCII chars are unicoded. Only strings containing non-ascii characters are really required to be unicode strings, as far as the current technical reasons go. The idea is that if you see a str object, it's a bytestring or ASCII. Not a "maybe it's UTF-8 and we just don't know because we threw away the important information" case. Regards, Malcolm --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Localflavors: request for developers and triagers
On 3/30/07, Simon G. <[EMAIL PROTECTED]> wrote: ... > UTF-8 with that -*- thing (what is that called?), there's also a http://docs.python.org/ref/encodings.html Encoding declaration. (Incidentally, the process of parsing the file given the declaration is given pretty clearly there, too. Good to know. :)) --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Localflavors: request for developers and triagers
Hi Malcolm, Should ALL the strings in these be unicode objects, even if they don't have any extended characters? I'm looking at the already checked in fr localflavor, and these are all ASCII with the respective file set to UTF-8 with that -*- thing (what is that called?), there's also a Brazilian one that I've marked as ready to go, but only the strings with non-ASCII chars are unicoded. --Simon --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Localflavors: request for developers and triagers
Hi Ville, On Fri, 2007-03-30 at 01:09 -0700, Ville Säävuori wrote: > I'm the author of the Finnish localflavor, #3847. > > > So could triagers please mark tickets with such strings as needing an > > improved patch and could original developers of these files please > > include any strings containing non-ASCII characters in as u"..." > > strings, not traditional strings. > > I updated my patch to follow this practice. I'm not quite sure that > the new patch is okay, but please add a comment to it if there's > still something to change :) The new patch looks fine. I'll commit it right after dinner. Best wishes, Malcolm --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Re: Localflavors: request for developers and triagers
I'm the author of the Finnish localflavor, #3847. > So could triagers please mark tickets with such strings as needing an > improved patch and could original developers of these files please > include any strings containing non-ASCII characters in as u"..." > strings, not traditional strings. I updated my patch to follow this practice. I'm not quite sure that the new patch is okay, but please add a comment to it if there's still something to change :) - VS --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---
Localflavors: request for developers and triagers
With all the new found interest in localflavor/ contributions, I've started to see a pattern that could cause us trouble: Many locales have the native names using characters outside of the ASCII range. So people put "# -*- coding: utf-8 -*-" at the top of their file and then include the proper names. Okay, sounds reasonable. But it has a problem. When Python parses a file containing the encoding marker, it produces a str object containing the resulting string encoded as UTF-8. However, this object carries no information that it is UTF-8 encoded. This makes it hard to know what they are when you see them in a context far removed from their original import. You also can't call encode() or decode() on the string if it contains bytes greater than 127, which are always present in non-ASCII UTF-8 strings. To make our life easier (read "slightly more bullet-resistant"), these character strings should be unicode objects, not str objects. That way, we have the information we need to convert to other character encodings as needed. So could triagers please mark tickets with such strings as needing an improved patch and could original developers of these files please include any strings containing non-ASCII characters in as u"..." strings, not traditional strings. Nobody ever said Python unicode handling was perfect -- well, one guy did once, but he was drunk at the time and we sent him out on a one-way trip to the ice floes to go fishing -- and sometimes we have to just work around the sharp bits. The about semi-problem wouldn't be an issue except that we do need to be able to encode the strings into different encodings for input and database storage. The alternative would be to *always* use UTF-8 internally and that isn't an insane idea -- I know a few C libraries that work that way and it may turn out to be slightly faster in Python -- but it doesn't seem to be the traditional Python way, as far as I can tell. We best play as the other boys do to encourage other developers. Regards, Malcolm --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~--~~~~--~~--~--~---