Re: Localflavors: request for developers and triagers

2007-03-30 Thread Malcolm Tredinnick

On Fri, 2007-03-30 at 21:03 -0700, Simon G. wrote:
> Hi Malcolm,
> 
> Should ALL the strings in these be unicode objects, even if they don't
> have any extended characters? I'm looking at the already checked in fr
> localflavor, and these are all ASCII with the respective file set to
> UTF-8 with that -*- thing (what is that called?), there's also a
> Brazilian one that I've marked as ready to go, but only the strings
> with non-ASCII chars are unicoded.

Only strings containing non-ascii characters are really required to be
unicode strings, as far as the current technical reasons go. The idea is
that if you see a str object, it's a bytestring or ASCII. Not a "maybe
it's UTF-8 and we just don't know because we threw away the important
information" case.

Regards,
Malcolm


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Localflavors: request for developers and triagers

2007-03-30 Thread Jeremy Dunck

On 3/30/07, Simon G. <[EMAIL PROTECTED]> wrote:
...
> UTF-8 with that -*- thing (what is that called?), there's also a

http://docs.python.org/ref/encodings.html

Encoding declaration.
(Incidentally, the process of parsing the file given the declaration
is given pretty clearly there, too.  Good to know.  :))

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Localflavors: request for developers and triagers

2007-03-30 Thread Simon G.

Hi Malcolm,

Should ALL the strings in these be unicode objects, even if they don't
have any extended characters? I'm looking at the already checked in fr
localflavor, and these are all ASCII with the respective file set to
UTF-8 with that -*- thing (what is that called?), there's also a
Brazilian one that I've marked as ready to go, but only the strings
with non-ASCII chars are unicoded.

--Simon


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Localflavors: request for developers and triagers

2007-03-30 Thread Malcolm Tredinnick

Hi Ville,

On Fri, 2007-03-30 at 01:09 -0700, Ville Säävuori wrote:
> I'm the author of the Finnish localflavor, #3847.
> 
> > So could triagers please mark tickets with such strings as needing an
> > improved patch and could original developers of these files please
> > include any strings containing non-ASCII characters in as u"..."
> > strings, not traditional strings.
> 
> I updated my patch to follow this practice. I'm not quite sure that
> the new  patch is okay, but please add a comment to it if there's
> still something to change :)

The new patch looks fine. I'll commit it right after dinner.

Best wishes,
Malcolm


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Re: Localflavors: request for developers and triagers

2007-03-30 Thread Ville Säävuori

I'm the author of the Finnish localflavor, #3847.

> So could triagers please mark tickets with such strings as needing an
> improved patch and could original developers of these files please
> include any strings containing non-ASCII characters in as u"..."
> strings, not traditional strings.

I updated my patch to follow this practice. I'm not quite sure that
the new  patch is okay, but please add a comment to it if there's
still something to change :)

- VS


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



Localflavors: request for developers and triagers

2007-03-29 Thread Malcolm Tredinnick

With all the new found interest in localflavor/ contributions, I've
started to see a pattern that could cause us trouble:

Many locales have the native names using characters outside of the ASCII
range. So people put "# -*- coding: utf-8 -*-" at the top of their file
and then include the proper names. Okay, sounds reasonable. But it has a
problem.

When Python parses a file containing the encoding marker, it produces a
str object containing the resulting string encoded as UTF-8. However,
this object carries no information that it is UTF-8 encoded. This makes
it hard to know what they are when you see them in a context far removed
from their original import. You also can't call encode() or decode() on
the string if it contains bytes greater than 127, which are always
present in non-ASCII UTF-8 strings.

To make our life easier (read "slightly more bullet-resistant"), these
character strings should be unicode objects, not str objects. That way,
we have the information we need to convert to other character encodings
as needed.

So could triagers please mark tickets with such strings as needing an
improved patch and could original developers of these files please
include any strings containing non-ASCII characters in as u"..."
strings, not traditional strings.

Nobody ever said Python unicode handling was perfect -- well, one guy
did once, but he was drunk at the time and we sent him out on a one-way
trip to the ice floes to go fishing -- and sometimes we have to just
work around the sharp bits. The about semi-problem wouldn't be an issue
except that we do need to be able to encode the strings into different
encodings for input and database storage. The alternative would be to
*always* use UTF-8 internally and that isn't an insane idea -- I know a
few C libraries that work that way and it may turn out to be slightly
faster in Python -- but it doesn't seem to be the traditional Python
way, as far as I can tell. We best play as the other boys do to
encourage other developers.

Regards,
Malcolm


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---