Re: String encoding question

Masklinn Sat, 19 Sep 2009 08:37:07 -0700

On 19 Sep 2009, at 17:19 , Joshua Russo wrote:
>> ... in fact using utf-8 string literals can cause problems in other  
>> places
>> with code that assumes another encoding (e.g. ascii) for byte  
>> strings.
>>
>
> Could you expand on this? I know that the Unicode string object has
> different methods than standard String, but are there other  
> scenarios where
> a unicode literal could cause problems?
What Karen is saying is not about unicode literals, it's about  
bytestring literals ('foo' not u'foo'): when libraries encounter  
bytestrings they *might* want to decode them (depending on their needs).


Now to decode a bytestring, you need to provide a codec (an encoding),  
so usually libraries either use a default (potentially overridable,  
potentially not, through the call site or through a conf file) or use  
the system's default encoding.

In the second case and if your bytestring encoding is different than  
the system's default, then the library will blow up (note: there are  
other similar cases, this is just an example) whereas with a unicode  
string, the library wouldn't have needed to decode it (as it's already  
decoded).

That's the kind of problems a non-ascii bytestring can generate.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: String encoding question

Reply via email to