On Mon, 2007-02-19 at 13:01 +0000, omat * gezgin.com wrote: > I am trying to match a utf-8 character with a filter. Within the > python prompt, "u'ç'.encode('utf-8')" returns "\xc3\xa7" correctly but > when I use this inside a filter like: > > (name__startswith = u'ç'.encode('utf-8')) > > I get a syntax error: > Non-ASCII character '\xc3' in file .../views.py on line 24, but no > encoding declared...
When you type something like 'ç' or u'ç', Python reads your source and it's important whether it know what encoding it is in. The solution you law in previous posts, were to declare the coding of the Python source file, which is the right thing to do. Just to let you know what happens here: There's something called Unicode object. Python can create an Unicod object by _decoding_ a string. In order to do that, assuming that your source is encoded in UTF-8, you do like this: unicode_obj = 'ç'.decode('utf-8') This function returns a Unicode object, which can be later _encoded_ to a string: utf_8_encoded_string = unicode_obj.encode('utf-8') Back to your example, you typed: u'ç'.encode('utf-8'), which told Python: Take this Unicode object and encode it in UTF-8. But hey, 'ç' is not a unicode object, it's a string! How is this string encoded? Dunno, the source doesn't say. In such a case we assume ASCII. But hey again, this is not an ASCII character! I'm going to complain! I hope it helps in the future, so you know what decoding and encoding means. Cheers, Maciej -- Maciej Bliziński http://automatthias.wordpress.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---