I'm experiencing some strange behavior, and I think it has to do with
how django deals with utf strings:

When I write a test.py file:

import re, unicodedata

reCombining = re.compile(u'[\u0300-\u036f\u1dc0-\u1dff\u20d0-\u20ff
\ufe20-\ufe2f]',re.U)

def remove_diacritics(s):
    return reCombining.sub('',unicodedata.normalize('NFD',unicode
(s)) )


and then open the python shell, I get:

Python 2.5.2 (r252:60911, Aug 10 2008, 00:43:40)
[GCC 4.0.1 (Apple Inc. build 5484)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from test import *
>>> remove_diacritics(u'café')
u'cafe'


as intended.

When I do the same thing with the django shell:
$ python manage.py shell
Python 2.5.2 (r252:60911, Aug 10 2008, 00:43:40)
[GCC 4.0.1 (Apple Inc. build 5484)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from test import *
>>> remove_diacritics(u'café')
u'cafA\xa9'


Which isn't quite what I expected.

My questions are:

1. How do I properly remove accents from strings in Django
2. What is django (this is using trunk) doing to strings differently
than python?

Even typing u'é' in the shell returns different things.

Cheers,

Dave

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to