On Sun, Sep 13, 2009 at 8:25 PM, W.P. McNeill <bill...@gmail.com> wrote:
> Is this expected behavior?  I can see some discussion on the web that
> references unicode support for slugification, but I can't tell if that
> unicode support works for any arbitrary unicode characters, or Django
> has hand-crafted slugification for certain non-ASCII characters (e.g.
> common European characters).

When in doubt, look at the source. The 'slugify' template filter is
implemented as, well, a template filter, and so lives with all the
other built-in filters in django.template.defaultfilters:

http://code.djangoproject.com/browser/django/trunk/django/template/defaultfilters.py#L222

It's easy to see from the code what's going on. A Unicode string comes
in to the filter, and is normalized (using form NFKD) and encoded as
ASCII, ignoring non-convertible characters. Then any character which
is neither a space, a hyphen nor an alphanumeric character is
stripped, as is leading and trailing whitespace. Finally, spaces are
replaced with hyphens.

The result is something which will be usable in a URL, regardless of
the exotic characters which went into it. However, this does have the
possibility of discarding information, in a couple of places.

First, the Unicode normalization and ASCII conversion is important --
NFKD decomposes characters, and then the ASCII encode discards
anything that can't be converted. So, for example, if the character
'ñ' is in the string, the NFKD normalization decomposes it into 'n'
and a combining diacritic, and then the ASCII conversion with the
'ignore' flag discards the diacritic. For a URL, this is typically
what you want, because it means 'ñ' becomes simply 'n'.

The other place where you can lose characters is in discarding
non-alphanumeric characters, but again for a URL this is typically
what you want.


-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to