Re: URL Patterns for URL Encoding Symbols

2009-07-22 Thread emil0r

> I wish I could specify all the unicode letters (for all other
> languages apart from Turkish) as something like \w.

Best thing I've come up with is to go over the unicode list and
identify which languages you want to support. You then use unichr to
construct a regex such as: '[%s-%s]' % (unichr(start of unicode
block), unichr(end of unicode block)).

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: URL Patterns for URL Encoding Symbols

2009-06-30 Thread Ahmet Emre Aladağ


>     (r'^(?P[-0-9A-Za-z]+)/(?P[-_.0-9A-Za-zıİğĞüÜşŞöÖçÇ ]
> +)/$', 'search_in_all_packages')
> then
>     (u'^(?P[-0-9A-Za-z]+)/(?P[-_.0-9A-Za-zıİğĞüÜşŞöÖçÇ ]
> +)/$', 'search_in_all_packages').encode("utf-8")
>
> but none of them worked.

[Typo: misplaced .encode("utf-8") in mail.]

I managed to get it work:
   (u'^(?P[-0-9A-Za-z]+)/(?P[-_.0-9A-Za-zıİğĞüÜşŞöÖçÇ ]
+)/$', 'search_in_all_packages'),

The key point seems to be using (u'^ ... instead of (r'^...

Maybe I had a problem with caching and the fact that I didn't use u'^
for the other rules.

I wish I could specify all the unicode letters (for all other
languages apart from Turkish) as something like \w.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: URL Patterns for URL Encoding Symbols

2009-06-30 Thread Ahmet Emre Aladağ

> > You need to replace the `\w` with something that will match the characters 
> > you
> > want. If you want everything that `\w` matches plus spaces, you should use
> > `[\w ]+` (note the space) instead of `\w+`.

What about other unicode characters? Such as special characters in
other languages? I'm developing a search engine which searches for
files inside Linux packages. Whenever somebody enters a Turkish
character as a search term, urls.py can't handle that. Somebody [1]
has found a workaround by accepting everything except for "/" in the
url. But I'm wondering if it could be dangerous for me as I'm doing a
database search.

My url patterns used to be like:

(r'^(?P[-0-9A-Za-z]+)/(?P[-_.0-9A-Za-z]+)/$',
'search_in_all_packages'),

Then I tried appending special characters, and some encoding stuff:
(r'^(?P[-0-9A-Za-z]+)/(?P[-_.0-9A-Za-zıİğĞüÜşŞöÖçÇ ]
+)/$', 'search_in_all_packages')
then
(u'^(?P[-0-9A-Za-z]+)/(?P[-_.0-9A-Za-zıİğĞüÜşŞöÖçÇ ]
+)/$', 'search_in_all_packages').encode("utf-8")

but none of them worked.

[1] http://blog.tkbe.org/archive/django-international-characters-in-urls
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: URL Patterns for URL Encoding Symbols

2009-06-11 Thread Andy Dietler

Awesome thanks. I love the Django community.

On Jun 11, 7:03 pm, Thomas Sutton  wrote:
> Hi Andy,
>
> 2009/6/12 Andy  Dietler :
>
>
>
>
>
>
>
> > Right now I've got a URL pattern that works for letters and numbers,
> > but when a character like %20 gets thrown in it fails.
>
> > The pattern is this:
>
> > (r'^(?P\w+)/$', 'detail'),
>
> > Which works when I have:
>
> > domain.com/Friends/
> > domain.com/24/
>
> > but not for
>
> > domain.com/The%20Office/
>
> > How do I get it to accept the %20?
>
> The `\w` in your regular expression means (to quote the Python `re` module
> documentation :
>
> > When the LOCALE and UNICODE flags are not specified, [`\w`] matches any
> > alphanumeric character and the underscore; this is equivalent to the set
> > [a-zA-Z0-9_].  With LOCALE, it will match the set [0-9_] plus whatever
> > characters are defined as alphanumeric for the current locale. If UNICODE is
> > set, this will match the characters [0-9_] plus whatever is classified as
> > alphanumeric in the Unicode character properties database.
>
> You need to replace the `\w` with something that will match the characters you
> want. If you want everything that `\w` matches plus spaces, you should use
> `[\w ]+` (note the space) instead of `\w+`.
>
> Cheers,
>
> Thomas Sutton
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



URL Patterns for URL Encoding Symbols

2009-06-11 Thread Andy Dietler

Right now I've got a URL pattern that works for letters and numbers,
but when a character like %20 gets thrown in it fails.

The pattern is this:

(r'^(?P\w+)/$', 'detail'),

Which works when I have:

domain.com/Friends/
domain.com/24/

but not for

domain.com/The%20Office/

How do I get it to accept the %20?

Thank in advance.
- Andy
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---