Hey Gabor,

On Wed, 2006-08-09 at 01:03 +0200, gabor wrote:
> today i experimented a little with the django source code,
> and here are the results.
> 
> if you apply a very small patch (65lines, attached), you can write a view
> completely in unicode.
> means:
> - GET/POST contains unicode data
> - request.META contains unicode data
> - you can put unicode text into the HttpResponse (this was already possible
> without the patch)
> 
> of course, this patch is a demonstration only. the charset is hardcoded
> to UTF-8 (should be settings.DEFAULT_CHARSET), and it only handles the
> WSGI way (the mod_python one is not handled). also templating and ORM
> are not touched. (not to mention the ugliness of the code)
> 
> but still, i was quite surprised that with such small changes so much
> can be done.

The low-hanging fruit are definitely the place to start for this sort of
thing.

> 
> i think unicodizing django can be done in 4 easily separated steps/parts:
> 
> 1. request/response
> 2. templating-system
> 3. database-system
> 4. "overall unicode-conversion". this is mostly about replacing
> bytestrings with u"bla" in the code, and switching __str__ to __unicode__
> 
> my biggest problem currently is, that i do not know how to continue...
> should i just write more and more patches to increase the
> "unicode-coverage" to more parts of django? or maybe a more coordinated
> approach would be better?

Ultimately, getting you a svn branch to work in will probably be
easiest. Maintaining a bunch of separate patches against a rapidly
changing tree can be fairly time consuming. I'm not sure what the
procedure is for that. Adrian?

Keeping the changes as reasonably independent as possible is a great
idea as far as you can take it. It will make review and testing a lot
easier, as well as keeping you saner because you will only have to be
looking at one layer at a time.

A couple of comments on the patch itself. I realise it's only a proof of
concept at the moment, so take as more things to think about when you
want to tidy it up:

(1) A docstring like """needed to workaround the cgi.parse_sql
unicode-problem""" is not very future-proof. *What* parse_sql unicode
problem? How will we know if/when it goes away? Either a quick
description of the problem or a URL if it's tricky and explained
elsewhere will help people who need to read this code in six months
time.

(2) You can't necessarily assume the environment is always in ASCII (or
maybe you can; see below). For example, my current locale is set to
en_AU.UTF-8 and I can do

        export foo="€50,00"
        
If I'm not careful when parsing os.environ['foo'] this comes out as
rubbish (I need to do unicode(os.environ['foo'], 'utf-8') or similar).

Probably some playing around with the locale module to work out the
right behaviour and getting a few people to test things (e.g. Windows
vs. Linux vs. Macs, etc) will be necessary. It's also important not to
go too overboard here, but since arbitrary environment variables can be
set through Apache, we need to be able to work with that to be
"correct". Hmm ... what are the restrictions on what webservers can put
in their config files? Maybe ASCII-only is reasonable. *shrug*

Maybe more investigation needed here.

(3) I know there are some software projects apparently using unicodize
as a word, but ... *shudder*. Using "code" as an analogy, "unicodify"
would be nicer (nobody uses "codize", I would hope).

(4) As you go through this process, keep a list somewhere of what people
need to do to port existing applications across to using this
functionality. Ideally, the answer would be "not much" and we can cast
from the default encoding to unicode internally where necessary. But I'm
sure there will be some changes required, so keeping a list of things to
watch out for as you go will help people test this for you.

Good to see somebody working on this. :-)

Regards,
Malcolm



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers
-~----------~----~----~----~------~----~------~--~---

Reply via email to