Re: django unicode-conversion, beginning

Aidas Bendoraitis Wed, 09 Aug 2006 04:03:22 -0700

Shouldn't the UTF-8 encoding be also defined in all files as described
here: http://www.python.org/dev/peps/pep-0263/ ?


That is using

#!/usr/bin/python
# -*- coding: UTF-8 -*-

at the beginning of python code files.

This works pretty good at least when you need to create new instances
of models containing multilingual characters via python script file.


Regards,
Aidas Bendoraitis [aka Archatas]


On 8/9/06, Malcolm Tredinnick <[EMAIL PROTECTED]> wrote:
>
> Hey Gabor,
>
> On Wed, 2006-08-09 at 01:03 +0200, gabor wrote:
> > today i experimented a little with the django source code,
> > and here are the results.
> >
> > if you apply a very small patch (65lines, attached), you can write a view
> > completely in unicode.
> > means:
> > - GET/POST contains unicode data
> > - request.META contains unicode data
> > - you can put unicode text into the HttpResponse (this was already possible
> > without the patch)
> >
> > of course, this patch is a demonstration only. the charset is hardcoded
> > to UTF-8 (should be settings.DEFAULT_CHARSET), and it only handles the
> > WSGI way (the mod_python one is not handled). also templating and ORM
> > are not touched. (not to mention the ugliness of the code)
> >
> > but still, i was quite surprised that with such small changes so much
> > can be done.
>
> The low-hanging fruit are definitely the place to start for this sort of
> thing.
>
> >
> > i think unicodizing django can be done in 4 easily separated steps/parts:
> >
> > 1. request/response
> > 2. templating-system
> > 3. database-system
> > 4. "overall unicode-conversion". this is mostly about replacing
> > bytestrings with u"bla" in the code, and switching __str__ to __unicode__
> >
> > my biggest problem currently is, that i do not know how to continue...
> > should i just write more and more patches to increase the
> > "unicode-coverage" to more parts of django? or maybe a more coordinated
> > approach would be better?
>
> Ultimately, getting you a svn branch to work in will probably be
> easiest. Maintaining a bunch of separate patches against a rapidly
> changing tree can be fairly time consuming. I'm not sure what the
> procedure is for that. Adrian?
>
> Keeping the changes as reasonably independent as possible is a great
> idea as far as you can take it. It will make review and testing a lot
> easier, as well as keeping you saner because you will only have to be
> looking at one layer at a time.
>
> A couple of comments on the patch itself. I realise it's only a proof of
> concept at the moment, so take as more things to think about when you
> want to tidy it up:
>
> (1) A docstring like """needed to workaround the cgi.parse_sql
> unicode-problem""" is not very future-proof. *What* parse_sql unicode
> problem? How will we know if/when it goes away? Either a quick
> description of the problem or a URL if it's tricky and explained
> elsewhere will help people who need to read this code in six months
> time.
>
> (2) You can't necessarily assume the environment is always in ASCII (or
> maybe you can; see below). For example, my current locale is set to
> en_AU.UTF-8 and I can do
>
>         export foo="€50,00"
>
> If I'm not careful when parsing os.environ['foo'] this comes out as
> rubbish (I need to do unicode(os.environ['foo'], 'utf-8') or similar).
>
> Probably some playing around with the locale module to work out the
> right behaviour and getting a few people to test things (e.g. Windows
> vs. Linux vs. Macs, etc) will be necessary. It's also important not to
> go too overboard here, but since arbitrary environment variables can be
> set through Apache, we need to be able to work with that to be
> "correct". Hmm ... what are the restrictions on what webservers can put
> in their config files? Maybe ASCII-only is reasonable. *shrug*
>
> Maybe more investigation needed here.
>
> (3) I know there are some software projects apparently using unicodize
> as a word, but ... *shudder*. Using "code" as an analogy, "unicodify"
> would be nicer (nobody uses "codize", I would hope).
>
> (4) As you go through this process, keep a list somewhere of what people
> need to do to port existing applications across to using this
> functionality. Ideally, the answer would be "not much" and we can cast
> from the default encoding to unicode internally where necessary. But I'm
> sure there will be some changes required, so keeping a list of things to
> watch out for as you go will help people test this for you.
>
> Good to see somebody working on this. :-)
>
> Regards,
> Malcolm
>
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers
-~----------~----~----~----~------~----~------~--~---

Re: django unicode-conversion, beginning

Reply via email to