On Sun, Feb 12, 2006 at 02:03:13AM +0100, BJ?rn Lindqvist wrote: > > > I think that Kid's way to deal with encodings is slightly > > > non-optimal.. It could be improved by having Kid default to utf8 > > > instead of ascii. It is likely that that brings other problems, but it > > > is still better than guessing ascii which in a web context is a > > > totally brain damaged guess. I'm sure there are alot of web apps out > > > there waiting to be broken because the programmer didn't realise that > > > his code only works with ascii characters. > > > > > > My other idea is that Kid would refuse to run unless an encoding is > > > explicitly specified somewhere. "In the face of ambiguity, refuse the > > > temptation to guess." I hope you can please fix this problem somehow. > > > Please CC me replies as I don't subscribe. > > > > I believe Kid tries to use the encoding that *you* have set. There > > may be a case in there somewhere that has a bad default, but I was > > not able to find it during a few quick greps. See the documentation > > for 'sys.getdefaultencoding()'. > > Yes! That's what I'm saying. Python's default encoding is ascii and > that is what is causing the exception: > > UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position > 3: ordinal not in range(128) > > Python IMHO made a big mistake by having ascii as the default > encoding, but it is probably to late to change that now. Kid should > not replicate that mistake by assuming that "sys.getdefaultencoding()" > is the explicitly requested encoding. Maybe you can investigate how > Cheetah chooses encoding? I may be totally wrong, but I don't think it > relies on sys.getdefaultencoding() at all. > > -- > mvh Bj?rn > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd_______________________________________________ > kid-template-discuss mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/kid-template-discuss >
I guess it really depends on hew you look at it. I think it is
pretty sane. If Python interprets all strings as ASCII then why
should Kid be any different. In this case I think it is up to the
user to tell Kid how the strings are encoded if they are different.
Two ways to do this:
1. Decode the string into a unicode object
2. Set the magical 'assume_encoding' property on you template
instance.
Quick example:
In [6]:t = kid.Template(source='<e>$x</e>')
In [7]:t.x = unichr(0xe4).encode('utf-8')
In [8]:t.serialize()
[snip]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
position 0: ordinal not in range(128)
In [9]:t.assume_encoding = 'utf-8'
In [10]:t.x = unichr(0xe4).encode('utf-8')
In [11]:t.serialize()
Out[11]:'<?xml version="1.0"
encoding="utf-8"?>\n<e>\xc3\xa4</e>'
David
--
GPG keyID #6272EDAF on http://pgp.mit.edu
Key fingerprint = 8BAA 7E11 8856 E148 6833 655A 92E2 3E00 6272 EDAF
pgpe7pN7mjNET.pgp
Description: PGP signature
