Okay, I get it now ... reading/writing files with the codecs module and the 'utf-8' option fixes it. Thanks!
From: Christian Witts Sent: Thursday, June 04, 2009 7:05 AM To: Dinesh B Vadhia Cc: tutor@python.org Subject: Re: [Tutor] unicode, utf-8 problem again Dinesh B Vadhia wrote: > Hi! I'm processing a large number of xml files that are all declared > as utf-8 encoded in the header ie. > > <?xml version="1.0" encoding="UTF-8"?> > > My Python environment has been set for 'utf-8' through site.py. > Additionally, the top of each program/module has the declaration: > > # -*- coding: utf-8 -*- > > But, I still get this error: > > Traceback (most recent call last): > ... > UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in > position 76: ordinal not in range(128) > > What am I missing? > > Dinesh > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > Hi, Take a read through http://evanjones.ca/python-utf8.html which will give you insight as to how you should be reading and processing your files. As for the encoding line "# -*- coding: utf-8 -*-", that is actually to declare the character encoding of your script and not of potential data it will be working with. -- Kind Regards, Christian Witts
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor