> To: tutor@python.org > From: __pete...@web.de > Date: Tue, 14 Aug 2012 16:03:46 +0200 > Subject: Re: [Tutor] output not in ANSI, conversing char set to > locale.getpreferredencoding() > > leon zaat wrote: > > > I get the error: > > UnicodeDecodeError: 'ascii' codecs can't decode byte 0xc3 in position 7: > > ordinal not in range(128) for the openbareruimtenaam=u'' + > > (openbareruimtenaam1.encode(chartype)) line. > > > The error message means that database.select() returns a byte string. > > bytestring.encode(encoding) > > implicitly attempts > > bytestring.decode("ascii").encode(encoding) > > and will fail for non-ascii bytestrings no matter what encoding you pass to > the encode() method. > > > I know that the default system codecs is ascii and chartype=b'cp1252' > > But how can i get the by pass the ascii encoding? > > You have to find out the database encoding -- then you can change the > failing line to > > database_encoding = ... # you need to find out yourself, but many use the > # UTF-8 -- IMO the only sensible choice these days > file_encoding = "cp1252" > > openbareruimtenaam = openbareruimtenaam1.decode( > database_encoding).encode(file_encoding) > > As you now have a bytestring again you can forget about codecs.open() which > won't work anyway as the csv module doesn't support unicode properly in > Python 2.x (The csv documentation has the details). > Tried it with: openbareruimtenaam = openbareruimtenaam1.decode("UTF-8").encode("cp1252") but still the complains about the ascii error prior message: import csv import codecs import locale # Globale variabele bagObjecten = [] chartype=locale.getpreferredencoding() #------------------------------------------------------------------------------ # BAGExtractPlus toont het hoofdscherm van de BAG Extract+ tool #------------------------------------------------------------------------------ class BAGExtractPlus(wx.Frame): #------------------------------------------------------------------------------ # schrijven van de records #------------------------------------------------------------------------------ def schrijfExportRecord(self, verblijfhoofd,identificatie): sql1=""; sql1="Select openbareruimtenaam, woonplaatsnaam from nummeraanduiding where identificatie = '" + identificatie "'" num= database.select(sql1); for row in num: openbareruimtenaam1=row[0] openbareruimtenaam=u'' + (openbareruimtenaam1.encode(chartype)) woonplaatsnaam1=(row[0]); woonplaatsnaam=u'' + (woonplaatsnaam1.encode(chartype)) newrow=[openbareruimtenaam, woonplaatsnaam]; verblijfhoofd.writerow(newrow); #-------------------------------------------------------------------------------------- # Exporteer benodigde gegevens #-------------------------------------------------------------------------------------- def ExportBestanden(self, event): ofile=codecs.open(r'D:\bestanden\BAG\adrescoordinaten.csv', 'wb', chartype) verblijfhoofd = csv.writer(ofile, delimiter=',', quotechar='"', quoting=csv.QUOTE_NONNUMERIC) counterVBO=2; identificatie='0014010011066771'; while 1 < counterVBO: hulpIdentificatie= identificatie; sql="Select identificatie, hoofdadres, verblijfsobjectgeometrie from verblijfsobject where "; sql= sql + "identificatie > '" + hulpIdentificatie ; vbo= database.select(sql); if not vbo: break; else: for row in vbo: identificatie=row[0]; verblijfobjectgeometrie=row[2]; self.schrijfExportRecord(verblijfhoofd, identificatie) I highlighted in red the lines i think that are important. When i try to convert openbareruimtenaam from the data below: "P.J. Noël Bakerstraat";"Groningen" I get the error: UnicodeDecodeError: 'ascii' codecs can't decode byte 0xc3 in position 7: ordinal not in range(128) for the openbareruimtenaam=u'' + (openbareruimtenaam1.encode(chartype)) line. I know that the default system codecs is ascii and chartype=b'cp1252' But how can i get the by pass the ascii encoding?
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor