[EMAIL PROTECTED] wrote: > Hi ! > > I want to get the WMI infos from Windows machines. > I use Py from HU (iso-8859-2) charset. > > Then I wrote some utility for it, because I want to write it to an XML file. > > def ToHU(s,NoneStr='-'): > if s==None: s=NoneStr > if not (type(s) in [type(''),type(u'')]): > s=str(s) > if type(s)<>type(u''): > s=unicode(s) > s=s.replace(chr(0),' '); > s=s.encode('iso-8859-2') > return s > > This fn is working, but I have been got an error with this value: > 'Kommunik\xe1ci\xf3s port (COM1)' > > This routine demonstrates the problem > > s='Kommunik\xe1ci\xf3s port (COM1)' > print s > print type(s) > print type(u'aaa') > s=unicode(s) # error ! > > This is makes me mad. > How to I convert every objects to string, and convert (encode) them to > iso-8859-2 (if needed) ? >
s is a 'byte string' - a series of characters encoded in bytes. (As is every string on some level). In order to convert that to a unicdoe object, Python needs to know what encoding is used. In other words it needs to know what character each byte represents. See this : t = s.decode('iso-8859-1') t u'Kommunik\xe1ci\xf3s port (COM1)' print t Kommunikációs port (COM1) print type(s) <type 'str'> print type(t) <type 'unicode'> The decode instruction converts s into a unicode string - where Python knows what every character is. If you call unicdoe with no encoding specified, Python reverts to the system default - which is *probably* 'ascii'. You string contains characters which have *no meaning* in the ascii codec - so it reports an error.... Does this help ? Once you 'get unicode', Python support for it is pretty easy. It's a slightly complicated subject though. Basically you need to *know* what encoding is being used, and whenever you convert between unicode and byte-strings you need to specify it. What can complicate matters is that there are lot's of times an *implicit* conversion can take place. Adding strings to unicode objects, printing strings, or writing them to a file are the usual times implicit conversion can happen. If you haven't specified an encoding, then Python has to use the system default or the file object default (sys.stdout often has a different default encoding than the one returned by sys.getdefaultencoding()). It is these implicit conversions that often cause the 'UnicodeDecodeError's and 'UnicodeEncodeError's. HTH Best Regards, Fuzzy http://www.voidspace.org.uk/python > Please help me ! > > Thanx for help: > ft -- http://mail.python.org/mailman/listinfo/python-list