Oleg Oltar wrote: > I am trying to decode a string I took from file: > > file = open ("./Downloads/lamp-post.csv", 'r') > data = file.readlines() > data[0] > > '\xff\xfeK\x00e\x00y\x00w\x00o\x00r\x00d\x00\t\x00C\x00o\x00m\x00p\x00e\x00t\x00i\x00t\x00i\x00o\x00n\x00\t\x00G\x00l\x00o\x00b\x00a\x00l\x00
> How do I convert this to something human readable? If you stare at it long enough you'll see the usual ascii characters interspersed with zero-bytes shown by Python as "\x00". This is an UTF-16 file. Open it with import codecs filename = "./Downloads/lamp-post.csv" with codecs.open(filename, "r", encoding="utf-16") as file: for line in file: print line Note that 'line' will now contain a unicode string instead of a byte string. If you want to write that to a file you have to encode it manually line = u"äöü" with open("tmp.txt", "w") as f: f.write(line.encode("utf-8")) or use codecs.open() again: with codecs.open("tmp.txt", "w", encoding="utf-8") as f: f.write(line) _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor