On Wed, Jan 21, 2009 at 1:59 PM, bilgin arslan <a.bilgi...@gmail.com> wrote:
> Hello, > I am trying to write a list of words to into a text file as two > colons: word (tab) len(word) > such as > > standart 8 > > I have no trouble writing the words but I couldn't write integers. I > always get strange characters, such as: > > GUN > 㐊娀䄀䴀䄀一ഀ5COCUK > 㐊䬀䄀䐀䤀一ഀ5EV > ... > 㜊夀䄀䴀䄀ഀ4YATSI > 㔊娀䤀䰀䜀䤀吀ഀ� Looks like an encoding problem to me. > > (the integers also seem to be incorrect) > I use the following form inside a loop to produce this > current = unicode(word)+" "+str(len(word)) > ofile.write(current) > > > I know about struct and I tried to used it but somehow I always got a > blank character instead of an int. > > import struct > format = "i" > data = struct.pack(format, 24) > print data Struct encodes the data as a string. 24 encoded as a byte string is represented as 18 00 00 00 (these are hex). All of these values are unprintable, so you get a blank instead. You're original idea should work once you get the encoding problem fixed. > > > Any ideas? > I use macosx and eclipse. The code also uses unicode encoding Unicode is NOT an encoding. It is a standard. You're probably thinking of the UTF-8 encoding, one of the 5 different "unicode" encodings. This page does a great job of explaining what Unicode actually is. http://www.joelonsoftware.com/articles/Unicode.html Try using ofile.write(current.encode("UTF-8")) and see if that helps.
-- http://mail.python.org/mailman/listinfo/python-list