Hi, I am having some encoding problems when I first parse stuff from a
non-english website using BeautifulSoup and then write the results to
a txt file.

I have the text both as a normal (text) and as a unicode string
(utext):
print repr(text)
'Branie zak\xc2\xb3adnik\xc3\xb3w'

print repr(utext)
u'Branie zak\xb3adnik\xf3w'

print text or print utext (fileSoup.prettify() also shows 'wrong'
symbols):
Branie zak³adników


Now I am trying to save this to a file but I never get the encoding
right. Here is what I tried (+ lot's of different things with encode,
decode...):
outFile=open(filePath,"w")
outFile.write(text)
outFile.close()

outFile=codecs.open( filePath, "w", "UTF8" )
outFile.write(utext)
outFile.close()

Thanks!!





-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to