You might also have luck with encoding='cp1252'. This is the "standard" set for windows characters.
On Sat, Mar 22, 2014 at 12:52 AM, Michael O'Donnell <michael.odonn...@uam.es> wrote: > Dear Cam, > > Python 3 is so much better at dealing with unicode than > Python 2. > > But, that said. Your file is in an encoding > that is not latin-1 (which is basically an anglo > encoding, no good if your text has inflections/accents). > > Solution: > > 1. Open your text file in a browser > > 2. If the file displays ok in the browser, > see what encoding the browser used > to decode the file: there is usually a "Encoding" > option in the menu somewhere, e.g. in Chrome, > under the View menu. > > Assume for this example that it is iso-8859-1 > > 3. Change your file opening to: > > F = codecs.open('temp.txt', encoding=iso-8859-1') > > That should fix it. you can read from the file > directly as a unicode string. > > Mick > > On 22 March 2014 03:26, Cam Farnell <ms...@bitflipper.ca> wrote: >> Technically this is a Python question, not a Tkinter question, but it's in >> the context of a Tkinter application so I don't feel *too* guilty about >> posting it here. >> >> OK. I've got at Tkinter application (running with Python 2.7.2 on Ubuntu >> 12.04.4 LTS) that needs to handle French accented characters. And it does >> handle accented characters just fine. I can type an accented character into >> an Entry and it shows up correctly. I can display it on a Text. I can >> cPickle it to disk and read it back. For example, if I enter e-circumflex >> (in at Tkinter Entry) and then print it using repr I get: >> >> u\'EA' >> >> If I look in the cPickled file there are 0xEA's where the e-circumflex >> characters are. So far so good. >> >> The problem comes when I need to read into my Tkinter application a file >> which has accented characters and which was prepared using a text editor >> like, for example, gedit. The file to be read also has 0xEA's to represent >> e-circumflex. However, when I read such a file the resulting string then >> contains u'\cd\xaa' where the e-circumflexes belong. I don't know who is >> doing the unwanted conversion or how to make it go away. I've tried reading >> in binary mode, I've tried opening the file using: >> >> F = codecs.open('temp.txt', encoding='latin-1') >> >> I've tried putting: >> >> # -*- coding: latin-1 -* >> >> as the second line of my program. I've tried reading Python/unicode >> documentation till my eyes went blurry. All to no avail. >> >> There is probably some really simple solution to this, but so far I've >> failed to find. it. >> >> Thus, if anyone out there in Tkinter land knows the simple solution or could >> point me to a good source of information I would greatly appreciate it. >> >> Thanks >> >> Cam Farnell >> >> _______________________________________________ >> Tkinter-discuss mailing list >> Tkinter-discuss@python.org >> https://mail.python.org/mailman/listinfo/tkinter-discuss > _______________________________________________ > Tkinter-discuss mailing list > Tkinter-discuss@python.org > https://mail.python.org/mailman/listinfo/tkinter-discuss -- **** Listen to my FREE CD at http://www.mellowood.ca/music/cedars **** Bob van der Poel ** Wynndel, British Columbia, CANADA ** EMAIL: b...@mellowood.ca WWW: http://www.mellowood.ca _______________________________________________ Tkinter-discuss mailing list Tkinter-discuss@python.org https://mail.python.org/mailman/listinfo/tkinter-discuss