[Tkinter-discuss] OT: Unicode

Cam Farnell Fri, 21 Mar 2014 20:01:22 -0700

Technically this is a Python question, not a Tkinter question, but it's in the 
context of a Tkinter application so I don't feel *too* guilty about posting it 
here.


OK. I've got at Tkinter application (running with Python 2.7.2 on Ubuntu 
12.04.4 LTS) that needs to handle French accented characters. And it does 
handle accented characters just fine. I can type an accented character into an 
Entry and it shows up correctly. I can display it on a Text. I can cPickle it 
to disk and read it back. For example, if I enter e-circumflex (in at Tkinter 
Entry) and then print it using repr I get:

    u\'EA'

If I look in the cPickled file there are 0xEA's where the e-circumflex 
characters are. So far so good.

The problem comes when I need to read into my Tkinter application a file which 
has accented characters and which was prepared using a text editor like, for 
example, gedit. The file to be read also has 0xEA's to represent e-circumflex. 
However, when I read such a file the resulting string then contains u'\cd\xaa' 
where the e-circumflexes belong. I don't know who is doing the unwanted 
conversion or how to make it go away. I've tried reading in binary mode, I've 
tried opening the file using:

    F = codecs.open('temp.txt', encoding='latin-1')

I've tried putting:

    # -*- coding: latin-1 -*

as the second line of my program. I've tried reading Python/unicode 
documentation till my eyes went blurry. All to no avail.

There is probably some really simple solution to this, but so far I've failed 
to find. it.

Thus, if anyone out there in Tkinter land knows the simple solution or could 
point me to a good source of information I would greatly appreciate it.

Thanks

Cam Farnell

_______________________________________________
Tkinter-discuss mailing list
Tkinter-discuss@python.org
https://mail.python.org/mailman/listinfo/tkinter-discuss

[Tkinter-discuss] OT: Unicode

Reply via email to