> Recall: > When I read data using sql I got a sequence like this: > \x88\x89\x85 > But when I entered heberw words directly in the print statement (or as > a dictionary key) > I got this: > \xe8\xe9\xe5 > > Now, scanning the encoding module I discovered that cp1255 maps > '\u05d9' to \xe9 > while cp856 maps '\u05d9' to \x89, > so trasforming \x88\x89\x85 to \xe8\xe9\xe5 is done by
Hebrew Windows apparently uses cp1255 (aka windows-1255) as the "ANSI code page", used in all GUI APIs, and cp856 as the "OEM code page", used in terminal window - and, for some reason, in MS SQL. > My qestion is, is there a way I can deduce cp856 and cp1255 from the > string itself? That's not possible. You have to know where the string comes from. to know what the encoding is. In the specific case, if the string comes out of MS SQL, it apparently has cp856 (but I'm sure you can specify the client encoding somewhere in SQL server, or in pymssql) > I don't know how IDLE guessed cp856, but it must have done it. I don't know why you think it did. You said you entered \xe9 directly into the source code in IDLE, so a) this is windows-1255, not cp856, and b) IDLE just *used* windows-1255 (i.e. the ANSI code page), it did not guess it. If you are claimaing that the program import pymssql con = pymssql.connect(host='192.168.13.122',user='sa',password='',database='tempdb') cur = con.cursor() cur.execute('select firstname, lastname from [users]') lines = cur.fetchall() print repr(lines[0]) does different things depending on whether it is run in IDLE or in a terminal window - I find that hard to believe. IDLE/Tk has nothing to do with that. It's the *repr* that you are printing, ie. all escaping has been done before IDLE/Tk even sees the text. So it must have been pymssql that returns different data in each case. It could be that the DB-API does such things, see http://msdn2.microsoft.com/en-us/library/aa937147(SQL.80).aspx Apparently, they do the OEMtoANSI conversion when you run a console application (i.e. python.exe), whereas they don't convert when running a GUI application (pythonw.exe). I'm not quite sure how they find out whether the program is a console application or not; the easiest thing to do might be to turn the autoconversion off on the server. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list