Hi Thomas,

Linux is my weakest platform, but if the filenames are stored as byte strings 
on disk (no reason to believe they are not), then how that byte string is 
interpreted is a function of the encoding of the bytes.  The encoding could be 
UTF-8 or something else, but since the python 3.0+ apis (such as 
Py_SetProgramName<http://docs.python.org/3/c-api/init.html?highlight=py_setprogramname#Py_SetProgramName>
 and 
PySys_SetArgV<http://docs.python.org/3/c-api/init.html?highlight=pysys_setargv#PySys_SetArgv>)
 take wide character strings, the values incoming from the program invocation 
must be converted to wide char strings already, and in fact they are in the 
bases code with mbstowcs.

My goal is to eliminate this conversion from Windows (since we can get wide 
characters from the OS there), and make sure that the right conversion is 
happening on Mac and Linux.  As it stands, mbstowcs is using the empty string 
for the LC_CTYPE  which looks suspicious to me.

So, after digging some, I did notice that the String decoding functions in 
Python 3 do have the useful 'Py_FileSystemDefaultEncoding ' constant which is 
already used in Common.c.   So, I believe that perhaps this (or maybe 
PyUnicode_DecoderFSDefault<http://docs.python.org/3/c-api/unicode.html?highlight=py_filesystemdefaultencoding#PyUnicode_DecodeFSDefault>
 along with 
PyUnicode_AsWideCharString<http://docs.python.org/3/c-api/unicode.html?highlight=py_filesystemdefaultencoding#PyUnicode_AsWideCharString>)
 can be leveraged earlier in order to bypass the call to mbstowcs with the 
questionable LC_CTYPE... and I won't have to make assumptions about the 
incoming encoding.

Granted, I haven't exercised this on non-windows yet, so it's all hypothetical 
at the moment.

Steven

From: Thomas Kluyver [mailto:[email protected]]
Sent: Monday, July 29, 2013 11:01 AM
To: primary discussion list for use and development of cx_Freeze
Subject: Re: [cx-freeze-users] Problems launching Frozen Python 3.3 application 
located in path with international characters.

On 29 July 2013 15:39, Steven Velez 
<[email protected]<mailto:[email protected]>> wrote:
This means that the bases for python 3 on windows will not support Windows 
9x... is that a concern?

I don't think so - Python itself apparently dropped support for Windows 9x in 
Python 2.6:
http://docs.python.org/3/whatsnew/2.6.html#port-specific-changes-windows

For python >= 3 on other platforms, I intend to do conversions to wide 
characters assuming a UTF-8 encoding in the source unless overridden by a 
cx-specific environment variable.

I'm not sure about this. On Linux, as I understand it, filenames are 
represented as bytes, which are generally *displayed* as UTF-8 by convention. 
Decoding all filenames to wide characters and later re-encoding them seems 
destined to create problems somewhere. Is it possible to do this so that the 
wide character filenames are only used on Windows?
Thanks,
Thomas
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
cx-freeze-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cx-freeze-users

Reply via email to