Hi Thomas,
Linux is my weakest platform, but if the filenames are stored as byte strings
on disk (no reason to believe they are not), then how that byte string is
interpreted is a function of the encoding of the bytes. The encoding could be
UTF-8 or something else, but since the python 3.0+ apis (such as
Py_SetProgramName<http://docs.python.org/3/c-api/init.html?highlight=py_setprogramname#Py_SetProgramName>
and
PySys_SetArgV<http://docs.python.org/3/c-api/init.html?highlight=pysys_setargv#PySys_SetArgv>)
take wide character strings, the values incoming from the program invocation
must be converted to wide char strings already, and in fact they are in the
bases code with mbstowcs.
My goal is to eliminate this conversion from Windows (since we can get wide
characters from the OS there), and make sure that the right conversion is
happening on Mac and Linux. As it stands, mbstowcs is using the empty string
for the LC_CTYPE which looks suspicious to me.
So, after digging some, I did notice that the String decoding functions in
Python 3 do have the useful 'Py_FileSystemDefaultEncoding ' constant which is
already used in Common.c. So, I believe that perhaps this (or maybe
PyUnicode_DecoderFSDefault<http://docs.python.org/3/c-api/unicode.html?highlight=py_filesystemdefaultencoding#PyUnicode_DecodeFSDefault>
along with
PyUnicode_AsWideCharString<http://docs.python.org/3/c-api/unicode.html?highlight=py_filesystemdefaultencoding#PyUnicode_AsWideCharString>)
can be leveraged earlier in order to bypass the call to mbstowcs with the
questionable LC_CTYPE... and I won't have to make assumptions about the
incoming encoding.
Granted, I haven't exercised this on non-windows yet, so it's all hypothetical
at the moment.
Steven
From: Thomas Kluyver [mailto:[email protected]]
Sent: Monday, July 29, 2013 11:01 AM
To: primary discussion list for use and development of cx_Freeze
Subject: Re: [cx-freeze-users] Problems launching Frozen Python 3.3 application
located in path with international characters.
On 29 July 2013 15:39, Steven Velez
<[email protected]<mailto:[email protected]>> wrote:
This means that the bases for python 3 on windows will not support Windows
9x... is that a concern?
I don't think so - Python itself apparently dropped support for Windows 9x in
Python 2.6:
http://docs.python.org/3/whatsnew/2.6.html#port-specific-changes-windows
For python >= 3 on other platforms, I intend to do conversions to wide
characters assuming a UTF-8 encoding in the source unless overridden by a
cx-specific environment variable.
I'm not sure about this. On Linux, as I understand it, filenames are
represented as bytes, which are generally *displayed* as UTF-8 by convention.
Decoding all filenames to wide characters and later re-encoding them seems
destined to create problems somewhere. Is it possible to do this so that the
wide character filenames are only used on Windows?
Thanks,
Thomas
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
cx-freeze-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cx-freeze-users