[issue15441] test_posixpath fails on Japanese edition of Windows

STINNER Victor Thu, 26 Jul 2012 03:53:38 -0700

STINNER Victor <victor.stin...@gmail.com> added the comment:

+    @unittest.skipIf(sys.platform == 'win32',
+        "Win32 can fail cwd() with invalid utf8 name")
     def test_nonascii_abspath(self):


You should not always skip the test on Windows: the filename is decodable in 
code pages other than cp932. It would be better to add the following code at 
the beginning of test_nonascii_abspath():

name = b'\xe7w\xf0'
if sys.platform == 'win32':
  try:
    os.fsdecode(name)
  except UnicodeDecodeError:
    self.skipTest("the filename %a is not decodable from the ANSI code page 
(%s)" % (name, sys.getfilesystemencoding()))

Note: Windows does not use UTF-8 for ANSI or OEM code pages, except if you 
change it manually.

+        batfile = """
+chcp 932
+{exe} {scriptname}
+chcp {codepage}
+"""

chcp does only change the OEM code page, whereas Python uses the ANSI code page 
for sys.getfilesystemencoding().

It is possible to change the ANSI code page of the current thread 
(CP_THREAD_ACP) using SetThreadLocale(), but it doesn't help because Python 
uses the global ANSI code page (CP_ACP). I don't think that changing the 
CP_THREAD_ACP code page does change the CP_ACP code page of child processes.

Changing the ANSI code page manually is possible in the Control Panel, but it 
requires to reboot Windows.

--

Your patch expects that "os.mkdir(b'\xe7w\xf0'); os.chdir(b'\xe7w\xf0')" works 
whereas I tested manually in Python, and it doesn't work because Windows 
creates a directory called "\u8f42" (b'\xe7w'), see my previous message 
(msg166441). At least with a NTFS filesystem on Windows 7.

--

Your last patch tries to decode the bytes filename from the filesystem 
encoding, or uses repr(filename). I may be better to keep the bytes filenames 
unchanged in OSError.filename, instead of using repr(). But it sounds like a 
good idea to patch all PyErr_Set*WithFilename(..., char*) functions. My patch 
for  path_error() avoids the creation of a temporary bytes objets.

--

test_support.temp_cwd(b'\xe7w\xf0') test was added by the changeset 
ebdc2aa730c0 and is related to the issue #3426. I'm not sure that it was really 
expected to test b'\xe7w\xf0', because a previous test was using u'\xe7w\xf0' :

-        # Issue 3426: check that abspath retuns unicode when the arg is unicode
-        # and str when it's str, with both ASCII and non-ASCII cwds
-        for cwd in (u'cwd', u'\xe7w\xf0'):

We may use b'\xe7w' instead of b'\xe7w\xf0' if b'\xe7w\xf0' cannot be decoded.

--

Attached patch win32_bytes_filename.patch tries to solve both issues: the test 
and UnicodeDecodeError on raising the OSError.

I tries to decode the bytes filename from the FS encoding, or keeps it 
unchanged (as bytes). As Python 2 does with os.listdir(unicode).

----------
nosy: +flox
Added file: http://bugs.python.org/file26524/win32_bytes_filename.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue15441>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15441] test_posixpath fails on Japanese edition of Windows

Reply via email to