Eryk Sun <eryk...@gmail.com> added the comment:

_sanitize_windows_name() fails to translate the reserved control characters 
(0x01-0x1F) and backslash in names. 

What I've seen done in some cases (e.g. Unix network shares mapped to SMB) is 
to translate names using the private use area block, e.g. 0xF001 - 0xF07F. 
Windows has no problem with characters in this range in a filename. (Displaying 
these characters sensibly is another matter.) For Windows 10, this is 
especially useful since the Linux subsystem automatically translates this PUA 
block back to ASCII when accessing a Windows volume via drvfs. For example:

    C:\Temp\pua>python -q
    >>> import sys
    >>> sys.platform
    'win32'
    >>> name = ''.join(map(chr, range(0xf001, 0xf080)))
    >>> _ = open(name, 'w')
    >>> ^Z

    C:\Temp\pua>bash -c "python3 -q"
    >>> import os, sys
    >>> sys.platform
    'linux'
    >>> os.listdir()
    ['\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f
      \x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f
       !"#$%&\'()*+,-./0123456789:;<=>?
      @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_
      `abcdefghijklmnopqrstuvwxyz{|}~\x7f']

Also, while _sanitize_windows_name() handles trailing dots, for some reason it 
overlooks trailing spaces. It also doesn't handle reserved DOS device names. 
The reserved names include NUL, CON, CONIN$, CONOUT$, AUX, PRN, COM[1-9], 
LPT[1-9], and these names plus zero or more spaces and possibly a dot or colon 
and any subsequent characters. For example:

    >>> os.path._getfullpathname('con')
    '\\\\.\\con'
    >>> os.path._getfullpathname('con  ')
    '\\\\.\\con'
    >>> os.path._getfullpathname('con:')
    '\\\\.\\con'
    >>> os.path._getfullpathname('con :')
    '\\\\.\\con'
    >>> os.path._getfullpathname('con : spam')
    '\\\\.\\con'
    >>> os.path._getfullpathname('con . eggs')
    '\\\\.\\con'

It's not a reserved device name if the first character after zero or more 
spaces is not a dot or colon. For example:

    >>> os.path._getfullpathname('con spam')
    'C:\\con spam'

We can create filenames with reserved device names or trailing spaces and dots 
by using a \\?\ prefixed path (i.e. a non-normalized device path). However, 
most programs don't use \\?\ paths, so it's probably better to translate these 
names.

----------
nosy: +eryksun

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue36534>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to