Eryk Sun <eryk...@gmail.com> added the comment:

A Windows path reserves the following characters:

* null, as the string terminator
* slash and backslash, as path separators
* colon as the second character in the first component of
  a non-UNC path, since it's a drive path

Additionally, a normalized path reserves trailing dots and spaces on names, 
since they get stripped from the final component (e.g. "C:\Temp\spam. . ." -> 
"C:\Temp\spam"). WindowsPath could automatically strip trailing dots and space 
from normalized paths. This would need to exclude extended paths that begin 
with the "\\?\" prefix.

Otherwise the set of reserved characters is a function of device and filesystem 
namespaces, regardless of the recommendations in "Naming Files, Paths, and 
Namespaces" [1], which are meant to constrain applications to what is generally 
allowed. I would prefer for WindowsPath to remain generic enough to support all 
device and filesystem namespaces. 

For example, the VirtualBox shared-folder filesystem (a mini-redirector to the 
host system) allows colon, pipe, and control characters in file and directory 
names:

    >>> control = '\a\b\t\n\v\f\r'
    >>> special = ':|'
    >>> dirname = f'//vboxsvr/work/nametest/{control}{special}'
    >>> os.makedirs(dirname, exist_ok=True)
    >>> os.listdir('//vboxsvr/work/nametest')[0]
    '\x07\x08\t\n\x0b\x0c\r:|'

Like most filesystems, it reserves the 5 wildcard characters in base filenames, 
which includes '*', '?', '<' (DOS_STAR), '>' (DOS_QM), and '"' (DOS_DOT). A 
filesystem that fails to reserve these wildcard characters cannot properly 
support WINAPI FindFirstFile[Ex]. The only filesystem I can think of that 
allows wildcard characters in base names is the named-pipe filesystem. NPFS 
actually allows any character in a pipe name -- even slash and backslash since 
it only supports a single directory, the root directory "//./PIPE/".

That said, a path may specify a stream name instead of a base filename. As is 
documented in [1], and NTFS stream name reserves colon as a delimiter, i.e. 
"filename:streamname:streamtype", and stream names can include wildcards, pipe, 
and control characters. For example:

    >>> control = '\a\b\t\n\v\f\r'
    >>> special = '*?<>"|'
    >>> dirname = 'C:\\Temp\\nametest'
    >>> filename = f'{dirname}\\spam'
    >>> streamname = f'{filename}:{control}{special}'
    >>> os.makedirs(dirname, exist_ok=True)
    >>> streamname
    'C:\\Temp\\nametest\\spam:\x07\x08\t\n\x0b\x0c\r*?<>"|'
    >>> open(streamname, 'w').close()

We can use PowerShell (pwsh) to verify the existence of the stream:

    >>> cmd = f'pwsh -c (gi "{filename}" -stream *)[1].Stream'
    >>> subprocess.check_output(cmd, text=True).rstrip()
    '\x07\x08\t\n\x0b\x0c\n*?<>"|'

In terms of device namespaces, a device that is not mounted by a filesystem can 
implement practically whatever namespace it wants. But considering "//./" 
device paths are normalized Windows paths, device namespaces should reserve 
slash, since the system translates slash to backslash.

[1] https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file

----------
nosy: +eryksun

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue39515>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to