Eryk Sun <eryk...@gmail.com> added the comment:

> I wish we could remove the MAX_PATH limit in this case.
>
> The problem is that we have to remove the limit in any case where the 
> resulting path might be used, which is what we're already trying to 
> encourage by supporting long paths.

Maybe it's better to ignore the MAX_PATH limit and let processes fail hard if 
they lack long-path support. A known and expected exception is better than 
unpredictable behavior (see the next paragraph for an example). That leaves the 
problem of a final component that's a reserved name, i.e. a DOS device name or 
a name with trailing dots or spaces. We have no choice but to return this case 
as an extended path. 

The intersection of this problem with SetCurrentDirectoryW (os.chdir) troubles 
me. Without long-path support, the current-directory buffer in the process 
parameters is hard limited to MAX_PATH, and passing SetCurrentDirectoryW an 
extended path can't work around this. Fair enough. But it still accepts a 
device path as the current directory, even though the docs do not explicitly 
allow it, and the implementation assumes it's disallowed. The combination is an 
ugly bug:

    >>> os.chdir('//./C:/Temp')
    >>> os.getcwd()
    '\\\\.\\C:\\Temp'
    >>> os.path._getfullpathname('/spam/eggs')
    '\\\\spam\\eggs'

    >>> os.chdir('//?/C:/Temp')
    >>> os.getcwd()
    '\\\\?\\C:\\Temp'
    >>> os.path._getfullpathname('/spam/eggs')
    '\\\\spam\\eggs'

In order to resolve a rooted path such as "/spam/eggs", the runtime library 
needs to be able to figure out the current drive from the current directory. It 
checks for a UNC path and otherwise assumes it's a DOS drive, since it's 
assuming device paths aren't allowed. It ends up assuming the current directory 
is a DOS drive and grabs the first two characters as the drive name, which is 
"\\\\". Then when joining the rooted path to this 'drive', the initial slash or 
backslash of the rooted path gets collapsed into the preceding backslash. The 
result is at best a broken path, and at worst an unrelated UNC path that 
exists. 

I think os.chdir should raise an exception when passed a device path. In 
explanation, we can point to the documentation of SetCurrentDirectoryW, which 
explicitly states the following:

    Each process has a single current directory made up of two parts:

        * A disk designator that is either a drive letter followed by 
          a colon, or a server name and share name 
          (\\servername\sharename)
        * A directory on the disk designator

> Perhaps the best we can do is an additional test where we 
> GetFinalPathName, strip the prefix, reopen the file, 
> GetFinalPathName again and if they match then return it 
> without the prefix. That should handle the both long path 
> settings as transparently as we can.

I assume you're talking about realpath() here, toward the end where we're 
working with a solid path, or rather where we have at least the beginning part 
of the path as a solid path, up to the first component that's inaccessible.

For the problem of reserved names, GetFullPathNameW is all we need. This 
doesn't address the MAX_PATH issue. But that either works or not. It's a 
user-mode issue. There's nothing to resolve in the kernel. If the path is too 
long, then CreateFileW will fail at 
RtlDosPathNameToRelativeNtPathName_U_WithStatus with STATUS_NAME_TOO_LONG, 
before making a single system call.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue37834>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to