[issue37834] readlink on Windows cannot read app exec links

Eryk Sun Tue, 13 Aug 2019 12:46:52 -0700


Eryk Sun <[email protected]> added the comment:


> I feel like that's more work than is worth us doing for something that 
> will be relatively rarely used, will live in the stdlib, and is 
> obviously something that will become outdated as Microsoft adds new 
> reparse points.

Junctions (NT 5) and symlinks (NT 6) are stable. So if os.read_reparse_point 
only returns the unparsed bytes, maybe add os.read_junction as well. I know 
other projects overload this on POSIX readlink. They're both name-surrogate 
reparse points, but they have different constraints and behavior.

The I/O manager tries to make a junction behave something like a hard link to a 
directory, with the addition of being able to link across local volumes. This 
is in turn relates to how it evaluates relative symbolic links. For example, if 
"C:/Junction" and "C:/Symlink" both target r"\\?\C:\Temp1\Temp2", and there's a 
relative symlink "C:/Temp1/Temp2/foo_link" that targets r"..\foo", then 
"C:/Junction/foo_link" references "C:/foo" but "C:/Symlink/foo_link" references 
"C:/Temp1/foo".
 
Another difference is with remote filesystems. SMB special cases symlinks to 
have the server send the reparse request over the wire to be evaluated on the 
client side. (Refer to [MS-SMB2] 2.2.2.1 Symbolic Link Error Response, and the 
subsequent section about client-side handling of this error.) So an absolute 
symlink on the server that targets r"\\?\C:\Windows" actually references the 
client's "C:/Windows" directory, whereas the same junction target would 
reference the server's "C:/Windows" directory. The symlink evaluation will 
succeed only if the client's R2L (remote to local) policy allows it. Symlinks 
can also target remote devices, depending on the L2R and R2R policy settings. 
Junctions are restricted to local devices.

> In theory, we can't follow any reparse point that isn't documented as 
> being followable and provides the target name is a stable, documented
> manner. 

To follow a reparse point, we're just calling CreateFileW the normal way, 
without FILE_FLAG_OPEN_REPARSE_POINT. The Windows API also does this (usually 
via NtOpenFile, but this has a similar  FILE_OPEN_REPARSE_POINT option) for 
tags it doesn't handle. That's why MoveFileExW (os.rename and os.replace) fails 
on one of these app-exec links. In some cases, it adds a third open attempt if 
the reparse point isn't handled. This is important for DeleteFileW (os.remove) 
and RemoveDirectoryW (os.rmdir) because we should be able to delete a bad 
reparse point.

> The appexec links don't do this (I just looked at the returned 
> buffer), so we really should just not follow them. They exist solely 
> so that CreateProcess internally returns a unique error code that can 
> be handled without impacting regular process start, which means we 
> *don't* want to follow them.

I know, so a regular stat() will fail. I think for an honest result, stat() 
should fail for a reparse point that can't be handled. Scripts can use 
stat(path, follow_nonlinks=False) or stat(path, follow_reparse_points=False), 
or however this eventually gets parameterized to force opening all reparse 
points.

> Now, directory junctions are far more interesting. My gut feel is that 
> we should treat them the same as symlinks (with respect to stat vs. 
> lstat) for consistency

Junctions are their own thing. They're mount points that behave like Unix 
volume mounts (in Windows, target the root directory of a volume device named 
by its "Volume{...}" name) or Unix bind mounts (in Windows, target arbitrary 
directories on any local volume; in Linux it's a mount created with --bind or 
FUSE bindfs). Bind-like junctions are also similar to DOS subst drives (e.g. 
"W:" -> "C:/Windows") and UNC shares. These are all mount points of one sort or 
another. 

OTOH, the base device names such as "//?/C:" and "//?/Volume{...}", without a 
specified root directory, are aliases (object symlinks) for an NT device such 
as r"\Device\HarddiskVolume2". These paths open the volume itself, not the 
mounted filesystem, so they're not like Unix mount points. They're like Unix 
'/dev/sda1' device paths, except in Unix, devices don't have their own 
namespaces, so it would be nonsense to open "/dev/sda1/".

RemoveDirectoryW for a volume mount is special cased to call 
DeleteVolumeMountPointW, which notifies the mount-point manager. It won't do 
this for a junction that targets the same volume root directory via the DOS 
drive-letter name -- or any other device alias for that matter (e.g. Windows 10 
creates "\\?\BootPartition" as an alternative named for the system "C:" drive). 
So bind-like mounts are different from volume mounts, but both are different 
from symlinks.

----------

_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue37834>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue37834] readlink on Windows cannot read app exec links

Reply via email to