On 2025-02-16 Pali Rohár wrote:
> On Sunday 16 February 2025 19:32:24 Lasse Collin wrote:
> > (There's no lstat() to detect symlinks or readlink() to read them.)
> 
> I know. Maybe for future it would be nice to have lstat() call.
> Implementation can be straightforward, open path as reparse point,
> check if it s reparse point + retrieve reparse tag and then either
> call fstat() with custom type (if it is reparse point) or call stat()
> (if it is not reparse point).

lstat() might be nice because symlinks are a thing on Windows nowadays.
<sys/stat.h> would need new macros:

  - S_IFLNK and S_ISLNK(m) for IO_REPARSE_TAG_SYMLINK

  - S_IFSOCK and S_ISSOCK(m) for IO_REPARSE_TAG_AF_UNIX

I suppose lstat() shouldn't follow any reparse points. Maybe there could
be non-POSIX macros like S_IFRPP and S_ISRPP(m) to indicate any other
reparse point than symlink or socket. Some OSes have extra macros but I
don't know if the idea makes sense here.

stat() would only need S_IFSOCK. Most reparse points likely should be
transparent to stat() (to the extent it is possible).

If UCRT added S_IFLNK and S_IFSOCK but used different constants than
mingw-w64, that would be a mess. I'm not proposing any S_ macro
additions in this email, I'm just thinking out aloud.

> > My point with the long list of attributes in get_d_type was to
> > return DT_UNKNOWN if Microsoft added a new not-regular-file
> > attribute some day, or if some application wants to handle reparse
> > points specially (apps might be ported from POSIX with some extra
> > code added on top to support Windows, so the end result can be a
> > mix of both worlds). But I might have been over-thinking (wouldn't
> > be the first time) or over-cautious.  
> 
> I highly doubt that some new attribute in future would change regular
> file to something totally different. That would break lot of things.
> 
> The way how new file types could be added is via reparse points. As
> this is existing way and can do basically anything.
> 
> What could probably makes sense for DT_UNKNOWN is to return it for
> files and dirs with reparse point attribute and reparse tag is not
> handled in the function. This can address the idea about applications
> which wants to handle reparse point specially, and also handles the
> AF_UNIX sockets (mentioned below).
> 
> It is important to know that if you do not have installed NT kernel
> driver for particular reparse point tag, then it is not to open file
> or dir to which is attached reparse point with that tag. Hence
> without the installed driver that file or dir with reparse point is
> not regular file or dir. But rather something unknown for the system.

Alright. :-) So my 0008.patch.txt in the previous email did too much. It
only should have removed the supported_attrs check.

> > About DT_ macros that cannot appear in a directory listing: I didn't
> > define DT_BLK in dirent.h because S_IFBLK seems to be a MinGW
> > invention (to make it easier to port apps). Its value doesn't match
> > glibc or *BSDs, so DT_BLK == S_IFBLK >> 12 wouldn't match glibc or
> > *BSDs.  
> 
> I think that "block device" is not available in neither msvcrt/ucrt
> nor in WinAPI. So that is why there is no DT_BLK / S_IFBLK macro in
> ucrt header files. I guess in mingw it is just for compile purposes
> of posix applications.

Right. The following MinGW bug says that it is or was needed to build
GCC. It has discussion about the atypical value of S_IFBLK too.

    https://sourceforge.net/p/mingw/bugs/1146/

In dirent.h, defining DT_BLK for similar compatibility reasons might
make sense if there are apps that assume that DT_BLK is defined if
_DIRENT_HAVE_D_TYPE is defined. However, people can add #ifdef DT_BLK
when porting such apps (which also forces them to notice that block
devices don't exist on Windows in this form). *If* DT_BLK is added, I
wonder if the value should be 3 instead of 6 due to mingw-w64's unsual
S_IFBLK value.

libarchive has a comment about S_IFBLK and MinGW which refers to the
above bug:

    
https://github.com/libarchive/libarchive/blob/65196fdd1a385f22114f245a9002ee8dc899f2c4/libarchive/test/test_entry.c#L89

The change was made in 2009, note the commit message:

    
https://github.com/libarchive/libarchive/commit/56965e7a9b1d8b0d70e55d952bd16172e7738746

There's a longer generic comment about S_IFxxx values in another file:

    
https://github.com/libarchive/libarchive/blob/65196fdd1a385f22114f245a9002ee8dc899f2c4/libarchive/archive_entry.h#L179

S_IFBLK could be changed to 0x6000, but it would be an ABI break. :-(
(But so is NAME_MAX change.)

> > There is no DT_SOCK either (mingw-w64 doesn't have S_IFSOCK).  
> 
> Ou, I forgot about this. Native AF_UNIX support is now available for
> WinAPI. This was added to WinAPI just recently and probably it is not
> supported in UCRT at all. So AF_UNIX files are detected as regular
> files.
> 
> But for future it would be nice to extend mingw stat and readdir code
> to detect AF_UNIX socket files and report them as DT_SOCK / S_IFSOCK.
> 
> WinAPI's AF_UNIX socket is stored as empty regular file with attached
> reparse point with tag IO_REPARSE_TAG_AF_UNIX and empty reparse point
> buffer.

Perhaps DT_SOCK could be added for IO_REPARSE_TAG_AF_UNIX already even
when S_IFSOCK and stat() support isn't there. (DT_LNK is already being
added even though there is no S_IFLNK, but maybe it's different as long
as lstat() doesn't exist.)

Are you certain that IO_REPARSE_TAG_AF_UNIX is the right tag for Win32
AF_UNIX? The following lists it as a WSL thing but it might be due to
the document being older than the Win32 AF_UNIX feature. The second link
says that WSL and Win32 are interoperable to some extent with AF_UNIX.
So quite likely it is the right tag, but it's better to be sure. :-)

    
https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c8e77b37-3909-4fe6-a4ea-2b9d423b1ee4

    https://devblogs.microsoft.com/commandline/windowswsl-interop-with-af_unix/

Summary of d_type questions:

  - Can DT_SOCK be added now?

  - Should DT_BLK be added? If yes, should the value be
    (S_IFBLK >> 12) == 3 (not 6) based on mingw-w64's <sys/stat.h>?
    (Or should changing of S_IFBLK to 0x6000 be considered?)

  - If adding DT_SOCK, is this OK:

    static unsigned char
    get_d_type (DWORD attrs, DWORD reparse_tag)
    {
      if (attrs & FILE_ATTRIBUTE_REPARSE_POINT)
        {
          switch (reparse_tag)
            {
              case IO_REPARSE_TAG_SYMLINK:
                return DT_LNK;

              case IO_REPARSE_TAG_AF_UNIX:
                return DT_SOCK;

              default:
                return DT_UNKNOWN;
            }
        }

      return (attrs & FILE_ATTRIBUTE_DIRECTORY) ? DT_DIR : DT_REG;
    }

What to do with NAME_MAX?

  - The old d_name[260] is already wrong in sense that size of d_name
    should be at most NAME_MAX + 1, and currently NAME_MAX is 255.

  - NAME_MAX isn't visible with standard feature test macros like
    _POSIX_C_SOURCE, so NAME_MAX is broken in this sense too.

  - I don't know if MSVC defines NAME_MAX in any situation. If
    it does, then a different value in mingw-w64 might be a tiny
    compatibility issue.

I think NAME_MAX should be increased at least if modern MSVC doesn't
define it. Making NAME_MAX visible with _POSIX_C_SOURCE etc. should be
simple, one just needs to be careful that all relevant macros are listed
correctly.

> > One can access "C:\Documents and Settings\SomeUserName" just fine if
> > one has permission to access C:\Users\SomeUserName. It's just the
> > root of the junction that doesn't allow its contents listed. So it
> > is a permission issue as the error message says.
> 
> Ok, so it is just a normal EACCES scenario.

It's just that this example makes it look like that junctions aren't
transparent in all common situations like FindFirstFileW.

-- 
Lasse Collin


_______________________________________________
Mingw-w64-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to