On Thursday 16 January 2025 18:32:08 Lasse Collin wrote: > On 2025-01-14 Pali Rohár wrote: > > Well, warnings are warnings. They are always being added by new > > compiler versions, so I would not be afraid of adding also in new > > mingw-w64 version. And security "warning" for me sounds like a good > > idea. > > OK, I agree. :-) > > Remember that I'm not a MinGW-w64 developer. Don't put too much weight > on my opinions. > > > > I have attached a draft patch (header bits are missing) and a demo > > > program. It has the above features so it's possible to think if the > > > extra code is worth it. > > > > I have already WIP something better, to handle all other parts, like > > not leaking wide global variables, properly initialize of narrow > > variables and fixing also direct usage of __getmainargs function by > > applications. All these parts are not handled in tour draft. I need > > more time to finish it and do more tests. > > Great! :-) > > > > With GB18030, U+FFFD consumes four bytes. So with that charset, the > > > maximum possible byte count is larger. > > > > It is possible to change local process ucrt encoding to GB18030? > > I suspect that it's not. UTF-8 might be the only locale code page that > isn't single or double byte. So reserving space for UTF-8 should be > enough. Even if some locale code page supports longer encodings > someday, the name has to be very long to hit the limit and result in > ENAMETOOLONG. > > With UTF-8, even the current 255 bytes should almost never be a > problem. Increasing NAME_MAX ensures that apps can list names in > an unusual case too (but they still cannot list unpaired surrogates). On > the other hand, the larger NAME_MAX may cause new problems if an app > assumes that a filename always fits in MAX_PATH (260) bytes. The dirent > API is from POSIX, so one would hope that apps ported from POSIX handle > it well. I don't know if that is too optimistic. > > > > > Maybe we would need type versioning? Like it was with time_t or > > > > fpos_t which based on the compile time macro expands either to > > > > old (32-bit) or new (64-bit type). > > > > > > If a third party header has > > > > > > include <dirent.h> > > > int foo(DIR *d); > > > > > > it's not possible to know which version of the symbols were used > > > when the library was compiled. To do versioning with only header > > > macros, all participants have to co-operate. Ideally one doesn't > > > use this kind of data types in API/ABI at all. > > > > Yes, that is truth. But same thing is already being done for time_t > > types in both visual studio and mingw-w64 header files. There are some > > defaults and via #define you change the behavior. > > > > Also same was used for a long time by UNIX LFN which changed in this > > way fpos_t and off_t types (plus redefined open, lseek and other > > functions). > > Those are good points, thanks. I still fear it could be messy. > > I see that sizeof(DIR) depends on _USE_32BIT_TIME_T because DIR > contains _finddata_t or _wfinddata_t. Luckily no one is supposed to > access that structure directly.
That is really bad. So it means if the mingw-w64 runtime is compiled with _USE_32BIT_TIME_T then opendir() provided by mingw-64 would be ABI-incompatible with application which will use 64-bit time_t (as it would change meaning of struct DIR). So it means that whole struct DIR could be already broken for libraries which exports functions like "int foo(DIR *d);" I think that we need a new ABI for opendir/readdir without these problems. And at the same time it can fix these problems related to encoding. > I will send a few dirent patches. Feel free to put me into copy. > I played around with flags that > re-enable best-fit mapping or disallow filenames over 255 bytes (if > someone needs those for compatibility reasons). Those won't be in the > first version. > > - Having more than one flag might make the API too fancy in the same > sense as I commented about command line handling features. The > main problem isn't the few lines of extra code in dirent.c, it's > that few would use the extra features and that the risk of > incorrect use increases. > > - A global variable works for _dowildcard but it's problematic for > dirent because a library might want to set it too. A DLL and > application would have their own flags which would work, but if a > library is built statically then the same variable could be defined > twice and cause a linker error. Or if the flag variable is only > defined in the static library, it would affect the unsuspecting > application too. > > - <dirent.h> could have _opendir_lossy(const char *). Then one could > have something like: > > #ifdef _DIRENT_LOSSY > # define opendir _opendir_lossy > #endif > > Apps could then #define _DIRENT_LOSSY and the code would be *source > compatible* with both old and new MinGW-w64. If an application > itself does "#define opendir _opendir_lossy" then the code would > only compile with new MinGW-w64. > > - It's easy to add d_lossy flag to struct dirent to mark which names > weren't properly converted. But again, it could be too fancy. > > -- > Lasse Collin _______________________________________________ Mingw-w64-public mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
