[PATCH] canonicalize-lgpl: Canonicalize casing too for MinGW.

2021-12-09 Thread Jan (janneke) Nieuwenhuizen
* lib/canonicalize-lgpl.c (filesystem_name)[__MINGW32__]: New static
function.
(realpath_stk)[__MINGW32__]: Use it to return correct canonicalized
casing.
* tests/test-canonicalize-lgpl.c (main)[__MINGW32__]: Test it.
---
 lib/canonicalize-lgpl.c| 37 ++
 tests/test-canonicalize-lgpl.c | 12 +++
 2 files changed, 49 insertions(+)

diff --git a/lib/canonicalize-lgpl.c b/lib/canonicalize-lgpl.c
index 92e9639720..baabcbdc25 100644
--- a/lib/canonicalize-lgpl.c
+++ b/lib/canonicalize-lgpl.c
@@ -41,6 +41,11 @@
 #include 
 #include 
 
+#if __MINGW32__
+#include 
+#include 
+#endif
+
 #ifdef _LIBC
 # include 
 # define GCC_LINT 1
@@ -180,6 +185,33 @@ get_path_max (void)
   return path_max < 0 ? 1024 : path_max <= IDX_MAX ? path_max : IDX_MAX;
 }
 
+#if __MINGW32__
+/* Return the basename of NAME as found on the filesystem, which may
+   or may not canonicalize the casing, or NULL if not found.  */
+static char *
+filesystem_name (char const *name)
+{
+  char base_buf[PATH_MAX];
+  strcpy (base_buf, name);
+  char *base = basename (base_buf);
+
+  int select_base (struct dirent const* entry)
+  {
+return strcasecmp (entry->d_name, base) == 0;
+  }
+
+  char dir_buf[PATH_MAX];
+  strcpy (dir_buf, name);
+  char *dir = dirname (dir_buf);
+
+  struct dirent **name_list;
+  int i = scandir (dir, &name_list, select_base, NULL);
+  if (i == 1)
+return name_list[0]->d_name;
+  return NULL;
+}
+#endif
+
 /* Act like __realpath (see below), with an additional argument
rname_buf that can be used as temporary storage.
 
@@ -322,6 +354,11 @@ realpath_stk (const char *name, char *resolved,
 {
   buf = link_buffer.data;
   idx_t bufsize = link_buffer.length;
+#if __MINGW32__
+  char *fname = filesystem_name (rname);
+  if (fname)
+strcpy (rname + strlen (rname) - strlen (fname), fname);
+#endif
   n = __readlink (rname, buf, bufsize - 1);
   if (n < bufsize - 1)
 break;
diff --git a/tests/test-canonicalize-lgpl.c b/tests/test-canonicalize-lgpl.c
index c0a5a55150..cf41a2a628 100644
--- a/tests/test-canonicalize-lgpl.c
+++ b/tests/test-canonicalize-lgpl.c
@@ -279,6 +279,18 @@ main (void)
 free (result2);
   }
 
+#if __MINGW32__
+  /* Check that \\ are changed into / and casing is canonicalized. */
+  {
+int fd = creat (BASE "/MinGW", 0600);
+ASSERT (0 <= fd);
+ASSERT (close (fd) == 0);
+
+char *result = canonicalize_file_name (BASE "\\mingw");
+ASSERT (strcmp (result, BASE "/MinGW");
+free (result);
+  }
+#endif
 
   /* Cleanup.  */
   ASSERT (remove (BASE "/droot") == 0);



Re: [PATCH] canonicalize-lgpl: Canonicalize casing too for MinGW.

2021-12-09 Thread Bruno Haible
Hi Jan,

> * lib/canonicalize-lgpl.c (filesystem_name)[__MINGW32__]: New static
> function.
> (realpath_stk)[__MINGW32__]: Use it to return correct canonicalized
> casing.

I don't think this is desirable, because

1) The 'realpath' function that canonicalize-lgpl.c implements is
   specified to return "an absolute pathname that resolves to the same
   directory entry, whose resolution does not involve '.', '..', or symbolic
   links." [1] There is no guarantee in the spec that it prefers lowercase,
   uppercase, or the case of the existing directory entry.

2) If we wanted to make this function consistent on all platforms, we would
   also need to handle
 - Linux with mounted VFAT file systems,
 - macOS with case-insensitive HFS+,
 - different locales on Windows (e.g. to recognize that 'ä' and 'Ä' are
   equivalent in Windows installations with Western locales).
   And, on macOS with HFS+, also the Unicode canonicalization (NFC vs. NFD).

3) By doing this, the function would be slowed down significantly. The
   scandir() call that you added reads all directory entries of a certain
   directory.

What exactly do you want to do? If you want to look at the file name of
an existing directory entry, let your program use scandir().

Additionally,

> +  int select_base (struct dirent const* entry)
> +  {
> +return strcasecmp (entry->d_name, base) == 0;
> +  }

I would refrain from adding code that requires GCC and does not work with MSVC.

Bruno

[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/realpath.html