On 11/15/21 10:37, Sudhip Nashi wrote:
Turns out lseek is broken (or at least works differently) on macOS as well 
(https://lists.gnu.org/archive/html/bug-gnulib/2018-09/msg00054.html). Funny 
coincidence! I’ll take a better look later this week if I can and try to see 
what the exact problem is.

Thanks, I think I see the problem now.

Eventually macOS will likely get fixed to work around this lseek+SEEK_DATA incompatibility (as FreeBSD, Solaris, etc. all do things the Linux way and that's what I think will appear in the next POSIX), but in the meantime I attempted to work around the portability issue by installing the attached patch into Gnulib, and by syncing coreutils to the latest Gnulib.

I don't use macOS so have not tested this. Please give it a try, either by building from bleeding-edge coreutils on Savannah, or by building from the tarball temporarily here:

https://www.cs.ucla.edu/~eggert/coreutils-9.0.26-0f4d9.tar.gz

Thanks.
From 4db8db34112b86ddf8bac48f16b5acff732b5fa9 Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Mon, 15 Nov 2021 15:08:25 -0800
Subject: [PATCH] lseek: port around macOS SEEK_DATA glitch

Problem reported by Sudhip Nashi (Bug#51857).
* doc/posix-functions/lseek.texi (lseek): Mention macOS SEEK_DATA
issue.
* lib/lseek.c (rpl_lseek): Work around macOS portability glitch.
* m4/lseek.m4 (gl_FUNC_LSEEK): Replace lseek on Darwin.
* modules/lseek (Depends-on): Depend on msvc-nothrow
and fstat only if needed.
---
 ChangeLog                      | 11 +++++++++++
 doc/posix-functions/lseek.texi |  4 ++++
 lib/lseek.c                    | 16 ++++++++++++++++
 m4/lseek.m4                    | 10 ++++++++--
 modules/lseek                  |  4 ++--
 5 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index f47071a72..71a226570 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,14 @@
+2021-11-15  Paul Eggert  <egg...@cs.ucla.edu>
+
+	lseek: port around macOS SEEK_DATA glitch
+	Problem reported by Sudhip Nashi (Bug#51857).
+	* doc/posix-functions/lseek.texi (lseek): Mention macOS SEEK_DATA
+	issue.
+	* lib/lseek.c (rpl_lseek): Work around macOS portability glitch.
+	* m4/lseek.m4 (gl_FUNC_LSEEK): Replace lseek on Darwin.
+	* modules/lseek (Depends-on): Depend on msvc-nothrow
+	and fstat only if needed.
+
 2021-11-11  Fabrice Fontaine  <fontaine.fabr...@gmail.com>  (tiny change)
 
 	sigsegv: fix builds on microblazeel, or1k
diff --git a/doc/posix-functions/lseek.texi b/doc/posix-functions/lseek.texi
index 4a9d55dcf..2f8e2b587 100644
--- a/doc/posix-functions/lseek.texi
+++ b/doc/posix-functions/lseek.texi
@@ -9,6 +9,10 @@ Gnulib module: lseek
 Portability problems fixed by Gnulib:
 @itemize
 @item
+On some platforms, @code{lseek (fd, offset, SEEK_DATA)} returns a value
+greater than @code{offset} even when @code{offset} addresses data:
+macOS 12
+@item
 This function is declared in a different header file (namely, @code{<io.h>})
 on some platforms:
 MSVC 14.
diff --git a/lib/lseek.c b/lib/lseek.c
index 0042546a8..7dcd6c9da 100644
--- a/lib/lseek.c
+++ b/lib/lseek.c
@@ -52,6 +52,22 @@ rpl_lseek (int fd, off_t offset, int whence)
       errno = ESPIPE;
       return -1;
     }
+#elif defined __APPLE__ && defined __MACH__ && defined SEEK_DATA
+  if (whence == SEEK_DATA)
+    {
+      /* If OFFSET points to data, macOS lseek+SEEK_DATA returns the
+         start S of the first data region that begins *after* OFFSET,
+         where the region from OFFSET to S consists of possibly-empty
+         data followed by a possibly-empty hole.  To work around this
+         portability glitch, check whether OFFSET is within data by
+         using lseek+SEEK_HOLE, and if so return to OFFSET by using
+         lseek+SEEK_SET.  */
+      off_t next_hole = lseek (fd, offset, SEEK_HOLE);
+      if (next_hole < 0)
+        return next_hole;
+      if (next_hole != offset)
+        whence = SEEK_SET;
+    }
 #else
   /* BeOS lseek mistakenly succeeds on pipes...  */
   struct stat statbuf;
diff --git a/m4/lseek.m4 b/m4/lseek.m4
index 0af63780a..faab09b73 100644
--- a/m4/lseek.m4
+++ b/m4/lseek.m4
@@ -1,4 +1,4 @@
-# lseek.m4 serial 11
+# lseek.m4 serial 12
 dnl Copyright (C) 2007, 2009-2021 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
@@ -59,7 +59,7 @@ AC_DEFUN([gl_FUNC_LSEEK],
          ;;
      esac
     ])
-  if test $gl_cv_func_lseek_pipe = no; then
+  if test "$gl_cv_func_lseek_pipe" = no; then
     REPLACE_LSEEK=1
     AC_DEFINE([LSEEK_PIPE_BROKEN], [1],
       [Define to 1 if lseek does not detect pipes.])
@@ -69,4 +69,10 @@ AC_DEFUN([gl_FUNC_LSEEK],
   if test $WINDOWS_64_BIT_OFF_T = 1; then
     REPLACE_LSEEK=1
   fi
+
+  dnl macOS SEEK_DATA is incompatible with other platforms.
+  case $host_os in
+    darwin*)
+      REPLACE_LSEEK=1;;
+  esac
 ])
diff --git a/modules/lseek b/modules/lseek
index ced443123..f60809319 100644
--- a/modules/lseek
+++ b/modules/lseek
@@ -9,8 +9,8 @@ Depends-on:
 unistd
 sys_types
 largefile
-msvc-nothrow    [test $REPLACE_LSEEK = 1]
-fstat           [test $REPLACE_LSEEK = 1]
+msvc-nothrow    [test $WINDOWS_64_BIT_OFF_T = 1]
+fstat           [test "$gl_cv_func_lseek_pipe" = no]
 
 configure.ac:
 gl_FUNC_LSEEK
-- 
2.33.1

Reply via email to