Re: [PATCH v2 1/2] libstdc++: use copy_file_range, improve sendfile in filesystem::copy_file

2023-03-20 Thread Jonathan Wakely via Gcc-patches
On Mon, 20 Mar 2023 at 15:16, Jonathan Wakely wrote:

>
>
> On Wed, 15 Mar 2023 at 19:30, Jannik Glückert via Libstdc++ <
> libstd...@gcc.gnu.org> wrote:
>
>> This iteration improves error handling for copy_file_range,
>> particularly around undocumented error codes in earlier kernel
>> versions.
>> Additionally this fixes the userspace copy fallback to handle
>> zero-length files such as in
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108178.
>>
>> Lastly, the case "src gets resized during the copy loop" is now
>> considered and will return true once the loop hits EOF (this is the
>> only situation, aside from a zero-length src, where sendfile and
>> copy_file_range return 0).
>>
>
> I've applied this patch (with some whitespace fixes) and started testing
> it. I'm seeing some regressions:
>
> FAIL: experimental/filesystem/operations/copy.cc execution test
> FAIL: experimental/filesystem/operations/copy_file.cc execution test
>
> The failures in the log look like:
>
> terminate called after throwing an instance of
> 'std::experimental::filesystem::v1::__cxx11::filesystem_error'
>  what():  filesystem error: cannot copy: Input/output error
> [filesystem-test.copy.2900321341.ySWT77]
> [filesystem-test.copy.2900321342.vjeAar]
> FAIL: experimental/filesystem/operations/copy.cc execution test
>

Not just the Filesystem TS versions of those tests, but the std::filesystem
ones too:

FAIL: 27_io/filesystem/operations/copy.cc execution test
FAIL: 27_io/filesystem/operations/copy_file.cc execution test


Re: [PATCH v2 1/2] libstdc++: use copy_file_range, improve sendfile in filesystem::copy_file

2023-03-20 Thread Jonathan Wakely via Gcc-patches
On Wed, 15 Mar 2023 at 19:30, Jannik Glückert via Libstdc++ <
libstd...@gcc.gnu.org> wrote:

> This iteration improves error handling for copy_file_range,
> particularly around undocumented error codes in earlier kernel
> versions.
> Additionally this fixes the userspace copy fallback to handle
> zero-length files such as in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108178.
>
> Lastly, the case "src gets resized during the copy loop" is now
> considered and will return true once the loop hits EOF (this is the
> only situation, aside from a zero-length src, where sendfile and
> copy_file_range return 0).
>

I've applied this patch (with some whitespace fixes) and started testing
it. I'm seeing some regressions:

FAIL: experimental/filesystem/operations/copy.cc execution test
FAIL: experimental/filesystem/operations/copy_file.cc execution test

The failures in the log look like:

terminate called after throwing an instance of
'std::experimental::filesystem::v1::__cxx11::filesystem_error'
 what():  filesystem error: cannot copy: Input/output error
[filesystem-test.copy.2900321341.ySWT77]
[filesystem-test.copy.2900321342.vjeAar]
FAIL: experimental/filesystem/operations/copy.cc execution test


[PATCH v2 1/2] libstdc++: use copy_file_range, improve sendfile in filesystem::copy_file

2023-03-15 Thread Jannik Glückert via Gcc-patches
This iteration improves error handling for copy_file_range,
particularly around undocumented error codes in earlier kernel
versions.
Additionally this fixes the userspace copy fallback to handle
zero-length files such as in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108178.

Lastly, the case "src gets resized during the copy loop" is now
considered and will return true once the loop hits EOF (this is the
only situation, aside from a zero-length src, where sendfile and
copy_file_range return 0).

Best
Jannik
From b55eb8dccaa44f07d8acbe6294326a46c920b04f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jannik=20Gl=C3=BCckert?= 
Date: Mon, 6 Mar 2023 20:52:08 +0100
Subject: [PATCH 1/2] libstdc++: also use sendfile for big files
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

we were previously only using sendfile for files smaller than 2GB, as
sendfile needs to be called repeatedly for files bigger than that.

some quick numbers, copying a 16GB file, average of 10 repetitions:
old:
real: 13.4s
user: 0.14s
sys : 7.43s
new:
real: 8.90s
user: 0.00s
sys : 3.68s

Additionally, this fixes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108178

libstdc++-v3/ChangeLog:

* acinclude.m4 (_GLIBCXX_HAVE_LSEEK): define
* config.h.in: Regenerate.
* configure: Regenerate.
* src/filesystem/ops-common.h: enable sendfile for files
  >2GB in std::filesystem::copy_file, skip zero-length files

Signed-off-by: Jannik Glückert 
---
 libstdc++-v3/acinclude.m4|  51 +
 libstdc++-v3/config.h.in |   3 +
 libstdc++-v3/configure   | 127 ---
 libstdc++-v3/src/filesystem/ops-common.h |  86 ---
 4 files changed, 175 insertions(+), 92 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 5136c0571e8..85a09a5a869 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -4583,6 +4583,7 @@ dnl  _GLIBCXX_USE_FCHMOD
 dnl  _GLIBCXX_USE_FCHMODAT
 dnl  _GLIBCXX_USE_SENDFILE
 dnl  HAVE_LINK
+dnl  HAVE_LSEEK
 dnl  HAVE_READLINK
 dnl  HAVE_SYMLINK
 dnl
@@ -4718,25 +4719,6 @@ dnl
   if test $glibcxx_cv_fchmodat = yes; then
 AC_DEFINE(_GLIBCXX_USE_FCHMODAT, 1, [Define if fchmodat is available in .])
   fi
-dnl
-  AC_CACHE_CHECK([for sendfile that can copy files],
-glibcxx_cv_sendfile, [dnl
-case "${target_os}" in
-  gnu* | linux* | solaris* | uclinux*)
-	GCC_TRY_COMPILE_OR_LINK(
-	  [#include ],
-	  [sendfile(1, 2, (off_t*)0, sizeof 1);],
-	  [glibcxx_cv_sendfile=yes],
-	  [glibcxx_cv_sendfile=no])
-	;;
-  *)
-	glibcxx_cv_sendfile=no
-	;;
-esac
-  ])
-  if test $glibcxx_cv_sendfile = yes; then
-AC_DEFINE(_GLIBCXX_USE_SENDFILE, 1, [Define if sendfile is available in .])
-  fi
 dnl
   AC_CACHE_CHECK([for link],
 glibcxx_cv_link, [dnl
@@ -4749,6 +4731,18 @@ dnl
   if test $glibcxx_cv_link = yes; then
 AC_DEFINE(HAVE_LINK, 1, [Define if link is available in .])
   fi
+dnl
+  AC_CACHE_CHECK([for lseek],
+glibcxx_cv_lseek, [dnl
+GCC_TRY_COMPILE_OR_LINK(
+  [#include ],
+  [lseek(1, 0, SEEK_SET);],
+  [glibcxx_cv_lseek=yes],
+  [glibcxx_cv_lseek=no])
+  ])
+  if test $glibcxx_cv_lseek = yes; then
+AC_DEFINE(HAVE_LSEEK, 1, [Define if lseek is available in .])
+  fi
 dnl
   AC_CACHE_CHECK([for readlink],
 glibcxx_cv_readlink, [dnl
@@ -4785,6 +4779,25 @@ dnl
   if test $glibcxx_cv_truncate = yes; then
 AC_DEFINE(HAVE_TRUNCATE, 1, [Define if truncate is available in .])
   fi
+dnl
+  AC_CACHE_CHECK([for sendfile that can copy files],
+glibcxx_cv_sendfile, [dnl
+case "${target_os}" in
+  gnu* | linux* | solaris* | uclinux*)
+	GCC_TRY_COMPILE_OR_LINK(
+	  [#include ],
+	  [sendfile(1, 2, (off_t*)0, sizeof 1);],
+	  [glibcxx_cv_sendfile=yes],
+	  [glibcxx_cv_sendfile=no])
+	;;
+  *)
+	glibcxx_cv_sendfile=no
+	;;
+esac
+  ])
+  if test $glibcxx_cv_sendfile = yes && test $glibcxx_cv_lseek = yes; then
+AC_DEFINE(_GLIBCXX_USE_SENDFILE, 1, [Define if sendfile is available in .])
+  fi
 dnl
   AC_CACHE_CHECK([for fdopendir],
 glibcxx_cv_fdopendir, [dnl
diff --git a/libstdc++-v3/src/filesystem/ops-common.h b/libstdc++-v3/src/filesystem/ops-common.h
index abbfca43e5c..9e1b1d41dc5 100644
--- a/libstdc++-v3/src/filesystem/ops-common.h
+++ b/libstdc++-v3/src/filesystem/ops-common.h
@@ -51,6 +51,7 @@
 # include 
 # ifdef _GLIBCXX_USE_SENDFILE
 #  include  // sendfile
+#  include  // lseek
 # endif
 #endif
 
@@ -358,6 +359,32 @@ _GLIBCXX_BEGIN_NAMESPACE_FILESYSTEM
   }
 
 #ifdef NEED_DO_COPY_FILE
+#if defined _GLIBCXX_USE_SENDFILE && ! defined _GLIBCXX_FILESYSTEM_IS_WINDOWS
+  bool
+  copy_file_sendfile(int fd_in, int fd_out, size_t length) noexcept
+  {
+// a zero-length file is either empty, or not copyable by this syscall
+// return early to avoid the syscall cost
+if (length == 0)
+  {
+errno