bug#6906: [PATCH] cp: copy entirely-sparse files oodles faster

2018-10-10 Thread Paul Eggert

Assaf Gordon wrote:

Can this be closed as out-dated?


Yes, that's fine. Closing.





bug#6906: [PATCH] cp: copy entirely-sparse files oodles faster

2018-10-10 Thread Assaf Gordon

(triaging old bugs)

Hello,

On 17/04/11 10:28 AM, Paul Eggert wrote:

On 04/17/11 01:55, Jim Meyering wrote:

Now that we have FIEMAP support, (by the looks of things
we will soon have SEEK_HOLE support in cp and in the linux kernel)
do you think adding support for this special case is worthwhile?
I could go either way.

If so, would you care to rebase it for 8.13?


Yes, I expect it's worthwhile, as the FIEMAP stuff isn't universal.
I'll add it to my list of thing to do.  It's not high priority,
to be sure.


In the 8 years since the original thread,
cp(1) can now copy sparse files very fast (though I suspect it's still 
with FIEMAP and not SEEK_DATA/HOLE).


https://bugs.gnu.org/6906

Can this be closed as out-dated?

regards,
 - assaf








bug#6906: [PATCH] cp: copy entirely-sparse files oodles faster

2011-04-17 Thread Paul Eggert
On 04/17/11 01:55, Jim Meyering wrote:
> Now that we have FIEMAP support, (by the looks of things
> we will soon have SEEK_HOLE support in cp and in the linux kernel)
> do you think adding support for this special case is worthwhile?
> I could go either way.
> 
> If so, would you care to rebase it for 8.13?

Yes, I expect it's worthwhile, as the FIEMAP stuff isn't universal.
I'll add it to my list of thing to do.  It's not high priority,
to be sure.





bug#6906: [PATCH] cp: copy entirely-sparse files oodles faster

2011-04-17 Thread Jim Meyering
Paul Eggert wrote:
> (By "oodles faster" I mean "as much faster as you like".
> The benchmark below shows a 2800x speedup.)
>
> In response to an idea by Kit Westneat for GNU tar reported in
> ,
> Eric Blake wrote:
>
>> Meanwhile, if you are indeed correct that there are easy ways to detect
>> completely sparse files, even when the ioctl or SEEK_HOLE directives are
>> not present, then the coreutils cp(1) hole iteration routine should
>> probably be taught that corner case to recognize an entirely sparse file
>> as a single hole.
>
> Here's a patch to coreutils to implement this idea.  It's based on a patch
>  that
> I just now installed into GNU tar.  I think of it as a quick first cut
> at full fiemap / SEEK_HOLE implementation, but unlike the full
> implementation this optimization does not depend on any special ioctls
> or lseek extensions, so it should work on any POSIX or POSIX-like host.
>
> On a simple benchmark this sped up GNU cp by a factor of 2800
> (measuring by real-time seconds) on my host:
>
>$ truncate -s 10GB bigfile
>$ time old/cp bigfile bigfile-slow
>
>real2m3.231s
>user0m1.497s
>sys 0m5.738s
>$ time new/cp bigfile bigfile-fast
>
>real0m0.044s
>user0m0.000s
>sys 0m0.002s
>$ ls -ls bigfile*
>0 -rw-r--r-- 1 eggert csfac 100 Aug 24 22:11 bigfile
>0 -rw-r--r-- 1 eggert csfac 100 Aug 24 22:14 bigfile-fast
>0 -rw-r--r-- 1 eggert csfac 100 Aug 24 22:14 bigfile-slow
>
>>From 2e535b590d675e6d96f954c1f840d678fb133f6a Mon Sep 17 00:00:00 2001
> From: Paul Eggert 
> Date: Tue, 24 Aug 2010 22:20:55 -0700
> Subject: [PATCH] cp: copy entirely-sparse files oodles faster
>
> * src/copy.c (copy_reg): Bypass reads if the file is entirely
> sparse.  Idea suggested for by Kit Westneat via Bernd Shubert in
> 
> for the Lustre file system.  Implementation stolen from my patch
> 
> to GNU tar.  On my machine this sped up a cp benchmark, which
> copied a 10 GB entirely-sparse file on an NFS file system, by a
> factor of 2800 in real seconds.

Hi Paul,

Somehow I didn't see this patch from you until now, while looking
through the hundreds of outstanding (bug mostly resolved) bugs at
http://debbugs.gnu.org/coreutils.  Sorry about that.

Now that we have FIEMAP support, (by the looks of things
we will soon have SEEK_HOLE support in cp and in the linux kernel)
do you think adding support for this special case is worthwhile?
I could go either way.

If so, would you care to rebase it for 8.13?
coreutils-8.12 will probably be coming soon to adjust FIEMAP
support not to collide with the combination of XFS, 2.6.39
release-candidate kernels and so called "unwritten extents".





bug#6906: [PATCH] cp: copy entirely-sparse files oodles faster

2010-08-24 Thread Paul Eggert
(By "oodles faster" I mean "as much faster as you like".
The benchmark below shows a 2800x speedup.)

In response to an idea by Kit Westneat for GNU tar reported in
,
Eric Blake wrote:

> Meanwhile, if you are indeed correct that there are easy ways to detect
> completely sparse files, even when the ioctl or SEEK_HOLE directives are
> not present, then the coreutils cp(1) hole iteration routine should
> probably be taught that corner case to recognize an entirely sparse file
> as a single hole.

Here's a patch to coreutils to implement this idea.  It's based on a patch
 that
I just now installed into GNU tar.  I think of it as a quick first cut
at full fiemap / SEEK_HOLE implementation, but unlike the full
implementation this optimization does not depend on any special ioctls
or lseek extensions, so it should work on any POSIX or POSIX-like host.

On a simple benchmark this sped up GNU cp by a factor of 2800
(measuring by real-time seconds) on my host:

   $ truncate -s 10GB bigfile
   $ time old/cp bigfile bigfile-slow

   real2m3.231s
   user0m1.497s
   sys 0m5.738s
   $ time new/cp bigfile bigfile-fast

   real0m0.044s
   user0m0.000s
   sys 0m0.002s
   $ ls -ls bigfile*
   0 -rw-r--r-- 1 eggert csfac 100 Aug 24 22:11 bigfile
   0 -rw-r--r-- 1 eggert csfac 100 Aug 24 22:14 bigfile-fast
   0 -rw-r--r-- 1 eggert csfac 100 Aug 24 22:14 bigfile-slow

>From 2e535b590d675e6d96f954c1f840d678fb133f6a Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Tue, 24 Aug 2010 22:20:55 -0700
Subject: [PATCH] cp: copy entirely-sparse files oodles faster

* src/copy.c (copy_reg): Bypass reads if the file is entirely
sparse.  Idea suggested for by Kit Westneat via Bernd Shubert in

for the Lustre file system.  Implementation stolen from my patch

to GNU tar.  On my machine this sped up a cp benchmark, which
copied a 10 GB entirely-sparse file on an NFS file system, by a
factor of 2800 in real seconds.
---
 src/copy.c |   18 +++---
 1 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/src/copy.c b/src/copy.c
index 6d11ed8..1e79523 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -669,10 +669,21 @@ copy_reg (char const *src_name, char const *dst_name,
 #endif
 }
 
-  /* If not making a sparse file, try to use a more-efficient
- buffer size.  */
-  if (! make_holes)
+  if (make_holes)
 {
+  /* For speed, bypass reads if the file is entirely sparse.  */
+
+  if (src_open_sb.st_size != 0 && ST_NBLOCKS (src_open_sb) == 0)
+{
+  n_read_total = src_open_sb.st_size;
+  goto set_dest_size;
+}
+}
+  else
+{
+  /* Not making a sparse file, so try to use a more-efficient
+ buffer size.  */
+
   /* Compute the least common multiple of the input and output
  buffer sizes, adjusting for outlandish values.  */
   size_t blcm_max = MIN (SIZE_MAX, SSIZE_MAX) - buf_alignment_slop;
@@ -788,6 +799,7 @@ copy_reg (char const *src_name, char const *dst_name,
 
   if (last_write_made_hole)
 {
+set_dest_size:
   if (ftruncate (dest_desc, n_read_total) < 0)
 {
   error (0, errno, _("truncating %s"), quote (dst_name));
-- 
1.7.2