Re: [Gluster-users] Confusing lstat() performance

2017-09-20 Thread Ben Turner
- Original Message -
> From: "Niklas Hambüchen" 
> To: "Sam McLeod" 
> Cc: gluster-users@gluster.org
> Sent: Sunday, September 17, 2017 7:34:15 PM
> Subject: Re: [Gluster-users] Confusing lstat() performance
> 
> I found the reason now, at least for this set of lstat()s I was looking at.
> 
> bup first does all getdents(), obtaining all file names in the
> directory, and then stat()s them.

+1 I was thinking of a case just like this.

-b

> 
> Apparently this destroys some of gluster's caching, making stat()s ~100x
> slower.
> 
> What caching could this be, and how could I convince gluster to serve
> these stat()s as fast as if a getdents() had been done just before them?
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
> 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Confusing lstat() performance

2017-09-20 Thread Niklas Hambüchen
Note, I have now put out a patch for bup that makes it 50x faster on my
1M files on gluster, exploiting the fact that stat() is fast after
getdents() on it:

  https://groups.google.com/forum/#!topic/bup-list/6CM7Jr7b_Ek

I'd still be interested in answers to this question though:

> What caching could this be, and how could I convince gluster to serve
> these stat()s as fast as if a getdents() had been done just before them?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Confusing lstat() performance

2017-09-19 Thread Ben Turner
- Original Message -
> From: "Niklas Hambüchen" 
> To: "Ben Turner" 
> Cc: "Gluster Users" 
> Sent: Monday, September 18, 2017 11:27:33 AM
> Subject: Re: [Gluster-users] Confusing lstat() performance
> 
> On 18/09/17 17:23, Ben Turner wrote:
> > Do you want tuned or untuned?  If tuned I'd like to try one of my tunings
> > for metadata, but I will use yours if you want.
> 
> (Re-CC'd list)
> 
> I would be interested in both, if possible: To confirm that it's not
> only my machines that exhibit this behaviour given my settings, and to
> see what can be achieved with your tuned settings.

I was just gonna get the details from you and post results back to list, just 
didn't want to spam people while we worked out the details.

Here is untuned + quota enabled:

strace -f -w -c rsync -a --dry-run ./ /tmp/
strace: Process 21797 attached
strace: Process 21827 attached
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 60.347.5255674849  1552 3 select
 28.633.571200 821  4351   getdents
 10.851.352934   72310 lstat
  0.050.006722   9   733   write
  0.050.005798   7   817   read
  0.040.0050012501 2   openat
  0.010.0017221722 1   execve
  0.010.001627  2663   munmap
  0.000.000552   961   mmap
  0.000.000228 114 2   clone
  0.000.000224  1219   close
  0.000.000123   718   fcntl
  0.000.90  10 9 2 open
  0.000.70   710   mprotect
  0.000.66   710 5 wait4
  0.000.63   512   rt_sigaction
  0.000.60  15 4   mremap
  0.000.37   6 6   brk
  0.000.36   5 7   fstat
  0.000.33  11 3   socketpair
  0.000.31  10 3 1 stat
  0.000.29  15 2   dup2
  0.000.11  11 1   kill
  0.000.10   5 2 2 rt_sigreturn
  0.000.08   8 1   getcwd
  0.000.07   7 1 1 access
  0.000.07   7 1   chdir
  0.000.06   6 1   rt_sigprocmask
  0.000.05   5 1   umask
  0.000.05   5 1   geteuid
  0.000.05   5 1   arch_prctl
-- --- --- - - 
100.00   12.472277207698100014 total

And BUP:

strace: Process 103678 detached
strace: Process 103679 detached
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 80.56 1627.233677   26201 62106   read
  9.02  182.162398   6  30924134   lseek
  7.46  150.672844 225670684 2 lstat
  1.21   24.461651 300 81647 10513 open
  0.83   16.696162 190 87869  1019 close
  0.254.993062 318 15702   llistxattr
  0.193.874752 165 23548   msync
  0.183.630711  23159461   write
  0.061.284745   623   getcwd
  0.040.880300 112  7850   rename
  0.040.800025   6134108   fstat
  0.040.725158  15 47288   munmap
  0.030.549348  14 37922   select
  0.020.361083   8 47593   mmap
  0.020.349493  22 15699  7850 unlink
  0.010.275029   6 47116   fcntl
  0.010.253821  32  7895  7879 ioctl
  0.010.243284  31  7851  7851 getxattr
  0.010.215476  27  7851  7851 lgetxattr
  0.010.136050  15  8832  8707 stat
  0.000.091960   6 15701   dup
  0.000.0249558318 3   execve
  0.000.004210   8   524   brk
  0.000.001756   8   210   mprotect
  0.000.001450 725 2   setsid
  0.000.001162   6   205   rt_sigaction
  0.000.000670 335 2   clone
  0.000.000200  1712   getdents
  0.000.70  12 6   openat
  0.000.54   8 7 5 access
  0.000.00

Re: [Gluster-users] Confusing lstat() performance

2017-09-19 Thread Niklas Hambüchen
On 18/09/17 17:23, Ben Turner wrote:
> Do you want tuned or untuned?  If tuned I'd like to try one of my tunings for 
> metadata, but I will use yours if you want.

(Re-CC'd list)

I would be interested in both, if possible: To confirm that it's not
only my machines that exhibit this behaviour given my settings, and to
see what can be achieved with your tuned settings.

Thank you!
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Confusing lstat() performance

2017-09-19 Thread Niklas Hambüchen
On 18/09/17 16:51, Ben Turner wrote:
> I wouldn't mind, but I don't have your dataset.

Oh sorry, I thought I had posted that here but in fact I did so in a
different issue regarding getdents() performance (bug 1478411).

My benchmarking data set is trivial: 100k empty files.

In a directory on your gluster mount:

  seq 1 10 | xargs touch

I found this simple benchmark data set (and even 10k empty files) to
exhibit the same lstat() problems as my directory with 1M files "real"
work load.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Confusing lstat() performance

2017-09-19 Thread Ben Turner
- Original Message -
> From: "Niklas Hambüchen" 
> To: "Ben Turner" 
> Cc: gluster-users@gluster.org
> Sent: Sunday, September 17, 2017 9:49:10 PM
> Subject: Re: [Gluster-users] Confusing lstat() performance
> 
> Hi Ben,
> 
> do you know if the smallfile benchmark also does interleaved getdents()
> and lstat, which is what I found as being the key difference that
> creates the performance gap (further down this thread)?

I am not sure, you can have a look at it:

https://github.com/bengland2/smallfile


> 
> Also, wouldn't `--threads 8` change the performance numbers by factor 8
> versus the plain `ls` and `rsync` that I did?

Maybe not 8x but it will DEF improve things.  I just recycled what was in my 
history buffer, I just wanted to illustrate that even though you see the stat 
calls in the strace application behavior can have a big impact on performance.

> 
> Would you mind running those commands directly/plainly on your cluster
> to confirm or refute my numbers?

I wouldn't mind, but I don't have your dataset.  Thats why I wanted to bring in 
a perf test tool that we could compare things apples to apples.  What about 
running on the data that smallfile creates and comparing that?

-b


> 
> Thanks!
> 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Confusing lstat() performance

2017-09-19 Thread Niklas Hambüchen
Hi Ben,

do you know if the smallfile benchmark also does interleaved getdents()
and lstat, which is what I found as being the key difference that
creates the performance gap (further down this thread)?

Also, wouldn't `--threads 8` change the performance numbers by factor 8
versus the plain `ls` and `rsync` that I did?

Would you mind running those commands directly/plainly on your cluster
to confirm or refute my numbers?

Thanks!
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Confusing lstat() performance

2017-09-19 Thread Ben Turner
I attached my strace output for you to look at:

Smallfile stat:
files/sec = 2270.307299
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 84.48  272.3244123351 81274  1141 stat
 10.20   32.880871   81997   401   read
  2.788.957013   9735992   write
  2.447.854060  43633718   select
  0.030.085606  67  1277   984 open
  0.030.082586   13764 6   getdents
  0.020.048819488210   unlink
  0.010.046833   46833 1   mkdir
  0.010.027127   27127 1   rmdir
  0.000.012896   12896 1   fsync
  0.000.003129   6   531   fstat
  0.000.002766  6642 2 wait4

Smallfile ls -l:
files/sec = 22852.145303
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 60.26   18.5581473849  4822   getdents
 26.968.302826  46126818   select
  4.921.5151951868   811   openat
  4.831.486988  18 81294  1161 stat
  2.230.686146745892   write
  0.260.080318803210   unlink
  0.170.050832  40  1277   984 open
  0.100.030263   30263 1   rmdir
  0.080.023776   23776 1   fsync
  0.050.016408  41   401   read
  0.050.016061   16061 1   mkdir
  0.040.011154  10  1108   close
  0.010.003229   6   531   fstat
  0.010.002840  6842 2 wait4

Look at the difference between my two stat calls:

 84.48  272.3244123351 81274  1141 stat   <--- stat

  4.831.486988  18 81294  1161 stat   <--- ls -l

Maybe you two applications are behaving differently, like smallfile stat and 
smallfile ls-l are?

-b

- Original Message -
> From: "Ben Turner" 
> To: "Niklas Hambüchen" 
> Cc: gluster-users@gluster.org
> Sent: Sunday, September 17, 2017 8:54:01 PM
> Subject: Re: [Gluster-users] Confusing lstat() performance
> 
> I did a quick test on one of my lab clusters with no tuning except for quota
> being enabled:
> 
> [root@dell-per730-03 ~]# gluster v info
>  
> Volume Name: vmstore
> Type: Replicate
> Volume ID: 0d2e4c49-334b-47c9-8e72-86a4c040a7bd
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: 192.168.50.1:/rhgs/brick1/vmstore
> Brick2: 192.168.50.2:/rhgs/brick1/vmstore
> Brick3: 192.168.50.3:/rhgs/ssd/vmstore (arbiter)
> Options Reconfigured:
> features.quota-deem-statfs: on
> nfs.disable: on
> features.inode-quota: on
> features.quota: on
> 
> And I ran the smallfile benchmark, created 80k 64KB files.  After that I
> clear cache everywhere and ran a smallfile stat test
> 
> [root@dell-per730-06-priv ~]# python /smallfile/smallfile_cli.py --files
> 1 --file-size 64 --threads 8 --top /gluster-mount/s-file/ --operation
> stat
>  version : 3.1
>hosts in test : None
>top test directory(s) : ['/gluster-mount/s-file']
>operation : stat
> files/thread : 1
>  threads : 8
>record size (KB, 0 = maximum) : 0
>   file size (KB) : 64
>   file size distribution : fixed
>files per dir : 100
> dirs per dir : 10
>   threads share directories? : N
>  filename prefix :
>  filename suffix :
>  hash file number into dir.? : N
>  fsync after modify? : N
>   pause between files (microsec) : 0
> finish all requests? : Y
>   stonewall? : Y
>  measure response times? : N
> verify read? : Y
> verbose? : False
>   log to stderr? : False
>ext.attr.size : 0
>   ext.attr.count : 0
> host = dell-per730-06-priv.css.lab.eng.rdu2.redhat.com,thr = 00,elapsed =
> 33.513184,files = 9400,records = 0,status = ok
> host = dell-per730-06-priv.css.lab.eng.rdu2.redhat.com,thr = 01,elapsed =
> 33.322282,files = 9700,records = 0,status = ok
> host = d

Re: [Gluster-users] Confusing lstat() performance

2017-09-19 Thread Ben Turner
dhat.com,thr = 01,elapsed = 
2.854434,files = 1,records = 0,status = ok
host = dell-per730-06-priv.css.lab.eng.rdu2.redhat.com,thr = 02,elapsed = 
2.864292,files = 9000,records = 0,status = ok
host = dell-per730-06-priv.css.lab.eng.rdu2.redhat.com,thr = 03,elapsed = 
2.878101,files = 9600,records = 0,status = ok
host = dell-per730-06-priv.css.lab.eng.rdu2.redhat.com,thr = 04,elapsed = 
2.883071,files = 9500,records = 0,status = ok
host = dell-per730-06-priv.css.lab.eng.rdu2.redhat.com,thr = 05,elapsed = 
2.885429,files = 9600,records = 0,status = ok
host = dell-per730-06-priv.css.lab.eng.rdu2.redhat.com,thr = 06,elapsed = 
2.865360,files = 9500,records = 0,status = ok
host = dell-per730-06-priv.css.lab.eng.rdu2.redhat.com,thr = 07,elapsed = 
2.866940,files = 9400,records = 0,status = ok
total threads = 8
total files = 76100
total IOPS = 0
 95.12% of requested files processed, minimum is  90.00
elapsed time = 2.885
files/sec = 26373.893173

So here I get 26k files / sec.  Based on 25k files / sec that works out to 
about 40 seconds, much more like what you are seeing.  Are you sure that your 
BUP app isn't doing something like ls -l where as rsync is doing something like 
stat?

Maybe you could try the smallfile benchmark and see what it gives you.  Also, 
make sure you clear cache on both clients and servers(echo 3 > 
/proc/sys/vm/drop_caches) before running your tests to eliminate server and 
client side caching.  If you are not dropping cache this could also cause the 
differences you are seeing.

-b

- Original Message -
> From: "Niklas Hambüchen" 
> To: gluster-users@gluster.org
> Sent: Thursday, September 14, 2017 8:42:44 AM
> Subject: [Gluster-users] Confusing lstat() performance
> 
> Hi,
> 
> I have a gluster 3.10 volume with a dir with ~1 million small files in
> them, say mounted at /mnt/dir with FUSE, and I'm observing something weird:
> 
> When I list and stat them all using rsync, then the lstat() calls that
> rsync does are incredibly fast (23 microseconds per call on average,
> definitely faster than a network roundtrip between my 3-machine bricks
> connected via Ethernet).
> But when I try to back them up with the `bup` tool
> (https://github.com/bup/bup), which (at least according to strace) does
> the same syscalls as rsync to stat all files, it takes 1700 microseconds
> per lstat() call, and the total time to go over all files is 50x higher.
> 
> These are my strace statistics:
> 
> rsync strace:
> 
>   $ strace -f -w -c rsync -a --dry-run /mnt/dir /tmp/nonexisting
> 
>   % time seconds  usecs/call callserrors syscall
>   -- --- --- - - 
>52.35   69.682773   26617  2618 1 select
>39.60   52.7122921056 49907   getdents
> 8.00   10.655949  11998102   lstat
> 0.020.022890  12  1900   read
> 0.010.017219  21   829   write
> 0.010.012093  14   868   munmap
> 0.010.006656  11   606   mmap
> 0.000.0041681389 3   readlink
> 0.000.0017961796 1   chdir
> 0.000.001019 510 2   clone
> 0.000.000841  441913 open
> 
> Took ~50 seconds real time to complete.
> 
> bup strace (I interrupted it after a while):
> 
>   strace: Process 10749 attached
>   strace: Process 10750 attached
>   Indexing: 25600 (566 paths/s)
>   ^C
>   % time seconds  usecs/call callserrors syscall
>   -- --- --- - - 
>89.55 1140.8373251700671016   lstat
> 3.92   49.875376 934 53387   getdents
> 3.58   45.655985   52238   874   read
> 2.14   27.293944 789 34588  5799 open
> 0.577.266342 141 51384   llistxattr
> 0.091.090689  42 25692 25692 getxattr
> 0.060.780977  26 29825  1019 close
> 0.050.601806  23 25739 25722 ioctl
> 0.030.3738941851   202   select
> 0.000.055953  14  3879   brk
> 
>   real20m52.150s
>   user0m2.608s
>   sys 0m11.456s
> 
> Note I passed `-c -w` to strace to measure wall time of the syscalls
> spend, not system CPU time.
> 
> Using
> 
>   time strace -f -c -w ls -lU /mnt/dir > /dev/null
> 
> shows that ls is as fast as rsync (also in 50 seconds).
> 
> (Aside: I've filed the large number of getdents() as
> https://bugzilla.redhat.com/show_bug.cgi?id=1478411 but that

Re: [Gluster-users] Confusing lstat() performance

2017-09-19 Thread Niklas Hambüchen
I found the reason now, at least for this set of lstat()s I was looking at.

bup first does all getdents(), obtaining all file names in the
directory, and then stat()s them.

Apparently this destroys some of gluster's caching, making stat()s ~100x
slower.

What caching could this be, and how could I convince gluster to serve
these stat()s as fast as if a getdents() had been done just before them?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Confusing lstat() performance

2017-09-19 Thread Niklas Hambüchen
On 17/09/17 18:03, Niklas Hambüchen wrote:
> So far the only difference between `ls` and `bup index` I could observe
> is that `bup index` chdir()s into the directory to index, ls doesn't.
> 
> But when I `cd` into the dir and run `ls` without directory argument, it
> is still much faster than bup index for each stat().

Hmm, bup uses the fchdir() syscall to go into the target directory.
Might that make a difference for gluster?
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Confusing lstat() performance

2017-09-19 Thread Niklas Hambüchen
On 15/09/17 03:46, Niklas Hambüchen wrote:
>> Out of interest have you tried testing performance
>> with performance.stat-prefetch enabled?

I have now tested with `performance.stat-prefetch: on` but am not
observing a difference.

So far the only difference between `ls` and `bup index` I could observe
is that `bup index` chdir()s into the directory to index, ls doesn't.

But when I `cd` into the dir and run `ls` without directory argument, it
is still much faster than bup index for each stat().
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Confusing lstat() performance

2017-09-18 Thread Niklas Hambüchen
On 15/09/17 02:45, Sam McLeod wrote:
> Out of interest have you tried testing performance
> with performance.stat-prefetch enabled?

Not yet, because I'm still struggling to understand the current more
basic setup's performance behaviour (with it being off), but it's
definitely on my list and I'll report the outcome.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Confusing lstat() performance

2017-09-14 Thread Sam McLeod
Hi Niklas,

Out of interest have you tried testing performance with 
performance.stat-prefetch enabled?

--
Sam McLeod 
@s_mcleod
https://smcleod.net

> On 14 Sep 2017, at 10:42 pm, Niklas Hambüchen  wrote:
> 
> Hi,
> 
> I have a gluster 3.10 volume with a dir with ~1 million small files in
> them, say mounted at /mnt/dir with FUSE, and I'm observing something weird:
> 
> When I list and stat them all using rsync, then the lstat() calls that
> rsync does are incredibly fast (23 microseconds per call on average,
> definitely faster than a network roundtrip between my 3-machine bricks
> connected via Ethernet).
> But when I try to back them up with the `bup` tool
> (https://github.com/bup/bup), which (at least according to strace) does
> the same syscalls as rsync to stat all files, it takes 1700 microseconds
> per lstat() call, and the total time to go over all files is 50x higher.
> 
> These are my strace statistics:
> 
> rsync strace:
> 
>  $ strace -f -w -c rsync -a --dry-run /mnt/dir /tmp/nonexisting
> 
>  % time seconds  usecs/call callserrors syscall
>  -- --- --- - - 
>   52.35   69.682773   26617  2618 1 select
>   39.60   52.7122921056 49907   getdents
>8.00   10.655949  11998102   lstat
>0.020.022890  12  1900   read
>0.010.017219  21   829   write
>0.010.012093  14   868   munmap
>0.010.006656  11   606   mmap
>0.000.0041681389 3   readlink
>0.000.0017961796 1   chdir
>0.000.001019 510 2   clone
>0.000.000841  441913 open
> 
> Took ~50 seconds real time to complete.
> 
> bup strace (I interrupted it after a while):
> 
>  strace: Process 10749 attached
>  strace: Process 10750 attached
>  Indexing: 25600 (566 paths/s)
>  ^C
>  % time seconds  usecs/call callserrors syscall
>  -- --- --- - - 
>   89.55 1140.8373251700671016   lstat
>3.92   49.875376 934 53387   getdents
>3.58   45.655985   52238   874   read
>2.14   27.293944 789 34588  5799 open
>0.577.266342 141 51384   llistxattr
>0.091.090689  42 25692 25692 getxattr
>0.060.780977  26 29825  1019 close
>0.050.601806  23 25739 25722 ioctl
>0.030.3738941851   202   select
>0.000.055953  14  3879   brk
> 
>  real20m52.150s
>  user0m2.608s
>  sys 0m11.456s
> 
> Note I passed `-c -w` to strace to measure wall time of the syscalls
> spend, not system CPU time.
> 
> Using
> 
>  time strace -f -c -w ls -lU /mnt/dir > /dev/null
> 
> shows that ls is as fast as rsync (also in 50 seconds).
> 
> (Aside: I've filed the large number of getdents() as
> https://bugzilla.redhat.com/show_bug.cgi?id=1478411 but that's not the
> core question of this email.)
> 
> My volume info:
> 
># gluster vol info
> 
>Volume Name: myvol
>Type: Replicate
>Volume ID: ...
>Status: Started
>Snapshot Count: 0
>Number of Bricks: 1 x 3 = 3
>Transport-type: tcp
>Bricks:
>Brick1: 10.0.0.1:/data/glusterfs/brick
>Brick2: 10.0.0.2:/data/glusterfs/brick
>Brick3: 10.0.0.3:/data/glusterfs/brick
>Options Reconfigured:
>changelog.capture-del-path: on
>changelog.changelog: on
>storage.build-pgfid: on
>nfs.disable: on
>transport.address-family: inet
>client.ssl: on
>server.ssl: on
>storage.linux-aio: on
>performance.io-thread-count: 64
>performance.readdir-ahead: on
>server.event-threads: 32
>client.event-threads: 32
>server.outstanding-rpc-limit: 64
>cluster.lookup-unhashed: auto
>performance.flush-behind: on
>performance.strict-write-ordering: off
>performance.high-prio-threads: 64
>performance.normal-prio-threads: 64
>performance.low-prio-threads: 64
>performance.write-behind-window-size: 10MB
>cluster.ensure-durability: on
>performance.lazy-open: yes
>cluster.use-compound-fops: off
>performance.open-behind: on
>features.cache-invalidation: off
>performance.quick-read: off
>performance.read-ahead: off
>performance.stat-prefetch: off
>changelog.rollover-time: 1
>cluster.self-heal-daemon: enable
> 
> Questions:
> 
> What could explain why bup's lstat()s are slow and rsync's lstat()s are
> fast?
> 
> Also, how comes rsync's lstat()s can be faster than a network roundtrip
> at all, given that I have all caching disabled (e.g. stat-prefetch: off,
> cache-invalidation: off)?
> 
> Is there some caching going on? Might bup issue its syscalls in an order
> that migh

[Gluster-users] Confusing lstat() performance

2017-09-14 Thread Niklas Hambüchen
Hi,

I have a gluster 3.10 volume with a dir with ~1 million small files in
them, say mounted at /mnt/dir with FUSE, and I'm observing something weird:

When I list and stat them all using rsync, then the lstat() calls that
rsync does are incredibly fast (23 microseconds per call on average,
definitely faster than a network roundtrip between my 3-machine bricks
connected via Ethernet).
But when I try to back them up with the `bup` tool
(https://github.com/bup/bup), which (at least according to strace) does
the same syscalls as rsync to stat all files, it takes 1700 microseconds
per lstat() call, and the total time to go over all files is 50x higher.

These are my strace statistics:

rsync strace:

  $ strace -f -w -c rsync -a --dry-run /mnt/dir /tmp/nonexisting

  % time seconds  usecs/call callserrors syscall
  -- --- --- - - 
   52.35   69.682773   26617  2618 1 select
   39.60   52.7122921056 49907   getdents
8.00   10.655949  11998102   lstat
0.020.022890  12  1900   read
0.010.017219  21   829   write
0.010.012093  14   868   munmap
0.010.006656  11   606   mmap
0.000.0041681389 3   readlink
0.000.0017961796 1   chdir
0.000.001019 510 2   clone
0.000.000841  441913 open

Took ~50 seconds real time to complete.

bup strace (I interrupted it after a while):

  strace: Process 10749 attached
  strace: Process 10750 attached
  Indexing: 25600 (566 paths/s)
  ^C
  % time seconds  usecs/call callserrors syscall
  -- --- --- - - 
   89.55 1140.8373251700671016   lstat
3.92   49.875376 934 53387   getdents
3.58   45.655985   52238   874   read
2.14   27.293944 789 34588  5799 open
0.577.266342 141 51384   llistxattr
0.091.090689  42 25692 25692 getxattr
0.060.780977  26 29825  1019 close
0.050.601806  23 25739 25722 ioctl
0.030.3738941851   202   select
0.000.055953  14  3879   brk

  real20m52.150s
  user0m2.608s
  sys 0m11.456s

Note I passed `-c -w` to strace to measure wall time of the syscalls
spend, not system CPU time.

Using

  time strace -f -c -w ls -lU /mnt/dir > /dev/null

shows that ls is as fast as rsync (also in 50 seconds).

(Aside: I've filed the large number of getdents() as
https://bugzilla.redhat.com/show_bug.cgi?id=1478411 but that's not the
core question of this email.)

My volume info:

# gluster vol info

Volume Name: myvol
Type: Replicate
Volume ID: ...
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.0.0.1:/data/glusterfs/brick
Brick2: 10.0.0.2:/data/glusterfs/brick
Brick3: 10.0.0.3:/data/glusterfs/brick
Options Reconfigured:
changelog.capture-del-path: on
changelog.changelog: on
storage.build-pgfid: on
nfs.disable: on
transport.address-family: inet
client.ssl: on
server.ssl: on
storage.linux-aio: on
performance.io-thread-count: 64
performance.readdir-ahead: on
server.event-threads: 32
client.event-threads: 32
server.outstanding-rpc-limit: 64
cluster.lookup-unhashed: auto
performance.flush-behind: on
performance.strict-write-ordering: off
performance.high-prio-threads: 64
performance.normal-prio-threads: 64
performance.low-prio-threads: 64
performance.write-behind-window-size: 10MB
cluster.ensure-durability: on
performance.lazy-open: yes
cluster.use-compound-fops: off
performance.open-behind: on
features.cache-invalidation: off
performance.quick-read: off
performance.read-ahead: off
performance.stat-prefetch: off
changelog.rollover-time: 1
cluster.self-heal-daemon: enable

Questions:

What could explain why bup's lstat()s are slow and rsync's lstat()s are
fast?

Also, how comes rsync's lstat()s can be faster than a network roundtrip
at all, given that I have all caching disabled (e.g. stat-prefetch: off,
cache-invalidation: off)?

Is there some caching going on? Might bup issue its syscalls in an order
that might destroy this caching, while rsync and ls use a favourable order?

Thanks!
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users