ffs snapshots patch

2011-04-16 Thread Manuel Bouyer
Hello,
attached is a work in progress on ffs snapshot (as it's work in progress,
some debug and instrumentation code is still present in the
patch, no need to comment on this part :).
The start of this work is that when working on quota, I noticed that
taking a snapshot on a 500Gb filesystem needs several minutes, and is
O(n) with the number of persisent snapshots.
Here's some timings on a otherwise idle 500Gb filesystem (it's some brand of
SATA2 3.5" drive attached to a AHCI controller, so it's a reasonable test
bed for today):
java# /usr/bin/time fssconfig fss0 /home /home/snaps/snap0
  260.53 real 0.00 user 1.15 sys
/home: suspended 77.873 sec, redo 1184 of 2556
java# /usr/bin/time fssconfig fss1 /home /home/snaps/snap1
  377.87 real 0.00 user 2.53 sys
/home: suspended 206.078 sec, redo 1184 of 2556
java# /usr/bin/time fssconfig fss2 /home /home/snaps/snap2
  508.23 real 0.00 user 4.28 sys
/home: suspended 338.534 sec, redo 1184 of 2556
java# /usr/bin/time fssconfig fss3 /home /home/snaps/snap3
  621.40 real 0.00 user 5.50 sys
/home: suspended 431.154 sec, redo 1183 of 2556

suspending a filesystem for more than 7mn to take a snapshot makes
persisent snapshot quite useless to me. I wonder how it would behaves
on a multi-terabyte filesystem.

I looked at where the time is spend and found 2 major issues:
1 cgaccount() works in 2 pass: first it copies cg before suspending the
  filesystem; then it is called again to copy only the cg that have been
  modified between copy and filesystem suspend.
  The problem is that to copy a cg we need to allocate blocks for the snapshot
  file, which may be in a cg we just copied. This is the cause of the high
  number of cg copies (almost half of them) with the filesystem suspended.

2 while the filesystem is suspended, we want to expunge the snapshot files
  from the snapshot view (make them appear as a 0-length file).
  With ~500GB sparse files this is a lot of work.

I fixed 1) by preallocating needed blocks snapshot_setup(). 
Fixing 2) is trickier. To avoid the heavy writes to the snapshot file
with the fs suspended, the snapshot appears with its real lenght and
blocks at the time of creation, but is marked invalid (only the
inode block needs to be copied, and this can be done before suspending
the fs). Now BLK_SNAP should never be seen as a block number, and we skip
ffs_copyonwrite() if the write is to a snapshot inode.

With these changes the times are much more reasonable:
/usr/bin/time fssconfig fss0 /home /home/snaps/snap0
  299.68 real 0.00 user 1.10 sys
/home: suspended 0.310 sec, redo 0 of 2556
/usr/bin/time fssconfig fss1 /home /home/snaps/snap1
  188.10 real 0.00 user 0.86 sys
/home: suspended 0.270 sec, redo 0 of 2556
/usr/bin/time fssconfig fss2 /home /home/snaps/snap2
  169.78 real 0.00 user 0.95 sys
/home: suspended 0.450 sec, redo 0 of 2556
/usr/bin/time fssconfig fss3 /home /home/snaps/snap3
  172.39 real 0.00 user 0.99 sys
/home: suspended 0.300 sec, redo 0 of 2556

This seems to work; one issue with this patch is that the block
count for the snapshot inode, and block summary informations (the
second being probably a consequence of the first) appear wrong when
running fsck against a snapshot.  I believe this is fixable, but
I've not yet found from where the information mismatch is coming from.

comments ?

PS: I'm away from computers for one week, so don't expect replies to
your comments before next sunday.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--
Index: ffs/ffs_snapshot.c
===
RCS file: /cvsroot/src/sys/ufs/ffs/ffs_snapshot.c,v
retrieving revision 1.111
diff -u -p -u -r1.111 ffs_snapshot.c
--- ffs/ffs_snapshot.c  6 Mar 2011 17:08:38 -   1.111
+++ ffs/ffs_snapshot.c  16 Apr 2011 19:07:31 -
@@ -109,6 +109,8 @@ static int snapacct(struct vnode *, void
 daddr_t, int);
 static int mapacct(struct vnode *, void *, int, int, struct fs *,
 daddr_t, int);
+static int snapcount(struct vnode *, void *, int, int, struct fs *,
+daddr_t, int);
 #endif /* !defined(FFS_NO_SNAPSHOT) */
 
 static int ffs_copyonwrite(void *, struct buf *, bool);
@@ -190,7 +192,7 @@ ffs_snapshot(struct mount *mp, struct vn
struct timespec ts;
struct timeval starttime;
 #ifdef DEBUG
-   struct timeval endtime;
+   struct timeval endtime, parttime;
 #endif
struct vnode *devvp = ip->i_devvp;
 
@@ -250,6 +252,8 @@ ffs_snapshot(struct mount *mp, struct vn
/*
 * All allocations are done, so we can now suspend the filesystem.
 */
+   printf("%s: before suspend size %qd %qd\n",
+   mp->mnt_stat.f_mntonname, (long long int)ip->i_size, (long long 
int)DIP(ip, size));
error = vfs_suspend(vp->v_mount, 0);
vn_lock(vp, LK_EXCLUSIV

Re: ffs snapshots patch

2011-04-18 Thread Juergen Hannken-Illjes
On Sat, Apr 16, 2011 at 09:29:26PM +0200, Manuel Bouyer wrote:
> Hello,
> attached is a work in progress on ffs snapshot (as it's work in progress,
> some debug and instrumentation code is still present in the
> patch, no need to comment on this part :).
> The start of this work is that when working on quota, I noticed that
> taking a snapshot on a 500Gb filesystem needs several minutes, and is
> O(n) with the number of persisent snapshots.
> Here's some timings on a otherwise idle 500Gb filesystem (it's some brand of
> SATA2 3.5" drive attached to a AHCI controller, so it's a reasonable test
> bed for today):
> java# /usr/bin/time fssconfig fss0 /home /home/snaps/snap0
>   260.53 real 0.00 user 1.15 sys
> /home: suspended 77.873 sec, redo 1184 of 2556
> java# /usr/bin/time fssconfig fss1 /home /home/snaps/snap1
>   377.87 real 0.00 user 2.53 sys
> /home: suspended 206.078 sec, redo 1184 of 2556
> java# /usr/bin/time fssconfig fss2 /home /home/snaps/snap2
>   508.23 real 0.00 user 4.28 sys
> /home: suspended 338.534 sec, redo 1184 of 2556
> java# /usr/bin/time fssconfig fss3 /home /home/snaps/snap3
>   621.40 real 0.00 user 5.50 sys
> /home: suspended 431.154 sec, redo 1183 of 2556
> 
> suspending a filesystem for more than 7mn to take a snapshot makes
> persisent snapshot quite useless to me. I wonder how it would behaves
> on a multi-terabyte filesystem.
> 
> I looked at where the time is spend and found 2 major issues:
> 1 cgaccount() works in 2 pass: first it copies cg before suspending the
>   filesystem; then it is called again to copy only the cg that have been
>   modified between copy and filesystem suspend.
>   The problem is that to copy a cg we need to allocate blocks for the snapshot
>   file, which may be in a cg we just copied. This is the cause of the high
>   number of cg copies (almost half of them) with the filesystem suspended.
> 
> 2 while the filesystem is suspended, we want to expunge the snapshot files
>   from the snapshot view (make them appear as a 0-length file).
>   With ~500GB sparse files this is a lot of work.
> 
> I fixed 1) by preallocating needed blocks snapshot_setup(). 

Good catch.  Committed.

> Fixing 2) is trickier. To avoid the heavy writes to the snapshot file
> with the fs suspended, the snapshot appears with its real lenght and
> blocks at the time of creation, but is marked invalid (only the
> inode block needs to be copied, and this can be done before suspending
> the fs). Now BLK_SNAP should never be seen as a block number, and we skip
> ffs_copyonwrite() if the write is to a snapshot inode.

I strongly object here.  There are good reasons to expunge old snapshots.

Even it it were done right, without deadlocks and locking-against-self,
the resulting snapshot looses at least two properties:

- A snapshot is considered stable.  Whenever you read a block you get
  the same contents.  Allowing old snapshots to exist but not running
  copy-on-write means these blocks will change their contents.

- A snapshot will fsck clean.  It is impossible to change fsck_ffs
  to check a snapshot as these old snapshots indirect blocks now will
  contain garbage.

You cannot copy blocks before suspension without rewriting them once
the file system is suspended.

The check in ffs_copyonwrite() will only work as long as the old
snapshot exists.  As sson as it gets removed we will run COW
on the blocks used by the old snapshot.

> With these changes the times are much more reasonable:
> /usr/bin/time fssconfig fss0 /home /home/snaps/snap0
>   299.68 real 0.00 user 1.10 sys
> /home: suspended 0.310 sec, redo 0 of 2556
> /usr/bin/time fssconfig fss1 /home /home/snaps/snap1
>   188.10 real 0.00 user 0.86 sys
> /home: suspended 0.270 sec, redo 0 of 2556
> /usr/bin/time fssconfig fss2 /home /home/snaps/snap2
>   169.78 real 0.00 user 0.95 sys
> /home: suspended 0.450 sec, redo 0 of 2556
> /usr/bin/time fssconfig fss3 /home /home/snaps/snap3
>   172.39 real 0.00 user 0.99 sys
> /home: suspended 0.300 sec, redo 0 of 2556
> 
> This seems to work; one issue with this patch is that the block
> count for the snapshot inode, and block summary informations (the
> second being probably a consequence of the first) appear wrong when
> running fsck against a snapshot.  I believe this is fixable, but
> I've not yet found from where the information mismatch is coming from.
> 
> comments ?
> 
> PS: I'm away from computers for one week, so don't expect replies to
> your comments before next sunday.
> 
> -- 
> Manuel Bouyer 
>  NetBSD: 26 ans d'experience feront toujours la difference
> --

-- 
Juergen Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)


Re: ffs snapshots patch

2011-04-23 Thread Juergen Hannken-Illjes
On Sat, Apr 16, 2011 at 09:29:26PM +0200, Manuel Bouyer wrote:
> Hello,
> attached is a work in progress on ffs snapshot (as it's work in progress,
> some debug and instrumentation code is still present in the
> patch, no need to comment on this part :).
> The start of this work is that when working on quota, I noticed that
> taking a snapshot on a 500Gb filesystem needs several minutes, and is
> O(n) with the number of persisent snapshots.
> Here's some timings on a otherwise idle 500Gb filesystem (it's some brand of
> SATA2 3.5" drive attached to a AHCI controller, so it's a reasonable test
> bed for today):
> java# /usr/bin/time fssconfig fss0 /home /home/snaps/snap0
>   260.53 real 0.00 user 1.15 sys
> /home: suspended 77.873 sec, redo 1184 of 2556
> java# /usr/bin/time fssconfig fss1 /home /home/snaps/snap1
>   377.87 real 0.00 user 2.53 sys
> /home: suspended 206.078 sec, redo 1184 of 2556
> java# /usr/bin/time fssconfig fss2 /home /home/snaps/snap2
>   508.23 real 0.00 user 4.28 sys
> /home: suspended 338.534 sec, redo 1184 of 2556
> java# /usr/bin/time fssconfig fss3 /home /home/snaps/snap3
>   621.40 real 0.00 user 5.50 sys
> /home: suspended 431.154 sec, redo 1183 of 2556
> 
> suspending a filesystem for more than 7mn to take a snapshot makes
> persisent snapshot quite useless to me. I wonder how it would behaves
> on a multi-terabyte filesystem.
[snip]

These times depend on the file systems block size.  With contiguous indirect
blocks (ffs_balloc.c rev 1.54) I did timings on a 1.4 TByte UFS1 non-logging
file system created on 3 concatenated WD5003ABYX.  For every block size
I created four persistent snapshots (with unmounting the file systems after
every creation) and get these times (seconds):

Layoutcreate suspended

91441948 x 16384  385.713   22.785
91441948 x 16384  414.170   59.580
91441948 x 16384  474.164   91.385
91441948 x 16384  652.556  111.314

45720974 x 32768   43.4780.420
45720974 x 32768   40.7905.642
45720974 x 32768   49.700   12.748
45720974 x 32768   55.599   18.612

22860487 x 655367.0050.600
22860487 x 65536   10.5582.436
22860487 x 65536   14.3654.122
22860487 x 65536   18.6155.739

For me snapshots create reasonable fast with a block size of 32k or 64k.

-- 
Juergen Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)


Re: ffs snapshots patch

2011-04-27 Thread Manuel Bouyer
On Mon, Apr 18, 2011 at 09:36:25AM +0200, Juergen Hannken-Illjes wrote:
> [...]
> > Fixing 2) is trickier. To avoid the heavy writes to the snapshot file
> > with the fs suspended, the snapshot appears with its real lenght and
> > blocks at the time of creation, but is marked invalid (only the
> > inode block needs to be copied, and this can be done before suspending
> > the fs). Now BLK_SNAP should never be seen as a block number, and we skip
> > ffs_copyonwrite() if the write is to a snapshot inode.
> 
> I strongly object here.  There are good reasons to expunge old snapshots.
> 
> Even it it were done right, without deadlocks and locking-against-self,
> the resulting snapshot looses at least two properties:
> 
> - A snapshot is considered stable.  Whenever you read a block you get
>   the same contents.  Allowing old snapshots to exist but not running
>   copy-on-write means these blocks will change their contents.
> 
> - A snapshot will fsck clean.  It is impossible to change fsck_ffs
>   to check a snapshot as these old snapshots indirect blocks now will
>   contain garbage.

Maybe we should relax these contraints then

> 
> You cannot copy blocks before suspension without rewriting them once
> the file system is suspended.
> 
> The check in ffs_copyonwrite() will only work as long as the old
> snapshot exists.  As sson as it gets removed we will run COW
> on the blocks used by the old snapshot.

is it a problem ?

On Sat, Apr 23, 2011 at 10:40:05AM +0200, Juergen Hannken-Illjes wrote:
> [...]
> 
> These times depend on the file systems block size.  With contiguous indirect
> blocks (ffs_balloc.c rev 1.54) I did timings on a 1.4 TByte UFS1 non-logging
> file system created on 3 concatenated WD5003ABYX.  For every block size
> I created four persistent snapshots (with unmounting the file systems after
> every creation) and get these times (seconds):
> 
> Layoutcreate suspended
> 
> 91441948 x 16384  385.713   22.785
> 91441948 x 16384  414.170   59.580
> 91441948 x 16384  474.164   91.385
> 91441948 x 16384  652.556  111.314
> 
> 45720974 x 32768   43.4780.420
> 45720974 x 32768   40.7905.642
> 45720974 x 32768   49.700   12.748
> 45720974 x 32768   55.599   18.612
> 
> 22860487 x 655367.0050.600
> 22860487 x 65536   10.5582.436
> 22860487 x 65536   14.3654.122
> 22860487 x 65536   18.6155.739
> 
> For me snapshots create reasonable fast with a block size of 32k or 64k.

On my test system (16k/2k UFS2, logging, quotas) I get:
/usr/bin/time fssconfig fss0 /home /home/snaps/snap0
  141.69 real 0.00 user 1.22 sys
/home: suspended 14.716 sec, redo 0 of 2556
/usr/bin/time fssconfig fss1 /home /home/snaps/snap1
  213.87 real 0.00 user 1.98 sys
/home: suspended 64.027 sec, redo 0 of 2556
/usr/bin/time fssconfig fss2 /home /home/snaps/snap2
  290.82 real 0.00 user 3.06 sys
/home: suspended 120.641 sec, redo 0 of 2556
/usr/bin/time fssconfig fss3 /home /home/snaps/snap3
  342.11 real 0.00 user 3.92 sys
/home: suspended 170.733 sec, redo 0 of 2556

Even a 14s hang is still a long time for a NFS server (workstations will be
frozen by this time). Even if we can make it shorter with some filesystem
tuning, it still doesn't scale with the size of the filesystem and
the number of snapshot (having 12 persistent snapshots on a filesystem is
not a unreasonable number).
Other OSes can do it with almost no freeze, so it should be possible
(the snapshot may not be fsck-able, but I'm not sure it's the most
important property of FS snapshots).

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: ffs snapshots patch

2011-04-28 Thread Juergen Hannken-Illjes
On Wed, Apr 27, 2011 at 10:43:59AM +0200, Manuel Bouyer wrote:
> On Mon, Apr 18, 2011 at 09:36:25AM +0200, Juergen Hannken-Illjes wrote:
> > [...]
> > > Fixing 2) is trickier. To avoid the heavy writes to the snapshot file
> > > with the fs suspended, the snapshot appears with its real lenght and
> > > blocks at the time of creation, but is marked invalid (only the
> > > inode block needs to be copied, and this can be done before suspending
> > > the fs). Now BLK_SNAP should never be seen as a block number, and we skip
> > > ffs_copyonwrite() if the write is to a snapshot inode.
> > 
> > I strongly object here.  There are good reasons to expunge old snapshots.
> > 
> > Even it it were done right, without deadlocks and locking-against-self,
> > the resulting snapshot looses at least two properties:
> > 
> > - A snapshot is considered stable.  Whenever you read a block you get
> >   the same contents.  Allowing old snapshots to exist but not running
> >   copy-on-write means these blocks will change their contents.
> > 
> > - A snapshot will fsck clean.  It is impossible to change fsck_ffs
> >   to check a snapshot as these old snapshots indirect blocks now will
> >   contain garbage.
> 
> Maybe we should relax these contraints then

No.  We use snapshots (with -X) for fsck and dump.  This makes no sense
if we cannot fsck a snapshot any more.

> > You cannot copy blocks before suspension without rewriting them once
> > the file system is suspended.
> > 
> > The check in ffs_copyonwrite() will only work as long as the old
> > snapshot exists.  As sson as it gets removed we will run COW
> > on the blocks used by the old snapshot.
> 
> is it a problem ?
> 
> On Sat, Apr 23, 2011 at 10:40:05AM +0200, Juergen Hannken-Illjes wrote:
> > [...]
> > 
> > These times depend on the file systems block size.  With contiguous indirect
> > blocks (ffs_balloc.c rev 1.54) I did timings on a 1.4 TByte UFS1 non-logging
> > file system created on 3 concatenated WD5003ABYX.  For every block size
> > I created four persistent snapshots (with unmounting the file systems after
> > every creation) and get these times (seconds):
> > 
> > Layoutcreate suspended
> > 
> > 91441948 x 16384  385.713   22.785
> > 91441948 x 16384  414.170   59.580
> > 91441948 x 16384  474.164   91.385
> > 91441948 x 16384  652.556  111.314
> > 
> > 45720974 x 32768   43.4780.420
> > 45720974 x 32768   40.7905.642
> > 45720974 x 32768   49.700   12.748
> > 45720974 x 32768   55.599   18.612
> > 
> > 22860487 x 655367.0050.600
> > 22860487 x 65536   10.5582.436
> > 22860487 x 65536   14.3654.122
> > 22860487 x 65536   18.6155.739
> > 
> > For me snapshots create reasonable fast with a block size of 32k or 64k.
> 
> On my test system (16k/2k UFS2, logging, quotas) I get:
> /usr/bin/time fssconfig fss0 /home /home/snaps/snap0
>   141.69 real 0.00 user 1.22 sys
> /home: suspended 14.716 sec, redo 0 of 2556
> /usr/bin/time fssconfig fss1 /home /home/snaps/snap1
>   213.87 real 0.00 user 1.98 sys
> /home: suspended 64.027 sec, redo 0 of 2556
> /usr/bin/time fssconfig fss2 /home /home/snaps/snap2
>   290.82 real 0.00 user 3.06 sys
> /home: suspended 120.641 sec, redo 0 of 2556
> /usr/bin/time fssconfig fss3 /home /home/snaps/snap3
>   342.11 real 0.00 user 3.92 sys
> /home: suspended 170.733 sec, redo 0 of 2556
> 
> Even a 14s hang is still a long time for a NFS server (workstations will be
> frozen by this time). Even if we can make it shorter with some filesystem
> tuning, it still doesn't scale with the size of the filesystem and
> the number of snapshot (having 12 persistent snapshots on a filesystem is
> not a unreasonable number).
> Other OSes can do it with almost no freeze, so it should be possible
> (the snapshot may not be fsck-able, but I'm not sure it's the most
> important property of FS snapshots).

The only other OS with ffs+snapshots is FreeBSD which should behave similiar.
Other file systems like ZFS, NilFS etc. will be faster and scale better as
they are designed with instant snapshots in mind.

-- 
Juergen Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)


Re: ffs snapshots patch

2011-04-28 Thread Manuel Bouyer
On Thu, Apr 28, 2011 at 11:48:55AM +0200, Juergen Hannken-Illjes wrote:
> On Wed, Apr 27, 2011 at 10:43:59AM +0200, Manuel Bouyer wrote:
> > On Mon, Apr 18, 2011 at 09:36:25AM +0200, Juergen Hannken-Illjes wrote:
> > > [...]
> > > > Fixing 2) is trickier. To avoid the heavy writes to the snapshot file
> > > > with the fs suspended, the snapshot appears with its real lenght and
> > > > blocks at the time of creation, but is marked invalid (only the
> > > > inode block needs to be copied, and this can be done before suspending
> > > > the fs). Now BLK_SNAP should never be seen as a block number, and we 
> > > > skip
> > > > ffs_copyonwrite() if the write is to a snapshot inode.
> > > 
> > > I strongly object here.  There are good reasons to expunge old snapshots.
> > > 
> > > Even it it were done right, without deadlocks and locking-against-self,
> > > the resulting snapshot looses at least two properties:
> > > 
> > > - A snapshot is considered stable.  Whenever you read a block you get
> > >   the same contents.  Allowing old snapshots to exist but not running
> > >   copy-on-write means these blocks will change their contents.
> > > 
> > > - A snapshot will fsck clean.  It is impossible to change fsck_ffs
> > >   to check a snapshot as these old snapshots indirect blocks now will
> > >   contain garbage.
> > 
> > Maybe we should relax these contraints then
> 
> No.  We use snapshots (with -X) for fsck and dump.  This makes no sense
> if we cannot fsck a snapshot any more.

AFAIK dump will ignore snapshot files (or at last it should), so it's not a
problem is the snapshot's blocks changes while we're working on a snapshot.
Also AFAIK, the above issue will only cause fsck to report missing blocks
in group maps and summary informations. It's not a big deal either.

In their current form, snapshot are not useable even for this, because
it's not acceptable to suspend a file server for several 10s of
seconds (if not minutes) to start a dump or fsck.

> > /home: suspended 170.733 sec, redo 0 of 2556
> > 
> > Even a 14s hang is still a long time for a NFS server (workstations will be
> > frozen by this time). Even if we can make it shorter with some filesystem
> > tuning, it still doesn't scale with the size of the filesystem and
> > the number of snapshot (having 12 persistent snapshots on a filesystem is
> > not a unreasonable number).
> > Other OSes can do it with almost no freeze, so it should be possible
> > (the snapshot may not be fsck-able, but I'm not sure it's the most
> > important property of FS snapshots).
> 
> The only other OS with ffs+snapshots is FreeBSD which should behave similiar.
> Other file systems like ZFS, NilFS etc. will be faster and scale better as
> they are designed with instant snapshots in mind.

what about ext3fs ?

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: ffs snapshots patch

2011-04-29 Thread Manuel Bouyer
With your last changes, things are much better now:
/usr/bin/time fssconfig fss0 /home /home/snaps/snap0
  149.85 real 0.00 user 1.16 sys
/home: suspended 0.040 sec, redo 0 of 2556
/usr/bin/time fssconfig fss1 /home /home/snaps/snap1
  227.49 real 0.00 user 1.90 sys
/home: suspended 0.040 sec, redo 0 of 2556
/usr/bin/time fssconfig fss2 /home /home/snaps/snap2
  263.58 real 0.00 user 2.97 sys
/home: suspended 0.040 sec, redo 0 of 2556
/usr/bin/time fssconfig fss3 /home /home/snaps/snap3
  353.23 real 0.00 user 3.88 sys
/home: suspended 0.040 sec, redo 0 of 2556

Taking a snapshot will still probably require a lot of time on
large filesystems with a dozen snapshots, but at last the server
won't hang for a long time.
thanks !

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--


Re: ffs snapshots patch

2011-04-29 Thread Juergen Hannken-Illjes
On Fri, Apr 29, 2011 at 01:48:39PM +0200, Manuel Bouyer wrote:
> With your last changes, things are much better now:
> /usr/bin/time fssconfig fss0 /home /home/snaps/snap0
>   149.85 real 0.00 user 1.16 sys
> /home: suspended 0.040 sec, redo 0 of 2556
> /usr/bin/time fssconfig fss1 /home /home/snaps/snap1
>   227.49 real 0.00 user 1.90 sys
> /home: suspended 0.040 sec, redo 0 of 2556
> /usr/bin/time fssconfig fss2 /home /home/snaps/snap2
>   263.58 real 0.00 user 2.97 sys
> /home: suspended 0.040 sec, redo 0 of 2556
> /usr/bin/time fssconfig fss3 /home /home/snaps/snap3
>   353.23 real 0.00 user 3.88 sys
> /home: suspended 0.040 sec, redo 0 of 2556
> 
> Taking a snapshot will still probably require a lot of time on
> large filesystems with a dozen snapshots, but at last the server
> won't hang for a long time.
> thanks !

Not really.  Any thread ending up in ffs_copyonwrite() or ffs_snapblkfree()
will block.  If this server runs NFS it could be possible that all NFS
server threads block.

-- 
Juergen Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)


Re: ffs snapshots patch

2011-04-29 Thread Ignatios Souvatzis
On Fri, Apr 29, 2011 at 01:56:01PM +0200, Juergen Hannken-Illjes wrote:
> On Fri, Apr 29, 2011 at 01:48:39PM +0200, Manuel Bouyer wrote:
> > With your last changes, things are much better now:
> > /usr/bin/time fssconfig fss0 /home /home/snaps/snap0
> >   149.85 real 0.00 user 1.16 sys
> > /home: suspended 0.040 sec, redo 0 of 2556
> > /usr/bin/time fssconfig fss1 /home /home/snaps/snap1
> >   227.49 real 0.00 user 1.90 sys
> > /home: suspended 0.040 sec, redo 0 of 2556
> > /usr/bin/time fssconfig fss2 /home /home/snaps/snap2
> >   263.58 real 0.00 user 2.97 sys
> > /home: suspended 0.040 sec, redo 0 of 2556
> > /usr/bin/time fssconfig fss3 /home /home/snaps/snap3
> >   353.23 real 0.00 user 3.88 sys
> > /home: suspended 0.040 sec, redo 0 of 2556
> > 
> > Taking a snapshot will still probably require a lot of time on
> > large filesystems with a dozen snapshots, but at last the server
> > won't hang for a long time.
> > thanks !
> 
> Not really.  Any thread ending up in ffs_copyonwrite() or ffs_snapblkfree()
> will block.  If this server runs NFS it could be possible that all NFS
> server threads block.

Oh - I might have seen this on Monday - 5.99.47 on sparc64. All I saw
was [tstile], and the quickest way out after a couple of minutes was to
hard reboot the machine and let wapbl / fsck sort it out - and to move
back to the pre-snapshot rsync script.

Sorry, no core dump.

Regards,
-is


Re: ffs snapshots patch

2011-04-29 Thread Manuel Bouyer
On Fri, Apr 29, 2011 at 01:56:01PM +0200, Juergen Hannken-Illjes wrote:
> On Fri, Apr 29, 2011 at 01:48:39PM +0200, Manuel Bouyer wrote:
> > With your last changes, things are much better now:
> > /usr/bin/time fssconfig fss0 /home /home/snaps/snap0
> >   149.85 real 0.00 user 1.16 sys
> > /home: suspended 0.040 sec, redo 0 of 2556
> > /usr/bin/time fssconfig fss1 /home /home/snaps/snap1
> >   227.49 real 0.00 user 1.90 sys
> > /home: suspended 0.040 sec, redo 0 of 2556
> > /usr/bin/time fssconfig fss2 /home /home/snaps/snap2
> >   263.58 real 0.00 user 2.97 sys
> > /home: suspended 0.040 sec, redo 0 of 2556
> > /usr/bin/time fssconfig fss3 /home /home/snaps/snap3
> >   353.23 real 0.00 user 3.88 sys
> > /home: suspended 0.040 sec, redo 0 of 2556
> > 
> > Taking a snapshot will still probably require a lot of time on
> > large filesystems with a dozen snapshots, but at last the server
> > won't hang for a long time.
> > thanks !
> 
> Not really.  Any thread ending up in ffs_copyonwrite() or ffs_snapblkfree()
> will block.  If this server runs NFS it could be possible that all NFS
> server threads block.

hum, this is bad then, because a NFS server with home directories is seeing
mostly writes ...

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--