On Wed, Apr 27, 2011 at 10:43:59AM +0200, Manuel Bouyer wrote: > On Mon, Apr 18, 2011 at 09:36:25AM +0200, Juergen Hannken-Illjes wrote: > > [...] > > > Fixing 2) is trickier. To avoid the heavy writes to the snapshot file > > > with the fs suspended, the snapshot appears with its real lenght and > > > blocks at the time of creation, but is marked invalid (only the > > > inode block needs to be copied, and this can be done before suspending > > > the fs). Now BLK_SNAP should never be seen as a block number, and we skip > > > ffs_copyonwrite() if the write is to a snapshot inode. > > > > I strongly object here. There are good reasons to expunge old snapshots. > > > > Even it it were done right, without deadlocks and locking-against-self, > > the resulting snapshot looses at least two properties: > > > > - A snapshot is considered stable. Whenever you read a block you get > > the same contents. Allowing old snapshots to exist but not running > > copy-on-write means these blocks will change their contents. > > > > - A snapshot will fsck clean. It is impossible to change fsck_ffs > > to check a snapshot as these old snapshots indirect blocks now will > > contain garbage. > > Maybe we should relax these contraints then
No. We use snapshots (with -X) for fsck and dump. This makes no sense if we cannot fsck a snapshot any more. > > You cannot copy blocks before suspension without rewriting them once > > the file system is suspended. > > > > The check in ffs_copyonwrite() will only work as long as the old > > snapshot exists. As sson as it gets removed we will run COW > > on the blocks used by the old snapshot. > > is it a problem ? > > On Sat, Apr 23, 2011 at 10:40:05AM +0200, Juergen Hannken-Illjes wrote: > > [...] > > > > These times depend on the file systems block size. With contiguous indirect > > blocks (ffs_balloc.c rev 1.54) I did timings on a 1.4 TByte UFS1 non-logging > > file system created on 3 concatenated WD5003ABYX. For every block size > > I created four persistent snapshots (with unmounting the file systems after > > every creation) and get these times (seconds): > > > > Layout create suspended > > > > 91441948 x 16384 385.713 22.785 > > 91441948 x 16384 414.170 59.580 > > 91441948 x 16384 474.164 91.385 > > 91441948 x 16384 652.556 111.314 > > > > 45720974 x 32768 43.478 0.420 > > 45720974 x 32768 40.790 5.642 > > 45720974 x 32768 49.700 12.748 > > 45720974 x 32768 55.599 18.612 > > > > 22860487 x 65536 7.005 0.600 > > 22860487 x 65536 10.558 2.436 > > 22860487 x 65536 14.365 4.122 > > 22860487 x 65536 18.615 5.739 > > > > For me snapshots create reasonable fast with a block size of 32k or 64k. > > On my test system (16k/2k UFS2, logging, quotas) I get: > /usr/bin/time fssconfig fss0 /home /home/snaps/snap0 > 141.69 real 0.00 user 1.22 sys > /home: suspended 14.716 sec, redo 0 of 2556 > /usr/bin/time fssconfig fss1 /home /home/snaps/snap1 > 213.87 real 0.00 user 1.98 sys > /home: suspended 64.027 sec, redo 0 of 2556 > /usr/bin/time fssconfig fss2 /home /home/snaps/snap2 > 290.82 real 0.00 user 3.06 sys > /home: suspended 120.641 sec, redo 0 of 2556 > /usr/bin/time fssconfig fss3 /home /home/snaps/snap3 > 342.11 real 0.00 user 3.92 sys > /home: suspended 170.733 sec, redo 0 of 2556 > > Even a 14s hang is still a long time for a NFS server (workstations will be > frozen by this time). Even if we can make it shorter with some filesystem > tuning, it still doesn't scale with the size of the filesystem and > the number of snapshot (having 12 persistent snapshots on a filesystem is > not a unreasonable number). > Other OSes can do it with almost no freeze, so it should be possible > (the snapshot may not be fsck-able, but I'm not sure it's the most > important property of FS snapshots). The only other OS with ffs+snapshots is FreeBSD which should behave similiar. Other file systems like ZFS, NilFS etc. will be faster and scale better as they are designed with instant snapshots in mind. -- Juergen Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)