I've done some work on such things. The difficulty in design is
figuring
out how often to do the send. You will want to balance your send time
interval with the write rate such that the send data is likely to be
in the ARC.
There is no magic formula, but empirically you can discover a
reasonable
interval.
Currently I replicate snapshots daily, the idea that I might be better
off doing snapshots and replication hourly and potentially even more
frequent never occurred to me. I'll have to try. Surprisingly doing a
replication of the entire data set (currently 13TB) actually performs
better than the incremental, from a raw throughput point of view.
P.S. if you have atime enabled, which is the default, handling
billions of
files will be quite a challenge.
Indeed, that was one of the very first things I tweaked and disabled.
I don't know how bad it would have been with it enabled but I wasn't
about to find out.
Thanks
On 20-Nov-09, at 11:48 AM, Richard Elling wrote:
On Nov 20, 2009, at 11:27 AM, Adam Serediuk wrote:
I have several X4540 Thor systems with one large zpool that
replicate data to a backup host via zfs send/recv. The process
works quite well when there is little to no usage on the source
systems. However when the source systems are under usage
replication slows down to a near crawl. Without load replication
streams along usually near 1 Gbps but drops down to anywhere
between 0 - 5000 Kbps while under load.
This makes it difficult to keep snapshot replication working
effectively. It seems that the zfs_send operation is low priority
only occurring after I/O operations have been completed.
Is there a way that I can increase the send priority to increase
replication speed?
No, unless you compile the code yourself.
Both the source and destination system are configured in one large
zpool comprised of 8 raidz sets. While under load the source
system does ~ 500 - 950 iops/s (from zpool iostat) with no apparent
hot spots. It seems to me that the system should be able to perform
much faster. Unfortunately the data on these systems is in the form
of hundreds of millions (maybe even into the billion mark now) of
very small files, could this be a factor even with the block level
replication occurring?
The process is currently:
zfs_send -> mbuffer -> LAN -> mbuffer -> zfs_recv
I've done some work on such things. The difficulty in design is
figuring
out how often to do the send. You will want to balance your send time
interval with the write rate such that the send data is likely to be
in the ARC.
There is no magic formula, but empirically you can discover a
reasonable
interval.
There is a lurking RFE here somewhere: it would be nice to
automatically
snapshot when some threshold of writes has occurred.
P.S. if you have atime enabled, which is the default, handling
billions of
files will be quite a challenge.
-- richard
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss