Hi,
Interesting. If you're not using incremental recursion (the default in
rsync = 3.0.0), I can see that the du would help by forcing the
destination I/O to overlap the file-list building in time. But with
incremental recursion, the du shouldn't be necessary because rsync
actually overlaps
No, not if the file cache isn't large enough for the number of files.
E.g. if you have 20 million files and only 256MB RAM, it's likely a bad
idea.
Splitting down to the subsub (2-levels down) directory level allows a single
subsub rsync to fit for me. Warming the cache is beneficial here, I
Hi,
In my situation I'm using rsync to backup a server with (currently) about
570,000 files.
These are all little files and maybe .1% of them change or new ones are added in
any 15 minute period.
I've split the main tree up so rsync can run on sub sub directories of the main
tree.
It does
Hi,
In order to expeditiously move these new files offsite, we use a modified
version of pyinotify to log all added/altered files across the entire
filesystem(s) and then every five minutes feed the list to rsync with the
--files-from option. This works very effectively and quickly.
Hi,
I know certain subtrees I want to backup are written once
and never deleted.
So to reduce the time it takes rsync to run, I was thinking
of putting the following .rsync-filter in each of these subtrees:
P /**
I can see this stops the files on the receiver side from being
deleted.
Does
a subtree entirely? I was thinking I
could
add this (whatever it is) dynamically after the subtree had been written.
Thanks,
Mike
- Original Message -
From: Mike Connell
To: rsync@lists.samba.org
Sent: Monday, October 05, 2009 11:08 PM
Subject: can a .rsync-filter improve
Hi,
Here is an update. I haven't deployed a new version of rsync into
production.
Instead I split my current rsync up into 10 independent sub directories of
the
main directory. I run them serially one after the other.
I'm up to 404,000 files and the total sync time doesn't seem to be falling
Hi,
I've got identical servers. One is primary the other is backup
receiving rsyncs from the primary. I'm backing up a file system to
disk and the files are small and there are lots of directories.
The overall problem seems to be the total number of files.
When I had ~375,000 files, the total