Re: rsync rewrites all blocks of large files although it uses delta transfer
On Thu 14 Feb 2019, Delian Krustev via rsync wrote: > On Wednesday, February 13, 2019 6:25:59 PM EET Remi Gauvin > wrote: > > If the --inplace delta is as large as the filesize, then the > > structure/location of the data has changed enough that the whole file > > would have to be written out in any case. > > This is not the case. > If you see my original post you would have noticed that the delta transfer > finds only about 20 MB of differences within the almost 2G datafile. I think you're missing the point of Remi's message. Say the original file is: ABCDEFGHIJ The new file is: XABCDEFGHI Then the delta is just 10%, but the entire file needs to be rewritten as the structure is changed. Paul -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync rewrites all blocks of large files although it uses delta transfer
On Wednesday, February 13, 2019 6:25:59 PM EET Remi Gauvin wrote: > If the --inplace delta is as large as the filesize, then the > structure/location of the data has changed enough that the whole file > would have to be written out in any case. This is not the case. If you see my original post you would have noticed that the delta transfer finds only about 20 MB of differences within the almost 2G datafile. The problem with --inplace without --backupdir is that delta transfers can no longer work efficiently. Cheers -- Delian -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync rewrites all blocks of large files although it uses delta transfer
On Wednesday, February 13, 2019 6:20:13 PM EET Remi Gauvin via rsync wrote: > Have you run the nifs-clean before checking this free space comparison? > Maybe there is just large amplification created by Rsyn's many small > writes when using --inplace. nilfs-clean is being suspended for the time of the backup. It would have idled if the fullness threshold of the FS (90% by default) have not been reached. The problem is probably that these mysqldump files have changed data near the beginning of the files. Thus any later blocks have to be overwritten. In order to avoid this "rsync" would have to allocate and deallocate space in the middle of the file: http://man7.org/linux/man-pages/man2/fallocate.2.html and unfortunately the respective syscalls are not portable, quite new and filesystem specific. Would have been nice to have these for all OSes and filesystems though. And better yet not aligned on FS block size. E.g.: - give me 5 new blocks in the middle of file F starting at POS - do not use the entire last block of these 5 but rather only X bytes of it. or - replace block 5 with "this" partial block data - truncate blocks 6 to 20 I can find a usage for them in many application workflows - from text editors trough databases to backup software .. Cheers -- Delian -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync rewrites all blocks of large files although it uses delta transfer
On 2019-02-13 10:47 a.m., Delian Krustev via rsync wrote: > > > Free space at the beginning and end of the backup: > Filesystem 1M-blocks Used Available Use% Mounted on > /dev/mapper/bkp 102392 76872 20400 80% /mnt/bkp > /dev/mapper/bkp 102392 78768 18504 81% /mnt/bkp > > > > As can be seen "rsync" has sent about 20M and received 300K of data. However > the filesystem has allocated almost 2G, which is the total size of the files > being backed up. > > The filesystem mounted on "/mnt/bkp" is of type "nilfs2", which is a log > structured filesystem. I'm using its snapshotting feature to keep backups for > past dates. Have you run the nifs-clean before checking this free space comparison? Maybe there is just large amplification created by Rsyn's many small writes when using --inplace. <> signature.asc Description: OpenPGP digital signature -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync rewrites all blocks of large files although it uses delta transfer
On 2019-02-13 5:26 p.m., Delian Krustev via rsync wrote: > > The copy is needed for the comparison of the blocks as "--inplace" overwrites > the destination file. I've tried without "--backup" but then the delta > transfers too much data - close to the size of the backed-up files. > It's cool that --backup can be used as source data that way, a feature was unaware of.. but I think you found the cause of your problem right here as well. If the --inplace delta is as large as the filesize, then the structure/location of the data has changed enough that the whole file would have to be written out in any case. <> signature.asc Description: OpenPGP digital signature -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync rewrites all blocks of large files although it uses delta transfer
It can't do what you want. The closest thing would be --compare-dest. On 2/13/19 5:26 PM, Delian Krustev wrote: > On Wednesday, February 13, 2019 11:29:44 AM EET Kevin Korb via rsync > wrote: >> With --backup in order to end up with 2 files it has to write out a >> whole new file. >> Sure, it only sent the differences (normally that means >> over the network but there is no network here) but the writing end was >> told to duplicate the file being updated before updating it. > > The copy is needed for the comparison of the blocks as "--inplace" overwrites > the destination file. I've tried without "--backup" but then the delta > transfers too much data - close to the size of the backed-up files. > > The copy is in a temp file system which is discarded after the backup (by "rm > -rf"). This temp filesystem is not log structured or copy-on-write so having > a > copy there is not a big problem. Although I don't want a backup of all files > which are modified but rather a TMPDIR. > > The ideal workflow would be to compare SRC and DST and write changed blocks > to > the TMPDIR, then read them from TMPDIR and apply it to DST. > > > > > Cheers > -- > Delian > -- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., Kevin Korb Phone:(407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. ke...@futurequest.net (work) Orlando, Floridak...@sanitarium.net (personal) Web page: https://sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., signature.asc Description: OpenPGP digital signature -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync rewrites all blocks of large files although it uses delta transfer
On Wednesday, February 13, 2019 11:29:44 AM EET Kevin Korb via rsync wrote: > With --backup in order to end up with 2 files it has to write out a > whole new file. > Sure, it only sent the differences (normally that means > over the network but there is no network here) but the writing end was > told to duplicate the file being updated before updating it. The copy is needed for the comparison of the blocks as "--inplace" overwrites the destination file. I've tried without "--backup" but then the delta transfers too much data - close to the size of the backed-up files. The copy is in a temp file system which is discarded after the backup (by "rm -rf"). This temp filesystem is not log structured or copy-on-write so having a copy there is not a big problem. Although I don't want a backup of all files which are modified but rather a TMPDIR. The ideal workflow would be to compare SRC and DST and write changed blocks to the TMPDIR, then read them from TMPDIR and apply it to DST. Cheers -- Delian -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync rewrites all blocks of large files although it uses delta transfer
With --backup in order to end up with 2 files it has to write out a whole new file. Sure, it only sent the differences (normally that means over the network but there is no network here) but the writing end was told to duplicate the file being updated before updating it. On 2/13/19 10:47 AM, Delian Krustev via rsync wrote: > Hi All, > > For a backup purpose I'm trying to transfer only the changed blocks of > large files. Thus I've run "rsync" with the appropriate options: > > RSYNC_BKPDIR=`mktemp -d` > rsync \ > --archive \ > --no-whole-file \ > --inplace \ > --backup \ > --backup-dir="$RSYNC_BKPDIR" \ > --verbose \ > --stats \ > /var/backups/mysql-dbs/. \ > /mnt/bkp/var/backups/mysql-dbs/. > > The problem is that although "rsync" shows that delta transfer is used(when > run with -vv) and only small amount if data is transferred, the target files > look to be overwritten in full. > > Here is the output of "rsync" and some more debugging info: > > > > sending incremental file list > ./ > horde.data.sql > horde.schema.sql > LARGEDB.data.sql > LARGEDB.schema.sql > mysql.data.sql > mysql.schema.sql > phpmyadmin.data.sql > phpmyadmin.schema.sql > > Number of files: 9 (reg: 8, dir: 1) > Number of created files: 0 > Number of deleted files: 0 > Number of regular files transferred: 8 > Total file size: 1,944,522,704 bytes > Total transferred file size: 1,944,522,704 bytes > Literal data: 21,421,681 bytes > Matched data: 1,923,101,023 bytes > File list size: 0 > File list generation time: 0.001 seconds > File list transfer time: 0.000 seconds > Total bytes sent: 21,612,218 > Total bytes received: 323,302 > > sent 21,612,218 bytes received 323,302 bytes 259,591.95 bytes/sec > total size is 1,944,522,704 speedup is 88.65 > > # du -m 1.9G /tmp/tmp.8gBzjNQOQZ > 1.9G /tmp/tmp.8gBzjNQOQZ > > # tree -a /tmp/tmp.8gBzjNQOQZ > /tmp/tmp.8gBzjNQOQZ > ├── horde.data.sql > ├── horde.schema.sql > ├── LARGEDB.data.sql > ├── LARGEDB.schema.sql > ├── mysql.data.sql > ├── mysql.schema.sql > ├── phpmyadmin.data.sql > └── phpmyadmin.schema.sql > > 0 directories, 8 files > > Free space at the beginning and end of the backup: > Filesystem 1M-blocks Used Available Use% Mounted on > /dev/mapper/bkp 102392 76872 20400 80% /mnt/bkp > /dev/mapper/bkp 102392 78768 18504 81% /mnt/bkp > > > > As can be seen "rsync" has sent about 20M and received 300K of data. However > the filesystem has allocated almost 2G, which is the total size of the files > being backed up. > > The filesystem mounted on "/mnt/bkp" is of type "nilfs2", which is a log > structured filesystem. I'm using its snapshotting feature to keep backups for > past dates. > > > Is there anything that can be done in order "rsync" to overwrite only the > changed blocks ? > > > > > P.S. I guess that it will be the same for copy-on-write filesystems, e.g. > BTRFS or ZFS. > > > > Cheers > -- > Delian > -- ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., Kevin Korb Phone:(407) 252-6853 Systems Administrator Internet: FutureQuest, Inc. ke...@futurequest.net (work) Orlando, Floridak...@sanitarium.net (personal) Web page: https://sanitarium.net/ PGP public key available on web site. ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._., signature.asc Description: OpenPGP digital signature -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html