Re: [BackupPC-users] Rsynv vs. tar, full vs. incremental
Hi there, On Fri, 2 Sep 2011 Pavel Hofman wrote: I guess the main problem is tar cannot resume after a network glitch, Can you simply tweak the TCP settings in /proc/sys/net/ so that the connection can cope with a ~one-minute break and tar doesn't notice? -- 73, Ged. -- Special Offer -- Download ArcSight Logger for FREE! Finally, a world-class log management solution at an even better price-free! And you'll get a free Love Thy Logs t-shirt when you download Logger. Secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsisghtdev2dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Rsynv vs. tar, full vs. incremental
Dne 31.5.2011 21:57, Holger Parplies napsal(a): Hi, Pavel Hofman wrote on 2011-05-31 15:24:56 +0200 [[BackupPC-users] Rsynv vs. tar, full vs. incremental]: Incremental backup of a linux machine using tar (i.e. only files newer than...) is several times faster than using rsync. that could be because it is missing files that rsync catches. Or perhaps I should rather say: yes, tar is probably more efficient, but it is less exact than rsync, because it only has one single timestamp to go by, whereas rsync has a full file list with attributes for all files. One very real consequence is that tar *cannot* detect deleted files in incremental backups while rsync will. My understanding is that the concept of incremental backups, way back in times where we did backups to tapes, was introduced simply to make daily backups feasible at all. Something along the lines of it's not great, but it's the best we can do, and it's good enough to be worthwhile. Nowadays, incremental backups still have their benefits, but we really need to shake the habit of making compromises for no better reason than that we haven't yet realized that there is an alternative. If you determine that incremental tar backups are good enough for you (e.g. because the cases it doesn't catch don't happen in your backup set), or that your server load forces you to make a compromise, then that's fine. But if it's only tar is faster than rsync and faster is better, then you should ask yourself why you are doing backups at all (no backups is an even faster option). On the other hand, full backup using tar transfers huge amount of data over network, way more than the efficient rsync. There are also other factors to consider like CPU usage. Where exactly is your bottleneck? Is there a way to use rsync for full backup and tar for the incremental runs? No. Actually, *the other way around*, it would make sense: full backups with tar (probably faster than rsync over a fast local network - depending on your backup set) and incremental backups with rsync (almost as exact as a full backup). I do not even know whether the two transfer modes formats produce mutually compatible data in the pool. No. There is (or was?) a slight difference in the attribute files, leading to retransmission of all files on the first rsync run after a tar run (because RsyncP thinks the file type has changed from something to plain file). The rest is, of course, compatible. It would be a shame if pooling wouldn't work between tar and rsync backups, wouldn't it? :) Hi Holger, Sorry for the few months between my reply :) I have been fighting the issue and still do not see any solution. I guess the main problem is tar cannot resume after a network glitch, while rsync takes too much time and RAM on our servers with a few million files each (maildirs, development trees etc.) Perhaps if the network transport was not so sensitive to network interruptions, TAR would be just fine. Our cable internet is VERY fast (100Mbps down with no FUP), but there are short interrupts at nights (mostly up to a minute). This often breaks full TAR backups before they are able to finish, rendering them useless. Our backups take tens of hours easily. Do you have any experience with tuning the network layer, or any other suggestion? Theoretically, a VPN could help (in fact there is openVPN active), it would just require running TAR over netcat, no additional layer of SSH. Otherwise the SSH over SSH overhead would make the process useless again. Thanks a lot for suggestions. Pavel. -- EMC VNX: the world's simplest storage, starting under $10K The only unified storage solution that offers unified management Up to 160% more powerful than alternatives and 25% more efficient. Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
[BackupPC-users] Rsynv vs. tar, full vs. incremental
Hi, Incremental backup of a linux machine using tar (i.e. only files newer than...) is several times faster than using rsync. On the other hand, full backup using tar transfers huge amount of data over network, way more than the efficient rsync. Is there a way to use rsync for full backup and tar for the incremental runs? I do not even know whether the two transfer modes formats produce mutually compatible data in the pool. Thanks a lot for any hints. Pavel. -- Simplify data backup and recovery for your virtual environment with vRanger. Installation's a snap, and flexible recovery options mean your data is safe, secure and there when you need it. Data protection magic? Nope - It's vRanger. Get your free trial download today. http://p.sf.net/sfu/quest-sfdev2dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Rsynv vs. tar, full vs. incremental
Hi, Pavel Hofman wrote on 2011-05-31 15:24:56 +0200 [[BackupPC-users] Rsynv vs. tar, full vs. incremental]: Incremental backup of a linux machine using tar (i.e. only files newer than...) is several times faster than using rsync. that could be because it is missing files that rsync catches. Or perhaps I should rather say: yes, tar is probably more efficient, but it is less exact than rsync, because it only has one single timestamp to go by, whereas rsync has a full file list with attributes for all files. One very real consequence is that tar *cannot* detect deleted files in incremental backups while rsync will. My understanding is that the concept of incremental backups, way back in times where we did backups to tapes, was introduced simply to make daily backups feasible at all. Something along the lines of it's not great, but it's the best we can do, and it's good enough to be worthwhile. Nowadays, incremental backups still have their benefits, but we really need to shake the habit of making compromises for no better reason than that we haven't yet realized that there is an alternative. If you determine that incremental tar backups are good enough for you (e.g. because the cases it doesn't catch don't happen in your backup set), or that your server load forces you to make a compromise, then that's fine. But if it's only tar is faster than rsync and faster is better, then you should ask yourself why you are doing backups at all (no backups is an even faster option). On the other hand, full backup using tar transfers huge amount of data over network, way more than the efficient rsync. There are also other factors to consider like CPU usage. Where exactly is your bottleneck? Is there a way to use rsync for full backup and tar for the incremental runs? No. Actually, *the other way around*, it would make sense: full backups with tar (probably faster than rsync over a fast local network - depending on your backup set) and incremental backups with rsync (almost as exact as a full backup). I do not even know whether the two transfer modes formats produce mutually compatible data in the pool. No. There is (or was?) a slight difference in the attribute files, leading to retransmission of all files on the first rsync run after a tar run (because RsyncP thinks the file type has changed from something to plain file). The rest is, of course, compatible. It would be a shame if pooling wouldn't work between tar and rsync backups, wouldn't it? :) Regards, Holger -- Simplify data backup and recovery for your virtual environment with vRanger. Installation's a snap, and flexible recovery options mean your data is safe, secure and there when you need it. Data protection magic? Nope - It's vRanger. Get your free trial download today. http://p.sf.net/sfu/quest-sfdev2dev ___ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List:https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki:http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/