Re: [BackupPC-users] Rsynv vs. tar, full vs. incremental

2011-09-03 Thread G.W. Haywood
Hi there,

On Fri, 2 Sep 2011 Pavel Hofman wrote:

 I guess the main problem is tar cannot resume after a network glitch,

Can you simply tweak the TCP settings in /proc/sys/net/ so that the
connection can cope with a ~one-minute break and tar doesn't notice?

--

73,
Ged.

--
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free Love Thy Logs t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Rsynv vs. tar, full vs. incremental

2011-08-26 Thread Pavel Hofman
Dne 31.5.2011 21:57, Holger Parplies napsal(a):
 Hi,
 
 Pavel Hofman wrote on 2011-05-31 15:24:56 +0200 [[BackupPC-users] Rsynv vs. 
 tar, full vs. incremental]:
 Incremental backup of a linux machine using tar (i.e. only files newer
 than...) is several times faster than using rsync.
 
 that could be because it is missing files that rsync catches. Or perhaps I
 should rather say: yes, tar is probably more efficient, but it is less exact
 than rsync, because it only has one single timestamp to go by, whereas rsync
 has a full file list with attributes for all files. One very real consequence
 is that tar *cannot* detect deleted files in incremental backups while rsync
 will.
 
 My understanding is that the concept of incremental backups, way back in times
 where we did backups to tapes, was introduced simply to make daily backups
 feasible at all. Something along the lines of it's not great, but it's the
 best we can do, and it's good enough to be worthwhile.
 
 Nowadays, incremental backups still have their benefits, but we really need
 to shake the habit of making compromises for no better reason than that we
 haven't yet realized that there is an alternative.
 
 If you determine that incremental tar backups are good enough for you (e.g.
 because the cases it doesn't catch don't happen in your backup set), or that
 your server load forces you to make a compromise, then that's fine. But if
 it's only tar is faster than rsync and faster is better, then you should
 ask yourself why you are doing backups at all (no backups is an even faster
 option).
 
 On the other hand, full backup using tar transfers huge amount of data over
 network, way more than the efficient rsync.
 
 There are also other factors to consider like CPU usage. Where exactly is your
 bottleneck?
 
 Is there a way to use rsync for full backup and tar for the incremental
 runs?
 
 No. Actually, *the other way around*, it would make sense: full backups with
 tar (probably faster than rsync over a fast local network - depending on your
 backup set) and incremental backups with rsync (almost as exact as a full
 backup).
 
 I do not even know whether the two transfer modes formats produce
 mutually compatible data in the pool.
 
 No. There is (or was?) a slight difference in the attribute files, leading to
 retransmission of all files on the first rsync run after a tar run (because
 RsyncP thinks the file type has changed from something to plain file).
 The rest is, of course, compatible. It would be a shame if pooling wouldn't
 work between tar and rsync backups, wouldn't it? :)

Hi Holger,

Sorry for the few months between my reply :) I have been fighting the
issue and still do not see any solution.

I guess the main problem is tar cannot resume after a network glitch,
while rsync takes too much time and RAM on our servers with a few
million files each (maildirs, development trees etc.)

Perhaps if the network transport was not so sensitive to network
interruptions, TAR would be just fine. Our cable internet is VERY fast
(100Mbps down with no FUP), but there are short interrupts at nights
(mostly up to a minute). This often breaks full TAR backups before they
are able to finish, rendering them useless. Our backups take tens of
hours easily.

Do you have any experience with tuning the network layer, or any other
suggestion? Theoretically, a VPN could help (in fact there is openVPN
active), it would just require running TAR over netcat, no additional
layer of SSH. Otherwise the SSH over SSH overhead would make the process
useless again.

Thanks a lot for suggestions.

Pavel.

--
EMC VNX: the world's simplest storage, starting under $10K
The only unified storage solution that offers unified management 
Up to 160% more powerful than alternatives and 25% more efficient. 
Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


[BackupPC-users] Rsynv vs. tar, full vs. incremental

2011-05-31 Thread Pavel Hofman
Hi,

Incremental backup of a linux machine using tar (i.e. only files newer
than...) is several times faster than using rsync. On the other hand,
full backup using tar transfers huge amount of data over network, way
more than the efficient rsync.

Is there a way to use rsync for full backup and tar for the incremental
runs? I do not even know whether the two transfer modes formats produce
mutually compatible data in the pool.

Thanks a lot for any hints.

Pavel.

--
Simplify data backup and recovery for your virtual environment with vRanger. 
Installation's a snap, and flexible recovery options mean your data is safe,
secure and there when you need it. Data protection magic?
Nope - It's vRanger. Get your free trial download today. 
http://p.sf.net/sfu/quest-sfdev2dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Rsynv vs. tar, full vs. incremental

2011-05-31 Thread Holger Parplies
Hi,

Pavel Hofman wrote on 2011-05-31 15:24:56 +0200 [[BackupPC-users] Rsynv vs. 
tar, full vs. incremental]:
 Incremental backup of a linux machine using tar (i.e. only files newer
 than...) is several times faster than using rsync.

that could be because it is missing files that rsync catches. Or perhaps I
should rather say: yes, tar is probably more efficient, but it is less exact
than rsync, because it only has one single timestamp to go by, whereas rsync
has a full file list with attributes for all files. One very real consequence
is that tar *cannot* detect deleted files in incremental backups while rsync
will.

My understanding is that the concept of incremental backups, way back in times
where we did backups to tapes, was introduced simply to make daily backups
feasible at all. Something along the lines of it's not great, but it's the
best we can do, and it's good enough to be worthwhile.

Nowadays, incremental backups still have their benefits, but we really need
to shake the habit of making compromises for no better reason than that we
haven't yet realized that there is an alternative.

If you determine that incremental tar backups are good enough for you (e.g.
because the cases it doesn't catch don't happen in your backup set), or that
your server load forces you to make a compromise, then that's fine. But if
it's only tar is faster than rsync and faster is better, then you should
ask yourself why you are doing backups at all (no backups is an even faster
option).

 On the other hand, full backup using tar transfers huge amount of data over
 network, way more than the efficient rsync.

There are also other factors to consider like CPU usage. Where exactly is your
bottleneck?

 Is there a way to use rsync for full backup and tar for the incremental
 runs?

No. Actually, *the other way around*, it would make sense: full backups with
tar (probably faster than rsync over a fast local network - depending on your
backup set) and incremental backups with rsync (almost as exact as a full
backup).

 I do not even know whether the two transfer modes formats produce
 mutually compatible data in the pool.

No. There is (or was?) a slight difference in the attribute files, leading to
retransmission of all files on the first rsync run after a tar run (because
RsyncP thinks the file type has changed from something to plain file).
The rest is, of course, compatible. It would be a shame if pooling wouldn't
work between tar and rsync backups, wouldn't it? :)

Regards,
Holger

--
Simplify data backup and recovery for your virtual environment with vRanger. 
Installation's a snap, and flexible recovery options mean your data is safe,
secure and there when you need it. Data protection magic?
Nope - It's vRanger. Get your free trial download today. 
http://p.sf.net/sfu/quest-sfdev2dev
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/