Re: cpdup work heads-up

2008-04-11 Thread Vincent Stemen
On Fri, Apr 11, 2008 at 03:32:42PM -0700, Matthew Dillon wrote:
> 
> :I have used cpdup a few times. Today I read some more in the manpage.
> :I see it can do synchronize mirrors remotely. This sounds great.
> :
> :Has anyone done any comparisons or benchmarks between it and rsync? I am 
> :especially curious if should start using cpdup instead :)
> 
> Well, I doubt it would be faster then rsync.  rsync uses a more
> efficient algorithm that batches better over connections.  cpdup's
> remote algorithm is kinda ad-hoc and uses synchronous transactions
> (-p allows multiple synchronous transactions to be run in parallel).
> cpdup isn't meant to do rsync's job.
> 
> On the otherhand, cpdup is a bit more user-friendly, and cpdup can
> do third party copies (both source and target are remote-host specs).
> ...

Hi.  That's interesting.  I didn't know about cpdup.  

However, since you are on the subject, I thought I would let you guys
know about rbu and offer to put it up for download if you have a need
for it.  

I am the author of bu (http://hightek.org/bu).  I have re-written bu
purely in perl which uses rsync under the covers, so it has remote
backup capability.  Bu was designed for FS to FS backups so the
destination had to be a mounted file system (local, NFS, etc).

I currently call it rbu for "Rsync based BU".  My plan is for it to
eventually be renamed to "bu" and completely replace bu.

Rbu is much more simple to use for backups than directly using rsync.
I have not released it yet because I have not had the time to implement
include/exclude lists or prepare and document an official release.
Although, it does have online help with -h.  In fact, I have not even
announced it's existence on the bu mailing list yet.

rbu, like bu, is designed to always properly duplicate the original file
system underneath the destination directory no matter how the source is
specified.  See the bu docs for details.

If anybody wants me to make it available or has any questions, don't
hesitate to let me know.

It would not be packaged with readme's or include install scripts yet,
but it works stand alone.  We have been running it on DragonFly for the
last several months and it has been stable.

It is very simple to use.  An example from the online usage,

rbu .
Backup the current working directory to the default backup directory
specified in $bu_dest.(alexandria:/backups/quark)

It can work along side of bu and shares the same configuration file.  If
you use it with bu, there are one or two caveats.  Just ask and I will
provide more info.



Re: cpdup work heads-up

2008-04-11 Thread Matthew Dillon

:I have used cpdup a few times. Today I read some more in the manpage.
:I see it can do synchronize mirrors remotely. This sounds great.
:
:Has anyone done any comparisons or benchmarks between it and rsync? I am 
:especially curious if should start using cpdup instead :)

Well, I doubt it would be faster then rsync.  rsync uses a more
efficient algorithm that batches better over connections.  cpdup's
remote algorithm is kinda ad-hoc and uses synchronous transactions
(-p allows multiple synchronous transactions to be run in parallel).
cpdup isn't meant to do rsync's job.

On the otherhand, cpdup is a bit more user-friendly, and cpdup can
do third party copies (both source and target are remote-host specs).
It also doesn't try to play naming tricks on the target like cp
(and rsync) do, e.g.

cp -r dir1 dir2 (dir2 doesn't exist -> dir2 created)
cp -r dir1 dir2 (dir2 exists, copies to dir2/dir1)

Which really screws up a lot of people (and has for over 20 years).

cpdup dir1 dir2 Make dir2 an exact copy of dir1 (regardless
of whether dir2 exists or not).

And cpdup by default, without any options specified, will attempt
to make as exact a copy as possible whereas with rsync you have to
tell it to with -a.

In anycase, cpdup has been around for over 10 years, lots of
people love it, and I still use it, so I'm going to continue to
maintain it.

-Matt
Matthew Dillon 
<[EMAIL PROTECTED]>


Re: cpdup work heads-up

2008-04-11 Thread Jeremy C. Reed
I have used cpdup a few times. Today I read some more in the manpage.
I see it can do synchronize mirrors remotely. This sounds great.

Has anyone done any comparisons or benchmarks between it and rsync? I am 
especially curious if should start using cpdup instead :)


cpdup work heads-up

2008-04-11 Thread Matthew Dillon
Ok, I think I've bashed cpdup's new features into shape on HEAD.
Beware that the updated cpdup must be running on both sides of the
link to use the new feature and there is no endian conversion.

The new feature is '-pN', e.g. -p16, which parallelizes operations
when the source and/or destination is a remote host specification.
This should significantly speed up operations over slow links, and
even over fast links.

I originally didn't intend to implement this sort of feature but
at the moment my offsite backup box has 100ms of latency due to a
routing snafu well beyond my control.  Needless to say, 10 files
checked per second is a bit too slow.

One final item not yet solved is that apparently cpdup's individual
write()'s to its pipe to ssh is resulting in ssh (which turns off nagle)
to send out one TCP packet per request.  haven't found a way to tell
ssh to leave nagle on for the batch link, since interactive response is
not really needed when using the -pN option.  This is resulting in
fairly expensive and unnecessary packet overhead.  If anyone has any
ideas on how to fix ssh I'm all ears.  I'd rather not gang the writes
in cpdup, it would be kinda messy to do that.

Once this has been tested well enough I will MFC it to 1.12.

-Matt