Re: Confused as to why rsync thinks time, owner and group of many files differ

2022-02-03 Thread Andy Smith via rsync
Hi Kevin,

On Thu, Feb 03, 2022 at 05:38:41PM -0500, Kevin Korb via rsync wrote:
> Are you using the same source and target each time?

Yes.

> I ask because the only discrepancy I see is the link count which
> shows that there are 11 more instances of that inode on the source
> than the target.  Maybe instances in other snapshots are being
> updated/re-linked?

I haven't yet let rsync run all the way through the whole source
filesystem so it probably hasn't yet sent over some of the hardlinks
that it knows about for this file.

There's only ever one rsync going at once, because this is a one-off
thing I am doing by hand.

> The only other thing to mention is that when you abort rsync (with -P or
> --inplace) incomplete files are left.  Rsync doesn't fix the owner+group
> until it is done with a directory and it doesn't fix the timestamp until it
> is done with a file.  This would be why you shouldn't mix those options with
> --update since the truncated file will be newer than the source file.

Okay, but:

- it's thousands of files that are reported as having differing
  t/o/g, not just whichever one was being worked on when I hit
  ctrl-c. I'm only hitting ctrl-c because rsync sees thousands of
  changes that I can't explain.

- they don't have differing t/o/g when you look at them.

- their contents are identical anyway as confirmed by sha256sum and
  also as confirmed by the fact that rsync isn't sending the file
  contents over.

- if I use "-I --checksum" to skip mtime checking and force
  checksum, rsync doesn't try to sync these files (it does still for
  the ones it thinks o/g are different). This partial workaround
  isn't very useful anyway as --checksum takes forever. Point is, it
  definitely thinks there are changes of mtime, uid and/or gid.

So I am still really confused.

If I remove the --inplace I think the spurious t/o/g detection will
still happen, and also that rsync will create a temp file to rename
over each file, so blowing up the hardlinks that it has already sent
across.

This would be mere curiosity if it did this once and then was happy
that it had set the mtime/uid/gid, but it doesn't, it does it every
time, which is making things really slow.

I am trying to build a newer rsync for use on the sender to see if
that makes any difference but am also running into bizarre problems
there, which is perhaps for another thread. Illegal instruction
somewhere inside libcrypto. The same libcrypto that the packaged
rsync is linked against. Goes away if I use --cc=none, but happens
for md4 or md5. Really not my night!

I am tempted to blow away the btrfs filesystem and just do xfs to
xfs, to rule out weird issues there. It would be a shame though as
I was hoping to use btrfs's compression here.

Cheers,
Andy

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Confused as to why rsync thinks time, owner and group of many files differ

2022-02-03 Thread Andy Smith via rsync
Hi,

I am at the moment using rsync to move quite a big set of backups
from one machine to another. The source filesystem is xfs; the
target filesystem is btrfs.

For various reasons I have been stopping the rsync part way through
and re-starting. I have noticed that a large number of files are
transferred over and over and I can't work out why.

Example:

sudo rsync -iPva \
--inplace \
--numeric-ids \
--delete \
/data/backup/rsnapshot/daily.0/cacti/ \
root@koff:/data/backup/rsnapshot/daily.0/cacti/

...
http://rsync.samba.org/
Capabilities:
64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints,
socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace,
append, ACLs, xattrs, iconv, symtimes, prealloc

Destination:

$ rsync --version
rsync  version 3.2.3  protocol version 31
Copyright (C) 1996-2020 by Andrew Tridgell, Wayne Davison, and others.
Web site: https://rsync.samba.org/
Capabilities:
64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints,
socketpairs, hardlinks, hardlink-specials, symlinks, IPv6, atimes,
batchfiles, inplace, append, ACLs, xattrs, optional protect-args, iconv,
symtimes, prealloc, stop-at, no crtimes
Optimizations:
SIMD, asm, openssl-crypto
Checksum list:
xxh128 xxh3 xxh64 (xxhash) md5 md4 none
Compress list:
zstd lz4 zlibx zlib none

What am I missing?

Thanks,
Andy

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: How to manage root<-->root rsync keeping permissions?

2021-08-04 Thread Andy Smith via rsync
Hello,

On Tue, Aug 03, 2021 at 03:05:27PM +0100, Chris Green via rsync wrote:
> Remember, as I said, this is all Debianland with no real root login,
> while I could add one I'd prefer not to.  

Your system already has a root user and if you added an SSH public
key to its authorized_keys file (and allowed root login by public
key only in sshd_config) then SSH login would work. The only form of
login you would have added is "by this specific ssh key". The
account could still remain password locked as it is now.

It is difficult for me to see why such a setup would be inherently
more secure than one where a regular user account can do absolutely
anything (i.e. run rsync as root without password prompt),
especially given that a regular user account is likely to run a lot
of other software some of which may have bugs.

But we all choose our security stance.

> I've set it up so chris can run rsync with root permissions.
> However I'm not quite sure how to get it to work as one needs to say
> "sudo rsync" to get the root privilege.  How do you do that?

The first link I sent you had an example of that: --rsync-path="sudo
rsync"

Cheers,
Andy

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: How to manage root<-->root rsync keeping permissions?

2021-08-03 Thread Andy Smith via rsync
Hi Chris,

On Tue, Aug 03, 2021 at 11:48:31AM +0100, Chris Green via rsync wrote:
> If I used the --super option (in a command like the one above) and
> chris can run rsync as root on the remote end (via options in the
> sudoers file) will this do what I want?  I guess I can go away and try
> it! :-)

You don't need --super if the remote side actually is running as
root (either because you logged in as "root" or you logged in as
"chris" but told it to execute "sudo rsync").

If you're going to use sudo then you'll want to set it NOPASSWD so
it doesn't ask for a sudo password. Possibly restricting that only
to uses of rsync or a specific script, otherwise it is giving
"chris" blanket sudo access without a password.

Cheers,
Andy

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: How to manage root<-->root rsync keeping permissions?

2021-08-03 Thread Andy Smith via rsync
Hi Chris,

On Tue, Aug 03, 2021 at 09:48:37AM +0100, Chris Green via rsync wrote:
> But how do you handle the other end to restore the root ownership etc.?
> The script has to do something like:-
> 
> rsync -a /etc/ chris@remote:backups/etc/
> 
> So at the remote end it only has chris' privileges.

A couple of options:


https://strugglers.net/~andy/blog/2021/04/10/rsync-and-sudo-without-x-forwarding/

Since you want to automate it I'd go with letting root log in by ssh
key only, and force the key to work only with a specific script.

Here is an example forced command that only allows rsync

https://www.guyrutenberg.com/2014/01/14/restricting-ssh-access-to-rsync/

This is still vulnerable to doing anything that rsync can do. You
can secure it further by making a script that only does the specific
things you need rsync to do, e.g. the exact parameters and paths,
and force that script instead.

Cheers,
Andy

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: feature request: exclude from path

2020-08-02 Thread Andy Smith via rsync
Hi Matt,

On Sat, Aug 01, 2020 at 10:10:49PM -0400, Matt Stevens via rsync wrote:
> I lack development skills. Would there be a way for rsync to be passed an
> option to exclude a specific path during a sync operaton? All of my attempts
> to use exclude have failed, as it does not respect paths, only filetypes.

The existing --exclude and filter file options work for me for this.
Maybe show us what you're doing (command line), what you expect to
happen and what actually happens?

You absolutely can exclude paths. The exclude and filter options are
very expressive. I've been doing it for years.

Cheers,
Andy

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


High memory usage - any way around it other than splitting jobs?

2020-06-25 Thread Andy Smith via rsync
Hi,

I have a virtual machine with 2G of memory. On this VM there is a
directory tree with 33.3 million files in it. When attempting to
rsync (rsync -PSHav --delete /source /dest) this tree from one
directory to another on the same host, rsync uses all the memory and
is killed by oom-killer.

This host is Debian oldstable so has

$ rsync --version
rsync  version 3.1.2  protocol version 31

The normal operation of this VM does not require more than 2G of
memory, but I doubled it to 4G anyway. Unfortunately rsync still
uses all the memory and is killed.

Most advice I can find on decreasing rsync memory usage advises to
split the job up into batches. By issuing one rsync for each
directory within /source I was able to make this work.

The interesting thing is though, the split of file numbers between
sub-directories is very uneven with the majority of them (31.5
million of the 33.3 million) being in just one of the sub-directory
trees. I am kind of surprised that rsync has such a problem going
just that little bit further with the last 2 million. Is there any
scope for improvement with the incremental recursion code?

If I upgraded the version of rsync could I expect this to work any
better?

I could also give the host a massive swap file. It currently has
just 1G of swap, which all gets used in the failure case. I could
add more but I fear that the job will go so slow it will not
complete in a reasonable time.

I don't know if the -H option is causing extra memory usage here;
unfortunately it is necessary as there are hardlinks in there.

Some years old advice says to disable incremental recursion with
--no-i-r. As incremental recursion was added to reduce memory usage
this seems counter-intuitive to me, but this advice is all over the
Internet…

These are all things I will investigate before settling for the
"split into multiple jobs" approach; just wondered if anyone has any
shortcuts for me.

Thanks,
Andy

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html