Re: [BackupPC-users] RsyncIncrArgsExtra

2019-08-03 Thread Alexander Kobel
Hi,

On 03.08.19 18:59, Ted Toal wrote:
> Ged,
> 
>> BackupPC shines, I think, in less well-constrained situations.
>>
>> Given the boundaries I wonder if you wouldn't do better with something
>> simple like a script which runs 'find' to find the files to be backed
>> up, plain vanilla rsync to do the actual transfers, and de-duplication
>> provided (if necessary) by one of several filesystems which offer it.
> 
> We looked at a lot of different solutions, and BackupPC seemed best.  I 
> really like it.  I’m not sure that any script we set up could do any better 
> job finding the files to back up, than rsync via BackupPC with the 
> file-size-constraint option specified.  If I understand it correctly, 
> incrementals DO NOT read the entire file contents and compute a checksum, but 
> work strictly off of file modification date, so finding the files requires 
> only reading the directories and not reading the files themselves, right?

Correct.

FWIW, `find` with inspection of the modification date (-newer) calls
getdents64 via readdir for listing directory entries directly, then
lstat for each entry. `rsync` does exactly the same; so for the
unchanged files, both should be identical. (In other words, I don't
think that an additional mirroring script based on find buys you
anything over BackupPC's rsync use.)

What *might* be a problem: I remember the painful experience of listing
directories with more than a couple files via NFS. [1] explains a
possible reason: readdir is not exactly the most efficient way to get
such lists, in particular if the latency to get another chunk of the
directory listings is significant. But probably that won't matter if you
have to call lstat per file anyways.

  [1]:
http://be-n.com/spw/you-can-list-a-million-files-in-a-directory-but-not-with-ls.html


Alex


> ___
> BackupPC-users mailing list
> BackupPC-users@lists.sourceforge.net
> List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
> 



smime.p7s
Description: S/MIME Cryptographic Signature
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RsyncIncrArgsExtra

2019-08-03 Thread Ted Toal
Ged,

> BackupPC shines, I think, in less well-constrained situations.
> 
> Given the boundaries I wonder if you wouldn't do better with something
> simple like a script which runs 'find' to find the files to be backed
> up, plain vanilla rsync to do the actual transfers, and de-duplication
> provided (if necessary) by one of several filesystems which offer it.

We looked at a lot of different solutions, and BackupPC seemed best.  I really 
like it.  I’m not sure that any script we set up could do any better job 
finding the files to back up, than rsync via BackupPC with the 
file-size-constraint option specified.  If I understand it correctly, 
incrementals DO NOT read the entire file contents and compute a checksum, but 
work strictly off of file modification date, so finding the files requires only 
reading the directories and not reading the files themselves, right?

Ted



___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RsyncIncrArgsExtra

2019-08-03 Thread G.W. Haywood via BackupPC-users

Hello again,

On Sat, 3 Aug 2019, Ted Toal wrote:


I am NOT sure whether bandwidth limitation is what I want. ...  only
backing up our lab?s small portion of the data ...  only backing up
files less than 1 MB ...


BackupPC shines, I think, in less well-constrained situations.

Given the boundaries I wonder if you wouldn't do better with something
simple like a script which runs 'find' to find the files to be backed
up, plain vanilla rsync to do the actual transfers, and de-duplication
provided (if necessary) by one of several filesystems which offer it.

--

73,
Ged.


___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RsyncIncrArgsExtra

2019-08-03 Thread Alexander Kobel
Hi Ted,

On 02.08.19 20:09, Ted Toal wrote:
> Hi Alex,
> 
> Ok, thanks for that suggestion, I’d thought of it, but wasn’t sure if rsync 
> would complain if the arg appeared twice, but apparently it doesn’t.
> 
> I am NOT sure whether bandwidth limitation is what I want.  I am actually 
> trying to throttle down not only the network bandwidth used but also the I/O 
> load.  This is a shared file system with hundreds of users accessing it.  I’m 
> only backing up our lab’s small portion of the data, and I’m only backing up 
> files less than 1 MB in size.  The full backups are done separately by 
> someone else in a different manner.  For my <1 MB files, I am doing a full 
> backup once a year and an incremental backup once an hour.

> I want to have essentially 0 impact on the network bandwidth and on the I/O 
> load between the server that talks to BackupPC and the network storage device.

I'm not 100% sure, but this sounds way more complicated than throttling
the bandwidth between the BackupPC server and the host.

IIUC, your situation is:

  BPC (1)  ---(a)---  host (2)  ---(b)---  NAS (3)

BPC (1) is the BackupPC server; host (2) is the system you want to back
up, i.e., the client from BackupPC's perspective; and NAS (3) is the
server providing the shared file system.

You want to limit I/O on 3 as well as bandwidth on link b, with
privileged access to only 1, no access to 3, and probably no chance of
changing the way 2 communicates with 3, correct? (E.g., to set up a
dedicated NFS connection where the server side (3) is I/O-limited.)


Here's my gut feeling: (disclaimer: unconfirmed, highly dependent on
your exact setup, and I'm not an expert on NFS setups)

In that situation, ionice on 2 won't help; the rsync instance running on
host 2 is purely cpu- and network-bound, but has negligible local I/O
(controlled by ionice). And limiting cpu (via nice) and network
bandwidth (via trickle, e.g.) on 2 won't help, either: just listing
files on an NFS is usually a bottleneck, because individual requests
have to pass the link b.
If you somehow manage to limit the bandwidth across b, actual *content*
transfer will be horribly slow. (And I expect this to be difficult, as
the NFS is probably pre-mounted via a mechanism that you can't control.)
The only reasonable idea, AFAICS, would be to rate-limit the *number* of
files accessed. But I do not see how this could be done, short of
modifying the rsync-sender on host 2.

IMHO, the one and only *proper* way to install such a backup solution
would be to ask your friendly staff managing the NAS 3 (hopefully
experts on how their setup works, if it serves 100+ users) to grant you
access to their backups (which they surely have), or give you read-only
direct access to NAS 3 with proper limits.
What you're trying to do sounds like their job, and even if you have
reasons to think that you might do better or have specific requirements
they won't be able to fulfill, you're not in best position to implement it.


Just my 2 pennies from someone who enjoys not having do deal with NFS a
lot...

Alex


> Since I’m just starting, I’m doing the first full backups, and they are 
> taking forever.  I have a bandwidth limit of 1 MB/s, very low.  I need to 
> explore how high I can go without impacting other’s access, and how high I 
> need to go to finish the full backups and incremental backups in a timely 
> fashion.  I’m thinking a higher bandwidth limit for the full backups would 
> get them done quicker with still little impact.  For the incrementals, I 
> haven’t done one yet so I don’t know how long it will take, but I may 
> discover I have to increase that bandwidth also, and/or decrease the 
> frequency of the incrementals.
> 
> Based on that, do you think I should be using ionice too?  And by the way, I 
> do not have root access to the server.
> 
> Ted



smime.p7s
Description: S/MIME Cryptographic Signature
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RsyncIncrArgsExtra

2019-08-02 Thread Ted Toal
Hi Alex,

Ok, thanks for that suggestion, I’d thought of it, but wasn’t sure if rsync 
would complain if the arg appeared twice, but apparently it doesn’t.

I am NOT sure whether bandwidth limitation is what I want.  I am actually 
trying to throttle down not only the network bandwidth used but also the I/O 
load.  This is a shared file system with hundreds of users accessing it.  I’m 
only backing up our lab’s small portion of the data, and I’m only backing up 
files less than 1 MB in size.  The full backups are done separately by someone 
else in a different manner.  For my <1 MB files, I am doing a full backup once 
a year and an incremental backup once an hour.  I want to have essentially 0 
impact on the network bandwidth and on the I/O load between the server that 
talks to BackupPC and the network storage device.  Since I’m just starting, I’m 
doing the first full backups, and they are taking forever.  I have a bandwidth 
limit of 1 MB/s, very low.  I need to explore how high I can go without 
impacting other’s access, and how high I need to go to finish the full backups 
and incremental backups in a timely fashion.  I’m thinking a higher bandwidth 
limit for the full backups would get them done quicker with still little 
impact.  For the incrementals, I haven’t done one yet so I don’t know how long 
it will take, but I may discover I have to increase that bandwidth also, and/or 
decrease the frequency of the incrementals.

Based on that, do you think I should be using ionice too?  And by the way, I do 
not have root access to the server.

Ted



___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RsyncIncrArgsExtra

2019-08-02 Thread Alexander Kobel
Hi again,

On 02.08.19 11:50, Alexander Kobel wrote:
> Hi Ted,
> 
> On 01.08.19 18:31, Ted Toal wrote:
>> There is a BackupPC config parameter named RsyncFullArgsExtra, but none 
>> named RsyncIncrArgsExtra (to provide extra rsync args for an incremental 
>> backup).  I’d like to see such a parameter.  My immediate use is that I’d 
>> like to restrict rsync bandwidth to different amounts depending on whether 
>> it is a full or incremental backup.
> 
> [...]
> 
> Apart from that, are you sure that a bandwidth limit actually is what
> you're after? The (network) *bandwidth* used for incrementals and fulls
> does not differ a lot; it's the *I/O load* on the client that makes the
> real difference:

that being said: if you want to use ionice or similar tools to adjust
the I/O load on the client, I suggest that you set RsyncClientPath to a
simple wrapper script that calls rsync via ionice. In this script, just
check whether --checksum is in the argument list; if it is, you're
running a full backup, otherwise an incremental.


Again: before solving the problem, make sure that it actually exists.
;-) I wouldn't be surprised if you end up at the *same* ionice arguments
for fulls and incrementals...


Cheers,
Alex



smime.p7s
Description: S/MIME Cryptographic Signature
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] RsyncIncrArgsExtra

2019-08-02 Thread Alexander Kobel
Hi Ted,

On 01.08.19 18:31, Ted Toal wrote:
> There is a BackupPC config parameter named RsyncFullArgsExtra, but none named 
> RsyncIncrArgsExtra (to provide extra rsync args for an incremental backup).  
> I’d like to see such a parameter.  My immediate use is that I’d like to 
> restrict rsync bandwidth to different amounts depending on whether it is a 
> full or incremental backup.

assuming that you want to use --bwlimit, can't you just add

  --bwlimit=

in RsyncArgs, and an additional

  --bwlimit=

in RsyncFullArgsExtra? According to my tests, the second will override
the first for the full limits. Slightly unelegant workaround, but effective.

Note that the arguments are appended in the order

  RsyncArgs RsyncFullArgsExtra RsyncArgsExtra

(see lib/BackupPC/Xfer/Rsync.pm, lines 307-334 or so), so you have to
add the "default" (incremental) limit to RsyncArgs, not RsyncArgsExtra.


Apart from that, are you sure that a bandwidth limit actually is what
you're after? The (network) *bandwidth* used for incrementals and fulls
does not differ a lot; it's the *I/O load* on the client that makes the
real difference:

IIUC, no matter what backup type, rsync needs to compare all file paths
and some metadata. For incrementals, by default it compares
  path size modification-time;
for fulls (with --checksum), it skips the latter two and compares
  path checksum
instead.

For the computation of the checksums, the client will read each file in
its entirety. That makes for a lot of *I/O bandwidth* on the client. But
regarding the *network bandwidth*: The checksum is an MD5 hash, i.e.,
128 bit = 16 bytes long. Without digging in the source, size and modtime
are probably integers of 4 or 8 bytes each. So my guess is that the
bandwidth difference is *at most* 8 bytes per file, but more likely 0...


HTH,
Alex



smime.p7s
Description: S/MIME Cryptographic Signature
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


[BackupPC-users] RsyncIncrArgsExtra

2019-08-01 Thread Ted Toal
There is a BackupPC config parameter named RsyncFullArgsExtra, but none named 
RsyncIncrArgsExtra (to provide extra rsync args for an incremental backup).  
I’d like to see such a parameter.  My immediate use is that I’d like to 
restrict rsync bandwidth to different amounts depending on whether it is a full 
or incremental backup.



___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/