Is it possible to set max file size on target?

2011-07-19 Thread Harrison Mak
Hi,

I'm trying to prevent files larger than a certain size to be backed-up on my
server.

rsync has an option --max-size that lets you control the file transfer size.
 However, this can be changed by the client.

I was wondering if this option can be set on the server side, so that I can
be sure my server won't accept files that do not meet the size requirement.

I've checked both the rsync and rsyncd man pages and archives so far.

Any help would be greatly appreciated!

Thanks.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: max file size

2009-11-13 Thread Heinz-Josef Claes
On Fri, 13 Nov 2009 01:38:48 -0500
Matt McCutchen m...@mattmccutchen.net wrote:

 On Mon, 2009-11-09 at 18:20 +0100, Heinz-Josef Claes wrote:
  Am Montag, 9. November 2009 17:48:35 schrieb Matt McCutchen:
   On Mon, 2009-11-09 at 11:43 +0100, Heinz-Josef Claes wrote:
does anybody know what's the maximum file size (terabytes?) when using
rsync with options --checksum and / or --inplace?
   
What file sizes have been tested in reality? Are there any experiences
using rsync (with --checksum and / or --inplace) for big files with
several / dozens or terabytes?
   
   I don't believe rsync has a fixed maximum size other than what can fit
   in 64 bits, but I can't speak to any reliability issues that might come
   up with extremely large files.
   
  I've read about a fix for overrun checksum buffers with more than some 
  hundred 
  terabytes but that was just something undefined . . .
 
 Indeed, I forgot about that.  The delta-transfer algorithm doesn't work
 for files longer than 2^31 blocks.  With the default maximum block size
 of 2^17, the limit is 2^48 bytes or 256 TB.  You could stretch the limit
 by fixing a larger block size with --block-size .  See:
 
 https://bugzilla.samba.org/show_bug.cgi?id=5459#c2

Thanks for that information!

Do you (or anybody) every has done a test with big file sizes?

 
   For what purpose are you considering --checksum?  In the case where the
   file's size hasn't changed (probably true for large image files), it
   will add an extra full read of the file on both sides before the
   transfer begins, which would be very expensive for multi-terabyte files.
  
  I want to check if the following is possible:
  
  1. transport a big block of data (several terabytes) physically from 
  location 
  A to location B (very long distance) via tapes (or disks).
  (Location A and B use different storage technologies.)
  
  When the tapes arrive in location B, the block of data has changed in 
  location 
  A (a program / OS is running and storing data in it).
  
  2. shutdown application / OS in location A, rsync the delta between 
  Location A 
  and B online, then restart the system in location B.
  
  (Perhaps step 2 has to be done multiple times.)
 
 Since the source and destination versions are practically certain to
 differ, --checksum would serve no purpose.  See the man page description
 of --checksum.
 

Don't understand what you mean. From 1. und 2., only a few percent of the data 
will change, so the idea is to transfer the differences only. Transferring the 
whole file online takes too long.
How to do this without check sums (either --checksum or --inbound)?

I'll probably be able to make a test with a file size of some terabytes in the 
next weeks, but that's not guaranteed.

Regards, HJC
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: max file size

2009-11-13 Thread Matt McCutchen
On Fri, 2009-11-13 at 12:36 +0100, Heinz-Josef Claes wrote:
 On Fri, 13 Nov 2009 01:38:48 -0500
 Matt McCutchen m...@mattmccutchen.net wrote:
  On Mon, 2009-11-09 at 18:20 +0100, Heinz-Josef Claes wrote:
   I want to check if the following is possible:
   
   1. transport a big block of data (several terabytes) physically from 
   location 
   A to location B (very long distance) via tapes (or disks).
   (Location A and B use different storage technologies.)
   
   When the tapes arrive in location B, the block of data has changed in 
   location 
   A (a program / OS is running and storing data in it).
   
   2. shutdown application / OS in location A, rsync the delta between 
   Location A 
   and B online, then restart the system in location B.
   
   (Perhaps step 2 has to be done multiple times.)
  
  Since the source and destination versions are practically certain to
  differ, --checksum would serve no purpose.  See the man page description
  of --checksum.
 
 Don't understand what you mean. From 1. und 2., only a few percent of
 the data will change, so the idea is to transfer the differences only.
 Transferring the whole file online takes too long.
 How to do this without check sums (either --checksum or --inbound)?

Did you read the description of --checksum as I suggested?  It is an
alternative quick check for deciding whether a file needs to be
transferred, which is not what you want.  You're talking about the
delta-transfer algorithm, which is on by default for remote runs and is
controlled by a separate option, --(no-)whole-file.

-- 
Matt

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: max file size

2009-11-12 Thread Matt McCutchen
On Mon, 2009-11-09 at 18:20 +0100, Heinz-Josef Claes wrote:
 Am Montag, 9. November 2009 17:48:35 schrieb Matt McCutchen:
  On Mon, 2009-11-09 at 11:43 +0100, Heinz-Josef Claes wrote:
   does anybody know what's the maximum file size (terabytes?) when using
   rsync with options --checksum and / or --inplace?
  
   What file sizes have been tested in reality? Are there any experiences
   using rsync (with --checksum and / or --inplace) for big files with
   several / dozens or terabytes?
  
  I don't believe rsync has a fixed maximum size other than what can fit
  in 64 bits, but I can't speak to any reliability issues that might come
  up with extremely large files.
  
 I've read about a fix for overrun checksum buffers with more than some 
 hundred 
 terabytes but that was just something undefined . . .

Indeed, I forgot about that.  The delta-transfer algorithm doesn't work
for files longer than 2^31 blocks.  With the default maximum block size
of 2^17, the limit is 2^48 bytes or 256 TB.  You could stretch the limit
by fixing a larger block size with --block-size .  See:

https://bugzilla.samba.org/show_bug.cgi?id=5459#c2

  For what purpose are you considering --checksum?  In the case where the
  file's size hasn't changed (probably true for large image files), it
  will add an extra full read of the file on both sides before the
  transfer begins, which would be very expensive for multi-terabyte files.
 
 I want to check if the following is possible:
 
 1. transport a big block of data (several terabytes) physically from location 
 A to location B (very long distance) via tapes (or disks).
 (Location A and B use different storage technologies.)
 
 When the tapes arrive in location B, the block of data has changed in 
 location 
 A (a program / OS is running and storing data in it).
 
 2. shutdown application / OS in location A, rsync the delta between Location 
 A 
 and B online, then restart the system in location B.
 
 (Perhaps step 2 has to be done multiple times.)

Since the source and destination versions are practically certain to
differ, --checksum would serve no purpose.  See the man page description
of --checksum.

-- 
Matt

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


max file size

2009-11-09 Thread Heinz-Josef Claes
Hello,

does anybody know what's the maximum file size (terabytes?) when using rsync 
with options --checksum and / or --inplace?

What file sizes have been tested in reality? Are there any experiences using 
rsync (with --checksum and / or --inplace) for big files with several / dozens 
or terabytes?

Thanks a lot, Heinz-Josef Claes
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: max file size

2009-11-09 Thread Matt McCutchen
On Mon, 2009-11-09 at 11:43 +0100, Heinz-Josef Claes wrote:
 does anybody know what's the maximum file size (terabytes?) when using rsync 
 with options --checksum and / or --inplace?
 
 What file sizes have been tested in reality? Are there any experiences using 
 rsync (with --checksum and / or --inplace) for big files with several / 
 dozens 
 or terabytes?

I don't believe rsync has a fixed maximum size other than what can fit
in 64 bits, but I can't speak to any reliability issues that might come
up with extremely large files.

For what purpose are you considering --checksum?  In the case where the
file's size hasn't changed (probably true for large image files), it
will add an extra full read of the file on both sides before the
transfer begins, which would be very expensive for multi-terabyte files.

-- 
Matt

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: max file size

2009-11-09 Thread Heinz-Josef Claes
Am Montag, 9. November 2009 17:48:35 schrieb Matt McCutchen:
 On Mon, 2009-11-09 at 11:43 +0100, Heinz-Josef Claes wrote:
  does anybody know what's the maximum file size (terabytes?) when using
  rsync with options --checksum and / or --inplace?
 
  What file sizes have been tested in reality? Are there any experiences
  using rsync (with --checksum and / or --inplace) for big files with
  several / dozens or terabytes?
 
 I don't believe rsync has a fixed maximum size other than what can fit
 in 64 bits, but I can't speak to any reliability issues that might come
 up with extremely large files.
 
I've read about a fix for overrun checksum buffers with more than some hundred 
terabytes but that was just something undefined . . .

 For what purpose are you considering --checksum?  In the case where the
 file's size hasn't changed (probably true for large image files), it
 will add an extra full read of the file on both sides before the
 transfer begins, which would be very expensive for multi-terabyte files.

I want to check if the following is possible:

1. transport a big block of data (several terabytes) physically from location 
A to location B (very long distance) via tapes (or disks).
(Location A and B use different storage technologies.)

When the tapes arrive in location B, the block of data has changed in location 
A (a program / OS is running and storing data in it).

2. shutdown application / OS in location A, rsync the delta between Location A 
and B online, then restart the system in location B.

(Perhaps step 2 has to be done multiple times.)

--
There a lots of other aspects in this scenario, but that's another story.

Regards, HJC
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: purge-empty-dirs and max-file-size confusion

2009-04-27 Thread Wayne Davison
On Fri, Apr 24, 2009 at 02:19:42PM -0400, Ian! D. Allen wrote:
 There is no mention of the concept of transfer rule in the rsync
 man page.  I offer some proposed man page wording changes, below.

Thanks.  I have committed some manpage changes that clarify this
unexpected behavior.  At some point rsync may allow actual filtering of
files by their (non-name) attributes, which would avoid this situation.

..wayne..
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: purge-empty-dirs and max-file-size confusion

2009-04-25 Thread Ian! D. Allen
On Fri, Apr 24, 2009 at 02:19:41PM -0400, Ian! D. Allen wrote:
 On Fri, Apr 24, 2009 at 07:51:35AM -0700, Wayne Davison wrote:
  This is because --min-size is a transfer rule, not an exclude rule.
 
 There is no mention of the concept of transfer rule in the rsync
 man page.

There is another oblique reference to transfer rule in --compare-dest
for which I offer this man page clarification:

--compare-dest=DIR
This transfer rule instructs rsync to use DIR on the destination
machine as an additional hierarchy to compare destination files
against doing transfers (if the files are missing in the destination
directory).  If a file is found in DIR that is identical to the
sender's file, the file will NOT be transferred to the destination
directory. This is useful for creating a sparse backup of just files
that have changed from an earlier backup, though all the directories
in the file-list will still be created (most of them likely empty).
Unlike a filter/exclude rule, this option does not affect the
file-list, so --prune-empty-dirs will not work with this option.

-m, --prune-empty-dirs
This option tells the receiving rsync to get rid of empty directories
from the file-list, including nested directories that have no
non-directory children. This is useful for avoiding the creation of
a bunch of useless directories when the sending rsync is recursively
scanning a hierarchy of files using include/exclude/filter rules. It
does not prevent the creation of empty directories that result
from the use of transfer rules such as --max-size, --min-size,
or --compare-dest, since transfer rules do not affect the file-list.

-- 
| Ian! D. Allen  -  idal...@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Open Source / Linux) via: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: purge-empty-dirs and max-file-size confusion

2009-04-24 Thread Paul Slootman
On Thu 23 Apr 2009, Ian! D. Allen wrote:
 
 In the man page it says in one place tells the receiving rsync to get
 rid of empty directories from the file-list and in another place it says
 prune empty directory chains from file-list.  The latter sounds like it
 operates on the source list, not on the receiving list, and if rsync were

Actually, to me it sounds quite like the same thing.
I don't think the intention is to actually delete empty directories at
the receiving end; only to prevent them being created. So once they're
created due to perhaps an earlier invocation without purge-empty-dirs,
you'll have to remove them by hand.


Paul
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: purge-empty-dirs and max-file-size confusion

2009-04-24 Thread Ian! D. Allen
On Fri, Apr 24, 2009 at 10:23:06AM +0200, Paul Slootman wrote:
 I don't think the intention is to actually delete empty directories at
 the receiving end; only to prevent them being created.

I have not yet found out how to prevent empty directories from being
created when using --max-size or --min-size.  As I showed in
my original post to this list, --prune-empty-dirs does not do it.
Either the man page is wrong/misleading/incomplete, I am misunderstanding
it badly, or rsync is broken.  I am fully prepared to believe that I
am misunderstanding something and will happly work on a better man page
wording when the truth is revealed to me.  I've supplied a short script
below that you can use to see the problem yourself.

 So once they're created due to perhaps an earlier invocation without
 purge-empty-dirs, you'll have to remove them by hand.

As my script below shows, the destination directory does not even exist.
There is no previously-created content in it at all, and yet rsync
creates empty directories even though I say --prune-empty-dirs.  Why?
How do I make --prune-empty-dirs do what the man page says it does?

#!/bin/sh -u

# start with fresh empy directories for source and destination
tmp1=/tmp/one$$
tmp2=/tmp/two$$
rm -rf $tmp1 $tmp2

echo '*** create the source directory with six subdirectories'
for i in 1 2 3 4 5 6 ; do
mkdir -p $tmp1/dir$i
done

echo '*** create three small files in dir1 dir2 dir3'
for i in 1 2 3 ; do
dd bs=1M count=1 if=/dev/zero of=$tmp1/dir$i/smallfile
done

echo '*** create three big files in dir4 dir5 dir6'
for i in 4 5 6 ; do
dd bs=1M count=11 if=/dev/zero of=$tmp1/dir$i/BIGFILE
done

echo '*** rsync should copy only the big files and prune all empty directories'
rsync -ai --min-size 10M --prune-empty-dirs $tmp1 $tmp2

echo '*** find should show no empty directories, but there are three - why?'
find $tmp2 -empty

echo '*** replace --min-size with an --exclude and it works fine:'
rm -r $tmp2
rsync -ai --exclude smallfile --prune-empty-dirs $tmp1 $tmp2
find $tmp2 -empty   # shows no output - this is correct and expected

echo *** Why doesn't --prune-empty-dirs work with --min-size and --max-size?

rm -r $tmp1 $tmp2

-- 
| Ian! D. Allen  -  idal...@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Open Source / Linux) via: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: purge-empty-dirs and max-file-size confusion

2009-04-24 Thread Wayne Davison
On Wed, Apr 22, 2009 at 02:20:37AM -0400, Ian! D. Allen wrote:
 I want to use --min-size to copy just large files (and their necessary
 parent directories), but everything I've tried copies *all* the source
 directories, and creates them empty on the destination even if they
 don't have any big files in them.  I only want the minimal directory
 hierarchies that contain the big files.

This is because --min-size is a transfer rule, not an exclude rule.  An
exclude rule would affect deletions, and --min-size just affects what is
transferred out of the full set of files that are present.  Thus, the
dirs with smaller files are not actually empty, they just don't have any
files that match the transfer rule.

There is not currently a way include/exclude files based on size in
rsync.

..wayne..
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: purge-empty-dirs and max-file-size confusion

2009-04-24 Thread Ian! D. Allen
On Fri, Apr 24, 2009 at 07:51:35AM -0700, Wayne Davison wrote:
 This is because --min-size is a transfer rule, not an exclude rule.

There is no mention of the concept of transfer rule in the rsync
man page.  I offer some proposed man page wording changes, below.

The man page says This option tells the receiving rsync to get rid of
empty directories from the file-list - there is no mention that there
must be two *kinds* of empty directories in the file list: (1) empty
directories created by filter/exclude rules and (2) empty directories
created by transfer rules.  Or perhaps (2) doesn't really exist, but the
sending rsync simply never gets around to sending the files that it says
should be in those directories and so the receiving rsync does all that
directory creation work but the promised files never arrive to fill them.

 There is not currently a way include/exclude files based on size in rsync.

That is most awkward, given that --min-size sure sounds like it behaves
this way.  It is an annoyingly fine distinction to say that exclude
and avoid transferring are two different kinds of operations when it
comes to rsync pruning empty directories.  This needs to be made much
clearer in the man page.  I offer these slightly reworded paragraphs:

-m, --prune-empty-dirs
  This option tells the receiving rsync to get rid of empty
  directories from the file-list, including nested directories that
  have no non-directory children. This is useful for avoiding the
  creation of a bunch of useless directories when the sending
  rsync is recursively scanning a hierarchy of files using
  include/exclude/filter rules. It does not prevent the creation of
  empty directories that result from the use of transfer rules such
  as --max-size or --min-size, since transfer rules do not affect
  the file-list.

--max-size=SIZE
  This transfer rule tells rsync to avoid transferring any file that
  is larger than the specified SIZE.  Unlike a filter/exclude rule, it
  does not affect the file-list, so --prune-empty-dirs will not work
  with this option.

--min-size=SIZE
  This transfer rule tells rsync to avoid transferring any file
  that is smaller than the specified SIZE, which can help in not
  transferring small, junk files.   Unlike a filter/exclude rule, it
  does not affect the file-list, so --prune-empty-dirs will not work
  with this option.

Thanks for keeping rsync alive and kicking!

-- 
| Ian! D. Allen  -  idal...@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Open Source / Linux) via: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: purge-empty-dirs and max-file-size confusion

2009-04-23 Thread Ian! D. Allen
   $ rsync -ai --min-size 10M --prune-empty-dirs /home/idallen/test /tmp/foo
 Have you tried --no-dirs?

Why should I need it?  I've explicitly told the receiving side don't
create empty directories and that should be sufficient.  I shouldn't
need any other options.  (In any case, I just tried --no-dirs and it
didn't change the result.  I still get piles of empty directories.)

Perhaps the man page lies, and --prune-empty-dirs does not operate on
the receiving side at all?

In the man page it says in one place tells the receiving rsync to get
rid of empty directories from the file-list and in another place it says
prune empty directory chains from file-list.  The latter sounds like it
operates on the source list, not on the receiving list, and if rsync were
operating on the source list it would explain the current misbehaviour.

Has nobody ever wondered about this before?  I suppose I shall have to
Read The Source to find out what is wrong.  Please someone enlighten me
about what I'm missing, before I start digging around in there...

-- 
| Ian! D. Allen  -  idal...@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Open Source / Linux) via: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


purge-empty-dirs and max-file-size confusion

2009-04-22 Thread Ian! D. Allen
I want to use --min-size to copy just large files (and their necessary
parent directories), but everything I've tried copies *all* the source
directories, and creates them empty on the destination even if they
don't have any big files in them.  I only want the minimal directory
hierarchies that contain the big files.  This doesn't work:

$ rm -rf /tmp/foo
$ rsync -ai --min-size 10M --prune-empty-dirs /home/idallen/test /tmp/foo
cd+ test/
cd+ test/dir1/
cd+ test/dir2/
cd+ test/dir3/
cd+ test/dir4/
f+ test/dir4/BIGFILE
cd+ test/dir5/
f+ test/dir5/BIGFILE
cd+ test/dir6/
f+ test/dir6/BIGFILE

Wrong.  I don't want all those dir1, dir2, dir3 empty directories.
I don't want *any* empty directories, at any level.
What am I missing?

-- 
| Ian! D. Allen  -  idal...@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Open Source / Linux) via: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html