Re: Using --fuzzy

2014-11-17 Thread Matthias Schniedermeyer
On 16.11.2014 18:38, Karl O. Pinc wrote:
 On 11/16/2014 03:53:12 PM, Joe wrote:
  I have a lot of files (and directories) (up to a few hundred at a
  time)
  that I get from various sources. Some time after I get them (after
  they
  are already backed up), I often have to move them around and 
  normalize
  their names.
  
  When I do this, rsync sees them as unrelated to the copies of these
  files which are already on the backup destination. 
 
 I don't know if it suits your use case but
 you could consider using hardlinks.

It should be noted that using hardlinks has 1 major caveat:
Order

It only saves a copy when the new hardlinks appears in the hierachy 
AFTER the original file.

(This is true for incremental-mode (default for =3.0). It might work 
differently for 3.0 or --no-inc-recursive-mode, but i haven't tried.)

Otherwise rsync will copy the new file and later hard link the 
old-file to the new-file and not the other way around.

So i personally use a directory '.z' in the root of a hierarchy where 
each file has an additional hardlink, so i can move files around in the 
hierarchy however i want.
That way rsync sees the '.z'-directory first and acts accordingly.


Such a directory can be created after the fact.
Make a directory that is LAST in sort-order. Assuming plain ASCII 
filesnames:
mkdir zzz
Then link all files into that directory and rsync (Don't forged adding 
-H).
Then rename it to be first in sort-order (on both sides!):
mv zzz .z

And after you have made the necessary changes to your procedures to make 
the additonal hardlink you are free to move around files without rsync 
having to copy them each time they are moved.

After deleting files you can use:
find .z -type f -links 1 -delete
to find and delete files that don't have an additional hardlink.




-- 

Matthias
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Using --fuzzy

2014-11-17 Thread Joe
I'm going to have to digest this for awhile. It makes sense, but I have
to work on it a bit before I understand it enough to actually apply it.

This would make a good howto article.

Thanks to both of you.
On 11/17/2014 04:56 AM, Matthias Schniedermeyer wrote:
 On 16.11.2014 18:38, Karl O. Pinc wrote:
 On 11/16/2014 03:53:12 PM, Joe wrote:
 I have a lot of files (and directories) (up to a few hundred at a
 time)
 that I get from various sources. Some time after I get them (after
 they
 are already backed up), I often have to move them around and 
 normalize
 their names.

 When I do this, rsync sees them as unrelated to the copies of these
 files which are already on the backup destination. 
 I don't know if it suits your use case but
 you could consider using hardlinks.
 It should be noted that using hardlinks has 1 major caveat:
 Order

 It only saves a copy when the new hardlinks appears in the hierachy 
 AFTER the original file.

 (This is true for incremental-mode (default for =3.0). It might work 
 differently for 3.0 or --no-inc-recursive-mode, but i haven't tried.)

 Otherwise rsync will copy the new file and later hard link the 
 old-file to the new-file and not the other way around.

 So i personally use a directory '.z' in the root of a hierarchy where 
 each file has an additional hardlink, so i can move files around in the 
 hierarchy however i want.
 That way rsync sees the '.z'-directory first and acts accordingly.


 Such a directory can be created after the fact.
 Make a directory that is LAST in sort-order. Assuming plain ASCII 
 filesnames:
 mkdir zzz
 Then link all files into that directory and rsync (Don't forged adding 
 -H).
 Then rename it to be first in sort-order (on both sides!):
 mv zzz .z

 And after you have made the necessary changes to your procedures to make 
 the additonal hardlink you are free to move around files without rsync 
 having to copy them each time they are moved.

 After deleting files you can use:
 find .z -type f -links 1 -delete
 to find and delete files that don't have an additional hardlink.





-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Using --fuzzy

2014-11-16 Thread Joe
I have a lot of files (and directories) (up to a few hundred at a time)
that I get from various sources. Some time after I get them (after they
are already backed up), I often have to move them around and normalize
their names.

When I do this, rsync sees them as unrelated to the copies of these
files which are already on the backup destination. When I can't use the
--delete option for various reasons, this causes multiple copies of the
files to be saved in the backup destination.

I see that there is a --fuzzy option which, specified twice, may address
this issue.

Is there a tutorial or howto that addresses this so I can better
understand it before I start experimenting? I don't want to have to
reinvent strategies which, doubtless, already exist.

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Using --fuzzy

2014-11-16 Thread Karl O. Pinc
On 11/16/2014 03:53:12 PM, Joe wrote:
 I have a lot of files (and directories) (up to a few hundred at a
 time)
 that I get from various sources. Some time after I get them (after
 they
 are already backed up), I often have to move them around and 
 normalize
 their names.
 
 When I do this, rsync sees them as unrelated to the copies of these
 files which are already on the backup destination. 

I don't know if it suits your use case but
you could consider using hardlinks.

If, instead of moving the files, you hardlinked them
then rsync with -H would see the files as being the same.

(Hardlinking can only be done within a filesystem.)

Then you'd have to delete the original filenames and
rsync again.

This is only practicable if it's easy to delete
the old filenames, say, if all the new files
arrive in a single directory that can later
be deleted.



Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Using --fuzzy

2014-11-16 Thread Joe
Great idea which I will keep in mind for other cases!

In this case, however, the backups are on separate partitions on
external USB drives (I have a notebook), so hard links won't work.

Joe

On 11/16/2014 07:38 PM, Karl O. Pinc wrote:
 On 11/16/2014 03:53:12 PM, Joe wrote:
 I have a lot of files (and directories) (up to a few hundred at a
 time)
 that I get from various sources. Some time after I get them (after
 they
 are already backed up), I often have to move them around and 
 normalize
 their names.

 When I do this, rsync sees them as unrelated to the copies of these
 files which are already on the backup destination. 
 I don't know if it suits your use case but
 you could consider using hardlinks.

 If, instead of moving the files, you hardlinked them
 then rsync with -H would see the files as being the same.

 (Hardlinking can only be done within a filesystem.)

 Then you'd have to delete the original filenames and
 rsync again.

 This is only practicable if it's easy to delete
 the old filenames, say, if all the new files
 arrive in a single directory that can later
 be deleted.



 Karl k...@meme.com
 Free Software:  You don't pay back, you pay forward.
  -- Robert A. Heinlein


-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: Using --fuzzy

2014-11-16 Thread Karl .O Pinc
The backups can be on separate partitions.  What must be on one partition is 
the file and it's hard link.

On November 16, 2014 6:58:26 PM CST, Joe jose...@main.nc.us wrote:
Great idea which I will keep in mind for other cases!

In this case, however, the backups are on separate partitions on
external USB drives (I have a notebook), so hard links won't work.

Joe

On 11/16/2014 07:38 PM, Karl O. Pinc wrote:
 On 11/16/2014 03:53:12 PM, Joe wrote:
 I have a lot of files (and directories) (up to a few hundred at a
 time)
 that I get from various sources. Some time after I get them (after
 they
 are already backed up), I often have to move them around and 
 normalize
 their names.

 When I do this, rsync sees them as unrelated to the copies of these
 files which are already on the backup destination. 
 I don't know if it suits your use case but
 you could consider using hardlinks.

 If, instead of moving the files, you hardlinked them
 then rsync with -H would see the files as being the same.

 (Hardlinking can only be done within a filesystem.)

 Then you'd have to delete the original filenames and
 rsync again.

 This is only practicable if it's easy to delete
 the old filenames, say, if all the new files
 arrive in a single directory that can later
 be deleted.



 Karl k...@meme.com
 Free Software:  You don't pay back, you pay forward.
  -- Robert A. Heinlein

Karl k...@meme.com
Free Software: You don't pay back, you pay forward.
-- Robert A. Heinlein
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Problem using --fuzzy with --*-dest flags

2014-03-06 Thread Graham 5915
Hi,

I'm trying to get the --fuzzy option to work with --compare-dest with rsync
3.1.0.

I'm testing with two files. C_VOL-b001-i3818.spi and
C_VOL-b001-i3816.spi are copies of one another, but I've modified the
latter a bit with a hex editor as a test. The modified date is still the
same.

This works for me (fuzzy is utilized against the 3816 test file):

rsync --fuzzy --fuzzy -vv C_VOL-b001-i3818.spi rsync://user@localhost
/SHARENAME/dest
Contents of SHARENAME/dest: C_VOL-b001-i3816.spi

This does not work:

rsync --fuzzy --fuzzy -vv --copy-dest=../alt C_VOL-b001-i3818.spi
rsync://user@localhost/SHARENAME/dest
Contents of SHARENAME/dest: empty
Contents of SHARENAME/alt: C_VOL-b001-i3816.spi

I do get complaints that:
file has vanished[...][..]alt/C_VOL-b001-i3816.spi (in SHARENAME)
...so it does appear to be seeing the contents of the alt folder. Fuzzy
isn't being run against it, however.

I'd appreciate any insight on this.

PS: I posted another thread having to do with issues with the --debug=FUZZY
flag, but I think that's a separate issue so I'm posting this separately
for clarity.

Thanks
Graham
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: rsync 2.6.5 segfault using --fuzzy --link-dest

2005-06-12 Thread Wayne Davison
On Sat, Jun 11, 2005 at 02:05:48PM -0400, Erik Jan Tromp wrote:
 #0  0x08060566 in flist_find ()
 #1  0x0804c6cd in recv_generator ()

OK, the crash turned out to be caused by an empty file-list not getting
its high value set correctly.  If such an empty list gets passed to
flist_find(), it would crash.  This is not something that normally
happens, but in the case where an empty destination directory is matched
up with a list-dest directory that has a file that is present but not
up-to-date, rsync triggers the bug.  Attached is a patch to fix this.

Thanks for your help!

..wayne..
--- flist.c 27 May 2005 18:15:18 -  1.297
+++ flist.c 12 Jun 2005 06:04:10 -
@@ -1471,8 +1471,12 @@ static void clean_flist(struct file_list
 {
int i, prev_i = 0;
 
-   if (!flist || flist-count == 0)
+   if (!flist)
return;
+   if (flist-count == 0) {
+   flist-high = -1;
+   return;
+   }
 
sorting_flist = flist;
qsort(flist-files, flist-count,
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: rsync 2.6.5 segfault using --fuzzy --link-dest

2005-06-12 Thread Erik Jan Tromp
On Sat, 11 Jun 2005 23:22:39 -0700
Wayne Davison [EMAIL PROTECTED] wrote:

 OK, the crash turned out to be caused by an empty file-list not getting
 its high value set correctly.  If such an empty list gets passed to
 flist_find(), it would crash.  This is not something that normally
 happens, but in the case where an empty destination directory is matched
 up with a list-dest directory that has a file that is present but not
 up-to-date, rsync triggers the bug.  Attached is a patch to fix this.

I seem to have a habit of doing stuff that doesn't 'normally happen'. :)

 Thanks for your help!

Rebuilt with this  the max verbosity patches, deployed, tested. Works like an 
absolute charm.

Thanks for the quick turnaround on these fixes.

Erik

-- 
I really want a license to do just two things: make the code available
to others, and make sure that improvements stay that way. That's really
it. Nothing more, nothing less. Everything else is fluff.
 -- Linus Torvalds

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync 2.6.5 segfault using --fuzzy --link-dest

2005-06-11 Thread Wayne Davison
On Fri, Jun 10, 2005 at 05:14:57AM -0400, Erik Jan Tromp wrote:
 if I remove every possible option except --fuzzy  --link-dest,
 segfault every time.

I haven't seen that in my testing.  One easy thing to do is to make sure
that core dumping is enabled and look at a backtrace:

ulimit -c unlimited
/path/to/non-stripped/rsync ...
gdb /path/to/non-stripped/rsync /path/to/core
bt

The backtract (bt) command will tell us where the program is crashing,
and should help me to find the bug.  Note that a non-stripped rsync can
be found in the build dir, and that the core file will probably be in
the destination directory of your pull transfer.

..wayne..
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync 2.6.5 segfault using --fuzzy --link-dest

2005-06-11 Thread Erik Jan Tromp
On Sat, 11 Jun 2005 09:03:07 -0700
Wayne Davison [EMAIL PROTECTED] wrote:

 On Fri, Jun 10, 2005 at 05:14:57AM -0400, Erik Jan Tromp wrote:
  if I remove every possible option except --fuzzy  --link-dest,
  segfault every time.
 
 I haven't seen that in my testing.  One easy thing to do is to make sure
 that core dumping is enabled and look at a backtrace:
 
 ulimit -c unlimited
 /path/to/non-stripped/rsync ...
 gdb /path/to/non-stripped/rsync /path/to/core
 bt

Concise directions.. perfect.

 The backtract (bt) command will tell us where the program is crashing,
 and should help me to find the bug.  Note that a non-stripped rsync can
 be found in the build dir, and that the core file will probably be in
 the destination directory of your pull transfer.

Source tree was wiped after I packaged, so did a quick rebuild  tossed the 
non-stripped bin in ~/bin/. No dev tools on oxygen (backup server), so into 
~/bin/ for gdb as well. 

Following is a complete dump.

-- 8 --

[EMAIL PROTECTED] ~]# ulimit -c unlimited
[EMAIL PROTECTED] ~]# ~/bin/rsync --archive --delete-during --fuzzy 
--hard-links --nu
meric-ids --quiet --sparse --temp-dir /backup/helium/ --link-dest /backup/hydrog
en/saturday/ --password-file /backup/helium/.password rsync://[EMAIL 
PROTECTED]/back
up/ /backup/helium/saturday/
Segmentation fault (core dumped)
[EMAIL PROTECTED] ~]# ~/bin/gdb ~/bin/rsync /backup/helium/saturday/core
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as i486-slackware-linux...Using host libthread_db libr
ary /lib/libthread_db.so.1.
  
Core was generated by `/root/bin/rsync --archive --delete-during --fuzzy --hard-
links --numeric-ids --'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/libpopt.so.0...done.
Loaded symbols for /usr/lib/libpopt.so.0
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
#0  0x08060566 in flist_find ()
(gdb) bt
#0  0x08060566 in flist_find ()
#1  0x0804c6cd in recv_generator ()
#2  0x0804db9a in generate_files ()
#3  0x08056390 in do_recv ()
#4  0x08056942 in client_run ()
#5  0x0806ab2b in start_socket_client ()
#6  0x08056e9f in start_client ()
#7  0x080573fd in main ()
(gdb) quit

-- 8 --

Erik

-- 
I really want a license to do just two things: make the code available
to others, and make sure that improvements stay that way. That's really
it. Nothing more, nothing less. Everything else is fluff.
 -- Linus Torvalds

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


rsync 2.6.5 segfault using --fuzzy --link-dest

2005-06-10 Thread Erik Jan Tromp
I've been reworking my backup script  decided to give some of the newer 
options a try. It would appear I've found a combination that doesn't play nice.

$ rsync --archive --delete-during --fuzzy --hard-links --numeric-ids
 --quiet --sparse --temp-dir /backup/helium/
 --link-dest /backup/hydrogen/tuesday/
 --password-file /backup/helium/.password
 rsync://[EMAIL PROTECTED]/backup/ /backup/helium/tuesday/   
Segmentation fault

If I remove either of the --fuzzy or --link-dest options from the above 
command, no more segfault. Conversely, if I remove every possible option except 
--fuzzy  --link-dest, segfault every time.

You'll undoubtedly want more info to dig in further. Being that I'm unfamiliar 
with gdb  strace, example usage would be helpful.

Erik

-- 
I really want a license to do just two things: make the code available
to others, and make sure that improvements stay that way. That's really
it. Nothing more, nothing less. Everything else is fluff.
 -- Linus Torvalds

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html