Re: Using --fuzzy
On 16.11.2014 18:38, Karl O. Pinc wrote: On 11/16/2014 03:53:12 PM, Joe wrote: I have a lot of files (and directories) (up to a few hundred at a time) that I get from various sources. Some time after I get them (after they are already backed up), I often have to move them around and normalize their names. When I do this, rsync sees them as unrelated to the copies of these files which are already on the backup destination. I don't know if it suits your use case but you could consider using hardlinks. It should be noted that using hardlinks has 1 major caveat: Order It only saves a copy when the new hardlinks appears in the hierachy AFTER the original file. (This is true for incremental-mode (default for =3.0). It might work differently for 3.0 or --no-inc-recursive-mode, but i haven't tried.) Otherwise rsync will copy the new file and later hard link the old-file to the new-file and not the other way around. So i personally use a directory '.z' in the root of a hierarchy where each file has an additional hardlink, so i can move files around in the hierarchy however i want. That way rsync sees the '.z'-directory first and acts accordingly. Such a directory can be created after the fact. Make a directory that is LAST in sort-order. Assuming plain ASCII filesnames: mkdir zzz Then link all files into that directory and rsync (Don't forged adding -H). Then rename it to be first in sort-order (on both sides!): mv zzz .z And after you have made the necessary changes to your procedures to make the additonal hardlink you are free to move around files without rsync having to copy them each time they are moved. After deleting files you can use: find .z -type f -links 1 -delete to find and delete files that don't have an additional hardlink. -- Matthias -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Using --fuzzy
I'm going to have to digest this for awhile. It makes sense, but I have to work on it a bit before I understand it enough to actually apply it. This would make a good howto article. Thanks to both of you. On 11/17/2014 04:56 AM, Matthias Schniedermeyer wrote: On 16.11.2014 18:38, Karl O. Pinc wrote: On 11/16/2014 03:53:12 PM, Joe wrote: I have a lot of files (and directories) (up to a few hundred at a time) that I get from various sources. Some time after I get them (after they are already backed up), I often have to move them around and normalize their names. When I do this, rsync sees them as unrelated to the copies of these files which are already on the backup destination. I don't know if it suits your use case but you could consider using hardlinks. It should be noted that using hardlinks has 1 major caveat: Order It only saves a copy when the new hardlinks appears in the hierachy AFTER the original file. (This is true for incremental-mode (default for =3.0). It might work differently for 3.0 or --no-inc-recursive-mode, but i haven't tried.) Otherwise rsync will copy the new file and later hard link the old-file to the new-file and not the other way around. So i personally use a directory '.z' in the root of a hierarchy where each file has an additional hardlink, so i can move files around in the hierarchy however i want. That way rsync sees the '.z'-directory first and acts accordingly. Such a directory can be created after the fact. Make a directory that is LAST in sort-order. Assuming plain ASCII filesnames: mkdir zzz Then link all files into that directory and rsync (Don't forged adding -H). Then rename it to be first in sort-order (on both sides!): mv zzz .z And after you have made the necessary changes to your procedures to make the additonal hardlink you are free to move around files without rsync having to copy them each time they are moved. After deleting files you can use: find .z -type f -links 1 -delete to find and delete files that don't have an additional hardlink. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Using --fuzzy
I have a lot of files (and directories) (up to a few hundred at a time) that I get from various sources. Some time after I get them (after they are already backed up), I often have to move them around and normalize their names. When I do this, rsync sees them as unrelated to the copies of these files which are already on the backup destination. When I can't use the --delete option for various reasons, this causes multiple copies of the files to be saved in the backup destination. I see that there is a --fuzzy option which, specified twice, may address this issue. Is there a tutorial or howto that addresses this so I can better understand it before I start experimenting? I don't want to have to reinvent strategies which, doubtless, already exist. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Using --fuzzy
On 11/16/2014 03:53:12 PM, Joe wrote: I have a lot of files (and directories) (up to a few hundred at a time) that I get from various sources. Some time after I get them (after they are already backed up), I often have to move them around and normalize their names. When I do this, rsync sees them as unrelated to the copies of these files which are already on the backup destination. I don't know if it suits your use case but you could consider using hardlinks. If, instead of moving the files, you hardlinked them then rsync with -H would see the files as being the same. (Hardlinking can only be done within a filesystem.) Then you'd have to delete the original filenames and rsync again. This is only practicable if it's easy to delete the old filenames, say, if all the new files arrive in a single directory that can later be deleted. Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Using --fuzzy
Great idea which I will keep in mind for other cases! In this case, however, the backups are on separate partitions on external USB drives (I have a notebook), so hard links won't work. Joe On 11/16/2014 07:38 PM, Karl O. Pinc wrote: On 11/16/2014 03:53:12 PM, Joe wrote: I have a lot of files (and directories) (up to a few hundred at a time) that I get from various sources. Some time after I get them (after they are already backed up), I often have to move them around and normalize their names. When I do this, rsync sees them as unrelated to the copies of these files which are already on the backup destination. I don't know if it suits your use case but you could consider using hardlinks. If, instead of moving the files, you hardlinked them then rsync with -H would see the files as being the same. (Hardlinking can only be done within a filesystem.) Then you'd have to delete the original filenames and rsync again. This is only practicable if it's easy to delete the old filenames, say, if all the new files arrive in a single directory that can later be deleted. Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Using --fuzzy
The backups can be on separate partitions. What must be on one partition is the file and it's hard link. On November 16, 2014 6:58:26 PM CST, Joe jose...@main.nc.us wrote: Great idea which I will keep in mind for other cases! In this case, however, the backups are on separate partitions on external USB drives (I have a notebook), so hard links won't work. Joe On 11/16/2014 07:38 PM, Karl O. Pinc wrote: On 11/16/2014 03:53:12 PM, Joe wrote: I have a lot of files (and directories) (up to a few hundred at a time) that I get from various sources. Some time after I get them (after they are already backed up), I often have to move them around and normalize their names. When I do this, rsync sees them as unrelated to the copies of these files which are already on the backup destination. I don't know if it suits your use case but you could consider using hardlinks. If, instead of moving the files, you hardlinked them then rsync with -H would see the files as being the same. (Hardlinking can only be done within a filesystem.) Then you'd have to delete the original filenames and rsync again. This is only practicable if it's easy to delete the old filenames, say, if all the new files arrive in a single directory that can later be deleted. Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Problem using --fuzzy with --*-dest flags
Hi, I'm trying to get the --fuzzy option to work with --compare-dest with rsync 3.1.0. I'm testing with two files. C_VOL-b001-i3818.spi and C_VOL-b001-i3816.spi are copies of one another, but I've modified the latter a bit with a hex editor as a test. The modified date is still the same. This works for me (fuzzy is utilized against the 3816 test file): rsync --fuzzy --fuzzy -vv C_VOL-b001-i3818.spi rsync://user@localhost /SHARENAME/dest Contents of SHARENAME/dest: C_VOL-b001-i3816.spi This does not work: rsync --fuzzy --fuzzy -vv --copy-dest=../alt C_VOL-b001-i3818.spi rsync://user@localhost/SHARENAME/dest Contents of SHARENAME/dest: empty Contents of SHARENAME/alt: C_VOL-b001-i3816.spi I do get complaints that: file has vanished[...][..]alt/C_VOL-b001-i3816.spi (in SHARENAME) ...so it does appear to be seeing the contents of the alt folder. Fuzzy isn't being run against it, however. I'd appreciate any insight on this. PS: I posted another thread having to do with issues with the --debug=FUZZY flag, but I think that's a separate issue so I'm posting this separately for clarity. Thanks Graham -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync 2.6.5 segfault using --fuzzy --link-dest
On Sat, Jun 11, 2005 at 02:05:48PM -0400, Erik Jan Tromp wrote: #0 0x08060566 in flist_find () #1 0x0804c6cd in recv_generator () OK, the crash turned out to be caused by an empty file-list not getting its high value set correctly. If such an empty list gets passed to flist_find(), it would crash. This is not something that normally happens, but in the case where an empty destination directory is matched up with a list-dest directory that has a file that is present but not up-to-date, rsync triggers the bug. Attached is a patch to fix this. Thanks for your help! ..wayne.. --- flist.c 27 May 2005 18:15:18 - 1.297 +++ flist.c 12 Jun 2005 06:04:10 - @@ -1471,8 +1471,12 @@ static void clean_flist(struct file_list { int i, prev_i = 0; - if (!flist || flist-count == 0) + if (!flist) return; + if (flist-count == 0) { + flist-high = -1; + return; + } sorting_flist = flist; qsort(flist-files, flist-count, -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync 2.6.5 segfault using --fuzzy --link-dest
On Sat, 11 Jun 2005 23:22:39 -0700 Wayne Davison [EMAIL PROTECTED] wrote: OK, the crash turned out to be caused by an empty file-list not getting its high value set correctly. If such an empty list gets passed to flist_find(), it would crash. This is not something that normally happens, but in the case where an empty destination directory is matched up with a list-dest directory that has a file that is present but not up-to-date, rsync triggers the bug. Attached is a patch to fix this. I seem to have a habit of doing stuff that doesn't 'normally happen'. :) Thanks for your help! Rebuilt with this the max verbosity patches, deployed, tested. Works like an absolute charm. Thanks for the quick turnaround on these fixes. Erik -- I really want a license to do just two things: make the code available to others, and make sure that improvements stay that way. That's really it. Nothing more, nothing less. Everything else is fluff. -- Linus Torvalds -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync 2.6.5 segfault using --fuzzy --link-dest
On Fri, Jun 10, 2005 at 05:14:57AM -0400, Erik Jan Tromp wrote: if I remove every possible option except --fuzzy --link-dest, segfault every time. I haven't seen that in my testing. One easy thing to do is to make sure that core dumping is enabled and look at a backtrace: ulimit -c unlimited /path/to/non-stripped/rsync ... gdb /path/to/non-stripped/rsync /path/to/core bt The backtract (bt) command will tell us where the program is crashing, and should help me to find the bug. Note that a non-stripped rsync can be found in the build dir, and that the core file will probably be in the destination directory of your pull transfer. ..wayne.. -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: rsync 2.6.5 segfault using --fuzzy --link-dest
On Sat, 11 Jun 2005 09:03:07 -0700 Wayne Davison [EMAIL PROTECTED] wrote: On Fri, Jun 10, 2005 at 05:14:57AM -0400, Erik Jan Tromp wrote: if I remove every possible option except --fuzzy --link-dest, segfault every time. I haven't seen that in my testing. One easy thing to do is to make sure that core dumping is enabled and look at a backtrace: ulimit -c unlimited /path/to/non-stripped/rsync ... gdb /path/to/non-stripped/rsync /path/to/core bt Concise directions.. perfect. The backtract (bt) command will tell us where the program is crashing, and should help me to find the bug. Note that a non-stripped rsync can be found in the build dir, and that the core file will probably be in the destination directory of your pull transfer. Source tree was wiped after I packaged, so did a quick rebuild tossed the non-stripped bin in ~/bin/. No dev tools on oxygen (backup server), so into ~/bin/ for gdb as well. Following is a complete dump. -- 8 -- [EMAIL PROTECTED] ~]# ulimit -c unlimited [EMAIL PROTECTED] ~]# ~/bin/rsync --archive --delete-during --fuzzy --hard-links --nu meric-ids --quiet --sparse --temp-dir /backup/helium/ --link-dest /backup/hydrog en/saturday/ --password-file /backup/helium/.password rsync://[EMAIL PROTECTED]/back up/ /backup/helium/saturday/ Segmentation fault (core dumped) [EMAIL PROTECTED] ~]# ~/bin/gdb ~/bin/rsync /backup/helium/saturday/core GNU gdb 6.3 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i486-slackware-linux...Using host libthread_db libr ary /lib/libthread_db.so.1. Core was generated by `/root/bin/rsync --archive --delete-during --fuzzy --hard- links --numeric-ids --'. Program terminated with signal 11, Segmentation fault. Reading symbols from /usr/lib/libpopt.so.0...done. Loaded symbols for /usr/lib/libpopt.so.0 Reading symbols from /lib/libresolv.so.2...done. Loaded symbols for /lib/libresolv.so.2 Reading symbols from /lib/libc.so.6...done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 Reading symbols from /lib/libnss_files.so.2...done. Loaded symbols for /lib/libnss_files.so.2 #0 0x08060566 in flist_find () (gdb) bt #0 0x08060566 in flist_find () #1 0x0804c6cd in recv_generator () #2 0x0804db9a in generate_files () #3 0x08056390 in do_recv () #4 0x08056942 in client_run () #5 0x0806ab2b in start_socket_client () #6 0x08056e9f in start_client () #7 0x080573fd in main () (gdb) quit -- 8 -- Erik -- I really want a license to do just two things: make the code available to others, and make sure that improvements stay that way. That's really it. Nothing more, nothing less. Everything else is fluff. -- Linus Torvalds -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
rsync 2.6.5 segfault using --fuzzy --link-dest
I've been reworking my backup script decided to give some of the newer options a try. It would appear I've found a combination that doesn't play nice. $ rsync --archive --delete-during --fuzzy --hard-links --numeric-ids --quiet --sparse --temp-dir /backup/helium/ --link-dest /backup/hydrogen/tuesday/ --password-file /backup/helium/.password rsync://[EMAIL PROTECTED]/backup/ /backup/helium/tuesday/ Segmentation fault If I remove either of the --fuzzy or --link-dest options from the above command, no more segfault. Conversely, if I remove every possible option except --fuzzy --link-dest, segfault every time. You'll undoubtedly want more info to dig in further. Being that I'm unfamiliar with gdb strace, example usage would be helpful. Erik -- I really want a license to do just two things: make the code available to others, and make sure that improvements stay that way. That's really it. Nothing more, nothing less. Everything else is fluff. -- Linus Torvalds -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html