Re: rsync feature suggestion
Dave Dykstra wrote > What Hadmut wants is the oft-requested and discussed "files-from" option > that I once offered to write but haven't been able to get to. Andy Schor > in http://lists.samba.org/pipermail/rsync/2001-November/005272.html posted > a patch for something similar but it only worked when the sender was on the > local machine and not when it was remote (among other issues). I don't > believe you've posted your patch, Justin; does your "files-from" directly > contain the list of files to send and skip the recursive traversal? If so, > I don't see the point of having rsync have the extra regex options you > mention because those could all be done by external greps that pre-process > the file list. Mine is also victim to the "sender on the local machine" problem, although that could be easily rectified. By default, my files-from doesn't do any recursive processing, but you can control this on a per-file basis within the list of filenames. Here's an example : /a/file /another/file /some/directory /some/other/directory 1 would recurse on /some/other/directory but nothing else. I also made the list of files base-64 encoded to avoid some obvious problems, and it works with all filenames (ascii, UTF-N). -justinb -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: rsync feature suggestion
What Hadmut wants is the oft-requested and discussed "files-from" option that I once offered to write but haven't been able to get to. Andy Schor in http://lists.samba.org/pipermail/rsync/2001-November/005272.html posted a patch for something similar but it only worked when the sender was on the local machine and not when it was remote (among other issues). I don't believe you've posted your patch, Justin; does your "files-from" directly contain the list of files to send and skip the recursive traversal? If so, I don't see the point of having rsync have the extra regex options you mention because those could all be done by external greps that pre-process the file list. - Dave On Fri, Jan 03, 2003 at 11:51:05AM -0600, Justin Banks wrote: > Max Bowsher wrote > > Hadmut Danisch wrote: > > > I'd like to suggest a new feature to rsync. > > > > > I am mirroring a debian archive, but unfortunately, > > > debian mixes all files of several distributions in a > > > subtree /pool. There is no way to select only the files > > > of a certain distribution through a simple exclude/include > > > expression. > > > > > > There is a tool called debmirror, which first downloads > > > the distribution index files, extracts all the filenames/paths > > > of the files needed and then calls rsync for every single file. > > > Thats certainly not useful, especially since rsync shows the > > > servers motd for every single file. > > > > I was about to suggest: > > $ rsync --include-from=list-file --exclude=\* > > but of course that will exclude the parent directories of files you want, > > causing them to be ignored. > > > > This might work: > > $ rsync --include-from=list-file --include=\*\*/ --exclude=\* > > > > although it will mirror the entire directory structure (but not unspecified > > files). > > > > Probably, rsync should be taught that: "If I explicitly include a file, look > > for it explicitly, even if I've excluded a parent directory." > > Not too long ago, I modified/mangled rsync to do > > rsync --files-from /some/file --include-regexes /some/regular/expressions \ > --exclude-regexes /some/regular/expressions > > > such that all the files in /some/file would be sent iff they matched the > posix regexes in --include-regexes and didn't match the ones in > --exclude-regexes (if present). > > I don't have a wide variety of platforms to test it on, but it worked okay > on linux, solaris, and irix. > > -justinb -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: rsync feature suggestion
Hi, I just sent an answer to Edward's similar suggestion to the list. regards Hadmut -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: rsync feature suggestion
On Fri, Jan 03, 2003 at 11:28:37AM -0600, Edward King wrote: > Might it be possible to take the file list that you want to feed to > rsync and turn it into an rsync.conf file? It might be possible, but maybe it is ambiguous and definitely not efficient, since --include defines a Pattern, not a file name/path. As far as I know, rsync has to check every single file against all include/exclude patterns. That's a complexity of O(n^2). I'm talking about directories with 30,000 .. 1,000,000 files. This could end up in 10^12 file name/comparison patterns, and that's certainly not what you want to have. If you read a list of plain filenames, you do not need to perfom pattern matching, but can use a simple associative/hash array and check extremely fast, whether a given filename is to be copied or not. That's a very important difference to a list of patterns. regards Hadmut -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: rsync feature suggestion
Max Bowsher wrote > Hadmut Danisch wrote: > > I'd like to suggest a new feature to rsync. > > > I am mirroring a debian archive, but unfortunately, > > debian mixes all files of several distributions in a > > subtree /pool. There is no way to select only the files > > of a certain distribution through a simple exclude/include > > expression. > > > > There is a tool called debmirror, which first downloads > > the distribution index files, extracts all the filenames/paths > > of the files needed and then calls rsync for every single file. > > Thats certainly not useful, especially since rsync shows the > > servers motd for every single file. > > I was about to suggest: > $ rsync --include-from=list-file --exclude=\* > but of course that will exclude the parent directories of files you want, > causing them to be ignored. > > This might work: > $ rsync --include-from=list-file --include=\*\*/ --exclude=\* > > although it will mirror the entire directory structure (but not unspecified > files). > > Probably, rsync should be taught that: "If I explicitly include a file, look > for it explicitly, even if I've excluded a parent directory." Not too long ago, I modified/mangled rsync to do rsync --files-from /some/file --include-regexes /some/regular/expressions \ --exclude-regexes /some/regular/expressions such that all the files in /some/file would be sent iff they matched the posix regexes in --include-regexes and didn't match the ones in --exclude-regexes (if present). I don't have a wide variety of platforms to test it on, but it worked okay on linux, solaris, and irix. -justinb -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: rsync feature suggestion
Hadmut Danisch wrote: > I'd like to suggest a new feature to rsync. > I am mirroring a debian archive, but unfortunately, > debian mixes all files of several distributions in a > subtree /pool. There is no way to select only the files > of a certain distribution through a simple exclude/include > expression. > > There is a tool called debmirror, which first downloads > the distribution index files, extracts all the filenames/paths > of the files needed and then calls rsync for every single file. > Thats certainly not useful, especially since rsync shows the > servers motd for every single file. I was about to suggest: $ rsync --include-from=list-file --exclude=\* but of course that will exclude the parent directories of files you want, causing them to be ignored. This might work: $ rsync --include-from=list-file --include=\*\*/ --exclude=\* although it will mirror the entire directory structure (but not unspecified files). Probably, rsync should be taught that: "If I explicitly include a file, look for it explicitly, even if I've excluded a parent directory." Max. -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
Re: rsync feature suggestion
Might it be possible to take the file list that you want to feed to rsync and turn it into an rsync.conf file? A simple bash script could create the config file and call rsync (with the --config= to specify the temporary config file) Something like this (syntax most likely is wrong, haven't tested it): #!/bin/sh IFS=" " cat /etc/rsync.conf > rsync_command FILES_TO_SYNC=`cat file_list.txt` for EACH_FILE in $FILES_TO_SYNC; do echo ' --include="${EACH_FILE}"' >> rsync_command done rsync --config=rsync_command - Ed King Hadmut Danisch wrote: Hi, I'd like to suggest a new feature to rsync. Problem: Currently, rsync generates a recursive list of file existing a the source directory, modifies this list by includes and excludes, and then copies these files. That's pretty good in most, but not all cases. I am mirroring a debian archive, but unfortunately, debian mixes all files of several distributions in a subtree /pool. There is no way to select only the files of a certain distribution through a simple exclude/include expression. There is a tool called debmirror, which first downloads the distribution index files, extracts all the filenames/paths of the files needed and then calls rsync for every single file. Thats certainly not useful, especially since rsync shows the servers motd for every single file. Therefore, I'd like to suggest a new option: Allow rsync to not build the list of files existing at the source directory by recursively walking through the source directory, but by reading a file or stdin to get a list of files to be copied. This would allow to mirror the distribution index files in a first step, then build the list of files needed and then to download these files is a second step. An alternative method would be to keep the recursive method, but to open a pipe to an external program. Before downloading a file, the path is printed to the pipe and an answer is read from the pipe. Thus, an external filter program can decide for each single file whether to copy it or not. regards Hadmut (Please respond directly, I'm not on your mailing list) -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html
rsync feature suggestion
Hi, I'd like to suggest a new feature to rsync. Problem: Currently, rsync generates a recursive list of file existing a the source directory, modifies this list by includes and excludes, and then copies these files. That's pretty good in most, but not all cases. I am mirroring a debian archive, but unfortunately, debian mixes all files of several distributions in a subtree /pool. There is no way to select only the files of a certain distribution through a simple exclude/include expression. There is a tool called debmirror, which first downloads the distribution index files, extracts all the filenames/paths of the files needed and then calls rsync for every single file. Thats certainly not useful, especially since rsync shows the servers motd for every single file. Therefore, I'd like to suggest a new option: Allow rsync to not build the list of files existing at the source directory by recursively walking through the source directory, but by reading a file or stdin to get a list of files to be copied. This would allow to mirror the distribution index files in a first step, then build the list of files needed and then to download these files is a second step. An alternative method would be to keep the recursive method, but to open a pipe to an external program. Before downloading a file, the path is printed to the pipe and an answer is read from the pipe. Thus, an external filter program can decide for each single file whether to copy it or not. regards Hadmut (Please respond directly, I'm not on your mailing list) -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html