Re: Excluding most and including some problems continue.
>>> On Friday, October 01, 2010 at 7:54 AM, in message >> + /das >> + /em >> + /enf >> + /internal >> + /itb >> + /medtox >> + /pml >> + /psb >> + /reg >> + /whs >> - /* >> > + /*/htdocs > - /*/* > > + /*/htdocs/docs >> - /*/htdocs/* Thanks Wayne, this worked well and seems simpler and lazier then my original version. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Excluding most and including some problems continue.
On Fri, 2010-10-01 at 16:28 -0400, Benjamin R. Haskell wrote: > I think I sent a variant of the attached Perl script last time someone > was asking something similar. What Wayne suggested is better right now > (that is: while your patterns are very simple -- just a few root-level > directories, each of which should include the /htdocs/docs/ subdir). > But if you start adding more, it could get more annoying to have to > manually fiddle with the rules. > > The attached script takes as input lines of the form: > /rooted/path/to/include > > and produces what should work as a filter file. A similar script is distributed with rsync: support/files-to-excludes . -- Matt -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Excluding most and including some problems continue.
On Fri, 1 Oct 2010, Wayne Davison wrote: On Thu, Sep 30, 2010 at 8:27 AM, Ian Skinner wrote: Unfortunately there are some subdirectories in some of these selected */htdocs/docs* directories that are unintentionally being excluded by these rules. I.E. */export/home/enf/htdocs/docs/county/internal/*. It is usually best to anchor your matching terms, unless you want a term to float and match anywhere. In an .rsync-filter file, terms that start with a slash are anchored in that file's directory. You can also use a wildcard for all the subdir excusions. For example: + /das + /em + /enf + /internal + /itb + /medtox + /pml + /psb + /reg + /whs - /* + /*/htdocs - /*/* + /*/htdocs/docs - /*/htdocs/* Another alternative is to sprinkle .rsync-filter files throughout your hierarchy with localized rules for that part of the hierarchy, but the above should do what you want. I think I sent a variant of the attached Perl script last time someone was asking something similar. What Wayne suggested is better right now (that is: while your patterns are very simple -- just a few root-level directories, each of which should include the /htdocs/docs/ subdir). But if you start adding more, it could get more annoying to have to manually fiddle with the rules. The attached script takes as input lines of the form: /rooted/path/to/include and produces what should work as a filter file. It supports '{one,other}' brace-style expansions (but I think that may be OS-dependent -- I think it works if your system's 'glob()' function supports them). So, for example, I produced a working filter file for your situation from: {==> input.rsync.rules <==} /{das,em,enf,internal,itb,medtox,pml,psb,reg,whs}/htdocs/docs {=} $ perl ./rsync-filter-generate.pl input.rsync.rules [produces rule file] [1] $ perl ./rsync-filter-generate.pl input.rsync.rules | rsync --include-from=- /path/to/root/ [shows what would be transferred] $ perl ./rsync-filter-generate.pl input.rsync.rules | rsync --include-from=- /path/to/root/ /path/to/dest/ [does it] The general strategy: $ echo /abc/def/ghi | perl ./rsync-filter-generate.pl + /abc -- first include each path component + /abc/def -- one-at-a-time, for each thing to include + /abc/def/ghi - /abc/def/* -- then exclude everything else at each - /abc/* -- level of the hierarchy - /* -- Best, Ben [1] output for your case: + /das + /das/htdocs + /das/htdocs/docs + /das/htdocs/other + /em + /em/htdocs + /em/htdocs/docs + /enf + /enf/htdocs + /enf/htdocs/docs + /internal + /internal/htdocs + /internal/htdocs/docs + /itb + /itb/htdocs + /itb/htdocs/docs + /medtox + /medtox/htdocs + /medtox/htdocs/docs + /pml + /pml/htdocs + /pml/htdocs/docs + /psb + /psb/htdocs + /psb/htdocs/docs + /reg + /reg/htdocs + /reg/htdocs/docs + /whs + /whs/htdocs + /whs/htdocs/docs - /whs/htdocs/* - /whs/* - /reg/htdocs/* - /reg/* - /psb/htdocs/* - /psb/* - /pml/htdocs/* - /pml/* - /medtox/htdocs/* - /medtox/* - /itb/htdocs/* - /itb/* - /internal/htdocs/* - /internal/* - /enf/htdocs/* - /enf/* - /em/htdocs/* - /em/* - /das/htdocs/* - /das/* - /*#!/usr/bin/perl use strict; use warnings; # read in all of the paths to include my @all; while (<>) { chomp; push @all, glob; } my (%inc, %exc); for (@all) { my @parts = split m{/}; for (1..$#parts) { # include every path component up to the end # e.g. # /abc # /abc/def # /abc/def/ghi $inc{join "/", @parts[0..$_]}++; # exclude every other path component # e.g. # /abc/def/* # /abc/* # /* $exc{join "/", @parts[0..$_-1]}++; } } # include things from shortest-to-longest path # exclude things from longest to shortest print "+ $_\n" for sort keys %inc; print "- $_/*\n" for reverse sort keys %exc; -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Excluding most and including some problems continue.
On Thu, Sep 30, 2010 at 8:27 AM, Ian Skinner wrote: > Unfortunately there are some subdirectories in some of these selected > */htdocs/docs* directories that are unintentionally being excluded by these > rules. I.E. */export/home/enf/htdocs/docs/county/internal/*. It is usually best to anchor your matching terms, unless you want a term to float and match anywhere. In an .rsync-filter file, terms that start with a slash are anchored in that file's directory. You can also use a wildcard for all the subdir excusions. For example: > + /das > + /em > + /enf > + /internal > + /itb > + /medtox > + /pml > + /psb > + /reg > + /whs > - /* > + /*/htdocs - /*/* + /*/htdocs/docs > - /*/htdocs/* Another alternative is to sprinkle .rsync-filter files throughout your hierarchy with localized rules for that part of the hierarchy, but the above should do what you want. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Excluding most and including some problems continue.
In <4ca4be08.2858.00a...@cdpr.ca.gov>, on 09/30/10 at 04:42 PM, "Ian Skinner" said: Hi, >> + das/htdocs/docs/* >> + em/htdocs/docs/* >> etc. >> + */ >> - * >Thanks for the suggestion, but that did not seem to produce the desired >results. I did not look into why in detail, but a dry run produced files >from directories I wanted to exclude and apparently not all the files I >wanted to include. Did you add --prune-empty-directorys to the command line? This filter setup along with --prune-empty-directories will copy only the files in the named directores, which is my understanding of what you want. >After a day of trail and error and internet searching I now have this >that is really close. Looks overly complex to me. Taking your example layout and using this filter set + das/htdocs/docs/* + em/htdocs/docs/* + enf/htdocs/docs/* + internal/htdocs/docs/* + itb/htdocs/docs/* + medtox/htdocs/docs/* + pml/htdocs/docs/* + psb/htdocs/docs/* + reg/htdocs/docs/* + whs/htdocs/docs/* + */ - * and this command line rsync --dry-run --prune-empty-dirs --itemize-changes -a -F export\ to\ I get .d..t.. ./ cd+ home/ cd+ home/das/ cd+ home/das/htdocs/ cd+ home/das/htdocs/docs/ >f+ home/das/htdocs/docs/SHLNotes.txt cd+ home/em/ cd+ home/em/htdocs/ cd+ home/em/htdocs/docs/ >f+ home/em/htdocs/docs/SHLNotes.txt Which I think is what you want. Every subdirectory contains a file. Good luck, Steven -- -- "Steven Levine" eCS/Warp/DIY etc. www.scoug.com www.ecomstation.com -- -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Excluding most and including some problems continue.
>>> Steven Levine Thursday, September 30, 2010 3:22 PM >>> >>> > It's close, but you need to augment is a bit. Try > + das/htdocs/docs/* > + em/htdocs/docs/* > etc. > + */ > - * Thanks for the suggestion, but that did not seem to produce the desired results. I did not look into why in detail, but a dry run produced files from directories I wanted to exclude and apparently not all the files I wanted to include. After a day of trail and error and internet searching I now have this that is really close. It copies all the directories I earlier identifies that where being falsely excluded. There are still two or three individual files that are not copying for some reason. I am looking into those now. + das + em + enf + internal + itb + medtox + pml + psb + reg + whs + htdocs + docs - /* - /das/* - /em/* - /enf/* - /internal/* - /itb/* - /medtox/* - /pml/* - /psb/* - /reg/* - /whs/* - htdocs/* -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Excluding most and including some problems continue.
In <4ca4958c.2858.00a...@cdpr.ca.gov>, on 09/30/10 at 01:50 PM, "Ian Skinner" said: Hi, >>or possibly >> >> + das/**htdocs/docs* >> + em/**/htdocs/docs* >> etc. >I'm not sure what the difference between the first example and the second >example is supposed to be? That's my bad eyes. This should have been + das/**htdocs/docs* + em/**htdocs/docs* but it's not going to do what you really want. >I don't see how that would address my needs, but I'm not sure what the >double ** symbols represent? I recommend you read the man page. ** and *** can be very useful. >But there is no extra directories between >the "das" and the "htdocs" directories in my use case. OK. That's why I said I was not sure what you were asking. >I want to mirror the following directories from the above example and >exclude everything else. /export/home/em/htdocs/docs/* >/export/home/enf/htdocs/docs/* >/export/home/das/htdocs/docs/* >(And seven more similar directories) OK. This is easier. >I just tried this filter file somewhat based on your previous suggestion >but it excluded everything. It's close, but you need to augment is a bit. Try + das/htdocs/docs/* + em/htdocs/docs/* etc. + */ - * and add --prune-empty-dirs to the command line. Also, if you really only want the contents of specific directories and not the content of any of the subdirecories, you can often avoid the recursive scan and use the --relative option and just list the source directories on the command line. Steven -- -- "Steven Levine" eCS/Warp/DIY etc. www.scoug.com www.ecomstation.com -- -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Excluding most and including some problems continue.
>>> Steven Levine Thursday, September 30, 2010 11:14 AM >>> >>> >I'm not sure I entirely understand what you want, but what about > > + das/**/htdocs/docs* > + em/**/htdocs/docs* > etc. > >or possibly > > + das/**htdocs/docs* > + em/**/htdocs/docs* > etc. I'm not sure what the difference between the first example and the second example is supposed to be? I don't see how that would address my needs, but I'm not sure what the double ** symbols represent? But there is no extra directories between the "das" and the "htdocs" directories in my use case. Maybe this representation would be clearer. /export /home /excluded_A /em /exclude_em-1 /htdocs /exclude_em_htdocs-1 /exclude_em_htdocs-2 /docs /exclude_em-2 /exclude_C /enf /exclude_enf-1 /htdocs /exclude_enf_htdocs-1 /exclude_enf_htdocs-2 /docs /exclude_enf-2 /exclude_E /das /exclude_das-1 /htdocs /exclude_das_htdocs-1 /exclude_das_htdocs-2 /docs /exclude_das-2 I want to mirror the following directories from the above example and exclude everything else. /export/home/em/htdocs/docs/* /export/home/enf/htdocs/docs/* /export/home/das/htdocs/docs/* (And seven more similar directories) I just tried this filter file somewhat based on your previous suggestion but it excluded everything. + das/htdocs/docs/* + em/htdocs/docs/* + enf/htdocs/docs/* + internal/htdocs/docs/* + itb/htdocs/docs/* + medtox/htdocs/docs/* + pml/htdocs/docs/* + psb/htdocs/docs/* + reg/htdocs/docs/* + whs/htdocs/docs/* - /* TIA Ian -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Excluding most and including some problems continue.
In <4ca449ea.2858.00a...@cdpr.ca.gov>, on 09/30/10 at 08:27 AM, "Ian Skinner" said: Hi, >Here is my rsync command as it currently stands. >/usr/local/bin/rsync -vvv --stats -Pzrtpl --delete >--password-file=/export/home/webuser/.appprod >--log-file=/export/home/webuser/logs/rsync-log -F /export/home/ >webu...@appprod::dprweb_extranet/ > rsync-test >This is doing pretty close to what I want it to do. Which is to mirror >only the */htdocs/docs* in each of the ten directories (das,em,enf,etc.) >in the base path of */export/home* and exclude the rest. I'm not sure I entirely understand what you want, but what about + das/**/htdocs/docs* + em/**/htdocs/docs* etc. or possibly + das/**htdocs/docs* + em/**/htdocs/docs* etc. I'm not sure if the addtional slash is required without setting up a testcase. If you really want just the files matching */htdocs/docs/*, the above needs to change slightly. Steven -- -- "Steven Levine" eCS/Warp/DIY etc. www.scoug.com www.ecomstation.com -- -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Excluding most and including some problems continue.
Here is my rsync command as it currently stands. /usr/local/bin/rsync -vvv --stats -Pzrtpl --delete --password-file=/export/home/webuser/.appprod --log-file=/export/home/webuser/logs/rsync-log -F /export/home/ webu...@appprod::dprweb_extranet/ > rsync-test Here is the current .rsync-filter file. + das + em + enf + internal + itb + medtox + pml + psb + reg + whs + htdocs + docs - /* - das/* - em/* - enf/* - internal/* - itb/* - medtox/* - pml/* - psb/* - reg/* - whs/* - htdocs/* This is doing pretty close to what I want it to do. Which is to mirror only the */htdocs/docs* in each of the ten directories (das,em,enf,etc.) in the base path of */export/home* and exclude the rest. Unfortunately there are some subdirectories in some of these selected */htdocs/docs* directories that are unintentionally being excluded by these rules. I.E. */export/home/enf/htdocs/docs/county/internal/*. [sender] hiding file enf/htdocs/docs/county/internal/gis0402.pdf because of pattern internal/* [per-dir .rsync-filter] [sender] hiding file enf/htdocs/docs/county/internal/gis1201.pdf because of pattern internal/* [per-dir .rsync-filter] [sender] hiding directory enf/htdocs/docs/county/internal/gis1201 because of pattern internal/* [per-dir .rsync-filter] [sender] hiding directory enf/htdocs/docs/county/internal/gis0402 because of pattern internal/* [per-dir .rsync-filter] Is there an easy way to remedy this in the base .rsync-filter file and|or the rsync command? Someway to say only exclude the base */export/home/internal/* directory, not any lower "internal" directories? OR is the only way to create sub .rsync-filter files in other directories? My concern with the latter option is that users are in control of these directories and can add and modify them at will. If I find all the special cases today, this is no guarantee that there won't be more special cases tomorrow. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html