In the message dated: Fri, 27 Jan 2006 21:56:36 EST,
The pithy ruminations from [EMAIL PROTECTED] on 
<Re: [BackupPC-users] New user, few questions> were:
=> 
=> Hi, Mark,
=> 
=> I went back and read your Jan. 16th posting. Sounds like a very ambitious
=> piece of code... Some issues:
=> 
=> --  Because the --include option of rsync (or the include option of an
=> rsync module) is not exclusive, like $Conf{BackupFilesOnly} is, we'd have
=> to work with --exclude (as you said), listing everything we didn't want for
=> each set for a share, make each of those a separate module, and write them
=> to rsyncd.conf. This doesn't need to happen from BackupPC (although it
=> would be nice). It can run perhaps as a cron job on the client, or with
=> DumpPreUserCmd.

I don't think that's too hard... If you've got an array listing directory tree 
roots (ie., most often mount points, but individual elements could go lower 
than that), then as you build each include list, everything that's not in that 
include list becomes the exclude list for the same module. After each module is 
built, pop the included directory trees from the array, leaving a smaller set 
to be parsed for constructing the next module.

Here's some really sketchy pseudo code:

------------------------------------------------------------------------------

while ( ! null(@all_directories) )
{
        @[EMAIL PROTECTED];

        @included_in_module= \
          function_to_build_sets(@all_directories,@all_directories_byGB,\
                $SIZELIMIT,$CPUscaling_factor,$network_scaling_factor);

        foreach $dir (@included_in_module)
        {
                @all_directories=pop_by_value(@all_directories,$dir);
                @excluded_from_module=pop_by_value(@excluded_from_module,$dir);
        }

        
$module[$i]=format4rscyndconfig(@included_in_module,@excluded_from_module);

        print $module[$i] >> $RSYNCD.config;

        $i++;   
}


=> 
=> -- Maybe a simplified one would do as a proof of concept. I'm thinking it

Absolutely.

=> would be easier to only balance number of files, not taking into account
=> sizes, memory availability on server and client, etc. And it would run in
=> two passes, one to list and chunk directories in sets, and a second one to
=> consolidate subtrees.

I don't think you need 2 passes as separate invocations, but conceptually there 
are 2 phases to the operation of building the backup sets.

=> 
=> It seems doable, but certainly not trivial.

Not trivial, but if you keep the goal in sight--coming up with reasonable sized 
chunks, rather than trying for complete optimization as in the knapsack 
problem--then it's quite doable.

=> 
=> Assuming we have that rsyncd.conf nicely constructed, we'd then have to
=> find out the list of modules that applies to the share to be backed up.
=> Something like
=> 
=>     rsync <host>::
=> 
=> would return, say,
=> 
=>     shareA_set1
=>     shareA_set2
=>     shareB_set1
=>     shareB_set2
=> 
=> and then it would have to be filtered for the relevant modules (assuming
=> they are named sensibly). This could be done at DumpPreUserCmd too. So

Yes.

I wouldn't even do anything like 
        shareA_set1
        shareA_set2
I'd simply do
        set1
        set2
and so on, and then pass back to the BackupPC server the count of the number of 
rsync modules to backup.

=> let's say we are backing up share B: the relevant modules are shareB_set1
=> and shareB_set2.
=> 
=> Next, we need to call rsync with that filtered list of modules. Could it be
=> something like this?:
=> 
=>     $Conf{XferMethod} = 'rsyncd';
=> 
=>     $Conf{DumpPreUserCmd} = <magical command that constructs rsync.conf,
=> and sets a variable>;
=> 
=>     $Conf{RsyncShareName} = ['shareB_set1', 'shareB_set2'];
=> 
=> In particular, I don't know how BackupPC connects to an rsync daemon on the
=> client (I assume it doesn't use the $Conf{RsyncClientCmd} because the
=> comments there say "This setting only matters if $Conf{XferMethod} =
=> 'rsync'"). And can the DumpPre... set a config variable, i.e.,
=> $Conf{RsyncShareName};?
=> 
=> Or would something like this work, which should be executed when the config
=> file is read? I'm assuming BackupPC rereads/require's the per-pc config
=> file before every dump:
=> 
=>     $Conf{XferMethod} = 'rsyncd';
=> 
=>     $Conf{DumpPreUserCmd} = <regular command not related to rsyncd>;
=> 
=>     sub get_modules {
=>         my($host, $share) = @_;
=> 
=>         [...]  # Magic happens here, and @modules is defined.
=>                # This sub could conceivably also construct the rsync.conf
=> on the client.
=> 
=>         return @modules;
=>     }
=> 
=>     $Conf{RsyncShareName} = [ get_modules($host, $share) ];

That's very good. I think that's a good way to put in this extension.

=> 
=> And one more question: after all this, is each module backed up
=> independently or are all used at once? The concern here, after all, is

Each module would be backed up serially, just as BackupPC/rsyncd does now when 
module names are hard-coded into the config files.

=> memory. If each module was done independently (whether in one or multiple
=> connections) then we'd be limiting memory usage, but if not, we are back
=> where we were...

Right.

Mark

=> 
=> Too many questions, sorry; enough for today.
=> Bernardo
=> 
=> 
=> 
=>                                                                              
                   
=>                                                       
=>                       [EMAIL PROTECTED]                                      
               
=>                                                       
=>                       Sent by:                               To:       
[EMAIL PROTECTED]
=> e.com                                                 
=>                       [EMAIL PROTECTED]        cc:       [EMAIL PROTECTED]
=> ceforge.net                                           
=>                       ceforge.net                            bcc:            
                   
=>                                                       
=>                                                              Subject:  Re: 
[BackupPC-users] New 
=> user, few questions                                   
=>                                                                              
                   
=>                                                       
=>                       01/25/2006 10:51 AM                                    
                   
=>                                                       
=>                       Please respond to backuppc                             
                   
=>                                                       
=>                                                                              
                   
=>                                                       
=>                                                                              
                   
=>                                                       
=> 
=> 
=> 
=> 
=> 
=> 
=> In the message dated: Tue, 24 Jan 2006 19:16:26 EST,
=> The pithy ruminations from [EMAIL PROTECTED] on
=> <Re: [BackupPC-users] New user, few questions> were:
=> =>
=> => Dan,
=>
=> =>
=> => > Alternately, just break your backup set into two (or more) pieces.
=> =>
=> => Ah, I haven't tried that yet... There are a couple of reasons why:
=> =>
=> =>   -- even though I'm backing up only one machine, I have one "client"
=> per
=> => backed up dir ("share"?, I'm using rsync), because I want to invoke the
=> => creation of a single LVM snapshot at a time with $Conf{DumpPreUserCmd}
=> (I
=> => don't want more than one snapshot simultaneously on the client, at least
=> => for now, for performance and space reasons).
=> =>
=> =>   -- subdirectories below each backed up dir can change, so I can't use
=> a
=> => static list, lest I be forced to maintain it (I'm lazy, I know).
=> =>
=> => But now you got me thinking... there might be ways to do it. For
=> example, I
=> => know that in the home partition there is one user that takes as much
=> space
=> => as everybody else together. So that would be a good candidate to break
=> that
=> => backup set into two nearly-equal-size pieces.
=> =>
=> 
=> 
=> This sounds a great deal like what I proposed on Jan 16, describing a
=> suggestion to enhance backuppc to dynamically create backup sets of small
=> sizes.
=> I saw much the same problems as you, with long back times, and the burden
=> of
=> maintaining different backup sets in the face of changing disk usage.
=> 
=> Check the archive for the post entitled "rsync dynamic backup set
=> partitioning
=> enhancement proposal".
=> 
=> 
=> 
=> => This is what I have currently:
=> =>
=> =>     $Conf{RsyncShareName} = '/home_snapshot';
=> =>
=> => Would something like this work?:
=> =>
=> =>     $Conf{RsyncShareName} = '/home_snapshot';
=> =>
=> =>   $Conf{BackupFilesOnly} = {
=> =>       '/home_snapshot' => [ '/<large_user_home_dir>' ],
=> =>       '/home_snapshot' => [ some grep-like operation that lists all home
=> => directories, but excludes /<large_user_home_dir> ],
=> =>     };
=> =>
=> =>
=> => Because the config files are Perl scripts, the second list can be
=> generated
=> => dynamically. I'd have to ssh to the client, ls /home_snapshot, filter
=> => <large_user...> out from that directory listing... looks doable... Argh!
=> 
=> Exactly. I think this should be built into backuppc...
=> 
=> => that wouldn't fly, because two hash entries with the same key means the
=> => second one would overwrite the first one. Hmm, but something along those
=> 
=> Not a problem, if you consider the "hash" to be the full path.
=> 
=> => lines... Is it possible to have arrays of arrays?:
=> 
=> Yes, perl conceptually supports arrays of arrays. I don't know if that
=> would
=> work w/in backuppc as it is now.
=> 
=> =>
=> =>   $Conf{BackupFilesOnly} = {
=> =>       '/home_snapshot' => [
=> =>                             [ '/<large_user_home_dir>' ],
=> =>                             [ some grep-like operation that lists all
=> home
=> => directories, but excludes /<large_user_home_dir> ]
=> =>                           ],
=> =>   };
=> =>
=> =>
=> => In which case I could just as well do this:
=> =>
=> =>   $Conf{RsyncShareName} = [
=> =>       [ '/home_snapshot/<large_user_home_dir>' ],
=> =>       [ /home_snapshot prefixed to each element of the filtered list of
=> => home dirs ]
=> =>   ];
=> =>
=> =>
=> =>
=> => I bet there is some much simpler way of doing this, and I just have the
=> => blinders on.
=> 
=> Yes. Use the "exclude" options in the client-side rsyncd.conf. For example,
=> my
=> backuppc server connects to a number of targets (in rsync terminology, a
=> "module") on the clients. The clients then define what directories are
=> included/
=> excluded in each target. In theory, the rsyncd.conf file could be build
=> dynamically. Here's a snippet of an rsyncd.conf on a backup client:
=> 
=> 
=> [justroot]
=>         path = /
=>         exclude = boot cdrom home opt tmp usr usr/local var win98 proc sys
=> mnt/*/*
=> 
=> [usr]
=>         path = /usr
=>         exclude = usr/local/
=> 
=> [usr_local]
=>         path = /usr/local
=> 
=> The backuppc server simply connects to the client modules "justroot",
=> "usr",
=> "usr_local", and so on, without maintaining the exclude lists on the
=> server.
=> Each of these targets, corresponding to mount points, could easily be
=> generated
=> dynamically.
=> 
=> =>
=> =>
=> => > cat5e should work fine for gigabit (it is the specified cable for
=> => > 1000baseT).  You don't need cat6.
=> =>
=> => Thanks for correcting me on cat 5e vs cat6. That's good news for me.
=>
=> =>
=> => Bernardo Rechea
=> =>
=> 
=> 
=> Mark
=> 
=> 
=> ----
=> Mark Bergman
=> [EMAIL PROTECTED]
=> Seeking a Unix/Linux sysadmin position local to Philadelphia or via
=> telecommuting
=> 
=> http://wwwkeys.pgp.net:11371/pks/lookup?op=get&search=bergman%40merctech.com
=> 
=> 
=> 
=> 
=> 
=> 
=> -------------------------------------------------------
=> This SF.net email is sponsored by: Splunk Inc. Do you grep through log
=> files
=> for problems?  Stop!  Download the new AJAX search engine that makes
=> searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
=> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
=> _______________________________________________
=> BackupPC-users mailing list
=> BackupPC-users@lists.sourceforge.net
=> https://lists.sourceforge.net/lists/listinfo/backuppc-users
=> http://backuppc.sourceforge.net/
=> 
=> 
=> 
=> 
=> 
=> 
=> -------------------------------------------------------
=> This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
=> for problems?  Stop!  Download the new AJAX search engine that makes
=> searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
=> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
=> _______________________________________________
=> BackupPC-users mailing list
=> BackupPC-users@lists.sourceforge.net
=> https://lists.sourceforge.net/lists/listinfo/backuppc-users
=> http://backuppc.sourceforge.net/
=> 





-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/

Reply via email to