Re: [Bacula-users] Where do we go after Bacula 2.2.0?

2007-08-20 Thread Alan Brown

My observation:

 Item  1:  Accurate restoration of renamed/deleted files
 Item  3:  Merge multiple backups (Synthetic Backup or Consolidation)

To my mind, these pretty much all use the same code inasmuch as one is 
wanting to generate a new full backup to tape (or restore to disk) based 
on what's in the database and in the volumes for any given backup date, 
while weeding files which had been deleted before that date, but since the 
previous backups (full/differential/incremental)

In other words, solving either of #1 or #3 should pretty much 
automatically solve the other.


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Where do we go after Bacula 2.2.0?

2007-08-20 Thread Kern Sibbald
On Monday 20 August 2007 11:42, Alan Brown wrote:
 My observation:
  Item  1:  Accurate restoration of renamed/deleted files
  Item  3:  Merge multiple backups (Synthetic Backup or Consolidation)

 To my mind, these pretty much all use the same code inasmuch as one is
 wanting to generate a new full backup to tape (or restore to disk) based
 on what's in the database and in the volumes for any given backup date,
 while weeding files which had been deleted before that date, but since the
 previous backups (full/differential/incremental)

 In other words, solving either of #1 or #3 should pretty much
 automatically solve the other.

I can see how one might think what you write is so, but in reality the two 
projects are quite distinct and don't really involve any common code.  Item 3 
(merge of multiple backups) is simply a restore bootstrap file as input a 
migration (or copy) job, which is a rather small to moderate addition to the 
current code.  The process doesn't involve the FD at all.

Item 1 is a very complex problem that has serious performance implications 
depending on how it is implemented particularly for the FD, and is a major 
addition to the current code. Probably the best solution that scales is to 
push the work out to the client (FD).  However, doing so risks to overrun the 
capacities of the FD.  The project involves sending a full and accurate state 
of the Client as known in the Bacula catalog to the client, which would then 
reference this information (potentially very large) when backing up files.  
This project has certain aspects in common with Item 7 Implement Base jobs, 
which also must have a full and accurate state of the catalog at the disposal 
of the Client.

Regards,

Kern

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Where do we go after Bacula 2.2.0?

2007-08-20 Thread Rich
On 2007.08.20. 15:37, Kern Sibbald wrote:
 On Monday 20 August 2007 11:42, Alan Brown wrote:
 My observation:
 Item  1:  Accurate restoration of renamed/deleted files
 Item  3:  Merge multiple backups (Synthetic Backup or Consolidation)
...
 Item 1 is a very complex problem that has serious performance implications 
 depending on how it is implemented particularly for the FD, and is a major 
 addition to the current code. Probably the best solution that scales is to 
 push the work out to the client (FD).  However, doing so risks to overrun the 
 capacities of the FD.  The project involves sending a full and accurate state 
 of the Client as known in the Bacula catalog to the client, which would then 
 reference this information (potentially very large) when backing up files.  

i suppose checking only modified directories (and working on from there) 
is considered not safe enough, right ?

in most cases this would probably result in less work, but some 
implementations that do not update directory mtime might break backups 
badly...

 This project has certain aspects in common with Item 7 Implement Base jobs, 
 which also must have a full and accurate state of the catalog at the disposal 
 of the Client.
 
 Regards,
 
 Kern
-- 
  Rich

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Where do we go after Bacula 2.2.0?

2007-08-20 Thread Alan Brown
On Mon, 20 Aug 2007, Kern Sibbald wrote:

 To my mind, these pretty much all use the same code inasmuch as one is
 wanting to generate a new full backup to tape (or restore to disk) based
 on what's in the database and in the volumes for any given backup date,
 while weeding files which had been deleted before that date, but since the
 previous backups (full/differential/incremental)

 In other words, solving either of #1 or #3 should pretty much
 automatically solve the other.

 I can see how one might think what you write is so, but in reality the two
 projects are quite distinct and don't really involve any common code.  Item 3
 (merge of multiple backups) is simply a restore bootstrap file as input a
 migration (or copy) job, which is a rather small to moderate addition to the
 current code.  The process doesn't involve the FD at all.

No it doesn't, but a synthetic full backup will also need to take account 
of which files have been deleted when creating the new backup set on 
Bacula volumes.

 Item 1 is a very complex problem that has serious performance implications
 depending on how it is implemented particularly for the FD, and is a major
 addition to the current code. Probably the best solution that scales is to
 push the work out to the client (FD).  However, doing so risks to overrun the
 capacities of the FD.  The project involves sending a full and accurate state
 of the Client as known in the Bacula catalog to the client, which would then
 reference this information (potentially very large) when backing up files.

You will need this information to get accurate synthetic full backups 
anyway, else that backup is likely to contain significant numbers of files 
which no longer exist on the filesystem at the timestamp the synthetic 
backup is made.

 This project has certain aspects in common with Item 7 Implement Base jobs,
 which also must have a full and accurate state of the catalog at the disposal
 of the Client.

Wholly agreed.

AB


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Where do we go after Bacula 2.2.0?

2007-08-20 Thread Kern Sibbald
On Monday 20 August 2007 17:00, Alan Brown wrote:
 On Mon, 20 Aug 2007, Kern Sibbald wrote:
  To my mind, these pretty much all use the same code inasmuch as one is
  wanting to generate a new full backup to tape (or restore to disk) based
  on what's in the database and in the volumes for any given backup date,
  while weeding files which had been deleted before that date, but since
  the previous backups (full/differential/incremental)
 
  In other words, solving either of #1 or #3 should pretty much
  automatically solve the other.
 
  I can see how one might think what you write is so, but in reality the
  two projects are quite distinct and don't really involve any common code.
   Item 3 (merge of multiple backups) is simply a restore bootstrap file as
  input a migration (or copy) job, which is a rather small to moderate
  addition to the current code.  The process doesn't involve the FD at all.

 No it doesn't, but a synthetic full backup will also need to take account
 of which files have been deleted when creating the new backup set on
 Bacula volumes.

  Item 1 is a very complex problem that has serious performance
  implications depending on how it is implemented particularly for the FD,
  and is a major addition to the current code. Probably the best solution
  that scales is to push the work out to the client (FD).  However, doing
  so risks to overrun the capacities of the FD.  The project involves
  sending a full and accurate state of the Client as known in the Bacula
  catalog to the client, which would then reference this information
  (potentially very large) when backing up files.

 You will need this information to get accurate synthetic full backups
 anyway, else that backup is likely to contain significant numbers of files
 which no longer exist on the filesystem at the timestamp the synthetic
 backup is made.

Yes, I agree, with what you say.

However the synthetic backup is not dependent on having information about 
deleted files.  The synthetic backup will simply take what is in the catalog 
an run with it.  At the current time, no information about deleted files 
exists in the catalog.  Once project #1 is implemented, it will exist.  So 
once the deleted files info is in the catalog, the bootstrap files produced 
will be automatically handled by the low level code that creates the 
bootstrap files.


  This project has certain aspects in common with Item 7 Implement Base
  jobs, which also must have a full and accurate state of the catalog at
  the disposal of the Client.

 Wholly agreed.

 AB

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Where do we go after Bacula 2.2.0?

2007-08-20 Thread Alan Brown
On Mon, 20 Aug 2007, Kern Sibbald wrote:

 However the synthetic backup is not dependent on having information about
 deleted files.  The synthetic backup will simply take what is in the catalog
 an run with it.

I was working on the basis of an accurate full backup. Without knowing 
which files to NOT save, such a synthetic backup would be more of a 
liability than an assett,


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Where do we go after Bacula 2.2.0?

2007-08-20 Thread Kern Sibbald
On Monday 20 August 2007 19:26, Alan Brown wrote:
 On Mon, 20 Aug 2007, Kern Sibbald wrote:
  However the synthetic backup is not dependent on having information about
  deleted files.  The synthetic backup will simply take what is in the
  catalog an run with it.

 I was working on the basis of an accurate full backup. Without knowing
 which files to NOT save, such a synthetic backup would be more of a
 liability than an assett,

I'm guessing that you have a different concept of what a synthetic backup is. 

I am considering it very much like a fancy Bacula migration job, with two 
differences:  

1. it doesn't delete (i.e. it is more like a copy).  It simply creates a 
consolidated copy into a new single job rather than multiple jobs (and 
possibly multiple media types and volumes).

2. instead of using the current migration selection criteria, for a Full, it 
selects what a current restore would (a bit different for a Diff).  Thus it 
has nothing to do with a normal backup or the FD.

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Where do we go after Bacula 2.2.0?

2007-08-20 Thread Steen
On Monday 20 August 2007 18:27:05 Kern Sibbald wrote:
 On Monday 20 August 2007 17:00, Alan Brown wrote:
  On Mon, 20 Aug 2007, Kern Sibbald wrote:
 
   Item 1 is a very complex problem that has serious performance
   implications depending on how it is implemented particularly for the
   FD, and is a major addition to the current code. Probably the best
   solution that scales is to push the work out to the client (FD). 
   However, doing so risks to overrun the capacities of the FD.  
Seems to me quite risky for older overloaded clients
   The 
   project involves sending a full and accurate state of the Client as
   known in the Bacula catalog to the client, which would then reference
   this information (potentially very large) when backing up files.
How can this work with millions of files?
why not send the directory and file structure information from the fd at the 
same time as it sends the data  - then marking deleted files in the catalog 
along the way?
Just seems to me that the backup servers are the ones that usually have more 
horsepower to perform that comparison operation, or the ones that easier can 
be scaled up with memory if neccessary.

 -
 This SF.net email is sponsored by: Splunk Inc.
 Still grepping through log files to find problems?  Stop.
 Now Search log events and configuration files using AJAX and a browser.
 Download your FREE copy of Splunk now   http://get.splunk.com/
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users



-- 
Regards

Steen

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Where do we go after Bacula 2.2.0?

2007-08-20 Thread Brian Debelius
Disney World?? :)



-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Where do we go after Bacula 2.2.0?

2007-08-18 Thread Kern Sibbald
Hello,

Now that Bacula version 2.2.0 has been released, I thought I would give you a 
brief review of the direction that I see Bacula taking over the next year.

1. Of course, there will probably be a few maintenance releases that fix bugs
and add minor new features.  I believe that Eric already has several
projects implemented ...

2. One major change is that as I have previously noted, I will be decreasing
the time I spend on the project from 100% to 40-50%.  The rest of the time
50-60% I will be devoting to the new Bacula services endeavor, which
should be operational by the beginning of next year.  At some point (a
year or two from now), I will probably return full time to the project.
As a consequence, development for the project will probably temporarily
slow down unless the contribution rate increases.  However, in the long
run, the Bacula services endeavor, in my opinion, is the best and fastest
way to accelerate Bacula development.

3. Normally after a major release, we do a vote on the Projects so that the
developers will have your input as to what is important and what is not.
This does not guarantee the the developers will develop all the high 
priority projects and not the low ones, but the user assigned priority is
certainly the largest factor in deciding what to work on.

For this particular release, unfortunately, the #1 project on the list was
taken by a developer who recently left the project, which means it was not
implemented.  As a consequence, in my opinion, it is not absolutely
necessary to hold a new vote as there are enough high priority projects
to work on.  That said, if Arno, would like to do a vote on the project
list, that is perfectly fine with me, and perhaps some of your priorities
have changed.

In any case, I have reviewed the old project list, removed the items that
were completed in 2.2.0, combined several projects that were similar, and
eliminated (put into a hold area) projects that are either developer 
optimizations, not well enough explained for me to implement, projects
that I don't know how to implement, or projects that require proprietary 
code, so cannot be implemented in Bacula (at the current moment).
This cut the number of projects in the voting list down from 44 to 25. 
They are numbered 1-25.  There are 10 projects in the hold list h1-h10.
For all the projects that I placed on hold, I made notes, so if one of
your projects was placed on hold, you will know why, and if it was placed
on hold because I didn't understand what you want or need additional 
information, please feel free to supply it. 

In addition, I stopped keeping track of Feature Requests some time ago 
(about 3 months ago) so any Feature Requests submitted after that point
are not included in the current list.

To sum it up, I've reproduced the list below, and if you feel it is important 
to vote again on the items, please discuss it with Arno, work out the details 
and let me know.

Best regards,

Kern


Projects:
 Bacula Projects Roadmap 
Status updated 18 August 2007
  After removing items completed in version  
   2.2.0 and renumbering

Items Completed:

Summary:
Item  1:  Accurate restoration of renamed/deleted files
Item  2:  Allow FD to initiate a backup
Item  3:  Merge multiple backups (Synthetic Backup or Consolidation)
Item  4:  Implement Catalog directive for Pool resource in Director
Item  5:  Add an item to the restore option where you can select a Pool
Item  6:  Deletion of disk Volumes when pruned
Item  7:  Implement Base jobs
Item  8:  Implement Copy pools
Item  9:  Scheduling syntax that permits more flexibility and options
Item 10:  Message mailing based on backup types
Item 11:  Cause daemons to use a specific IP address to source communications
Item 12:  Add Plug-ins to the FileSet Include statements.
Item 13:  Restore only file attributes (permissions, ACL, owner, group...)
Item 14:  Add an override in Schedule for Pools based on backup types
Item 15:  Implement more Python events and functions
Item 16:  Allow inclusion/exclusion of files in a fileset by creation/mod 
times
Item 17:  Automatic promotion of backup levels based on backup size
Item 18:  Better control over Job execution
Item 19:  Automatic disabling of devices
Item 20:  An option to operate on all pools with update vol parameters
Item 21:  Include timestamp of job launch in stat clients output
Item 22:  Implement Storage daemon compression
Item 23:  Improve Bacula's tape and drive usage and cleaning management
Item 24:  Multiple threads in file daemon for the same job
Item 25:  Archival (removal) of User Files to Tape


Item  1:  Accurate restoration of renamed/deleted files
  Date:   28 November 2005
  Origin: Martin Simmons (martin at lispworks dot com)
  Status: Robert