On Wednesday 12 September 2007 05:06, Nick Pope wrote:
> On Sep 11, 2007, at 1:26 PM, David Boyes wrote:
> >> Couldn't the migrate capability be altered ever so slightly to allow
> >> the "migration" of a job without purging the old job from the
> >> catalog?  This would allow bitwise identical backup(s) to be created
> >> without having to create a forking/muxing SD/FD.
> >>
> >> This, of course, does not create the identical backups at the same
> >> instant in time, but it would solve the off-site backup problem with
> >> much less effort.
> >
> > As I said, that wouldn't be sufficient to satisfy our auditors.
> > YMMV. In
> > our case (which IMHO isn't unusual in the enterprise space), if we
> > can't
> > stand up and say under penalty of perjury that the bits on volume A
> > are
> > the same exact bits on volume B, then it's not good enough and we
> > stand
> > a good chance of seeing jail time.
>
> In the case of a migrate-without-purge, the bacula blocks would,
> presumably, be copied block for block.  So your backed up data would
> be identical.  The metadata Bacula uses to encapsulate your data
> would be recreated for the second job, so that would be different.
> So maybe the migrate-without-purge feature won't satisfy your
> auditors, but that doesn't make the simpler feature pointless.  You
> seem to be implying it has to be one or the other (sorry if I'm
> misreading you here).  I think there's a use for BOTH the simpler
> feature (especially if it comes quicker) and the full-blown muxing SD.

Well the data is not copied block for block; it is copied record for record 
(i.e. bit for bit for the data).  Copying block for block might be a future 
performance enhancement, but it could only be done if the block size is the 
same on the input and output device, which is not always the case.  

The file meta data is copied record for record (i.e. bit for bit).  Thus all 
the data (meta and file) is copied bit for bit.  The layout of the SD 
blocking data on the output Volume may be slightly different -- e.g. the 
Volume label contains the new Volume name rather than the old Volume name 
(obviously), and as I said, the data *may* be blocked different because it is 
unpacked from the blocks and repacked into blocks.  Normally the blocks will 
be identical if the media are of the same type, but in spanning Volumes if 
the tape sizes are different, the blocking on the migrated Volume after the 
change of Volumes will be different, but the file meta and file data will be 
bit for bit identical.

I don't know if that should satisfy David's auditors or not.  It seems to me 
that they accepted clone jobs as being valid "copies".  Well migrated jobs 
should be 100% identical as compared to clone jobs which may be different due 
to small filesystem changes between the two jobs.

>
> > Also, there is still a period of time where only one copy of the
> > backed-up data exists; all the easy solutions to this problem don't
> > address that requirement.
>
> This is the major drawback of the simpler solution (again, doesn't
> invalidate its usefulness in other scenarios)

It may invalidate it for some, but it will be *extremely* useful for a lot of 
people and will in fact give us an archiving capability.

>
> > If we could get away with that, we'd just
> > duplicate the tapes outside of Bacula and be done with it.
>
> If I do that, I can't track the copied volumes with the Bacula
> catalog.  One might foresee Bacula at some point enforcing a minimum
> number of duplicate copies, etc.


>
> > The related
> > problem is how Bacula handles multiple locations where the file can be
> > found, and how Bacula prioritizes restores. I have some interesting
> > ideas on that when/if Kern gets time to think about the design for
> > this
> > stuff.
>
> That certainly seems like the main challenge to the copy job.

Yes, I don't really know how to handle this in the best way at the moment, but 
implementing the simple change to Migration to do a Copy certainly is the 
first step.

>
> > There are some easier ways to deal with some of the symptoms of the
> > problem. I think that if we start solving symptoms rather than the
> > problem, we're going to waste a lot of effort, particularly testing
> > time, on a partial solution that doesn't get Bacula to enterprise
> > grade
> > solution space. This is major surgery to Bacula; it's going to take a
> > lot of testing resources to get this one right. I'd really rather see
> > that testing done to get to the final solution.
>
> I'm not sure I agree that the migrate-without-purge is treating a
> symptom.  I think it addresses a major shortcoming (fresh offsite
> backups) rather effectively.  While it may not solve all enterprise- 
> grade offsite scenarios, it does address many basic offsite backup
> scenarios.  I don't really agree that the migrate-without-purge is an
> interim solution.  I think people will use it even when Bacula gets
> the full-blown muxing SD.  Not everyone is running Bacula in a large
> enterprise.

Yes, I agree 100%, and I even think it will be used in large corporations.  If 
I had a full understanding of the problem of copy pools, I might go straight 
for it, but Bacula has always over the last 7+ years evolved baby step by 
baby step, which is one reason why it is very stable, and in doing so, we 
have so far never worked ourselves into a hole because we didn't understand 
the full problem.  It has permitted us (the users and programmers) to get the 
necessary experience to either stop with the simple solution because nothing 
more was needed or go on to bigger and better things with a good base.

It would be different if we had 5-10 full time programmers (a possibility in 
the next couple of years).

>
> >> This would allow me to backup to disk at night as usual.  Then once
> >> the backups are done and the clients are freed up, the copy/migrate
> >> job could run and copy the jobs to tape or an offsite pool.  The
> >> migrate job would not involve the clients, so it wouldn't have to run
> >> in the middle of the night.
> >
> > Assuming the connection between the SD and the offsite server doesn't
> > run over the same network...8-)
>
> Fair point :)  In my case, I just need to have a full offsite tapeset
> to take offsite and I don't want to wait 6 months for my fullbackups
> to migrate to tape (making my offsite backup 6 months out of date).
> i do see your point: the simpler solution won't work for large
> enterprises.  Fair 'nuff.

If some large enterprise(s) wants to fund acceleration of such a full 
solution, it will be possible sometime around mid next year.

Regards,

Kern

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to