On 27/05/2020 15:13, Gary R. Schmidt wrote:
> On 27/05/2020 23:17, Alan Brown wrote:
>>
>>
>>
>> Bacula DOES NOT LIKE and does not handle network interruptions _at all_
>> if backups are in progress. This _will_ cause backups to abort - and
>> these aborted backups are _not_ resumable
>>
>> Similarly, if there's any kind of disruption between the director and
>> database, the only fix is to restart the director
>>

>> Opinion: I know bugs aren't sexy to work on but these need fixing, not
>> being brushed off. This is the difference between LAN-quality and actual
>> Enterprise grade software.
>>
> I do not consider these to be bugs - they aren't simple errors where
> someone made a mistake or used the wrong sized variable - they require
> a large amount of re-design and reimplementation of Bacula's
> communication modules, and the scheduler, and no doubt other bits to
> go away.


Nonetheless they need to be done. There are a lot of assumptions made
about networks that simply do not hold true or only work in SOHO/SMB scale.


>
> Bacula started life twenty years ago, and the environment has changed
> since then, and, while Bacula has kept up with a some things, disk as
> a target rather than tape, frex, something like re-startable jobs is,
> as I have said, not just an extension or addition to what is there,
> but a big change to a large part of Bacula.


Restarting is there for stopped jobs already. The question is how much
work is needed to extend that to aborted or errored jobs


> And, from the commercial stand-point, that the changes could be made
without interrupting the existing income stream. 

There's a "cost of not implementing". I'm facing pressure to replace
Bacula and this is pointed to as one of the reasons - bear in mind we're
a paying customer who would go away if this isn't sorted


>
> Then there's the projected time-line before it could be released?

You can't project that if it's not even on your TODO list and right now
it keeps being swept into the "WON'T DO" basket.


> I don't want to think about that, Bacula is fragile as it is, ripping
> it apart and stitching it back together would be a massive task!


This is exactly why I _do_ want to think about it. This is _where_ it's
fragile and what most fundamentally needs fixing.

Enteprise software needs to be robust. Bacula is not - in extremely
critical areas


"If carpenters built buildings the way programmers write programs, the
first woodpecker that came along would destroy civilization."


> And Bacula does not have that capability, not in the OSS space nor in
> the Enterprise space.
>
> All the above said, I think that re-startable jobs would be a great
> enhancement for Bacula, but how often and for how long does it try by
> default before giving up?  :->
>

restartable, or reconnecting? (and why not just set defaults - then let
the users decide on #attempts/timeouts?)


The single most fragile part of Bacula:  If the database connection
glitches for _any_ reason the only solution is to restart the entire
program - and you lose _everything_ that was underway at the time.

As I said, that includes using a high availability database (postgresql,
etc). As soon as heads are switched there's a necessary glitch in the
connection.


Database connections are _supposed_ to be stateless. Bacula breaks that
and as such it's a fundamental bug, whether by design or not.






_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to