On 23/04/2019 21:50, Heitor Faria wrote:
Hello Radoslaw,

I meditated a lot about this topic, and just to keep it short I will resume my conclusions:

1. HA means single points of failure elimination, reliable crossover and failure detection. I don't see how having two replicated always on Directors (perhaps with the same Director Name); replicated job and client configurations; replicated backup data and metadata; secondary Director de/activation mechanisms; redundant storage possibility; cannot be considered a High Availability Solution. I will undergo a laboratory on that.

It is not HA because the jobs that have been running on the failed server cannot be continued.

On a HA system, failure doesn't necessarily mean all prior state is lost.

On the VAXClusters[1] I used to wrangle back in the 1980s (where everything was automatically checkpointed), when a machine went down the load balancer just switched you across to one of the other machines and things continued on from the last checkpoint. In Bacula terms, the file that was being backed up at the time of failure may have to be redone, not the entire job.

Cluster failover of Bacula jobs requires a re-start of all incomplete/failed jobs, all prior state has to be discarded, so if you are 99% through a several terabyte backup, that backup has to be run again, completely.

Which means it's DR, we start with effectively a clean slate and some context from some time in the past.

        Cheers,
                Gary    B-)

1 - That was proper clusters, that was, not the half-arsed crap that lusers call clustering these days.


_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to