Hello Dimitri,

In your case cluster active/passive with shared storage replicated by
DRBD I think you need to use only one Bacula FD config, the same on
both nodes.

In your cluster in the same time the File Daemon will be running on
the only one node (A or B), not on both. It is the cluster way to have
accessible the service by one (or more) common virtual IP address(es)
the same for both nodes. It is the virutal IP address that in your
case is 1.2.3.1.

You should store Bacula FD config on a disk that is replicated by
DRBD. This way you will have automatically the FD configuration on
remote host up-to-date as well.

Now let's look on usage cases with begining state: A - active node, B
- passive node:

In case manual failover action the cluster software should:
1) umount your disks on node A and switch DRBD disks to passive mode,
2) mount your disks on node B and switch DRBD disks to active mode,
3) stop your FD on node A and start FD on node B,
4) your virtual IP address should be down on node A and up on node B.

In the cluster software there should be handlers/triggers to do points
1), 2), 3) and 4).

In case automatic failover (for example: server crash or power off):
1) mount your disks on node B and switch DRBD disks to active mode,
2) start FD on node B,
3) your virtual IP address should be up on node B.

In case failback (crash node A is fixed, node A is back healthy):
1) umount your disks on node B and switch DRBD disks to passive mode
2) mount your disks on node A and switch DRBD disks to active mode
2) stop your FD on node B and start FD on node A.
3) your virtual IP address should be down on node B and up on node A

The steps for failback are opposite to manual failover action.

I case one FD in the cluster environment you need only following
configs (1.2.3.1 - virtual IP address):

Client {
  name = cluster-fd
  address = 1.2.3.1
  ...
}
Job {
  name = nodea-etc
  client = cluster-fd
  fileset = etc
}
FileDaemon {
  name = cluster-fd
...
}

You have to be careful when you switch active/passive after failover
because there can be miliseconds when your cluster can be splitted (it
is split brain state in DRBD nomenclature). It is the moment when
users are able to write to both nodes at the same time. For DRBD there
are some scenarios to recovery this situation.

I hope that I helped. Please let know on the mailing list your
experiences when you finish preparing your cluster.

Good luck.

Best regards.
Marcin Haba (gani)

On 26 October 2016 at 21:46, Dimitri Maziuk <dmaz...@bmrb.wisc.edu> wrote:
> On 10/24/2016 04:15 PM, Josh Fisher wrote:
>
> ... snipped ...
>
> Yes, this is more or less what I've been doing up until now. The good
> news is, it seems I don't have to anymore. Here's what I have working now:
>
> corosync/pacemaker cluster with node A @ 1.2.3.4, node B @ 1.2.3.5, and
> cluster ip @ 1.2.3.1, shared storage mounted a /raid on the active node.
>
> node A bacula-fd.conf:
> FileDaemon {
>   name = nodea-fd
> ...
> }
>
> node B bacula-fd.conf:
> FileDaemon {
>   name = nodeb-fd
> ...
> }
>
> bacula-dir config:
>
> Client {
>   name = nodea-fd
>   address = 1.2.3.4
>   ...
> }
> Client {
>   name = nodeb-fd
>   address = 1.2.3.5
>   ...
> }
> Client {
>   name = cluster-fd
>   address = 1.2.3.1
>   ...
> }
> Job {
>   name = nodea-etc
>   client = nodea-fd
>   fileset = etc
> }
> Job {
>   name = nodeb-etc
>   client = nodeb-fd
>   fileset = etc
> }
> Job {
>   name = cluster-raid
>   client = cluster-fd
>   fileset = raid
> }
>
> -- and it's happily spooling the 21GB /raid right now.
>
> What seems to be happening is bacula is connecting to the cluster
> address (checked with lsof -i), completely ignoring FD name "cluster-fd"
> and is backing up "fileset = raid" from "nodea-fd".
>
> Which is great, if not checking FD name is a bug, *please* don't fix it. :)
>
> So all you need is start FDs at boot listening on * and the director
> will automagically get the shared filesystem off of the node that
> happens to have it mounted.
>
> (Of course the backup will fail if the cluster fails over or the
> connection is otherwise disrupted.)
> --
> Dimitri Maziuk
> Programmer/sysadmin
> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>
>
> ------------------------------------------------------------------------------
> The Command Line: Reinvented for Modern Developers
> Did the resurgence of CLI tooling catch you by surprise?
> Reconnect with the command line and become more productive.
> Learn the new .NET and ASP.NET CLI. Get your free copy!
> http://sdm.link/telerik
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>



-- 
"Greater love hath no man than this, that a man lay down his life for
his friends." Jesus Christ

"Większej miłości nikt nie ma nad tę, jak gdy kto życie swoje kładzie
za przyjaciół swoich." Jezus Chrystus

------------------------------------------------------------------------------
The Command Line: Reinvented for Modern Developers
Did the resurgence of CLI tooling catch you by surprise?
Reconnect with the command line and become more productive. 
Learn the new .NET and ASP.NET CLI. Get your free copy!
http://sdm.link/telerik
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to