Il 27/08/2018 09:33, Roland Kammerer ha scritto:
On Mon, Aug 27, 2018 at 09:12:12AM +0200, Roberto Resoli wrote:
Il 24/08/2018 10:54, Roland Kammerer ha scritto:
This feature is documented here:
https://docs.linbit.com/docs/users-guide-9.0/#s-proxmox-ls-HA

Hello,
I read and tried the described procedure, but my findings were not positive.
In two cases the entire drbd storage freezed and all vm were stopped.

Could you describe these scenarios in more detail? How many nodes, where
was the controller VM started, what did you do exactly, autoboot of
other VMs? Sorry, as always, "does not work" is not good enough to debug
things.

Yes, of course. The scenario is:

=== general ===

3 pve nodes, each one with a 2TB disk dedicated to drbd; I recently migrated from drbdmanage, all 3 nodes are COMBINED type, each resource is replicated on each node.

=== networking ===

drbd storage uses a dedicated network mesh, with dedicated connections between nodes (no switch). There is a bridge used as drbd interface on each node, stp is disabled, and broadcasts storm are blocked using ebtable rules. Here's the relevant part of /etc/network/interfaces file for node pve3

auto vmbr2
iface vmbr2 inet static
        address  10.1.1.3
        netmask  255.255.255.0
        bridge_ports eth2 eth3
        bridge_stp off
        bridge_ageing 30
        bridge_fd 5
# Only with stp on
      # pve1 and pve2 are preferred
        #bridge_bridgeprio 32768
# Only with stp off
        pre-up ifconfig eth2 mtu 9000 && ifconfig eth3 mtu 9000
        up   ebtables -I FORWARD -i eth2 -j DROP
        up   ebtables -I FORWARD -i eth3 -j DROP
        up   ebtables -I FORWARD -o tap200i1 -j ACCEPT
        down ebtables -D FORWARD -i eth2 -j DROP
        down ebtables -D FORWARD -i eth3 -j DROP
        down ebtables -D FORWARD -o tap200i1 -j ACCEPT

The "tap200i1" interface corresponds to th controller vm, the resulting ebtables rules are:

# ebtables -L
Bridge table: filter

Bridge chain: INPUT, entries: 0, policy: ACCEPT

Bridge chain: FORWARD, entries: 3, policy: ACCEPT
-o tap200i1 -j ACCEPT
-i eth3 -j DROP
-i eth2 -j DROP

Bridge chain: OUTPUT, entries: 0, policy: ACCEPT
root@pve3:~# ebtables -L
Bridge table: filter

Bridge chain: INPUT, entries: 0, policy: ACCEPT

Bridge chain: FORWARD, entries: 3, policy: ACCEPT
-o tap200i1 -j ACCEPT
-i eth3 -j DROP
-i eth2 -j DROP

Bridge chain: OUTPUT, entries: 0, policy: ACCEPT

=== drbd resources ===

Resource definitions are as follows:

# linstor rd l -p
+------------------------------+
| ResourceName  | Port | State |
|------------------------------|
| vm-100-disk-1 | 7009 | ok    |
| vm-101-disk-1 | 7001 | ok    |
| vm-101-disk-2 | 7002 | ok    |
| vm-102-disk-1 | 7003 | ok    |
| vm-103-disk-1 | 7000 | ok    |
| vm-103-disk-2 | 7004 | ok    |
| vm-104-disk-1 | 7008 | ok    |
| vm-104-disk-2 | 7005 | ok    |
| vm-105-disk-1 | 7006 | ok    |
| vm-106-disk-1 | 7013 | ok    |
| vm-120-disk-1 | 7010 | ok    |
| vm-120-disk-2 | 7015 | ok    |
| vm-121-disk-1 | 7007 | ok    |
| vm-122-disk-1 | 7011 | ok    |
| vm-123-disk-1 | 7012 | ok    |
| vm-200-disk-1 | 7014 | ok    |
| vm-999-disk-1 | 7016 | ok    |
| vm-999-disk-2 | 7017 | ok    |
+------------------------------+

=== drbd nodes before controller virtualization ===

+--------------------------------------------------+
| Node  | NodeType   | IPs               | State   |
|--------------------------------------------------|
| pve1  | COMBINED   | 10.1.1.1(PLAIN)   | Online  |
| pve2  | COMBINED   | 10.1.1.2(PLAIN)   | Online  |
| pve3  | COMBINED   | 10.1.1.3(PLAIN)   | Online  |
+--------------------------------------------------+


I installed the controller on 10.1.1.1 and this basic setup is working nicely.


I installed a debian stretch minimal vm with id 200, connected with distict interfaces to both the drbd storage and the regular network segments. I installed linbit controller on it, and gave hostname "drbdc". I stopped and disabled the controller on pve1 and migrated the service on drbdc. Then i added drbdc to the drbd cluster:

=== drbd nodes after controller virtualization ===

+--------------------------------------------------+
| Node  | NodeType   | IPs               | State   |
|--------------------------------------------------|
| drbdc | CONTROLLER | 10.1.1.200(PLAIN) | OFFLINE |
| pve1  | COMBINED   | 10.1.1.1(PLAIN)   | Online  |
| pve2  | COMBINED   | 10.1.1.2(PLAIN)   | Online  |
| pve3  | COMBINED   | 10.1.1.3(PLAIN)   | Online  |
+--------------------------------------------------+

After that I enabled and started drbd service on all three nodes.

Then i changed the drbd proxmox storage configuration as follows:

drbd: drbdthin
        content rootdir,images
        redundancy 3
        controller 10.1.1.200
        controllervm 200

Then i enabled HA for vm 200

===

At this point all worked quite well, but when i tried to shut down the node on which the controller vm resided, the resources on it did not came up, even that of controller vm. In one case HA did its job moving it on another node, but since in my setup many times the quorum is lost temporarily even restarting one node only, the controller restarted only after the quorum was established.

I have still to investigate the condition under that drbd storage became unavailable to pve, causing all vms to stop. Hopefully I will have a chance to give you some more details after examining the logs.

At the moment I can report only a bunch of these messages in syslog:

Aug 25 22:49:04 pve3 pvestatd[2598]: malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "(end of string)") at /usr/share/perl5/PVE/Storage/Custom/LINSTORPlugin.pm line 321.

may be generated when i switched to the "controllervm" configuration.

The main problem, if I understand well, is that even if proxmox plugin does
not manage controller vm anymore, controller vm storage depends on itself.

What do you mean by that? The DRBD resource (== storage for the
controller VM) is brought up by the drbd.service and can then be
auto-promoted.

I noted that resources definition files inside /var/lib/linstor.d are recreated each time the node is started, so i guess drbd service cannot bring them up.

The plugin code ignores that VM. The Proxmox HA service
should do its job and start the VM on one node. So again, where exactly
does that chain break in your setup?

See before.

If it is stopped, or cannot be contacted, no corresponding resource will be
created, resulting in a deadlock.

That is true. But how would that happen? The Proxmox HA feature should
make sure that the VM is always running (that is its job, isn't it). The
storage should be accessible, because up'ed by drbd.service.

Is drbd service dependent on definitions files? If so, I guess that if controller is unavailable, definitions of resources linstor-managed are not provided to satellite that cannot create definitions. It's only a guess, i'm not inside linstor/drbd internals.

Did Proxmox try to start another "autoboot" VM before the HA service
kicked in to start the controller VM? That would at least explain it.

May be; i have another two vm with autoboot feature enabled, but I had set them in order after controller vm, and with minutes of delay.

In
that case we have to document that "autoboot" has to be disabled if one
goes that route.

Ok, will retry and hopefully give you some more feedback.

Bye,
rob

Thanks, rck
_______________________________________________
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


_______________________________________________
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to