Hello,

I did some tests Proxmox / Linstor with 2 nodes; I have problems in my controller toggle tests. In preparation, I stop the controller service on my first node (vpx3-1) and copy the contents of the database folder to the second node (vpx3-2).
In my test, I abruptly shutdown the controller node.
The vm running on the active node remains operational despite storage becoming inaccessible; very well. The storage becomes operational again after promoting my second node (vpx3-2) as a controller and modifying the storage.cfg; perfect. But, if I stop the vm on this node, I can not restart it (same if I restart the node): TASK ERROR: command 'drbdsetup wait-connect-resource vm-900-disk-1' failed: got timeout

Likewise, creating a new vm fails:
SUCCESS:
Description:
New resource definition 'vm-101-disk-1' created.
Details:
Resource definition 'vm-101-disk-1' UUID is: 746a418f-62e5-4669-b3fe-f49dbb0dee82
SUCCESS:
Description:
Resource definition 'vm-101-disk-1' modified.
Details:
Resource definition 'vm-101-disk-1' UUID is: 746a418f-62e5-4669-b3fe-f49dbb0dee82
SUCCESS:
New volume definition with number '0' of resource definition 'vm-101-disk-1' created.
ERROR:
Description:
Not enough available nodes
Details:
Not enough nodes fulfilling the following auto-place criteria:
* the current access context has enough privileges to use the node and the storage pool
* the node is online
Auto-placing resource: vm-101-disk-1
Show reports:
linstor error-reports show 5BD197AC-00000-000000
TASK ERROR: unable to create VM 101 - error with cfs lock 'storage-drbdstorage': Could not place vm-101-disk-1: exit code 10

however, everything seems ok :
╭───────────────────────────────────────────────────╮
┊ ResourceName  ┊ Node   ┊ Port ┊ Usage  ┊    State ┊
╞┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╡
┊ vm-900-disk-1 ┊ vpx3-1 ┊ 7000 ┊        ┊  Unknown ┊
┊ vm-900-disk-1 ┊ vpx3-2 ┊ 7000 ┊ Unused ┊ UpToDate ┊
╰───────────────────────────────────────────────────╯
╭──────────────────────────────╮
┊ ResourceName  ┊ Port ┊ State ┊
╞┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄╡
┊ vm-900-disk-1 ┊ 7000 ┊ ok    ┊
╰──────────────────────────────╯

Except warnings here with SSL network communication service :

Oct 26 11:05:37 vpx3-2 Controller[7805]: 11:05:37.055 [Main] INFO LINSTOR/Controller - Current security level is MAC Oct 26 11:05:37 vpx3-2 Controller[7805]: 11:05:37.080 [Main] INFO LINSTOR/Controller - Core objects load from database is in progress Oct 26 11:05:37 vpx3-2 Controller[7805]: 11:05:37.282 [Main] INFO LINSTOR/Controller - Core objects load from database completed Oct 26 11:05:37 vpx3-2 Controller[7805]: 11:05:37.291 [Main] INFO LINSTOR/Controller - Initializing network communications services Oct 26 11:05:37 vpx3-2 Controller[7805]: 11:05:37.305 [Main] WARN LINSTOR/Controller - The SSL network communication service 'DebugSslConnector' could not be started because the keyStore file (ssl/keystore.jks) is missing Oct 26 11:05:37 vpx3-2 Controller[7805]: 11:05:37.343 [Main] INFO LINSTOR/Controller - Created network communication service 'PlainConnector', bound to ::0:3376 Oct 26 11:05:37 vpx3-2 Controller[7805]: 11:05:37.345 [Main] WARN LINSTOR/Controller - The SSL network communication service 'SslConnector' could not be started because the keyStore file (ssl/keystore.jks) is missing Oct 26 11:05:37 vpx3-2 Controller[7805]: 11:05:37.347 [Main] INFO LINSTOR/Controller - Reconnecting to previously known nodes Oct 26 11:05:37 vpx3-2 Controller[7805]: 11:05:37.388 [Main] INFO LINSTOR/Controller - Reconnect requests sent Oct 26 11:05:37 vpx3-2 Controller[7805]: 11:05:37.390 [Main] INFO LINSTOR/Controller - Controller initialized

Proxomox / Linstor / Drbbd are up to date.

What is missing ?

Thank you for your help,

---
Greb

_______________________________________________
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to