On Mon, 2017-07-24 at 20:52 +0200, Lentes, Bernd wrote: > Hi, > > just to be sure: > i have a VirtualDomain resource (called prim_vm_servers_alive) running on one > node (ha-idg-2). From reasons i don't remember i have a location constraint: > location cli-prefer-prim_vm_servers_alive prim_vm_servers_alive role=Started > inf: ha-idg-2 > > Now i try to set this node into standby, because i need it to reboot. > From what i think now the resource can't migrate to node ha-idg-1 because of > this constraint. Right ?
Right, the "inf:" makes it mandatory. BTW, the "cli-" at the beginning indicates that this was created by a command-line tool such as pcs, crm shell or crm_resource. Such tools implement "ban"/"move" type commands by adding such constraints, and then offer a separate manual command to remove such constraints (e.g. "pcs resource clear"). > > That's what the log says: > Jul 21 18:03:50 ha-idg-2 VirtualDomain(prim_vm_servers_alive)[28565]: ERROR: > Server_Monitoring: live migration to qemu+ssh://ha-idg-1/system failed: 1 > Jul 21 18:03:50 ha-idg-2 lrmd[8573]: notice: operation_finished: > prim_vm_servers_alive_migrate_to_0:28565:stderr [ error: Requested operation > is not valid: domain 'Server_Monitoring' is already active ] > Jul 21 18:03:50 ha-idg-2 crmd[8576]: notice: process_lrm_event: Operation > prim_vm_servers_alive_migrate_to_0: unknown error (node=ha-idg-2, call=114, > rc=1, cib-update=572, confirmed=true) > Jul 21 18:03:50 ha-idg-2 crmd[8576]: notice: process_lrm_event: > ha-idg-2-prim_vm_servers_alive_migrate_to_0:114 [ error: Requested operation > is not valid: domain 'Server_Monitoring' is already active\n ] > Jul 21 18:03:50 ha-idg-2 crmd[8576]: warning: status_from_rc: Action 64 > (prim_vm_servers_alive_migrate_to_0) on ha-idg-2 failed (target: 0 vs. rc: > 1): Error > Jul 21 18:03:50 ha-idg-2 crmd[8576]: notice: abort_transition_graph: > Transition aborted by prim_vm_servers_alive_migrate_to_0 'modify' on > ha-idg-2: Event failed > (magic=0:1;64:417:0:656ecd4a-f8e8-46c9-b4e6-194616237988, cib=0.879.5, sou > rce=match_graph_event:350, 0) > Jul 21 18:03:50 ha-idg-2 crmd[8576]: warning: status_from_rc: Action 64 > (prim_vm_servers_alive_migrate_to_0) on ha-idg-2 failed (target: 0 vs. rc: > 1): Error > Jul 21 18:03:53 ha-idg-2 VirtualDomain(prim_vm_mausdb)[28564]: ERROR: > mausdb_vm: live migration to qemu+ssh://ha-idg-1/system failed: 1 > > That is the way i understand "Requested operation is not valid". It's not > possible because of the constraint. > I just wanted to be sure. And because the resource can't be migrated but the > host is going to standby the resource is stopped. Right ? > > Strange is that a second resource also running on node ha-idg-2 called > prim_vm_mausdb also didn't migrate to the other node. And that's something i > don't understand completely. > The resource didn't have any location constraint. > Both VirtualDomains have a vnc server configured (that i can monitor the boot > procedure if i have starting problems). The vnc port for prim_vm_mausdb is > 5900 in the configuration file. > The port is set to auto for prim_vm_servers_alive because i forgot to > configure it fix. So it must be s.th like 5900+ because both resources were > running concurrently on the same node. > But prim_vm_mausdb can't migrate because the port is occupied on the other > node ha-idg-1: > > Jul 21 18:03:53 ha-idg-2 VirtualDomain(prim_vm_mausdb)[28564]: ERROR: > mausdb_vm: live migration to qemu+ssh://ha-idg-1/system failed: 1 > Jul 21 18:03:53 ha-idg-2 lrmd[8573]: notice: operation_finished: > prim_vm_mausdb_migrate_to_0:28564:stderr [ error: internal error: early end > of file from monitor: possible problem: ] > Jul 21 18:03:53 ha-idg-2 lrmd[8573]: notice: operation_finished: > prim_vm_mausdb_migrate_to_0:28564:stderr [ Failed to start VNC server on > `127.0.0.1:0,share=allow-exclusive': Failed to bind socket: Address already > in use ] > Jul 21 18:03:53 ha-idg-2 lrmd[8573]: notice: operation_finished: > prim_vm_mausdb_migrate_to_0:28564:stderr [ ] > Jul 21 18:03:53 ha-idg-2 crmd[8576]: notice: process_lrm_event: Operation > prim_vm_mausdb_migrate_to_0: unknown error (node=ha-idg-2, call=110, rc=1, > cib-update=573, confirmed=true) > Jul 21 18:03:53 ha-idg-2 crmd[8576]: notice: process_lrm_event: > ha-idg-2-prim_vm_mausdb_migrate_to_0:110 [ error: internal error: early end > of file from monitor: possible problem:\nFailed to start VNC server on > `127.0.0.1:0,share=allow > -exclusive': Failed to bind socket: Address already in use\n\n ] > Jul 21 18:03:53 ha-idg-2 crmd[8576]: warning: status_from_rc: Action 51 > (prim_vm_mausdb_migrate_to_0) on ha-idg-2 failed (target: 0 vs. rc: 1): Error > Jul 21 18:03:53 ha-idg-2 crmd[8576]: warning: status_from_rc: Action 51 > (prim_vm_mausdb_migrate_to_0) on ha-idg-2 failed (target: 0 vs. rc: 1): Error > > Do i understand it correctly that the port is occupied on the node it should > migrate to (ha-idg-1) ? It looks like it > But there is no vm running and i don't have a standalone vnc server > configured. Why is the port occupied ? Can't help there > Btw: the network sockets are live migrated too during a live migration of a > VirtualDomain resource ? > It should be like that. > > Thanks. > > > Bernd My memory is hazy, but I think TCP connections are migrated as long as the migration is under the TCP timeout. I could be mis-remembering. -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org