On Wed, 2017-09-20 at 12:46 +0300, Vladislav Bogdanov wrote: > Hi, > > as 1.1.17 received a lot of care in pcmk_remote, I decided to try it > again > in rather big setup (less then previous, so I'm not hit by IPC > disconnects here). > > From the first runs there are still some severe issues when cluster > nodes are fenced. > > The following results are obtained by killing the DC node (md12k-3- > srv) which was > hosting remote resources for nodes es7700-3-srv, es12kxe-3-srv and > es12kxe-7-srv. > After the fence new DC (md12k-1-srv) has moved that resources the > following way: > ===== > Sep 20 08:53:28 md12k-1-srv pengine[2525]: notice: Move es12kxe- > 3-srv (Started md12k-3-srv -> md12k-4-srv) > Sep 20 08:53:28 md12k-1-srv pengine[2525]: notice: Move es12kxe- > 7-srv (Started md12k-3-srv -> md12k-1-srv) > Sep 20 08:53:28 md12k-1-srv pengine[2525]: notice: Move es7700- > 3-srv (Started md12k-3-srv -> md12k-2-srv) > ===== > > * The first issue I found (and I expect that to be a reason for some > other issues) is that > pacemaker_remote does not drop an old crmds' connection after new > crmd connects. > As IPC proxy connections are in the hash table, there is a 50% chance > that remoted tries to > reach an old crmd to f.e. proxy checks of node attributes when > resources are reprobed. > That leads to timeouts of that resources' probes with consequent > reaction from a cluster. > A solution here could be to drop old IPC proxy connection as soon as > new one is established.
We can't drop connections from the pacemaker_remoted side because it doesn't know anything about the cluster state (e.g. whether the cluster connection resource is live-migrating). However we can simply always use the most recently connected provider, which I think solves the issue. See commit e9a7e3bb, one of a few recent bugfixes in the master branch for pacemaker_remoted. It will most likely not make it into 2.0 (which I'm trying to focus on deprecated syntax removals), but the next release after that. > * I _suspect_ that the issue above could lead to following lines in a > cluster node logs. > I didn't check, but I suspect that when remoted decides to disconnect > an old connection > after fenced node goes up and TCP connections are reset - it > disconnects a new one too. pacemaker_remoted doesn't really initiate disconnections except when it exits. All connections and disconnections should be initiated on the cluster side. > At least this issue happens at the same time fenced node rejoins a > cluster. > These logs are for the case no resources operate node attributes (I > removed that resources > from the CIB and set a stickiness for all others). > ===== > Sep 20 08:55:41 md12k-1-srv crmd[11375]: error: Timed out (10000 > ms) while waiting for remote data > Sep 20 08:55:41 md12k-1-srv crmd[11375]: error: Unable to receive > expected reply, disconnecting. > Sep 20 08:55:41 md12k-1-srv crmd[11375]: error: Remote lrmd server > disconnected while waiting for reply with id 9823. > Sep 20 08:55:41 md12k-1-srv crmd[11375]: error: Unexpected > disconnect on remote-node es12kxe-7-srv The messages above may be confusing, but the disconnect is originating on the cluster side, after a timeout waiting for a reply from pacemaker_remoted. I don't know of any reason that a cluster node rejoining would cause pacemaker_remoted to stop working, aside from when the cluster node is the same host running pacemaker_remoted (which I've done accidentally during testing). > Sep 20 08:55:41 md12k-1-srv crmd[11375]: error: Result of monitor > operation for es12kxe-7-srv on md12k-1-srv: Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: error: Couldn't perform > lrmd_rsc_info operation (timeout=0): -107: Success (0) > Sep 20 08:55:41 md12k-1-srv crmd[11375]: error: LRMD disconnected > Sep 20 08:55:41 md12k-1-srv crmd[11375]: error: Could not add > resource ost0033-es03a to LRM es12kxe-7-srv > Sep 20 08:55:41 md12k-1-srv crmd[11375]: error: Invalid resource > definition for ost0033-es03a > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: bad > input <create_request_adv origin="te_rsc_command" t="crmd" > version="3.0.13" subt="request" reference="lrm_invoke-tengine-150 > 5897731-3872" crm_task="lrm_invoke" crm_sys_to="lrmd" > crm_sys_from="tengine" crm_host_to="md12k-1-srv" > acl_target="hacluster" crm_user="hacluster"> > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: bad > input <crm_xml> > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: bad > input <rsc_op id="221" operation="monitor" > operation_key="ost0033-es03a_monitor_0" on_node="es12kxe-7-srv" > on_node_uuid= > "es12kxe-7-srv" router_node="md12k-1-srv" transition- > key="221:23:7:7df0b5ef-5bfa-4d21-82e3-4dc964d6cffd"> > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: bad > input <primitive id="ost0033-es03a" class="ocf" > provider="ddn" type="lustre-server"/> > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: bad > input <attributes CRM_meta_on_node="es12kxe-7-srv" > CRM_meta_on_node_uuid="es12kxe-7-srv" CRM_meta_op_target_rc="7" CRM > _meta_record_pending="false" CRM_meta_timeout="300000" > crm_feature_set="3.0.13" device="/dev/ddn/es03a_ost0033" > directory="/lustre/es03a/ost0033" lustre_resource_type="ost" manage_d > irectory="1"/> > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: bad > input </rsc_op> > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: bad > input </crm_xml> > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: bad > input </create_request_adv> > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Resource ost0033- > es03a no longer exists in the lrmd > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 221 > (ost0033-es03a_monitor_0) on es12kxe-7-srv failed (target: 7 vs. rc: > 6): Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: notice: Transition aborted > by lrm_rsc_op.ost0033-es03a_last_failure_0: Event failed > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 221 > (ost0033-es03a_monitor_0) on es12kxe-7-srv failed (target: 7 vs. rc: > 6): Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: error: Result of probe > operation for ost0033-es03a on es12kxe-7-srv: Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: notice: Transition aborted > by operation es12kxe-3-srv_monitor_30000 'create' on md12k-4-srv: Old > event > ... > ep 20 08:56:41 md12k-1-srv attrd[2524]: notice: Node md12k-3-srv > state is now member > Sep 20 08:56:41 md12k-1-srv cib[2511]: notice: Node md12k-3-srv > state is now member > Sep 20 08:56:41 md12k-1-srv pacemakerd[2398]: notice: Node md12k-3- > srv state is now member > Sep 20 08:56:41 md12k-1-srv crmd[11375]: notice: Node md12k-3-srv > state is now member > Sep 20 08:56:41 md12k-1-srv stonith-ng[2522]: notice: Node md12k-3- > srv state is now member > Sep 20 08:56:41 md12k-1-srv crmd[11375]: warning: No reason to > expect node 2 to be down > Sep 20 08:56:41 md12k-1-srv crmd[11375]: notice: Stonith/shutdown > of md12k-3-srv not matched > ===== > I cannot check if that is a true until the first issue is fixed. > > * There are repeated probe results with rc 6 > (PCMK_OCF_NOT_CONFIGURED) > and 189 (PCMK_OCF_CONNECTION_DIED) if probes are running on remote > nodes > after their previous *home* node rejoins. Also there are some with > rc 1 (PCMK_OCF_UNKNOWN_ERROR): > ===== > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 221 > (ost0033-es03a_monitor_0) on es12kxe-7-srv failed (target: 7 vs. rc: > 6): Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 221 > (ost0033-es03a_monitor_0) on es12kxe-7-srv failed (target: 7 vs. rc: > 6): Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 194 > (ost0033-es02a_monitor_0) on es12kxe-3-srv failed (target: 7 vs. rc: > 6): Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 194 > (ost0033-es02a_monitor_0) on es12kxe-3-srv failed (target: 7 vs. rc: > 6): Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 195 > (ost0033-es03a_monitor_0) on es12kxe-3-srv failed (target: 7 vs. rc: > 189): Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 195 > (ost0033-es03a_monitor_0) on es12kxe-3-srv failed (target: 7 vs. rc: > 189): Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 196 > (ost0033-es04a_monitor_0) on es12kxe-3-srv failed (target: 7 vs. rc: > 189): Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 196 > (ost0033-es04a_monitor_0) on es12kxe-3-srv failed (target: 7 vs. rc: > 189): Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 246 > (ost0033-es04a_monitor_0) on es7700-3-srv failed (target: 7 vs. rc: > 189): Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 246 > (ost0033-es04a_monitor_0) on es7700-3-srv failed (target: 7 vs. rc: > 189): Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 247 > (ost0034-es01a_monitor_0) on es7700-3-srv failed (target: 7 vs. rc: > 189): Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 247 > (ost0034-es01a_monitor_0) on es7700-3-srv failed (target: 7 vs. rc: > 189): Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 245 > (ost0033-es03a_monitor_0) on es7700-3-srv failed (target: 7 vs. rc: > 1): Error > Sep 20 08:55:41 md12k-1-srv crmd[11375]: warning: Action 245 > (ost0033-es03a_monitor_0) on es7700-3-srv failed (target: 7 vs. rc: > 1): Error > ===== > rc=6 ones look absolutely incorrect. > rc=1 one appears from a probe op finished with rc=7 (NOT_RUNNING). > rc=189 ones seem to be expected. > > Hope this info has some value, > Best regards, > Vladislav -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org