On Mon, Jan 22, 2018 at 10:09 PM, brainheadz <brainhe...@gmail.com> wrote: > Hello Andrei, > > yes this fixes the issue. But is there a way to automate this process > without a manual intervention? >
Normally adding and removing this constraint is manual process by design. Do you mean this constraint appears again without you being aware of it? > Node1 fails. > > Node2 takes over the vip_bad and ipsrcaddr. > > Node1 is back online. > > vip_bad and ipsrcaddr are moved back to Node1. > > Node2 sets the correct default_gw and it's own source address again > (configured via ip_bad_2 and vip_bad_2_location). > ^- this happens if i execute the cleanup manually > > # crm resource cleanup default_gw_clone > Cleaning up default_gw:0 on fw-managed-01, removing fail-count-default_gw > Cleaning up default_gw:0 on fw-managed-02, removing fail-count-default_gw > Waiting for 2 replies from the CRMd.. OK > > # crm status > Last updated: Mon Jan 22 19:43:22 2018 Last change: Mon Jan 22 > 19:43:17 2018 by hacluster via crmd on fw-managed-01 > Stack: corosync > Current DC: fw-managed-01 (version 1.1.14-70404b0) - partition with quorum > 2 nodes and 6 resources configured > > Online: [ fw-managed-01 fw-managed-02 ] > > Full list of resources: > > vip_managed (ocf::heartbeat:IPaddr2): Started fw-managed-01 > vip_bad (ocf::heartbeat:IPaddr2): Started fw-managed-01 > Clone Set: default_gw_clone [default_gw] > Started: [ fw-managed-01 fw-managed-02 ] > src_address (ocf::heartbeat:IPsrcaddr): Started fw-managed-01 > vip_bad_2 (ocf::heartbeat:IPaddr2): Started fw-managed-02 > > Failed Actions: > * src_address_monitor_0 on fw-managed-02 'unknown error' (1): call=18, > status=complete, exitreason='[/usr/lib/heartbeat/findif -C] failed', > last-rc-change='Fri Jan 19 17:10:43 2018', queued=0ms, exec=75ms > > root@fw-managed-02:~# ip r > default via 100.200.123.161 dev bad > 100.200.123.160/29 dev bad proto kernel scope link src 100.200.123.165 > 172.18.0.0/16 dev tun0 proto kernel scope link src 172.18.0.1 > 172.30.40.0/24 dev managed proto kernel scope link src 172.30.40.252 > root@fw-managed-02:~# ping 8.8.8.8 > PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data. > 64 bytes from 8.8.8.8: icmp_seq=1 ttl=60 time=3.57 ms > ^C > > On Mon, Jan 22, 2018 at 7:29 PM, Andrei Borzenkov <arvidj...@gmail.com> > wrote: >> >> 22.01.2018 20:54, brainheadz пишет: >> > Hello, >> > >> > I've got 2 public IP's and 2 Hosts. >> > >> > Each IP is assigned to one host. The interfaces are not configured by >> > the >> > system, I am using pacemaker to do this. >> > >> > fw-managed-01: 100.200.123.166/29 >> > fw-managed-02: 100.200.123.165/29 >> > >> > gateway: 100.200.123.161 >> > >> > I am trying to get some form of active/passive cluster. fw-managed-01 is >> > the active node. If it fails, fw-managed-02 has to take over the VIP and >> > change it's IPsrcaddr. This works so far. But if fw-managed-01 comes >> > back >> > online, the default Gateway isn't set again on the node fw-managed-02. >> > >> > I'm quite new to this topic. The Cluster would work that way, but the >> > passive Node can never reach the internet cause of the missing default >> > gateway. >> > >> > Can anyone explain to what I am missing or doing wrong here? >> > >> > Output >> > >> > # crm configure show >> > node 1: fw-managed-01 >> > node 2: fw-managed-02 >> > primitive default_gw Route \ >> > op monitor interval=10s \ >> > params destination=default device=bad gateway=100.200.123.161 >> > primitive src_address IPsrcaddr \ >> > op monitor interval=10s \ >> > params ipaddress=100.200.123.166 >> > primitive vip_bad IPaddr2 \ >> > op monitor interval=10s \ >> > params nic=bad ip=100.200.123.166 cidr_netmask=29 >> > primitive vip_bad_2 IPaddr2 \ >> > op monitor interval=10s \ >> > params nic=bad ip=100.200.123.165 cidr_netmask=29 >> > primitive vip_managed IPaddr2 \ >> > op monitor interval=10s \ >> > params ip=172.30.40.254 cidr_netmask=24 >> > clone default_gw_clone default_gw \ >> > meta clone-max=2 target-role=Started >> > location cli-prefer-default_gw default_gw_clone role=Started inf: >> > fw-managed-01 >> >> As far as I can tell this restricts clone to one node only. As it starts >> with cli- this was done using something like "crm resource move" or >> similar. Try >> >> crm resource clear default_gw_clone >> >> > location src_address_location src_address inf: fw-managed-01 >> > location vip_bad_2_location vip_bad_2 inf: fw-managed-02 >> > location vip_bad_location vip_bad inf: fw-managed-01 >> > order vip_before_default_gw inf: vip_bad:start src_address:start >> > symmetrical=true >> > location vip_managed_location vip_managed inf: fw-managed-01 >> > property cib-bootstrap-options: \ >> > have-watchdog=false \ >> > dc-version=1.1.14-70404b0 \ >> > cluster-infrastructure=corosync \ >> > cluster-name=debian \ >> > stonith-enabled=false \ >> > no-quorum-policy=ignore \ >> > last-lrm-refresh=1516362207 \ >> > start-failure-is-fatal=false >> > >> > # crm status >> > Last updated: Mon Jan 22 18:47:12 2018 Last change: Fri Jan 19 >> > 17:04:12 2018 by root via cibadmin on fw-managed-01 >> > Stack: corosync >> > Current DC: fw-managed-01 (version 1.1.14-70404b0) - partition with >> > quorum >> > 2 nodes and 6 resources configured >> > >> > Online: [ fw-managed-01 fw-managed-02 ] >> > >> > Full list of resources: >> > >> > vip_managed (ocf::heartbeat:IPaddr2): Started fw-managed-01 >> > vip_bad (ocf::heartbeat:IPaddr2): Started fw-managed-01 >> > Clone Set: default_gw_clone [default_gw] >> > default_gw (ocf::heartbeat:Route): FAILED fw-managed-02 (unmanaged) >> > Started: [ fw-managed-01 ] >> > src_address (ocf::heartbeat:IPsrcaddr): Started fw-managed-01 >> > vip_bad_2 (ocf::heartbeat:IPaddr2): Started fw-managed-02 >> > >> > Failed Actions: >> > * default_gw_stop_0 on fw-managed-02 'not installed' (5): call=26, >> > status=complete, exitreason='Gateway address 100.200.123.161 is >> > unreachable.', >> > last-rc-change='Fri Jan 19 17:10:43 2018', queued=0ms, exec=31ms >> > * src_address_monitor_0 on fw-managed-02 'unknown error' (1): call=18, >> > status=complete, exitreason='[/usr/lib/heartbeat/findif -C] failed', >> > last-rc-change='Fri Jan 19 17:10:43 2018', queued=0ms, exec=75ms >> > >> > >> > best regards, >> > Tobias >> > >> > >> > >> > _______________________________________________ >> > Users mailing list: Users@clusterlabs.org >> > http://lists.clusterlabs.org/mailman/listinfo/users >> > >> > Project Home: http://www.clusterlabs.org >> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> > Bugs: http://bugs.clusterlabs.org >> > >> >> >> _______________________________________________ >> Users mailing list: Users@clusterlabs.org >> http://lists.clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org