Re: [ClusterLabs] getting "Totem is unable to form a cluster" error

Jan Friesse Sun, 10 Apr 2016 23:26:46 -0700

08.04.2016 17:51, Jan Friesse пишет:

On 04/08/16 13:01, Jan Friesse wrote:
  >> pacemaker 1.1.12-11.12
  >> openais 1.1.4-5.24.5
  >> corosync 1.4.7-0.23.5
  >>
  >> Its a two node active/passive cluster and we just upgraded the
SLES 11
  >> SP 3 to SLES 11 SP 4(nothing  else) but when we try to start the
cluster
  >> service we get the following error:
  >>
  >> "Totem is unable to form a cluster because of an operating system or
  >> network fault."
  >>
  >> Firewall is stopped and disabled on both the nodes. Both nodes can
  >> ping/ssh/vnc each other.
  >
  > Hard to help. First of all, I would recommend to ask SUSE support
because I don't really have access to source code of corosync
1.4.7-0.23.5 package, so really don't know what patches are added.
  >
  >
Yup, ticket opened with SUSE Support.


  >>
  >>
  >>
  >> /var/log/messages:
  >> Apr  6 17:51:49 prd1 corosync[8672]:  [MAIN  ] Corosync Cluster
Engine
  >> ('1.4.7'): started and ready to provide service.
  >> Apr  6 17:51:49 prd1 corosync[8672]:  [MAIN  ] Corosync built-in
  >> features: nss
  >> Apr  6 17:51:49 prd1 corosync[8672]:  [MAIN  ] Successfully
configured
  >> openais services to load
  >> Apr  6 17:51:49 prd1 corosync[8672]:  [MAIN  ] Successfully read main
  >> configuration file '/etc/corosync/corosync.conf'.
  >> Apr  6 17:51:49 prd1 corosync[8672]:  [TOTEM ] Initializing transport
  >> (UDP/IP Unicast).
  >> Apr  6 17:51:49 prd1 corosync[8672]:  [TOTEM ] Initializing
  >> transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
  >> Apr  6 17:51:49 prd1 corosync[8672]:  [TOTEM ] The network
interface is
  >> down.
  >
  > ^^^ This is important line. It means corosync was unable to find
interface for bindnetaddr 192.168.150.0. Make sure interface with this
network address exists.
  >
  >
this machine has two IP address assigned on interface bond0

bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
      link/ether 74:e6:e2:73:e5:61 brd ff:ff:ff:ff:ff:ff
      inet 10.150.20.91/24 brd 10.150.20.55 scope global bond0
      inet 192.168.150.12/22 brd 192.168.151.255 scope global
bond0:cluster
      inet6 fe80::76e6:e2ff:fe73:e561/64 scope link
         valid_lft forever preferred_lft forever


This is ifconfig output? I'm just wondering how you were able to set two
ipv4 addresses (in this format, I would expect another interface like
bond0:1 or nothing at all)?


That is how Linux stack works for the last 10 or 15 years. The bond0:1
is legacy emulation for ifconfig addicts.

ip addr add 10.150.20.91/24 dev bond0


Hmm.

RHEL 6:

# tunctl -p
Set 'tap0' persistent and owned by uid 0

# ip addr add 192.168.7.1/24 dev tap0
# ip addr add 192.168.8.1/24 dev tap0
# ifconfig tap0
tap0      Link encap:Ethernet  HWaddr 22:95:B1:85:67:3F
          inet addr:192.168.7.1  Bcast:0.0.0.0  Mask:255.255.255.0
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

RHEL 7:
# ip tuntap add dev tap0 mode tap
#  ip addr add 192.168.7.1/24 dev tap0
# ip addr add 192.168.8.1/24 dev tap0
# ifconfig tap0
tap0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        inet 192.168.7.1  netmask 255.255.255.0  broadcast 0.0.0.0
        ether 36:02:5c:ff:29:ea  txqueuelen 500  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

So where do you see 192.168.8.1 in ifconfig output?

Anyway, I was trying to create bonding interface and set second ipv4
(via ip addr) and corosync (flatiron what is 1.4.8 + 4 for your problem
completely unrelated patches) was able to detect it without any problem.

I can recommend you to try:
- Set bindnetaddr to IP address of given node (so you have to change
bindnetaddr on both nodes)
- Try upstream corosync 1.4.8/flatiron

Regards,
   Honza


And I can ping 192.168.150.12 from this machine and from other machines
on network



--
Regards,

Muhammad Sharfuddin

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] getting "Totem is unable to form a cluster" error

Reply via email to