Re: [ClusterLabs] Could not initialize corosync configuration API error 2
Hi, On 31/03/2023 11:36, S Sathish S wrote: Hi Team, Please find the corosync version. [root@node2 ~]# rpm -qa corosync corosync-2.4.4-2.el7.x86_64. RHEL 7 never got 2.4.4 - there was 2.4.3 in RHEL 7.7 and 2.4.5 in RHEL 7.8/7.9. Is this self compiled version? If so, please consider updating to distro provided package - RHEL 7 package IS actively maintained. Firewall in disable state only. Please find the debug and trace logs Mar 31 10:07:30 [17684] node2 corosync notice [MAIN ] Corosync Cluster Engine ('UNKNOWN'): started and ready to provide service. Mar 31 10:07:30 [17684] node2 corosync info[MAIN ] Corosync built-in features: pie relro bindnow Mar 31 10:07:30 [17684] node2 corosync warning [MAIN ] Could not set SCHED_RR at priority 99: Operation not permitted (1) This is weird - is corosync running as a root? Mar 31 10:07:30 [17684] node2 corosync debug [QB] shm size:8388621; real_size:8392704; rb->word_size:2098176 Mar 31 10:07:30 [17684] node2 corosync debug [MAIN ] Corosync TTY detached Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] waiting_trans_ack changed to 1 Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] Token Timeout (5550 ms) ... Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] entering GATHER state from 11(merge during join). This is important. Usually this means there is forgotten node somewhere trying to connect to existing cluster or config files between nodes differs. Solution is: 1. Check corosync.conf is equal on all nodes 2. Update to distro package (2.4.5) which contains block_unlisted_ips functionality/option (enabled by default) and/or generate new crypto key, distribute it only to nodes within cluster (so node1 .. node9) and turn on crypto, Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] entering GATHER state from 11(merge during join). Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] entering GATHER state from ... Please find the corosync conf file. [root@node2 ~]# cat /etc/corosync/corosync.conf totem { version: 2 cluster_name: OCC secauth: off it's really good idea to turn on crypto transport: udpu } nodelist { node { ring0_addr: node1 nodeid: 1 } node { ring0_addr: node2 nodeid: 2 } node { ring0_addr: node3 nodeid: 3 } node { ring0_addr: node4 nodeid: 4 } node { ring0_addr: node5 nodeid: 5 } node { ring0_addr: node6 nodeid: 6 } node { ring0_addr: node7 nodeid: 7 } node { ring0_addr: node8 nodeid: 8 } node { ring0_addr: node9 nodeid: 9 } } quorum { provider: corosync_votequorum } logging { to_logfile: yes logfile: /var/log/cluster/corosync.log to_syslog: no timestamp:on } Regards, Honza Thanks and Regards, S Sathish S ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
Re: [ClusterLabs] Could not initialize corosync configuration API error 2
Hi Team, Please find the corosync version. [root@node2 ~]# rpm -qa corosync corosync-2.4.4-2.el7.x86_64. Firewall in disable state only. Please find the debug and trace logs Mar 31 10:07:30 [17684] node2 corosync notice [MAIN ] Corosync Cluster Engine ('UNKNOWN'): started and ready to provide service. Mar 31 10:07:30 [17684] node2 corosync info[MAIN ] Corosync built-in features: pie relro bindnow Mar 31 10:07:30 [17684] node2 corosync warning [MAIN ] Could not set SCHED_RR at priority 99: Operation not permitted (1) Mar 31 10:07:30 [17684] node2 corosync debug [QB] shm size:8388621; real_size:8392704; rb->word_size:2098176 Mar 31 10:07:30 [17684] node2 corosync debug [MAIN ] Corosync TTY detached Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] waiting_trans_ack changed to 1 Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] Token Timeout (5550 ms) retransmit timeout (1321 ms) Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] token hold (1046 ms) retransmits before loss (4 retrans) Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] join (50 ms) send_join (0 ms) consensus (6660 ms) merge (200 ms) Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] downcheck (1000 ms) fail to recv const (2500 msgs) Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] seqno unchanged const (30 rotations) Maximum network MTU 1401 Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] window size per rotation (50 messages) maximum messages per rotation (17 messages) Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] missed count const (5 messages) Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] send threads (0 threads) Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] RRP token expired timeout (1321 ms) Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] RRP token problem counter (2000 ms) Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] RRP threshold (10 problem count) Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] RRP multicast threshold (100 problem count) Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] RRP automatic recovery check timeout (1000 ms) Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] RRP mode set to none. Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] heartbeat_failures_allowed (0) Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] max_network_delay (50 ms) Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0 Mar 31 10:07:30 [17684] node2 corosync notice [TOTEM ] Initializing transport (UDP/IP Unicast). Mar 31 10:07:30 [17684] node2 corosync notice [TOTEM ] Initializing transmit/receive security (NSS) crypto: none hash: none Mar 31 10:07:30 [17684] node2 corosync trace [QB] grown poll array to 2 for FD 8 Mar 31 10:07:30 [17684] node2 corosync notice [TOTEM ] The network interface [10.33.59.175] is now up. Mar 31 10:07:30 [17684] node2 corosync debug [TOTEM ] Created or loaded sequence id 540.10.33.59.175 for this ring. Mar 31 10:07:30 [17684] node2 corosync notice [SERV ] Service engine loaded: corosync configuration map access [0] Mar 31 10:07:30 [17684] node2 corosync debug [MAIN ] Initializing IPC on cmap [0] Mar 31 10:07:30 [17684] node2 corosync debug [MAIN ] No configured qb.ipc_type. Using native ipc Mar 31 10:07:30 [17684] node2 corosync info[QB] server name: cmap Mar 31 10:07:30 [17684] node2 corosync trace [QB] grown poll array to 3 for FD 9 Mar 31 10:07:30 [17684] node2 corosync notice [SERV ] Service engine loaded: corosync configuration service [1] Mar 31 10:07:30 [17684] node2 corosync debug [MAIN ] Initializing IPC on cfg [1] Mar 31 10:07:30 [17684] node2 corosync debug [MAIN ] No configured qb.ipc_type. Using native ipc Mar 31 10:07:30 [17684] node2 corosync info[QB] server name: cfg Mar 31 10:07:30 [17684] node2 corosync trace [QB] grown poll array to 4 for FD 10 Mar 31 10:07:30 [17684] node2 corosync notice [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2] Mar 31 10:07:30 [17684] node2 corosync debug [MAIN ] Initializing IPC on cpg [2] Mar 31 10:07:30 [17684] node2 corosync debug [MAIN ] No configured qb.ipc_type. Using native ipc Mar 31 10:07:30 [17684] node2 corosync info[QB] server name: cpg Mar 31 10:07:30 [17684] node2 corosync trace [QB] grown poll array to 5 for FD 11 Mar 31 10:07:30 [17684] node2 corosync notice [SERV ] Service engine loaded: corosync profile loading service [4] Mar 31 10:07:30 [17684] node2 corosync debug [MAIN ] NOT Initializing IPC on pload [4] Mar 31 10:07:30 [17684] node2 corosync notice [QUORUM] Using quorum provider corosync_votequorum Mar 31 10:07:30 [17684] node2 corosync trace [VOTEQ ] ENTERING votequorum_init() Mar 31 10:07:30 [17684] node2 corosync trace [VOTEQ ] ENTERING votequorum_exec_init_fn() Mar 31
Re: [ClusterLabs] Could not initialize corosync configuration API error 2
Hi, more information would be needed to really find out real reason, so: - double check corosync.conf (ip addresses) - check firewall (mainly local one) - what is the version of corosync - try to set debug:on (or trace) - paste config file - paste full log - since corosync was started Also keep in mind if it is version 2.x it's no longer supported by upstream and you have to contact your distribution provider support. Regards, Honza On 30/03/2023 12:08, S Sathish S via Users wrote: Hi Team, we are unable to start corosync service which is already part of existing cluster same is running fine for longer time. Now we are seeing corosync server unable to join "Could not initialize corosync configuration API error 2". Please find the below logs. [root@node1 ~]# systemctl status corosync ● corosync.service - Corosync Cluster Engine Loaded: loaded (/usr/lib/systemd/system/corosync.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Thu 2023-03-30 10:49:58 WAT; 7min ago Docs: man:corosync man:corosync.conf man:corosync_overview Process: 9922 ExecStop=/usr/share/corosync/corosync stop (code=exited, status=0/SUCCESS) Process: 9937 ExecStart=/usr/share/corosync/corosync start (code=exited, status=1/FAILURE) Mar 30 10:48:57 node1 systemd[1]: Starting Corosync Cluster Engine... Mar 30 10:49:58 node1 corosync[9937]: Starting Corosync Cluster Engine (corosync): [FAILED] Mar 30 10:49:58 node1 systemd[1]: corosync.service: control process exited, code=exited status=1 Mar 30 10:49:58 node1 systemd[1]: Failed to start Corosync Cluster Engine. Mar 30 10:49:58 node1 systemd[1]: Unit corosync.service entered failed state. Mar 30 10:49:58 node1 systemd[1]: corosync.service failed. Please find the corosync logs error: Mar 30 10:49:52 [9947] node1 corosync debug [MAIN ] Denied connection, corosync is not ready Mar 30 10:49:52 [9947] node1 corosync warning [QB] Denied connection, is not ready (9948-10497-23) Mar 30 10:49:52 [9947] node1 corosync debug [MAIN ] cs_ipcs_connection_destroyed() Mar 30 10:49:52 [9947] node1 corosync debug [MAIN ] Denied connection, corosync is not ready Mar 30 10:49:57 [9947] node1 corosync debug [MAIN ] cs_ipcs_connection_destroyed() Mar 30 10:49:58 [9947] node1 corosync notice [MAIN ] Node was shut down by a signal Mar 30 10:49:58 [9947] node1 corosync notice [SERV ] Unloading all Corosync service engines. Mar 30 10:49:58 [9947] node1 corosync info[QB] withdrawing server sockets Mar 30 10:49:58 [9947] node1 corosync debug [QB] qb_ipcs_unref() - destroying Mar 30 10:49:58 [9947] node1 corosync notice [SERV ] Service engine unloaded: corosync vote quorum service v1.0 Mar 30 10:49:58 [9947] node1 corosync info[QB] withdrawing server sockets Mar 30 10:49:58 [9947] node1 corosync debug [QB] qb_ipcs_unref() - destroying Mar 30 10:49:58 [9947] node1 corosync notice [SERV ] Service engine unloaded: corosync configuration map access Mar 30 10:49:58 [9947] node1 corosync info[QB] withdrawing server sockets Mar 30 10:49:58 [9947] node1 corosync debug [QB] qb_ipcs_unref() - destroying Mar 30 10:49:58 [9947] node1 corosync notice [SERV ] Service engine unloaded: corosync configuration service Mar 30 10:49:58 [9947] node1 corosync info[QB] withdrawing server sockets Mar 30 10:49:58 [9947] node1 corosync debug [QB] qb_ipcs_unref() - destroying Mar 30 10:49:58 [9947] node1 corosync notice [SERV ] Service engine unloaded: corosync cluster closed process group service v1.01 Mar 30 10:49:58 [9947] node1 corosync info[QB] withdrawing server sockets Mar 30 10:49:58 [9947] node1 corosync debug [QB] qb_ipcs_unref() - destroying Mar 30 10:49:58 [9947] node1 corosync notice [SERV ] Service engine unloaded: corosync cluster quorum service v0.1 Mar 30 10:49:58 [9947] node1 corosync notice [SERV ] Service engine unloaded: corosync profile loading service Mar 30 10:49:58 [9947] node1 corosync debug [TOTEM ] sending join/leave message Mar 30 10:49:58 [9947] node1 corosync notice [MAIN ] Corosync Cluster Engine exiting normally While try manually start corosync service also getting below error. [root@node1 ~]# bash -x /usr/share/corosync/corosync start + desc='Corosync Cluster Engine' + prog=corosync + PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/sbin + '[' -f /etc/sysconfig/corosync ']' + . /etc/sysconfig/corosync ++ COROSYNC_INIT_TIMEOUT=60 ++ COROSYNC_OPTIONS= + case '/etc/sysconfig' in + '[' -f /etc/init.d/functions ']' + . /etc/init.d/functions ++ TEXTDOMAIN=initscripts ++ umask 022 ++ PATH=/sbin:/usr/sbin:/bin:/usr/bin ++ export PATH ++ '[' 28864 -ne 1 -a -z '' ']' ++ '[' -d /run/systemd/system ']' ++ case "$0" in ++ '[' -z '' ']' ++ COLUMNS=80 ++ '[' -z '' ']' ++ '[' -c /dev/stderr -a -r /dev/stderr ']' +++ /sbin/consoletype ++ CONSOLETYPE=pty ++ '[' -z '' ']' ++ '[' -z '' ']' ++ '[' -f
[ClusterLabs] Could not initialize corosync configuration API error 2
Hi Team, we are unable to start corosync service which is already part of existing cluster same is running fine for longer time. Now we are seeing corosync server unable to join "Could not initialize corosync configuration API error 2". Please find the below logs. [root@node1 ~]# systemctl status corosync ● corosync.service - Corosync Cluster Engine Loaded: loaded (/usr/lib/systemd/system/corosync.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Thu 2023-03-30 10:49:58 WAT; 7min ago Docs: man:corosync man:corosync.conf man:corosync_overview Process: 9922 ExecStop=/usr/share/corosync/corosync stop (code=exited, status=0/SUCCESS) Process: 9937 ExecStart=/usr/share/corosync/corosync start (code=exited, status=1/FAILURE) Mar 30 10:48:57 node1 systemd[1]: Starting Corosync Cluster Engine... Mar 30 10:49:58 node1 corosync[9937]: Starting Corosync Cluster Engine (corosync): [FAILED] Mar 30 10:49:58 node1 systemd[1]: corosync.service: control process exited, code=exited status=1 Mar 30 10:49:58 node1 systemd[1]: Failed to start Corosync Cluster Engine. Mar 30 10:49:58 node1 systemd[1]: Unit corosync.service entered failed state. Mar 30 10:49:58 node1 systemd[1]: corosync.service failed. Please find the corosync logs error: Mar 30 10:49:52 [9947] node1 corosync debug [MAIN ] Denied connection, corosync is not ready Mar 30 10:49:52 [9947] node1 corosync warning [QB] Denied connection, is not ready (9948-10497-23) Mar 30 10:49:52 [9947] node1 corosync debug [MAIN ] cs_ipcs_connection_destroyed() Mar 30 10:49:52 [9947] node1 corosync debug [MAIN ] Denied connection, corosync is not ready Mar 30 10:49:57 [9947] node1 corosync debug [MAIN ] cs_ipcs_connection_destroyed() Mar 30 10:49:58 [9947] node1 corosync notice [MAIN ] Node was shut down by a signal Mar 30 10:49:58 [9947] node1 corosync notice [SERV ] Unloading all Corosync service engines. Mar 30 10:49:58 [9947] node1 corosync info[QB] withdrawing server sockets Mar 30 10:49:58 [9947] node1 corosync debug [QB] qb_ipcs_unref() - destroying Mar 30 10:49:58 [9947] node1 corosync notice [SERV ] Service engine unloaded: corosync vote quorum service v1.0 Mar 30 10:49:58 [9947] node1 corosync info[QB] withdrawing server sockets Mar 30 10:49:58 [9947] node1 corosync debug [QB] qb_ipcs_unref() - destroying Mar 30 10:49:58 [9947] node1 corosync notice [SERV ] Service engine unloaded: corosync configuration map access Mar 30 10:49:58 [9947] node1 corosync info[QB] withdrawing server sockets Mar 30 10:49:58 [9947] node1 corosync debug [QB] qb_ipcs_unref() - destroying Mar 30 10:49:58 [9947] node1 corosync notice [SERV ] Service engine unloaded: corosync configuration service Mar 30 10:49:58 [9947] node1 corosync info[QB] withdrawing server sockets Mar 30 10:49:58 [9947] node1 corosync debug [QB] qb_ipcs_unref() - destroying Mar 30 10:49:58 [9947] node1 corosync notice [SERV ] Service engine unloaded: corosync cluster closed process group service v1.01 Mar 30 10:49:58 [9947] node1 corosync info[QB] withdrawing server sockets Mar 30 10:49:58 [9947] node1 corosync debug [QB] qb_ipcs_unref() - destroying Mar 30 10:49:58 [9947] node1 corosync notice [SERV ] Service engine unloaded: corosync cluster quorum service v0.1 Mar 30 10:49:58 [9947] node1 corosync notice [SERV ] Service engine unloaded: corosync profile loading service Mar 30 10:49:58 [9947] node1 corosync debug [TOTEM ] sending join/leave message Mar 30 10:49:58 [9947] node1 corosync notice [MAIN ] Corosync Cluster Engine exiting normally While try manually start corosync service also getting below error. [root@node1 ~]# bash -x /usr/share/corosync/corosync start + desc='Corosync Cluster Engine' + prog=corosync + PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/sbin + '[' -f /etc/sysconfig/corosync ']' + . /etc/sysconfig/corosync ++ COROSYNC_INIT_TIMEOUT=60 ++ COROSYNC_OPTIONS= + case '/etc/sysconfig' in + '[' -f /etc/init.d/functions ']' + . /etc/init.d/functions ++ TEXTDOMAIN=initscripts ++ umask 022 ++ PATH=/sbin:/usr/sbin:/bin:/usr/bin ++ export PATH ++ '[' 28864 -ne 1 -a -z '' ']' ++ '[' -d /run/systemd/system ']' ++ case "$0" in ++ '[' -z '' ']' ++ COLUMNS=80 ++ '[' -z '' ']' ++ '[' -c /dev/stderr -a -r /dev/stderr ']' +++ /sbin/consoletype ++ CONSOLETYPE=pty ++ '[' -z '' ']' ++ '[' -z '' ']' ++ '[' -f /etc/sysconfig/i18n -o -f /etc/locale.conf ']' ++ . /etc/profile.d/lang.sh ++ unset LANGSH_SOURCED ++ '[' -z '' ']' ++ '[' -f /etc/sysconfig/init ']' ++ . /etc/sysconfig/init +++ BOOTUP=color +++ RES_COL=60 +++ MOVE_TO_COL='echo -en \033[60G' +++ SETCOLOR_SUCCESS='echo -en \033[0;32m' +++ SETCOLOR_FAILURE='echo -en \033[0;31m' +++ SETCOLOR_WARNING='echo -en \033[0;33m' +++ SETCOLOR_NORMAL='echo -en \033[0;39m' ++ '[' pty = serial ']' ++