Sorry, I was using wrong hostnames for that networks, using debug log I found it was not finding "this node" in conf file. Gabriele Sonicle S.r.l. : http://www.sonicle.com Music: http://www.gabrielebulfon.com Quantum Mechanics : http://www.cdbaby.com/cd/gabrielebulfon Da: Gabriele Bulfon A: Cluster Labs - All topics related to open-source clustering welcomed Data: 26 luglio 2020 11.23.53 CEST Oggetto: Re: [ClusterLabs] pacemaker startup problem Thanks, I ran it manually so I got those errors, running from service script it correctly set PCMK_ipc_type to socket. But now I see these now: Jul 26 11:08:16 [4039] pacemakerd: info: crm_log_init: Changed active directory to /sonicle/var/cluster/lib/pacemaker/cores Jul 26 11:08:16 [4039] pacemakerd: info: mcp_read_config: cmap connection setup failed: CS_ERR_LIBRARY. Retrying in 1s Jul 26 11:08:17 [4039] pacemakerd: info: mcp_read_config: cmap connection setup failed: CS_ERR_LIBRARY. Retrying in 2s Jul 26 11:08:19 [4039] pacemakerd: info: mcp_read_config: cmap connection setup failed: CS_ERR_LIBRARY. Retrying in 3s Jul 26 11:08:22 [4039] pacemakerd: info: mcp_read_config: cmap connection setup failed: CS_ERR_LIBRARY. Retrying in 4s Jul 26 11:08:26 [4039] pacemakerd: info: mcp_read_config: cmap connection setup failed: CS_ERR_LIBRARY. Retrying in 5s Jul 26 11:08:31 [4039] pacemakerd: warning: mcp_read_config: Could not connect to Cluster Configuration Database API, error 2 Jul 26 11:08:31 [4039] pacemakerd: notice: main: Could not obtain corosync config data, exiting Jul 26 11:08:31 [4039] pacemakerd: info: crm_xml_cleanup: Cleaning up memory from libxml2 So I think I need to start corosync first (right?) but it dies with this: Jul 26 11:07:06 [4027] xstorage1 corosync notice [MAIN ] Corosync Cluster Engine ('2.4.1'): started and ready to provide service. Jul 26 11:07:06 [4027] xstorage1 corosync info [MAIN ] Corosync built-in features: bindnow Jul 26 11:07:06 [4027] xstorage1 corosync notice [TOTEM ] Initializing transport (UDP/IP Multicast). Jul 26 11:07:06 [4027] xstorage1 corosync notice [TOTEM ] Initializing transmit/receive security (NSS) crypto: none hash: none Jul 26 11:07:06 [4027] xstorage1 corosync notice [TOTEM ] The network interface [10.100.100.1] is now up. Jul 26 11:07:06 [4027] xstorage1 corosync notice [SERV ] Service engine loaded: corosync configuration map access [0] Jul 26 11:07:06 [4027] xstorage1 corosync notice [YKD ] Service engine loaded: corosync configuration service [1] Jul 26 11:07:06 [4027] xstorage1 corosync notice [YKD ] Service engine loaded: corosync cluster closed process group service v1.01 [2] Jul 26 11:07:06 [4027] xstorage1 corosync notice [YKD ] Service engine loaded: corosync profile loading service [4] Jul 26 11:07:06 [4027] xstorage1 corosync notice [QUORUM] Using quorum provider corosync_votequorum Jul 26 11:07:06 [4027] xstorage1 corosync crit [QUORUM] Quorum provider: corosync_votequorum failed to initialize. Jul 26 11:07:06 [4027] xstorage1 corosync error [SERV ] Service engine 'corosync_quorum' failed to load for reason 'configuration error: nodelist or quorum.expected_votes must be configured!' Jul 26 11:07:06 [4027] xstorage1 corosync error [MAIN ] Corosync Cluster Engine exiting with status 20 at /data/sources/sonicle/xstream-storage-gate/components/cluster/corosync/corosync-2.4.1/exec/service.c:356. My corosync conf has nodelist configured! Here it is: service { ver: 1 name: pacemaker use_mgmtd: no use_logd: no}totem { version: 2 crypto_cipher: none crypto_hash: none interface { ringnumber: 0 bindnetaddr: 10.100.100.0 mcastaddr: 239.255.1.1 mcastport: 5405 ttl: 1 }}nodelist { node { ring0_addr: xstorage1 nodeid: 1 } node { ring0_addr: xstorage2 nodeid: 2 }}quorum { provider: corosync_votequorum two_node: 1}logging { fileline: off to_stderr: no to_logfile: yes logfile: /sonicle/var/log/cluster/corosync.log to_syslog: no debug: off timestamp: on logger_subsys { subsys: QUORUM debug: off }} Sonicle S.r.l. : http://www.sonicle.com Music: http://www.gabrielebulfon.com Quantum Mechanics : http://www.cdbaby.com/cd/gabrielebulfon ---------------------------------------------------------------------------------- Da: Ken Gaillot A: Cluster Labs - All topics related to open-source clustering welcomed Data: 25 luglio 2020 0.46.52 CEST Oggetto: Re: [ClusterLabs] pacemaker startup problem On Fri, 2020-07-24 at 18:34 +0200, Gabriele Bulfon wrote: Hello, after a long time I'm back to run heartbeat/pacemaker/corosync on our XStreamOS/illumos distro. I rebuilt the original components I did in 2016 on our latest release (probably a bit outdated, but I want to start from where I left). Looks like pacemaker is having trouble starting up showin this logs: Set r/w permissions for uid=401, gid=401 on /var/log/pacemaker.log Set r/w permissions for uid=401, gid=401 on /var/log/pacemaker.log Jul 24 18:21:32 [971] crmd: info: crm_log_init: Changed active directory to /sonicle/var/cluster/lib/pacemaker/cores Jul 24 18:21:32 [971] crmd: info: main: CRM Git Version: 1.1.15 (e174ec8) Jul 24 18:21:32 [971] crmd: info: do_log: Input I_STARTUP received in state S_STARTING from crmd_init Jul 24 18:21:32 [969] lrmd: info: crm_log_init: Changed active directory to /sonicle/var/cluster/lib/pacemaker/cores Jul 24 18:21:32 [968] stonith-ng: info: crm_log_init: Changed active directory to /sonicle/var/cluster/lib/pacemaker/cores Jul 24 18:21:32 [968] stonith-ng: info: get_cluster_type: Verifying cluster type: 'heartbeat' Jul 24 18:21:32 [968] stonith-ng: info: get_cluster_type: Assuming an active 'heartbeat' cluster Jul 24 18:21:32 [968] stonith-ng: notice: crm_cluster_connect: Connecting to cluster infrastructure: heartbeat Jul 24 18:21:32 [969] lrmd: error: mainloop_add_ipc_server: Could not start lrmd IPC server: Operation not supported (-48) This is repeated for all the subdaemons ... the error is coming from qb_ipcs_run(), which looks like the issue is an invalid PCMK_ipc_type for illumos. If you set it to "socket" it should work. Jul 24 18:21:32 [969] lrmd: error: main: Failed to create IPC server: shutting down and inhibiting respawn Jul 24 18:21:32 [969] lrmd: info: crm_xml_cleanup: Cleaning up memory from libxml2 Jul 24 18:21:32 [971] crmd: info: get_cluster_type: Verifying cluster type: 'heartbeat' Jul 24 18:21:32 [971] crmd: info: get_cluster_type: Assuming an active 'heartbeat' cluster Jul 24 18:21:32 [971] crmd: info: start_subsystem: Starting sub- system "pengine" Jul 24 18:21:32 [968] stonith-ng: info: crm_get_peer: Created entry 25bc5492-a49e-40d7-ae60-fd8f975a294a/80886f0 for node xstorage1/0 (1 total) Jul 24 18:21:32 [968] stonith-ng: info: crm_get_peer: Node 0 has uuid d426a730-5229-6758-853a-99d4d491514a Jul 24 18:21:32 [968] stonith-ng: info: register_heartbeat_conn: Hostname: xstorage1 Jul 24 18:21:32 [968] stonith-ng: info: register_heartbeat_conn: UUID: d426a730-5229-6758-853a-99d4d491514a Jul 24 18:21:32 [970] attrd: notice: crm_cluster_connect: Connecting to cluster infrastructure: heartbeat Jul 24 18:21:32 [970] attrd: error: mainloop_add_ipc_server: Could not start attrd IPC server: Operation not supported (-48) Jul 24 18:21:32 [970] attrd: error: attrd_ipc_server_init: Failed to create attrd servers: exiting and inhibiting respawn. Jul 24 18:21:32 [970] attrd: warning: attrd_ipc_server_init: Verify pacemaker and pacemaker_remote are not both enabled. Jul 24 18:21:32 [972] pengine: info: crm_log_init: Changed active directory to /sonicle/var/cluster/lib/pacemaker/cores Jul 24 18:21:32 [972] pengine: error: mainloop_add_ipc_server: Could not start pengine IPC server: Operation not supported (-48) Jul 24 18:21:32 [972] pengine: error: main: Failed to create IPC server: shutting down and inhibiting respawn Jul 24 18:21:32 [972] pengine: info: crm_xml_cleanup: Cleaning up memory from libxml2 Jul 24 18:21:33 [971] crmd: info: do_cib_control: Could not connect to the CIB service: Transport endpoint is not connected Jul 24 18:21:33 [971] crmd: warning: do_cib_control: Couldn't complete CIB registration 1 times... pause and retry Jul 24 18:21:33 [971] crmd: error: crmd_child_exit: Child process pengine exited (pid=972, rc=100) Jul 24 18:21:35 [971] crmd: info: crm_timer_popped: Wait Timer (I_NULL) just popped (2000ms) Jul 24 18:21:36 [971] crmd: info: do_cib_control: Could not connect to the CIB service: Transport endpoint is not connected Jul 24 18:21:36 [971] crmd: warning: do_cib_control: Couldn't complete CIB registration 2 times... pause and retry Jul 24 18:21:38 [971] crmd: info: crm_timer_popped: Wait Timer (I_NULL) just popped (2000ms) Jul 24 18:21:39 [971] crmd: info: do_cib_control: Could not connect to the CIB service: Transport endpoint is not connected Jul 24 18:21:39 [971] crmd: warning: do_cib_control: Couldn't complete CIB registration 3 times... pause and retry Jul 24 18:21:41 [971] crmd: info: crm_timer_popped: Wait Timer (I_NULL) just popped (2000ms) Jul 24 18:21:42 [971] crmd: info: do_cib_control: Could not connect to the CIB service: Transport endpoint is not connected Jul 24 18:21:42 [971] crmd: warning: do_cib_control: Couldn't complete CIB registration 4 times... pause and retry Jul 24 18:21:42 [968] stonith-ng: error: setup_cib: Could not connect to the CIB service: Transport endpoint is not connected (-134) Jul 24 18:21:42 [968] stonith-ng: error: mainloop_add_ipc_server: Could not start stonith-ng IPC server: Operation not supported (-48) Jul 24 18:21:42 [968] stonith-ng: error: stonith_ipc_server_init: Failed to create stonith-ng servers: exiting and inhibiting respawn. Jul 24 18:21:42 [968] stonith-ng: warning: stonith_ipc_server_init: Verify pacemaker and pacemaker_remote are not both enabled. Any idea what's happening? Gabriele Sonicle S.r.l. : http://www.sonicle.com Music: http://www.gabrielebulfon.com Quantum Mechanics : http://www.cdbaby.com/cd/gabrielebulfon _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ _______________________________________________Manage your subscription:https://lists.clusterlabs.org/mailman/listinfo/usersClusterLabs home: https://www.clusterlabs.org/
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/