date:20200726

Re: [ClusterLabs] pacemaker startup problem

2020-07-26 Thread Gabriele Bulfon

Thanks, I ran it manually so I got those errors, running from service script it 
correctly set PCMK_ipc_type to socket.
 
But now I see these now:
Jul 26 11:08:16 [4039] pacemakerd: info: crm_log_init: Changed active directory 
to /sonicle/var/cluster/lib/pacemaker/cores
Jul 26 11:08:16 [4039] pacemakerd: info: mcp_read_config: cmap connection setup 
failed: CS_ERR_LIBRARY. Retrying in 1s
Jul 26 11:08:17 [4039] pacemakerd: info: mcp_read_config: cmap connection setup 
failed: CS_ERR_LIBRARY. Retrying in 2s
Jul 26 11:08:19 [4039] pacemakerd: info: mcp_read_config: cmap connection setup 
failed: CS_ERR_LIBRARY. Retrying in 3s
Jul 26 11:08:22 [4039] pacemakerd: info: mcp_read_config: cmap connection setup 
failed: CS_ERR_LIBRARY. Retrying in 4s
Jul 26 11:08:26 [4039] pacemakerd: info: mcp_read_config: cmap connection setup 
failed: CS_ERR_LIBRARY. Retrying in 5s
Jul 26 11:08:31 [4039] pacemakerd: warning: mcp_read_config: Could not connect 
to Cluster Configuration Database API, error 2
Jul 26 11:08:31 [4039] pacemakerd: notice: main: Could not obtain corosync 
config data, exiting
Jul 26 11:08:31 [4039] pacemakerd: info: crm_xml_cleanup: Cleaning up memory 
from libxml2
 
So I think I need to start corosync first (right?) but it dies with this:
 
Jul 26 11:07:06 [4027] xstorage1 corosync notice [MAIN ] Corosync Cluster 
Engine ('2.4.1'): started and ready to provide service.
Jul 26 11:07:06 [4027] xstorage1 corosync info [MAIN ] Corosync built-in 
features: bindnow
Jul 26 11:07:06 [4027] xstorage1 corosync notice [TOTEM ] Initializing 
transport (UDP/IP Multicast).
Jul 26 11:07:06 [4027] xstorage1 corosync notice [TOTEM ] Initializing 
transmit/receive security (NSS) crypto: none hash: none
Jul 26 11:07:06 [4027] xstorage1 corosync notice [TOTEM ] The network interface 
[10.100.100.1] is now up.
Jul 26 11:07:06 [4027] xstorage1 corosync notice [SERV ] Service engine loaded: 
corosync configuration map access [0]
Jul 26 11:07:06 [4027] xstorage1 corosync notice [YKD ] Service engine loaded: 
corosync configuration service [1]
Jul 26 11:07:06 [4027] xstorage1 corosync notice [YKD ] Service engine loaded: 
corosync cluster closed process group service v1.01 [2]
Jul 26 11:07:06 [4027] xstorage1 corosync notice [YKD ] Service engine loaded: 
corosync profile loading service [4]
Jul 26 11:07:06 [4027] xstorage1 corosync notice [QUORUM] Using quorum provider 
corosync_votequorum
Jul 26 11:07:06 [4027] xstorage1 corosync crit [QUORUM] Quorum provider: 
corosync_votequorum failed to initialize.
Jul 26 11:07:06 [4027] xstorage1 corosync error [SERV ] Service engine 
'corosync_quorum' failed to load for reason 'configuration error: nodelist or 
quorum.expected_votes must be configured!'
Jul 26 11:07:06 [4027] xstorage1 corosync error [MAIN ] Corosync Cluster Engine 
exiting with status 20 at 
/data/sources/sonicle/xstream-storage-gate/components/cluster/corosync/corosync-2.4.1/exec/service.c:356.
My corosync conf has nodelist configured! Here it is:
 
service {ver: 1name: pacemakeruse_mgmtd: nouse_logd: no}totem { 
   version: 2crypto_cipher: nonecrypto_hash: none
interface {ringnumber: 0bindnetaddr: 
10.100.100.0mcastaddr: 239.255.1.1mcastport: 
5405ttl: 1}}nodelist {   node { ring0_addr: 
xstorage1 nodeid: 1}   node { ring0_addr: xstorage2 
nodeid: 2}}quorum {provider: corosync_votequorum
two_node: 1}logging {fileline: offto_stderr: no
to_logfile: yeslogfile: /sonicle/var/log/cluster/corosync.log
to_syslog: nodebug: offtimestamp: onlogger_subsys { 
   subsys: QUORUMdebug: off}}
 
 
 
 
Sonicle S.r.l. 
: 
http://www.sonicle.com
Music: 
http://www.gabrielebulfon.com
Quantum Mechanics : 
http://www.cdbaby.com/cd/gabrielebulfon
--
Da: Ken Gaillot
A: Cluster Labs - All topics related to open-source clustering welcomed
Data: 25 luglio 2020 0.46.52 CEST
Oggetto: Re: [ClusterLabs] pacemaker startup problem
On Fri, 2020-07-24 at 18:34 +0200, Gabriele Bulfon wrote:
Hello,
after a long time I'm back to run heartbeat/pacemaker/corosync on our
XStreamOS/illumos distro.
I rebuilt the original components I did in 2016 on our latest release
(probably a bit outdated, but I want to start from where I left).
Looks like pacemaker is having trouble starting up showin this logs:
Set r/w permissions for uid=401, gid=401 on /var/log/pacemaker.log
Set r/w permissions for uid=401, gid=401 on /var/log/pacemaker.log
Jul 24 18:21:32 [971] crmd: info: crm_log_init: Changed active
directory to /sonicle/var/cluster/lib/pacemaker/cores
Jul 24 18:21:32 [971] crmd: info: main: CRM Git Version: 1.1.15
(e174ec8)
Jul 24 18:21:32 [971] crmd: info: do_log: Input I_STARTUP received in
sta

Re: [ClusterLabs] pacemaker startup problem

2020-07-26 Thread Gabriele Bulfon

Sorry, I was using wrong hostnames for that networks, using debug log I found 
it was not finding "this node" in conf file.
 
Gabriele
 
 
Sonicle S.r.l. 
: 
http://www.sonicle.com
Music: 
http://www.gabrielebulfon.com
Quantum Mechanics : 
http://www.cdbaby.com/cd/gabrielebulfon
Da:
Gabriele Bulfon
A:
Cluster Labs - All topics related to open-source clustering welcomed
Data:
26 luglio 2020 11.23.53 CEST
Oggetto:
Re: [ClusterLabs] pacemaker startup problem
 
Thanks, I ran it manually so I got those errors, running from service script it 
correctly set PCMK_ipc_type to socket.
 
But now I see these now:
Jul 26 11:08:16 [4039] pacemakerd: info: crm_log_init: Changed active directory 
to /sonicle/var/cluster/lib/pacemaker/cores
Jul 26 11:08:16 [4039] pacemakerd: info: mcp_read_config: cmap connection setup 
failed: CS_ERR_LIBRARY. Retrying in 1s
Jul 26 11:08:17 [4039] pacemakerd: info: mcp_read_config: cmap connection setup 
failed: CS_ERR_LIBRARY. Retrying in 2s
Jul 26 11:08:19 [4039] pacemakerd: info: mcp_read_config: cmap connection setup 
failed: CS_ERR_LIBRARY. Retrying in 3s
Jul 26 11:08:22 [4039] pacemakerd: info: mcp_read_config: cmap connection setup 
failed: CS_ERR_LIBRARY. Retrying in 4s
Jul 26 11:08:26 [4039] pacemakerd: info: mcp_read_config: cmap connection setup 
failed: CS_ERR_LIBRARY. Retrying in 5s
Jul 26 11:08:31 [4039] pacemakerd: warning: mcp_read_config: Could not connect 
to Cluster Configuration Database API, error 2
Jul 26 11:08:31 [4039] pacemakerd: notice: main: Could not obtain corosync 
config data, exiting
Jul 26 11:08:31 [4039] pacemakerd: info: crm_xml_cleanup: Cleaning up memory 
from libxml2
 
So I think I need to start corosync first (right?) but it dies with this:
 
Jul 26 11:07:06 [4027] xstorage1 corosync notice [MAIN ] Corosync Cluster 
Engine ('2.4.1'): started and ready to provide service.
Jul 26 11:07:06 [4027] xstorage1 corosync info [MAIN ] Corosync built-in 
features: bindnow
Jul 26 11:07:06 [4027] xstorage1 corosync notice [TOTEM ] Initializing 
transport (UDP/IP Multicast).
Jul 26 11:07:06 [4027] xstorage1 corosync notice [TOTEM ] Initializing 
transmit/receive security (NSS) crypto: none hash: none
Jul 26 11:07:06 [4027] xstorage1 corosync notice [TOTEM ] The network interface 
[10.100.100.1] is now up.
Jul 26 11:07:06 [4027] xstorage1 corosync notice [SERV ] Service engine loaded: 
corosync configuration map access [0]
Jul 26 11:07:06 [4027] xstorage1 corosync notice [YKD ] Service engine loaded: 
corosync configuration service [1]
Jul 26 11:07:06 [4027] xstorage1 corosync notice [YKD ] Service engine loaded: 
corosync cluster closed process group service v1.01 [2]
Jul 26 11:07:06 [4027] xstorage1 corosync notice [YKD ] Service engine loaded: 
corosync profile loading service [4]
Jul 26 11:07:06 [4027] xstorage1 corosync notice [QUORUM] Using quorum provider 
corosync_votequorum
Jul 26 11:07:06 [4027] xstorage1 corosync crit [QUORUM] Quorum provider: 
corosync_votequorum failed to initialize.
Jul 26 11:07:06 [4027] xstorage1 corosync error [SERV ] Service engine 
'corosync_quorum' failed to load for reason 'configuration error: nodelist or 
quorum.expected_votes must be configured!'
Jul 26 11:07:06 [4027] xstorage1 corosync error [MAIN ] Corosync Cluster Engine 
exiting with status 20 at 
/data/sources/sonicle/xstream-storage-gate/components/cluster/corosync/corosync-2.4.1/exec/service.c:356.
My corosync conf has nodelist configured! Here it is:
 
service {ver: 1name: pacemakeruse_mgmtd: nouse_logd: no}totem { 
   version: 2crypto_cipher: nonecrypto_hash: none
interface {ringnumber: 0bindnetaddr: 
10.100.100.0mcastaddr: 239.255.1.1mcastport: 
5405ttl: 1}}nodelist {   node { ring0_addr: 
xstorage1 nodeid: 1}   node { ring0_addr: xstorage2 
nodeid: 2}}quorum {provider: corosync_votequorum
two_node: 1}logging {fileline: offto_stderr: no
to_logfile: yeslogfile: /sonicle/var/log/cluster/corosync.log
to_syslog: nodebug: offtimestamp: onlogger_subsys { 
   subsys: QUORUMdebug: off}}
 
 
 
 
Sonicle S.r.l. 
: 
http://www.sonicle.com
Music: 
http://www.gabrielebulfon.com
Quantum Mechanics : 
http://www.cdbaby.com/cd/gabrielebulfon
--
Da: Ken Gaillot
A: Cluster Labs - All topics related to open-source clustering welcomed
Data: 25 luglio 2020 0.46.52 CEST
Oggetto: Re: [ClusterLabs] pacemaker startup problem
On Fri, 2020-07-24 at 18:34 +0200, Gabriele Bulfon wrote:
Hello,
after a long time I'm back to run heartbeat/pacemaker/corosync on our
XStreamOS/illumos distro.
I rebuilt the original components I did in 2016 on our latest release
(probably a bit outdated, but I want to start from where I left).
Looks

Re: [ClusterLabs] pacemaker startup problem

2020-07-26 Thread Gabriele Bulfon

Sorry, actually the problem is not gone yet.
Now corosync and pacemaker are running happily, but those IPC errors are coming 
out of heartbeat and crmd as soon as I start it.
The pacemakerd process has PCMK_ipc_type=socket, what's wrong with heartbeat or 
crmd?
 
Here's the env of the process:
 
sonicle@xstorage1:/sonicle/etc/cluster/ha.d# penv 4222
4222: /usr/sbin/pacemakerd
envp[0]: PCMK_respawned=true
envp[1]: PCMK_watchdog=false
envp[2]: HA_LOGFACILITY=none
envp[3]: HA_logfacility=none
envp[4]: PCMK_logfacility=none
envp[5]: HA_logfile=/sonicle/var/log/cluster/corosync.log
envp[6]: PCMK_logfile=/sonicle/var/log/cluster/corosync.log
envp[7]: HA_debug=0
envp[8]: PCMK_debug=0
envp[9]: HA_quorum_type=corosync
envp[10]: PCMK_quorum_type=corosync
envp[11]: HA_cluster_type=corosync
envp[12]: PCMK_cluster_type=corosync
envp[13]: HA_use_logd=off
envp[14]: PCMK_use_logd=off
envp[15]: HA_mcp=true
envp[16]: PCMK_mcp=true
envp[17]: HA_LOGD=no
envp[18]: LC_ALL=C
envp[19]: PCMK_service=pacemakerd
envp[20]: PCMK_ipc_type=socket
envp[21]: SMF_ZONENAME=global
envp[22]: PWD=/
envp[23]: SMF_FMRI=svc:/sonicle/xstream/cluster/pacemaker:default
envp[24]: _=/usr/sbin/pacemakerd
envp[25]: TZ=Europe/Rome
envp[26]: LANG=en_US.UTF-8
envp[27]: SMF_METHOD=start
envp[28]: SHLVL=2
envp[29]: PATH=/usr/sbin:/usr/bin
envp[30]: SMF_RESTARTER=svc:/system/svc/restarter:default
envp[31]: A__z="*SHLVL
 
 
Here are crmd complaints:
 
Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.notice] notice: Node 
xstorage1 state is now member
Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error: Could not 
start crmd IPC server: Operation not supported (-48)
Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error: Failed to 
create IPC server: shutting down and inhibiting respawn
Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.notice] notice: The 
local CRM is operational
Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error: Input 
I_ERROR received in state S_STARTING from do_started
Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.notice] notice: State 
transition S_STARTING -S_RECOVERY
Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.warning] warning: 
Fast-tracking shutdown in response to errors
Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.warning] warning: Input 
I_PENDING received in state S_RECOVERY from do_started
Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error: Input 
I_TERMINATE received in state S_RECOVERY from do_recover
Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.notice] notice: 
Disconnected from the LRM
Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error: Child 
process pengine exited (pid=4316, rc=100)
Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error: Could not 
recover from internal error
Jul 26 11:39:07 xstorage1 heartbeat: [ID 996084 daemon.warning] [4275]: WARN: 
Managed /usr/libexec/pacemaker/crmd process 4315 exited with return code 201.
 
 
Sonicle S.r.l. 
: 
http://www.sonicle.com
Music: 
http://www.gabrielebulfon.com
Quantum Mechanics : 
http://www.cdbaby.com/cd/gabrielebulfon
--
Da: Ken Gaillot
A: Cluster Labs - All topics related to open-source clustering welcomed
Data: 25 luglio 2020 0.46.52 CEST
Oggetto: Re: [ClusterLabs] pacemaker startup problem
On Fri, 2020-07-24 at 18:34 +0200, Gabriele Bulfon wrote:
Hello,
after a long time I'm back to run heartbeat/pacemaker/corosync on our
XStreamOS/illumos distro.
I rebuilt the original components I did in 2016 on our latest release
(probably a bit outdated, but I want to start from where I left).
Looks like pacemaker is having trouble starting up showin this logs:
Set r/w permissions for uid=401, gid=401 on /var/log/pacemaker.log
Set r/w permissions for uid=401, gid=401 on /var/log/pacemaker.log
Jul 24 18:21:32 [971] crmd: info: crm_log_init: Changed active
directory to /sonicle/var/cluster/lib/pacemaker/cores
Jul 24 18:21:32 [971] crmd: info: main: CRM Git Version: 1.1.15
(e174ec8)
Jul 24 18:21:32 [971] crmd: info: do_log: Input I_STARTUP received in
state S_STARTING from crmd_init
Jul 24 18:21:32 [969] lrmd: info: crm_log_init: Changed active
directory to /sonicle/var/cluster/lib/pacemaker/cores
Jul 24 18:21:32 [968] stonith-ng: info: crm_log_init: Changed active
directory to /sonicle/var/cluster/lib/pacemaker/cores
Jul 24 18:21:32 [968] stonith-ng: info: get_cluster_type: Verifying
cluster type: 'heartbeat'
Jul 24 18:21:32 [968] stonith-ng: info: get_cluster_type: Assuming an
active 'heartbeat' cluster
Jul 24 18:21:32 [968] stonith-ng: notice: crm_cluster_connect:
Connecting to cluster infrastructure: heartbeat
Jul 24 18:21:32 [969] lrmd: error: mainloop_add_ipc_server: Could not
start lrmd IPC server: Operation not supported (-48)
This is repeated for all the subdaemons ... the error is coming from
qb_ipcs_run(), which looks like the issue is an in

Re: [ClusterLabs] pacemaker startup problem

2020-07-26 Thread Reid Wahl

Hmm. If it's reading PCMK_ipc_type and matching the server type to
QB_IPC_SOCKET, then the only other place I see it could be coming from is
qb_ipc_auth_creds.

qb_ipcs_run -> qb_ipcs_us_publish -> qb_ipcs_us_connection_acceptor ->
qb_ipcs_uc_recv_and_auth -> process_auth -> qb_ipc_auth_creds ->

static int32_t
qb_ipc_auth_creds(struct ipc_auth_data *data)
{
...
#ifdef HAVE_GETPEERUCRED
/*
 * Solaris and some BSD systems
...
#elif defined(HAVE_GETPEEREID)
/*
* Usually MacOSX systems
...
#elif defined(SO_PASSCRED)
/*
* Usually Linux systems
...
#else /* no credentials */
data->ugp.pid = 0;
data->ugp.uid = 0;
data->ugp.gid = 0;
res = -ENOTSUP;
#endif /* no credentials */

return res;

I'll leave it to Ken to say whether that's likely and what it implies if so.

On Sun, Jul 26, 2020 at 2:53 AM Gabriele Bulfon  wrote:

> Sorry, actually the problem is not gone yet.
> Now corosync and pacemaker are running happily, but those IPC errors are
> coming out of heartbeat and crmd as soon as I start it.
> The pacemakerd process has PCMK_ipc_type=socket, what's wrong with
> heartbeat or crmd?
>
> Here's the env of the process:
>
> sonicle@xstorage1:/sonicle/etc/cluster/ha.d# penv 4222
> 4222: /usr/sbin/pacemakerd
> envp[0]: PCMK_respawned=true
> envp[1]: PCMK_watchdog=false
> envp[2]: HA_LOGFACILITY=none
> envp[3]: HA_logfacility=none
> envp[4]: PCMK_logfacility=none
> envp[5]: HA_logfile=/sonicle/var/log/cluster/corosync.log
> envp[6]: PCMK_logfile=/sonicle/var/log/cluster/corosync.log
> envp[7]: HA_debug=0
> envp[8]: PCMK_debug=0
> envp[9]: HA_quorum_type=corosync
> envp[10]: PCMK_quorum_type=corosync
> envp[11]: HA_cluster_type=corosync
> envp[12]: PCMK_cluster_type=corosync
> envp[13]: HA_use_logd=off
> envp[14]: PCMK_use_logd=off
> envp[15]: HA_mcp=true
> envp[16]: PCMK_mcp=true
> envp[17]: HA_LOGD=no
> envp[18]: LC_ALL=C
> envp[19]: PCMK_service=pacemakerd
> envp[20]: PCMK_ipc_type=socket
> envp[21]: SMF_ZONENAME=global
> envp[22]: PWD=/
> envp[23]: SMF_FMRI=svc:/sonicle/xstream/cluster/pacemaker:default
> envp[24]: _=/usr/sbin/pacemakerd
> envp[25]: TZ=Europe/Rome
> envp[26]: LANG=en_US.UTF-8
> envp[27]: SMF_METHOD=start
> envp[28]: SHLVL=2
> envp[29]: PATH=/usr/sbin:/usr/bin
> envp[30]: SMF_RESTARTER=svc:/system/svc/restarter:default
> envp[31]: A__z="*SHLVL
>
>
> Here are crmd complaints:
>
> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.notice] notice:
> Node xstorage1 state is now member
> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error:
> Could not start crmd IPC server: Operation not supported (-48)
> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error:
> Failed to create IPC server: shutting down and inhibiting respawn
> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.notice] notice:
> The local CRM is operational
> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error:
> Input I_ERROR received in state S_STARTING from do_started
> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.notice] notice:
> State transition S_STARTING -> S_RECOVERY
> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.warning] warning:
> Fast-tracking shutdown in response to errors
> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.warning] warning:
> Input I_PENDING received in state S_RECOVERY from do_started
> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error:
> Input I_TERMINATE received in state S_RECOVERY from do_recover
> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.notice] notice:
> Disconnected from the LRM
> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error:
> Child process pengine exited (pid=4316, rc=100)
> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error:
> Could not recover from internal error
> Jul 26 11:39:07 xstorage1 heartbeat: [ID 996084 daemon.warning] [4275]:
> WARN: Managed /usr/libexec/pacemaker/crmd process 4315 exited with return
> code 201.
>
>
>
>
> *Sonicle S.r.l. *: http://www.sonicle.com
> *Music: *http://www.gabrielebulfon.com
> *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
>
>
>
>
> --
>
> Da: Ken Gaillot 
> A: Cluster Labs - All topics related to open-source clustering welcomed <
> users@clusterlabs.org>
> Data: 25 luglio 2020 0.46.52 CEST
> Oggetto: Re: [ClusterLabs] pacemaker startup problem
>
> On Fri, 2020-07-24 at 18:34 +0200, Gabriele Bulfon wrote:
> > Hello,
> >
> > after a long time I'm back to run heartbeat/pacemaker/corosync on our
> > XStreamOS/illumos distro.
> > I rebuilt the original components I did in 2016 on our latest release
> > (probably a bit outdated, but I want to start from where I left).
> > Looks like pacemaker is having trouble starting up showin this logs:
> >
> > Set r/w permissions for uid=401, gid=401 on /var/log/pacemaker.log
> > Se

Re: [ClusterLabs] pacemaker startup problem

2020-07-26 Thread Reid Wahl

Illumos might have getpeerucred, which can also set errno to ENOTSUP.

On Sun, Jul 26, 2020 at 3:25 AM Reid Wahl  wrote:

> Hmm. If it's reading PCMK_ipc_type and matching the server type to
> QB_IPC_SOCKET, then the only other place I see it could be coming from is
> qb_ipc_auth_creds.
>
> qb_ipcs_run -> qb_ipcs_us_publish -> qb_ipcs_us_connection_acceptor ->
> qb_ipcs_uc_recv_and_auth -> process_auth -> qb_ipc_auth_creds ->
>
> static int32_t
> qb_ipc_auth_creds(struct ipc_auth_data *data)
> {
> ...
> #ifdef HAVE_GETPEERUCRED
> /*
>  * Solaris and some BSD systems
> ...
> #elif defined(HAVE_GETPEEREID)
> /*
> * Usually MacOSX systems
> ...
> #elif defined(SO_PASSCRED)
> /*
> * Usually Linux systems
> ...
> #else /* no credentials */
> data->ugp.pid = 0;
> data->ugp.uid = 0;
> data->ugp.gid = 0;
> res = -ENOTSUP;
> #endif /* no credentials */
>
> return res;
>
> I'll leave it to Ken to say whether that's likely and what it implies if
> so.
>
> On Sun, Jul 26, 2020 at 2:53 AM Gabriele Bulfon 
> wrote:
>
>> Sorry, actually the problem is not gone yet.
>> Now corosync and pacemaker are running happily, but those IPC errors are
>> coming out of heartbeat and crmd as soon as I start it.
>> The pacemakerd process has PCMK_ipc_type=socket, what's wrong with
>> heartbeat or crmd?
>>
>> Here's the env of the process:
>>
>> sonicle@xstorage1:/sonicle/etc/cluster/ha.d# penv 4222
>> 4222: /usr/sbin/pacemakerd
>> envp[0]: PCMK_respawned=true
>> envp[1]: PCMK_watchdog=false
>> envp[2]: HA_LOGFACILITY=none
>> envp[3]: HA_logfacility=none
>> envp[4]: PCMK_logfacility=none
>> envp[5]: HA_logfile=/sonicle/var/log/cluster/corosync.log
>> envp[6]: PCMK_logfile=/sonicle/var/log/cluster/corosync.log
>> envp[7]: HA_debug=0
>> envp[8]: PCMK_debug=0
>> envp[9]: HA_quorum_type=corosync
>> envp[10]: PCMK_quorum_type=corosync
>> envp[11]: HA_cluster_type=corosync
>> envp[12]: PCMK_cluster_type=corosync
>> envp[13]: HA_use_logd=off
>> envp[14]: PCMK_use_logd=off
>> envp[15]: HA_mcp=true
>> envp[16]: PCMK_mcp=true
>> envp[17]: HA_LOGD=no
>> envp[18]: LC_ALL=C
>> envp[19]: PCMK_service=pacemakerd
>> envp[20]: PCMK_ipc_type=socket
>> envp[21]: SMF_ZONENAME=global
>> envp[22]: PWD=/
>> envp[23]: SMF_FMRI=svc:/sonicle/xstream/cluster/pacemaker:default
>> envp[24]: _=/usr/sbin/pacemakerd
>> envp[25]: TZ=Europe/Rome
>> envp[26]: LANG=en_US.UTF-8
>> envp[27]: SMF_METHOD=start
>> envp[28]: SHLVL=2
>> envp[29]: PATH=/usr/sbin:/usr/bin
>> envp[30]: SMF_RESTARTER=svc:/system/svc/restarter:default
>> envp[31]: A__z="*SHLVL
>>
>>
>> Here are crmd complaints:
>>
>> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.notice] notice:
>> Node xstorage1 state is now member
>> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error:
>> Could not start crmd IPC server: Operation not supported (-48)
>> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error:
>> Failed to create IPC server: shutting down and inhibiting respawn
>> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.notice] notice:
>> The local CRM is operational
>> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error:
>> Input I_ERROR received in state S_STARTING from do_started
>> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.notice] notice:
>> State transition S_STARTING -> S_RECOVERY
>> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.warning] warning:
>> Fast-tracking shutdown in response to errors
>> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.warning] warning:
>> Input I_PENDING received in state S_RECOVERY from do_started
>> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error:
>> Input I_TERMINATE received in state S_RECOVERY from do_recover
>> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.notice] notice:
>> Disconnected from the LRM
>> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error:
>> Child process pengine exited (pid=4316, rc=100)
>> Jul 26 11:39:07 xstorage1 crmd[4315]: [ID 702911 daemon.error] error:
>> Could not recover from internal error
>> Jul 26 11:39:07 xstorage1 heartbeat: [ID 996084 daemon.warning] [4275]:
>> WARN: Managed /usr/libexec/pacemaker/crmd process 4315 exited with return
>> code 201.
>>
>>
>>
>>
>> *Sonicle S.r.l. *: http://www.sonicle.com
>> *Music: *http://www.gabrielebulfon.com
>> *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon
>>
>>
>>
>>
>> --
>>
>> Da: Ken Gaillot 
>> A: Cluster Labs - All topics related to open-source clustering welcomed <
>> users@clusterlabs.org>
>> Data: 25 luglio 2020 0.46.52 CEST
>> Oggetto: Re: [ClusterLabs] pacemaker startup problem
>>
>> On Fri, 2020-07-24 at 18:34 +0200, Gabriele Bulfon wrote:
>> > Hello,
>> >
>> > after a long time I'm back to run heartbeat/pacemaker/corosync on our
>> > XStreamOS/illumos distro.
>> > I re

Re: [ClusterLabs] pacemaker startup problem

Re: [ClusterLabs] pacemaker startup problem

Re: [ClusterLabs] pacemaker startup problem

Re: [ClusterLabs] pacemaker startup problem

Re: [ClusterLabs] pacemaker startup problem

5 matches

Site Navigation

Mail list logo

Footer information