Re: [Pacemaker] solaris problem

2013-03-25 Thread Andrei Belov
Andreas,

thank you for sharing this link and your start script!

My goal is to make possible building those tools using more convenient way of 
NetBSD's pkgsrc system.
Perhaps using something like --localstatedir=${VARBASE}/cluster for both libqb, 
corosync and pacemaker,
and setting the appropriate permissions to /var/cluster will solve the problem.

Thanks again!


On Mar 25, 2013, at 20:35 , "Grüninger, Andreas (LGL Extern)" 
 wrote:

> Andrei
> 
> There is no need to make this change.
> 
> I described in 
> http://grueni.github.com/libqb/ 
> how I compiled libqb and the other programs.
> 
> LOCALSTATEDIR should be defined with ./configure.
> Please look a "Compile Corosync" in my description.
> 
> I guess your start scripts should be changed.
> 
> We use this as start script called by the smf instance
> ##
> #!/usr/bin/bash
> # Start/stop HACluster service
> #
> . /lib/svc/share/smf_include.sh
> 
> ## Tracing mit debug version
> # PCMK_trace_files=1
> # PCMK_trace_functions=1
> # PCMK_trace_formats=1
> # PCMK_trace_tags=1
> 
> export PCMK_ipc_type=socket
> CLUSTER_USER=hacluster
> COROSYNC=corosync
> PACEMAKERD=pacemakerd
> PACEMAKER_PROCESSES=pacemaker
> APPPATH=/opt/ha/sbin/
> SLEEPINTERVALL=10
> SLEEPCOUNT=5
> SLEPT=0
> 
> 
> killapp() {
>   pid=`pgrep -f $1`
>   if [ "x$pid" != "x" ]; then
>  kill -9 $pid 
>   fi
>   return 0
> }
> 
> start0() {
>stop0
>su ${CLUSTER_USER} -c ${APPPATH}${COROSYNC}
>sleep $sleep0
>su ${CLUSTER_USER} -c ${APPPATH}${PACEMAKERD} &
>return 0
> }
> 
> stop0() {
> # first try, graceful shutdown
>pid=`pgrep -U ${CLUSTER_USER} -f ${PACEMAKERD}`
>if [ "x$pid" != "x" ]; then
>   ${APPPATH}${PACEMAKERD} --shutdown 
>   sleep $SLEEPINTERVALL
>fi
> # second try, kill the rest
>killapp ${APPPATH}${COROSYNC}
>killapp ${PACEMAKER_PROCESSES}
>return 0
> }
> 
> let sleep0=$SLEEPINTERVALL/2
> case "$1" in
> 'start')
>start0
>;;
> 'restart')
>stop0
>start0
>;;
> 'stop')
>stop0
>;;
> *)
>    echo "Usage: -bash { start | stop | restart}"
>exit 1
>;;
> esac
> exit 0
> ###
> 
> Andreas
> 
> 
> -Ursprüngliche Nachricht-
> Von: Andrei Belov [mailto:defana...@gmail.com] 
> Gesendet: Montag, 25. März 2013 15:08
> An: The Pacemaker cluster resource manager
> Betreff: Re: [Pacemaker] solaris problem
> 
> 
> Ok, I fixed this issue with the following patch against libqb 0.14.4:
> 
> --- lib/unix.c.orig 2013-03-25 12:30:50.445762231 +
> +++ lib/unix.c  2013-03-25 12:49:59.322276376 +
> @@ -83,7 +83,7 @@
> #if defined(QB_LINUX) || defined(QB_CYGWIN)
>snprintf(path, PATH_MAX, "/dev/shm/%s", file);  #else
> -   snprintf(path, PATH_MAX, LOCALSTATEDIR "/run/%s", file);
> +   snprintf(path, PATH_MAX, "%s/%s", SOCKETDIR, file);
>is_absolute = path;
> #endif
>}
> @@ -91,7 +91,7 @@
>if (fd < 0 && !is_absolute) {
>qb_util_perror(LOG_ERR, "couldn't open file %s", path);
> 
> -   snprintf(path, PATH_MAX, LOCALSTATEDIR "/run/%s", file);
> +   snprintf(path, PATH_MAX, "%s/%s", SOCKETDIR, file);
>fd = open_mmap_file(path, file_flags);
>if (fd < 0) {
>res = -errno;
> 
> 
> libqb was configured with --with-socket-dir=/var/run/qb, /var/run/qb owned by 
> hacluster:haclient - this configuration works fine with both corosync 2.3.0 
> and pacemaker 1.1.8.
> 
> Though I'm not sure that libqb is the right place to touch - maybe it'd be 
> better to add some enhancements to pacemaker's lib/common/mainloop.c,
> mainloop_add_ipc_server() ?
> 
> 
> Cheers.
> 
> 
> On Mar 25, 2013, at 16:01 , Andrei Belov  wrote:
> 
>> 
>> I've rebuilt libqb using separated SOCKETDIR (/var/run/qb), and set 
>> hacluster:haclient ownership to this dir.
>> 
>> After that pacemakerd has been successfully started with all its childs:
>> 
>> [root@ha1 /var/run/qb]# pacemakerd -fV Could not establish pacemakerd 
>> connection: Connection refused (146)
>>   info: crm_ipc_connect:  Could not establish pacemakerd connection: 
>> Connection refused (146)
>>   i

Re: [Pacemaker] solaris problem

2013-03-25 Thread LGL Extern
Andrei

There is no need to make this change.

I described in 
http://grueni.github.com/libqb/ 
how I compiled libqb and the other programs.

LOCALSTATEDIR should be defined with ./configure.
Please look a "Compile Corosync" in my description.

I guess your start scripts should be changed.

We use this as start script called by the smf instance
##
#!/usr/bin/bash
# Start/stop HACluster service
#
. /lib/svc/share/smf_include.sh

## Tracing mit debug version
# PCMK_trace_files=1
# PCMK_trace_functions=1
# PCMK_trace_formats=1
# PCMK_trace_tags=1

export PCMK_ipc_type=socket
CLUSTER_USER=hacluster
COROSYNC=corosync
PACEMAKERD=pacemakerd
PACEMAKER_PROCESSES=pacemaker
APPPATH=/opt/ha/sbin/
SLEEPINTERVALL=10
SLEEPCOUNT=5
SLEPT=0


killapp() {
   pid=`pgrep -f $1`
   if [ "x$pid" != "x" ]; then
  kill -9 $pid 
   fi
   return 0
}

start0() {
stop0
su ${CLUSTER_USER} -c ${APPPATH}${COROSYNC}
sleep $sleep0
su ${CLUSTER_USER} -c ${APPPATH}${PACEMAKERD} &
return 0
}

stop0() {
# first try, graceful shutdown
pid=`pgrep -U ${CLUSTER_USER} -f ${PACEMAKERD}`
if [ "x$pid" != "x" ]; then
   ${APPPATH}${PACEMAKERD} --shutdown 
   sleep $SLEEPINTERVALL
fi
# second try, kill the rest
killapp ${APPPATH}${COROSYNC}
killapp ${PACEMAKER_PROCESSES}
return 0
}

let sleep0=$SLEEPINTERVALL/2
case "$1" in
'start')
start0
;;
'restart')
stop0
start0
;;
'stop')
stop0
;;
*)
echo "Usage: -bash { start | stop | restart}"
exit 1
;;
esac
exit 0
###

Andreas


-Ursprüngliche Nachricht-
Von: Andrei Belov [mailto:defana...@gmail.com] 
Gesendet: Montag, 25. März 2013 15:08
An: The Pacemaker cluster resource manager
Betreff: Re: [Pacemaker] solaris problem


Ok, I fixed this issue with the following patch against libqb 0.14.4:

--- lib/unix.c.orig 2013-03-25 12:30:50.445762231 +
+++ lib/unix.c  2013-03-25 12:49:59.322276376 +
@@ -83,7 +83,7 @@
 #if defined(QB_LINUX) || defined(QB_CYGWIN)
snprintf(path, PATH_MAX, "/dev/shm/%s", file);  #else
-   snprintf(path, PATH_MAX, LOCALSTATEDIR "/run/%s", file);
+   snprintf(path, PATH_MAX, "%s/%s", SOCKETDIR, file);
is_absolute = path;
 #endif
}
@@ -91,7 +91,7 @@
if (fd < 0 && !is_absolute) {
qb_util_perror(LOG_ERR, "couldn't open file %s", path);
 
-   snprintf(path, PATH_MAX, LOCALSTATEDIR "/run/%s", file);
+   snprintf(path, PATH_MAX, "%s/%s", SOCKETDIR, file);
fd = open_mmap_file(path, file_flags);
if (fd < 0) {
res = -errno;


libqb was configured with --with-socket-dir=/var/run/qb, /var/run/qb owned by 
hacluster:haclient - this configuration works fine with both corosync 2.3.0 and 
pacemaker 1.1.8.

Though I'm not sure that libqb is the right place to touch - maybe it'd be 
better to add some enhancements to pacemaker's lib/common/mainloop.c,
mainloop_add_ipc_server() ?


Cheers.


On Mar 25, 2013, at 16:01 , Andrei Belov  wrote:

> 
> I've rebuilt libqb using separated SOCKETDIR (/var/run/qb), and set 
> hacluster:haclient ownership to this dir.
> 
> After that pacemakerd has been successfully started with all its childs:
> 
> [root@ha1 /var/run/qb]# pacemakerd -fV Could not establish pacemakerd 
> connection: Connection refused (146)
>info: crm_ipc_connect:  Could not establish pacemakerd connection: 
> Connection refused (146)
>info: get_cluster_type: Detected an active 'corosync' cluster
>info: read_config:  Reading configure for stack: corosync
>  notice: crm_add_logfile:  Additional logging available in 
> /var/log/cluster/corosync.log
>  notice: main: Starting Pacemaker 1.1.8 (Build: 1f8858c):  ncurses 
> libqb-logging libqb-ipc upstart systemd  corosync-native
>info: main: Maximum core file size is: 18446744073709551613
>info: qb_ipcs_us_publish:   server name: pacemakerd
>  notice: update_node_processes:48de70 Node 182452614 now known as 
> ha1, was: 
>info: start_child:  Forked child 60719 for process cib
>info: start_child:  Forked child 60720 for process stonith-ng
>info: start_child:  Forked child 60721 for process lrmd
>info: start_child:  Forked child 60722 for process attrd
>info: start_child:  Forked child 60723 for process pengine
>info: start_child:  Forked child 60724 for process crmd
>info: main: Starting mainloop
> 
> [root@ha1 /var/run/qb]# ls -l
> total 0
> srwxrwxrwx 1 

Re: [Pacemaker] solaris problem

2013-03-25 Thread Andrei Belov
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned 
>> -32, expected 256!
>> Mar 25 11:15:55 [53976]cib:error: qb_ipcs_us_publish:   
>> Could not bind AF_UNIX (/var/run/cib_ro): Permission denied (13)
>> Mar 25 11:15:55 [53976]cib:error: mainloop_add_ipc_server:  
>> Could not start cib_ro IPC server: Unknown error (-13)
>> Mar 25 11:15:55 [53976]cib:error: qb_ipcs_us_publish:   
>> Could not bind AF_UNIX (/var/run/cib_rw): Permission denied (13)
>> Mar 25 11:15:55 [53976]cib:error: mainloop_add_ipc_server:  
>> Could not start cib_rw IPC server: Unknown error (-13)
>> Mar 25 11:15:55 [53976]cib:error: qb_ipcs_us_publish:   
>> Could not bind AF_UNIX (/var/run/cib_shm): Permission denied (13)
>> Mar 25 11:15:55 [53976]cib:error: mainloop_add_ipc_server:  
>> Could not start cib_shm IPC server: Unknown error (-13)
>> Mar 25 11:15:55 [53976]cib:error: cib_init: Couldnt 
>> start all IPC channels, exiting.
>> Mar 25 11:15:55 [53975] pacemakerd:error: pcmk_child_exit:  Child 
>> process cib exited (pid=53976, rc=255)
>> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned 
>> -32, expected 223!
>> Mar 25 11:16:04 [53977] stonith-ng:error: setup_cib:Could not 
>> connect to the CIB service: -134 fd7fc421a0b0
>> Mar 25 11:16:04 [33641] ha1 corosync error   [SERV  ] event_send retuned 
>> -32, expected 217!
>> Mar 25 11:16:04 [53975] pacemakerd:   notice: pcmk_shutdown_worker: 
>> Attempting to inhibit respawning after fatal error
>> 
>> 
>> # fgrep 32 /usr/include/sys/errno.h 
>> #define EPIPE   32  /* Broken pipe  */
>> 
>> 
>> 
>> On Mar 25, 2013, at 13:55 , "Grüninger, Andreas (LGL Extern)" 
>>  wrote:
>> 
>>> With solaris/openindiana you should use this setting 
>>> export PCMK_ipc_type=socket 
>>> 
>>> Andreas
>>> 
>>> -Ursprüngliche Nachricht-
>>> Von: Andrei Belov [mailto:defana...@gmail.com] 
>>> Gesendet: Montag, 25. März 2013 10:43
>>> An: pacemaker@oss.clusterlabs.org
>>> Betreff: [Pacemaker] solaris problem
>>> 
>>> Hi folks,
>>> 
>>> I'm trying to build test HA cluster on Solaris 5.11 using libqb 0.14.4, 
>>> corosync 2.3.0 and pacemaker 1.1.8, and I'm facing a strange problem while 
>>> starting pacemaker.
>>> 
>>> Log shows the following errors:
>>> 
>>> Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  
>>> Could not start lrmd IPC server: Unknown error (-48)
>>> Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New 
>>> IPC server could not be created because another lrmd process exists, 
>>> sending shutdown command to old lrmd process.
>>> Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  
>>> Could not start lrmd IPC server: Unknown error (-48)
>>> Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New 
>>> IPC server could not be created because another lrmd process exists, 
>>> sending shutdown command to old lrmd process.
>>> Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  
>>> Could not start lrmd IPC server: Unknown error (-48)
>>> Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New 
>>> IPC server could not be created because another lrmd process exists, 
>>> sending shutdown command to old lrmd process.
>>> Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  
>>> Could not start lrmd IPC server: Unknown error (-48)
>>> Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New 
>>> IPC server could not be created because another lrmd process exists, 
>>> sending shutdown command to old lrmd process.
>>> Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  
>>> Could not start lrmd IPC server: Unknown error (-48)
>>> Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New 
>>> IPC server could not be created because another lrmd process exists, 
>>> sending shutdown command to old lrmd process.
>>> Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  
>>> Could not start lrmd IPC server: Unknown error (-48)
>>> Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New 
>>> IPC server could n

Re: [Pacemaker] solaris problem

2013-03-25 Thread Andrei Belov
ared:
> 
> Mar 25 11:15:55 [33641] ha1 corosync error   [MAIN  ] event_send retuned -32, 
> expected 256!
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 217!
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 219!
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 223!
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 223!
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 223!
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 223!
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 223!
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 223!
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 256!
> Mar 25 11:15:55 [53980]pengine:error: qb_ipcs_us_publish:   Could 
> not bind AF_UNIX (/var/run/pengine): Permission denied (13)
> Mar 25 11:15:55 [53980]pengine:error: mainloop_add_ipc_server:  Could 
> not start pengine IPC server: Unknown error (-13)
> Mar 25 11:15:55 [53980]pengine:error: main: Couldn't start IPC 
> server
> Mar 25 11:15:55 [53975] pacemakerd:error: pcmk_child_exit:  Child process 
> pengine exited (pid=53980, rc=1)
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 256!
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 223!
> Mar 25 11:15:55 [53979]  attrd:error: qb_ipcs_us_publish:   Could 
> not bind AF_UNIX (/var/run/attrd): Permission denied (13)
> Mar 25 11:15:55 [53979]  attrd:error: mainloop_add_ipc_server:  Could 
> not start attrd IPC server: Unknown error (-13)
> Mar 25 11:15:55 [53979]  attrd:error: main: Could not start IPC 
> server
> Mar 25 11:15:55 [53979]  attrd:error: main: Aborting startup
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 223!
> Mar 25 11:15:55 [53975] pacemakerd:error: pcmk_child_exit:  Child process 
> attrd exited (pid=53979, rc=100)
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 223!
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 223!
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 223!
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 223!
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 256!
> Mar 25 11:15:55 [53976]cib:error: qb_ipcs_us_publish:   Could 
> not bind AF_UNIX (/var/run/cib_ro): Permission denied (13)
> Mar 25 11:15:55 [53976]cib:error: mainloop_add_ipc_server:  Could 
> not start cib_ro IPC server: Unknown error (-13)
> Mar 25 11:15:55 [53976]cib:error: qb_ipcs_us_publish:   Could 
> not bind AF_UNIX (/var/run/cib_rw): Permission denied (13)
> Mar 25 11:15:55 [53976]cib:error: mainloop_add_ipc_server:  Could 
> not start cib_rw IPC server: Unknown error (-13)
> Mar 25 11:15:55 [53976]cib:error: qb_ipcs_us_publish:   Could 
> not bind AF_UNIX (/var/run/cib_shm): Permission denied (13)
> Mar 25 11:15:55 [53976]cib:error: mainloop_add_ipc_server:  Could 
> not start cib_shm IPC server: Unknown error (-13)
> Mar 25 11:15:55 [53976]cib:error: cib_init: Couldnt start 
> all IPC channels, exiting.
> Mar 25 11:15:55 [53975] pacemakerd:error: pcmk_child_exit:  Child process 
> cib exited (pid=53976, rc=255)
> Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 223!
> Mar 25 11:16:04 [53977] stonith-ng:error: setup_cib:Could not 
> connect to the CIB service: -134 fd7fc421a0b0
> Mar 25 11:16:04 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
> expected 217!
> Mar 25 11:16:04 [53975] pacemakerd:   notice: pcmk_shutdown_worker: 
> Attempting to inhibit respawning after fatal error
> 
> 
> # fgrep 32 /usr/include/sys/errno.h 
> #define EPIPE   32  /* Broken pipe  */
> 
> 
> 
> On Mar 25, 2013, at 13:55 , "Grüninger, Andreas (LGL Extern)" 
>  wrote:
> 
>> With solaris/openindiana you should use this setting 
>> export PCMK_ipc_type=socket 
>> 
>> Andreas
>> 
>> -Ursprüngliche Nachricht-
>> Von: Andrei Belov [mailto:defana...@g

Re: [Pacemaker] solaris problem

2013-03-25 Thread Andrei Belov
Andreas,

just tried "PCMK_ipc_type=socket pacemaker -fV" - a bunch of additional 
"event_send" errors appeared:

Mar 25 11:15:55 [33641] ha1 corosync error   [MAIN  ] event_send retuned -32, 
expected 256!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 217!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 219!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 256!
Mar 25 11:15:55 [53980]pengine:error: qb_ipcs_us_publish:   Could 
not bind AF_UNIX (/var/run/pengine): Permission denied (13)
Mar 25 11:15:55 [53980]pengine:error: mainloop_add_ipc_server:  Could 
not start pengine IPC server: Unknown error (-13)
Mar 25 11:15:55 [53980]pengine:error: main: Couldn't start IPC 
server
Mar 25 11:15:55 [53975] pacemakerd:error: pcmk_child_exit:  Child process 
pengine exited (pid=53980, rc=1)
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 256!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 223!
Mar 25 11:15:55 [53979]  attrd:error: qb_ipcs_us_publish:   Could 
not bind AF_UNIX (/var/run/attrd): Permission denied (13)
Mar 25 11:15:55 [53979]  attrd:error: mainloop_add_ipc_server:  Could 
not start attrd IPC server: Unknown error (-13)
Mar 25 11:15:55 [53979]  attrd:error: main: Could not start IPC 
server
Mar 25 11:15:55 [53979]  attrd:error: main: Aborting startup
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 223!
Mar 25 11:15:55 [53975] pacemakerd:error: pcmk_child_exit:  Child process 
attrd exited (pid=53979, rc=100)
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 223!
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 256!
Mar 25 11:15:55 [53976]cib:error: qb_ipcs_us_publish:   Could 
not bind AF_UNIX (/var/run/cib_ro): Permission denied (13)
Mar 25 11:15:55 [53976]cib:error: mainloop_add_ipc_server:  Could 
not start cib_ro IPC server: Unknown error (-13)
Mar 25 11:15:55 [53976]cib:error: qb_ipcs_us_publish:   Could 
not bind AF_UNIX (/var/run/cib_rw): Permission denied (13)
Mar 25 11:15:55 [53976]cib:error: mainloop_add_ipc_server:  Could 
not start cib_rw IPC server: Unknown error (-13)
Mar 25 11:15:55 [53976]cib:error: qb_ipcs_us_publish:   Could 
not bind AF_UNIX (/var/run/cib_shm): Permission denied (13)
Mar 25 11:15:55 [53976]cib:error: mainloop_add_ipc_server:  Could 
not start cib_shm IPC server: Unknown error (-13)
Mar 25 11:15:55 [53976]cib:error: cib_init: Couldnt start 
all IPC channels, exiting.
Mar 25 11:15:55 [53975] pacemakerd:error: pcmk_child_exit:  Child process 
cib exited (pid=53976, rc=255)
Mar 25 11:15:55 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 223!
Mar 25 11:16:04 [53977] stonith-ng:error: setup_cib:Could not 
connect to the CIB service: -134 fd7fc421a0b0
Mar 25 11:16:04 [33641] ha1 corosync error   [SERV  ] event_send retuned -32, 
expected 217!
Mar 25 11:16:04 [53975] pacemakerd:   notice: pcmk_shutdown_worker: 
Attempting to inhibit respawning after fatal error


# fgrep 32 /usr/include/sys/errno.h 
#define EPIPE   32  /* Broken pipe  */



On Mar 25, 2013, at 13:55 , "Grüninger, Andreas (LGL Extern)" 
 wrote:

> With solaris/openindiana you should use this setting 
> export PCMK_ipc_type=socket 
> 
> Andreas
> 
> -Ursprüngliche Nachricht-
> Von: Andrei Belov [mailto:defana...@gmail.com] 
> Gesendet: Montag, 25. März 2013 10:43
> An: pacemaker@oss.clusterlabs.org
> Betreff: [Pacemaker] solaris problem
> 
> Hi folks,
> 
> I'm trying to build test HA cluster on Solaris 5.11 using libqb 0.14.4, 
> corosync 2.3.0 and pacemaker 1.1.8, and I'm facing a strange problem while 
> sta

Re: [Pacemaker] solaris problem

2013-03-25 Thread LGL Extern
With solaris/openindiana you should use this setting 
export PCMK_ipc_type=socket 

Andreas

-Ursprüngliche Nachricht-
Von: Andrei Belov [mailto:defana...@gmail.com] 
Gesendet: Montag, 25. März 2013 10:43
An: pacemaker@oss.clusterlabs.org
Betreff: [Pacemaker] solaris problem

Hi folks,

I'm trying to build test HA cluster on Solaris 5.11 using libqb 0.14.4, 
corosync 2.3.0 and pacemaker 1.1.8, and I'm facing a strange problem while 
starting pacemaker.

Log shows the following errors:

Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: main: Failed to allocate lrmd 
server.  shutting down
Mar 25 09:21:26 [33722]pengine:error: mainloop_add_ipc_server:  Could 
not start pengine IPC server: Unknown error (-48)
Mar 25 09:21:26 [33722]pengine:error: main: Couldn't start IPC 
server
Mar 25 09:21:26 [33717] pacemakerd:error: pcmk_child_exit:  Child process 
lrmd exited (pid=33720, rc=255)
Mar 25 09:21:26 [33721]  attrd:error: qb_ipcs_us_publish:   Could 
not bind AF_UNIX (/var/run/attrd): Permission denied (13)
Mar 25 09:21:26 [33721]  attrd:error: mainloop_add_ipc_server:  Could 
not start attrd IPC server: Unknown error (-13)
Mar 25 09:21:26 [33721]  attrd:error: main: Could not start IPC 
server
Mar 25 09:21:26 [33721]  attrd:error: main: Aborting startup
Mar 25 09:21:26 [33717] pacemakerd:error: pcmk_child_exit:  Child process 
pengine exited (pid=33722, rc=1)
Mar 25 09:21:26 [33717] pacemakerd:error: pcmk_child_exit:  Child process 
attrd exited (pid=33721, rc=100)
Mar 25 09:21:26 [33718]cib:error: qb_ipcs_us_publish:   Could 
not bind AF_UNIX (/var/run/cib_ro): Permission denied (13)
Mar 25 09:21:26 [33718]cib:error: mainloop_add_ipc_server:  Could 
not s

[Pacemaker] solaris problem

2013-03-25 Thread Andrei Belov
Hi folks,

I'm trying to build test HA cluster on Solaris 5.11 using libqb 0.14.4, 
corosync 2.3.0 and pacemaker 1.1.8,
and I'm facing a strange problem while starting pacemaker.

Log shows the following errors:

Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: mainloop_add_ipc_server:  Could 
not start lrmd IPC server: Unknown error (-48)
Mar 25 09:21:26 [33720]   lrmd:error: try_server_create:New IPC 
server could not be created because another lrmd process exists, sending 
shutdown command to old lrmd process.
Mar 25 09:21:26 [33720]   lrmd:error: main: Failed to allocate lrmd 
server.  shutting down
Mar 25 09:21:26 [33722]pengine:error: mainloop_add_ipc_server:  Could 
not start pengine IPC server: Unknown error (-48)
Mar 25 09:21:26 [33722]pengine:error: main: Couldn't start IPC 
server
Mar 25 09:21:26 [33717] pacemakerd:error: pcmk_child_exit:  Child process 
lrmd exited (pid=33720, rc=255)
Mar 25 09:21:26 [33721]  attrd:error: qb_ipcs_us_publish:   Could 
not bind AF_UNIX (/var/run/attrd): Permission denied (13)
Mar 25 09:21:26 [33721]  attrd:error: mainloop_add_ipc_server:  Could 
not start attrd IPC server: Unknown error (-13)
Mar 25 09:21:26 [33721]  attrd:error: main: Could not start IPC 
server
Mar 25 09:21:26 [33721]  attrd:error: main: Aborting startup
Mar 25 09:21:26 [33717] pacemakerd:error: pcmk_child_exit:  Child process 
pengine exited (pid=33722, rc=1)
Mar 25 09:21:26 [33717] pacemakerd:error: pcmk_child_exit:  Child process 
attrd exited (pid=33721, rc=100)
Mar 25 09:21:26 [33718]cib:error: qb_ipcs_us_publish:   Could 
not bind AF_UNIX (/var/run/cib_ro): Permission denied (13)
Mar 25 09:21:26 [33718]cib:error: mainloop_add_ipc_server:  Could 
not start cib_ro IPC server: Unknown error (-13)
Mar 25 09:21:26 [33718]cib:error: qb_ipcs_us_publish:   Could 
not bind AF_UNIX (/var/run/cib_rw): Permission denied (13)
Mar 25 09:21:26 [33718]cib:error: mainloop_add_ipc_server:  Could 
not start cib_rw IPC server: Unknown erro