Hi Reid Wahl,
There are more log informations below. The reason seems to be that
communication with DBUS timed out. Any suggestions?
1672712 Jul 24 21:20:17 [3945305] B0610011 lrmd: info:
pcmk_dbus_timeout_dispatch:Timeout 0x147bbd0 expired
1672713 Jul 24 21:20:17 [3945305]
RPM Version Information:
corosync-2.3.4-7.el7_2.1.x86_64
pacemaker-1.1.12-22.el7.x86_64
Coredump file backtrace:
```
warning: .dynamic section for "/lib64/libk5crypto.so.3" is not at the expected
address (wrong library or version mismatch?)
Missing separate debuginfo for
Try: yum
Corosync.log has kept printing the following logs for several days. What's
wrong with the corosync cluster? Now the cpu load is not high.
Cluster version information:
[root@paas-controller-172-167-40-24:~]$ rpm -q corosync
corosync-2.4.0-9.el7_4.2.x86_64
[root@paas-controller-172-167-40-24:~]$
Version information
[root@paas-controller-172-167-40-24:~]$ rpm -q corosync
corosync-2.4.0-9.el7_4.2.x86_64
[root@paas-controller-172-167-40-24:~]$ rpm -q pacemaker
pacemaker-1.1.16-12.el7_4.2.x86_64
The crmd process exited with error code of 201. The pacemakerd process tried to
fork 100
I have two pacemaker resources. We call them A and B. Because of environmental
reasons, their start methods and monitor methods always return failure
(OCF_ERR_GENERIC). The following are their configurations:(The cluster property
of start-failure-is-fatal is false)
primitive A A \
op
Great! These two parameters (batch-limit & node-action-limit) solve my problem.
Thank you very much!
By the way, is there any way to know the number of parallel action on node and
cluster?
At 2018-05-10 20:56:27, "lkxjtu" <lkx...@163.com> wrote:
On Tue, 2018-05-08 at
I have a three node cluster of about 50 resources. When I reboot three nodes at
the same time, I observe the resource by "crm status". I found that pacemaker
starts 3-5 resources at a time, from top to bottom, rather than start all at
the same time. Is there any parameter control?
It seems to
> Lkxjtu,
> On 14/04/18 00:16 +0800, lkxjtu wrote:
>> My cluster version:
>> Corosync 2.4.0
>> Pacemaker 1.1.16
>>>> There are many resource anomalies. Some resources are only monitored
>> and not recovered. Some resources are not monitored or recovered
My cluster version:
Corosync 2.4.0
Pacemaker 1.1.16
There are many resource anomalies. Some resources are only monitored and not
recovered. Some resources are not monitored or recovered. Only one resource of
vnm is scheduled normally, but this resource cannot be started because other
resources
These logs are both print when system is abnormal, I am very confused what they
mean. Does anyone know what they mean? Thank you very much
corosync version 2.4.0
pacemaker version 1.1.16
1)
Feb 01 10:57:58 [18927] paas-controller-192-167-0-2 crmd: warning:
find_xml_node:Could
quot;Ken Gaillot" <kgail...@redhat.com> wrote:
>On Sat, 2017-11-04 at 22:46 +0800, lkxjtu wrote:
>>
>>
>> >Another possibility would be to have the start return immediately,
>> and
>> >make the monitor artificially return success for the first 10
>>
redhat.com> wrote:
>On Sat, 2017-10-28 at 01:11 +0800, lkxjtu wrote:
>>
>> Thank you for your response! This means that there shoudn't be long
>> "sleep" in ocf script.
>> If my service takes 10 minite from service starting to healthcheck
>>
hether some of
> the actions in the second transition would be needed regardless of
> whether the pending actions succeeded or failed, but in practice, that
> would be difficult to implement (and possibly take more time to
> calculate than is desirable in a recovery situation).
> On F
I have two clone resources in my corosync/pacemaker cluster. They are fm_mgt
and logserver. Both of their RA is ocf. fm_mgt takes 1 minute to start the
service(calling ocf start function for 1 minite). Configured as below:
# crm configure show
node 168002177: 192.168.2.177
node 168002178:
I have two clone resources in my corosync/pacemaker cluster. They are fm_mgt
and logserver. Both of their RA is ocf. fm_mgt takes 1 minute to start the
service(calling ocf start function for 1 minite). Configured as below:
# crm configure show
node 168002177: 192.168.2.177
node 168002178:
Does anyone know this question?
best regards
发自网易邮箱手机版
在2017年10月25日 23:10,lkxjtu 写道:
My problem is about the pacemaker. For example,the pacemaker cluster has two
resources, both of them resource agent are ocf. One of resource is
starting(calling ocf start function), such as needing for 1
My problem is about the pacemaker. For example,the pacemaker cluster has two
resources, both of them resource agent are ocf. One of resource is
starting(calling ocf start function), such as needing for 1 minutes, then in
this 1 minutes, if another resource monitor failed, pacemaker will not
17 matches
Mail list logo