Hi,
I set upper-case hostname (GUEST03/GUEST4) and run Pacemaker 1.1.9 +
Corosync 2.3.0.
[root@GUEST04 ~]# crm_mon -1
Last updated: Wed Apr 10 15:12:48 2013
Last change: Wed Apr 10 14:02:36 2013 via crmd on GUEST04
Stack: corosync
Current DC: GUEST04 (3232242817) - partition with quorum
Version:
Hi,
My previous patch had a spelling error, revise it just a bit.
Thanks,
Junko
2012/5/30 Junko IKEDA tsukishima...@gmail.com:
Hi,
I am trying to setup NFSv4 server using nfsserver RA,
and adding some handlings for rpc.idmad.
http://linux.die.net/man/8/rpc.idmapd
Please see the attached
Hi,
Thank you for your quick response!
This one seems to be missing. Or is it covered now by the monitor
test?
nfsserver_start () can now return $OCF_SUCCESS if it detects that nfs
server is already started.
This ocf_log debug, which complains about the argument, will not be
called anymore
,
so ocf_log debug complains;
Not enough arguments [1] to ocf_log.
I added a check statement for this.
Please see the attached.
Regards,
Junko IKEDA
NTT DATA INTELLILINK CORPORATION
nfsserver-validate-all.patch
Description: Binary data
nfsserver-check-start.patch
Description: Binary data
Hi,
Is my case hard to understand?
multipath means the Fibre Channels, there are two cables for redundancy.
Thanks,
Junko
2012/5/9 Junko IKEDA tsukishima...@gmail.com:
Hi,
In my case, the umount succeed when the Fibre Channels is disconnected,
so it seemed that the handling status file
OCF_CHECK_LEVE), it's enough to try unmount
the file system, isn't it?
https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/Filesystem#L774
Regards,
Junko IKEDA
NTT DATA INTELLILINK CORPORATION
Filesystem.patch
Description: Binary data
Hi,
In my case, the umount succeed when the Fibre Channels is disconnected,
so it seemed that the handling status file caused a longer failover,
as Dejan said.
If the umount fails, it will go into a timeout, might call stonith
action, and this case also makes sense (though I couldn't see this).
Hi,
Thank you for pointing that out!
Regards,
Junko IKEDA
2012/1/17 Dejan Muhamedagic de...@suse.de:
On Mon, Jan 16, 2012 at 03:10:14PM +0100, Dejan Muhamedagic wrote:
On Sat, Jan 14, 2012 at 12:32:20PM +0100, Lars Ellenberg wrote:
On Mon, Jan 09, 2012 at 05:50:14PM +0100, Dejan Muhamedagic
, right?
named_monitor()
output=`$OCF_RESKEY_host $OCF_RESKEY_monitor_request $OCF_RESKEY_monitor_ip`
if [ $? -ne 0 ] || ! echo $output | grep -q '.* has address
'$OCF_RESKEY_monitor_response
Would you please give me some advice?
Regards,
Junko IKEDA
NTT DATA INTELLILINK CORPORATION
named_ipv6
Hi Raoul,
Thank you for your comments!
this method should leave the slave be if the master did not change
since the last sync. consider:
crm node standby node02; crm node online node02
the slave should pick up where it left using mysql's own way of saving
the last replication information
Hi,
sorry, agein.
My previous patch was wrong.
I attached the new one.
Thanks,
Junko
2011/11/11 Junko IKEDA tsukishima...@gmail.com:
Hi,
The current mysql RA, it set hostname (= uname -n) as its replication network,
but I have the following restriction.
# uname -n
node01
# cat /etc
Hi Raoul,
Sure, thanks!
Regards,
Junko
2011/11/14 Raoul Bhatia [IPAX] r.bha...@ipax.at:
hello junko-san!
i propose the following documentation update to clarify the parameter's
usage.
parameter name=replication_hostname_suffix unique=0 required=0
longdesc lang=en
A hostname suffix that
Hi Marek, Florian,
Thank you for your comments!
Did you set evict_outdated_slaves?
No,
If set to false (the default), then the slave will be allowed to stay in
the cluster, but its master preference will be pushed down so it's not
promoted, and this seems to be Ikeda-san's preferred
noisy.
I think there is no problem if we change these log level from info to debug.
Please see attached.
Regards,
Junko IKEDA
NTT DATA INTELLILINK CORPORATION
mysql-log.patch
Description: Binary data
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux
]; then
# Sanitize a below-zero preference to just zero
master_pref=0
fi
$CRM_MASTER -v $master_pref
fi
I'm less familiar with the replication behavior,
please advise me how to do it.
Regards,
Junko IKEDA
NTT DATA INTELLILINK CORPORATION
mysql
?
Regards,
Junko IKEDA
NTT DATA INTELLILINK CORPORATION
mysql-replication_hostname_suffix.patch
Description: Binary data
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux
Hi Dejan,
Many thanks!
Can I get it from http://hg.linux-ha.org/glue/ ?
Regards,
Junko
2011/9/22 Dejan Muhamedagic de...@suse.de:
Hi Junko-san,
On Wed, Aug 17, 2011 at 10:22:40AM +0900, Junko IKEDA wrote:
Hi Dejan,
Thank you for your reply!
I attached the revised patch.
Just applied
Hi Dejan,
Thank you for your reply!
I attached the revised patch.
http://www.gossamer-threads.com/lists/linuxha/pacemaker/74350
I don't see the connection between the two.
I am trying to use /tmp/ipmitool command for some tests,
and add its path for root.
so $PATH for root is here;
# echo
?
Best Regards,
Junko IKEDA
NTT DATA INTELLILINK CORPORATION
ipmi.patch
Description: Binary data
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
Hi,
The latest resource-agent has man page for sfex_init,
and I add it to .spec.
Please see the attached patch.
Best Regard,
Junko IKEDA
NTT DATA INTELLILINK CORPORATION
sfex_init.patch
Description: Binary data
___
Linux-HA-Dev: Linux-HA-Dev
'
gmake[1]: *** [all-recursive] Error 1
gmake[1]: Leaving directory `/root/Desktop/work/20110622/resource-agents'
make: *** [all] Error 2
Best Regards,
Junko IKEDA
NTT DATA INTELLILINK CORPORATION
ethmonitor.patch
Description: Binary data
Hi Erkan,
The pacemaker logos has been created by NTT group.
I asked for the boss's permission,
I think I can send them to you directory soon :)
Did you post the similar mail to the Japanese mailing list before this?
Sorry to inconvenience you.
Thanks,
Junko IKEDA
NTT DATA INTELLILINK
-agents'
make: *** [all] Error 2
Best Regards,
Junko IKEDA
NTT DATA INTELLILINK CORPORATION
ethmonitor.patch
Description: Binary data
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also
Hi,
May I suggest that you go with the devel version, because
crm_cli.txt was converted to crm.8.txt. There are not many
textual changes, just some obsolete parts removed.
OK, I got crm.8.txt from devel.
Each directory structure for Pacemaker 1.0,1.1 and devel is just a bit
different.
Does
Hi,
I tried to compile the latest agents package from mercurial repository,
but new exportfs RA complained about something like this;
# hg clone http://hg.linux-ha.org/agents/
# cd agents
# ./autogen.sh
# ./configure --localstatedir=/var --disable-fatal-warnings
# make
/bin/sh:
Hi Dejan,
This is an old issue,
http://www.gossamer-threads.com/lists/linuxha/users/56449
but I remember this from the release plan announcement of Heartbeat 3.0.2.
I have done the test for the following your patch,
it seemed there was no problem.
Please apply this to the new release.
On Thu,
Hi,
I had done some tests for this patch,
and I could get the desired results.
I think this patch wouldn't affect the current usage.
Serge,
Thank you for your review!
Thanks,
Junko
--- Forwarded message ---
From: Serge Dubrouski serge...@gmail.com
To: Junko IKEDA ike
Hi,
If some failures happen during the online backup of PostgreSQL,
pgsql can not handle the fail over,
because backup_label, this is a file for a backup process of Postgres,
remains on the shared disk.
pgsql can not start DB if this file remains.
Please see the attached.
Thanks,
Junko
; then
- return $OCF_SUCCESS
- fi
+MSG=`$PING $PINGARGS 21`
+if [ $? = 0 ]; then
+return $OCF_SUCCESS
+fi
done
-
+
+ocf_log err $MSG
return $OCF_ERR_GENERIC
}
Thanks,
Junko
On Mon, 09 Nov 2009 18:13:29 +0900, Junko IKEDA
ike...@intellilink.co.jp
Hi,
I wonder why IPaddr RA needs to run route del before it deletes the
target interface.
Does the old version of IPaddr contain route add?
If route del fails, RA will be able to return $OCF_SUCCESS,
but I feel a little strange when I see the error message from route
command like this.
Hi,
By the way, this is a really trivial thing,
I have some requests about logging messages of IPaddr.
Please see the modified attachment.
Thanks,
Junko
On Mon, 09 Nov 2009 18:13:29 +0900, Junko IKEDA ike...@intellilink.co.jp
wrote:
Hi,
I wonder why IPaddr RA needs to run route del
Hi,
Heartbeat 2.1.3 (snmp_subagent) had some bugs around LHAIFStatus.
I think you should use Heartbeat 2.1.4.
or you can try to change the value for -r option for hbagent.
ex.)
respawn root /usr/lib64/heartbeat/hbagent -r 1 -d
Thanks,
Junko
On Fri, 07 Aug 2009 15:12:42 +0900, Jiann-Ming Su
Hi,
On Mon, 2009-07-06 at 16:14 +0200, Michael Schwartzkopff wrote:
Am Montag, 6. Juli 2009 16:06:55 schrieb Michael Schwartzkopff:
Hi,
anybody gained already some experience using the heartbeat snmp subagent in
an openais enviroment? any problems?
Thanks,
Ok. Found the answer
On Tue, 07 Jul 2009 17:23:41 +0900, Michael Schwartzkopff mi...@multinet.de
wrote:
Am Dienstag, 7. Juli 2009 10:10:09 schrieb Junko IKEDA:
Hi,
On Mon, 2009-07-06 at 16:14 +0200, Michael Schwartzkopff wrote:
Am Montag, 6. Juli 2009 16:06:55 schrieb Michael Schwartzkopff:
Hi
Hi,
On Fri, 03 Jul 2009 10:59:21 +0900, Junko IKEDA ike...@intellilink.co.jp
wrote:
Hi Dejan,
Your patch could stop the error message from LVM RA.
Many thanks!
But I run Heartbeat 2.1.4 so I worry about whether 2.1.4 still have a problem
about stonithd that you pointed.
By the way
the latest code, of course. :)
Thanks,
Junko
On Fri, 03 Jul 2009 16:28:20 +0900, Dejan Muhamedagic deja...@fastmail.fm
wrote:
Hi again Junko-san,
On Fri, Jul 03, 2009 at 04:15:40PM +0900, Junko IKEDA wrote:
Hi,
On Fri, 03 Jul 2009 10:59:21 +0900, Junko IKEDA ike...@intellilink.co.jp
wrote
,
On Fri, Jul 03, 2009 at 04:37:44PM +0900, Junko IKEDA wrote:
Hi again, :)
Thank you for your quick reply.
Our customer might hesitate to apply the new patch for their running system
at once.
Of course.
(I don't know their upgrade plan unfortunately)
So I want to know whether Heartbeat can
Hi,
I'm not be sure...
But I found the folloing message in config.log.
/usr/bin/ld: crt1.o: No such file: No such file or directory
I have RHEL5.3, so
# rpm -qf /usr/bin/ld
binutils-2.17.50.0.6-9.el5
I think RHEL4-3 should have binutils-2.15.92.0.2-1.
Thanks,
Junko
-Original
Sorry,
I'm really wrong...
Thanks,
Junko
-Original Message-
From: linux-ha-boun...@lists.linux-ha.org
[mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Junko IKEDA
Sent: Thursday, May 28, 2009 6:41 PM
To: 'General Linux-HA mailing list'
Subject: Re: [Linux-HA] Error
Sorry for many posting.
# rpm -qf /usr/lib64/crt1.o
glibc-devel-2.5-34
need glibc-devel-2.3.4-2.19?
-Original Message-
From: linux-ha-boun...@lists.linux-ha.org
[mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Junko IKEDA
Sent: Thursday, May 28, 2009 6:44 PM
To: 'General
-Original Message-
From: linux-ha-boun...@lists.linux-ha.org
[mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Andrew Beekhof
Sent: Tuesday, April 28, 2009 5:42 PM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] crm CLI
On Tue, Apr 28, 2009 at 09:39, Cristina
Hi,
I think I can use hb_delnode when I want to remove one node from the
cluster,
Should I do hb_delnode on DC?
Is there any distinction between DC or not to do that?
Thanks,
Junko
-Original Message-
From: linux-ha-boun...@lists.linux-ha.org
Development List
Subject: Re: [Linux-ha-dev] xm dump-core from xen0
Hi,
On Mon, Mar 16, 2009 at 07:22:02PM +0900, Junko IKEDA wrote:
Hi,
I run the new xen0 on domU now,
and need an additional feature for a dump destination.
I have RHEL5.2 x86_64 and xen 3.1.
This would dump
I found the following stonithd behavior.
It might be an expected one, but I'm just wondering.
My operation is here;
(1) start Heartbeat 2.1.4 on two nodes(dom-d1, dom-2).
(2) start the resource on active node(dom-d2), and dom-d2 is also DC in
this
case.
(3) modify the RA and cause
Hi,
My operation is here;
# ssh x3650g
# export dom0=x3650g
# export hostlist=dom-d1:/etc/xen/dom-d1 dom-d2:/etc/xen/dom-d2
# /usr/lib64/stonith/plugins/external/xen0 on dom-d1
# echo $?
0
dom-d1 was created well.
# /usr/lib64/stonith/plugins/external/xen0 reset dom-d1
# echo $?
# 1
Sorry for all of my mistakes...
I have a wrong /etc/hosts.
It works well for now.
By the way, Could I config this plugin on two Dom0 and two DomU?
ex.) domU-1 on Dom0-1, and domU-2 o Dom0-2
Thanks,
Junko
Hi,
My operation is here;
# ssh x3650g
# export dom0=x3650g
# export
I run the attached cib.xml.
It seems that this configuration works well (but I need more tests)
If there is any strange elements, please let me know.
Thanks,
Junko
Sorry for all of my mistakes...
I have a wrong /etc/hosts.
It works well for now.
By the way, Could I config this plugin on
be a big deal. I can add one more config parameter like
run_dump, then if it's set the script will call xm dump-core before
destroying xunU.
On Tue, Mar 3, 2009 at 10:38 PM, Junko IKEDA ike...@intellilink.co.jp
wrote:
Hi Serge,
I'm trying to manage xen domain-U with xen0 plugin
4, 2009 at 6:45 PM, Junko IKEDA ike...@intellilink.co.jp
wrote:
Hi,
Attached is a patch that adds that functionality.
Many thanks!
I'll give it a try.
By the way, xen0 plugin should run on domain-0, right?
Is it possible to run it on domain-U?
Thanks,
Junko
On Tue, Mar 3
Hi Serge,
I'm trying to manage xen domain-U with xen0 plugin.
There are two xm command, like xm destroy and xm create in xen0,
How do you think to add xm dump-core into it?
If possible, I want to get the dump of domain-U when some fence events
happen.
Best Regards,
Junko Ikeda
we always face the IPC problem...
We handle the big cib.xml to put /var/lib/heartbeat/crm when its first
boot,
and modify some TCP/UDP parameters as makeshift measures.
Thanks,
Junko
On Tue, Feb 3, 2009 at 08:12, Junko IKEDA ike...@intellilink.co.jp
wrote:
Hi,
We have 16 nodes
the timeout...
Best Regards,
Junko Ikeda
NTT DATA INTELLILINK CORPORATION
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Hi,
We have 16 nodes, and the size of cib.xml is now about 150kbyte.
Heartbeat is 2.1.4.
When I call cibadmin command, the following message comes.
# cibadmin -U -x cib.xml
No messages received in 30 seconds.. aborting
Is the size of cib.xml too big?
Quite possibly. I
Hi,
We have 16 nodes, and the size of cib.xml is now about 150kbyte.
Heartbeat is 2.1.4.
When I call cibadmin command, the following message comes.
# cibadmin -U -x cib.xml
No messages received in 30 seconds.. aborting
Is the size of cib.xml too big?
Quite
Hi,
im using the crm_mon output to parse the cluster status for other
applications.
Problem is, when heartbeat is either not running or there is some
connection
issue in the cluster or some random issue i cant make out - crm_mon will
never
return (ok as soon as the issue is repaired it may be
://developerbugs.linux-foundation.org//show_bug.cgi?id=2004
Best Regards,
Junko Ikeda
NTT DATA INTELLILINK CORPORATION
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org
The latest Pacemaker 1.0 can help our problem which I posted to the
following entry.
http://developerbugs.linux-foundation.org/show_bug.cgi?id=1990
A split brain under 4 nodes circumstances can be recovered successfully!
It seems that these patches have the effect for this behavior.
...) and join
the cluster member.
hac02, hac06 received instance=17 again, and can notice the DC election,
but they freeze... the newest id doesn't come.
Other nodes would take hac02 and hac06 as OFFLINE node.
This situation is very rare, so is this some timing bug?
Best Regards,
Junko
DC nodes during a split brain,
so it seems that DC election conflicts and some nodes can not share the
status when its recovering stage.
It might be a bug of ccm.
Can I get any obvious evidences why the node is shut down from log?
Best Regards,
Junko Ikeda
NTT DATA INTELLILINK CORPORATION
/10/30_14:38:32 debug: handle_request: Raising I_JOIN_OFFER:
join-4
crmd[6912]: 2008/10/30_14:38:32 debug: handle_request: Raising I_JOIN_OFFER:
join-5
crmd[6912]: 2008/10/30_14:38:32 debug: handle_request: Raising
I_JOIN_RESULT: join-14
Best Regards,
Junko Ikeda
NTT DATA INTELLILINK CORPORATION
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Alex Strachan
Sent: Monday, October 27, 2008 10:31 AM
To: 'General Linux-HA mailing list'
Subject: RE: [Linux-HA] Updated from 2.1.3 to 2.99.x w/ Pacemaker 1.x and
CIBno
longer conforms to DTD
See
Hi,
See also this page, please.
http://www.linux-ha.org/sfex/
Thanks,
Junko
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Xinwei Hu
Sent: Thursday, October 16, 2008 6:55 PM
To: High-Availability Linux Development List
Subject: Re: [Linux-ha-dev]
net/ipv4/udp.c(2.6.18-92.el5)
495 int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct \
msghdr *msg,
496 size_t len)
497 {
511 if (len 0x)
512 return -EMSGSIZE;
in line 511, the limit for UDP packet is 65535
Hmm. Perhaps this (the maximum packet size) has been checked by
somebody before, then forgotten and it never got into discussion
about the message compression. When I started working on the
compression, the MAXMSG was already temporarily set to 2MB.
Also, I can distinctly recall that Lars
It means that heartbeat can't deliver a message if the uncompressed
size is bigger than 2MB.
It also means that heartbeat can't deliver a message if, after
compressing the message, the size is still bigger than 256kB.
I see,
First control gate is 2MB, second is 256kB.
If MAXMSG(256kbyte)
it, because the max size for sendto() is 64kbyte.
256kbyte message should be split into pieces before sending as
packet.
by the way, I set bcast in ha.cf as media.
I assume you're working on a patch for this?
That means, heartbeat doesn't care for 256kB message before putting it
with /sbin/ldconfig for
RedHat?
Best Regards,
Junko Ikeda
NTT DATA INTELLILINK CORPORATION
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
If there's no objection I would like to push this patch into
the lha-2.1 repository, but any problem on that?
sure
It seems that the latest pacemaker also presents the same behavior
so I think the both needs to be fixed as well.
I thought it was fine?
sorry, that might have been
If you don't want non_clone_group1 to be restarted when this happens,
make the ordering constraint advisory-only by setting adding score=0
to the constraint.
I tried this configuration, but non_clone_group1 was restarted
when clone1 resources fail-count was cleared.
you're right -
Of course.
More resources == more actions to perform == more CIB updates to
perform == more work for the CIB
This is reasonable, but 1 group which contains 15 resources gets 100%
CPU
is
overdone even if it's a fleeting behavior.
So the machine should waste CPU cycles so that 100% of
threading
Best Regards,
Junko Ikeda
NTT DATA INTELLILINK CORPORATION
opreport
.txtlibcrmcommon.txt___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also
power?
Or are there any reasonable causes that cib process occupies lots of
cpu
instantaneously?
By the way, cib_notify_client() is also called a lot.
cpuinfo:
Intel(R) Xeon(R) CPU 5160 @ 3.00GHz
Core 2 Hiper threading
Best Regards,
Junko Ikeda
NTT
@ 3.00GHz
Core 2 Hiper threading
Best Regards,
Junko Ikeda
NTT DATA INTELLILINK CORPORATION
opreport
.txtlibcrmcommon.txt___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org
which codebase is this?
I tried this with Heartbeat 2.1.3 first.
Heartbeat-Dev + Pacemaker+Dev also showed the same behavior.
Thanks,
Junko
On Jul 24, 2008, at 1:58 PM, Junko IKEDA wrote:
Hi,
I have 4 nodes, (3 active + 1 standby),
and each active node has 15 resources which
Hi,
It seems that we can get Heartbeat 2.1.4 soon,
so I am trying STABLE 2.1 (http://hg.linux-ha.org/lha-2.1/) as a release
candidate.
These are trivial bugs, but user can find them easily.
It would be convenient if they are fixed before a release.
1) When I quit crm_mon -i1 using Ctrl + C,
Hi,
We are now trying to show a good performance report to the potential
customer.
Our customer's requests are here;
* There are more than 100 resources on one node.
* 100 resources are included in one group, so they would start/stop
sequentially.
* Fail over for all of 100
the
lrmd died but whatever mechanism the IPC code is using doesn't seem
able to.
On Fri, Jun 27, 2008 at 12:51, Junko IKEDA [EMAIL PROTECTED]
wrote:
It might be worth seeing if you can repeat the result with a resource
based on a simple daemon process ( while(1) { sleep(1
Hi,
It seems that you face the same problem which I did before.
I think you shouldn't use 2.1.3.
Please refer to this list:
http://www.gossamer-threads.com/lists/linuxha/users/47008
http://developerbugs.linux-foundation.org/show_bug.cgi?id=1859
The latest package includes the above fix.
released 0.6.5).
On Fri, Jun 20, 2008 at 09:51, Junko IKEDA [EMAIL PROTECTED]
wrote:
Hi,
I run this combination;
Pacemaker:0df5ae633188
Heartbeat:c94051dc16a5
There are three Filesystem and one IPaddr on one node.
If IPaddr is forced into the other node with crm_resource
Ah ok, sorry just wanted to make sure the intended functionality was
clear.
I had a look at the report and analysis.txt highlights the problem quite
well:
pengine[20727]: 2008/06/23_11:02:40 ERROR: unpack_rsc_op: Hard error:
prmApPostgreSQLDB_fail_6 failed with rc=2.
Ah ok, sorry just wanted to make sure the intended functionality was
clear.
I had a look at the report and analysis.txt highlights the problem
quite
well:
pengine[20727]: 2008/06/23_11:02:40 ERROR: unpack_rsc_op: Hard error:
prmApPostgreSQLDB_fail_6 failed with rc=2.
When the lrmd process falls, lrmd reboots.
But, the monitor stops after having rebooted.
In this status, lrmd cannot detect the obstacle of the resource after it.
Actually, there may be little possibility that lrmd reboots.
But, I think that it is necessary when I think about the worst
When the lrmd process falls, lrmd reboots.
But, the monitor stops after having rebooted.
In this status, lrmd cannot detect the obstacle of the resource after
it.
Actually, there may be little possibility that lrmd reboots.
But, I think that it is necessary when I think about the
I can only think of two things - system load (or CPU power, both of
which would affect the timing) and whether you both have stonith
enabled.
Intel(R) Xeon(R) CPU5160 @ 3.00GHz (x2)
- lrmd restart. That's it.
Intel(R) Pentium(R) 4 CPU 3.20GHz (x2)
- lrmd restart, and somehow,
Intel(R) Xeon(R) CPU5160 @ 3.00GHz (x2)
- lrmd restart. That's it.
actually, i was wrong... if you're using crm on then the node should
(based on how the code works) always commit suicide.
can you create a bug and attach a hb_report for this case please?
Is suicide the
Hi,
iirc, 2.1.3 did not have the -s option on ptest. According to file
revisions, (
http://hg.clusterlabs.org/pacemaker/stable-0.6/log/e63da7d9940d/contrib/sh
owscores.sh
) this should be the version to try (before change to use ptest -s):
://hg.clusterlabs.org/pacemaker/dev/file/tip/contrib/showscores.sh
So some values can not be displayed if I use it with Heartbeat 2.1.3.
Best Regards,
Junko Ikeda
NTT DATA INTELLILINK CORPORATION
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http
Then (because of the probe) we find out it _is_ running afterall and
we end up in the situation contained in pe-input-6.bz2
We only guarantee that the probe for rscX completes before we start the
rscX.
Start failures which are set as on_fail=block also induce the unmanaged
status as the same
. Don't know if anybody's using it.
It might not be SCSI reservations to be exact,
but it would control the ownership of shared disk.
Try SFEX (Shared Disk File EXclusiveness Control Program) from here;
http://linux-ha.org/sfex
Best Regards,
Junko Ikeda
NTT DATA INTELLILINK CORPORATION
Hi,
I'm using latest 2.1.3 from CentOS. If somebody's interested, hb_report is
available at http://nik.lbox.cz/downloads/vb.tar.gz
thanks a lot in advance
BR
nik
Is this the same problem as this?
Clone instance might be shuffled unexpectedly.
Btw. You do realize that setting ordered=false for the master resource
also means that the group's actions wont be ordered either don't you?
You mean, there's a possibility that slave resource will start/stop
before
master's action complete if I don't set ordered=true, right?
No.
of it next time.
Thanks,
Junko
2008/4/18 Junko IKEDA [EMAIL PROTECTED]:
Fixed by:
http://hg.clusterlabs.org/pacemaker/stable-0.6/rev/4817a7094683
It works well with group-master/slave, too.
Many thanks!
Please merge it into Heartbeat 2.1.4.
Thanks,
Junko
any ideas as to why the current code doesn't work for you?
I failed to build rpm on open suse 10.1 too...
It might be a potential problem in Heartbeat 2.1.3.
See attached configure-213.log.
It's sure that the summary says CIM provider and TSA plugin would not be
built,
Build CRM
.
Is there something wrong with cib.xml ?
This is similar case to what Yamauchi-san posted.
Best Regards,
Junko Ikeda
NTT DATA INTELLILINK CORPORATION
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux
Hi,
I keep failing to build lha-2.1 on RHEL5.1 for now.
It seems that --enable-cim-provider=no and --enable-tsa-plugin=no are
ineffective for ConfigureMe.
We don't need CIM providers or TSA plugin, so have a try to make patch
about
it.
Please check the attached.
Sorry for annoying.
The
Hi again,
Another request;
Would it be possible to include the following patch in release 2.1.4?
http://hg.linux-ha.org/dev/rev/6307bb091d02
It will help the problems which are posted into Bugzilla 1814,
for all platform not only ppc.
Hi,
So, that said, I've pushed my proposed code to
http://hg.linux-ha.org/lha-2.1/. It, for reasons outlined above, likely
doesn't build yet (because the in-tree packaging is broken), but I
wanted to share the scope of changes with you.
There are some fixes about failcount in
Hi,
I finally have the primary server back up (fspbro213.rchland.ibm.com), but
I can't get the secondary server back up. I get these messages in the
/var/log/ha-log file of the secondary server (fspbro214.rchland.ibm.com):
heartbeat[12894]: 2008/03/10_11:57:37 ERROR: should_drop_message:
Hi,
I've seen these messages appearing when I connect the hb_gui to the mgmtd:
mgmtd[6819]: 2008/02/29_08:03:51 ERROR: crm_abort: ha_set_tm_time:
Triggered
assert at iso8601.c:887 : rhs-tm_mday 0 || lhs-days == rhs-tm_mday
I also found the same error right now!
Not only from mgmtd, crmd
);
if (source)
g_source_destroy (source);
return source != NULL;
}
Best Regards,
Junko Ikeda
NTT DATA INTELLILINK CORPORATION
# crm_standby -U prec370e -v on
(process:10181): GLib-CRITICAL **: g_source_remove: assertion `tag 0' failed
(process:10181): GLib-CRITICAL **: g_source_remove
1 - 100 of 162 matches
Mail list logo