Re: [Pacemaker] Aisexec use 50% cpu

2009-09-29 Thread Dejan Muhamedagic
Hi,

On Mon, Sep 28, 2009 at 02:30:23PM +0200, Marcos Riosalido wrote:
 Hi,
 I have instaled a pacemaker 1.0.4.1 in debian lenny with instructions of wiki
 (http://clusterlabs.org/wiki/Debian_Lenny_HowTo). When openais-legacy
 starts the aisexec consume 50% of cpu.

Yes, openais takes some resources.

 I tested with resources configured and without config and the results
 are equals.

It has nothing to do with pacemaker. You may try to email the
openais mailing list, though I think that the issue has already
been raised.

Thanks,

Dejan

 Ofcourse i read the faq, howtos and google '''but... or no info or
 i'm bad looking'''
 
 Regards,
 Marcos Riosalido.
 
 ___
 Pacemaker mailing list
 Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Aisexec use 50% cpu

2009-09-29 Thread Michael Schwartzkopff
Am Dienstag, 29. September 2009 11:56:50 schrieb Dejan Muhamedagic:
 Hi,

 On Mon, Sep 28, 2009 at 02:30:23PM +0200, Marcos Riosalido wrote:
  Hi,
  I have instaled a pacemaker 1.0.4.1 in debian lenny with instructions of
  wiki (http://clusterlabs.org/wiki/Debian_Lenny_HowTo). When
  openais-legacy starts the aisexec consume 50% of cpu.

 Yes, openais takes some resources.

  I tested with resources configured and without config and the results
  are equals.

 It has nothing to do with pacemaker. You may try to email the
 openais mailing list, though I think that the issue has already
 been raised.

 Thanks,

 Dejan

Hi,

as far as I know this was caused by pacemaker asking openais to check itself 
all 100 ms. beekhof changed this to 1 s. After that the high CPU load went 
away on my test systems.

I think this patch is not included in the 1.0.4 binaries of madkiss, but only 
in the 1.0.5 of the ha-corosync repository.

-- 
Dr. Michael Schwartzkopff
MultiNET Services GmbH
Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany
Tel: +49 - 89 - 45 69 11 0
Fax: +49 - 89 - 45 69 11 21
mob: +49 - 174 - 343 28 75

mail: mi...@multinet.de
web: www.multinet.de

Sitz der Gesellschaft: 85630 Grasbrunn
Registergericht: Amtsgericht München HRB 114375
Geschäftsführer: Günter Jurgeneit, Hubert Martens

---

PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B
Skype: misch42

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Aisexec use 50% cpu

2009-09-29 Thread Dejan Muhamedagic
Hi,

On Tue, Sep 29, 2009 at 12:09:36PM +0200, Michael Schwartzkopff wrote:
 Am Dienstag, 29. September 2009 11:56:50 schrieb Dejan Muhamedagic:
  Hi,
 
  On Mon, Sep 28, 2009 at 02:30:23PM +0200, Marcos Riosalido wrote:
   Hi,
   I have instaled a pacemaker 1.0.4.1 in debian lenny with instructions of
   wiki (http://clusterlabs.org/wiki/Debian_Lenny_HowTo). When
   openais-legacy starts the aisexec consume 50% of cpu.
 
  Yes, openais takes some resources.
 
   I tested with resources configured and without config and the results
   are equals.
 
  It has nothing to do with pacemaker. You may try to email the
  openais mailing list, though I think that the issue has already
  been raised.
 
  Thanks,
 
  Dejan
 
 Hi,
 
 as far as I know this was caused by pacemaker asking openais to check itself 
 all 100 ms. beekhof changed this to 1 s. After that the high CPU load went 
 away on my test systems.

Good to know.

 I think this patch is not included in the 1.0.4 binaries of madkiss, but only 
 in the 1.0.5 of the ha-corosync repository.

Thanks,

Dejan

 -- 
 Dr. Michael Schwartzkopff
 MultiNET Services GmbH
 Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany
 Tel: +49 - 89 - 45 69 11 0
 Fax: +49 - 89 - 45 69 11 21
 mob: +49 - 174 - 343 28 75
 
 mail: mi...@multinet.de
 web: www.multinet.de
 
 Sitz der Gesellschaft: 85630 Grasbrunn
 Registergericht: Amtsgericht München HRB 114375
 Geschäftsführer: Günter Jurgeneit, Hubert Martens
 
 ---
 
 PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B
 Skype: misch42
 
 ___
 Pacemaker mailing list
 Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Aisexec use 50% cpu

2009-09-29 Thread Marcos Riosalido
Hi,

On Tue, Sep 29, 2009 at 12:09 PM, Michael Schwartzkopff
mi...@multinet.de wrote:
 Am Dienstag, 29. September 2009 11:56:50 schrieb Dejan Muhamedagic:
 Hi,

 On Mon, Sep 28, 2009 at 02:30:23PM +0200, Marcos Riosalido wrote:
  Hi,
  I have instaled a pacemaker 1.0.4.1 in debian lenny with instructions of
  wiki (http://clusterlabs.org/wiki/Debian_Lenny_HowTo). When
  openais-legacy starts the aisexec consume 50% of cpu.

 Yes, openais takes some resources.

  I tested with resources configured and without config and the results
  are equals.

 It has nothing to do with pacemaker. You may try to email the
 openais mailing list, though I think that the issue has already
 been raised.

 Thanks,

 Dejan

 Hi,

 as far as I know this was caused by pacemaker asking openais to check itself
 all 100 ms. beekhof changed this to 1 s. After that the high CPU load went
 away on my test systems.

 I think this patch is not included in the 1.0.4 binaries of madkiss, but only
 in the 1.0.5 of the ha-corosync repository.

 --
 Dr. Michael Schwartzkopff
 MultiNET Services GmbH
 Addresse: Bretonischer Ring 7; 85630 Grasbrunn; Germany
 Tel: +49 - 89 - 45 69 11 0
 Fax: +49 - 89 - 45 69 11 21
 mob: +49 - 174 - 343 28 75

 mail: mi...@multinet.de
 web: www.multinet.de

 Sitz der Gesellschaft: 85630 Grasbrunn
 Registergericht: Amtsgericht München HRB 114375
 Geschäftsführer: Günter Jurgeneit, Hubert Martens

 ---

 PGP Fingerprint: F919 3919 FF12 ED5A 2801 DEA6 AA77 57A4 EDD8 979B
 Skype: misch42

 ___
 Pacemaker mailing list
 Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Thanks,

Marcos

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] A problem to fail in a stop of Pacemaker.

2009-09-29 Thread Remi Broemeling




Hello Hideo,

It appears that this is a similar problem to the one that I reported,
yes. It appears to not be a bug in Corosync, but rather one in
Pacemaker. This bug has been filed in Red Hat Bugzilla, see it at:

https://bugzilla.redhat.com/show_bug.cgi?id=525589

Perhaps you could add any additional details that you have found
(affected packages, etc.) to the bug; it may help the developers fix it.

Thanks.


renayama19661...@ybb.ne.jp wrote:

  Hi,

I started a Dummy resource in one node by the next combination.
 * corosync 1.1.0
 * Pacemaker-1-0-05c8b63cbca7
 * Reusable-Cluster-Components-6ef02517ee57
 * Cluster-Resource-Agents-88a9cfd9e8b5

The Dummy resource started in a node.

I was going to stop a node(service Corosync stop), but did not stop.

--log--
(snip)

Sep 29 13:52:01 rh53-1 crmd: [11193]: info: crm_signal_dispatch: Invoking handler for signal 15:
Terminated
Sep 29 13:52:01 rh53-1 crmd: [11193]: info: crm_shutdown: Requesting shutdown
Sep 29 13:52:01 rh53-1 crmd: [11193]: info: do_state_transition: State transition S_IDLE -
S_POLICY_ENGINE [ input=I_SHUTDOWN cause=C_SHUTDOWN origin=crm_shutdown ]
Sep 29 13:52:01 rh53-1 crmd: [11193]: info: do_state_transition: All 1 cluster nodes are eligible to
run resources.
Sep 29 13:52:01 rh53-1 crmd: [11193]: info: do_shutdown_req: Sending shutdown request to DC: rh53-1
Sep 29 13:52:30 rh53-1 corosync[11183]:   [pcmk  ] notice: pcmk_shutdown: Still waiting for crmd
(pid=11193) to terminate...
Sep 29 13:53:30 rh53-1 last message repeated 2 times
Sep 29 13:55:00 rh53-1 last message repeated 3 times
Sep 29 13:56:30 rh53-1 last message repeated 3 times
Sep 29 13:58:01 rh53-1 last message repeated 3 times
Sep 29 13:59:31 rh53-1 last message repeated 3 times
Sep 29 14:00:31 rh53-1 last message repeated 2 times
Sep 29 14:00:46 rh53-1 cib: [11189]: info: cib_stats: Processed 94 operations (11489.00us average, 0%
utilization) in the last 10min
Sep 29 14:01:01 rh53-1 corosync[11183]:   [pcmk  ] notice: pcmk_shutdown: Still waiting for crmd
(pid=11193) to terminate...

(snip)
--log--


Possibly is the cause same as the next email?
 * http://www.gossamer-threads.com/lists/linuxha/pacemaker/58127

And, the same problem was taking place by the next combination.
 * corosync 1.0.1
 * Pacemaker-1-0-595cca870aff
 * Reusable-Cluster-Components-6ef02517ee57
 * Cluster-Resource-Agents-88a9cfd9e8b5

I attach a file of hb_report.

Best Regards,
Hideo Yamauchi.
  


-- 


Remi Broemeling
Sr System Administrator

Nexopia.com Inc.
 direct: 780 444 1250 ext 435
email: r...@nexopia.com
fax: 780 487 0376 




You are only young once, but you can stay immature
indefinitely.
www.siglets.com




___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


[Pacemaker] Preventing both nodes from running non-clone resources

2009-09-29 Thread Errol Neal
Hi. I'm a new openais user. I followed:

http://www.clusterlabs.org/mediawiki/images/9/9d/Clusters_from_Scratch_-_Apache_on_Fedora11.pdf

And I was very successful in getting a cluster setup. I'm having a few issues 
that I'm trying to sort out and I was wondering if I could get some help. 

Here is my configuration:

node axigen1
node axigen2
primitive axigenFS ocf:heartbeat:Filesystem \
params device=/dev/vgAxigenMailStore01/lvAxigenMailStore01 
directory=/var/opt/axigen fstype=ocfs2 \
meta target-role=Started
primitive axigenFilters lsb:axigenfilters \
op monitor interval=120s timeout=30s \
meta migration-threshold=2
primitive axigenIP ocf:heartbeat:IPaddr2 \
params ip=*.*.*.* cidr_netmask=23 nic=eth0 \
op monitor interval=21s timeout=5s \
meta migration-threshold=2
primitive axigenServer lsb:axigen \
op monitor interval=120s timeout=30s \
meta migration-threshold=2
primitive dlm ocf:pacemaker:controld \
op monitor interval=120s
primitive o2cb ocf:ocfs2:o2cb \
op monitor interval=120s group axigenGroup axigenServer axigenFilters 
clone axigenFS-clone axigenFS \
meta interleave=true
group axigenGroup axigenFilters axigenServer
clone axigenFS-clone axigenFS
clone dlm-clone dlm \
meta interleave=true
clone o2cb-clone o2cb \
meta interleave=true
colocation axigenFS-with-o2cb inf: axigenFS-clone o2cb-clone
colocation axigenGroup-with-axigenFS inf: axigenGroup axigenFS-clone
colocation axigenGroup-with-axigenIP inf: axigenGroup axigenIP
colocation o2cb-with-dlm inf: o2cb-clone dlm-clone
order start-axigenGroup-after-axigenFS inf: axigenFS-clone axigenGroup
order start-axigenGroup-after-axigenIP inf: axigenIP axigenGroup
order start-o2cb-after-dlm inf: dlm-clone o2cb-clone
property $id=cib-bootstrap-options \
dc-version=1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7 \
cluster-infrastructure=openais \
expected-quorum-votes=2 \
stonith-enabled=false \
cluster-delay=60s \
last-lrm-refresh=1254246905 \
no-quorum-policy=ignore
rsc_defaults $id=rsc-options \
resource-stickiness=100


I'm running a mailsystem on a two node cluster backed by an lvm'ed ocfs2 
filesystem. I recompiled lvm2 v 2.0.49 to support openais as the clvmd type. 

I just had an issue where both nodes in the cluster started the IPResource and 
the axigenGroup. This resulted in some corruption. Is there something in my 
configuration that would make this ok? 

I don't have stonith yet. I'm trying to get things stable before I risk death 
matches between the nodes. 

Any insight would be helpful 

Thanks,



_
This email was transferred using an evaluation version
of AXIGEN Mail Server.

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] A problem to fail in a stop of Pacemaker.

2009-09-29 Thread renayama19661014
Hi Remi,

 It appears that this is a similar problem to the one that I reported, 
 yes.  It appears to not be a bug in Corosync, but rather one in 
 Pacemaker.  This bug has been filed in Red Hat Bugzilla, see it at:
 
 https://bugzilla.redhat.com/show_bug.cgi?id=525589
 
 Perhaps you could add any additional details that you have found 
 (affected packages, etc.) to the bug; it may help the developers fix it.

All right.
Thank you.

Best Regards,
Hideo Yamauchi.

--- Remi Broemeling r...@nexopia.com wrote:

 Hello Hideo,
 
 It appears that this is a similar problem to the one that I reported, 
 yes.  It appears to not be a bug in Corosync, but rather one in 
 Pacemaker.  This bug has been filed in Red Hat Bugzilla, see it at:
 
 https://bugzilla.redhat.com/show_bug.cgi?id=525589
 
 Perhaps you could add any additional details that you have found 
 (affected packages, etc.) to the bug; it may help the developers fix it.
 
 Thanks.
 
 
 renayama19661...@ybb.ne.jp wrote:
  Hi,
 
  I started a Dummy resource in one node by the next combination.
   * corosync 1.1.0
   * Pacemaker-1-0-05c8b63cbca7
   * Reusable-Cluster-Components-6ef02517ee57
   * Cluster-Resource-Agents-88a9cfd9e8b5
 
  The Dummy resource started in a node.
 
  I was going to stop a node(service Corosync stop), but did not stop.
 
  --log--
  (snip)
 
  Sep 29 13:52:01 rh53-1 crmd: [11193]: info: crm_signal_dispatch: Invoking 
  handler for signal
 15:
  Terminated
  Sep 29 13:52:01 rh53-1 crmd: [11193]: info: crm_shutdown: Requesting 
  shutdown
  Sep 29 13:52:01 rh53-1 crmd: [11193]: info: do_state_transition: State 
  transition S_IDLE -
  S_POLICY_ENGINE [ input=I_SHUTDOWN cause=C_SHUTDOWN origin=crm_shutdown ]
  Sep 29 13:52:01 rh53-1 crmd: [11193]: info: do_state_transition: All 1 
  cluster nodes are
 eligible to
  run resources.
  Sep 29 13:52:01 rh53-1 crmd: [11193]: info: do_shutdown_req: Sending 
  shutdown request to DC:
 rh53-1
  Sep 29 13:52:30 rh53-1 corosync[11183]:   [pcmk  ] notice: pcmk_shutdown: 
  Still waiting for
 crmd
  (pid=11193) to terminate...
  Sep 29 13:53:30 rh53-1 last message repeated 2 times
  Sep 29 13:55:00 rh53-1 last message repeated 3 times
  Sep 29 13:56:30 rh53-1 last message repeated 3 times
  Sep 29 13:58:01 rh53-1 last message repeated 3 times
  Sep 29 13:59:31 rh53-1 last message repeated 3 times
  Sep 29 14:00:31 rh53-1 last message repeated 2 times
  Sep 29 14:00:46 rh53-1 cib: [11189]: info: cib_stats: Processed 94 
  operations (11489.00us
 average, 0%
  utilization) in the last 10min
  Sep 29 14:01:01 rh53-1 corosync[11183]:   [pcmk  ] notice: pcmk_shutdown: 
  Still waiting for
 crmd
  (pid=11193) to terminate...
 
  (snip)
  --log--
 
 
  Possibly is the cause same as the next email?
   * http://www.gossamer-threads.com/lists/linuxha/pacemaker/58127
 
  And, the same problem was taking place by the next combination.
   * corosync 1.0.1
   * Pacemaker-1-0-595cca870aff
   * Reusable-Cluster-Components-6ef02517ee57
   * Cluster-Resource-Agents-88a9cfd9e8b5
 
  I attach a file of hb_report.
 
  Best Regards,
  Hideo Yamauchi.

 
 -- 
 
 Remi Broemeling
 Sr System Administrator
 
 Nexopia.com Inc.
 direct: 780 444 1250 ext 435
 email: r...@nexopia.com mailto:r...@nexopia.com
 fax: 780 487 0376
 
 www.nexopia.com http://www.nexopia.com
 
 You are only young once, but you can stay immature indefinitely.
 www.siglets.com
  ___
 Pacemaker mailing list
 Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 


___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker