Re: [Pacemaker] booth is the state of "started" on pacemaker before booth write ticket info in cib.

2013-03-18 Thread Yuichi SEINO
Hi Xia and Jiaju,

Because RA may read an unintended file, I think that it is better to
check the existence of lockfile in RA. I detailed a previous mail.

What do you think about this?
If you agrees to this, Could you fix RA?

Sincerely,
Yuichi

2013/2/25 Yuichi SEINO :
> Hi Jiaju,
>
> 2013/2/22 Jiaju Zhang :
>> On Wed, 2013-02-20 at 16:26 +0900, Yuichi SEINO wrote:
>>> Hi Jiaju,
>>>
>>> I am testing this patch.
>>> When a lockfile was removed, it seems that the stop of RA isn't a
>>> intended behavior.
>>
>> I'm just curious how the lockfile was removed. Basically the existence
>> of the lockfile shows one boothd is started, and prevent being wrongly
>> started again. So the lockfile should not be removed intentionally by
>> the admin.
>
> I used how to run "mv" to the pid file.
>
>  The other case also is the same situation. When we already run
> "boothd -l other.pid" on node, the lockfile exists in the other place.
> So, $lockfile doesn't exist in the start and stop of RA.
>
>  I think that it is better to take account of  the existence of
> lockfile or $pidnum, because /proc/cmdline may happen to fulfill this
> if. For example, anything RA includes the check if pid is the empty.
>
> anything_status() {
> if test -f "$pidfile"
> then
> if pid=`getpid $pidfile` && [ "$pid" ] && kill -s 0 $pid
> then
> return $OCF_SUCCESS
> else
> # pidfile w/o process means the process died
> return $OCF_ERR_GENERIC
> fi
> else
> return $OCF_NOT_RUNNING
> fi
> }
>
>>
>> Thanks,
>> Jiaju
>>
>
> Sincerely,
> Yuichi
>

--
Yuichi SEINO
METROSYSTEMS CORPORATION
E-mail:seino.clust...@gmail.com

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


pacemaker@oss.clusterlabs.org

2013-03-18 Thread Andrew Beekhof
Are you sure you just didn't wait long enough for the cib to become active?

-bash-4.2$ id
uid=12(games) gid=100(users) groups=100(users),168(haclient)

-bash-4.2$ /usr/sbin/cibadmin -Ql

  

  


-bash-4.2$ crm_mon -Af1
Last updated: Mon Mar 18 23:12:41 2013
Last change: Mon Mar 18 23:11:30 2013 via crmd on pcmk-1
Current DC: pcmk-2 (102) - partition WITHOUT quorum
4 Nodes configured, 4 expected votes
20 Resources configured.



On Mon, Mar 18, 2013 at 11:40 PM, matonb  wrote:
> Andrew Beekhof  writes:
>
>>
> Ok, managed to remote in.
>
> On the node running the upstream version from CentOS:
> -bash-4.1$ id
> uid=26(postgres) gid=26(postgres) groups=26(postgres),499(haclient)
> -bash-4.1$ crm_mon -Af1
> Could not establish cib_ro connection: Permission denied (13)
>
> Connection to cluster failed: Transport endpoint is not connected
> -bash-4.1$
>
> On the other node running my home brew RPM built from github:
> -bash-4.1$ id
> uid=26(postgres) gid=26(postgres) groups=26(postgres),499(haclient)
> -bash-4.1$ crm_mon -Af1
> Could not establish cib_ro connection: No such file or directory (2)
>
> Different errors (expected) but errors on both none the less...
>
> Hope this helps a bit!
>
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] When VIP failed, pacemaker stops tomcat resource.

2013-03-18 Thread 정진환
Hello there!

I have two nodes with pacemaker. Other resources act as I think, but when
VIP failed(shutdown network device by ifdown command), tomcat resource also
stopped.
Is this  normal situation? Tomcat resource is configured
by ocf::heartbeat:tomcat.

Please give me some help!
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem][crmsh]The designation of the 'ordered' attribute becomes the error.

2013-03-18 Thread renayama19661014
Hi Dejan,

> > changeset:   789:916d1b15edc3
> > user:Dejan Muhamedagic 
> > date:Thu Aug 16 17:01:24 2012 +0200
> > summary: Medium: cibconfig: drop attributes set to default on cib import

I confirmed that I was set definitely without becoming xml if you made the 
modifications that you taught.

* When I set true with cib.xml file.(sequential=true)
(snip)








(snip)

[root@rh64-heartbeat1 ~]# crm 
crm(live)# configure
crm(live)configure# show
(snip)
group testGroup01 Dummy01 Dummy02
order test-order : _rsc_set_ Dummy01 Dummy02
(snip)

Many Thanks!
Hideo Yamauchi.


--- On Mon, 2013/3/11, renayama19661...@ybb.ne.jp  
wrote:

> Hi Dejan,
> 
> Thank you for comment.
> 
> > sequential=true is the default. In that case it's not possible to
> > have an unequivocal representation for the same construct and, in
> > this particular case, the conversion XML->CLI->XML yields a
> > different XML. There's a later commit which helps here, I think
> > that it should be possible to backport it to 1.0:
> > 
> > changeset:   789:916d1b15edc3
> > user:        Dejan Muhamedagic 
> > date:        Thu Aug 16 17:01:24 2012 +0200
> > summary:     Medium: cibconfig: drop attributes set to default on cib import
> 
> I apply the backporting that you taught and confirm movement.
> I talk with you again if I have a problem.
> 
> > > Is there a right method to appoint an attribute of "resource_set" with 
> > > crm shell?
> > > Possibly is not "resource_set" usable with crm shell of Pacemaker1.0.13?
> > 
> > Should work. It's just that using it with two resources, well,
> > it's sort of unusual use case.
> 
> All right!
> 
> Many Thanks!
> Hideo Yamauchi.
> 
> --- On Fri, 2013/3/8, Dejan Muhamedagic  wrote:
> 
> > Hi Hideo-san,
> > 
> > On Thu, Mar 07, 2013 at 10:18:09AM +0900, renayama19661...@ybb.ne.jp wrote:
> > > Hi Dejan,
> > > 
> > > The problem was settled with your patch.
> > > 
> > > However, I have a question.
> > > I want to use "resource_set" which Mr. Andrew proposed, but do not 
> > > understand a method to use with crm shell.
> > > 
> > > I read two next cib.xml and confirmed it with crm shell.
> > > 
> > > Case 1) sequential="false". 
> > > (snip)
> > >     
> > >         
> > >                  > >id="test-order-resource_set">
> > >                         
> > >                         
> > >                 
> > >         
> > >     
> > > (snip)
> > >  * When I confirm it with crm shell ...
> > > (snip)
> > >     group master-group vip-master vip-rep
> > >     order test-order : _rsc_set_ ( vip-master vip-rep )
> > > (snip)
> > 
> > Yes. All size two resource sets get the _rsc_set_ keyword,
> > otherwise it's not possible to distinguish them from "normal"
> > constraints. Resource sets are supposed to help cases when it is
> > necessary to express relation between three or more resources.
> > Perhaps this case should be an exception.
> > 
> > > Case 2) sequential="true"
> > > (snip)
> > >     
> > >       
> > >         
> > >           
> > >           
> > >         
> > >       
> > >     
> > > (snip)
> > >  * When I confirm it with crm shell ...
> > > (snip)
> > >    group master-group vip-master vip-rep
> > >    xml  \
> > >          \
> > >                  \
> > >                  \
> > >          \
> > > 
> > > (snip)
> > > 
> > > Does the designation of "sequential=true" have to describe it in xml?
> > 
> > sequential=true is the default. In that case it's not possible to
> > have an unequivocal representation for the same construct and, in
> > this particular case, the conversion XML->CLI->XML yields a
> > different XML. There's a later commit which helps here, I think
> > that it should be possible to backport it to 1.0:
> > 
> > changeset:   789:916d1b15edc3
> > user:        Dejan Muhamedagic 
> > date:        Thu Aug 16 17:01:24 2012 +0200
> > summary:     Medium: cibconfig: drop attributes set to default on cib import
> > 
> > > Is there a right method to appoint an attribute of "resource_set" with 
> > > crm shell?
> > > Possibly is not "resource_set" usable with crm shell of Pacemaker1.0.13?
> > 
> > Should work. It's just that using it with two resources, well,
> > it's sort of unusual use case.
> > 
> > Cheers,
> > 
> > Dejan
> > 
> > > Best Regards,
> > > Hideo Yamauchi.
> > > 
> > > --- On Thu, 2013/3/7, renayama19661...@ybb.ne.jp 
> > >  wrote:
> > > 
> > > > Hi Dejan,
> > > > Hi Andrew,
> > > > 
> > > > Thank you for comment.
> > > > I confirm the movement of the patch and report it.
> > > > 
> > > > Best Regards,
> > > > Hideo Yamauchi.
> > > > 
> > > > --- On Wed, 2013/3/6, Dejan Muhamedagic  wrote:
> > > > 
> > > > > Hi Hideo-san,
> > > > > 
> > > > > On Wed, Mar 06, 2013 at 10:37:44AM +0900, renayama19661...@ybb.ne.jp 
> > > > > wrote:
> > > > > > Hi Dejan,
> > > > > > Hi Andrew,
> > > > > > 
> > > > > > As for the crm shell, the check of the meta attribute was r

Re: [Pacemaker] Wrong system send arp reply when using IPaddr

2013-03-18 Thread David Coulson


On 3/18/13 5:24 PM, Andrew Beekhof wrote:

So:

1. the IP moved from 01 to 02
2. 01 was then rebooted
3. a long time passes
4. 01 starts arping for the IP

Is that what you're saying?
Is the problem transient or does it persist?

Went like this - IP movements are all by Pacemaker/IPaddr resource

1. 01 was rebooted and IP moved to 02
2. A week goes by
3. 01 sends one arp reply for the IP on 02
4. 02 is rebooted and a different IP moves to 01
5. a day goes by
6. 02 arps for the IP on 01.

We've only had two occurrences of it. Once from each box, and for two 
different IPs.


David

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Exchanging data between resource agent instances

2013-03-18 Thread Lars Ellenberg
On Mon, Mar 18, 2013 at 08:49:41PM +0100, Riccardo Bicelli wrote:
> Hello,
> anyone knows if is it possible to exchange data between two instances of a
> resource agent?
> 
> I have a Master/Slave resource agent that, when slave,  has to create a
> dummy device in same size of a given block device (DRBD) running on Master.

Why?
What do you want to achieve?

> Since the  block device is not accessible when the resource is slave, I was
> wondering if master could read size of device and report it to the slave.

does cat /proc/partitions help?

> I don't like the idea of putting that size in the cib.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Wrong system send arp reply when using IPaddr

2013-03-18 Thread Andrew Beekhof
On Mon, Mar 18, 2013 at 1:00 PM, David Coulson  wrote:
> Last movement of the ip was six days before issue when we patched and 
> rebooted one of the nodes.
>
> Resource is a simple ipaddr (not ipaddr2) with an ip and net mask. Like I 
> said, really simple config so confused :)

So:

1. the IP moved from 01 to 02
2. 01 was then rebooted
3. a long time passes
4. 01 starts arping for the IP

Is that what you're saying?
Is the problem transient or does it persist?

>
> Sent from my iPhone
>
> On Mar 17, 2013, at 9:01 PM, Andrew Beekhof  wrote:
>
>> On Mon, Mar 18, 2013 at 3:17 AM, David Coulson  
>> wrote:
>>> First off, I'm going to preface this with the realization that what I am
>>> explaining makes no sense, doesn't follow normal logic and I'm not a
>>> complete idiot. I've beaten my head against a wall with this issue for two
>>> days, and have made no progress, yet we've had a couple of production system
>>> outages because of it.
>>>
>>> Environment is a pair of IBM x-series systems in a DMZ connected to an
>>> ASA5500. Each IBM box has two interfaces in a mode=4 bond connected to two
>>> switches, which connected to the pri/sec firewall and are interconnected -
>>> Poor man's redundancy I support. Both boxes run RHEL6.3 and Pacemaker
>>> 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14. ASA has a ARP table
>>> timeout of 4hours.
>>>
>>> There are about a dozen IPAddr resources in a group which are configured
>>> with meta ordered="false" collocated="false" - Each is independent from a
>>> service perspective, but the group makes it easy to manage them. Each box
>>> runs LVS with mangle rules, then assigns fwm values for routing within LVS -
>>> For whatever reason, this still requires the IP to be on the box receiving
>>> the packet through LVS, even if the mangle rule is triggered.
>>>
>>> We've had a couple of instances for two IPs in this configuration where
>>> Pacemaker (and syslog) indicate the IP is assigned to box 01, yet the
>>> firewall receives an ARP reply from box 02. Didn't believe it at first until
>>> we grabbed packets from a SPAN on the switches. Correct IP address in reply,
>>> MAC of one of the bonded interfaces on box 02, yet the IP isn't on it.
>>>
>>> We've experienced both 01 arping for an IP on 02, and 02 arping for an IP on
>>> 01. Last night when we had the issue, an IP was on 02, 01 arped for it and I
>>> tcpdumped on 01 and saw SYN packets coming in for the IP on 01 - Makes
>>> sense, but doesn't explain why the box answered the arp in the first place.
>>
>> Had pacemaker just failed over the IP?
>> Did you set any ARP related options for the resource?
>>
>>>
>>> I realize this likely isn't a Pacemaker issue, but I was hoping someone else
>>> might have experienced a similar issue, or can at least point me in the
>>> right direction. We have a far more complex Pacemaker/LVS environment on our
>>> inside network (which isn't link-local to the ASA - goes through an inside
>>> router) which works flawlessly, so I'm open to the fact that something is
>>> totally screwed up in our DMZ.
>>>
>>> Sorry that was long. :)
>>>
>>> ___
>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>> ___
>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] Exchanging data between resource agent instances

2013-03-18 Thread Riccardo Bicelli
Hello,
anyone knows if is it possible to exchange data between two instances of a
resource agent?

I have a Master/Slave resource agent that, when slave,  has to create a
dummy device in same size of a given block device (DRBD) running on Master.

Since the  block device is not accessible when the resource is slave, I was
wondering if master could read size of device and report it to the slave.

I don't like the idea of putting that size in the cib.

Riccardo.
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


pacemaker@oss.clusterlabs.org

2013-03-18 Thread matonb
Andrew Beekhof  writes:

> 
Ok, managed to remote in.

On the node running the upstream version from CentOS:
-bash-4.1$ id
uid=26(postgres) gid=26(postgres) groups=26(postgres),499(haclient) 
-bash-4.1$ crm_mon -Af1
Could not establish cib_ro connection: Permission denied (13)

Connection to cluster failed: Transport endpoint is not connected
-bash-4.1$ 

On the other node running my home brew RPM built from github:
-bash-4.1$ id
uid=26(postgres) gid=26(postgres) groups=26(postgres),499(haclient) 
-bash-4.1$ crm_mon -Af1
Could not establish cib_ro connection: No such file or directory (2)

Different errors (expected) but errors on both none the less...

Hope this helps a bit!


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


pacemaker@oss.clusterlabs.org

2013-03-18 Thread matonb
> Unprivileged but in the haclient group?

I'm out of the office at the moment, I'll verify that again tonight and get back
to you.
(I was testing on a different pair of nodes to my usual test cluster...).


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org