Re: [Pacemaker] Problems shutting down server/services

2010-06-17 Thread Erich Weiler
this is a problem ... without libvirtd running all management attempts via "virsh" will fail (including the VirtualDomain RA) ... why is libvirtd stopping before corosync? Ah, because I forgot that I have libvirtd starting and stopping via init scripts. ;) That is the default shutdown sequen

[Pacemaker] Problems shutting down server/services

2010-06-16 Thread Erich Weiler
Hi All, I'm having a weird issue that I was hoping someone could shed some light on... I've got Pacemaker 1.0.8 and Corosync 1.2.2 on a two node cluster. I'm managing some VMs via VirtualDomain, with DRBD under that. It mostly works, but I'm having a freal hard time shutting down services,

[Pacemaker] VirtualDomain/DRBD live migration with pacemaker...

2010-06-14 Thread Erich Weiler
Hi All, We have this interesting problem I was hoping someone could shed some light on. Basically, we have 2 servers acting as a pacemaker cluster for DRBD and VirtualDomain (KVM) resources under CentOS 5.5. As it is set up, if one node dies, the other node promotes the DRBD devices to "Mas

Re: [Pacemaker] Pacemaker failover problem

2010-03-09 Thread Erich Weiler
-with-IP inf: LDAP-IP LDAP-clone order LDAP-after-IP inf: LDAP-IP LDAP-clone property $id="cib-bootstrap-options" \ dc-version="1.0.7-d3fa20fc76c7947d6de66db7e52526dc6bd7d782" \ cluster-infrastructure="openais" \ expected-quorum-votes="

Re: [Pacemaker] Pacemaker failover problem

2010-03-08 Thread Erich Weiler
: LDAP-IP LDAP-clone which is expected at that point. It seems that when I tested this with older versions of pacemaker, this didn't happen. Should 'order' statements be avoided entirely when dealing with anonymous clones? Is that behavior expected? Thanks, erich Erich

Re: [Pacemaker] Pacemaker failover problem

2010-03-08 Thread Erich Weiler
e. Please remove these and see that helps. location LDAP-IP-placement-2 LDAP-IP 50: genome-ldap2 location LDAP-placement-1 LDAP-clone 100: genome-ldap1 location LDAP-placement-2 LDAP-clone 100: genome-ldap2 colocation LDAP-with-IP inf: LDAP-IP LDAP-clone hj On Mon, Mar 8, 2010 at 1:34 PM, Erich

[Pacemaker] Pacemaker failover problem

2010-03-08 Thread Erich Weiler
Hi All, I have a (hopefully) simple problem that I need to fix, but I feel like I'm missing a key concept here that is causing problems. I have 2 nodes, genome-ldap1 and genome-ldap2. Using latest corosync, pacemaker and openais from the epel and clusterlabs repos, CentOS 5.4. Both nodes a

Re: [Pacemaker] Installation problems

2010-03-07 Thread Erich Weiler
Mar 07 08:20:04 corosync [pcmk ] ERROR: pcmk_startup: Cluster user hacluster does not exist Heh, I answered my own question... I was overwriting the passwd file and nuking the hacluster user, which was causing problems... I got it sorted. Thanks for reading, in any case... ;) __

[Pacemaker] Installation problems

2010-03-07 Thread Erich Weiler
Hi Y'all, I'm having some issues getting things running on a stock CentOS 5.4 install, and I was hoping someone could point me in the right direction... Through the epel and clusterlabs repos that are referenced in the wiki, I installed: corosync-1.2.0-1.el5 openais-1.1.0-1.el5 pacemaker-1.

Re: [Pacemaker] ha-clustering repo?

2010-02-06 Thread Erich Weiler
Repo. is : Repository is also announced in the Install doco : Many thanks! ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinf

[Pacemaker] ha-clustering repo?

2010-02-06 Thread Erich Weiler
Hi All, In the recent past, I've been able to get pacemaker rpms from the ha-clustering repo at opensuse.org (for RHEL_5), but just recently, they seem to have disappeared. The other rpms are still there (heartbeat, openais, etc). Did I miss an announcement? Are they to be found somewhere

Re: [Pacemaker] Auto-restart service on IP shift?

2010-02-05 Thread Erich Weiler
e... ;) Dejan Muhamedagic wrote: Hi, On Wed, Feb 03, 2010 at 07:35:31AM -0800, Erich Weiler wrote: I think I read about this somehwere, but I can't find it now... I have a cloned service on two nodes that is always running on both (named, one is a replicated slave). The IP sticks to one

Re: [Pacemaker] Auto-restart service on IP shift?

2010-02-04 Thread Erich Weiler
yes, it does. Thanks for the pointer. I've been not aware that it's possible to have a kind of "standby-ip" where the upper services can bind to without actually getting connection requests. So you're saying you can have 2 nodes have the IP address, but only one answers on it? Does the other n

Re: [Pacemaker] Auto-restart service on IP shift?

2010-02-04 Thread Erich Weiler
t when the IP *comes* to testvm2, not when the IP leaves it. ;) Erich Weiler wrote: If you had an ordering constraint that said named clone should start after the ip, that would achieve what you want. Might result in extra restarts of the clone though. I thought that was the way I had it config

Re: [Pacemaker] Auto-restart service on IP shift?

2010-02-04 Thread Erich Weiler
If you had an ordering constraint that said named clone should start after the ip, that would achieve what you want. Might result in extra restarts of the clone though. I thought that was the way I had it configured, but it doesn't seem to work. When I shut down testvm1, the IP goes to testvm2

[Pacemaker] Auto-restart service on IP shift?

2010-02-03 Thread Erich Weiler
I think I read about this somehwere, but I can't find it now... I have a cloned service on two nodes that is always running on both (named, one is a replicated slave). The IP sticks to one, then floats to the other if the first one goes down. Works great. My problem is that named only binds

Re: [Pacemaker] Master/Slave Confusion

2010-02-02 Thread Erich Weiler
Thanks for the tip, I think I'm closer... I set a preference and then tried to start the LDAP service, and the crm monitor shows testvm3 (my preferred master) trying to start LDAP as a master repeatedly but failing. It does this like 2 times per second. I think I'm very, very close to nailin

Re: [Pacemaker] Master/Slave Confusion

2010-02-02 Thread Erich Weiler
The script should set a preference for being promoted using crm_master. Have a look at LinBit's drbd script for a good example. Thanks for the tip, I think I'm closer... I set a preference and then tried to start the LDAP service, and the crm monitor shows testvm3 (my preferred master) trying

Re: [Pacemaker] Master/Slave Confusion

2010-02-01 Thread Erich Weiler
t doesn't mean much. I think the IP is following the preference I set, and not 'following the master' since there is no master? How does one promote a slave to master automatically? Thanks again for any insight! -erich Erich Weiler wrote: if you make the LDAP daemon listen on

Re: [Pacemaker] Master/Slave Confusion

2010-02-01 Thread Erich Weiler
if you make the LDAP daemon listen on all available interfaces, it will accept connections on the on-demand activated floating-ip. Well, I'm trying to get this to work and running into a wall... I've got 3 servers, I want LDAP to run on testvm2 and testvm3. I've configured LDAP on those 2 s

Re: [Pacemaker] Master/Slave Confusion

2010-02-01 Thread Erich Weiler
Thanks! This will be helpful... RafaƂ Kupka wrote: On Sun, Jan 31, 2010 at 06:39:28PM -0800, Erich Weiler wrote: Hi, However, it seems that when LDAP starts, the IP needs to be live on each node for the LDAP server to bind on that IP. Is that how the master/slave setup works in pacemaker

[Pacemaker] Resource cannot run anywhere

2010-01-31 Thread Erich Weiler
So, I have 3 Floating IP addresses, between 3 nodes, with each IP address having a preference for each node. But the third one, ClusterIP3, is not running and the "crm_verify -VL" command says that ClusterIP3 cannot run anywhere. But, I can't see why. Here's my CRM config: crm(live)configu

[Pacemaker] Master/Slave Confusion

2010-01-31 Thread Erich Weiler
Hi All, Forgive a probably elementary question, but I'm new to Pacemaker and am not clear on exactly how a Master/Slave relationships exist. Here's my confusion: My initial though was that with a master/slave service, the service is started on both nodes (assuming I have 2 nodes). But, I w