Re: [ClusterLabs] Antw: Re: Antw: Delayed first monitoring

2015-08-13 Thread Digimer
://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Users mailing list: Users

Re: [ClusterLabs] implementation of fence and stonith agents for pacemaker

2015-08-13 Thread Digimer
On 13/08/15 07:54 AM, Kostiantyn Ponomarenko wrote: Digimer, Thank you. I will try this out. One more question. What about directories for those agents, what rules are here? Thank you, Kostya I'm not entirely sure I understand the question, sorry. What do you mean by directories

Re: [ClusterLabs] upgrade from 1.1.9 to 1.1.12 fails to start

2015-08-18 Thread Digimer
to be done between 1.1.9 and 1.1.12? Michelle Streeter You need to upgrade all of the cluster components please. Ideally, upgrade the whole OS... -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education

Re: [ClusterLabs] [Slightly OT] OCFS2 over LVM

2015-08-23 Thread Digimer
On 23/08/15 04:40 PM, Jorge Fábregas wrote: On 08/23/2015 02:16 PM, Digimer wrote: One, this is on-topic, so don't worry. :) Thanks. Two, I've never used ocfs2 (allergic to all things Oracle), but clvmd makes LVM cluster-aware, as you know. So I have no idea why they'd say that. I

Re: [ClusterLabs] [Slightly OT] OCFS2 over LVM

2015-08-24 Thread Digimer
anyway, so I've already incurred the complexity costs so hey, why not. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Users mailing list: Users

Re: [ClusterLabs] Cluster.conf

2015-08-24 Thread Digimer
straight to setting up pacemaker and not worry about cman/corosync directly. digimer On 24/08/15 01:52 PM, Streeter, Michelle N wrote: If I have a cluster.conf file in /etc/cluster, my cluster will not start. Pacemaker 1.1.11, Corosync 1.4.7, cman 3.0.12, But if I do not have a cluster.conf file

Re: [ClusterLabs] upgrade from 1.1.9 to 1.1.12 fails to start

2015-08-17 Thread Digimer
the Cluster.conf file and the cib.xml and all the back up versions and tried again and got the same error. I googled this error and really got nothing. Any ideas? As a test, can you create a fresh, new cluster? -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer

Re: [ClusterLabs] Node lost early in HA startup -- no STONITH

2015-08-02 Thread Digimer
the duration of the events. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman

Re: [ClusterLabs] implementation of fence and stonith agents for pacemaker

2015-08-11 Thread Digimer
the need for agents to output the XML metadata. For now, you should be able to see the format needed by looking at the metadata output of existing FAs. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education

Re: [ClusterLabs] [ClusterLabs Developers] Resource Agent language discussion

2015-08-07 Thread Digimer
to add, uselessly; perl! 3 -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman

Re: [ClusterLabs] nfsServer Filesystem Failover average 76s

2015-08-14 Thread Digimer
. *Lots* changed in HA from 6.4 - 6.6. digimer On 14/08/15 01:17 PM, Streeter, Michelle N wrote: I am getting an average failover for nfs of 76s. I have set all the start and stop settings to 10s but no change. The Web page is instant but not nfs. I am running two node cluster on rhel6

Re: [ClusterLabs] 2 Nodes Pacemaker for Nginx can only failover 1 time

2015-08-08 Thread Digimer
@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer

Re: [ClusterLabs] fence-virtd reach remote server serial/VM channel/TCP

2015-08-05 Thread Digimer
sides by other means. Alternatively, you might implement such relying directly as fence_virtd module (backend), possibly reusing some code from the client side (fence_virt). -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person

Re: [ClusterLabs] Pacemaker failover failure

2015-07-14 Thread Digimer
: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure

Re: [ClusterLabs] Antw: Re: [Slightly OT] OCFS2 over LVM

2015-08-25 Thread Digimer
On 25/08/15 04:45 AM, Ulrich Windl wrote: Digimer li...@alteeve.ca schrieb am 24.08.2015 um 18:20 in Nachricht 55db4453.10...@alteeve.ca: [...] Using a pair of nodes with a traditional file system exported by NFS and made accessible by a floating (virtual) IP address gives you redundancy

Re: [ClusterLabs] how to fence in a two node cluster.If haretbeat network is down between the two nodes, which node will fence the other node?

2015-11-04 Thread Digimer
ems. Pull all power to a node. This will cause IPMI to fail, so the fence call will fail. This is why I use IPMI *and* a pair of switched PDUs (IPMI on one switch, PDUs on another). -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in th

Re: [ClusterLabs] large cluster - failure recovery

2015-11-04 Thread Digimer
6.3 from local not > applied to 1.15046.3: current "num_updates" is greater than required > [...] > > > ps. Sorry if should posted on corosync newsgroup, just the CIB > synchronization fails, so this group seemed to me the right place. All of the HA mailing lists are

Re: [ClusterLabs] Fencing Two-Node Cluster on Two ESXi Hosts

2015-11-05 Thread Digimer
uot;two_node: 1" in corosync.conf (assuming you're using > corosync 2). That will allow one node to keep quorum if the other is > shut down. > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.or

Re: [ClusterLabs] Cluster node loss detection.

2015-10-16 Thread Digimer
t set‐ > ting, they can be configured in cluster.conf as shown above. > Cman uses the following default values: > >vsftype="none" > token="1" > token_retransmits_before_loss_const=&qu

Re: [ClusterLabs] Fencing questions.

2015-10-19 Thread Digimer
es i see after fencing takes place. > > Thanks in advance > Arjun > > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting

Re: [ClusterLabs] Cluster node loss detection.

2015-10-16 Thread Digimer
etty sure that DLM was just being informed by clustering, but I > needed to ask. > > Again, thanks. > > > Regards. > Mark K Vallevand mark.vallev...@unisys.com > <mailto:mark.vallev...@unisys.com> > Never try and teach a pig to sing: it's a waste of time, and it

Re: [ClusterLabs] Cluster node loss detection.

2015-10-16 Thread Digimer
mail and its > attachments from all computers. > > > -Original Message- > From: Digimer [mailto:li...@alteeve.ca] > Sent: Friday, October 16, 2015 11:51 AM > To: Cluster Labs - All topics related to open-source clustering welcomed > Subject: Re: [ClusterLabs] Cluster n

Re: [ClusterLabs] (no subject)

2015-10-08 Thread Digimer
On 08/10/15 09:03 PM, TaMen说我挑食 wrote: > Corosync+Pacemaker error during failover You need to ask a question if you want us to be able to help you. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to educat

Re: [ClusterLabs] multiple drives looks like balancing but why and causing troubles

2015-08-26 Thread Digimer
___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org -- Digimer Papers and Projects: https://alteeve.ca/w

Re: [ClusterLabs] VG activation on Active/Passive

2015-08-29 Thread Digimer
-- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project

Re: [ClusterLabs] VG activation on Active/Passive

2015-08-29 Thread Digimer
On 29/08/15 02:51 PM, Jorge Fábregas wrote: On 08/29/2015 02:37 PM, Digimer wrote: No need for clustered LVM, only the active node should see the PV. When the passive takes over, after connecting to the PV, it should do a pvscan - vgscan - lvscan before mounting the FS on the LV. Keep you

Re: [ClusterLabs] SBD & Failed Peer

2015-09-07 Thread Digimer
> Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without acces

[ClusterLabs] Problem with fence_virsh in RHEL 6 - selinux denial

2015-09-08 Thread Digimer
el6.x86_64 cman-3.0.12.1-73.el6.1.x86_64 corosync-1.4.7-2.el6.x86_64 [root@node1 ~]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 6.7 (Santiago) I'll post a follow-up if I can sort out how to fix it. My selinux-fu is weak... -- Digimer Papers and Projects: https://alteeve.ca/w

Re: [ClusterLabs] [ClusterLabs Developers] Problem with fence_virsh in RHEL 6 - selinux denial

2015-09-08 Thread Digimer
74): avc: denied { open } for pid=23611 comm="ssh" name="id_rsa" dev=vda2 ino=1966200 scontext=unconfined_u:system_r:fenced_t:s0 tcontext=unconfined_u:object_r:ssh_home_t:s0 tclass=file type=SYSCALL msg=audit(1441767229.710:9374): arch=c03e syscall=2 success=yes e

Re: [ClusterLabs] Coming in 1.1.14: Fencing topology based on node attribute

2015-09-08 Thread Digimer
ve to update the fencing > configuration once rather than for every node in the rack. > > The syntax accepts either '=' or ':' as the separator for the name/value > pair, so target="rack:1" would work in the XML as well. Holy crap that is awesome! :D -- Digimer Papers and

[ClusterLabs] Clustered LVM with iptables issue

2015-09-10 Thread Digimer
-j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT -A INPUT -j REJECT --reject-with icmp-host-prohibited -A FORWARD -j REJECT --reject-with icmp-host-prohibited COMMIT # Completed on Thu Sep 10 22:12:38 2015 Any help is appreciated! -- Digimer Papers and Projects

Re: [ClusterLabs] Clustered LVM with iptables issue

2015-09-10 Thread Digimer
On 10/09/15 06:31 PM, Noel Kuntze wrote: > > Hello Digimer, > > Pro tip: look at the 'multiport' module. You can substantially reduce the > number of rules with it. > Right now, I'm scratching my eyes out. > You can use `ss` or `netstat` to find out where clmvd wants to ph

Re: [ClusterLabs] Clustered LVM with iptables issue

2015-09-10 Thread Digimer
On 10/09/15 06:54 PM, Noel Kuntze wrote: > > Hello Digimer, > > I initially assumed you were familiar with ss or netstat and simply > forgot about them. > Seems I was wrong. > > Check the output of this: `ss -tpn` and `ss -upn`. > Those commands give you the current o

Re: [ClusterLabs] Clustered LVM with iptables issue

2015-09-10 Thread Digimer
[3001]: [TOTEM ] ring 0 active with no faults Adding; iptables -I INPUT -p sctp -j ACCEPT Got it working. Obviously, that needs to be tightened up. digimer On 10/09/15 07:01 PM, Digimer wrote: > On 10/09/15 06:54 PM, Noel Kuntze wrote: >> >> Hello Digimer, >> >> I

Re: [ClusterLabs] Adding 'virsh migrate migrate-setspeed' support to the vm RA

2015-09-14 Thread Digimer
On 14/09/15 07:19 AM, Dejan Muhamedagic wrote: > Hi Digimer, > > On Fri, Sep 04, 2015 at 03:36:09AM -0400, Digimer wrote: >> Hi all, >> >> I hit an issue a little while ago where live-migrating a VM (on the >> same management network normally used for coros

Re: [ClusterLabs] EL6, cman, rrp, unicast and iptables

2015-09-14 Thread Digimer
it that something else triggered the fault detection. It happened during a long live migration (actually, several servers back to back), so I *assumed* that was the cause. Given it was a cut-over weekend though, I made a mental note and went back to work. Bad choice... I should have snagged

Re: [ClusterLabs] EL6, cman, rrp, unicast and iptables

2015-09-14 Thread Digimer
On 14/09/15 04:20 AM, Jan Friesse wrote: > Digimer napsal(a): >> Hi all, >> >>Starting a new thread from the "Clustered LVM with iptables issue" >> thread... >> >>I've decided to review how I do networking entirely in my cluster. I >>

Re: [ClusterLabs] EL6, cman, rrp, unicast and iptables

2015-09-15 Thread Digimer
On 15/09/15 12:10 PM, Noel Kuntze wrote: > > Hello Digimer, > >> So what's the final verdict on this? I followed your back and forth, and >> it sounds like corosync uses 0, so nothing else is to be done? > > Missing prioritization itself cannot be the cause of the p

Re: [ClusterLabs] [ClusterLabs Developers] Problem with fence_virsh in RHEL 6 - selinux denial

2015-09-09 Thread Digimer
I've created an rhbz: https://bugzilla.redhat.com/show_bug.cgi?id=1261711 digimer On 08/09/15 11:04 PM, Digimer wrote: > ere is my cluster.conf, in case it matters: > > > [root@node1 ~]# cat /etc/cluster/

Re: [ClusterLabs] EL6, cman, rrp, unicast and iptables

2015-09-15 Thread Digimer
On 15/09/15 03:20 AM, Jan Friesse wrote: > Digimer napsal(a): >> On 14/09/15 04:20 AM, Jan Friesse wrote: >>> Digimer napsal(a): >>>> Hi all, >>>> >>>> Starting a new thread from the "Clustered LVM with iptables issue" >>

Re: [ClusterLabs] Major problem with iSCSITarget resource on top of DRBD M/S resource.

2015-09-27 Thread Digimer
On 27/09/15 11:02 AM, Alex Crow wrote: > > > On 27/09/15 15:54, Digimer wrote: >> On 27/09/15 10:40 AM, Alex Crow wrote: >>> Hi List, >>> >>> I'm trying to set up a failover iSCSI storage system for oVirt using a >>> self-hosted engine. I've se

Re: [ClusterLabs] Major problem with iSCSITarget resource on top of DRBD M/S resource.

2015-09-27 Thread Digimer
e-target away from glenrock > after 100 failures (max=100) > Sep 27 15:35:59 glenrock pengine[3365]: notice: process_pe_message: > Calculated Transition 54: /var/lib/pacemaker/pengine/pe-input-537.bz2 > Sep 27 15:35:59 glenrock crmd[3366]: notice: run_graph: Transition 54 > (C

[ClusterLabs] will rgmanager/ccs support the vm RA's new migrate-setspeed?

2015-10-05 Thread Digimer
Re: https://github.com/ClusterLabs/resource-agents/pull/629 I'd love to support this in the current gen Anvil!. Would it be hard to add support to ccs for this? -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access

Re: [ClusterLabs] [Linux-HA] Cluster for HA VM's serving our local network

2015-09-23 Thread Digimer
efault libvirtd bridge is a NAT'ed bridge, so your VMs would get IPs in the 192.168.122.0/24 subnet, and the libvirtd bridge would route them to the outside world. Using the bridge type in the tutorial though, your VMs would appear to be directly on your network and would get (or you wo

Re: [ClusterLabs] [Linux-HA] Cluster for HA VM's serving our local network

2015-09-23 Thread Digimer
On 23/09/15 10:23 AM, J. Echter wrote: > Hi Digimer, > > Am 23.09.2015 um 15:38 schrieb Digimer: >> Hi Juergen, >> >>First; This list is deprecated and you should use the Cluster Labs - >> Users list (which I've cc'ed here). > > i already got th

[ClusterLabs] Odd clvmd error - clvmd: Unable to create DLM lockspace for CLVM: Address already in use

2015-09-24 Thread Digimer
:07:40 node1 corosync[4770]: [TOTEM ] Retransmit List: 252 Sep 24 23:07:40 node1 corosync[4770]: [TOTEM ] Retransmit List: 254 Sep 24 23:07:40 node1 corosync[4770]: [TOTEM ] Retransmit List: 254 Certainly *looks* like a network problem, but I can't see what's wrong... Any ideas? Thanks

Re: [ClusterLabs] Quorum With Two Nodes And an "observer" Questions

2015-09-19 Thread Digimer
h such idea too? Got anything to share about this? > > > Thank you Feasible, sure. Needed? No. Quorum is nice to have, but if you use a fence delay on a node and tell corosync to use 'wait_for_all', then you're fine. All the clusters I've built in the last 5~6 years have been 2-node

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-21 Thread Digimer
all team on how to make this work. SSH > into the drac works fine, and IPMI over IP is enabled. If anyone has > ideas on this, they would be greatly appreciated. You're initiating the connection, so no firewall edits should be needed (anything returning should be ESTABLISHED/RELATED).

Re: [ClusterLabs] Pacemaker/pcs & DRBD not demoting secondary node to Slave (always Stopped)

2015-09-21 Thread Digimer
ere another method/configuration for > fencing DRBD? > > Thank you for your advice, > > Jason > > On 9/20/15, 9:40 PM, "Digimer" <li...@alteeve.ca> wrote: > >> On 20/09/15 09:18 PM, Jason Gress wrote: >>> I had seemed to cause a split brain attempting

Re: [ClusterLabs] Odd clvmd error - clvmd: Unable to create DLM lockspace for CLVM: Address already in use

2015-09-25 Thread Digimer
On 25/09/15 03:44 AM, Christine Caulfield wrote: > On 25/09/15 00:09, Digimer wrote: >> I had a RHEL 6.7, cman + rgmanager cluster that I've built many times >> before. Oddly, I just hit this error: >> >> >> [root@node2 ~]# /etc/init.d/clvmd start >>

Re: [ClusterLabs] design of a two-node cluster

2015-12-08 Thread Digimer
On 08/12/15 03:13 AM, Lentes, Bernd wrote: > Digimer wrote: > >>>>> Should I install all vm's in one partition or every vm in a seperate >>>>> partition ? The advantage of one vm per partition is that I don't >>>>> need a cluster fs, righ

Re: [ClusterLabs] Antw: Re: Antw: Re: design of a two-node cluster

2015-12-08 Thread Digimer
gt;>> >>>>>> "Lentes, Bernd" <bernd.len...@helmholtz-muenchen.de> schrieb >>> am >>>>>> 08.12.2015 um >>> 09:13 in Nachricht <00a901d13190$5c6db3c0$15491b40$@helmholtz- >>> muenchen.de>: >>>> Digimer

Re: [ClusterLabs] Antw: Re: design of a two-node cluster

2015-12-08 Thread Digimer
On 08/12/15 02:44 AM, Ulrich Windl wrote: >>>> Digimer <li...@alteeve.ca> schrieb am 07.12.2015 um 22:40 in Nachricht > <5665fcdc.1030...@alteeve.ca>: > [...] >> Node 1 looks up how to fence node 2, sees no delay and fences >> immediately. Node 2

Re: [ClusterLabs] design of a two-node cluster

2015-12-07 Thread Digimer
On 07/12/15 03:27 PM, Lentes, Bernd wrote: > Digimer wrote: >> >> On 07/12/15 12:35 PM, Lentes, Bernd wrote: >>> Hi, >>> >>> i've been asking all around here a while ago. Unfortunately I couldn't >>> continue to work on my cluster, so I'm still th

Re: [ClusterLabs] design of a two-node cluster

2015-12-07 Thread Digimer
with their respective tools. > > Thanks in advance. I don't recommend snapshots, as I mentioned. Focus on your backup application and create DR VMs if you want to minimize the time to recovery after a total VM loss is what I recommend. > B

Re: [ClusterLabs] cluster node config - DNS or IP addresses?

2015-12-22 Thread Digimer
On 22/12/15 05:31 PM, Ilia Sokolinski wrote: > Hi, > > What are the best practices with respect to using DNS names vs IP addresses > in pacemaker/corosync configuration? > > Thanks > > Ilia Sokolinski I use `uname -n` and have the hostname resolve via /etc/ho

Re: [ClusterLabs] mail server (postfix)

2016-06-04 Thread Digimer
On 04/06/16 01:27 PM, Dmitri Maziuk wrote: > On 2016-06-04 01:10, Digimer wrote: > >> We're running postfix/dovecot/postgres for our mail on an HA cluster, >> but we put it all in a set of VMs and made the VMs HA on DRBD. > > Hmm. I deliver to ~/Maildir and /home

Re: [ClusterLabs] how to "switch on" cLVM ?

2016-06-07 Thread Digimer
are always seen on all nodes right away. What you do on the LVs is up to you. If boot a VM on node 1 using an LV as backing storage, nothing in LVM stopping you from accessing that LV on another node and destroying your data. For that, you need pacemaker or something else smart enough and cl

Re: [ClusterLabs] how to "switch on" cLVM ?

2016-06-06 Thread Digimer
ster aware... If you mount it on two nodes, you will almost certainly corrupt the FS quickly. If you want to mount an LV on two+ nodes at once, you need a cluster-aware file system, life GFS2. > Thanks in advance. > > > Bernd > -- Digimer Papers and Projects: https

Re: [ClusterLabs] mail server (postfix)

2016-06-04 Thread Digimer
for our mail on an HA cluster, but we put it all in a set of VMs and made the VMs HA on DRBD. We go this route because one setup can be adapted to just about any application. This also allows migrations without interruptions. Didn't answer you question, but maybe an alternate approach to consider

Re: [ClusterLabs] newbie questions

2016-05-31 Thread Digimer
ceed to shoot node 1 and then recover any resources that had been on node 1. digimer > However, there are more things for me to read and more experiments > for me to try so I'm good for now. > > Thanks to everyone for the prompt help. > > j. > > On Tue, May 31,

Re: [ClusterLabs] design question to DRBD

2016-06-22 Thread Digimer
+rgmanager" for "pacemaker", adjust the actual commands and the rest of the guide works fine. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___

Re: [ClusterLabs] Recovering after split-brain

2016-06-21 Thread Digimer
. Whether you're using a shared IP, shared storage or something else, it's all the same to pacemaker in the end. > On Tue, Jun 21, 2016 at 8:27 PM, Dmitri Maziuk <dmitri.maz...@gmail.com > <mailto:dmitri.maz...@gmail.com>> wrote: > > On 2016-06-20 17:19, Digimer wro

Re: [ClusterLabs] Node is silently unfenced if transition is very long

2016-06-21 Thread Digimer
ct (resource cleanup is a node unfence)... >> Honestly, this potentially leads to a data corruption... >> >> Also (probably not related) there was one more resource stop failure (in >> that case - timeout) prior to failed stop mentioned above. And that stop >> timeout di

Re: [ClusterLabs] Recovering after split-brain

2016-06-21 Thread Digimer
On 21/06/16 10:57 AM, Dmitri Maziuk wrote: > On 2016-06-20 17:19, Digimer wrote: > >> Nikhil indicated that they could switch where traffic went up-stream >> without issue, if I understood properly. > > They have some interesting setup, but that notwithstanding: if sp

Re: [ClusterLabs] Recovering after split-brain

2016-06-21 Thread Digimer
sources mode. If I wanted to run it so that when it breaks I get to > keep the pieces, I could. You technically can in pacemaker, too, but it's dumb in any HA environment. As soon as you make assumptions, you open up the chance of being wrong. -- Digimer Papers and Projects: https://alteeve.c

Re: [ClusterLabs] restarting pacemakerd

2016-06-19 Thread Digimer
;. I'd be shocked if there wasn't a version of this in pacemaker already, given that it has for more flexibility than rgmanager. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? _

Re: [ClusterLabs] Resource ocf:heartbeat:asterisk fails to start

2016-06-17 Thread Digimer
isable stonith, so ya, not a great resource. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Users mailing list: Users@clusterlabs.org http://clust

Re: [ClusterLabs] Recovering after split-brain

2016-06-20 Thread Digimer
andby. > > Does Pacemaker make it easy to do this kind of thing through some means? > Are there any issues that I am completely unaware due to letting > split-brain occur? > > -Thanks > Nikhil -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for can

Re: [ClusterLabs] DLM standalone without crm ?

2016-06-24 Thread Digimer
AM, Lentes, Bernd wrote: > Hi, > > is it possible to have a DLM running without CRM ? Just for playing around a > bit and get used to some stuff. > > > Bernd > -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of

Re: [ClusterLabs] restarting pacemakerd

2016-06-18 Thread Digimer
sense. What you want to do is alert an admin that a restart was needed, so that he or she can investigate the cause. Pacemaker 1.1.15 allows for this alerting now. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the

[ClusterLabs] DLM hanging when corosync is OK causes cluster to hang

2016-01-11 Thread Digimer
this? -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project

Re: [ClusterLabs] Pacemaker 1.1.14 released

2016-01-14 Thread Digimer
he release includes many bugfixes and minor enhancements. For > a more detailed list of changes, see the change log: > > https://github.com/ClusterLabs/pacemaker/blob/1.1/ChangeLog > > Feedback is invited and welcome. > -- Digimer Papers and Projects: https://al

Re: [ClusterLabs] Antw: Re: DLM fencing

2016-02-10 Thread Digimer
On 10/02/16 02:40 AM, Ulrich Windl wrote: >>>> Digimer <li...@alteeve.ca> schrieb am 08.02.2016 um 20:03 in Nachricht > <56b8e68a.1060...@alteeve.ca>: >> On 08/02/16 01:56 PM, Ferenc Wágner wrote: >>> Ken Gaillot <kgail...@redhat.com> write

[ClusterLabs] OT - Someone mailed me something, was it someone here?

2016-02-06 Thread Digimer
wondering; Was it someone from here maybe? It's definitely something meant for me so I was wondering if someone I've helped found my address... A mystery! If it was, thanks to whoever sent it. It it wasn't anyone here, then sorry for the line noise. :) -- Digimer Papers and Projects: https://alteeve.ca

Re: [ClusterLabs] DLM fencing

2016-02-08 Thread Digimer
node with clvmd/gfs2 is no different than normal cluster fencing. To be clear, DLM does NOT fence, it simply waits for the cluster to fence. So you can use IPMI, switched PDUs or whatever else is available in your environment. > On Mon, Feb 8, 2016 at 2:03 PM, Digimer <li...@alteeve.ca >

Re: [ClusterLabs] Antw: Re: Antw: Re: DLM fencing

2016-02-10 Thread Digimer
On 11/02/16 02:37 AM, Ulrich Windl wrote: >>>> Digimer <li...@alteeve.ca> schrieb am 10.02.2016 um 17:32 in Nachricht > <56bb6637.6090...@alteeve.ca>: >> On 10/02/16 02:40 AM, Ulrich Windl wrote: > > [...] >>>> If fencing fails or is not con

Re: [ClusterLabs] Antw: Re: DLM fencing

2016-02-11 Thread Digimer
On 11/02/16 04:42 AM, Vladislav Bogdanov wrote: > 10.02.2016 19:32, Digimer wrote: > [snip] >> >> To be clear; DLM does NOT have it's own fencing. It relies on the >> cluster's fencing. >> > > Actually, dlm4 can use fence-agents directly (device keywo

Re: [ClusterLabs] The cluster stack in Debian

2016-01-29 Thread Digimer
gt; ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://b

[ClusterLabs] Moving Anvil! and Striker development to this list

2016-01-25 Thread Digimer
us. :) If anyone has any comments or concerns about us moving our project discussion to this list, please let me know and I'll do what I can to make sure we address those concerns. Cheers! digimer 1. https://alteeve.ca/w/AN!Cluster_Tutorial_2 2. https://github.com/digimer/striker 3. https:

Re: [ClusterLabs] booth release v1.0

2016-03-18 Thread Digimer
opensuse.org/repositories/network:/ha-clustering:/Stable/ > > If you don't know what booth is and what is it good for, please > check the README at the bottom of the git repository home page: > > https://github.com/ClusterLabs/booth > > Cheers, > > Dejan Hey hey, con

Re: [ClusterLabs] GFS and cLVM fencing requirements with DLM

2016-03-15 Thread Digimer
nith >proxy) and leaving fencing fully to the resource manager (Pacemaker) Pacemaker's fencing will inform DLM when the node has been terminated. If EL6-based clusters, this is done via 'fence_pcmk' config in cman's cluster.conf (which simply asks pacemaker to do the fence and report back when succ

Re: [ClusterLabs] reproducible split brain

2016-03-18 Thread Digimer
On 16/03/16 04:04 PM, Christopher Harvey wrote: > On Wed, Mar 16, 2016, at 04:00 PM, Digimer wrote: >> On 16/03/16 03:59 PM, Christopher Harvey wrote: >>> I am able to create a split brain situation in corosync 1.1.13 using >>> iptables in a 3 node cluster. >>>

Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Digimer
Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > > > > > _______ > Users mailing list: Users@clusterlabs.org

Re: [ClusterLabs] [DRBD-user] DRBD fencing issue on failover causes resource failure

2016-03-19 Thread Digimer
h (the former being ideal for production, the later being easier to setup but more fragile, so only good for testing). -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? _

Re: [ClusterLabs] IPMI working but evacuations don't work‏

2016-03-31 Thread Digimer
o, please share the log files from the surviving node starting just before you crashed the node until a few minutes after. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education?

Re: [ClusterLabs] Totem is unable to form a cluster because of an operating system or network fault

2016-04-12 Thread Digimer
t; > 5402/udp closedunknown > > 5403/udp closedunknown > > 5404/udp closedunknown > > *5405/udp open|filtered unknown* > > MAC Address: 12:34:56:78:9A:BC (Unknown) > > > > Service detection performed. Please report any i

Re: [ClusterLabs] reproducible split brain

2016-03-19 Thread Digimer
mr 3 from the rest of the cluster and everything fails over normally, > so only a unidirectional failure causes problems. > > I don't have stonith enabled right now, and looking over the > pacemaker.log file closely to see if 4 and 5 would normally have fenced > 3, but I didn't see any

Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Digimer
On 19/03/16 10:10 AM, Dennis Jacobfeuerborn wrote: > On 18.03.2016 00:50, Digimer wrote: >> On 17/03/16 07:30 PM, Christopher Harvey wrote: >>> On Thu, Mar 17, 2016, at 06:24 PM, Ken Gaillot wrote: >>>> On 03/17/2016 05:10 PM, Christopher Harvey wrote: >>&

Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Digimer
The resource manager, pacemaker or rgmanager, care about resources, so it is what cares about making smart decisions. As Ken pointed out, without fencing, it can never tell the difference between no access and dead peer. This is (again) why fencing is critical. -- Digimer Papers and Projects: h

Re: [ClusterLabs] DRBD fencing issue on failover causes resource failure

2016-03-20 Thread Digimer
aster (score:INFINITY) (with-rsc-role:Master) > (id:colocation-drbd_fs-drbd_master-INFINITY) > > Resources Defaults: > resource-stickiness: 100 > failure-timeout: 60 > Operations Defaults: > No defaults set > > Cluster Properties: > cluster-infrastructure: c

Re: [ClusterLabs] [Announce] libqb 1.0 release

2016-04-01 Thread Digimer
.redhat.com Congratulations! An auspicious day to release, if any. ;) -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Users mailing list: Users@clusterlabs.o

Re: [ClusterLabs] DLM hanging when corosync is OK causes cluster to hang

2016-04-03 Thread Digimer
On 19/01/16 08:04 PM, Jan Pokorný wrote: > On 11/01/16 11:59 -0500, Digimer wrote: >> We hit a strange problem where a RAID controller on a node failed, >> causing DLM (gfs2/clvmd) to hang, but the node was never fenced. I >> assume this was because coros

Re: [ClusterLabs] Resource failure-timeout does not reset when resource fails to connect to both nodes

2016-03-28 Thread Digimer
l:service_failover):Stopped > > Failcounts for dmz1 > ha-d1.dev.com: 4 > ha-d2.dev.com: 4 > > Is there any way to automatically recover from this scenario, other than > setting an obnoxiously high migration-threshold? > > -- > > *Sam Gardner * > >

Re: [ClusterLabs] HA meetup at OpenStack Summit

2016-04-13 Thread Digimer
On 13/04/16 10:16 AM, Ken Gaillot wrote: > On 04/12/2016 06:39 PM, Digimer wrote: >> On 12/04/16 07:09 PM, Ken Gaillot wrote: >>> Hi everybody, >>> >>> The upcoming OpenStack Summit is April 25-29 in Austin, Texas (US). Some >>> regular ClusterLa

Re: [ClusterLabs] Simple Clarification's regarding pacemaker

2016-04-26 Thread Digimer
f mail (and when I say a lot I mean a LOT). > > Bye, With stonith enabled, a failed fence will leave the cluster hung, by design. The logic is that, as bad as a hung cluster is, it is better than risking a split-brain (which can lead to data loss / corruption, confused switches, etc). digim

Re: [ClusterLabs] Using pacemaker for manual failover only?

2016-05-23 Thread Digimer
tic failover permanently while still > allowing manual failover (with "pcs resource move" or with something else)? Setting aside the use-case for this... Ditch the HA stack, it's an avoidable complexity. Instead, just write a small shell script that drops the IP, stops nfs, unmounts th

Re: [ClusterLabs] install software in centos5

2016-05-11 Thread Digimer
ntOS 5, use cman + rgmanager. If you want to use pacemaker, use CentOS 7 (or at the very least, the lastest 6 with the cman plugin). -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without acc

Re: [ClusterLabs] availibility

2016-05-08 Thread Digimer
ago, so it is available. If I am assuming that you're asking about the Pacemaker project, yes it is also up to date and very actively supported and developed. The projects under the Clusterlabs umbrella are all available here: https://github.com/ClusterLabs -- Digimer Papers and Projects: ht

Re: [ClusterLabs] dropping ssh connection on failover

2016-04-15 Thread Digimer
nd forth but that is someone more complicated. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ Users mailing list: Users@clusterlabs.org http://clust

  1   2   3   4   5   >