Re: [ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

2016-10-05 Thread Ken Gaillot
On 10/04/2016 05:34 PM, Andrew Beekhof wrote:
> 
> 
> On Wed, Oct 5, 2016 at 7:03 AM, Ken Gaillot  > wrote:
> 
> On 10/02/2016 10:02 PM, Andrew Beekhof wrote:
> >> Take a
> >> look at all of nagios' options for deciding when a failure becomes 
> "real".
> >
> > I used to take a very hard line on this: if you don't want the cluster
> > to do anything about an error, don't tell us about it.
> > However I'm slowly changing my position... the reality is that many
> > people do want a heads up in advance and we have been forcing that
> > policy (when does an error become real) into the agents where one size
> > must fit all.
> >
> > So I'm now generally in favour of having the PE handle this "somehow".
> 
> Nagios is a useful comparison:
> 
> check_interval - like pacemaker's monitor interval
> 
> retry_interval - if a check returns failure, switch to this interval
> (i.e. check more frequently when trying to decide whether it's a "hard"
> failure)
> 
> max_check_attempts - if a check fails this many times in a row, it's a
> hard failure. Before this is reached, it's considered a soft failure.
> Nagios will call event handlers (comparable to pacemaker's alert agents)
> for both soft and hard failures (distinguishing the two). A service is
> also considered to have a "hard failure" if its host goes down.
> 
> high_flap_threshold/low_flap_threshold - a service is considered to be
> flapping when its percent of state changes (ok <-> not ok) in the last
> 21 checks (= max. 20 state changes) reaches high_flap_threshold, and
> stable again once the percentage drops to low_flap_threshold. To put it
> another way, a service that passes every monitor is 0% flapping, and a
> service that fails every other monitor is 100% flapping. With these,
> even if a service never reaches max_check_attempts failures in a row, an
> alert can be sent if it's repeatedly failing and recovering.
> 
> 
> makes sense.
> 
> since we're overhauling this functionality anyway, do you think we need
> to add an equivalent of retry_interval too?

It only makes sense if we switch to "in-a-row" failure counting, in
which case we'd need to add flap detection as well ... probably a bigger
project than desired right now :)

> >> If you clear failures after a success, you can't detect/recover a
> >> resource that is flapping.
> >
> > Ah, but you can if the thing you're clearing only applies to other
> > failures of the same action.
> > A completed start doesn't clear a previously failed monitor.
> 
> Nope -- a monitor can alternately succeed and fail repeatedly, and that
> indicates a problem, but wouldn't trip an "N-failures-in-a-row" system.
> 
> >> It only makes sense to escalate from ignore -> restart -> hard, so 
> maybe
> >> something like:
> >>
> >>   op monitor ignore-fail=3 soft-fail=2 on-hard-fail=ban
> >>
> > I would favour something more concrete than 'soft' and 'hard' here.
> > Do they have a sufficiently obvious meaning outside of us developers?
> >
> > Perhaps (with or without a "failures-" prefix) :
> >
> >ignore-count
> >recover-count
> >escalation-policy
> 
> I think the "soft" vs "hard" terminology is somewhat familiar to
> sysadmins -- there's at least nagios, email (SPF failures and bounces),
> and ECC RAM. But throwing "ignore" into the mix does confuse things.
> 
> How about ... max-fail-ignore=3 max-fail-restart=2 fail-escalation=ban
> 
> 
> I could live with that :-)

OK, that will be the tentative plan, subject to further discussion of
course. There's a lot on the plate right now, so there's plenty of time
to refine it :)

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] stonithd/fenced filling up logs

2016-10-05 Thread Israel Brewster
On Oct 5, 2016, at 9:38 AM, Ken Gaillot  wrote:
> 
> On 10/05/2016 11:56 AM, Israel Brewster wrote
>> 
 I never did any specific configuring of CMAN, Perhaps that's the
 problem? I missed some configuration steps on setup? I just
 followed the
 directions
 here:
 http://jensd.be/156/linux/building-a-high-available-failover-cluster-with-pacemaker-corosync-pcs,
 which disabled stonith in pacemaker via the
 "pcs property set stonith-enabled=false" command. Is there
 separate CMAN
 configs I need to do to get everything copacetic? If so, can you
 point
 me to some sort of guide/tutorial for that?
> 
> If you ran "pcs cluster setup", it configured CMAN for you. Normally you
> don't need to modify those values, but you can see them in
> /etc/cluster/cluster.conf.

Good to know. So I'm probably OK on that front.

>> 
>> So in any case, I guess the next step here is to figure out how to do
>> fencing properly, using controllable power strips or the like. Back to
>> the drawing board!
> 
> It sounds like you're on the right track for fencing, but it may not be
> your best next step. Currently, your nodes are trying to fence each
> other endlessly, so if you get fencing working, one of them will
> succeed, and you just have a new problem. :-)
> 
> Check the logs for the earliest occurrence (after starting the cluster)
> of the "Requesting Pacemaker fence" message. Look back from that time in
> /var/log/messages, /var/log/cluster/*, and /var/log/pacemaker.log (not
> necessarily all will be present on your system) to try to figure out why
> it wants to fence.
> 
> One thing I noticed is that you're running CentOS 6.8, but your
> pacemaker version is 1.1.11. CentOS 6.8 shipped with 1.1.14, so maybe
> you partially upgraded your system from an earlier OS version? I'd try
> applying all updates (especially cman, libqb, corosync, and pacemaker).

I think what's you're seeing is pacemaker on my primary DB server, which is 
still at CentOS 6.7. The other servers I've managed to update, but I haven't 
figured out a *good* HA solution for my DB server (PostgreSQL 9.4 running 
streaming replication with named replication slots). That is, I can fail over 
*relatively* easily (touch a file on the secondary, move the IP, and hope all 
the persistent DB connections reconnect without issue), but getting the demoted 
primary back up and running is more of a chore (the pg_rewind feature of 
PostgreSQL 9.5 looks to help with this, but I'm not up to 9.5 yet). As such, I 
haven't updated the primary DB server as much as some of the others.

Proper integration of the DB with pacemaker is something I need to look into 
again, but I took a stab at it when I was first setting up the application 
cluster, and didn't have much luck.

 Now if there is a version of fencing that simply
 e-mails/texts/whatever me and says "Ummm... something is wrong with
 that machine over there, you need to do something about it, because I
 can't guarantee operation otherwise", I could go for that. 
> 
> As digimer mentioned elsewhere, one variation is to use "fabric"
> fencing, i.e. cutting off all external access (disk and/or network) to
> the node. That leaves it up but unable to cause any trouble, so you can
> investigate.
> 
> If the disk is all local, or accessed over the network, then asking an
> intelligent switch to cut off network access is sufficient. If the disk
> is shared (e.g. iSCSI), then you need to cut it off, too.

All disks are local, which would simplify this option, especially considering 
that I don't have any remote power control options available at the moment. I 
mentioned getting switched PDU's to my boss, and he'll look into it, but thinks 
it might not fit into his budget. If I could simply down the proper ports on 
the Cisco switch(s) the machines are connected to, that could be a viable 
alternative without any additional hardware needed.

Thanks!

---
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
---

> 
>>> No, that is not fencing.
>>> 
>>> -- 
>>> Digimer
>>> Papers and Projects: https://alteeve.ca/w/
>>> What if the cure for cancer is trapped in the mind of a person without
>>> access to education?
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlab

Re: [ClusterLabs] stonithd/fenced filling up logs

2016-10-05 Thread Ken Gaillot
On 10/05/2016 11:56 AM, Israel Brewster wrote:
>> On Oct 4, 2016, at 4:06 PM, Digimer > > wrote:
>>
>> On 04/10/16 07:50 PM, Israel Brewster wrote:
>>> On Oct 4, 2016, at 3:38 PM, Digimer >> > wrote:

 On 04/10/16 07:09 PM, Israel Brewster wrote:
> On Oct 4, 2016, at 3:03 PM, Digimer  > wrote:
>>
>> On 04/10/16 06:50 PM, Israel Brewster wrote:
>>> On Oct 4, 2016, at 2:26 PM, Ken Gaillot >> 
>>> > wrote:

 On 10/04/2016 11:31 AM, Israel Brewster wrote:
> I sent this a week ago, but never got a response, so I'm sending it
> again in the hopes that it just slipped through the cracks. It
> seems to
> me that this should just be a simple mis-configuration on my part
> causing the issue, but I suppose it could be a bug as well.
>
> I have two two-node clusters set up using corosync/pacemaker on
> CentOS
> 6.8. One cluster is simply sharing an IP, while the other one has
> numerous services and IP's set up between the two machines in the
> cluster. Both appear to be working fine. However, I was poking
> around
> today, and I noticed that on the single IP cluster, corosync,
> stonithd,
> and fenced were using "significant" amounts of processing power
> - 25%
> for corosync on the current primary node, with fenced and
> stonithd often
> showing 1-2% (not horrible, but more than any other process).
> In looking
> at my logs, I see that they are dumping messages like the
> following to
> the messages log every second or two:
>
> Sep 27 08:51:50 fai-dbs1 stonith-ng[4851]:  warning:
> get_xpath_object:
> No match for //@st_delegate in /st-reply
> Sep 27 08:51:50 fai-dbs1 stonith-ng[4851]:   notice:
> remote_op_done:
> Operation reboot of fai-dbs1 by fai-dbs2 for
> stonith_admin.cman.15835@fai-dbs2.c5161517: No such device
> Sep 27 08:51:50 fai-dbs1 crmd[4855]:   notice:
> tengine_stonith_notify:
> Peer fai-dbs1 was not terminated (reboot) by fai-dbs2 for
> fai-dbs2: No
> such device (ref=c5161517-c0cc-42e5-ac11-1d55f7749b05) by client
> stonith_admin.cman.15835
> Sep 27 08:51:50 fai-dbs1 fence_pcmk[15393]: Requesting
> Pacemaker fence
> fai-dbs2 (reset)

 The above shows that CMAN is asking pacemaker to fence a node. Even
 though fencing is disabled in pacemaker itself, CMAN is
 configured to
 use pacemaker for fencing (fence_pcmk).
>>>
>>> I never did any specific configuring of CMAN, Perhaps that's the
>>> problem? I missed some configuration steps on setup? I just
>>> followed the
>>> directions
>>> here:
>>> http://jensd.be/156/linux/building-a-high-available-failover-cluster-with-pacemaker-corosync-pcs,
>>> which disabled stonith in pacemaker via the
>>> "pcs property set stonith-enabled=false" command. Is there
>>> separate CMAN
>>> configs I need to do to get everything copacetic? If so, can you
>>> point
>>> me to some sort of guide/tutorial for that?

If you ran "pcs cluster setup", it configured CMAN for you. Normally you
don't need to modify those values, but you can see them in
/etc/cluster/cluster.conf.

>> Disabling stonith is not possible in cman, and very ill advised in
>> pacemaker. This is a mistake a lot of "tutorials" make when the author
>> doesn't understand the role of fencing.
>>
>> In your case, pcs setup cman to use the fence_pcmk "passthrough" fence
>> agent, as it should. So when something went wrong, corosync
>> detected it,
>> informed cman which then requested pacemaker to fence the peer. With
>> pacemaker not having stonith configured and enabled, it could do
>> nothing. So pacemaker returned that the fence failed and cman went
>> into
>> an infinite loop trying again and again to fence (as it should have).
>>
>> You must configure stonith (exactly how depends on your hardware),
>> then
>> enable stonith in pacemaker.
>>
>
> Gotcha. There is nothing special about the hardware, it's just two
> physical boxes connected to the network. So I guess I've got a
> choice of either a) live with the logging/load situation (since the
> system does work perfectly as-is other than the excessive logging),
> or b) spend some time researching stonith to figure out what it
> does and how to configure it. Thanks for the pointers.

 The system is not working perfectly. Consider it like this; You're
 flying, and your landing gears are busted. You think everything is fine
 because you're not trying to land yet.
>>>

Re: [ClusterLabs] stonithd/fenced filling up logs

2016-10-05 Thread Dimitri Maziuk
On 10/05/2016 12:19 PM, Digimer wrote:

> Explain why this is a bad idea, because I don't see anything wrong with it.

My point exactly.
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] stonithd/fenced filling up logs

2016-10-05 Thread Digimer
On 05/10/16 01:14 PM, Dimitri Maziuk wrote:
> On 10/05/2016 11:56 AM, Israel Brewster wrote:
> 
>> As you say, though, this is something I'll simply need to get over if I want 
>> real HA
> 
> The sad truth is making simple stupid stuff that Just Works(tm) is not
> cool. Making stuff that will run a cluster of 1001 randomly mixed
> active, somewhat-active, mostly-passive, etc. nodes, power-off anything
> it doesn't like, when that fails: fence it with the Lights-Out
> Management System Du Jour, when that fails: turn the power off at the
> networked PDUs... and bring you warmed-up slippers in the morning, now
> that's cool.

If you have "1001 randomly mixed ..." services, you might want to break
up your software into smaller clusters. Also, iLO, DRAC, iRMC, RSA...
They're all basically IPMI plus some vendor features. Not sure why you'd
refer to them as "System Du Jour"...

> And when you ask: if there's only two node and one can't talk to the
> other, how does it know that it's the other node and not itself that
> needs to be fenced? The "cool" developers answer: well, we just add a
> delay so they don't try to fence each other at the same time.
> 
> D'oh.
> 

Explain why this is a bad idea, because I don't see anything wrong with it.

> I think your problem is centos 6. Either switch to 7 or ditch pacemaker
> and go heartbeat in haresources mode + mon and a little perl scripting.
> I'm running both, the haresources version. I get about 1 instance of the
> scary split brain per 2 cluster/years and almost all of them are caused
> by me doing something stupid.

That is an insane recommendation. Heartbeat has been deprecated for many
years. There is no plan to restart development, either. Meanwhile,
CentOS/RHEL 6 is perfectly fine and stable and will be supported until
at least 2020.

https://alteeve.ca/w/History_of_HA_Clustering

"scary split brain per 2 cluster/years"

Split-brains are about the worst thing that can happen in HA. At the
very best, you lose your services. At worst, you corrupt your data. Why
risk that at all when fencing solves the problem perfectly fine?

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] stonithd/fenced filling up logs

2016-10-05 Thread Dimitri Maziuk
On 10/05/2016 12:14 PM, Dimitri Maziuk wrote:
...
> I'm running both, the haresources version. 

-- for years now and --

> I get about 1 instance of the
> scary split brain per 2 cluster/years and almost all of them are caused
> by me doing something stupid.


-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] stonithd/fenced filling up logs

2016-10-05 Thread Dimitri Maziuk
On 10/05/2016 11:56 AM, Israel Brewster wrote:

> As you say, though, this is something I'll simply need to get over if I want 
> real HA

The sad truth is making simple stupid stuff that Just Works(tm) is not
cool. Making stuff that will run a cluster of 1001 randomly mixed
active, somewhat-active, mostly-passive, etc. nodes, power-off anything
it doesn't like, when that fails: fence it with the Lights-Out
Management System Du Jour, when that fails: turn the power off at the
networked PDUs... and bring you warmed-up slippers in the morning, now
that's cool.

And when you ask: if there's only two node and one can't talk to the
other, how does it know that it's the other node and not itself that
needs to be fenced? The "cool" developers answer: well, we just add a
delay so they don't try to fence each other at the same time.

D'oh.


I think your problem is centos 6. Either switch to 7 or ditch pacemaker
and go heartbeat in haresources mode + mon and a little perl scripting.
I'm running both, the haresources version. I get about 1 instance of the
scary split brain per 2 cluster/years and almost all of them are caused
by me doing something stupid.
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] stonithd/fenced filling up logs

2016-10-05 Thread Digimer
On 05/10/16 12:56 PM, Israel Brewster wrote:

>>> Yeah, I don't want that. If one of the nodes enters an unknown state,
>>> I want the system to notify me so I can decide the proper course of
>>> action - I don't want it to simply shut down the other machine or
>>> something.
>>
>> You do, actually. If a node isn't readily disposable, you need to
>> rethink your HA strategy. The service you're protecting is what matters,
>> not the machine hosting it at any particular time.
> 
> True. My hesitation, however, stems not from loosing the machine without
> warning (the ability to do so without consequence being one of the major
> selling points of HA), but rather with loosing the diagnostic
> opportunities presented *while* the machine is mis-behaving. I'm
> borderline obsessive with knowing what went wrong and why, if the
> machine is shut down before I have a chance to see what state it is in,
> my chances of being able to figure out what happened greatly diminish.
> 
> As you say, though, this is something I'll simply need to get over if I
> want real HA (see below).

If that is the case, you can explore using "fabric fencing". Power
fencing is safer (from a human error perspective) and more popular
because it often returns the node to a working state and restores
redundancy faster.

However, from a pure safety perspective, all the cluster software cares
about is that the target can't provide cluster services. So in fabric
fencing, what happens is that the target isn't powered off, but instead
isolated from the world by severing its network links. I actually wrote
a POC using snmp + managed ethernet switches to provide "cheap" fencing
to people who didn't have IPMI or switched PDUs. Basically, when the
fence fired, it would log into the switches and turn down all the ports
used by the target.

The main concern here is that someone, or something, might restore
access without first clearing the node's state (usually via a reboot).
So I still don't recommend it, but if your concern about being able to
analyze the cause of the hang is strong enough, and if you take
appropriate care not to let the node back in without first rebooting it,
it is an option.

>> Further, the whole role of pacemaker is to know what to do when things
>> go wrong (which you validate with plenty of creative failure testing
>> pre-production). A good HA system is one you won't touch for a long
>> time, possibly over a year. You don't want to be relying on rusty memory
>> for what do while, while people are breathing down your neck because the
>> service is down.
> 
> True, although that argument would hold more weight if I worked for a
> company where everyone wasn't quite so nice :-) We've had outages before
> (one of the reasons I started looking at HA), and everyone was like
> "Well, we can't do our jobs without it, so please let us know when it's
> back up. Have a good day!"

I'm here to help with technical issues. Politics I leave to others. ;)

>> Trust the HA stack to do the right job, and validate that via testing.
> 
> Yeah, my testing is somewhat lacking. Probably contributes to my lack of
> trust.

My rule of thumb is that, *at a minimum*, you should allocate 2 days for
testing for each day you spend implementing. HA is worthless without
full and careful testing.

 This is also why I said that your hardware matters.
 Do your nodes have IPMI? (or iRMC, iLO, DRAC, RSA, etc)?
>>>
>>> I *might* have IPMI. I know my newer servers do. I'll have to check
>>> on that.
>>
>> You can tell from the CLI. I've got a section on how to locate and
>> configure IPMI from the command line here:
>>
>> https://alteeve.ca/w/AN!Cluster_Tutorial_2#What_is_IPMI
>>
>> It should port to most any distro/version.
> 
> Looks like I'm out-of-luck on the IPMI front. Neither my application
> servers nor my database servers have IPMI ports. I'll have to talk to my
> boss about getting controllable power strips or the like (unless there
> are better options than just cutting the power)

APC brand switched PDUs, like the AP7900 (or your country's version of)
are excellent fence devices. Of course, if your servers have dual PSUs,
get dual PDUs. Alternatively, I know Raritan brand work (I wrote an
agent for them).

>>> So where fencing comes in would be for the situations where one
>>> machine *thinks* the other is unavailable, perhaps due to a network
>>> issue, but in fact the other machine is still up and running, I
>>> guess? That would make sense, but the thought of software simply
>>> taking over and shutting down one of my machines, without even
>>> consulting me first, doesn't sit well with me at all. Even a restart
>>> would be annoying - I typically like to see if I can figure out what
>>> is going on before restarting, since restarting often eliminates the
>>> symptoms that help diagnose problems.
>>
>> That is a classic example, but not the only one. Perhaps the target is
>> hung, but might recover later? You just don't know, and not knowing i

[ClusterLabs] Live migration problem

2016-10-05 Thread Digimer
Hi all,

  I just spent a fair bit of time debugging a weird error, and now that
I've solved it, I wanted to share it on the list so that it is archived.
With luck, it will save someone else some heartache. No replies are
expected. :)

Environment:
* Anvil m2 (RHEL 6.8, cman+rgmanager+kvm+drbd+clvmd, fully updated)
* Guest VM OS - Win2012 R2 64-bit

  When I tried to live-migrate the server, rgmanager failed with:

[root@an-a07n02 ~]# clusvcadm -M Windows-Server-2012-R2 -m
an-a07n02.alteeve.ca
Trying to migrate service:Windows-Server-2012-R2 to
an-a07n02.alteeve.ca...Failed; service running on original owner

/var/log/messages showed:

Oct  4 19:15:05 an-a07n01 rgmanager[4213]: Migrating
vm:Windows-Server-2012-R2 to an-a07n02.alteeve.ca
Oct  4 19:15:41 an-a07n01 rgmanager[7588]: [vm] Migrate
Windows-Server-2012-R2 to an-a07n02.alteeve.ca failed:
Oct  4 19:15:41 an-a07n01 rgmanager[7610]: [vm] error: Unable to read
from monitor: Connection reset by peer
Oct  4 19:15:41 an-a07n01 rgmanager[4213]: migrate on vm
"Windows-Server-2012-R2" returned 150 (unspecified)
Oct  4 19:15:41 an-a07n01 rgmanager[4213]: Migration of
vm:Windows-Server-2012-R2 to an-a07n02.alteeve.ca failed; return code 150


I disabled the VM in rgmanager, manually booted it using virsh and tried
to live migrate it directly. Note that I booted the server on node 2
fine, and was trying to migrate from 2 -> 1. Note also that the
'--unsafe' is required because nodes using 4kib sector disks can't use
'cache="none"' in KVM/qemu (so we set 'write-through', so it is still safe).

[root@an-a07n02 ~]# virsh migrate --live Windows-Server-2012-R2
qemu+ssh://an-a07n01.alteeve.ca/system --unsafe
error: Unable to read from monitor: Connection reset by peer

In the qemu log file:


2016-10-05 16:11:19.948+: starting up
LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=spice
/usr/libexec/qemu-kvm -name Windows-Server-2012-R2 -S -M rhel6.6.0 -cpu
SandyBridge,+erms,+smep,+fsgsbase,+pdpe1gb,+rdrand,+f16c,+osxsave,+dca,+pcid,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme
-enable-kvm -m 16384 -realtime mlock=off -smp
4,sockets=4,cores=1,threads=1 -uuid be69b994-0f70-ccf3-2934-43eb4a4b795b
-nodefconfig -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/Windows-Server-2012-R2.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc
base=localtime,driftfix=slew -no-reboot -no-shutdown -device
ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x4.0x7 -device
ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x4
-device
ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x4.0x1
-device
ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x4.0x2
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive
file=/shared/files/Windows_2012_R2_64-bit_eval.iso,if=none,media=cdrom,id=drive-ide0-0-0,readonly=on,format=raw
-device
ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=2
-drive
file=/shared/files/virtio-win-0.1.102.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw
-device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0
-drive
file=/dev/an-a07n01_vg0/Windows-Server-2012-R2_0,if=none,id=drive-virtio-disk0,format=raw,cache=writethrough,aio=native
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-drive
file=/dev/an-a07n01_vg0/Windows-Server-2012-R2_1,if=none,id=drive-virtio-disk1,format=raw,cache=writethrough,aio=native
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1
-netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=27 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:80:2d:e0,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -chardev
spicevmc,id=charchannel0,name=vdagent -device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0
-device usb-tablet,id=input0 -spice
port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -vga
qxl -global qxl-vga.ram_size=67108864 -global qxl-vga.vram_size=67108864
-incoming tcp:[::]:49152 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on
char device redirected to /dev/pts/0
Features 0x2250 unsupported. Allowed features: 0x71000454
qemu: warning: error while loading state for instance 0x0 of device
':00:06.0/virtio-blk'
load of migration failed
2016-10-05 16:11:31.503+: shutting down


The key here was "qemu: warning: error while loading state for instance
0x0 of device ':00:06.0/virtio-blk'".

There was precious little matching this on google. I could see no
problems with the XML definition, the backing LVs (two on this VM, the
LVs are passed up raw to the guest).

Inside the guest OS, I could see no problems. I could, as mentioned
above, boot the server on both nodes, but I could not live migrate.

I got

Re: [ClusterLabs] stonithd/fenced filling up logs

2016-10-05 Thread Israel Brewster

---Israel BrewsterSystems Analyst IIRavn Alaska5245 Airport Industrial RdFairbanks, AK 99709(907) 450-7293---BEGIN:VCARD
VERSION:3.0
N:Brewster;Israel;;;
FN:Israel Brewster
ORG:Frontier Flying Service;MIS
TITLE:PC Support Tech II
EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com
TEL;type=WORK;type=pref:907-450-7293
item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701;
item1.X-ABADR:us
CATEGORIES:General
X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson
END:VCARD


On Oct 4, 2016, at 4:06 PM, Digimer  wrote:On 04/10/16 07:50 PM, Israel Brewster wrote:On Oct 4, 2016, at 3:38 PM, Digimer  wrote:On 04/10/16 07:09 PM, Israel Brewster wrote:On Oct 4, 2016, at 3:03 PM, Digimer  wrote:On 04/10/16 06:50 PM, Israel Brewster wrote:On Oct 4, 2016, at 2:26 PM, Ken Gaillot > wrote:On 10/04/2016 11:31 AM, Israel Brewster wrote:I sent this a week ago, but never got a response, so I'm sending itagain in the hopes that it just slipped through the cracks. It seems tome that this should just be a simple mis-configuration on my partcausing the issue, but I suppose it could be a bug as well.I have two two-node clusters set up using corosync/pacemaker on CentOS6.8. One cluster is simply sharing an IP, while the other one hasnumerous services and IP's set up between the two machines in thecluster. Both appear to be working fine. However, I was poking aroundtoday, and I noticed that on the single IP cluster, corosync, stonithd,and fenced were using "significant" amounts of processing power - 25%for corosync on the current primary node, with fenced and stonithd oftenshowing 1-2% (not horrible, but more than any other process). In lookingat my logs, I see that they are dumping messages like the following tothe messages log every second or two:Sep 27 08:51:50 fai-dbs1 stonith-ng[4851]:  warning: get_xpath_object:No match for //@st_delegate in /st-replySep 27 08:51:50 fai-dbs1 stonith-ng[4851]:   notice: remote_op_done:Operation reboot of fai-dbs1 by fai-dbs2 forstonith_admin.cman.15835@fai-dbs2.c5161517: No such deviceSep 27 08:51:50 fai-dbs1 crmd[4855]:   notice: tengine_stonith_notify:Peer fai-dbs1 was not terminated (reboot) by fai-dbs2 for fai-dbs2: Nosuch device (ref=c5161517-c0cc-42e5-ac11-1d55f7749b05) by clientstonith_admin.cman.15835Sep 27 08:51:50 fai-dbs1 fence_pcmk[15393]: Requesting Pacemaker fencefai-dbs2 (reset)The above shows that CMAN is asking pacemaker to fence a node. Eventhough fencing is disabled in pacemaker itself, CMAN is configured touse pacemaker for fencing (fence_pcmk).I never did any specific configuring of CMAN, Perhaps that's theproblem? I missed some configuration steps on setup? I just followed thedirectionshere: http://jensd.be/156/linux/building-a-high-available-failover-cluster-with-pacemaker-corosync-pcs,which disabled stonith in pacemaker via the"pcs property set stonith-enabled=false" command. Is there separate CMANconfigs I need to do to get everything copacetic? If so, can you pointme to some sort of guide/tutorial for that?Disabling stonith is not possible in cman, and very ill advised inpacemaker. This is a mistake a lot of "tutorials" make when the authordoesn't understand the role of fencing.In your case, pcs setup cman to use the fence_pcmk "passthrough" fenceagent, as it should. So when something went wrong, corosync detected it,informed cman which then requested pacemaker to fence the peer. Withpacemaker not having stonith configured and enabled, it could donothing. So pacemaker returned that the fence failed and cman went intoan infinite loop trying again and again to fence (as it should have).You must configure stonith (exactly how depends on your hardware), thenenable stonith in pacemaker.Gotcha. There is nothing special about the hardware, it's just two physical boxes connected to the network. So I guess I've got a choice of either a) live with the logging/load situation (since the system does work perfectly as-is other than the excessive logging), or b) spend some time researching stonith to figure out what it does and how to configure it. Thanks for the pointers.The system is not working perfectly. Consider it like this; You'reflying, and your landing gears are busted. You think everything is finebecause you're not trying to land yet.Ok, good analogy :-)Fencing is needed to force a node that has entered into a known stateinto a known state (usually 'off'). It does this by reaching out oversome independent mechanism, like IPMI or a switched PDU, and forcing thetarget to shut down.Yeah, I don't want that. If one of the nodes enters an unknown state, I want the system to notify me so I can decide the proper course of action - I don't want it to simply shut down the other machine or something.You do, actually. If a node isn't readily disposable, you n

Re: [ClusterLabs] [Problem] When a group resource does not stop in a trouble node, the movement of the group resource is started in other nodes.

2016-10-05 Thread renayama19661014
Hi All, 


After Pacemaker1.1.14, there may be a problem in order of the stop of the group 
resource. 
The problem occurs by cluster constitution without STONITH. 

I can confirm it in the next procedure. 

Step 1) Copy Dummy resource and make Dummy1 resource and Dummy2 resource.

Step 2) Constitute a cluster. 

[root@rh72-01 ~]# crm_mon -1 -Af
Stack: corosync
Current DC: rh72-02 (version 1.1.15-e174ec8) - partition with quorum
Last updated: Wed Oct  5 16:24:21 2016          
Last change: Wed Oct  5
16:24:15 2016 by root via cibadmin on rh72-01 
2 nodes and 2 resources configured 
Online: [ rh72-01 rh72-02 ] 
Resource Group: grpDummy prmDummy1  (ocf::pacemaker:Dummy1):
        Started rh72-01 prmDummy2  (ocf::pacemaker:Dummy2):        Started 
rh72-01 
Node Attributes:
* Node rh72-01:
* Node rh72-02: Migration Summary:
* Node rh72-01:
* Node rh72-02: 

Step 3) Set pseudotrouble in stop of Dummy2.
(snip)
dummy_stop() {
return $OCF_ERR_GENERIC dummy_monitor 
if [ $? -eq $OCF_SUCCESS ]; then
 rm ${OCF_RESKEY_state} 
fi 
rm -f "${VERIFY_SERIALIZED_FILE}" 
return $OCF_SUCCESS
}
(snip) 

Step 4) Make rh72-01 node standby. Trouble occurs in Dummy2 resource, and the 
resource does not move. 

[root@rh72-01 ~]# crm_standby -N rh72-01 -v on
[root@rh72-01 ~]# crm_mon -1 -Af
Stack: corosync
Current DC: rh72-02 (version 1.1.15-e174ec8) - partition with quorum
Last updated: Wed Oct  5 16:27:49 2016          
Last change: Wed Oct  5
16:27:47 2016 by root via crm_attribute on rh72-01 
2 nodes and 2 resources configured 
Node rh72-01: standby
Online: [ rh72-02 ] 
Resource Group: grpDummy
 prmDummy1  (ocf::pacemaker:Dummy1):        Started rh72-01
 prmDummy2  (ocf::pacemaker:Dummy2):        FAILED rh72-01 (blocked) Node 
Attributes:
* Node rh72-01:
* Node rh72-02: Migration Summary:
* Node rh72-01: prmDummy2: migration-threshold=1 fail-count=100 
last-failure='Wed Oct  5
16:29:29 2016'
* Node rh72-02: Failed Actions:
* prmDummy2_stop_0 on rh72-01 'unknown error' (1): call=15, status=complete,
exitreason='none', last-rc-change='Wed Oct  5 16:27:47 2016', queued=1ms, 
exec=34ms 

Step 5) Clean Dummy2 resource. 

[root@rh72-01 ~]# crm_resource -C -r prmDummy2 -H rh72-01 -f
Cleaning up prmDummy2 on rh72-01, removing fail-count-prmDummy2
Waiting for 1 replies from the CRMd. OK

[root@rh72-01 ~]# crm_mon -1 -Af
Stack: corosync
Current DC: rh72-02 (version 1.1.15-e174ec8) - partition with quorum
Last updated: Wed Oct  5 16:30:55 2016          
Last change: Wed Oct  5
16:30:53 2016 by hacluster via crmd on rh72-01 
2 nodes and 2 resources configured 
Node rh72-01: standby
Online: [ rh72-02 ] 
Resource Group: grpDummy
 prmDummy1  (ocf::pacemaker:Dummy1):        Started rh72-02
 prmDummy2  (ocf::pacemaker:Dummy2):        FAILED rh72-01 (blocked) 
Node Attributes:
* Node rh72-01:
* Node rh72-02: Migration Summary:
* Node rh72-01: prmDummy2: migration-threshold=1 fail-count=100 
last-failure='Wed Oct  5
16:32:35 2016'
* Node rh72-02: Failed Actions:
* prmDummy2_stop_0 on rh72-01 'unknown error' (1): call=23, status=complete,
exitreason='none', last-rc-change='Wed Oct  5 16:30:54 2016', queued=0ms, 
exec=35ms 


Trouble occurs again, and the Dummy2 resource does not move, but the Dummy1 
resource moves to rh72-02 node.

If all the resources of the group do not stop, the resource should not move. 

The problem does not occur in Pacemaker1.1.13. An event of probe_complete is 
abolished by Pacemaker1.1.14.

It is thought that a problem is included near the next correction.
 * 
https://github.com/ClusterLabs/pacemaker/commit/c1438ae489d791cc689625332b8ced21bfd4d143#diff-8e7ae81c93497126538c2a82fe183692
 * 
https://github.com/ClusterLabs/pacemaker/commit/8f76b782133857b40a583e947d743d45c7d05dc8#diff-8e7ae81c93497126538c2a82fe183692
 

I registered this problem with Bugzilla.
 * http://bugs.clusterlabs.org/show_bug.cgi?id=5301

Best Regards,
Hideo Yamauch.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] [Problem] When a group resource does not stop in a trouble node, the movement of the group resource is started in other nodes.

2016-10-05 Thread renayama19661014
Hi All,

Sorry...

The format of the email has collapsed.
I retransmit it later.

Best Regards,
Hideo Yamauchi.



- Original Message -
> From: "renayama19661...@ybb.ne.jp" 
> To: ClusterLabs-ML 
> Cc: 
> Date: 2016/10/5, Wed 23:25
> Subject: [ClusterLabs] [Problem] When a group resource does not stop in a 
> trouble node, the movement of the group resource is started in other nodes.
> 
> Hi All, After Pacemaker1.1.14, there may be a problem in order of the stop of 
> the group
> resource. The problem occurs by cluster constitution without STONITH. I can 
> confirm it in the next procedure. Step 1) Copy Dummy resource and make Dummy1 
> resource and Dummy2 resource.
> Step 2) Constitute a cluster. [root@rh72-01 ~]# crm_mon -1 -Af
> Stack: corosync
> Current DC: rh72-02 (version 1.1.15-e174ec8) - partition with quorum
> Last updated: Wed Oct  5 16:24:21 2016          Last change: Wed Oct  5
> 16:24:15 2016 by root via cibadmin on rh72-01 2 nodes and 2 resources 
> configured 
> Online: [ rh72-01 rh72-02 ] Resource Group: grpDummy prmDummy1  
> (ocf::pacemaker:Dummy1):        Started rh72-01 prmDummy2  
> (ocf::pacemaker:Dummy2):        Started rh72-01 Node Attributes:
> * Node rh72-01:
> * Node rh72-02: Migration Summary:
> * Node rh72-01:
> * Node rh72-02: Step 3) Set pseudotrouble in stop of Dummy2.
> (snip)
> dummy_stop() {
> return $OCF_ERR_GENERIC dummy_monitor if [ $? -eq $OCF_SUCCESS ]; then rm 
> ${OCF_RESKEY_state} fi rm -f "${VERIFY_SERIALIZED_FILE}" return 
> $OCF_SUCCESS
> }
> (snip) Step 4) Make rh72-01 node standby. Trouble occurs in Dummy2 resource, 
> and 
> the
> resource does not move. [root@rh72-01 ~]# crm_standby -N rh72-01 -v on
> [root@rh72-01 ~]# crm_mon -1 -Af
> Stack: corosync
> Current DC: rh72-02 (version 1.1.15-e174ec8) - partition with quorum
> Last updated: Wed Oct  5 16:27:49 2016          Last change: Wed Oct  5
> 16:27:47 2016 by root via crm_attribute on rh72-01 2 nodes and 2 resources 
> configured Node rh72-01: standby
> Online: [ rh72-02 ] Resource Group: grpDummy prmDummy1  
> (ocf::pacemaker:Dummy1):        Started rh72-01 prmDummy2  
> (ocf::pacemaker:Dummy2):        FAILED rh72-01 (blocked) Node Attributes:
> * Node rh72-01:
> * Node rh72-02: Migration Summary:
> * Node rh72-01: prmDummy2: migration-threshold=1 fail-count=100 
> last-failure='Wed Oct  5
> 16:29:29 2016'
> * Node rh72-02: Failed Actions:
> * prmDummy2_stop_0 on rh72-01 'unknown error' (1): call=15, 
> status=complete,
> exitreason='none', last-rc-change='Wed Oct  5 16:27:47 2016', 
> queued=1ms, exec=34ms Step 5) Clean Dummy2 resource. [root@rh72-01 ~]# 
> crm_resource -C -r prmDummy2 -H rh72-01 -f
> Cleaning up prmDummy2 on rh72-01, removing fail-count-prmDummy2
> Waiting for 1 replies from the CRMd. OK
> [root@rh72-01 ~]# crm_mon -1 -Af
> Stack: corosync
> Current DC: rh72-02 (version 1.1.15-e174ec8) - partition with quorum
> Last updated: Wed Oct  5 16:30:55 2016          Last change: Wed Oct  5
> 16:30:53 2016 by hacluster via crmd on rh72-01 2 nodes and 2 resources 
> configured Node rh72-01: standby
> Online: [ rh72-02 ] Resource Group: grpDummy prmDummy1  
> (ocf::pacemaker:Dummy1):        Started rh72-02 prmDummy2  
> (ocf::pacemaker:Dummy2):        FAILED rh72-01 (blocked) Node Attributes:
> * Node rh72-01:
> * Node rh72-02: Migration Summary:
> * Node rh72-01: prmDummy2: migration-threshold=1 fail-count=100 
> last-failure='Wed Oct  5
> 16:32:35 2016'
> * Node rh72-02: Failed Actions:
> * prmDummy2_stop_0 on rh72-01 'unknown error' (1): call=23, 
> status=complete,
> exitreason='none', last-rc-change='Wed Oct  5 16:30:54 2016', 
> queued=0ms, exec=35ms Trouble occurs again, and the Dummy2 resource does not 
> move, but the Dummy1
> resource moves to rh72-02 node.
> If all the resources of the group do not stop, the resource should not move. 
> The 
> problem does not occur in Pacemaker1.1.13. An event of probe_complete is 
> abolished by Pacemaker1.1.14.
> It is thought that a problem is included near the next correction. * 
> https://github.com/ClusterLabs/pacemaker/commit/c1438ae489d791cc689625332b8ced21bfd4d143#diff-8e7ae81c93497126538c2a82fe183692
>  
> * 
> https://github.com/ClusterLabs/pacemaker/commit/8f76b782133857b40a583e947d743d45c7d05dc8#diff-8e7ae81c93497126538c2a82fe183692
>  
> 
> 
> I registered this problem with Bugzilla.
> * http://bugs.clusterlabs.org/show_bug.cgi?id=5301
> Best Regards,
> Hideo Yamauch.
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: 

[ClusterLabs] [Problem] When a group resource does not stop in a trouble node, the movement of the group resource is started in other nodes.

2016-10-05 Thread renayama19661014
Hi All, After Pacemaker1.1.14, there may be a problem in order of the stop of 
the group
resource. The problem occurs by cluster constitution without STONITH. I can 
confirm it in the next procedure. Step 1) Copy Dummy resource and make Dummy1 
resource and Dummy2 resource.
Step 2) Constitute a cluster. [root@rh72-01 ~]# crm_mon -1 -Af
Stack: corosync
Current DC: rh72-02 (version 1.1.15-e174ec8) - partition with quorum
Last updated: Wed Oct  5 16:24:21 2016  Last change: Wed Oct  5
16:24:15 2016 by root via cibadmin on rh72-01 2 nodes and 2 resources 
configured Online: [ rh72-01 rh72-02 ] Resource Group: grpDummy prmDummy1  
(ocf::pacemaker:Dummy1):Started rh72-01 prmDummy2  
(ocf::pacemaker:Dummy2):Started rh72-01 Node Attributes:
* Node rh72-01:
* Node rh72-02: Migration Summary:
* Node rh72-01:
* Node rh72-02: Step 3) Set pseudotrouble in stop of Dummy2.
(snip)
dummy_stop() {
return $OCF_ERR_GENERIC dummy_monitor if [ $? -eq $OCF_SUCCESS ]; then rm 
${OCF_RESKEY_state} fi rm -f "${VERIFY_SERIALIZED_FILE}" return $OCF_SUCCESS
}
(snip) Step 4) Make rh72-01 node standby. Trouble occurs in Dummy2 resource, 
and the
resource does not move. [root@rh72-01 ~]# crm_standby -N rh72-01 -v on
[root@rh72-01 ~]# crm_mon -1 -Af
Stack: corosync
Current DC: rh72-02 (version 1.1.15-e174ec8) - partition with quorum
Last updated: Wed Oct  5 16:27:49 2016  Last change: Wed Oct  5
16:27:47 2016 by root via crm_attribute on rh72-01 2 nodes and 2 resources 
configured Node rh72-01: standby
Online: [ rh72-02 ] Resource Group: grpDummy prmDummy1  
(ocf::pacemaker:Dummy1):Started rh72-01 prmDummy2  
(ocf::pacemaker:Dummy2):FAILED rh72-01 (blocked) Node Attributes:
* Node rh72-01:
* Node rh72-02: Migration Summary:
* Node rh72-01: prmDummy2: migration-threshold=1 fail-count=100 
last-failure='Wed Oct  5
16:29:29 2016'
* Node rh72-02: Failed Actions:
* prmDummy2_stop_0 on rh72-01 'unknown error' (1): call=15, status=complete,
exitreason='none', last-rc-change='Wed Oct  5 16:27:47 2016', queued=1ms, 
exec=34ms Step 5) Clean Dummy2 resource. [root@rh72-01 ~]# crm_resource -C -r 
prmDummy2 -H rh72-01 -f
Cleaning up prmDummy2 on rh72-01, removing fail-count-prmDummy2
Waiting for 1 replies from the CRMd. OK
[root@rh72-01 ~]# crm_mon -1 -Af
Stack: corosync
Current DC: rh72-02 (version 1.1.15-e174ec8) - partition with quorum
Last updated: Wed Oct  5 16:30:55 2016  Last change: Wed Oct  5
16:30:53 2016 by hacluster via crmd on rh72-01 2 nodes and 2 resources 
configured Node rh72-01: standby
Online: [ rh72-02 ] Resource Group: grpDummy prmDummy1  
(ocf::pacemaker:Dummy1):Started rh72-02 prmDummy2  
(ocf::pacemaker:Dummy2):FAILED rh72-01 (blocked) Node Attributes:
* Node rh72-01:
* Node rh72-02: Migration Summary:
* Node rh72-01: prmDummy2: migration-threshold=1 fail-count=100 
last-failure='Wed Oct  5
16:32:35 2016'
* Node rh72-02: Failed Actions:
* prmDummy2_stop_0 on rh72-01 'unknown error' (1): call=23, status=complete,
exitreason='none', last-rc-change='Wed Oct  5 16:30:54 2016', queued=0ms, 
exec=35ms Trouble occurs again, and the Dummy2 resource does not move, but the 
Dummy1
resource moves to rh72-02 node.
If all the resources of the group do not stop, the resource should not move. 
The problem does not occur in Pacemaker1.1.13. An event of probe_complete is 
abolished by Pacemaker1.1.14.
It is thought that a problem is included near the next correction. * 
https://github.com/ClusterLabs/pacemaker/commit/c1438ae489d791cc689625332b8ced21bfd4d143#diff-8e7ae81c93497126538c2a82fe183692
 * 
https://github.com/ClusterLabs/pacemaker/commit/8f76b782133857b40a583e947d743d45c7d05dc8#diff-8e7ae81c93497126538c2a82fe183692
 

I registered this problem with Bugzilla.
* http://bugs.clusterlabs.org/show_bug.cgi?id=5301
Best Regards,
Hideo Yamauch.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely

2016-10-05 Thread renayama19661014
Hi All,

>> If a user uses sbd, can the cluster evade a problem of SIGSTOP of crmd?
> 
> As pointed out earlier, maybe crmd should feed a watchdog. Then stopping crmd 
> will reboot the node (unless the watchdog fails).


Thank you for comment.

We examine watchdog of crmd, too.
In addition, I comment after examination advanced.


Best Regards,
Hideo Yamauchi.



- Original Message -
> From: Ulrich Windl 
> To: users@clusterlabs.org; renayama19661...@ybb.ne.jp
> Cc: 
> Date: 2016/10/5, Wed 23:08
> Subject: Antw: Re: [ClusterLabs] Antw: Re: When the DC crmd is frozen, 
> cluster decisions are delayed infinitely
> 
   schrieb am 21.09.2016 um 11:52 
> in Nachricht
> <876439.61305...@web200311.mail.ssk.yahoo.co.jp>:
>>  Hi All,
>> 
>>  Was the final conclusion given about this problem?
>> 
>>  If a user uses sbd, can the cluster evade a problem of SIGSTOP of crmd?
> 
> As pointed out earlier, maybe crmd should feed a watchdog. Then stopping crmd 
> will reboot the node (unless the watchdog fails).
> 
>> 
>>  We are interested in this problem, too.
>> 
>>  Best Regards,
>> 
>>  Hideo Yamauchi.
>> 
>> 
>>  ___
>>  Users mailing list: Users@clusterlabs.org 
>>  http://clusterlabs.org/mailman/listinfo/users 
>> 
>>  Project Home: http://www.clusterlabs.org 
>>  Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>>  Bugs: http://bugs.clusterlabs.org 
> 

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: Antw: Re: When the DC crmd is frozen, cluster decisions are delayed infinitely

2016-10-05 Thread Ulrich Windl
>>>  schrieb am 21.09.2016 um 11:52 in Nachricht
<876439.61305...@web200311.mail.ssk.yahoo.co.jp>:
> Hi All,
> 
> Was the final conclusion given about this problem?
> 
> If a user uses sbd, can the cluster evade a problem of SIGSTOP of crmd?

As pointed out earlier, maybe crmd should feed a watchdog. Then stopping crmd 
will reboot the node (unless the watchdog fails).

> 
> We are interested in this problem, too.
> 
> Best Regards,
> 
> Hideo Yamauchi.
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 





___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: OCF_ERR_CONFIGURED (was: Virtual ip resource restarted on node with down network device)

2016-10-05 Thread Ulrich Windl
>>> Ken Gaillot  schrieb am 20.09.2016 um 16:43 in 
>>> Nachricht
<51303130-9cc5-85d6-2f26-ff21b711a...@redhat.com>:
> On 09/20/2016 07:38 AM, Lars Ellenberg wrote:
>> From the point of view of the resource agent,
>> you configured it to use a non-existing network.
>> Which it considers to be a configuration error,
>> which is treated by pacemaker as
>> "don't try to restart anywhere
>> but let someone else configure it properly, first".
>> 
>> I think the OCF_ERR_CONFIGURED is good, though, otherwise 
>> configuration errors might go unnoticed for quite some time.
>> A network interface is not supposed to "vanish".
>> 
>> You may disagree with that choice,
> 
> This is a point we should settle in the upcoming changes to the OCF
> standard.
> 
> The OCF 1.0 standard
> (https://github.com/ClusterLabs/OCF-spec/blob/master/ra/resource-agent-api.md)
> merely says it means "Program is not configured". That is open to
> interpretation.
> 
> Pacemaker
> (http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Expla
>  
> ined/index.html#s-ocf-return-codes)
> has a more narrow view: "The resource's configuration is invalid. E.g.
> required parameters are missing."

I agree that OCF_ERR_CONFIGURED signals a configuration error that retrying 
without operator intervention cannot fix. However it may be node specific.

> 
> The reason Pacemaker considers it a fatal error is that it expects it to
> be returned only for an error in the resource agent's configuration *in
> the cluster*. If the cluster config is bad, it doesn't matter which node
> we try it on. For example, if an agent takes a parameter "frobble" with
> valid values from 1 to 10, and the user supplies "frobble=-1", that
> would be a configuration error.
> 
> I think in OCF 2.0 we should distinguish "supplied RA parameters are
> bad" from "service's configuration on this host is bad". Currently,
> Pacemaker expects the latter error to generate OCF_ERR_GENERIC,
> OCF_ERR_ARGS, OCF_ERR_PERM, or OCF_ERR_INSTALLED, which allows it to try
> the resource on another node.

IMHO OCF_ERR_INSTALLED is similar to the above, but some software is missing or 
incompatible.


OCF_ERR_ARGS vs. OCF_ERR_CONFIGURED: Configured may be valid by syntax, but bad 
regarding the environment, whereas OCF_ERR_ARGS is invalid in all cases.
OCF_ERR_GENERIC is "catch the rest", I guess.

> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 





___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: clustering fiber channel pools on multiple wwns

2016-10-05 Thread Ulrich Windl
>>> Andrei Borzenkov  schrieb am 18.09.2016 um 08:50 in
Nachricht <911e7925-511e-0555-2ee6-2da0ee7b1...@gmail.com>:
> 06.09.2016 16:45, Gabriele Bulfon пишет:
>> Hi, on illumos, I have a way to cluster one zfs pool on two nodes, by
>> moving ip,pool and its shares at once on the other node. This works
>> for iscsi too: the ip of the target has been migrated together with
>> the pool, so the iscsi resource is still there running on the same ip
>> (just a different node). Now I was thinking to do the same with fiber
>> channel: two nodes, each with its own qlogic fc connected to a fc
>> switch, with vmware clients with their fc cards connected on the same
>> switch. I can't see how I can do this with fc, because with iscsi I
>> can migrate the hosting IP, but with fc I can't migrate the hosting
>> wwn! What I need, is to tell vmware that the target volume may be
>> running on two different wwns, so a failing wwn should trigger retry
>> on the other wwn: the pool and shared volumes will be moving from one
>> wwn to the other. Am I dreaming?? Gabriele 
> 
> Use NPIV to define virtual HBAs that can then be migrated between
> servers. This also makes overall configuration more secure as you can
> zone only those servers that need access to these virtual HBAs, so they
> can see only own resources.

Having done that, I know that cleaning up the NPIV can be tricky if
multipathing is in effect. If you remove the NPIV too early, other things may
hang. Fortunately the times of kernel panics and OOpses should be gone if you
add and remove multiple NPIVs quickly ;-)

Actually we were using NPIV as a additional safeguard against mounting the
same disk on multiple nodes. Noe we use OCFS2 which does not have a problem
with that ;-)

Ulrich

> 
> See documentation for virtual ports in QLogic CLI (scli -vp). Your
> switch needs NPIV support too.
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] IPaddr2, interval between unsolicited ARP packets

2016-10-05 Thread 飯田 雄介
Hi, Hamaguchi-san

send_arp exists in two versions depending on the environment.
https://github.com/ClusterLabs/resource-agents/blob/master/tools/send_arp.libnet.c
https://github.com/ClusterLabs/resource-agents/blob/master/tools/send_arp.linux.c

Those that contain in your environment, it seems that it is send_arp.linux.c.
send_arp.linux.c has been made to also work with the same options as the 
send_arp.libnet.c.
However, -i and -p In send_arp.linux.c is not used.

send_arp.linux.c sends arp every second because it is driven by the alarm(1).
https://github.com/ClusterLabs/resource-agents/blob/master/tools/send_arp.linux.c#L384
This interval cannot be changed because they are hard-coded.

Regards, Yusuke
> -Original Message-
> From: Shinjiro Hamaguchi [mailto:hamagu...@agile.ne.jp]
> Sent: Wednesday, October 05, 2016 1:57 PM
> To: users@clusterlabs.org
> Subject: Re: [ClusterLabs] IPaddr2, interval between unsolicited ARP packets
> 
> Matsushima-san
> 
> 
> Thank you very much for your reply.
> And sorry for late reply.
> 
> 
> >Do you get same result by executing the command manually with different
> parameters like this?
> I tried following command but same result (1sec interval)
> 
>  [command used to send unsolicited arp]
> /usr/libexec/heartbeat/send_arp -i 1500 -r 8 eth0 192.168.12.215 auto not_used
> not_used
> 
>  [result of tcudump]
> 04:31:50.475928 ARP, Request who-has 192.168.12.215 (Broadcast) tell
> 192.168.12.215, length 28
> 04:31:51.476053 ARP, Request who-has 192.168.12.215 (Broadcast) tell
> 192.168.12.215, length 28
> 04:31:52.476146 ARP, Request who-has 192.168.12.215 (Broadcast) tell
> 192.168.12.215, length 28
> 04:31:53.476246 ARP, Request who-has 192.168.12.215 (Broadcast) tell
> 192.168.12.215, length 28
> 04:31:54.476287 ARP, Request who-has 192.168.12.215 (Broadcast) tell
> 192.168.12.215, length 28
> 04:31:55.476406 ARP, Request who-has 192.168.12.215 (Broadcast) tell
> 192.168.12.215, length 28
> 04:31:56.476448 ARP, Request who-has 192.168.12.215 (Broadcast) tell
> 192.168.12.215, length 28
> 04:31:57.476572 ARP, Request who-has 192.168.12.215 (Broadcast) tell
> 192.168.12.215, length 28
> 
> >Please also make sure the PID file has been created properly.
> When I checked manually with send_arp command, I didn't use send_arp command
> with "-p" option.
> 
> Even when I did fail-over of IPaddr2resource (not manually execute send_arp
> command), I couldn't see pid file generated at /var/run/resource-agents/.
> I used following command to see if pid file generated.
> 
> watch -n0.1 "ls -la /var/run/resource-agents/"
> 
> 
> Thank you in advance.
> 
> 
> On Wed, Oct 5, 2016 at 12:20 PM, Digimer  wrote:
> 
> 
> 
> 
> 
>    Forwarded Message 
>   Subject: Re: [ClusterLabs] IPaddr2, interval between unsolicited ARP
> packets
>   Date: Tue, 4 Oct 2016 11:18:37 +0900
>   From: Takehiro Matsushima 
>   Reply-To: Cluster Labs - All topics related to open-source clustering
>   welcomed 
>   To: Cluster Labs - All topics related to open-source clustering welcomed
>   
> 
>   Hello Hamaguchi-san,
> 
>   Do you get same result by executing the command manually with
>   different parameters like this?
>   #/usr/libexec/heartbeat/send_arp -i 1500 -r 8 -p
>   /var/run/resource-agents/send_arp-192.168.12.215 eth0
> 192.168.12.215
>   auto not_used not_used
> 
>   Please also make sure the PID file has been created properly.
> 
>   Thank you,
> 
>   Takehiro MATSUSHIMA
> 
>   2016-10-03 14:45 GMT+09:00 Shinjiro Hamaguchi
> :
>   > Hello everyone!!
>   >
>   >
>   > I'm using IPaddr2 for VIP.
>   >
>   > In the IPaddr2 document, it say interval between unsolicited ARP
> packets is
>   > default 200msec and can change it using option "-i", but when i check
> with
>   > tcpdump, it looks like sending arp every 1000msec fixed.
>   >
>   > Does someone have any idea ?
>   >
>   > Thank you in advance.
>   >
>   >
>   > [environment]
>   > kvm, centOS 6.8
>   > pacemaker-1.1.14-8.el6_8.1.x86_64
>   > cman-3.0.12.1-78.el6.x86_64
>   > resource-agents-3.9.5-34.el6_8.2.x86_64
>   >
>   >
>   > [command used to send unsolicited arp]
>   > NOTE: i got this command from /var/log/cluster/corosync.log
>   > #/usr/libexec/heartbeat/send_arp -i 200 -r 5 -p
>   > /var/run/resource-agents/send_arp-192.168.12.215 eth0
> 192.168.12.215 auto
>   > not_used not_used
>   >
>   > [result of tcudump]
>   >
>   > #tcpdump arp
>   >
>   > tcpdump: verbose output suppressed, use -v or -vv for full protocol
> decode
>   >
>   > listening on eth0, link-type EN10MB (Ethernet), capture size 65535
> bytes
>   >
>   > 05:28:17.267296 ARP, Request who-has 192.168.12.215 (Broadcast) tell
>   > 192.168.12.215, length 28
>   >
>   > 05:28:18.267519 ARP, Request who-has 192.168.12.215 (B

Re: [ClusterLabs] Corosync ring shown faulty between healthy nodes & networks (rrp_mode: passive)

2016-10-05 Thread Jan Friesse

Martin,


Hello all,

I am trying to understand why the following 2 Corosync heartbeat ring failure
scenarios
I have been testing and hope somebody can explain why this makes any sense.


Consider the following cluster:

 * 3x Nodes: A, B and C
 * 2x NICs for each Node
 * Corosync 2.3.5 configured with "rrp_mode: passive" and
   udpu transport with ring id 0 and 1 on each node.
 * On each node "corosync-cfgtool -s" shows:
 [...] ring 0 active with no faults
 [...] ring 1 active with no faults


Consider the following scenarios:

 1. On node A only block all communication on the first NIC  configured with
ring id 0
 2. On node A only block all communication on all   NICs configured with
ring id 0 and 1


The result of the above scenarios is as follows:

 1. Nodes A, B and C (!) display the following ring status:
 [...] Marking ringid 0 interface  FAULTY
 [...] ring 1 active with no faults
 2. Node A is shown as OFFLINE - B and C display the following ring status:
 [...] ring 0 active with no faults
 [...] ring 1 active with no faults


Questions:
 1. Is this the expected outcome ?


Yes


 2. In experiment 1. B and C can still communicate with each other over both
NICs, so why are
B and C not displaying a "no faults" status for ring id 0 and 1 just 
like
in experiment 2.


Because this is how RRP works. RRP marks whole ring as failed so every 
node sees that ring as failed.



when node A is completely unreachable ?


Because it's different scenario. In scenario 1 there are 3 nodes 
membership where one of them has failed one ring -> whole ring is 
failed. In scenario 2 there are 2 nodes membership where both rings 
works as expected. Node A is completely unreachable and it's not in the 
membership.


Regards,
  Honza




Regards,
Martin Schlegel

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org