aned files.
--
Eric Robinson
-Original Message-
From: Ken Gaillot [mailto:kgail...@redhat.com]
Sent: Monday, July 25, 2016 7:52 AM
To: users@clusterlabs.org
Cc: Eric Robinson <eric.robin...@psmnv.com>
Subject: Re: [ClusterLabs] After Startup, Pacemaker Gasps and Dies
On 07/23/201
--
Eric Robinson
Chief Information Officer
Physician Select Management, LLC
775.885.2211 x 112
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
Indeed. My mistake.
--
Eric Robinson
-Original Message-
From: Ulrich Windl [mailto:ulrich.wi...@rz.uni-regensburg.de]
Sent: Friday, January 20, 2017 4:25 AM
To: users@clusterlabs.org
Subject: [ClusterLabs] Antw: Re: Antw: Colocations and Orders Syntax Changed?
>>> Eric
Thanks for the input. I usually just do a 'crm config show >
myfile.xml.date_time' and the read it back in if I need to.
--
Eric Robinson
> -Original Message-
> From: Ulrich Windl [mailto:ulrich.wi...@rz.uni-regensburg.de]
> Sent: Thursday, January 19, 2017 12:04 AM
o they can each be started and stopped without hurting anything.
--
Eric Robinson
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clust
and refuses to join the cluster, notifying operators. Later,
operators manually resolve the split brain.
There is no perfect solution, of course, but is seems to me that this simple
approach provides a level of availability beyond what you would normally get
with a 2-node cluster. What am I missin
I'm mostly interested in prventing false-positive cluster failovers that might
occur during manual network maintenance (for example, testing switch and link
outages).
>> Thanks for the clarification. So what's the easiest way to ensure that the
>> cluster waits a
>> desired timeout before
Basically, when we turn off a switch, I want to keep the cluster from failing
over before Linux bonding has had a chance to recover.
I'm mostly interested in prventing false-positive cluster failovers that might
occur during manual network maintenance (for example, testing switch and link
but I have not seen this approach anywhere. Maybe there's a good reason
for that because it simply won't work? The arbitration solutions I have seen
all rely on a third machine that plays a complex role in arbitration.
Thoughts?
--
Eric Robinson
__
failovers. In other words, if there is a link or
switch failure, I want to make sure that the cluster allows plenty of time for
link communication to recover before deciding that a node has actually died.
--
Eric Robinson
___
Users mailing l
control the delay with arp_ip_target?
--
Eric Robinson
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc
Does anyone know how many arp_intervals must pass without a reply before the
bonding driver downs the primary NIC? Just one?
--
Eric Robinson
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users
Project Home
.
--
Eric Robinson
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http
tering welcomed
Subject: Re: [ClusterLabs] Can Bonding Cause a Broadcast Storm?
What bonding mode are you using? Some modes require additional configuration
from the switch to avoid flooding. Also, is spanning tree enabled on the
switches?
On Tue, Nov 15, 2016 at 1:26 PM Eric Robinson
<e
> AFAIK, it _all_ ARP targets did not respond _once_ the link will be
> considered down
It would be great if someone could confirm that.
> after "Down Delay". I guess you want to use multiple (and the correct ones)
> ARP IP targets...
Yes, I use multiple targets, and arp_all_targets=any.
bd0 and p_vip_clust19 are getting the Master designation.
--
Eric Robinson
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: h
e that much. I just want to make sure people in the list are not
getting alerts that my mails are fraudulent.
--
Eric Robinson
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: h
> -Original Message-
> From: Digimer [mailto:li...@alteeve.ca]
> Sent: Sunday, April 16, 2017 11:17 AM
> To: Cluster Labs - All topics related to open-source clustering welcomed
> <users@clusterlabs.org>; Eric Robinson <eric.robin...@psmnv.com>
> Subject: Re
> In shred-nothing cluster "split brain" means whichever MAC address
> is in ARP cache of the border router is the one that gets the traffic.
> How does the existing code figure this one out?
I'm guessing the surviving node broadcasts a gratuitous arp reply.
ike my question was well-timed, as it served as a catalyst for you to
write the article. Thanks much, I am working through it now and will doubtless
have some questions and comments. Before I say anything more, I want to do some
testing in my lab to make sure I have my thoughts collected.
-
Somebody want to look at this log and tell me why the cluster failed over? All
we did was add a new resource. We've done it many times before without any
problems.
--
Apr 03 08:50:30 [22762] ha14acib: info: cib_process_request:
Forwarding cib_apply_diff operation for
Somebody want to look at this log and tell me why the cluster failed over? All
we did was add a new resource. We've done it many times before without any
problems.
--
Apr 03 08:50:30 [22762] ha14acib: info: cib_process_request:
Forwarding cib_apply_diff operation for
> I've received your emails without any alteration or flagging as "fraud".
> So I don't think we're doing anything to your emails.
Good to know.
--
Eric Robinson
___
Users mailing list: Users@clusterlabs.org
http://lists.cluster
>> You guys got a thing against Office 365?
> doesn't everybody?
Fair enough.
--
Eric Robinson
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.or
> On a serious note, I too received your e-mails without any red flags attached.
Thanks for the confirmation. I guess I'm the only one seeing those warnings.
Maybe Office 365 has a problem with ClusterLabs. ;-)
--
Eric Robinson
___
Users mailing l
1) iotop did not show any significant io, just maybe 30k/second of drbd traffic.
2) okay. I've never done that before. I'll give it a shot.
3) I'm not sure what I'm looking at there.
--
Eric Robinson
> -Original Message-
> From: Ulrich Windl [mailto:ulrich.wi...@rz.uni-regensb
ges!
--
Eric Robinson
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: h
of 128MB. Creating an ext4 filesystem on it and trimming only
took 1.5 minutes (across multiple tests).
Somebody knowledgeable may be able to explain how DISC-MAX affects the trim
speed, and why the DISC-MAX value is different when creating the array with
mdadm versus lvm.
--
Eric Robinson
bug?
--
Eric Robinson
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: h
Yeah, UpToDate was not of concern to me. The part that threw me off was
"done:100.00." It did eventually finish, though, and that was shown in the
dmesg output. However, 'drbdadm status' said "done:100.00" the whole time,
from start to finish, which seems weird.
ation at ClusterLabs misleading?
--
Eric Robinson
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
. If there is
anything I can do to assist with getting the documentation cleaned up, I'd be
more than glad to help.
--
Eric Robinson
-Original Message-
From: Ken Gaillot [mailto:kgail...@redhat.com]
Sent: Tuesday, August 22, 2017 2:08 PM
To: Cluster Labs - All topics related to open-source
> > Out of curiosity, what did I say that indicates that we're not using
> > fencing?
> >
>
> Same place you said you were new to HA and needed to learn corosync and
> pacemaker to use OpenBSD.
>
I must have misspoken. I said I stopped using OpenBSD back around the year 2000
and switched to
> > I must have misspoken.
>
> No, I had invisible tags all over my last two messages.
Haha, okay. Thought I was going nuts for a moment.
--Eric
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
> > Out of curiosity, do the openSUSE Leap repos and packages work with
> SLES?
>
> I know that there are some base system differences that could cause
> problems, things like Leap using systemd/journald for logging while SLES is
> still logging via syslog-ng (IIRC)... so it's possible that you
> Jokes (?) aside; Red Hat and SUSE both have paid teams that make sure the
> HA software works well. So if you're new to HA, I strongly recommend
> sticking with one of those two, and SUSE is what you mentioned. If you really
> want to go to BSD or something else, I would recommend learning HA on
> > Also, use fencing. Seriously, just do it.
>
> Yeah. Fencing is the only bit that's missing from this picture.
>
Out of curiosity, what did I say that indicates that we're not using fencing?
--Eric
___
Users mailing list: Users@clusterlabs.org
> > I can understand how SUSE can charge for support, but not for the
> software itself. Corosync, Pacemaker, and DRBD are all open source.
>
> So why do not you download open source and compile it yourself?
>
I've done that before and I could if necessary. Rather go with the easiest
option
itself. Corosync, Pacemaker, and DRBD are all open source.
--
Eric Robinson
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http
"High
Availability Extension," which I must pay $700/year for? No freaking way!
This is Linux we're talking about, right? There's got to be an easy way to
install the cluster without paying for a subscription... right?
Someone talk me off the ledge here.
--
Eri
The license would be GPL, I suppose, whatever enthusiasts and community
contributors usually do. And yes, it would be fun to know I contributed
something to the repo.
--
Eric Robinson
> -Original Message-
> From: Kristoffer Grönlund [mailto:kgronl...@suse.com]
> Sen
Forgot to mention that it's called AZaddr and is intended to be dependent on
IPaddr2 (or vice versa) and live in /usr/lib/ocf/resource.d/heartbeat.
--
Eric Robinson
From: Eric Robinson [mailto:eric.robin...@psmnv.com]
Sent: Friday, September 15, 2017 3:56 PM
To: Cluster Labs - All topics
Greetings, all --
If anyone's interested, I wrote a resource agent that works with Microsoft
Azure. I'm no expert at shell scripting, so I'm certain it needs a great deal
of improvement, but I've done some testing and it works with a 2-node cluster
in my Azure environment. Offhand, I don't
ing the DRBD layer and writing directly to the
drives, so we must conclude that DRBD has a data corruption bug under high
write load. However, we would be more than happy to be proved wrong.
--
Eric Robinson
___
Users mailing list: Users@clusterlabs.
> I don't know the tool, but isn't the expectation a bit high that the tool
> will trim
> the correct blocks throuch drbd->LVM/mdadm->device? Why not use the tool
> on the affected devices directly?
>
I did, and the corruption did not occur. It only happened when writing through
the DRBD
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
I used tcpdump and I see a lot of traffic between them on port 2224, but
nothing else.
Is there an issue because the bindinetaddr is 172.28.0.0 but the members have a
/23 mask?
--
Eric Robinson
ent on it. That might work?
--
Eric Robinson
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
around
this limitation?
--
Eric Robinson
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc
/Azure parameters, and if those are configured, it would do the
appropriate API requests.
On Thu, 2017-08-24 at 23:27 +, Eric Robinson wrote:
> Leon -- I will pay you one trillion samolians for that resource agent!
> Any way we can get our hands on a copy?
>
>
>
> --
> Eric
Oh, okay. I thought you meant some different ones.
--
Eric Robinson
Chief Information Officer
Physician Select Management, LLC
775.885.2211 x 112
-Original Message-
From: Kristoffer Grönlund [mailto:kgronl...@suse.com]
Sent: Friday, August 25, 2017 9:56 AM
To: Eric Robinson <eric.ro
there.
--
Eric Robinson
> -Original Message-
> From: Oyvind Albrigtsen [mailto:oalbr...@redhat.com]
> Sent: Friday, August 25, 2017 12:17 AM
> To: Cluster Labs - All topics related to open-source clustering welcomed
> <users@clusterlabs.org>
> Subject: Re: [Cluster
o corosync.conf seem to be working.
--
Eric Robinson
-Original Message-
From: Jan Friesse [mailto:jfrie...@redhat.com]
Sent: Tuesday, August 22, 2017 11:52 PM
To: Cluster Labs - All topics related to open-source clustering welcomed
<users@clusterlabs.org>; kgail...@redhat.com
I figured out the cause. CMAN got installed by yum, and so none of my changes
to corosync.conf had any effect, including the udpu directive. Now I just have
to figure out how to enable unicast in cman.
--
Eric Robinson
From: Eric Robinson [mailto:eric.robin...@psmnv.com]
Sent: Wednesday
I got it.
From: Eric Robinson [mailto:eric.robin...@psmnv.com]
Sent: Wednesday, August 23, 2017 6:51 PM
To: Cluster Labs - All topics related to open-source clustering welcomed
<users@clusterlabs.org>
Subject: Re: [ClusterLabs] Is there a Trick to Making Corosync Work on
Microsoft Azu
I'm sure someone has seen this before. What does it mean?
ha11a:~ # drbdmanage init 198.51.100.65
You are going to initialize a new drbdmanage cluster.
CAUTION! Note that:
* Any previous drbdmanage cluster information may be removed
* Any remaining resources managed by a previous drbdmanage
I installed corosync 2.4.3 and pacemaker 1.1.17 from the openSUSE Leap 4.23
repos, but I can't find pcs or pcsd. Anybody know where to download them from?
--Eric
___
Users mailing list: Users@clusterlabs.org
To: Cluster Labs - All topics related to open-source clustering welcomed
<users@clusterlabs.org>; Eric Robinson <eric.robin...@psmnv.com>
Subject: Re: [ClusterLabs] Where to Find pcs and pcsd for OpenSUSE LEAP 4.23
Hi,
On 11/07/2017 05:35 AM, Eric Robinson wrote:
I installed co
> Which aspects of its constraints handling do you like, and why? I'm curious,
> since I wasn't aware that it was significantly different from crmsh in this
> respect.
>
Well, to be fair, in the past I have always configured my clusters by using
'crm configure edit' and building the config in
> > I sent this to the drbd list too, but it’s possible that someone here
> > may know.
> >
> >
> >
> > This is a WEIRD one.
> >
> >
> >
> > Why would one drbd volume be trimmable and the other one not?
> >
>
> iirc drbd stores some of the config in the meta-data as well - like e.g. some
>
I sent this to the drbd list too, but it's possible that someone here may know.
This is a WEIRD one.
Why would one drbd volume be trimmable and the other one not?
Here you can see me issuing the trim command against two different filesystems.
It works on one but fails on the other.
ha11a:~ #
General question. I tried to set up a cman + corosync + pacemaker cluster using
two corosync rings. When I start the cluster, everything works fine, except
when I do a 'corosync-cfgtool -s' it only shows one ring. I tried manually
editing the /etc/cluster/cluster.conf file adding two sections,
Thanks for the suggestion everyone. I'll give that a try.
> -Original Message-
> From: Jan Friesse [mailto:jfrie...@redhat.com]
> Sent: Monday, February 12, 2018 8:49 AM
> To: Cluster Labs - All topics related to open-source clustering welcomed
> <users@clusterlabs.o
> > Thanks for the suggestion everyone. I'll give that a try.
>
> Sorry, I'm late on this, but I wrote a quick start doc describing this (amongs
> other things) some time ago. See the following chapter:
>
> https://clusterlabs.github.io/PAF/Quick_Start-CentOS-6.html#cluster-
> creation
>
I
I have what seems to be a healthy cluster, but I can't get resources to move.
Here's what's installed...
[root@001db01a cluster]# yum list installed|egrep "pacem|coro"
corosync.x86_64 2.4.3-2.el7_5.1 @updates
corosynclib.x86_64 2.4.3-2.el7_5.1
Move?
>
> On Wed, 2018-08-01 at 03:49 +, Eric Robinson wrote:
> > I have what seems to be a healthy cluster, but I can’t get resources
> > to move.
> >
> > Here’s what’s installed…
> >
> > [root@001db01a cluster]# yum list installed|egrep "pacem|co
> > The message likely came from the resource agent calling crm_attribute
> > to set a node attribute. That message usually means the cluster isn't
> > running on that node, so it's highly suspect. The cib might have
> > crashed, which should be in the log as well. I'd look into that first.
>
>
> Hi!
>
> I'm not familiar with Redhat, but is tis normal?:
>
> > > corosync: active/disabled
> > > pacemaker: active/disabled
>
> Regards,
> Ulrich
That's the default after a new install. I had not enabled them to start
automatically yet.
>
I don't understand why a problem with a resource causes other resources above
it in the dependency stack (or on the same level with it) to fail over.
My dependency stack is:
drbd -> filesystem -> floating_ip -> Azure virtual IP
|
->
The corosync log show different times for lrmd messages than for cib or crmd
messages. Note the 4 hour difference. What?
Aug 20 13:08:27 [107884] 001store01acib: info: cib_perform_op:
+
> Hi!
>
> I could guess that the processes run with different timezone settings (for
> whatever reason).
>
> Regards,
> Ulrich
That would be my guess, too, but I cannot imagine how they ended up in that
condition.
>
> >>> Eric Robinson schrieb am 21.0
> -Original Message-
> From: Users On Behalf Of Jan Pokorný
> Sent: Tuesday, August 21, 2018 2:45 AM
> To: users@clusterlabs.org
> Subject: Re: [ClusterLabs] Different Times in the Corosync Log?
>
> On 21/08/18 08:43 +, Eric Robinson wrote:
> >> I coul
configuration).
>
> If you figure this out, I'd love to hear what it was. Gremlins ...
You'll be the second to know after me!
>
> On Tue, 2018-08-21 at 11:45 +0200, Jan Pokorný wrote:
> > On 21/08/18 08:43 +, Eric Robinson wrote:
> > > > I could guess that the proc
I have a few corosync+pacemeker clusters in Azure. Occasionally, cluster nodes
failover, possibly because of intermittent connectivity loss, but more likely
because one or more nodes experiences high load and is not able to respond in a
timely fashion. I want to make the clusters a little more
> -Original Message-
> From: Jan Friesse
> Sent: Sunday, January 20, 2019 11:57 PM
> To: Cluster Labs - All topics related to open-source clustering welcomed
> ; Eric Robinson
> Subject: Re: [ClusterLabs] Increasing Token Timeout Safe By Itself?
>
> Eric Robins
> -Original Message-
> From: Users On Behalf Of Andrei
> Borzenkov
> Sent: Wednesday, February 20, 2019 8:51 PM
> To: users@clusterlabs.org
> Subject: Re: [ClusterLabs] Antw: Re: Why Do All The Services Go Down When
> Just One Fails?
>
> 20.02.2019
> -Original Message-
> From: Users On Behalf Of Ulrich Windl
> Sent: Tuesday, February 19, 2019 11:35 PM
> To: users@clusterlabs.org
> Subject: [ClusterLabs] Antw: Re: Why Do All The Services Go Down When
> Just One Fails?
>
> >>>
> -Original Message-
> From: Users On Behalf Of Ken Gaillot
> Sent: Friday, February 22, 2019 5:06 PM
> To: Cluster Labs - All topics related to open-source clustering welcomed
>
> Subject: Re: [ClusterLabs] Simulate Failure Behavior
>
> On Sat, 2019-02-23 at 00
I want to mess around with different on-fail options and see how the cluster
responds. I'm looking through the documentation, but I don't see a way to
simulate resource failure and observe behavior without actually failing over
the mode. Isn't there a way to have the cluster MODEL failure and
cluster-name: 001db01ab
dc-version: 1.1.18-11.el7_5.3-2b07d5c5a9
have-watchdog: false
last-lrm-refresh: 1550347798
maintenance-mode: false
no-quorum-policy: ignore
stonith-enabled: false
--Eric
From: Users On Behalf Of Eric Robinson
Sent: Saturday, February 16, 2019 12:34 PM
To: Cluster Labs - All
Here are the relevant corosync logs.
It appears that the stop action for resource p_mysql_002 failed, and that
caused a cascading series of service changes. However, I don't understand why,
since no other resources are dependent on p_mysql_002.
[root@001db01a cluster]# cat
?
> -Original Message-
> From: Users On Behalf Of Andrei
> Borzenkov
> Sent: Saturday, February 16, 2019 1:34 PM
> To: users@clusterlabs.org
> Subject: Re: [ClusterLabs] Why Do All The Services Go Down When Just One
> Fails?
>
> 17.02.2019 0:03, Eric Robinson пишет
> On Sat, Feb 16, 2019 at 09:33:42PM +0000, Eric Robinson wrote:
> > I just noticed that. I also noticed that the lsb init script has a
> > hard-coded stop timeout of 30 seconds. So if the init script waits
> > longer than the cluster resource timeout of 15s, that would ca
These are the resources on our cluster.
[root@001db01a ~]# pcs status
Cluster name: 001db01ab
Stack: corosync
Current DC: 001db01a (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with
quorum
Last updated: Sat Feb 16 15:24:55 2019
Last change: Sat Feb 16 15:10:21 2019 by root via cibadmin on
> -Original Message-
> From: Users On Behalf Of Valentin Vidic
> Sent: Saturday, February 16, 2019 1:28 PM
> To: users@clusterlabs.org
> Subject: Re: [ClusterLabs] Why Do All The Services Go Down When Just One
> Fails?
>
> On Sat, Feb 16, 2019 at 09:03:43PM +0
I'm looking through the docs but I don't see how to set the on-fail value for a
resource.
> -Original Message-
> From: Users On Behalf Of Eric Robinson
> Sent: Saturday, February 16, 2019 1:47 PM
> To: Cluster Labs - All topics related to open-source clustering welcomed
t;
> On Tue, 2019-02-19 at 17:40 +, Eric Robinson wrote:
> > > -Original Message-
> > > From: Users On Behalf Of Andrei
> > > Borzenkov
> > > Sent: Sunday, February 17, 2019 11:56 AM
> > > To: users@clusterlabs.org
> > > Sub
> -Original Message-
> From: Users On Behalf Of Andrei
> Borzenkov
> Sent: Sunday, February 17, 2019 11:56 AM
> To: users@clusterlabs.org
> Subject: Re: [ClusterLabs] Why Do All The Services Go Down When Just One
> Fails?
>
> 17.02.2019 0:44, Eric
Roger --
Thank you, sir. That does help.
-Original Message-
From: Roger Zhou
Sent: Wednesday, October 30, 2019 2:56 AM
To: Cluster Labs - All topics related to open-source clustering welcomed
; Eric Robinson
Subject: Re: [ClusterLabs] Stupid DRBD/LVM Global Filter Question
On 10/30
If I have an LV as a backing device for a DRBD disk, can someone explain why I
need an LVM filter? It seems to me that we would want the LV to be always
active under both the primary and secondary DRBD devices, and there should be
no need or desire to have the LV activated or deactivated by
ei Borzenkov
> wrote:
> >05.02.2020 20:55, Eric Robinson пишет:
> >> The two servers 001db01a and 001db01b were up and responsive. Neither
> >had been rebooted and neither were under heavy load. There's no
> >indication in the logs of loss of network connectivity. Any
The two servers 001db01a and 001db01b were up and responsive. Neither had been
rebooted and neither were under heavy load. There's no indication in the logs
of loss of network connectivity. Any ideas on why both nodes seem to think the
other one is at fault?
(Yes, it's a 2-node cluster without
> -Original Message-
> From: Users On Behalf Of Andrei
> Borzenkov
> Sent: Wednesday, February 5, 2020 12:14 PM
> To: users@clusterlabs.org
> Subject: Re: [ClusterLabs] Why Do Nodes Leave the Cluster?
>
> 05.02.2020 20:55, Eric Robinson пишет:
> > The two
topics related to open-source clustering welcomed
; Andrei Borzenkov
Subject: Re: [ClusterLabs] Why Do Nodes Leave the Cluster?
Hi Erik,
what has led you to think that there was no network loss ?
Best Regards,
Strahil Nikolov
В сряда, 5 февруари 2020 г., 22:59:56 ч. Гринуич+2, Eric Robinson
В четвъртък, 6 февруари 2020 г., 01:44:55 ч. Гринуич+2, Eric Robinson
mailto:eric.robin...@psmnv.com>> написа:
Hi Strahil –
I can’t prove there was no network loss, but:
1. There were no dmesg indications of ethernet link loss.
2. Other than corosync, there are no oth
Hi Nikolov --
> Defaults are 1s token, 1.2s consensus which is too small.
> In Suse, token is 10s, while consensus is 1.2 * token -> 12s.
> With these settings, cluster will not react for 22s.
>
> I think it's a good start for your cluster .
> Don't forget to put the cluster in
> >
> > I've done that with all my other clusters, but these two servers are
> > in Azure, so the network is out of our control.
>
> Is a normal cluster supported to use corosync over Internet? I'm not sure
> (because of the delays and possible packet losses).
>
>
As with most things, the main
If I want to know the current DRBD runtime settings such as timeout, ping-int,
or connect-int, how do I check that? I'm assuming they may not be the same as
what shows in the config file.
--Eric
Disclaimer : This email and any files transmitted with it are confidential and
intended solely
e clusters? Should I use a
larger consensus anyway?
--Eric
> -Original Message-
> From: Strahil Nikolov
> Sent: Thursday, February 6, 2020 1:07 PM
> To: Eric Robinson ; Cluster Labs - All topics
> related to open-source clustering welcomed ;
> Andrei Borzenkov
> Su
1. What command can I execute on the qdevice node which tells me which
client nodes are connected and alive?
1. In the output of the pcs qdevice status command, what is the meaning of...
Vote: ACK (ACK)
1. In the output of the pcs quorum status Command,
1 - 100 of 175 matches
Mail list logo