On Wed, Oct 9, 2019 at 10:59 AM Kadlecsik József
wrote:
>
> Hello,
>
> The nodes in our cluster have got backend and frontend interfaces: the
> former ones are for the storage and cluster (corosync) traffic and the
> latter ones are for the public services of KVM guests only.
>
> One of the nodes
On Thu, Oct 10, 2019 at 11:16 AM Ulrich Windl
wrote:
>
> Hi!
>
> In recent SLES there is "cluster MD", like in
> cluster-md-kmp-default-4.12.14-197.18.1.x86_64
> (/lib/modules/4.12.14-197.18-default/kernel/drivers/md/md-cluster.ko).
> However I could not find any manual page for it.
>
> Where
10.10.2019 18:22, Lentes, Bernd пишет:
> HI,
>
> i have a two node cluster running on SLES 12 SP4.
> I did some testing on it.
> I put one into standby (ha-idg-2), the other (ha-idg-1) got fenced a few
> minutes later because i made a mistake.
> ha-idg-2 was DC. ha-idg-1 made a fresh boot and i
28.02.2020 01:55, Ken Gaillot пишет:
> On Thu, 2020-02-27 at 22:39 +0300, Andrei Borzenkov wrote:
>> 27.02.2020 20:54, Ken Gaillot пишет:
>>> On Thu, 2020-02-27 at 18:43 +0100, Jehan-Guillaume de Rorthais
>>> wrote:
>>>>>> Speaking about s
27.02.2020 20:54, Ken Gaillot пишет:
> On Thu, 2020-02-27 at 18:43 +0100, Jehan-Guillaume de Rorthais wrote:
Speaking about shutdown, what is the status of clean shutdown of
the
cluster handled by Pacemaker? Currently, I advice to stop
resources
gracefully (eg. using pcs
05.02.2020 20:55, Eric Robinson пишет:
> The two servers 001db01a and 001db01b were up and responsive. Neither had
> been rebooted and neither were under heavy load. There's no indication in the
> logs of loss of network connectivity. Any ideas on why both nodes seem to
> think the other one is
05.02.2020 18:16, Олег Самойлов пишет:
> Hi all.
>
> I am reading the documentation about new (for me) pacemaker, which came with
> RedHat 8.
>
> And I see two different chapters, which both tried to solve exactly the same
> problem.
>
> One is CONFIGURING DISASTER RECOVERY CLUSTERS (pcs dr):
08.01.2020 17:30, Achim Leitner пишет:
> Hi,
>
> some progress on this issue:
>
> Am 20.12.19 um 13:37 schrieb Achim Leitner:
>> After pacemaker restart, we have Transition 0 with the DRBD actions,
>> followed 4s later with Transition 1 including all VM actions with
>> correct ordering. 32s
04.01.2020 01:42, Valentin Vidić пишет:
> On Thu, Jan 02, 2020 at 09:52:09PM +0100, Jan Pokorný wrote:
>> What you've used appears to be akin to what this chunk of manpage
>> suggests (amongst others):
>> https://git.netfilter.org/iptables/tree/extensions/libxt_cluster.man
>>
>> which is (yet
14.01.2020 17:47, Jan Pokorný пишет:
> On 11/01/20 19:47 +0300, Andrei Borzenkov wrote:
>> 04.01.2020 01:42, Valentin Vidić пишет:
>>> On Thu, Jan 02, 2020 at 09:52:09PM +0100, Jan Pokorný wrote:
>>>> What you've used appears to be akin to what this chunk of manpage
08.04.2020 10:12, Jan Friesse пишет:
> Sherrard,
>
>> i could not determine which of these sub-threads to include this in,
>> so i am going to (reluctantly) top-post it.
>>
>> i switched the transport to udp, and in limited testing i seem to not
>> be hitting the race condition. of course i have
07.04.2020 00:21, Sherrard Burton пишет:
>>
>> It looks like some timing issue or race condition. After reboot node
>> manages to contact qnetd first, before connection to other node is
>> established. Qnetd behaves as documented - it sees two equal size
>> partitions and favors the partition that
06.04.2020 20:57, Sherrard Burton пишет:
>
>
> On 4/6/20 1:20 PM, Sherrard Burton wrote:
>>
>>
>> On 4/6/20 12:35 PM, Andrei Borzenkov wrote:
>>> 06.04.2020 17:05, Sherrard Burton пишет:
>>>>
>>>> from the quorum node:
>> .
16.04.2020 18:58, Stefan Sabolowitsch пишет:
> Hi there,
> i have expanded a cluster with 2 nodes with an additional one "elastic-03".
> However, fence_scsi does not start on the new node.
>
> pcs-status:
> [root@logger cluster]# pcs status
> Cluster name: cluster_elastic
> Stack: corosync
>
31.03.2020 05:56, Ken Gaillot пишет:
> On Sat, 2020-02-22 at 03:50 +0200, Strahil Nikolov wrote:
>> Hello community,
>>
>> Recently I have started playing with fence_mpath and I have noticed
>> that when the node is fenced, the node is kicked out of the
>> cluster (corosync & pacemaker are shut
31.03.2020 09:27, steven prothero пишет:
> Hello,
>
> I am new with Pacemaker (new to redis also) and appreciate the info shared
> here.
>
> I believe with Redis sentinel a switchover is about 2 seconds.
> Reading a post about Pacemaker with Redis, the author said he was
> doing it in 3
05.05.2020 06:39, Nickle, Richard пишет:
> I have a two node cluster managing a VIP. The service is an SMTP service.
> This could be active/active, it doesn't matter which node accepts the SMTP
> connection, but I wanted to make sure that a VIP was in place so that there
> was a well-known
my base network in 'bindnetaddr'
> doesn't account for networks with CIDR mask bits greater than 24? (which
> would have non-zero least significant bytes.)
>
> Thanks,
>
> Rick
>
>
>
>
>
> On Tue, May 5, 2020 at 12:03 PM Andrei Borzenkov
> wrote:
>
&
05.05.2020 16:44, Nickle, Richard пишет:
> Thanks Honza and Andrei (and Strahil? I might have missed a message in the
> thread...)
>
Yep, all messages from Strahil end up in spam folder.
___
Manage your subscription:
21.03.2020 20:07, Ken Gaillot пишет:
> Hi all,
>
> I am happy to announce a feature that was discussed on this list a
> while back. It will be in Pacemaker 2.0.4 (the first release candidate
> is expected in about three weeks).
>
> A longstanding concern in two-node clusters is that in a split
07.10.2020 06:42, Digimer пишет:
> Hi all,
>
> While developing our program (and not being a production cluster), I
> find that when I push broken code to a node, causing the RA to fail to
> perform an operation, the node gets fenced. (example below).
>
> This brings up a question;
>
> If
09.10.2020 08:21, Rohit Saini пишет:
> Hi Team,
> I am using ocf:pacemaker:ping resource to check aliveness of a machine
> every X seconds. As I understand, monitor interval 'Y' will cause ping to
> happen every 'Y' seconds. So, for my case, Y should be equal to X?
> I do not see this behavior
05.10.2020 20:55, Richard Seo пишет:
>
>> Create host route via specific device.
> I've looked over the docs, haven't found a way to do this. I've tried
> configuring corosync.conf using the specific ip addresses. Could you
> specify
> how to route to a specific network adapter from
On Tue, Aug 25, 2020 at 10:00 AM Rohit Saini
wrote:
>
> Hi All,
> I am seeing the following behavior. Can someone clarify if this is intended
> behavior. If yes, then why so? Please let me know if logs are needed for
> better clarity.
>
> 1. Without Stonith:
> Continuous corosync kill on master
18.08.2020 10:35, Ulrich Windl пишет:
>>>> Andrei Borzenkov schrieb am 18.08.2020 um 09:24 in
> Nachricht <83aba38d-c9ea-1dff-e53b-14a9e0623...@gmail.com>:
>> 18.08.2020 10:10, Ulrich Windl пишет:
>>>>>> Ken Gaillot sc
17.08.2020 23:39, Jehan-Guillaume de Rorthais пишет:
> On Mon, 17 Aug 2020 10:19:45 -0500
> Ken Gaillot wrote:
>
>> On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote:
>>> Thanks to all your suggestions, I now have the systems with stonith
>>> configured on ipmi.
>>
>> A word of caution:
18.08.2020 10:10, Ulrich Windl пишет:
Ken Gaillot schrieb am 17.08.2020 um 17:19 in
> Nachricht
> <73d6ecf113098a3154a2e7db2e2a59557272024a.ca...@redhat.com>:
>> On Fri, 2020‑08‑14 at 15:09 +0200, Gabriele Bulfon wrote:
>>> Thanks to all your suggestions, I now have the systems with stonith
18.08.2020 17:02, Ken Gaillot пишет:
> On Tue, 2020-08-18 at 08:21 +0200, Klaus Wenninger wrote:
>> On 8/18/20 7:49 AM, Andrei Borzenkov wrote:
>>> 17.08.2020 23:39, Jehan-Guillaume de Rorthais пишет:
>>>> On Mon, 17 Aug 2020 10:19:45 -0500
>>>> Ken Gai
18.08.2020 22:49, Klaus Wenninger пишет:
>>> What I'm not sure about is how watchdog-only sbd would behave as a
>>> fail-back method for a regular fence device. Will the cluster wait for
>>> the sbd timeout no matter what, or only if the regular fencing fails,
>>> or ...?
>>>
>> Diskless SBD
21.08.2020 21:16, Ken Gaillot пишет:
>
> Previously at shutdown, sbd determined a clean pacemaker shutdown by
> checking whether any resources were running at shutdown. This would
> lead to sbd fencing if pacemaker shut down in maintenance mode with
> resources active.
What conditions lead to
I changed list to users because it is general usage question, not
development topic.
26.08.2020 23:33, Hayden Pfeiffer пишет:
> Hello,
>
>
> I am in the process of configuring fencing in an AWS cluster of two
> hosts. I have done so and nodes are correctly fenced when
> communication is broken
17.08.2020 10:06, Klaus Wenninger пишет:
>>
>>> Alternatively, you can set up corosync-qdevice, using a separate system
>>> running qnetd server as a quorum arbitrator.
>>>
>> Any solution that is based on node suicide is prone to complete cluster
>> loss. In particular, in two node cluster with
22.09.2020 02:06, Philippe M Stedman пишет:
> Hi Strahil,
>
> Here is the output of those commands I appreciate the help!
>
> # crm config show
> node 1: ceha03 \
> attributes ethmonitor-ens192=1
> node 2: ceha04 \
> attributes ethmonitor-ens192=1
> (...)
> primitive
26.09.2020 12:22, Michael Ivanov пишет:
> Hallo,
>
> I have strange problem: when I reset the node on which my resources are
> running,
> they are correctly migrated to the other node. But when I turn the failed
> node
> back, then as soon as it is up all resources are returned back to it. I
01.10.2020 20:09, Richard Seo пишет:
> Hello everyone,
> I'm trying to setup a cluster with two hosts:
> both have two ethernet adapters all within the same subnet.
> I've created resources for an adapter for each hosts.
> Here is the example:
> Stack: corosync
> Current DC: ceha06 (version
23.10.2020 21:08, Lentes, Bernd пишет:
>
> Surprisingly if the virsh destroy is successfull the RA waits until the
> domain isn't running anymore:
>
...
>
> I need someting like that which waits for some time (maybe 30s) if the domain
> nevertheless stops although
> "virsh destroy" gaves an
On Wed, Oct 28, 2020 at 3:18 PM Patrick Vranckx
wrote:
>
> Hi,
>
> I try yo setup an HA cluster for ZFS. I think fence_scsi is not working
> properly. I can reproduce the problem on two kind of hardware: iSCSI and
> SAS storage.
>
> Here is what I did:
>
> - set up a storage server with 3 iscsi
22.10.2020 23:29, Lentes, Bernd пишет:
> Hi guys,
>
> ocassionally stopping a VirtualDomain resource via "crm resource stop" does
> not work, and in the end the node is fenced, which is ugly.
> I had a look at the RA to see what it does. After trying to stop the domain
> via "virsh shutdown
nges may be overwritten by pacemaker?
> 2. Do you have idea where(which config file) crm_node command retrieves its
> data?
CIB
> Thanks,
> Jiaqi Tian
>
> - Original message -
> From: Andrei Borzenkov
> Sent by: "Users"
> To: Cluster
21.10.2020 20:47, Strahil Nikolov пишет:
> Both SUSE and RedHat provide utilities to add the node without messing with
> the configs manually.
Which are crmsh and pcs respectively :)
>
> What is your distro ?
>
>
> Best Regards,
> Strahil Nikolov
>
>
>
>
>
>
> В сряда, 21 октомври 2020
02.07.2020 18:18, stefan.schm...@farmpartner-tec.com пишет:
> Hello,
>
> I hope someone can help with this problem. We are (still) trying to get
> Stonith to achieve a running active/active HA Cluster, but sadly to no
> avail.
>
> There are 2 Centos Hosts. On each one there is a virtual Ubuntu
18.07.2020 03:36, Reid Wahl пишет:
> I'm not sure that the libvirt backend is intended to be used in this way,
> with multiple hosts using the same multicast address. From the
> fence_virt.conf man page:
>
> ~~~
> BACKENDS
>libvirt
>The libvirt plugin is the simplest plugin. It
On Wed, Jul 22, 2020 at 9:42 AM Harvey Shepherd <
harvey.sheph...@aviatnet.com> wrote:
> Hi All,
>
> I'm running Pacemaker 2.0.3 on a two-node cluster, controlling 40+
> resources which are a mixture of clones and other resources that are
> colocated with the master instance of certain clones.
On Wed, Jul 22, 2020 at 10:59 AM Хиль Эдуард wrote:
> Hi there! I have 2 nodes with Pacemaker 2.0.3, corosync 3.0.3 on ubuntu 20
> + 1 qdevice. I want to define new resource as systemd unit *dummy.service
> *:
>
> [Unit]
> Description=Dummy
> [Service]
> Restart=on-failure
>
On Wed, Jul 22, 2020 at 4:58 PM Ken Gaillot wrote:
> On Wed, 2020-07-22 at 10:59 +0300, Хиль Эдуард wrote:
> > Hi there! I have 2 nodes with Pacemaker 2.0.3, corosync 3.0.3 on
> > ubuntu 20 + 1 qdevice. I want to define new resource as systemd
> > unit dummy.service :
> >
> > [Unit]
> >
30.07.2020 08:42, Strahil Nikolov пишет:
> You got plenty of options:
> - IPMI based fencing like HP iLO, DELL iDRAC
> - SCSI-3 persistent reservations (which can be extended to fence the node
> when the reservation(s) were removed)
>
SCSI reservation prevents data corruption due to
11.08.2020 10:34, Adam Cécile пишет:
> On 8/11/20 8:48 AM, Andrei Borzenkov wrote:
>> 08.08.2020 13:10, Adam Cécile пишет:
>>> Hello,
>>>
>>>
>>> I'm experiencing issue with corosync/pacemaker running on Debian Buster.
>>> Clu
On Thu, Jul 30, 2020 at 11:29 AM Strahil Nikolov wrote:
>
> This one links to how to power fence when reservations are removed:
> https://access.redhat.com/solutions/4526731
>
All of this is RH(CS) specific
___
Manage your subscription:
08.08.2020 13:10, Adam Cécile пишет:
> Hello,
>
>
> I'm experiencing issue with corosync/pacemaker running on Debian Buster.
> Cluster has three nodes running in VMWare virtual machine and the
> cluster fails when VEEAM backups the virtual machine (I know it's doing
> bad things, like freezing
06.07.2020 19:13, fatcha...@gmx.de пишет:
> Hi,
>
> I'm running a two node corosync httpd-cluster on a CentOS 7.
> corosync-2.4.5-4.el7.x86_64
> pcs-0.9.168-4.el7.centos.x86_64
> Today I used lets encrypt to installt https for two domains on that system.
> After that the node with the new
14.07.2020 13:19, Rohit Saini пишет:
> Also, " Keep in mind that neither qdevice nor booth is "replacement" for
> stonith. "
>
> Why not? qdevice/booth are handling the split-brain scenario, keeping one
> master only even in case of local/geo network disjoints. Can you please
> clarify more on
14.07.2020 14:56, Grégory Sacré пишет:
> Dear all,
>
>
> I'm pretty new to Pacemaker so I must be missing something but I cannot find
> it in the documentation.
>
> I'm setting up a SAMBA File Server cluster with DRBD and Pacemaker. Here are
> the relevant pcs commands related to the mount
18.06.2020 18:24, Ken Gaillot пишет:
> Note that a failed start of a stonith device will not prevent the
> cluster from using that device for fencing. It just prevents the
> cluster from monitoring the device.
>
My understanding is that if stonith resource cannot run anywhere, it
also won't be
24.06.2020 12:20, Ulrich Windl пишет:
>>
>> How Service Guard handles loss of shared storage?
>
> When a node is up it would log the event; if a node is down it wouldn't care;
> if a node detects a communication problem with the other node, it would fence
> itself.
>
So in case of split brain
24.06.2020 10:28, Ulrich Windl пишет:
>>
>> Usual recommendation is third site which functions as witness. This
>> works fine up to failure of this third site itself. Unavailability of
>> the witness makes normal maintenance of either of two nodes impossible.
>
> That's a problem of pacemaker:
>
Two node is what I almost exclusively deal with. It works reasonably
well in one location where failures to perform fencing are rare and can
be mitigated by two different fencing methods. Usually SBD is reliable
enough, as failure of shared storage also implies failure of the whole
cluster.
When
29.06.2020 14:57, Ulrich Windl пишет:
Klaus Wenninger schrieb am 29.06.2020 um 10:12 in
> Nachricht
>
> [...]
>> My mailer was confused by all this combinations of
>> "Antw: Re: Antw:" anddidn't compose mails into a
>> thread properly. Which is why I missed further
>> discussion where it
29.06.2020 20:20, Tony Stocker пишет:
>
>>
>>
>> The most interesting part seems to be the question whow you define (and
>> detect) a failure that will cause a node switch.
>
> That is a VERY good question! How many mounts failed is the critical
> number when you have 130+? If a single one
19.06.2020 01:13, Howard пишет:
> Thanks for all the help so far. With your assistance, I'm very close to
> stable.
>
> Made the following changes to the vmfence stonith resource:
>
> Meta Attrs: failure-timeout=30m migration-threshold=10
> Operations: monitor interval=60s
e. After 30 minutes it will start trying again.
>>
>> On Thu, Jun 18, 2020 at 12:29 PM Ken Gaillot > <mailto:kgail...@redhat.com>> wrote:
>>
>> On Thu, 2020-06-18 at 21:32 +0300, Andrei Borzenkov wrote:
>> > 18.06.2020 18:24, Ken Gaillot пишет:
>
18.06.2020 20:16, Howard пишет:
> Thanks for the replies! I will look at the failure-timeout resource
> attribute and at adjusting the timeout from 20 to 30 seconds. It is funny
> that the 100 tries message is symbolic.
>
It is not symbolic, it is INFINITY. From pacemaker documentation
If
17.06.2020 22:05, Howard пишет:
> Hello, recently I received some really great advice from this community
> regarding changing the token timeout value in corosync. Thank you! Since
> then the cluster has been working perfectly with no errors in the log for
> more than a week.
>
> This morning I
On Wed, Jul 29, 2020 at 9:01 AM Gabriele Bulfon wrote:
> That one was taken from a specific implementation on Solaris 11.
> The situation is a dual node server with shared storage controller: both
> nodes see the same disks concurrently.
> Here we must be sure that the two nodes are not going to
On Mon, Jul 20, 2020 at 11:45 AM Klaus Wenninger
wrote:
> On 7/20/20 10:34 AM, Andrei Borzenkov wrote:
>
>
>
>
>>
>> The cpg-configuration sounds interesting as well. Haven't used
>> it or looked into the details. Would be interested to hear about
>&
t (libvirt network was in NAT mode) or wrong (VMs using Host's bond
> in a bridged network).
> >
> > Best Regards,
> > Strahil Nikolov
> >
> > На 19 юли 2020 г. 9:45:29 GMT+03:00, Andrei Borzenkov <
> arvidj...@gmail.com> написа:
> >> 18.07.2020 03:36
maker-based [1719] (cib_process_request)
> info: Completed cib_delete operation for section status: OK (rc=0,
> origin=node1.local/crmd/246, version=0.132.5)
> Jul 22 12:38:42 node2.local pacemaker-based [1719] (cib_perform_op)
> info: Diff: --- 0.13
30.07.2020 23:23, Lentes, Bernd пишет:
>
>
> - Am 30. Jul 2020 um 9:28 schrieb Ulrich Windl
> ulrich.wi...@rz.uni-regensburg.de:
>
> "Lentes, Bernd" schrieb am 29.07.2020
>> um
>> 17:26 in Nachricht
>> <1894379294.27456141.1596036406000.javamail.zim...@helmholtz-muenchen.de>:
>>> Hi,
16.08.2020 04:25, Reid Wahl пишет:
>
>
>> - considering that I have both nodes with stonith against the other node,
>> once the two nodes can communicate, how can I be sure the two nodes will
>> not try to stonith each other?
>>
>
> The simplest option is to add a delay attribute (e.g.,
30.11.2020 17:05, Ulrich Windl пишет:
>>>> Andrei Borzenkov schrieb am 30.11.2020 um 14:18 in
> Nachricht
> :
>> On Mon, Nov 30, 2020 at 3:11 PM Ulrich Windl
>> wrote:
>>>
>>> Hi!
>>>
>>> In SLES15 I'm surprised what a stan
On Mon, Nov 30, 2020 at 3:11 PM Ulrich Windl
wrote:
>
> Hi!
>
> In SLES15 I'm surprised what a standby node does: My guess was that a standby
> node would stop all resources and then just "shut up", but it seems it still
> tried to place resources and calls monitor operations.
>
Standby nodes
30.11.2020 15:36, Ulrich Windl пишет:
> Hi!
>
> I configured a shared LVM activation as per instructions (I hope) in SLES15
> SP2. However I get this warning:
> LVM-activate(prm_testVG_activate)[57281]: WARNING: You are recommended to
> activate one LV at a time or use exclusive activation
On Thu, Dec 3, 2020 at 11:11 AM Ulrich Windl
wrote:
>
> >>> Strahil Nikolov schrieb am 02.12.2020 um 22:42 in
> Nachricht <311137659.2419591.1606945369...@mail.yahoo.com>:
> > Constraints' values are varying from:
> > infinity which equals to score of 100
> > to:
> > - infinity which equals
ched off.
You really need to test how ipmi behaves with your specific hardware
to make sure it is not possible or to adjust stonith agent to handle
delays.
To reiterate:
>
> Da: Andrei Borzenkov
>
> It is possible that your IPMI/BMC/whatever implementation responds
> with success bef
On Mon, Dec 14, 2020 at 2:40 PM Gabriele Bulfon wrote:
>
> I isolated the log when everything happens (when I disable the ha interface),
> attached here.
>
And where are matching logs from the second node?
___
Manage your subscription:
15.12.2020 17:10, Tony Stocker пишет:
> On Tue, Dec 15, 2020 at 9:02 AM Andrei Borzenkov wrote:
>>
>> On Tue, Dec 15, 2020 at 4:58 PM Tony Stocker wrote:
>>>
>>
>> You could simply query whether a specific resource (group) is active
>> on the nod
17.12.2020 14:02, Ulrich Windl пишет:
>>>> Andrei Borzenkov schrieb am 17.12.2020 um 09:50 in
> Nachricht
> :
>
> ...
>> According to logs from xstha1, it started to activate resources only
>> after stonith was confirmed
>>
>> Dec 16 15
17.12.2020 21:30, Ken Gaillot пишет:
>
> This reminded me that some IPMI implementations return "success" for
> commands before they've actually been completed. This is why
> fence_ipmilan has a "power_wait" parameter that defaults to 2 seconds.
>
But on this case we also do not know whether
18.12.2020 10:09, Ulrich Windl пишет:
>>>> Andrei Borzenkov schrieb am 18.12.2020 um 08:01 in
> Nachricht :
>> 17.12.2020 21:30, Ken Gaillot пишет:
>>>
>>> This reminded me that some IPMI implementations return "success" for
>>> co
18.12.2020 12:00, Ulrich Windl пишет:
>
> Maybe a related question: Do STONITH resources have special rules, meaning
> they don't wait for successful fencing?
pacemaker resources in CIB do not perform fencing. They only register
fencing devices with fenced which does actual job. In particular
Sonicle S.r.l. : http://www.sonicle.com
> Music: http://www.gabrielebulfon.com
> eXoplanets : https://gabrielebulfon.bandcamp.com/album/exoplanets
>
>
>
>
>
> --
>
> Da: Andrei Borz
11.12.2020 18:37, Gabriele Bulfon пишет:
> I found I can do this temporarily:
>
> crm config property cib-bootstrap-options: no-quorum-policy=ignore
>
All two node clusters I remember run with setting forever :)
> then once node 2 is up again:
>
> crm config property cib-bootstrap-options:
18.12.2020 21:54, Ken Gaillot пишет:
> On Fri, 2020-12-18 at 17:51 +, Animesh Pande wrote:
>> Hello,
>>
>> Is there a tool that would allow for commands to be run on remote
>> nodes in the cluster through the corosync messaging layer? I have a
>> cluster configured with multiple corosync
On Tue, Dec 15, 2020 at 4:58 PM Tony Stocker wrote:
>
> I'm trying to figure out the best way to do the following on our
> 2-node clusters.
>
> Whichever node is the primary (all services run on a single node) I
> want to create a file that contains an identity descriptor, e.g.
>
11.12.2020 16:13, Raphael Laguerre пишет:
> Hello,
>
> I'm trying to setup a 2 nodes cluster with 2 galera instances. I use the
> ocf:heartbeat:galera resource agent, however, after I create the resource,
> only one node appears to be in master role, the other one can't be promoted
> and stays
gt; Gabriele
>
>
> Sonicle S.r.l. : http://www.sonicle.com
> Music: http://www.gabrielebulfon.com
> eXoplanets : https://gabrielebulfon.bandcamp.com/album/exoplanets
>
>
>
>
> ------
>
> Da:
16.12.2020 17:56, Gabriele Bulfon пишет:
> Thanks, here are the logs, there are infos about how it tried to start
> resources on the nodes.
Both logs are from the same node.
> Keep in mind the node1 was already running the resources, and I simulated a
> problem by turning down the ha
16.12.2020 19:05, Gabriele Bulfon пишет:
> Looking at the two logs, looks like corosync decided that xst1 was offline,
> while xst was still online.
> I just issued an "ifconfig ha0 down" on xst1, so I expect both nodes cannot
> see other one, while I see these same lines both on xst1 and xst2
On Thu, Dec 17, 2020 at 11:11 AM Gabriele Bulfon wrote:
>
> Yes, sorry took same bash by mistake...here are the correct logs.
>
> Yes, xstha1 has delay 10s so that I'm giving him precedence, xstha2 has delay
> 1s and will be stonished earlier.
> During the short time before xstha2 got powered
15.11.2020 20:00, Guy Przytula пишет:
> a question would be :
>
> we have maintenance to perform on a node of the cluster
>
> to avoid that the cluster starts the resource that we stopped - we want
> to disable a node temporarily - is this possible without deleting the node
>
Put node in
On Wed, Oct 21, 2020 at 5:03 PM Jiaqi Tian1 wrote:
>
> Hi,
> I'm trying to add a new node into an active pacemaker cluster with resources
> up and running.
> After steps:
> 1. update corosync.conf files among all hosts in cluster including the new
> node
> 2. copy corosync auth file to the new
23.01.2021 19:10, Sharma, Jaikumar пишет:
> Hi guys,
>
> I'm newbie to high availability clusters, pls excuse me - learning tools
> stack (corosync & pacemaker).
>
> In fact, our high availability solution is based on Debian 9.x (pacemaker 1.x
> and corosync 2.x) - which worked as expected.
>
On Mon, Jan 25, 2021 at 12:07 PM Jehan-Guillaume de Rorthais
wrote:
> As actions during a cluster shutdown cannot be handled in the same transition
> for each nodes, I usually add a step to disable all resources using property
> "stop-all-resources" before shutting down the cluster:
>
> pcs
On Mon, Jan 18, 2021 at 12:00 PM Steffen Vinther Sørensen
wrote:
>
> Hi,
>
> I have persistent journal, but 'journalctl -b -1' was empty in this
> case, so it might not be optimally configured. And centralized logging
> is on the todo list
>
>
> btw. about the fencing, I have set '
On Mon, Jan 18, 2021 at 11:55 AM Ulrich Windl
wrote:
.
>
> So can someone explan, or direct me to some helpful docs?
>
Are you aware of https://libvirt.org/kbase/locking.html which links
further to virtlockd description?
___
Manage your subscription:
On Mon, Feb 1, 2021 at 10:07 AM Ulrich Windl
wrote:
>
> You are saying starting libvirtd does not require the ro and tls socket units
> to be started?
>
So far I am not aware of any service that would *require* socket
activation. Socket activation is optimization that allows you to avoid
On Mon, Feb 1, 2021 at 12:53 PM Ulrich Windl
wrote:
>
> Hi!
>
> While fighting to get the wrong configuration, I broke libvirt live-migration
> by not enabling the TLS socket.
>
> When testing to live-migrate a VM from h16 to h18, these are the essential
> events:
> Feb 01 10:30:10 h16
On Mon, Feb 1, 2021 at 1:59 PM Ulrich Windl
wrote:
>
> But the VM *wasn't* stopped on h16!
>
I am not sure what you mean here. It was not stopped during migration?
Yes, pacemaker knew it and it tried to stop it explicitly when
migration failed. It was not stopped when pacemaker tried to stop it?
27.01.2021 19:06, damiano giuliani пишет:
> Hi all im pretty new to the clusters, im struggling trying to configure a
> bounch of resources and test how they failover.my need is to start and
> manage a group of resources as one (in order to archive this a resource
> group has been created), and if
27.01.2021 22:03, Ken Gaillot пишет:
>
> With a group, later members depend on earlier members. If an earlier
> member can't run, then no members after it can run.
>
> However we can't make the dependency go in both directions. If an
> earlier member can't run unless a later member is active,
29.01.2021 20:37, Stuart Massey пишет:
> Can someone help me with this?
> Background:
>
> "node01" is failing, and has been placed in "maintenance" mode. It
> occasionally loses connectivity.
>
> "node02" is able to run our resources
>
> Consider the following messages from pacemaker.log on
301 - 400 of 663 matches
Mail list logo