25.02.2019 11:50, Samarth Jain пишет:
> Hi,
>
>
> We have a bunch of resources running in master slave configuration with one
> master and one slave instance running at any given time.
>
> What we observe is, that for any two given resources at a time, if say
> resource Stateful_Test_1 is in
26.02.2019 1:08, Ken Gaillot пишет:
> On Mon, 2019-02-25 at 23:00 +0300, Andrei Borzenkov wrote:
>> 25.02.2019 22:36, Andrei Borzenkov пишет:
>>>
>>>> Could you please help me understand:
>>>> 1. Why doesn't pacemaker process the failure of Stateful_T
25.02.2019 23:13, Ken Gaillot пишет:
> On Mon, 2019-02-25 at 14:20 +0530, Samarth Jain wrote:
>> Hi,
>>
>>
>> We have a bunch of resources running in master slave configuration
>> with one master and one slave instance running at any given time.
>>
>> What we observe is, that for any two given
20.02.2019 21:51, Eric Robinson пишет:
>
> The following should show OK in a fixed font like Consolas, but the following
> setup is supposed to be possible, and is even referenced in the ClusterLabs
> documentation.
>
>
>
>
>
> +--+
>
> | mysql001 +--+
>
>
18.02.2019 18:53, Ken Gaillot пишет:
> On Sun, 2019-02-17 at 20:33 +0300, Andrei Borzenkov wrote:
>> 17.02.2019 0:33, Andrei Borzenkov пишет:
>>> 17.02.2019 0:03, Eric Robinson пишет:
>>>> Here are the relevant corosync logs.
>>>>
>>>> It
23.02.2019 2:57, solarflow99 пишет:
> I'm trying to have my NFS share exported via pacemaker and now it doesn't
> seem to be working, it also kills off nfs-mountd. It looks like the rbd
> device could have something to do with it, the nfsroot doesn't get
> exported, but there's no indication why:
26.02.2019 18:05, Ken Gaillot пишет:
> On Tue, 2019-02-26 at 06:55 +0300, Andrei Borzenkov wrote:
>> 26.02.2019 1:08, Ken Gaillot пишет:
>>> On Mon, 2019-02-25 at 23:00 +0300, Andrei Borzenkov wrote:
>>>> 25.02.2019 22:36, Andrei Borzenkov пишет:
>>>>&g
12.03.2019 18:10, Adam Budziński пишет:
> Hello,
>
>
>
> I’m planning to setup a two node (active-passive) HA cluster consisting of
> pacemaker, corosync and DRBD. The two nodes will run on VMware VM’s and
> connect to a single DB server (unfortunately for various reasons not
> included in the
16.03.2019 1:16, Adam Budziński пишет:
> Hi Tomas,
>
> Ok but how then pacemaker or the fence agent knows which route to take to
> reach the vCenter?
They do not know or care at all. It is up to your underlying operating
system and its routing tables.
> Btw. Do I have to add the stonith
e stonith agent is not prohibited to run by (co-)location rules. My
understanding is that this node is selected by DC in partition.
> Thank you!
>
> sob., 16.03.2019, 05:37 użytkownik Andrei Borzenkov
> napisał:
>
>> 16.03.2019 1:16, Adam Budziński пишет:
>>> Hi Tomas,
>>
On Fri, Mar 22, 2019 at 1:08 PM Jan Pokorný wrote:
>
> Also a Friday's idea:
> Perhaps we should crank up "how to ask" manual for this list
Yest another one?
http://www.catb.org/~esr/faqs/smart-questions.html
___
Manage your subscription:
t away from its current node.
In this particular case it may be argued that pacemaker reaction is
unjustified. Administrator explicitly set target state to "stop"
(otherwise pacemaker would not attempt to stop it) so it is unclear why
it tries to restart it on other node.
>> -O
17.02.2019 0:03, Eric Robinson пишет:
> Here are the relevant corosync logs.
>
> It appears that the stop action for resource p_mysql_002 failed, and that
> caused a cascading series of service changes. However, I don't understand
> why, since no other resources are dependent on p_mysql_002.
>
17.02.2019 0:33, Andrei Borzenkov пишет:
> 17.02.2019 0:03, Eric Robinson пишет:
>> Here are the relevant corosync logs.
>>
>> It appears that the stop action for resource p_mysql_002 failed, and that
>> caused a cascading series of service changes. However, I don'
17.02.2019 0:44, Eric Robinson пишет:
> Thanks for the feedback, Andrei.
>
> I only want cluster failover to occur if the filesystem or drbd resources
> fail, or if the cluster messaging layer detects a complete node failure. Is
> there a way to tell PaceMaker not to trigger a cluster failover
13.02.2019 15:50, Maciej S пишет:
> Can you describe at least one situation when it could happen?
> I see situations where data on two masters can diverge but I can't find the
> one where data gets corrupted.
If diverged data in two databases that are supposed to be exact copy of
each other is
19.02.2019 23:06, Eric Robinson пишет:
...
> Bottom line is, how do we configure the cluster in such a way that
> there are no cascading circumstances when a MySQL resource fails?
> Basically, if a MySQL resource fails, it fails. We'll deal with that
> on an ad-hoc basis. I don't want the whole
23.01.2019 17:20, Klaus Wenninger пишет:
>
> And yes dynamic-configuration of two_node should be possible -
> remember that I had to implement that communication with
> corosync into sbd for clusters that are expanded node-by-node
> using pcs.
> 'corosync-cfgtool -R' to reload the config.
>
24.01.2019 18:01, Lentes, Bernd пишет:
> - On Jan 23, 2019, at 3:20 PM, Klaus Wenninger kwenn...@redhat.com wrote:
>>> I have corosync-2.3.6-9.13.1.x86_64.
>>> Where can i configure this value ?
>>
>> speaking of two_node & wait_for_all?
>> That is configured in the quorum-section of
03.04.2019 13:04, Klaus Wenninger пишет:
> On 4/3/19 9:47 AM, Andrei Borzenkov wrote:
>> On Tue, Apr 2, 2019 at 8:49 PM Digimer wrote:
>>> It's worth noting that SBD fencing is "better than nothing", but slow.
>>> IPMI and/or PDU fencing completes a lot fas
12.04.2019 15:30, Олег Самойлов пишет:
>
>> 11 апр. 2019 г., в 20:00, Klaus Wenninger
>> написал(а):
>>
>> On 4/11/19 5:27 PM, Олег Самойлов wrote:
>>> Hi all. I am developing HA PostgreSQL cluster for 2 or 3
>>> datacenters. In case of DataCenter failure (blackout) the fencing
>>> will not
03.06.2019 9:09, Ulrich Windl пишет:
> 118 if [ ‑x $xentool ]; then
> 119 $xentool info | awk
>>> '/total_memory/{printf("%d\n",$3);exit(0)}'
> 120 else
> 121 ocf_log warn "Can only set hv_memory for Xen hypervisor"
> 122 echo "0"
08.06.2019 5:12, Harvey Shepherd пишет:
> Thank you for your advice Ken. Sorry for the delayed reply - I was trying out
> a few things and trying to capture extra info. The changes that you suggested
> make sense, and I have incorporated them into my config. However, the
> original issue
29.05.2019 11:12, Ulrich Windl пишет:
Jan Pokorný schrieb am 28.05.2019 um 16:31 in
> Nachricht
> <20190528143145.ga29...@redhat.com>:
>> On 27/05/19 08:28 +0200, Ulrich Windl wrote:
>>> I copnfigured ocf:pacemaker:NodeUtilization more or less for fun, and I
>> realized that the cluster
30.04.2019 9:53, Digimer пишет:
> On 2019-04-30 12:07 a.m., Andrei Borzenkov wrote:
>> As soon as majority of nodes are stopped, the remaining nodes are out of
>> quorum and watchdog reboot kicks in.
>>
>> What is the correct procedure to ensure nodes are sto
18.05.2019 18:34, Kadlecsik József пишет:
> Hello,
>
> We have a resource agent which creates IP tunnels. In spite of the
> configuration setting
>
> primitive tunnel-eduroam ocf:local:tunnel \
> params
> op start timeout=120s interval=0 \
> op stop timeout=300s
21.05.2019 0:46, Ken Gaillot пишет:
>>
>>> From what's described here, the op-restart-digest is changing every
>>> time, which means something is going wrong in the hash comparison
>>> (since the definition is not really changing).
>>>
>>> The log that stands out to me is:
>>>
>>> trace May 18
about dynamic cluster expansion; the
question is about normal static cluster with fixed number of nodes that
needs to be shut down.
>> 30 апр. 2019 г., в 7:07, Andrei Borzenkov написал(а):
>>
>> As soon as majority of nodes are stopped, the remaining nodes are out of
>> q
30.04.2019 19:34, Олег Самойлов пишет:
>
>> No. I simply want reliable way to shutdown the whole cluster (for
>> maintenance).
>
> Official way is `pcs cluster stop --all`.
pcs is just one of multiple high level tools. I am interested in
plumbing, not porcelain.
> But it’s not always worked as
30.04.2019 9:53, Digimer пишет:
> On 2019-04-30 12:07 a.m., Andrei Borzenkov wrote:
>> As soon as majority of nodes are stopped, the remaining nodes are out of
>> quorum and watchdog reboot kicks in.
>>
>> What is the correct procedure to ensure nodes are sto
30.04.2019 9:51, Jan Friesse пишет:
>
>> Now, corosync-qdevice gets SIGTERM as "signal to terminate", but it
>> installs SIGTERM handler that does not exit and only closes some socket.
>> May be this should trigger termination of main loop, but somehow it does
>> not.
>
> Yep, this is exactly
27.04.2019 1:04, Danka Ivanović пишет:
> Hi, here is a complete cluster configuration:
>
> node 1: master
> node 2: secondary
> primitive AWSVIP awsvip \
> params secondary_private_ip=10.x.x.x api_delay=5
> primitive PGSQL pgsqlms \
> params pgdata="/var/lib/postgresql/9.5/main"
>
29.04.2019 18:05, Ken Gaillot пишет:
>>
>>> Why does not it check OCF_RESKEY_CRM_meta_notify?
>>
>> I was just not aware of this env variable. Sadly, it is not
>> documented
>> anywhere :(
>
> It's not a Pacemaker-created value like the other notify variables --
> all user-specified
As soon as majority of nodes are stopped, the remaining nodes are out of
quorum and watchdog reboot kicks in.
What is the correct procedure to ensure nodes are stopped in clean way?
Short of disabling stonith-watchdog-timeout before stopping cluster ...
29.04.2019 14:32, Jan Friesse пишет:
> Andrei,
>
>> I setup qdevice in openSUSE Tumbleweed and while it works as expected I
>
> Is it corosync-qdevice or corosync-qnetd daemon?
>
corosync-qdevice
>> cannot stop it - it always results in timeout and service finally gets
>> killed by systemd.
On Mon, May 6, 2019 at 8:30 AM Arkadiy Kulev wrote:
>
> Andrei,
>
> I just went through the docs
> (https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-failure-migration.html)
> and it says that the option "failure-timeout" is responsible for retrying a
> failed
03.05.2019 20:18, Lentes, Bernd пишет:
> Hi,
>
> on my cluster nodes i established a systemd service which starts crm_mon
> which writes cluster information into a html-file so i can see the state
> of my cluster in a webbrowser.
> crm_mon is started that way:
> /usr/sbin/crm_mon -d -i 10 -h
While testing corosync-qdevice I repeatedly got the above message. The
reason seems to be startup sequence in corosync-qdevice. Consider:
● corosync-qdevice.service - Corosync Qdevice daemon
Loaded: loaded (/etc/systemd/system/corosync-qdevice.service;
disabled; vendor preset: disabled)
30.04.2019 19:47, Олег Самойлов пишет:
>
>
>> 30 апр. 2019 г., в 19:38, Andrei Borzenkov
>> написал(а):
>>
>> 30.04.2019 19:34, Олег Самойлов пишет:
>>>
>>>> No. I simply want reliable way to shutdown the whole cluster
>>>> (for
05.05.2019 16:14, Arkadiy Kulev пишет:
> Hello!
>
> I run pacemaker on 2 active/active hosts which balance the load of 2 public
> IP addresses.
> A few days ago we ran a very CPU/network intensive process on one of the 2
> hosts and Pacemaker failed.
>
> I've attached a screenshot of the
> On Sun, May 5, 2019 at 11:05 PM Andrei Borzenkov
> wrote:
>
>> 05.05.2019 18:43, Arkadiy Kulev пишет:
>>> Dear Andrei,
>>>
>>> I'm sorry for the screenshot, this is the only thing that I have left
>> after
>>> the crash.
>>>
>&
rerequisite was successful
stop of resource.
> Sincerely,
> Ark.
>
> e...@ethaniel.com
>
>
> On Sun, May 5, 2019 at 9:46 PM Andrei Borzenkov wrote:
>
>> 05.05.2019 16:14, Arkadiy Kulev пишет:
>>> Hello!
>>>
>>> I run pacemaker on 2 active/activ
On Wed, Jul 3, 2019 at 12:59 AM Ken Gaillot wrote:
>
> On Mon, 2019-07-01 at 23:30 +, Harvey Shepherd wrote:
> > > The "transition summary" is just a resource-by-resource list, not
> > > the
> > > order things will be done. The "executing cluster transition"
> > > section
> > > is the order
On Fri, Jun 28, 2019 at 7:24 AM Harvey Shepherd
wrote:
>
> Hi All,
>
>
> I'm running Pacemaker 2.0.2 on a two node cluster. It runs one master/slave
> resource (I'll refer to it as the king resource) and about 20 other resources
> which are a mixture of:
>
>
> - resources that only run on the
On Thu, Jul 11, 2019 at 12:58 PM Lars Ellenberg
wrote:
>
> On Wed, Jul 10, 2019 at 06:15:56PM +, Michael Powell wrote:
> > Thanks to you and Andrei for your responses. In our particular
> > situation, we want to be able to operate with either node in
> > stand-alone mode, or with both nodes
On Tue, Jul 9, 2019 at 3:54 PM Michael Powell <
michael.pow...@harmonicinc.com> wrote:
> I have a two-node cluster with a problem. If I start Corosync/Pacemaker
> on one node, and then delay startup on the 2nd node (which is otherwise
> up and running), the 2nd node will be rebooted very soon
28.06.2019 9:45, Andrei Borzenkov пишет:
> On Fri, Jun 28, 2019 at 7:24 AM Harvey Shepherd
> wrote:
>>
>> Hi All,
>>
>>
>> I'm running Pacemaker 2.0.2 on a two node cluster. It runs one master/slave
>> resource (I'll refer to it as the king resour
20.04.2019 22:29, Lentes, Bernd пишет:
>
>
> - Am 18. Apr 2019 um 16:21 schrieb kgaillot kgail...@redhat.com:
>
>>
>> Simply stopping pacemaker and corosync by whatever mechanism your
>> distribution uses (e.g. systemctl) should be sufficient.
>
> That works. But strangely is that after a
On Tue, Jul 16, 2019 at 11:01 AM Nishant Nakate wrote:
>
>> >
>> > I will give you a quick overview of the system. There would be 3 nodes
>> > configured in a cluster. One would act as a leader and others as
>> > followers. Our system would be actively running on all the three nodes and
>> >
02.07.2019 2:30, Harvey Shepherd пишет:
>> The "transition summary" is just a resource-by-resource list, not the
>> order things will be done. The "executing cluster transition" section
>> is the order things are being done.
>
> Thanks Ken. I think that's where the problem is originating. If you
29.06.2019 8:05, Harvey Shepherd пишет:
> There is an ordering constraint - everything must be started after the king
> resource. But even if this constraint didn't exist I don't see that it should
> logically make any difference due to all the non-clone resources being
> colocated with the
I'm using sbd watchdog and stonith-watchdog-timeout without explicit
stonith agents (shared nothing cluster). How can I clean up failed
fencing action?
Current DC: ha1 (version
2.0.1+20190408.1b68da8e8-1.3-2.0.1+20190408.1b68da8e8) - partition with
quorum
Last updated: Sat Aug 3 19:10:12 2019
29.07.2019 22:07, Ken Gaillot пишет:
> On Sat, 2019-07-27 at 11:04 +0300, Andrei Borzenkov wrote:
>> Is it possible to have single definition of resource set that is
>> later
>> references in order and location constraints? All syntax in
>> documentation or crmsh pre
Отправлено с iPhone
12 авг. 2019 г., в 9:48, Ulrich Windl
написал(а):
>>>> Andrei Borzenkov schrieb am 09.08.2019 um 18:40 in
> Nachricht <217d10d8-022c-eaf6-28ae-a4f58b2f9...@gmail.com>:
>> 09.08.2019 16:34, Yan Gao пишет:
>>> Hi,
>>>
>&
Отправлено с iPhone
> 12 авг. 2019 г., в 8:46, Jan Friesse написал(а):
>
> Олег Самойлов napsal(a):
>>> 9 авг. 2019 г., в 9:25, Jan Friesse написал(а):
>>> Please do not set dpd_interval that high. dpd_interval on qnetd side is not
>>> about how often is the ping is sent. Could you please
terlabs.org
>
> When replying, please edit your Subject line so it is more specific than "Re:
> Contents of Users digest..."
>
>
> Today's Topics:
>
>1. why is node fenced ? (Lentes, Bernd)
>2. Postgres HA - pacemaker RA do not support aut
On Mon, Aug 12, 2019 at 4:12 PM Michael Powell <
michael.pow...@harmonicinc.com> wrote:
> At 07:44:49, the ss agent discovers that the master instance has failed on
> node *mgraid…-0* as a result of a failed *ssadm* request in response to
> an *ss_monitor()* operation. It issues a *crm_master -Q
On Tue, Aug 20, 2019 at 1:03 AM Del Monaco, Andrea
wrote:
>
> Hi Users,
>
>
>
> As per title – do you know if there is some resource in pacemaker that allows
> a filesystem (md array) to be mounted and then run the quotaon command on it
Is not quota information persistent so it is enough to run
22.08.2019 12:49, Ulrich Windl пишет:
> Hi!
>
> It's been a while since I used crm shell, and now after having moved from
> SLES11 to SLES12 (jhaving to use it again), I realized a few things:
>
> 1) As the ptest command is crm_simulate now, shouldn't crm shell's ptest (in
> configure) be
22.08.2019 10:07, Ulrich Windl пишет:
> Hi!
>
> When starting pacemaker (1.1.19+20181105.ccd6b5b10-3.10.1) on a node that had
> been down for a while, I noticed some unexpected messages about the node name:
>
> pacemakerd: notice: get_node_name: Could not obtain a node name for
> corosync
27.08.2019 18:24, Casey & Gina пишет:
> Hi, I'm looking for a way to show just location constraints, if they exist,
> for a cluster. I'm looking for the same data shown in the output of `pcs
> config` under the "Location Constraints:" header, but without all the rest,
> so that I can write a
31.08.2019 6:39, Chris Walker пишет:
> Hello,
> The 1.1.19-8 EL7 version of Pacemaker contains a commit ‘Feature: crmd:
> default record-pending to TRUE’ that is not in the ClusterLabs Github repo.
commit b48ceeb041cee65a9b93b9b76235e475fa1a128f
Author: Ken Gaillot
Date: Mon Oct 16 09:45:18
03.09.2019 11:09, Marco Marino пишет:
> Hi, I have a problem with fencing on a two node cluster. It seems that
> randomly the cluster cannot complete monitor operation for fence devices.
> In log I see:
> crmd[8206]: error: Result of monitor operation for fence-node2 on
> ld2.mydomain.it: Timed
04.09.2019 2:03, Tomer Azran пишет:
> Hello,
>
> When using IPaddr2 RA in order to set a cloned IP address resource:
>
> pcs resource create vip1 ocf:heartbeat:IPaddr2 ip=10.0.0.100 iflabel=vip1
> cidr_netmask=24 flush_routes=true op monitor interval=30s
> pcs resource clone vip1 clone-max=2
04.09.2019 0:27, wf...@niif.hu пишет:
> Jeevan Patnaik writes:
>
>> [16187] node1 corosyncwarning [MAIN ] Corosync main process was not
>> scheduled for 2889.8477 ms (threshold is 800. ms). Consider token
>> timeout increase.
>> [...]
>> 2. How to fix this? We have not much load on the
04.09.2019 1:27, Tomer Azran пишет:
> Hello,
>
> When using IPaddr2 RA in order to set a cloned IP address resource:
>
> pcs resource create vip1 ocf:heartbeat:IPaddr2 ip=10.0.0.100 iflabel=vip1
> cidr_netmask=24 flush_routes=true op monitor interval=30s
> pcs resource clone vip1 clone-max=2
On Mon, Aug 26, 2019 at 9:59 AM Ulrich Windl
wrote:
> Also see my earlier message. If adding the node name to corosync conf is
> highly recommended, I wonder why SUSE's SLES procedure does not set it...
>
If you mean ha-cluster-init/ha-cluster-join, it just invokes "crm
cluster", so you may
On Thu, Sep 12, 2019 at 3:45 PM Ulrich Windl
wrote:
>
> >>> Andrei Borzenkov schrieb am 12.09.2019 um 14:21 in
> Nachricht
> :
> > On Thu, Sep 12, 2019 at 12:40 PM Ulrich Windl
> > wrote:
> >>
> >> Hi!
> >>
> >> I just d
On Thu, Sep 12, 2019 at 12:40 PM Ulrich Windl
wrote:
>
> Hi!
>
> I just discovered an unpleasant side-effect of this:
> SLES has "zypper ps" to show processes that use obsoleted binaries. Now if any
> resource binary was replaced, zypper suggests to restart pacemaker (which is
> nonsense, of
27.07.2019 11:04, Andrei Borzenkov пишет:
> Is it possible to have single definition of resource set that is later
> references in order and location constraints? All syntax in
> documentation or crmsh presumes inline set definition in location or
> order statement.
>
> In th
There is no one-size-fits-all answer. You should enable and configure
stonith in pacemaker (which is disabled, otherwise described situation
would not happen). You may consider wait_for_all (or better two_node)
options in corosync that would prevent pacemaker to start unless both
nodes are up.
On
07.08.2019 12:21, Oleg Ulyanov пишет:
> Hi all,
> I’m facing a problem with fence_vmware_soap on Ubuntu 16.04. Being able to
> resolve dependency missing by manually installing python packages, I still
> not able to connect to my vcenter. Apparently it’s a problem with 4.0.22
> version and
On Mon, Jul 29, 2019 at 9:52 AM Jan Friesse wrote:
>
> Andrei
>
> Andrei Borzenkov napsal(a):
> > corosync.service sets StopWhenUnneded=yes which normally stops it when
>
> This was the case only for very limited time (v 3.0.1) and it's removed
> now (v 3.0.2) because i
In two node cluster + qnetd I consistently see the node that is being
shut down last being reset during shutdown. I.e.
- shutdown the first node - OK
- shutdown the second node - reset
As far as I understand what happens is
- during shutdown pacemaker.service is stopped first. In above
corosync.service sets StopWhenUnneded=yes which normally stops it when
pacemaker is shut down. Unfortunately, corosync-qdevice.service declares
Requires=corosync.service and corosync-qdevice.service itself is *not*
stopped when pacemaker.service is stopped. Which means corosync.service
remains
Is it possible to have single definition of resource set that is later
references in order and location constraints? All syntax in
documentation or crmsh presumes inline set definition in location or
order statement.
In this particular case there will be set of filesystems that need to be
On Fri, Aug 9, 2019 at 9:25 AM Jan Friesse wrote:
>
> Олег Самойлов napsal(a):
> > Hello all.
> >
> > I have a test bed with several virtual machines to test pacemaker. I
> > simulate random failure on one of the node. The cluster will be on several
> > data centres, so there is not stonith
09.08.2019 16:34, Yan Gao пишет:
> Hi,
>
> With disk-less sbd, it's fine to stop cluster service from the cluster
> nodes all at the same time.
>
> But if to stop the nodes one by one, for example with a 3-node cluster,
> after stopping the 2nd node, the only remaining node resets itself
On Tue, Jul 16, 2019 at 9:48 AM Nishant Nakate wrote:
>
>
> On Tue, Jul 16, 2019 at 11:33 AM Ulrich Windl
> wrote:
>>
>> >>> Nishant Nakate schrieb am 16.07.2019 um 05:37
>> >>> in
>> Nachricht
>> :
>> > Hi All,
>> >
>> > I am new to this community and HA tools. Need some guidance on my
On Thu, Jul 25, 2019 at 3:20 AM Ondrej wrote:
>
> Is there any plan on getting this also into 1.1 branch?
> If yes, then I would be for just introducing the configuration option in
> 1.1.x with default to 'stop'.
>
+1 for back porting it from someone who just recently hit this
(puzzling)
23.09.2019 23:23, Vitaly Zolotusky пишет:
> Hello,
> I am trying to upgrade to Fedora 30. The platform is two node cluster with
> pacemaker.
> It Fedora 28 we were using old fence_sbd script from 2013:
>
> # This STONITH script drives the shared-storage stonith plugin.
> # Copyright (C) 2013
09.07.2019 13:08, Danka Ivanović пишет:
> Hi I didn't manage to start master with postgres, even if I increased start
> timeout. I checked executable paths and start options.
> When cluster is running with manually started master and slave started over
> pacemaker, everything works ok. Today we
On Wed, Jul 10, 2019 at 12:42 PM Jehan-Guillaume de Rorthais
wrote:
>
> > > Jul 09 09:16:32 [2679] postgres1 lrmd:debug:
> > > child_kill_helper: Kill pid 12735's group Jul 09 09:16:34 [2679]
> > > postgres1 lrmd: warning: child_timeout_callback:
> > > PGSQL_monitor_15000
On Wed, Jul 10, 2019 at 12:42 PM Jehan-Guillaume de Rorthais
wrote:
>
> > P.S. crm_resource is called by resource agent (pgsqlms). And it shows
> > result of original resource probing which makes it confusing. At least
> > it explains where these logs entries come from.
>
> Not sure tu understand
30.10.2019 15:46, RAM PRASAD TWISTED ILLUSIONS пишет:
> Hi everyone,
>
> I am trying to set up a storage cluster with two nodes, both running debian
> buster. The two nodes called, duke and miles, have a LUN residing on a SAN
> box as their shared storage device between them. As you can see in
06.11.2019 18:55, Ken Gaillot пишет:
> On Wed, 2019-11-06 at 08:04 +0100, Ulrich Windl wrote:
> Ken Gaillot schrieb am 05.11.2019 um
> 16:05 in
>>
>> Nachricht
>> :
>>> Coincidentally, the documentation for the pcmk_host_check default
>>> was
>>> recently updated for the upcoming 2.0.3
On Thu, Dec 5, 2019 at 1:04 AM Jan Pokorný wrote:
>
> On 04/12/19 21:19 +0100, Jan Pokorný wrote:
> > OTOH, this enforced split of state transitions is perhaps what makes
> > the transaction (comprising perhaps countless other interdependent
> > resources) serializable and thus feasible at all
16.12.2019 18:26, Stefan K пишет:
> I thnik I got it..
>
> It looks like that (A)
> order pcs_rsc_order_set_iscsi-server_haip iscsi-server:start
> iscsi-lun00:start iscsi-lun01:start iscsi-lun02:start ha-ip:start
> symmetrical=false
It is different from configuration you show originally.
>
28.10.2019 20:00, Jean-Francois Malouin пишет:
> Hi,
>
> Building a new pacemaker cluster using corosync 3.0 and pacemaker 2.0.1 on
> Debian/Buster 10
> I get this error when trying to insert a order constraint in the CIB to first
> promote drbd to primary
> then start/scan LVM. It used to work
According to it, you have symmetric cluster (and apparently made typo
trying to change it)
On Fri, Oct 18, 2019 at 10:29 AM Raffaele Pantaleoni
wrote:
>
> Il 17/10/2019 18:08, Ken Gaillot ha scritto:
> > This does sound odd, possibly a bug. Can you provide the output of "pcs
>
On Tue, Oct 15, 2019 at 11:58 AM Yan Gao wrote:
> >
> > Help for "move" still says:
> > resource# help move
> > Move a resource to another node
> >
> > Move a resource away from its current location.
> Looks like an issue in the version of crmsh.
>
> Xin, could you please take a look?
>
>
24.10.2019 16:54, Sherrard Burton пишет:
> background:
> we are upgrading a (very) old HA cluster running heartbeat DRBD and NFS,
> with no stonith, to a much more modern implementation. for the existing
> cluster, as well as the new one, the disk space requirements make
> running a full
On Fri, Oct 25, 2019 at 9:03 AM jyd <471204...@qq.com> wrote:
>
> Hi:
> I want to user pacemaker to mange a resource named A,i want A only
> started on one node,
> only when the node is down or A can not started in this node,the A resource
> will started on other nodes.
> And config a
28.10.2019 22:44, Jean-Francois Malouin пишет:
> Hi,
>
> Is there any new magic that I'm unaware of that needs to be added to a
> pacemaker cluster using a DRBD nested setup? pacemaker 2.0.x and DRBD 8.4.10
> on
> Debian/Buster on a 2-node cluster with stonith.
> Eventually this will host a
21.10.2019 9:39, Ulrich Windl пишет:
"Dileep V Nair" schrieb am 20.10.2019 um 17:54 in
> Nachricht
>
> m>:
>
>> Hi,
>>
>> I am confused about the best way to stop pacemaker on both nodes of a
>> two node cluster. The options I know of are
>> 1. Put the cluster in Maintenance Mode,
23.10.2019 13:35, Ulrich Windl пишет:
> Hi!
>
> In SLES12 SP4 I'm kind of annoyed due to repeating messages "unpack_config:
> Watchdog will be used via SBD if fencing is required".
>
> While examining another problem, I found this sequence:
> * Some unrelated resource was moved (migrated)
>
18.10.2019 12:43, Raffaele Pantaleoni пишет:
>
> Il 18/10/2019 10:21, Andrei Borzenkov ha scritto:
>> According to it, you have symmetric cluster (and apparently made typo
>> trying to change it)
>>
>> > name="symmetric-cluster" value=&quo
29.11.2019 16:37, Dennis Jacobfeuerborn пишет:
Hi,
I'm currently trying to set up a drbd 8.4 resource in a 3-node pacemaker
cluster. The idea is to have nodes storage1 and storage2 running with
the drbd clones and only use the third node storage3 for quorum.
The way I'm trying to do this:
pcs
29.11.2019 17:46, Jan Pokorný пишет:
On 27/11/19 20:13 +, matt_murd...@amat.com wrote:
I finally understand that there is a Apache Resource for Pacemaker
that assigns a single virtual ipaddress that "floats" between two
nodes as in webservers.
What happens if both interconnect and shared device is lost by node? I
assume node will reboot, correct?
Now assuming (two node cluster) second node still can access shared
device it will fence (via SBD) and continue takeover, right?
If both nodes lost shared device, both nodes will reboot and
201 - 300 of 663 matches
Mail list logo