date:20200810

Re: [ClusterLabs] why is node fenced ?

2020-08-10 Thread Ken Gaillot

On Sun, 2020-08-09 at 22:17 +0200, Lentes, Bernd wrote:
> 
> - Am 29. Jul 2020 um 18:53 schrieb kgaillot kgail...@redhat.com:
> 
> > On Wed, 2020-07-29 at 17:26 +0200, Lentes, Bernd wrote:
> > > Hi,
> > > 
> > > a few days ago one of my nodes was fenced and i don't know why,
> > > which
> > > is something i really don't like.
> > > What i did:
> > > I put one node (ha-idg-1) in standby. The resources on it (most
> > > of
> > > all virtual domains) were migrated to ha-idg-2,
> > > except one domain (vm_nextcloud). On ha-idg-2 a mountpoint was
> > > missing the xml of the domain points to.
> > > Then the cluster tries to start vm_nextcloud on ha-idg-2 which of
> > > course also failed.
> > > Then ha-idg-1 was fenced.
> > > I did a "crm history" over the respective time period, you find
> > > it
> > > here:
> > > https://hmgubox2.helmholtz-muenchen.de/index.php/s/529dfcXf5a72ifF
> > > 
> > > Here, from my point of view, the most interesting from the logs:
> > > ha-idg-1:
> > > Jul 20 16:59:33 [23763] ha-idg-1cib: info:
> > > cib_perform_op:  Diff: --- 2.16196.19 2
> > > Jul 20 16:59:33 [23763] ha-idg-1cib: info:
> > > cib_perform_op:  Diff: +++ 2.16197.0
> > > bc9a558dfbe6d7196653ce56ad1ee758
> > > Jul 20 16:59:33 [23763] ha-idg-1cib: info:
> > > cib_perform_op:  +  /cib:  @epoch=16197, @num_updates=0
> > > Jul 20 16:59:33 [23763] ha-idg-1cib: info:
> > > cib_perform_op:  +  /cib/configuration/nodes/node[@id='1084777482
> > > ']/i
> > > nstance_attributes[@id='nodes-108
> > > 4777482']/nvpair[@id='nodes-1084777482-standby']:  @value=on
> > > ha-idg-1 set to standby
> > > 
> > > Jul 20 16:59:34 [23768] ha-idg-1   crmd:   notice:
> > > process_lrm_event:   ha-idg-1-vm_nextcloud_migrate_to_0:3169
> > > [
> > > error: Cannot access storage file
> > > '/mnt/mcd/AG_BioInformatik/Technik/software_und_treiber/linux/ubu
> > > ntu/
> > > ubuntu-18.04.4-live-server-amd64.iso': No such file or
> > > directory\nocf-exit-reason:vm_nextcloud: live migration to ha-
> > > idg-2
> > > failed: 1\n ]
> > > migration failed
> > > 
> > > Jul 20 17:04:01 [23767] ha-idg-1pengine:error:
> > > native_create_actions:   Resource vm_nextcloud is active on 2
> > > nodes
> > > (attempting recovery)
> > > ???
> > 
> > This is standard for a failed live migration -- the cluster doesn't
> > know how far the migration actually got before failing, so it has
> > to
> > assume the VM could be active on either node. (The log message
> > would
> > make more sense saying "might be active" rather than "is active".)
> > 
> > > Jul 20 17:04:01 [23767] ha-idg-1pengine:   notice:
> > > LogAction:*
> > > Recovervm_nextcloud   ( ha-idg-2 )
> > 
> > The recovery from that situation is a full stop on both nodes, and
> > start on one of them.
> > 
> > > Jul 20 17:04:01 [23768] ha-idg-1   crmd:   notice:
> > > te_rsc_command:  Initiating stop operation vm_nextcloud_stop_0 on
> > > ha-
> > > idg-2 | action 106
> > > Jul 20 17:04:01 [23768] ha-idg-1   crmd:   notice:
> > > te_rsc_command:  Initiating stop operation vm_nextcloud_stop_0
> > > locally on ha-idg-1 | action 2
> > > 
> > > Jul 20 17:04:01 [23768] ha-idg-1   crmd: info:
> > > match_graph_event:   Action vm_nextcloud_stop_0 (106)
> > > confirmed
> > > on ha-idg-2 (rc=0)
> > > 
> > > Jul 20 17:04:06 [23768] ha-idg-1   crmd:   notice:
> > > process_lrm_event:   Result of stop operation for
> > > vm_nextcloud on
> > > ha-idg-1: 0 (ok) | call=3197 key=vm_nextcloud_stop_0
> > > confirmed=true
> > > cib-update=5960
> > 
> > It looks like both stops succeeded.
> > 
> > > Jul 20 17:05:29 [23761] ha-idg-1 pacemakerd:   notice:
> > > crm_signal_dispatch: Caught 'Terminated' signal | 15
> > > (invoking
> > > handler)
> > > systemctl stop pacemaker.service
> > > 
> > > 
> > > ha-idg-2:
> > > Jul 20 17:04:03 [10691] ha-idg-2   crmd:   notice:
> > > process_lrm_event:   Result of stop operation for
> > > vm_nextcloud on
> > > ha-idg-2: 0 (ok) | call=157 key=vm_nextcloud_stop_0
> > > confirmed=true
> > > cib-update=57
> > > the log from ha-idg-2 is two seconds ahead of ha-idg-1
> > > 
> > > Jul 20 17:04:08 [10688] ha-idg-2   lrmd:   notice:
> > > log_execute: executing - rsc:vm_nextcloud action:start
> > > call_id:192
> > > Jul 20 17:04:09 [10688] ha-idg-2   lrmd:   notice:
> > > operation_finished:  vm_nextcloud_start_0:29107:stderr [
> > > error:
> > > Failed to create domain from /mnt/share/vm_nextcloud.xml ]
> > > Jul 20 17:04:09 [10688] ha-idg-2   lrmd:   notice:
> > > operation_finished:  vm_nextcloud_start_0:29107:stderr [
> > > error:
> > > Cannot access storage file
> > > '/mnt/mcd/AG_BioInformatik/Technik/software_und_treiber/linux/ubu
> > > ntu/
> > > ubuntu-18.04.4-live-server-amd64.iso': No such file or directory
> > > ]
> > > Jul 20 17:04:09 [10688] ha-idg-2   lrmd:   notice:
> > > operation_finished:

[ClusterLabs] How to specify which IP pcs should use?

2020-08-10 Thread Mariusz Gronczewski

Hi,

Pacemaker 2, the current setup is

* management network with host's hostname resolving to host's
  management IP
* cluster network for Pacemaker/Corosync communication
* corosync set up with node name and IP of the cluster network

pcs status shows both nodes online, added config syncs to the other
node etc. but pcs cluster status shows one node being offline.

After a look in firewall logs it appears all of the communication is
going just fine on the cluster network but PCS tries to talk with PCSD
on port 2224 via *management* network instead of using IP set as
ring0_addr in corosync

Is "just use host's hostname regardless of config" something normal ?
Is there a separate setting to pcs about which IP it should use ?

Regards

-- 
Mariusz Gronczewski, Administrator

Efigence S. A.
ul. Wołoska 9a, 02-583 Warszawa
T:   [+48] 22 380 13 13
NOC: [+48] 22 380 10 20
E: ad...@efigence.com
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] why is node fenced ?

2020-08-10 Thread Lentes, Bernd



- Am 29. Jul 2020 um 18:53 schrieb kgaillot kgail...@redhat.com:

> On Wed, 2020-07-29 at 17:26 +0200, Lentes, Bernd wrote:
>> Hi,
>> 
>> a few days ago one of my nodes was fenced and i don't know why, which
>> is something i really don't like.
>> What i did:
>> I put one node (ha-idg-1) in standby. The resources on it (most of
>> all virtual domains) were migrated to ha-idg-2,
>> except one domain (vm_nextcloud). On ha-idg-2 a mountpoint was
>> missing the xml of the domain points to.
>> Then the cluster tries to start vm_nextcloud on ha-idg-2 which of
>> course also failed.
>> Then ha-idg-1 was fenced.
>> I did a "crm history" over the respective time period, you find it
>> here:
>> https://hmgubox2.helmholtz-muenchen.de/index.php/s/529dfcXf5a72ifF
>> 
>> Here, from my point of view, the most interesting from the logs:
>> ha-idg-1:
>> Jul 20 16:59:33 [23763] ha-idg-1cib: info:
>> cib_perform_op:  Diff: --- 2.16196.19 2
>> Jul 20 16:59:33 [23763] ha-idg-1cib: info:
>> cib_perform_op:  Diff: +++ 2.16197.0 bc9a558dfbe6d7196653ce56ad1ee758
>> Jul 20 16:59:33 [23763] ha-idg-1cib: info:
>> cib_perform_op:  +  /cib:  @epoch=16197, @num_updates=0
>> Jul 20 16:59:33 [23763] ha-idg-1cib: info:
>> cib_perform_op:  +  /cib/configuration/nodes/node[@id='1084777482']/i
>> nstance_attributes[@id='nodes-108
>> 4777482']/nvpair[@id='nodes-1084777482-standby']:  @value=on
>> ha-idg-1 set to standby
>> 
>> Jul 20 16:59:34 [23768] ha-idg-1   crmd:   notice:
>> process_lrm_event:   ha-idg-1-vm_nextcloud_migrate_to_0:3169 [
>> error: Cannot access storage file
>> '/mnt/mcd/AG_BioInformatik/Technik/software_und_treiber/linux/ubuntu/
>> ubuntu-18.04.4-live-server-amd64.iso': No such file or
>> directory\nocf-exit-reason:vm_nextcloud: live migration to ha-idg-2
>> failed: 1\n ]
>> migration failed
>> 
>> Jul 20 17:04:01 [23767] ha-idg-1pengine:error:
>> native_create_actions:   Resource vm_nextcloud is active on 2 nodes
>> (attempting recovery)
>> ???
> 
> This is standard for a failed live migration -- the cluster doesn't
> know how far the migration actually got before failing, so it has to
> assume the VM could be active on either node. (The log message would
> make more sense saying "might be active" rather than "is active".)
> 
>> Jul 20 17:04:01 [23767] ha-idg-1pengine:   notice:
>> LogAction:*
>> Recovervm_nextcloud   ( ha-idg-2 )
> 
> The recovery from that situation is a full stop on both nodes, and
> start on one of them.
> 
>> Jul 20 17:04:01 [23768] ha-idg-1   crmd:   notice:
>> te_rsc_command:  Initiating stop operation vm_nextcloud_stop_0 on ha-
>> idg-2 | action 106
>> Jul 20 17:04:01 [23768] ha-idg-1   crmd:   notice:
>> te_rsc_command:  Initiating stop operation vm_nextcloud_stop_0
>> locally on ha-idg-1 | action 2
>> 
>> Jul 20 17:04:01 [23768] ha-idg-1   crmd: info:
>> match_graph_event:   Action vm_nextcloud_stop_0 (106) confirmed
>> on ha-idg-2 (rc=0)
>> 
>> Jul 20 17:04:06 [23768] ha-idg-1   crmd:   notice:
>> process_lrm_event:   Result of stop operation for vm_nextcloud on
>> ha-idg-1: 0 (ok) | call=3197 key=vm_nextcloud_stop_0 confirmed=true
>> cib-update=5960
> 
> It looks like both stops succeeded.
> 
>> Jul 20 17:05:29 [23761] ha-idg-1 pacemakerd:   notice:
>> crm_signal_dispatch: Caught 'Terminated' signal | 15 (invoking
>> handler)
>> systemctl stop pacemaker.service
>> 
>> 
>> ha-idg-2:
>> Jul 20 17:04:03 [10691] ha-idg-2   crmd:   notice:
>> process_lrm_event:   Result of stop operation for vm_nextcloud on
>> ha-idg-2: 0 (ok) | call=157 key=vm_nextcloud_stop_0 confirmed=true
>> cib-update=57
>> the log from ha-idg-2 is two seconds ahead of ha-idg-1
>> 
>> Jul 20 17:04:08 [10688] ha-idg-2   lrmd:   notice:
>> log_execute: executing - rsc:vm_nextcloud action:start
>> call_id:192
>> Jul 20 17:04:09 [10688] ha-idg-2   lrmd:   notice:
>> operation_finished:  vm_nextcloud_start_0:29107:stderr [ error:
>> Failed to create domain from /mnt/share/vm_nextcloud.xml ]
>> Jul 20 17:04:09 [10688] ha-idg-2   lrmd:   notice:
>> operation_finished:  vm_nextcloud_start_0:29107:stderr [ error:
>> Cannot access storage file
>> '/mnt/mcd/AG_BioInformatik/Technik/software_und_treiber/linux/ubuntu/
>> ubuntu-18.04.4-live-server-amd64.iso': No such file or directory ]
>> Jul 20 17:04:09 [10688] ha-idg-2   lrmd:   notice:
>> operation_finished:  vm_nextcloud_start_0:29107:stderr [ ocf-
>> exit-reason:Failed to start virtual domain vm_nextcloud. ]
>> Jul 20 17:04:09 [10688] ha-idg-2   lrmd:   notice:
>> log_finished:finished - rsc:vm_nextcloud action:start call_id:192
>> pid:29107 exit-code:1 exec-time:581ms queue-time:0ms
>> start on ha-idg-2 failed
> 
> The start failed ...
> 
>> Jul 20 17:05:32 [10691] ha-idg-2   crmd: info:
>> do_dc_takeover:  Taking over DC status for this partition
>> ha-idg-1 stopped pacemaker
> 
> Since the

[ClusterLabs] Coming in Pacemaker 2.0.5: on-fail=demote / no-quorum-policy=demote

2020-08-10 Thread Ken Gaillot

Hi all,

Looking ahead to the Pacemaker 2.0.5 release expected at the end of
this year, here is a new feature already in the master branch.

When configuring resource operations, Pacemaker lets you set an "on-
fail" policy to specify whether to restart the resource, fence the
node, etc., if the operation fails. With 2.0.5, a new possible value
will be "demote", which will mean "demote this resource but do not
fully restart it".

"Demote" will be a valid value only for promote actions, and for
recurring monitors with "role" set to "Master".

Once the resource is demoted, it will be eligible for promotion again,
so if the promotion scores have not changed, a promote on the same node
may be attempted. If this is not desired, the agent can change the
promotion scores either in the failed monitor or the demote.

The intended use case is an application where a successful demote 
assures a well-functioning service, and a full restart would be
unnecessarily heavyweight. A large database might be an example.

Similarly, Pacemaker offers the cluster-wide "no-quorum-policy" option
to specify what happens to resources when quorum is lost (the default
being to stop them). With 2.0.5, "demote" will be a possible value here
as well, and will mean "demote all promotable resources and stop all
other resources".

The intended use case is an application that cannot cause any harm
after being demoted, and may be useful in a demoted role even if there
is no quorum. A database that operates read-only when demoted and
doesn't depend on any non-promotable resources might be an example.

Happy clustering :)
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] why is node fenced ?

2020-08-10 Thread Lentes, Bernd


- Am 29. Jul 2020 um 18:53 schrieb kgaillot kgail...@redhat.com:

 
> Since the ha-idg-2 is now shutting down, ha-idg-1 becomes DC.

The other way round.

>> Jul 20 17:05:33 [10690] ha-idg-2pengine:  warning:
>> unpack_rsc_op_failure:   Processing failed migrate_to of vm_nextcloud
>> on ha-idg-1: unknown error | rc=1
>> Jul 20 17:05:33 [10690] ha-idg-2pengine:  warning:
>> unpack_rsc_op_failure:   Processing failed start of vm_nextcloud on
>> ha-idg-2: unknown error | rc
>> 
>> Jul 20 17:05:33 [10690] ha-idg-2pengine: info:
>> native_color:Resource vm_nextcloud cannot run anywhere
>> logical
>> 
>> Jul 20 17:05:33 [10690] ha-idg-2pengine:  warning:
>> custom_action:   Action vm_nextcloud_stop_0 on ha-idg-1 is unrunnable
>> (pending)
>> ???
> 
> So this appears to be the problem. From these logs I would guess the
> successful stop on ha-idg-1 did not get written to the CIB for some
> reason. I'd look at the pe input from this transition on ha-idg-2 to
> confirm that.
> 
> Without the DC knowing about the stop, it tries to schedule a new one,
> but the node is shutting down so it can't do it, which means it has to
> be fenced.
> 
>> Jul 20 17:05:35 [10690] ha-idg-2pengine:  warning:
>> custom_action:   Action vm_nextcloud_stop_0 on ha-idg-1 is unrunnable
>> (offline)
>> Jul 20 17:05:35 [10690] ha-idg-2pengine:  warning:
>> pe_fence_node:   Cluster node ha-idg-1 will be fenced: resource
>> actions are unrunnable
>> Jul 20 17:05:35 [10690] ha-idg-2pengine:  warning:
>> stage6:  Scheduling Node ha-idg-1 for STONITH
>> Jul 20 17:05:35 [10690] ha-idg-2pengine: info:
>> native_stop_constraints: vm_nextcloud_stop_0 is implicit after ha-
>> idg-1 is fenced
>> Jul 20 17:05:35 [10690] ha-idg-2pengine:   notice:
>> LogNodeActions:   * Fence (Off) ha-idg-1 'resource actions are
>> unrunnable'
>> 
>> 
>> Why does it say "Jul 20 17:05:35 [10690] ha-idg-
>> 2pengine:  warning: custom_action:   Action vm_nextcloud_stop_0
>> on ha-idg-1 is unrunnable (offline)" although
>> "Jul 20 17:04:06 [23768] ha-idg-1   crmd:   notice:
>> process_lrm_event:   Result of stop operation for vm_nextcloud on
>> ha-idg-1: 0 (ok) | call=3197 key=vm_nextcloud_stop_0 confirmed=true
>> cib-update=5960"
>> says that stop was ok ?

Bernd
Helmholtz Zentrum München

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Automatic recover from split brain ?

2020-08-10 Thread Adam Cécile


Hello,


I'm experiencing issue with corosync/pacemaker running on Debian Buster. 
Cluster has three nodes running in VMWare virtual machine and the 
cluster fails when VEEAM backups the virtual machine (I know it's doing 
bad things, like freezing completely the VM for a few minutes to make 
disk snapshot).


My biggest issue is that once the backup has been completed, the cluster 
stays in split brain state, and I'd like it to heal itself. Here current 
status:



One node is isolated:

Stack: corosync
Current DC: host2.domain.com (version 2.0.1-9e909a5bdd) - partition 
WITHOUT quorum

Last updated: Sat Aug  8 11:59:46 2020
Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on 
host1.domain.com


3 nodes configured
6 resources configured

Online: [ host2.domain.com ]
OFFLINE: [ host3.domain.com host1.domain.com ]


Two others are seeing each others:

Stack: corosync
Current DC: host3.domain.com (version 2.0.1-9e909a5bdd) - partition with 
quorum

Last updated: Sat Aug  8 12:07:56 2020
Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on 
host1.domain.com


3 nodes configured
6 resources configured

Online: [ host3.domain.com host1.domain.com ]
OFFLINE: [ host2.domain.com ]


The problem is that one of the resources is a floating IP address which 
is currently assigned to two different hosts...



Can you help me configuring the cluster correctly so this cannot occurs ?


Thanks in advance,

Adam.

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Automatic recover from split brain ?

2020-08-10 Thread Ken Gaillot

On Sun, 2020-08-09 at 21:11 +0200, Adam Cécile wrote:
> Hello,
> 
> 
> I'm experiencing issue with corosync/pacemaker running on Debian
> Buster. 
> Cluster has three nodes running in VMWare virtual machine and the 
> cluster fails when VEEAM backups the virtual machine (I know it's
> doing 
> bad things, like freezing completely the VM for a few minutes to
> make 
> disk snapshot).
> 
> My biggest issue is that once the backup has been completed, the
> cluster 
> stays in split brain state, and I'd like it to heal itself. Here 

Fencing is how the cluster prevents split-brain. When one node is lost,
the other nodes will not recover any resources from it until it's
fenced. For VMWare there's a fence_vmware_soap fence agent.

However that's intended for failure scenarios, not a planned outage
like a backup snapshot.

For planned outages, you can set the cluster-wide
property "maintenance-mode" to true. The cluster won't start, monitor,
or stop resources while in maintenance mode. You can use rules to
automatically put the cluster in maintenance mode at specific times.

However I believe even in maintenance mode, the node will get fenced if
it drops out of the corosync membership. Ideally you'd put the cluster
in maintenance mode, stop pacemaker and corosync on the node, do the
backup, then start pacemaker and corosync, wait for them to come up,
and take the cluster out of maintenance mode.

Alternatively, if you want the resources to move to other nodes while
the backup is being done, you could put the node in standby rather than
set maintenance mode.

> current 
> status:
> 
> 
> One node is isolated:
> 
> Stack: corosync
> Current DC: host2.domain.com (version 2.0.1-9e909a5bdd) - partition 
> WITHOUT quorum
> Last updated: Sat Aug  8 11:59:46 2020
> Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on 
> host1.domain.com
> 
> 3 nodes configured
> 6 resources configured
> 
> Online: [ host2.domain.com ]
> OFFLINE: [ host3.domain.com host1.domain.com ]
> 
> 
> Two others are seeing each others:
> 
> Stack: corosync
> Current DC: host3.domain.com (version 2.0.1-9e909a5bdd) - partition
> with 
> quorum
> Last updated: Sat Aug  8 12:07:56 2020
> Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on 
> host1.domain.com
> 
> 3 nodes configured
> 6 resources configured
> 
> Online: [ host3.domain.com host1.domain.com ]
> OFFLINE: [ host2.domain.com ]
> 
> 
> The problem is that one of the resources is a floating IP address
> which 
> is currently assigned to two different hosts...
> 
> 
> Can you help me configuring the cluster correctly so this cannot
> occurs ?
> 
> 
> Thanks in advance,
> 
> Adam.
> 
> 
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

[ClusterLabs] Automatic recover from split brain ?

2020-08-10 Thread Adam Cécile


Hello,


I'm experiencing issue with corosync/pacemaker running on Debian Buster. 
Cluster has three nodes running in VMWare virtual machine and the 
cluster fails when VEEAM backups the virtual machine (I know it's doing 
bad things, like freezing completely the VM for a few minutes to make 
disk snapshot).


My biggest issue is that once the backup has been completed, the cluster 
stays in split brain state, and I'd like it to heal itself. Here current 
status:



One node is isolated:

Stack: corosync
Current DC: host2.domain.com (version 2.0.1-9e909a5bdd) - partition 
WITHOUT quorum

Last updated: Sat Aug  8 11:59:46 2020
Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on 
host1.domain.com


3 nodes configured
6 resources configured

Online: [ host2.domain.com ]
OFFLINE: [ host3.domain.com host1.domain.com ]


Two others are seeing each others:

Stack: corosync
Current DC: host3.domain.com (version 2.0.1-9e909a5bdd) - partition with 
quorum

Last updated: Sat Aug  8 12:07:56 2020
Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on 
host1.domain.com


3 nodes configured
6 resources configured

Online: [ host3.domain.com host1.domain.com ]
OFFLINE: [ host2.domain.com ]


The problem is that one of the resources is a floating IP address which 
is currently assigned to two different hosts...



Can you help me configuring the cluster correctly so this cannot occurs ?


Thanks in advance,

Adam.


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] why is node fenced ?

[ClusterLabs] How to specify which IP pcs should use?

Re: [ClusterLabs] why is node fenced ?

[ClusterLabs] Coming in Pacemaker 2.0.5: on-fail=demote / no-quorum-policy=demote

Re: [ClusterLabs] why is node fenced ?

[ClusterLabs] Automatic recover from split brain ?

Re: [ClusterLabs] Automatic recover from split brain ?

[ClusterLabs] Automatic recover from split brain ?

8 matches

Site Navigation

Mail list logo

Footer information