Re: [ClusterLabs] Updated attribute is not displayed in crm_mon

2017-08-16 Thread Ken Gaillot
gt;+ ringnumber_1 : 192.168.102.132 is UP > > Regards, > Kazunori INOUE > > > -Original Message- > > From: Ken Gaillot [mailto:kgail...@redhat.com] > > Sent: Tuesday, August 15, 2017 2:42 AM > > To: Cluster Labs - All topics relat

Re: [ClusterLabs] Antw: Retries before setting fail-count to INFINITY

2017-08-21 Thread Ken Gaillot
> > behavior or will this have to be coded in the resource agent? > > See above. > > > > > Thanks, > > Vaibhaw > > > > > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.cl

Re: [ClusterLabs] ClusterLabs.Org Documentation Problem?

2017-08-22 Thread Ken Gaillot
ne that is updated, and it's mostly independent of the underlying layer, so you should prefer that set. I plan to reorganize that page in the coming months, so I'll try to make it clearer. -- Ken Gaillot ___ Users mailing lis

Re: [ClusterLabs] start one node only?

2017-08-24 Thread Ken Gaillot
You may want to set wait_for_all back to 1 once your cluster is back to normal. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting sta

Re: [ClusterLabs] ClusterLabs.Org Documentation Problem?

2017-08-24 Thread Ken Gaillot
ed with corosync 1.x as > a configuration layer and (more important) quorum provider. With Corosync 2.x > quorum provider is already in corosync so no need for cman. > > > > > > > -- > > Eric Robinson > > > > -Original Message- > > From: Ken Gaillo

Re: [ClusterLabs] start one node only?

2017-08-24 Thread Ken Gaillot
On Thu, 2017-08-24 at 15:53 -0500, Dimitri Maziuk wrote: > On 08/24/2017 03:40 PM, Ken Gaillot wrote: > > > You could set wait_for_all to 0 in corosync.conf, then boot. The living > > node should try to fence the other one, and proceed if fencing succeeds. > > Didn

Re: [ClusterLabs] Pacemaker in Azure

2017-08-24 Thread Ken Gaillot
w.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > > > > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Projec

Re: [ClusterLabs] VirtualDomain live migration error

2017-08-31 Thread Ken Gaillot
gt; Anybody has experienced the same issue? > > > Thanks in advance for your help If something works from the command line but not when run by a daemon, my first suspicion is SELinux. Check the audit log for denials around that time. I'd also check the system log and Pacemaker d

Re: [ClusterLabs] Pacemaker stopped monitoring the resource

2017-08-31 Thread Ken Gaillot
( http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#ap-ocf ); then add your resource to the cluster without migration-threshold or failure-timeout, and work out any issues with frequent failures; then finally set migration-threshold and fai

Re: [ClusterLabs] Is there a way to ignore a single monitoring timeout

2017-08-31 Thread Ken Gaillot
h leads to a monitoring timeout and > resource restart etc > > Is there any way to ignore one timed out monitoring request and react only on > two (or more) failed requests in a row? > > Best regards, > Klecho Not currently, but that is p

Re: [ClusterLabs] VirtualDomain live migration error

2017-08-31 Thread Ken Gaillot
n vdicnode01 | action 6 > Aug 31 23:38:31 [1536] vdicnode01 crmd: info: > do_lrm_rsc_op: Performing > key=6:7:0:fe1a9b0a-816c-4b97-96cb-b90dbf71417a > op=vm-vdicdb01_monitor_1 > Aug 31 23:38:31 [1536] vdicnode01 crmd: info: > process_lrm

Re: [ClusterLabs] VirtualDomain live migration error

2017-09-01 Thread Ken Gaillot
> Is there any way to check ssk keys? I'd just login once to the host as root from the cluster nodes, to make it sure it works, and accept the host when asked. > > Sorry for all theese questions. > > > Thanks a lot > > > > > > > El 1 sept. 2

Re: [ClusterLabs] Pacemaker stopped monitoring the resource

2017-09-01 Thread Ken Gaillot
r rechecks the current state to see if anything needs to be done. last-lrm-refresh is just a dummy property that the cluster uses to trigger that. It's set in certain rare circumstances when a resource cleanup is done. You should see a line in your logs like "Triggering a refres

Re: [ClusterLabs] Pacemaker stopped monitoring the resource

2017-09-05 Thread Ken Gaillot
ne of the nodes is elected the "DC" at any given time. That node calculates what needs to be done about failures. It looks like the other node was DC at this time, so its logs will be more relevant. It's fine for this node not to have logs if the DC didn't ask it to do anythin

Re: [ClusterLabs] pacemaker fencing

2017-09-05 Thread Ken Gaillot
Beware that with multiple agents on one level pacemaker always does > on/off and no reboot. > But for the higher level instance you can map the on-action to reboot > and the off-action to metadata. > While for the lower prio level you would just map the on-action to > metadata (to make it

[ClusterLabs] 2017 ClusterLabs Summit -- Pacemaker 1.2.0 or 2.0 talk

2017-09-06 Thread Ken Gaillot
. The purpose of this email is to start a discussion about these changes. Nothing is set in stone. We do want to focus more on removing legacy usage rather than adding new features in the 2.0 release. Anyone who has an opinion or questions about the changes mentioned above, or suggestions fo

Re: [ClusterLabs] Corosync on a home network

2017-09-12 Thread Ken Gaillot
rlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org -- Ken Gaillot ___ Users maili

Re: [ClusterLabs] How to avoid stopping ordered resources on cleanup?

2017-09-15 Thread Ken Gaillot
1380:8:e2c19428-0707-4677-a89a- > ff1c19ebe57c op=mvno-100_monitor_9000 > sba(mvno-100)[57166]:   2017/09/13_06:43:55 INFO: mvno-100 monitor > started > sba(mvno-100)[57166]:   2017/09/13_06:43:55 INFO: Check status > Sep 13 06:43:55 [3826] bam1-omc    cib: info: > cib_perform_op:   Diff: --- 0.168.7

Re: [ClusterLabs] Hundreds resources on two node cluster

2017-09-15 Thread Ken Gaillot
t let us know if you encounter anything else. > > Best Regards. > · > Roberto Muñoz > > > > P Antes de imprimir, piensa en el MEDIO AMBIENTE > AVISO LEGAL/DISCLAIMER -- Ken Gaillot ___ Us

Re: [ClusterLabs] Force stopping the resources from a resource group in parallel

2017-09-15 Thread Ken Gaillot
erlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch. > pdf > Bugs: http://bugs.clusterlabs.org -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: h

Re: [ClusterLabs] High CPU during CIB sync

2017-09-15 Thread Ken Gaillot
but a resource agent generally shouldn't change its own configuration.) You should be able to reduce the CPU usage by setting "dampening" on the node attributes. This will make the cluster wait a bit of time before writing node attribute changes to the CIB, so the recalculatio

Re: [ClusterLabs] Cannot stop cluster due to order constraint

2017-09-15 Thread Ken Gaillot
kind=Serialize > pcs constraint order start main5 then stop backup5 kind=Serialize > pcs constraint order start main6 then stop backup6 kind=Serialize > > pcs constraint colocation add backup1 with main1 -200 > pcs constraint colocation add backup2 with main2 -200 > pcs

Re: [ClusterLabs] IP clone issue

2017-09-15 Thread Ken Gaillot
:1(ocf::heartbeat:IPaddr2):Started > node01 > > > > But if one node fails the IP resource is not migrated to > active > > node as is said in documentation. > > > > Clone Set: ClusterIP-clone [ClusterIP] (unique) > > 

[ClusterLabs] Pacemaker 1.1.18 deprecation warnings

2017-09-18 Thread Ken Gaillot
on-threshold) * undocumented and ignored -r option to lrmd * compile-time option to use undocumented "notification-agent" and "notification-recipient" cluster properties instead of current "alerts" syntax * compatibility with CIB schemas below 1.0, and schema 1.1

[ClusterLabs] Disabling stonith in Pacemaker 2.0 (was: Re: Pacemaker 1.1.18 deprecation warnings)

2017-09-18 Thread Ken Gaillot
On Mon, 2017-09-18 at 13:53 -0400, Digimer wrote: > On 2017-09-18 01:48 PM, Ken Gaillot wrote: > > As discussed at the recent ClusterLabs Summit, I plan to start the > > release cycle for Pacemaker 1.1.18 soon. > >  > > There will be the usual bug fixes and a few sm

Re: [ClusterLabs] Antw: Pacemaker 1.1.18 deprecation warnings

2017-09-19 Thread Ken Gaillot
On Tue, 2017-09-19 at 09:13 +0200, Ulrich Windl wrote: > >>> Ken Gaillot schrieb am 18.09.2017 um 19:48 > in Nachricht > <1505756918.5541.4.ca...@redhat.com>: > > As discussed at the recent ClusterLabs Summit, I plan to start the > > release cycle for Pacemaker

Re: [ClusterLabs] Pacemaker 1.1.18 deprecation warnings

2017-09-19 Thread Ken Gaillot
int would be a log for non > > existentalert-agents prior to their unsuccessful first use. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: ht

Re: [ClusterLabs] Pacemaker 1.1.18 deprecation warnings

2017-09-19 Thread Ken Gaillot
plans for that, but it will be later than 2.0. > On 18.9.2017 12:48:38 Ken Gaillot wrote: > > As discussed at the recent ClusterLabs Summit, I plan to start the > > release cycle for Pacemaker 1.1.18 soon. > >  > > There will be the usual bug fixes and a few small new feat

Re: [ClusterLabs] Pacemaker 1.1.18 deprecation warnings

2017-09-20 Thread Ken Gaillot
On Wed, 2017-09-20 at 11:48 +0200, Ferenc Wágner wrote: > Ken Gaillot writes: > > > * undocumented LRMD_MAX_CHILDREN environment variable > > (PCMK_node_action_limit is the current syntax) > > By the way, is the current syntax documented somewhere?  Looking at Unfortuna

Re: [ClusterLabs] some resources move after recovery

2017-09-20 Thread Ken Gaillot
··· > Roberto Muñoz > BME - Sistemas UNIX > C/ Tramontana, 2 Bis. Edificio 2 - 1ª Planta > 28230 Las Rozas, Madrid - España > Tlfn: +34-917095778 > > > P Antes de imprimir, piensa en el MEDIO AMBIENTE > AVISO LEGAL/DISCLAIMER -- Ken Gaillot

[ClusterLabs] New website design and new-new logo

2017-09-20 Thread Ken Gaillot
ed our new logo -- Kristoffer Grönlund had a professional designer look at the one he created. I hope everyone likes the end result. It's simpler, cleaner and friendlier. Check it out at https://clusterlabs.org/ -- Ken Gaillot ___ Users mailing list: U

Re: [ClusterLabs] New website design and new-new logo

2017-09-21 Thread Ken Gaillot
On Thu, 2017-09-21 at 09:26 +0200, Jehan-Guillaume de Rorthais wrote: > On Wed, 20 Sep 2017 21:25:51 -0400 > Digimer wrote: > > > On 2017-09-20 07:53 PM, Ken Gaillot wrote: > > > Hi everybody, > > >  > > > We've started a major update of the Cluster

Re: [ClusterLabs] New website design and new-new logo

2017-09-21 Thread Ken Gaillot
On Wed, 2017-09-20 at 21:25 -0400, Digimer wrote: > On 2017-09-20 07:53 PM, Ken Gaillot wrote: > > Hi everybody, > >  > > We've started a major update of the ClusterLabs web design. The > main > > goal (besides making it look more modern) is to make the top-lev

Re: [ClusterLabs] New website design and new-new logo

2017-09-21 Thread Ken Gaillot
On Thu, 2017-09-21 at 11:56 +0200, Kai Dupke wrote: > On 09/21/2017 01:53 AM, Ken Gaillot wrote: > > Check it out at https://clusterlabs.org/ > > Two comments > > - I would like to see the logo used by as many > people/projects/marketingers, so I propose to link the Lo

Re: [ClusterLabs] New website design and new-new logo

2017-09-21 Thread Ken Gaillot
On Thu, 2017-09-21 at 16:46 +0200, Kai Dupke wrote: > On 09/21/2017 04:42 PM, Ken Gaillot wrote: > > Yes, the FAQ needs an overhaul as well -- all the Pacemaker- > specific > > questions should be moved to a separate Pacemaker FAQ, and the top > FAQ > > should just have

Re: [ClusterLabs] Pacemaker resource parameter reload confusion

2017-09-22 Thread Ken Gaillot
not sure what > was > implemented in the end, and I can't find anything in the changelog > either.  So, what do I miss here?  Parallel reload and stop looks > rather > suspicious, though... Nothing's been done about reload yet. It's wai

[ClusterLabs] Coming in 1.1.18: deprecating stonith-enabled

2017-09-25 Thread Ken Gaillot
il" set to "fence" (the default for stop operations), or any fence resource has been configured. If fencing is not possible, the cluster will behave as if stonith- enabled is false (even if it's not). -- Ken Gaillot ___ Users mailing l

Re: [ClusterLabs] Transition aborted when disabling resource

2017-09-27 Thread Ken Gaillot
etfh) > > > > This is just one example, it happens randomly with others resources > and times. > > How can it be avoid? > > Regards. > > · > > Roberto Muñoz -- Ken Gaillot ___ Users ma

Re: [ClusterLabs] Transition aborted when disabling resource

2017-09-30 Thread Ken Gaillot
, @exec- > > time=1190 > > Sep 25 23:50:08 [4492] vttwinformlrz1    cib: info: > > cib_process_request: Completed cib_modify operation for section > > status: OK (rc=0, origin=vttwinformlrz2/crmd/9922, > > version=0.18640.1) > > Sep 25 23:50:08 [4501] vt

Re: [ClusterLabs] strange behaviour from pacemaker_remote

2017-10-01 Thread Ken Gaillot
On Thu, 2017-09-28 at 01:39 +0200, Adam Spiers wrote: > Hi all, > > When I do a > > pkill -9 -f pacemaker_remote > > to simulate failure of a remote node, sometimes I see things like: > > 08:29:32 d52-54-00-da-4e-05 pacemaker_remoted[5806]: error: No > ipc providers available for uid 0

Re: [ClusterLabs] Pacemaker starts with error on LVM resource

2017-10-01 Thread Ken Gaillot
On Thu, 2017-09-28 at 18:05 +0300, Octavian Ciobanu wrote: > Hello all. > > I have a test configuration with 2 nodes that is configured as iSCSI > storage. > > I've created a master/slave DRBD resource and a group that has the > following resources ordered as follow :  >  - iSCSI TCP IP/port bloc

Re: [ClusterLabs] Cluster is not promoting DRBD resource to master

2017-10-01 Thread Ken Gaillot
On Fri, 2017-09-29 at 11:36 +0300, Octavian Ciobanu wrote: > Hello all. > I've encountered another strange behavior after updating to CentOS > 7.4. The DRBD resource is no longer promote one node to Master, > instead both nodes are stuck in Slave. > > The configuration is based on 2 nodes running

Re: [ClusterLabs] Restarting a failed resource on same node

2017-10-03 Thread Ken Gaillot
On Mon, 2017-10-02 at 12:32 -0700, Paolo Zarpellon wrote: > Hi, > on a basic 2-node cluster, I have a master-slave resource where > master runs on a node and slave on the other one. If I kill the slave > resource, the resource status goes to "stopped". > Similarly, if I kill the the master resource

Re: [ClusterLabs] what does cluster do when 'resourceA with resourceB' happens

2017-10-03 Thread Ken Gaillot
On Tue, 2017-10-03 at 11:53 +0100, lejeczek wrote: > hi > > I'm reading "An A-Z guide to Pacemaker's Configurations  > Options" and in there it read: > "... > So when you are creating colocation constraints, it is  > important to consider whether you should > colocate A with B, or B with A. > Anot

Re: [ClusterLabs] Restarting a failed resource on same node

2017-10-04 Thread Ken Gaillot
st resource's migration-threshold set to INFINITY > > Thank you in advance. > Regards, > Paolo > > On Tue, Oct 3, 2017 at 7:12 AM, Ken Gaillot > wrote: > > On Mon, 2017-10-02 at 12:32 -0700, Paolo Zarpellon wrote: > > > Hi, > > > on a basic 2

[ClusterLabs] Pacemaker 1.1.18-rc1 now available

2017-10-06 Thread Ken Gaillot
t and appreciated. Many thanks to all contributors of source code to this release, including Andrew Beekhof, Aravind Kumar, Artur Novik, Bin Liu, Yan Gao, Hideo Yamauchi, Igor Tsiglyar, Jan Pokorný, Ken Gaillot, Klaus Wenninger, Nye Liu, and Valentin Vidic. -- Ken Ga

Re: [ClusterLabs] crm_resource --wait

2017-10-09 Thread Ken Gaillot
On Mon, 2017-10-09 at 16:37 +1000, Leon Steffens wrote: > Hi all, > > We have a use case where we want to place a node into standby and > then wait for all the resources to move off the node (and be started > on other nodes) before continuing.   > > In order to do this we call: > $ pcs cluster st

Re: [ClusterLabs] crm_resource --wait

2017-10-09 Thread Ken Gaillot
On Tue, 2017-10-10 at 07:47 +1000, Leon Steffens wrote: > > > > > > > > > Pending actions: > > > Action 40: sv_fencer_monitor_6 on brilxvm44 > > > Action 39: sv_fencer_start_0 on brilxvm44 > > > Action 38: sv_fencer_stop_0 on brilxvm43 > > > Error performing operation: Timer expired > > > >

Re: [ClusterLabs] crm_resource --wait

2017-10-10 Thread Ken Gaillot
On Tue, 2017-10-10 at 15:19 +1000, Leon Steffens wrote: > Hi Ken, > > I managed to reproduce this on a simplified version of the cluster, > and on Pacemaker 1.1.15, 1.1.16, as well as 1.1.18-rc1 > The steps to create the cluster are: > > pcs property set stonith-enabled=false > pcs property set

Re: [ClusterLabs] corosync service not automatically started

2017-10-10 Thread Ken Gaillot
On Tue, 2017-10-10 at 12:24 +0200, Václav Mach wrote: > On 10/10/2017 11:40 AM, Valentin Vidic wrote: > > On Tue, Oct 10, 2017 at 11:26:24AM +0200, Václav Mach wrote: > > > # The primary network interface > > > allow-hotplug eth0 > > > iface eth0 inet dhcp > > > # This is an autoconfigured IPv6 int

Re: [ClusterLabs] if resourceA starts @nodeA then start resource[xy] @node[xy]

2017-10-11 Thread Ken Gaillot
re, that's simply a colocation constraint with a negative score. For details, see http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/htm l-single/Pacemaker_Explained/index.html#s-resource-colocation (and/or the help for whatever higher-level to

Re: [ClusterLabs] ClusterMon mail notification - does not work

2017-10-11 Thread Ken Gaillot
t your version of crm_mon supports the mail-* arguments. It's a compile-time option, and I don't know if Ubuntu enabled it. Simply do "man crm_mon", and if it shows the mail-* options, then you have the capability. -- Ken Gaillot ___

Re: [ClusterLabs] Debugging problems with resource timeout without any actions from cluster

2017-10-12 Thread Ken Gaillot
e_lag=180 evict_outdated_slaves=false > binary="/usr/sbin/mysqld" test_user=test test_passwd=test \ > op start interval=0 timeout=60s \ > op stop interval=0 timeout=60s \ > op monitor interval=5s role=Master OCF_CHECK_LEVEL=1 \ > op monitor i

Re: [ClusterLabs] can't move/migrate ressource

2017-10-12 Thread Ken Gaillot
t; Oct 11 13:55:59 [3556] zfs-serv2    cib: info: > cib_file_write_with_digest:  Reading cluster configuration file > /var/lib/pacemaker/cib/cib.kA8iQp (digest: > /var/lib/pacemaker/cib/cib.Va05np) > Oct 11 13:56:03 [3556] zfs-serv2cib: info:

Re: [ClusterLabs] Mysql upgrade in DRBD setup

2017-10-12 Thread Ken Gaillot
ect Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch. > pdf > Bugs: http://bugs.clusterlabs.org -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/

Re: [ClusterLabs] Mysql upgrade in DRBD setup

2017-10-13 Thread Ken Gaillot
time on any given server). This makes more sense if the mysql servers are running inside VMs or containers that can migrate between the physical machines. > > Thanks! > > > > -Original Message- > From: Ken Gaillot [mailto:kgail...@redhat.com]  > Sent: Thursday, O

[ClusterLabs] Pacemaker 1.1.18 Release Candidate 2

2017-10-16 Thread Ken Gaillot
testing you can do is very welcome. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-16 Thread Ken Gaillot
wever if the agent returns "failed" for both resources when either one fails, you could see something like that. I'd look at the logs on the DC and see why it decided to restart the second resource. -- Ken Gaillot ___ Users mailing list:

Re: [ClusterLabs] set node in maintenance - stop corosync - node is fenced - is that correct ?

2017-10-16 Thread Ken Gaillot
'd stop pacemaker before stopping corosync, in any case. In maintenance mode, that should be fine. I don't think a running pacemaker would be able to reconnect to corosync after corosync comes back. > What are you really trying to do, > what is the reason you need it in maintenance-

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-17 Thread Ken Gaillot
it solves ... I'd recommend writing your own OCF agent tailored to your service. It's not much more complicated than an init script. > On Mon, Oct 16, 2017 at 6:57 PM, Ken Gaillot > wrote: > > On Mon, 2017-10-16 at 18:30 +0200, Gerard Garcia wrote: > > >

Re: [ClusterLabs] Debugging problems with resource timeout without any actions from cluster

2017-10-17 Thread Ken Gaillot
On Tue, 2017-10-17 at 15:30 +0600, Sergey Korobitsin wrote: > Ken Gaillot ☫ → To Cluster Labs - All topics related to open-source > clustering welcomed @ Thu, Oct 12, 2017 09:47 -0500 > > Thanks for the answer, Ken, > > > > I found several ways to achieve that: > &

Re: [ClusterLabs] Pacemaker resource parameter reload confusion

2017-10-17 Thread Ken Gaillot
On Fri, 2017-09-22 at 18:30 +0200, Ferenc Wágner wrote: > Ken Gaillot writes: > > > Hmm, stop+reload is definitely a bug. Can you attach (or email it > > to me > > privately, or file a bz with it attached) the above pe-input file > > with > > any sensitive

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-18 Thread Ken Gaillot
d from the second resource > > when it does not have any value.  > > > > I must have something wrongly configuration but I can't really see > > why there is this relationship... > > > > Gerard > > > > On Tue, Oct 17, 2017 at 3:35 PM, Ken Gaill

Re: [ClusterLabs] VirtualDomain live migration error

2017-10-18 Thread Ken Gaillot
ility to tell pacemaker to execute a resource agent as a particular user. We've already put the plumbing in for it, so that lrmd can execute alert agents as the hacluster user. All that would be needed would be a new resource meta-attribute and the IPC API to use it. It's low priority due

Re: [ClusterLabs] monitor failed actions not cleared

2017-10-18 Thread Ken Gaillot
is usually sufficient. > I know that my english and my pacemaker knowledge are not so high but > could you please give me some explanations about that behavior that I > misunderstand. Not at all, this was a very clear and well-thought-out post :) > ð  If something is wrong w

Re: [ClusterLabs] strange cluster state

2017-10-18 Thread Ken Gaillot
en it is > in  > standby state. Also all the resources should run on same node and > all  > the resources should be started in the defined order. The output > above  > does not match that. > > I'm not totally sure if the attached logs were created when this > probl

Re: [ClusterLabs] When resource fails to start it stops an apparently unrelated resource

2017-10-18 Thread Ken Gaillot
eleased in 1.1.17. The bug only affected cloned resources where one clone's name ended with the other's. FYI, CentOS 7.4 has 1.1.16, but that won't help this issue. > > On Wed, Oct 18, 2017 at 4:42 PM, Ken Gaillot > wrote: > > On Wed, 2017-10-18 at 14:25 +0200, Ge

Re: [ClusterLabs] Antw: changing on-fail action default

2017-10-19 Thread Ken Gaillot
ave to be a strictness about fencing before recovery. If the cluster can't communicate with the node, fencing is the only way to be sure it's unable to cause conflicts. But, it's fine for "fencing" to be manual, i.e. having an admin manually investigate, reboot

Re: [ClusterLabs] Pacemaker resource parameter reload confusion

2017-10-20 Thread Ken Gaillot
On Fri, 2017-10-20 at 15:52 +0200, Ferenc Wágner wrote: > Ken Gaillot writes: > > > On Fri, 2017-09-22 at 18:30 +0200, Ferenc Wágner wrote: > > > Ken Gaillot writes: > > > > > > > Hmm, stop+reload is definitely a bug. Can you attach (or email > &

Re: [ClusterLabs] crm_resource --wait

2017-10-20 Thread Ken Gaillot
ed to make it happen. Still investigating a fix. A workaround is to assign some stickiness or utilization to sv-fencer. On Wed, 2017-10-11 at 14:01 +1000, Leon Steffens wrote: > I've attached two files: > 314 = after standby step > 315 = after resource update > > On Wed, Oct 11

Re: [ClusterLabs] (Not) Coming in 1.1.18: deprecating stonith-enabled

2017-10-23 Thread Ken Gaillot
r project than the 1.1.18 (or 2.0.0) time frame. On Mon, 2017-09-25 at 18:53 -0500, Ken Gaillot wrote: > Hi all, > > I thought I'd call attention to one of the most visible deprecations > coming in 1.1.18: stonith-enabled. In order to deprecate that option, > we have to prov

[ClusterLabs] Pacemaker 1.1.18 Release Candidate 3

2017-10-23 Thread Ken Gaillot
release next week. Any testing you can do is very welcome. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http

Re: [ClusterLabs] dead cluster after centos update

2017-10-23 Thread Ken Gaillot
owing the node access to the disk. Since that fails, nothing else can proceed. > > I disabled it for now, and > > pcs resource debug-start resource-zfs --full > > works fine: the pool is imported, filesystems are mounted and > exported > -- but the resources remain stopped

Re: [ClusterLabs] MYSQL data on DRBD

2017-10-24 Thread Ken Gaillot
Best regards > Antony > tel.   +380669197533 > tel2. +380636564340 > Paypal http://paypal.me/Satskiy > satski...@gmail.com -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Projec

Re: [ClusterLabs] Cluster metrics and collectd

2017-10-27 Thread Ken Gaillot
subset of nodes can handle it. The advantages of load-balancing are (1) continuously exercising all nodes so you're not surprised in an outage if a node has become degraded in some fashion, and (2) possibly better performance, depending on workload and capacities. Note that pacemaker has a

Re: [ClusterLabs] 'expected' node attribute missing

2017-10-27 Thread Ken Gaillot
n your cluster, you should see a "" entry in "crm_mon -X" output, and it should include "expected_up=true" or "expected_up=false". -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http://lists.c

Re: [ClusterLabs] Pacemaker resource start delay when there are another resource is starting

2017-10-27 Thread Ken Gaillot
t; > > 【网易自营】好吃到爆!鲜香弹滑加热即食,经典13香/麻辣小龙虾仅75元3斤>>       > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org >

Re: [ClusterLabs] different start/stop order

2017-10-27 Thread Ken Gaillot
ordering constraints. You can make ordering constraints asymmetrical, so they only apply in the listed direction: http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemake r_Explained/index.html#s-resource-ordering -- Ken Gaillot ___ Users maili

Re: [ClusterLabs] How to stop resource after N failures?

2017-10-27 Thread Ken Gaillot
and message > boards I can find, but I don't see a solution. Any ideas? Is this > setup possible in Pacemaker or do I need to pick between on- > failure=stop and migration-threshold=3? -- Ken Gaillot ___ Users mailing list: Users@cluste

Re: [ClusterLabs] different start/stop order

2017-10-31 Thread Ken Gaillot
ingle/Pace > > maker_Explained/index.html#s-resource-ordering > > Ok but i see only how can i create a start order, but how can i > create a different stop order? > > Best regards > Stefan symmetrical=false plus first-action and then-action -- Ken Gaillot

Re: [ClusterLabs] Colocation rule with vip and ms master

2017-10-31 Thread Ken Gaillot
at's occurring in my cluster is that the first rule > > stops the > > > Sync node from being promoted if the Master ever dies. The second > > doesn't > > > but I can't quite follow why. > > > > Getting a score of -inf means that the res

Re: [ClusterLabs] Pacemaker resource parameter reload confusion

2017-10-31 Thread Ken Gaillot
On Tue, 2017-10-31 at 09:33 +0100, Ferenc Wágner wrote: > Ken Gaillot writes: > > > On Fri, 2017-10-20 at 15:52 +0200, Ferenc Wágner wrote: > > > > > Ken Gaillot writes: > > > > > > > On Fri, 2017-09-22 at 18:30 +0200, Ferenc Wágn

Re: [ClusterLabs] 'expected' node attribute missing

2017-10-31 Thread Ken Gaillot
On Mon, 2017-10-30 at 10:48 +0600, Sergey Korobitsin wrote: > Ken Gaillot ☫ → To Cluster Labs - All topics related to open-source > clustering welcomed @ Fri, Oct 27, 2017 10:38 -0500 > > > > Hello, > > > I'm trying to use https://github.com/marcan/pacemaker-export

Re: [ClusterLabs] Pacemaker resource parameter reload confusion

2017-10-31 Thread Ken Gaillot
On Tue, 2017-10-31 at 18:44 +0100, Ferenc Wágner wrote: > Ken Gaillot writes: > > > The pe-input is indeed entirely sufficient. > > > > I forgot to check why the reload was not possible in this case. It > > turns out it is this: > > > >    trace: check

Re: [ClusterLabs] different start/stop order

2017-11-01 Thread Ken Gaillot
source-stickiness=100 \ > stonith-enabled=false \ > last-lrm-refresh=1507890181 > > is that ok? the manual failover looks good. > > best regards -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.o

Re: [ClusterLabs] Pacemaker resource start delay when there are another resource is starting

2017-11-01 Thread Ken Gaillot
(ocf::heartbeat:logserver): FAILED > 192.168.2.177 > >  Started: [ 192.168.2.178 192.168.2.179 ] > >  Clone Set: fm_mgt_replica [fm_mgt] > >  Started: [ 192.168.2.178 192.168.2.179 ] > >  Stopped: [ 192.168.2.177 ] > > I am confusing very much.

Re: [ClusterLabs] resource "force" start / always start - how?

2017-11-01 Thread Ken Gaillot
at "start" does your one-shot command. Then Pacemaker can start and stop it as normal. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlab

Re: [ClusterLabs] dont (re)start a ressource if there is already running

2017-11-01 Thread Ken Gaillot
f \ > cluster-infrastructure=corosync \ > cluster-name=debian \ > no-quorum-policy=ignore \ > default-resource-stickiness=100 \ > stonith-enabled=false \ > last-lrm-refresh=1509546667 > > So is it possible to che

[ClusterLabs] Pacemaker 1.1.18 Release Candidate 4

2017-11-02 Thread Ken Gaillot
idate before the final release next week. Any testing you can do is very welcome. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started:

Re: [ClusterLabs] drbd clone not becoming master

2017-11-03 Thread Ken Gaillot
t;   Dennis > That's odd, it should only happen if the cluster is not running, but then the agent wouldn't have been called. The CIB is one of the core daemons of pacemaker; it manages the cluster configuration and status. If it's not running, the cluster can't

Re: [ClusterLabs] Prevent resources from beeing restarted when activating placement-strategy

2017-11-03 Thread Ken Gaillot
time??  > > thank you!  > regards  > Philipp > Good question, I didn't realize that. crm_simulate is a good tool for exploring that sort of "why", but it's rather arcane. If you have a pe- input file from the transition with the restart, I can take a look. --

Re: [ClusterLabs] Pacemaker resource start delay when there are another resource is starting

2017-11-06 Thread Ken Gaillot
the undocumented/unsupported start-delay operation attribute, that you can put on the status operation to delay the first monitor. That may give you the behavior you want. > At 2017-11-01 21:20:50, "Ken Gaillot" wrote: > >On Sat, 2017-10-28 at 01:11 +0800, lkxjtu wrote: > >

Re: [ClusterLabs] Failed actions .. constraint confusion

2017-11-07 Thread Ken Gaillot
use crm_simulate to get more information about it. crm_simulate is not very user-friendly, so if you can attach the pe- input file, I can take a look at it. (The pe-input will be listed at the end of the transition in the logs on the node that was DC at the time; you'll see a bunch of

Re: [ClusterLabs] are there equivelent restful apis for crm commands

2017-11-07 Thread Ken Gaillot
re enhancement, but it would be a big project, so I don't know what the time frame would be. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.cluster

Re: [ClusterLabs] Unable to perform resource failover.

2017-11-07 Thread Ken Gaillot
e that your operating system is not managing the httpd process (via systemd, upstart, lsb init, etc.). > How we can achieve a resources failover? migration-threshold=1 >   > Further I will use this environment for testing the migration- > threshold. > Any suggestions

Re: [ClusterLabs] Hawk vs pcs Web UI

2017-11-08 Thread Ken Gaillot
tories (some, such as Debian, provide both). If you have a strong preference, you can always build your favorite yourself (which is less of an option if you are using an enterprise distro and want everything supported). -- Ken Gaillot ___ Users mailing l

Re: [ClusterLabs] One cluster with two groups of nodes

2017-11-09 Thread Ken Gaillot
an it from the other nodes using -INFINITY location constraints. If the base resource should only fail over to the opposite group, that's trickier, but something roughly similar would be to prefer one node in each group with an equal positive score location constraint, and migration-thre

Re: [ClusterLabs] issues with pacemaker daemonization

2017-11-09 Thread Ken Gaillot
tem to make pacemaker daemonize itself more "properly", but no one's had the time to address it. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.cl

Re: [ClusterLabs] Pacemaker 1.1.18 Release Candidate 4

2017-11-09 Thread Ken Gaillot
On Fri, 2017-11-03 at 08:24 +0100, Kristoffer Grönlund wrote: > Ken Gaillot writes: > > > I decided to do another release candidate, because we had a large > > number of changes since rc3. The fourth release candidate for > > Pacemaker > > version 1.1.18 is no

Re: [ClusterLabs] Pacemaker responsible of DRBD and a systemd resource

2017-11-10 Thread Ken Gaillot
‘pcs config’ > > https://pastebin.com/1TUvZ4X9 > > Cheers! > -dw > > -- > Derek Wuelfrath > dwuelfr...@inverse.ca :: +1.514.447.4918 (x110) :: +1.866.353.6153 > (x110) > Inverse inc. :: Leaders behind SOGo (www.sogo.nu), PacketFence > (www.packetfence.org)

<    3   4   5   6   7   8   9   10   11   12   >