Re: [ClusterLabs] Pacemaker 2.0.1-rc4 now available

2019-01-31 Thread Jan Pokorný
On 30/01/19 11:07 -0600, Ken Gaillot wrote: > For those on the bleeding edge, the newest versions of GCC and glib > cause some issues. GCC 9 does stricter checking of print formats that > required a few log message fixes in this release (i.e. using GCC 9 with > the -Werror option will fail with

Re: [ClusterLabs] PCMK_ipc_buffer recommendation

2019-01-22 Thread Jan Pokorný
On 21/01/19 16:38 -0600, Ken Gaillot wrote: > On Sat, 2019-01-19 at 00:46 +0200, Michael Kolomiets wrote: > I wish there were a convenient formula, part of this non-straightforward evaluation is fact that some daemons connect to other local "mates" only intermittently, on an ad-hoc basis, if my

Re: [ClusterLabs] About the alerts

2019-01-22 Thread Jan Pokorný
On 22/01/19 12:46 +0100, Klaus Wenninger wrote: > On 01/22/2019 12:23 PM, T. Ladd Omar wrote: >> Will the pacemaker alerts re-execute the alerts scripts if  the >> external-scripts execute failed? > > Nope - at least not till execution was moved to lrmd (execd) - but I > don't think that this

Re: [ClusterLabs] [pacemaker] Discretion with glib v2.59.0+ recommended

2019-01-21 Thread Jan Pokorný
On 21/01/19 09:17 +0100, Ulrich Windl wrote: > IMHO it's like in Perl: When relying the hash keys to be returned > in any particular (or even stable) order, the idea is just broken! > Either keep the keys in an extra array for ordering, or sort them > in some way... Exactly, IT silos lacking

Re: [ClusterLabs] [pacemaker] Discretion with glib v2.59.0+ recommended

2019-01-20 Thread Jan Pokorný
On 18/01/19 20:32 +0100, Jan Pokorný wrote: > It was discovered that this release of glib project changed sligthly > some parameters of how distribution of values within hash tables > structures work, undermining pacemaker's hard (alas unfeasible) attempt > to turn this data typ

[ClusterLabs] [pacemaker] Discretion with glib v2.59.0+ recommended

2019-01-18 Thread Jan Pokorný
It was discovered that this release of glib project changed sligthly some parameters of how distribution of values within hash tables structures work, undermining pacemaker's hard (alas unfeasible) attempt to turn this data type into fully predictable entity. Current impact is unknown beside

Re: [ClusterLabs] Node add doesn't add node?

2019-01-11 Thread Jan Pokorný
On 11/01/19 00:16 +, Israel Brewster wrote: > On Jan 10, 2019, at 10:57 AM, Israel Brewster > mailto:ibrews...@flyravn.com>> wrote: >> >> So in my ongoing work to upgrade my cluster to CentOS 7, I got one >> box up and running on CentOS 7, with the cluster fully configured >> and functional,

Re: [ClusterLabs] About the pacemaker

2019-01-10 Thread Jan Pokorný
On 10/01/19 14:53 +0100, Jan Pokorný wrote: > On 08/01/19 10:14 -0600, Ken Gaillot wrote: >> On Tue, 2019-01-08 at 15:27 +0800, T. Ladd Omar wrote: >>> I have a question, if the Pacemaker has an event-notify interface >>> which is realized by push Recently I want to d

Re: [ClusterLabs] About the pacemaker

2019-01-10 Thread Jan Pokorný
On 08/01/19 10:14 -0600, Ken Gaillot wrote: > On Tue, 2019-01-08 at 15:27 +0800, T. Ladd Omar wrote: >> I have a question, if the Pacemaker has an event-notify interface >> which is realized by push Recently I want to do something >> extra using other process when the resources being started or >>

Re: [ClusterLabs] VirtualDomain & parallel shutdown

2019-01-08 Thread Jan Pokorný
On 27/11/18 14:35 +0100, Jan Pokorný wrote: > On 27/11/18 12:29 +0200, Klecho wrote: >> Big thanks for the answer, but I in your ways around I don't see a solution >> for the following simple case: >> >> I have a few VMs (VirtualDomain RA) and just want and to sto

[ClusterLabs] Stray started resource leakages (Was: [Problem] The crmd fails to connect with pengine.)

2019-01-08 Thread Jan Pokorný
On 02/01/19 15:43 +0100, Jan Pokorný wrote: > On 28/12/18 05:51 +0900, renayama19661...@ybb.ne.jp wrote: >> As a result, Pacemaker will stop without stopping the resource. > > This might have serious consequences in some scenarios, perhaps > unless some watchdog-based soluti

Re: [ClusterLabs] [Problem] The crmd fails to connect with pengine.

2019-01-02 Thread Jan Pokorný
On 28/12/18 05:51 +0900, renayama19661...@ybb.ne.jp wrote: > This problem occurred with our users. > > The following problem occurred in a two-node cluster that does not set > STONITH. > > The problem seems to have occurred in the following procedure. > > Step 1) Configure the cluster with 2

Re: [ClusterLabs] Corosync 3.0.0 is available at corosync.org!

2018-12-17 Thread Jan Pokorný
On 17/12/18 10:04 +, Christine Caulfield wrote: > On 17/12/2018 09:34, Ulrich Windl wrote: >> I wonder: Is there a migration script that can converts corosync.conf files? >> At least you have a few version components in the config file that will help >> such tool to know what to do... ;-) > >

Re: [ClusterLabs] How to backup?

2018-11-28 Thread Jan Pokorný
On 26/11/18 09:10 +0100, Ulrich Windl wrote: lejeczek schrieb am 23.11.2018 um 15:56 in Nachricht > <46d2baf6-a03d-9aac-fceb-7bcffb383...@yahoo.co.uk>: >> hi guys, >> >> Do we have tools or maybe outside of the cluster suite there is a way to >> backup cluster? >> >> I'm obviously talking

Re: [ClusterLabs] VirtualDomain & parallel shutdown

2018-11-27 Thread Jan Pokorný
On 27/11/18 12:29 +0200, Klecho wrote: > Big thanks for the answer, but I in your ways around I don't see a solution > for the following simple case: > > I have a few VMs (VirtualDomain RA) and just want and to stop a few of them, > not all. > > While the first VM is shutting down

Re: [ClusterLabs] pcs 0.10.1 released

2018-11-26 Thread Jan Pokorný
Congratulations for the release. On 26/11/18 17:26 +0100, Tomas Jelinek wrote: > Main changes compared to 0.9 branch: > > [...] > > * Python 3.6+ and Ruby 2.2+ is now required Out of curiosity, what's the driver for such a steep Python version lower bound? -- Nazdar, Jan (Poki)

[ClusterLabs] FYI: dlm.service possibly silenty overwritten with DisplayLink driver installer

2018-11-20 Thread Jan Pokorný
Accidentally, when searching for something systemd related, dlm.service caught my eye, and surprisingly, it was rather in a HW support in Linux SW enablement context. Briefly looking into the Ubuntu driver that allegedly contained that file (or recipe to create it, actually), I've realized the

Re: [ClusterLabs] Pacemaker auto restarts disabled groups

2018-11-12 Thread Jan Pokorný
On 09/11/18 13:24 +, Ian Underhill wrote: > Yep all my pcs commands run on a live cluster. The design needs > resources to respond in specific ways before moving on to other > shutdown requests. > > So it seems that these pcs commands that run on different nodes at > the same time, is the

Re: [ClusterLabs] Fwd: Re: ocf on dotnet

2018-11-06 Thread Jan Pokorný
On 06/11/18 12:06 +0100, Jan Pokorný wrote: > On 05/11/18 18:49 +0100, jiri pijacek wrote: >>> could enyone help me with  OCF for dotnet >>> /usr/lib/ocf/resource.d/heartbeat/dotnet > > looks like some context is missing so now, we can only guess you ask > fo

Re: [ClusterLabs] Fwd: Re: ocf on dotnet

2018-11-06 Thread Jan Pokorný
Hello Jiří, On 05/11/18 18:49 +0100, jiri pijacek wrote: >> could enyone help me with  OCF for dotnet >> /usr/lib/ocf/resource.d/heartbeat/dotnet looks like some context is missing so now, we can only guess you ask for help with writing said agent. The best you can do is to look around how

[ClusterLabs] Idea of native masking of systemd resources (Was: Configure a resource to only run a single instance at all times)

2018-10-31 Thread Jan Pokorný
On 29/10/18 20:19 +0300, Andrei Borzenkov wrote: > 29.10.2018 20:04, jm2109...@gmail.com пишет: >> I'm a new user of pacemaker clustering software and I've just configured a >> cluster with a single systemd resource. I have the following cluster and >> resource configurations below. Failover works

Re: [ClusterLabs] How to generate RPMs for Pacemaker release 2.x on Centos

2018-10-17 Thread Jan Pokorný
On 15/10/18 14:46 +, Lopez, Francisco Javier [Global IT] wrote: > I could not do that way as this box does not have access to Internet. > Will see how to deal with this. As Ken mentioned, current upstream-devised RPM packaging practices are wrapped around the assumption of working with git

[ClusterLabs] Position of pacemaker in today's HA world

2018-10-05 Thread Jan Pokorný
Hello HA enthusiasts, I've come by an interesting article on the topic of how high availability (possibly, I couldn't witness this first hand since I don't have a time machine, but some of you can perhaps comment if the picture matches own experience) historically evolved from the perspective of

Re: [ClusterLabs] About fencing stonith

2018-09-26 Thread Jan Pokorný
On 26/09/18 20:19 +0200, Valentin Vidic wrote: > On Thu, Sep 06, 2018 at 04:47:32PM -0400, Digimer wrote: >> It depends on the hardware you have available. In your case, RPi has no >> IPMI or similar feature, so you'll need something external, like a >> switched PDU. I like the APC AP7900 (or your

Re: [ClusterLabs] Q: Reusing date specs in crm shell

2018-09-12 Thread Jan Pokorný
On 11/09/18 13:52 +0200, Ulrich Windl wrote: > I have a set of resources with almost identical rules, one part > being a data spec. Currently I'm using two different date specs in > those rules. However I repeated the date spec in every rule. > Foreseeing that I might change those one day, I

Re: [ClusterLabs] Different Times in the Corosync Log?

2018-08-27 Thread Jan Pokorný
On 22/08/18 03:58 +, Eric Robinson wrote: >> -Original Message- >> From: Users On Behalf Of Jan Pokorný >> Sent: Tuesday, August 21, 2018 2:45 AM >> To: users@clusterlabs.org >> Subject: Re: [ClusterLabs] Different Times in the Corosync Log? >&g

Re: [ClusterLabs] Different Times in the Corosync Log?

2018-08-21 Thread Jan Pokorný
On 21/08/18 08:43 +, Eric Robinson wrote: >> I could guess that the processes run with different timezone >> settings (for whatever reason). > > That would be my guess, too, but I cannot imagine how they ended up > in that condition. Hard to guess, the PIDs indicate the expected state of

Re: [ClusterLabs] Q: ordering for a monitoring op only?

2018-08-20 Thread Jan Pokorný
On 20/08/18 10:51 +0200, Ulrich Windl wrote: > I wonder whether it's possible to run a monitoring op only if some > specific resource is up. > Background: We have some resource that runs fine without NFS, but > the start, stop and monitor operations will just hang if NFS is > down. In effect the

Re: [ClusterLabs] Q: HA_RSCTMP in SLES11 SP4 at first start after reboot

2018-08-13 Thread Jan Pokorný
On 13/08/18 18:13 +0300, Vladislav Bogdanov wrote: > 10.08.2018 19:52, Ulrich Windl wrote: >> >> A simple question: One of my RAs uses $HA_RSCTMP in SLES11 SP4, and it >> reports the following problem: >> WARNING: Unwritable HA_RSCTMP directory /var/run/resource-agents - using >> /tmp > >

Re: [ClusterLabs] Q: HA_RSCTMP in SLES11 SP4 at first start after reboot

2018-08-13 Thread Jan Pokorný
On 13/08/18 09:27 -0500, Ryan Thomas wrote: > I've had similar problems in the past. In my case, it was because > pacemaker was running as user 'hacluster' in group 'haclient', so it didn't > have permission to access the root owned file. So to fix the problem, I > changed the ownership of the

Re: [ClusterLabs] digression: Corosync watchdog experience

2018-08-10 Thread Jan Pokorný
On 10/08/18 10:51 +0200, Ferenc Wágner wrote: > Failure story for amusement: the blades expose a BMC watchdog device to > the OS, which was picked up by Corosync. It seemed like a useful second > line of defense in case fencing (BMC IPMI power) failed for any reason; > I let it live and forgot

Re: [ClusterLabs] How to implement a fencing agent

2018-08-09 Thread Jan Pokorný
On 09/08/18 14:10 -0500, Ryan Thomas wrote: > I did some more investigation and was able to answer two of my questions: > > First, why did "pcs stonith list" not show my fence_foo agent? pcs runs the > meta-data action on the agent to get the description. Since my fence_foo > agent wasn't

Re: [ClusterLabs] How to implement a fencing agent

2018-08-09 Thread Jan Pokorný
On 09/08/18 07:59 +0200, Ulrich Windl wrote: Ryan Thomas schrieb am 08.08.2018 um 23:26 in > Nachricht > : >> I’m attempting to implement a fencing agent. >> >> The ClusterLabs/fence-agent github repo has some helpful information >> including fence-agents/doc/FenceAgentAPI.md, but I haven’t

Re: [ClusterLabs] ban node or disable (all) resources upon node addition to the cluster - how?

2018-08-01 Thread Jan Pokorný
Hello, On 01/08/18 13:46 +0100, lejeczek wrote: > is it possible to tell the cluster to exclude or ban resources to > run on a node which I'd like to add the cluster? (as one command?) > > (or any other way that would assure that no resources would be moved to that > node, in case cluster would

Re: [ClusterLabs] CIB daemon up and running

2018-07-31 Thread Jan Pokorný
Hello Rohit, On 31/07/18 05:03 +, Rohit Saini wrote: > After "pcs cluster start", how would I know if my CIB daemon has > come up and is initialized properly. > Currently I am checking output of "cibadmin -Q" periodically and > when I get the output, I consider CIB daemon has come up and >

Re: [ClusterLabs] corosync/dlm fencing?

2018-07-19 Thread Jan Pokorný
On 19/07/18 17:25 +0200, Philipp Achmüller wrote: > "Users" schrieb am 18.07.2018 15:46:09: >> if it's unclear, 0.17.2 as the lowest version that's fixed > > following version is currently installed with SP3: > > libqb0-1.0.1-2.15.x86_64 Then the only blind bet is that this patch to libqb

Re: [ClusterLabs] corosync/dlm fencing?

2018-07-18 Thread Jan Pokorný
Just minor clarifications (without changing the validity) below: On 17/07/18 21:28 +0200, Jan Pokorný wrote: > > On 16/07/18 11:44 +0200, Philipp Achmüller wrote: >> Unfortunatly it is not obvious for me - the "grep fence" is attached >> in my original message.

Re: [ClusterLabs] Antwort: Antw: corosync/dlm fencing?

2018-07-17 Thread Jan Pokorný
On 16/07/18 11:44 +0200, Philipp Achmüller wrote: > Unfortunatly it is not obvious for me - the "grep fence" is attached > in my original message. Sifting your logs a bit: > --- > Node: siteb-2 (DC): > 2018-06-28T09:02:23.282153+02:00 siteb-2 pengine[189259]: notice: Move >

Re: [ClusterLabs] Problem with pacemaker init.d script

2018-07-12 Thread Jan Pokorný
On 12/07/18 18:54 +0200, Jan Pokorný wrote: >>> On 12 Jul 2018, at 15:47, Jan Pokorný wrote: >>> On 11/07/18 18:43 +0200, Salvatore D'angelo wrote: >>>>>>> On Wed, 2018-07-11 at 18:43 +0200, Salvatore D'angelo wrote: >>>>>>>> [..

Re: [ClusterLabs] Problem with pacemaker init.d script

2018-07-12 Thread Jan Pokorný
through their native systemd unit files (a tuned collection of directives determining how to optimally run particular daemons) when provided, amongst others. > >> On 12 Jul 2018, at 15:47, Jan Pokorný wrote: >> On 11/07/18 18:43 +0200, Salvatore D'angelo wrote: >>>>>> O

Re: [ClusterLabs] Problem with pacemaker init.d script

2018-07-12 Thread Jan Pokorný
Hello Salvatore, we can cope with that without much trouble, but you seem to have a talent to present multiple related issues at once, or perhaps to start solving the problems from the too distant point :-) As mentioned, that's also fine, but let's separate them... On 11/07/18 18:43 +0200,

Re: [ClusterLabs] cronjobs only on active node

2018-07-10 Thread Jan Pokorný
> "Stefan K" schrieb am 10.07.2018 um 11:49 in Nachricht > : >> is it somehow possible to have a cronjob active only on the active node? >> On 10/07/18 12:18 +0200, Ulrich Windl wrote: > I had written some script to patch a crontab in HP-UX to enable and > disable specific entries. For Linux the

Re: [ClusterLabs] Upgrade corosync problem

2018-07-09 Thread Jan Pokorný
On 06/07/18 15:25 +0200, Salvatore D'angelo wrote: > On 6 Jul 2018, at 14:40, Christine Caulfield wrote: >> Yes. you can't randomly swap in and out hand-compiled libqb versions. >> Find one that works and stick to it. It's an annoying 'feature' of newer >> linkers that we had to workaround in

Re: [ClusterLabs] Upgrade corosync problem

2018-07-03 Thread Jan Pokorný
On 02/07/18 17:19 +0200, Salvatore D'angelo wrote: > Today I tested the two suggestions you gave me. Here what I did. > In the script where I create my 5 machines cluster (I use three > nodes for pacemaker PostgreSQL cluster and two nodes for glusterfs > that we use for database backup and WAL

Re: [ClusterLabs] Upgrade corosync problem

2018-06-29 Thread Jan Pokorný
per root user may normally be bypassing any such limitations. Good luck. > Il Ven 29 Giu 2018, 5:46 PM Jan Pokorný ha scritto: > >> On 26/06/18 11:03 +0200, Salvatore D'angelo wrote: >>> Yes, sorry you’re right I could find it by myself. >>> However, I did the foll

Re: [ClusterLabs] Upgrade corosync problem

2018-06-29 Thread Jan Pokorný
On 29/06/18 10:00 +0100, Christine Caulfield wrote: > On 27/06/18 08:35, Salvatore D'angelo wrote: >> One thing that I do not understand is that I tried to compare corosync >> 2.3.5 (the old version that worked fine) and 2.4.4 to understand >> differences but I haven’t found anything related to

Re: [ClusterLabs] Upgrade corosync problem

2018-06-29 Thread Jan Pokorný
On 26/06/18 11:03 +0200, Salvatore D'angelo wrote: > Yes, sorry you’re right I could find it by myself. > However, I did the following: > > 1. Added the line you suggested to /etc/fstab > 2. mount -o remount /dev/shm > 3. Now I correctly see /dev/shm of 512M with df -h > Filesystem Size

[ClusterLabs] [questionnaire] Do you overload pacemaker's meta-attributes to track your own data?

2018-06-28 Thread Jan Pokorný
Hello, and since it is a month since the preceding attempt to gather some feedback, welcome to yet another simple set of questions that I will be glad to have answered by as many of you as possible, as an auxiliary indicator what's generally acceptable and what's not within the userbase. This

Re: [ClusterLabs] VM failure during shutdown

2018-06-27 Thread Jan Pokorný
Hello Vaggelis, just a technical meta-note below: On 27/06/18 11:09 +0300, Vaggelis Papastavros wrote: > *My question Ken is : are the below steps (in red enough) to ensure > that the new VM will be placed on the node 1 ?* It's an unwritten convention to refrain from HTML formatted messages

Re: [ClusterLabs] Upgrade corosync problem

2018-06-27 Thread Jan Pokorný
On 26/06/18 17:56 +0200, Salvatore D'angelo wrote: > I did another test. I modified docker container in order to be able to run > strace. > Running strace corosync-quorumtool -ps I got the following: > [snipped] > connect(5, {sa_family=AF_LOCAL, sun_path=@"cfg"}, 110) = 0 > setsockopt(5,

Re: [ClusterLabs] pcs 0.9.165 released

2018-06-25 Thread Jan Pokorný
On 25/06/18 12:08 +0200, Tomas Jelinek wrote: > I am happy to announce the latest release of pcs, version 0.9.165. What a mighty patch/micro version component ;-) With several pacemaker 2.0 release candidates out, it would be perhaps welcome to share details about versioning (branches) politics

Re: [ClusterLabs] Upgrade corosync problem

2018-06-25 Thread Jan Pokorný
On 25/06/18 19:06 +0200, Salvatore D'angelo wrote: > Thanks for reply. I scratched my cluster and created it again and > then migrated as before. This time I uninstalled pacemaker, > corosync, crmsh and resource agents with make uninstall > > then I installed new packages. The problem is the

Re: [ClusterLabs] corosync-qdevice doesn't daemonize (or stay running)

2018-06-21 Thread Jan Pokorný
On 21/06/18 14:44 +0100, Christine Caulfield wrote: > On 21/06/18 14:27, Christine Caulfield wrote: >> >> I just tried this on my Debian VM and it does exactly the same as yours. >> So I think you should report it to the Debian maintainer as it doesn't >> happen on my Fedora or RHEL systems >> >

Re: [ClusterLabs] corosync-qdevice doesn't daemonize (or stay running)

2018-06-21 Thread Jan Pokorný
On 21/06/18 07:05 -0400, Jason Gauthier wrote: > On Thu, Jun 21, 2018 at 5:11 AM Christine Caulfield > wrote: >> On 19/06/18 18:47, Jason Gauthier wrote: >>> Attached! >> >> That's very odd. I can see communication with the server and corosync in >> there (do it's doing something) but no

Re: [ClusterLabs] Resource agents differences from 1.1.14 and 1.1.18

2018-06-21 Thread Jan Pokorný
Hello Salvatore, On 21/06/18 12:44 +0200, Salvatore D'angelo wrote: > I am trying to upgrade my PostgresSQL cluster managed by pacemaker > to pacemaker 1.1.8 or 2.0.0. I have some resource agents that I > patched to have them working with my cluster. > > Can someone tell me if something is

Re: [ClusterLabs] [questionnaire] Do you manage your pacemaker configuration by hand and (if so) what reusability features do you use?

2018-06-14 Thread Jan Pokorný
On 31/05/18 14:48 +0200, Jan Pokorný wrote: > I am soliciting feedback on these CIB features related questions, > please reply (preferably on-list so we have the shared collective > knowledge) if at least one of the questions is answered positively > in your case (just tick th

Re: [ClusterLabs] Ansible role to configure Pacemaker

2018-06-08 Thread Jan Pokorný
On 07/06/18 17:57 +0100, Adam Spiers wrote: > Jan Pokorný wrote: >> While I see why Ansible is compelling, I feel it's important to >> challenge this trend of trying to bend/rebrand _machine-local >> configuration management tool_ as _distributed system managemen

Re: [ClusterLabs] Ansible role to configure Pacemaker

2018-06-07 Thread Jan Pokorný
On 07/06/18 11:08 -0400, Styopa Semenukha wrote: > Thank you for your thoughts, Jan! I agree with the importance of the > topics you raised, and I'd like to comment on them in the light of > our project (and configuration management approach in general). > > On 06/06/2018 08:26

Re: [ClusterLabs] Ansible role to configure Pacemaker

2018-06-06 Thread Jan Pokorný
On 07/06/18 02:19 +0200, Jan Pokorný wrote: > While I see why Ansible is compelling, I feel it's important to > challenge this trend of trying to bend/rebrand _machine-local > configuration management tool_ as _distributed system management tool_ > (pacemaker is distributed applicati

Re: [ClusterLabs] Ansible role to configure Pacemaker

2018-06-06 Thread Jan Pokorný
On 06/06/18 15:51 -0400, Styopa Semenukha wrote: > We wrote a role to configure Pacemaker clusters, and I'd like to share > it with the community. Any questions or comments welcome. Hello and thanks for the announcement. And now, something I've meant to write down regarding configuration

Re: [ClusterLabs] [questionnaire] Do you manage your pacemaker configuration by hand and (if so) what reusability features do you use?

2018-06-05 Thread Jan Pokorný
On 31/05/18 14:48 +0200, Jan Pokorný wrote: > I am soliciting feedback on these CIB features related questions, > please reply (preferably on-list so we have the shared collective > knowledge) if at least one of the questions is answered positively > in your case (just tick th

Re: [ClusterLabs] [questionnaire] Do you manage your pacemaker configuration by hand and (if so) what reusability features do you use?

2018-05-31 Thread Jan Pokorný
On 31/05/18 11:42 -0500, Ken Gaillot wrote: > On Thu, 2018-05-31 at 14:48 +0200, Jan Pokorný wrote: >> I am soliciting feedback on these CIB features related questions, >> please reply (preferably on-list so we have the shared collective >> knowledge) if at least one of the

[ClusterLabs] [questionnaire] Do you manage your pacemaker configuration by hand and (if so) what reusability features do you use?

2018-05-31 Thread Jan Pokorný
Hello, I am soliciting feedback on these CIB features related questions, please reply (preferably on-list so we have the shared collective knowledge) if at least one of the questions is answered positively in your case (just tick the respective "[ ]" boxes as "[x]"). Any other commentary also

Re: [ClusterLabs] pcsd processes using 100% CPU

2018-05-24 Thread Jan Pokorný
On 23/05/18 12:43 -0600, Casey & Gina wrote: > I don't have gcore installed and don't know which package might > provide it. I also don't have experience with gdb but am happy to > try anything suggested to help figure out what's going on. gcore is part of gdb:

Re: [ClusterLabs] pcsd processes using 100% CPU

2018-05-22 Thread Jan Pokorný
On 18/05/18 20:04 +, Shobe, Casey wrote: > On a couple clusters that have been running for a little while > (without fencing), I'm seeing runaway server.rb processes using 100% > of a single CPU core each. > > When I look at ps, I can see that these have something to do with > pcsd: > > USER

Re: [ClusterLabs] Pacemaker resources are not scheduled

2018-04-16 Thread Jan Pokorný
Lkxjtu, On 14/04/18 00:16 +0800, lkxjtu wrote: > My cluster version: > Corosync 2.4.0 > Pacemaker 1.1.16 > > There are many resource anomalies. Some resources are only monitored > and not recovered. Some resources are not monitored or recovered. > Only one resource of vnm is scheduled normally,

Re: [ClusterLabs] Corosync 2.4.4 is available at corosync.org!

2018-04-12 Thread Jan Pokorný
On 12/04/18 14:33 +0200, Jan Friesse wrote: > I am pleased to announce the latest maintenance release of Corosync > 2.4.4 available immediately from our website at > http://build.clusterlabs.org/corosync/releases/. > > This release contains a lot of fixes, including fix for CVE-2018-1084.

Re: [ClusterLabs] Antw: Re: Possible idea for 2.0.0: renaming the Pacemaker daemons

2018-04-11 Thread Jan Pokorný
On 11/04/18 11:03 -0500, Ken Gaillot wrote: > On Wed, 2018-04-11 at 08:49 +0200, Ulrich Windl wrote: > Ken Gaillot schrieb am 09.04.2018 um > 19:10 in Nachricht >>> I had planned to use the "pcmk-" prefix, but I kept thinking about >>> the goal of making things more

Re: [ClusterLabs] Possible idea for 2.0.0: renaming the Pacemaker daemons

2018-04-11 Thread Jan Pokorný
On 11/04/18 15:24 +0200, Klaus Wenninger wrote: > On 04/11/2018 01:14 AM, Andrew Beekhof wrote: >> you know... I wouldn't be opposed to running two copies (one for >> config, one for status) and having the crmd combine the two before >> sending to the PE. i've toyed with the idea in the past to

Re: [ClusterLabs] Possible idea for 2.0.0: renaming the Pacemaker daemons

2018-04-10 Thread Jan Pokorný
On 10/04/18 09:42 -0500, Ken Gaillot wrote: > On Tue, 2018-04-10 at 08:50 +0200, Jehan-Guillaume de Rorthais > wrote: >> On Tue, 10 Apr 2018 00:54:01 +0200 >> Jan Pokorný <jpoko...@redhat.com> wrote: >>> On 09/04/18 12:10 -0500, Ken Gaillot wrote: >>>&g

[ClusterLabs] Pacemaker's additional services for distributed applications (Was: Possible idea for 2.0.0: renaming the Pacemaker daemons)

2018-04-10 Thread Jan Pokorný
On 06/04/18 12:24 +0200, Jan Pokorný wrote: > On 06/04/18 09:09 +0200, Kristoffer Grönlund wrote: >>>> The idea is to provide a more generalized key-value store that >>>> other applications built on top of pacemaker can use. Something >>>> li

Re: [ClusterLabs] Possible idea for 2.0.0: renaming the Pacemaker daemons

2018-04-09 Thread Jan Pokorný
On 09/04/18 12:10 -0500, Ken Gaillot wrote: > Based on the list discussion and feedback I could coax out of others, I > will change the Pacemaker daemon names, including the log tags, for > 2.0.0-rc3. > > I will add symlinks for the old names, to allow help/version/metadata > calls in user

Re: [ClusterLabs] Possible idea for 2.0.0: renaming the Pacemaker daemons

2018-04-06 Thread Jan Pokorný
On 06/04/18 09:09 +0200, Kristoffer Grönlund wrote: > Ken Gaillot writes: >> On Tue, 2018-04-03 at 08:33 +0200, Kristoffer Grönlund wrote: >>> Ken Gaillot writes: >>> > I would vote against PREFIX-configd as compared to other cluster > software,

Re: [ClusterLabs] Possible idea for 2.0.0: renaming the Pacemaker daemons

2018-04-06 Thread Jan Pokorný
On 29/03/18 09:53 -0500, Ken Gaillot wrote: > On Thu, 2018-03-29 at 10:35 +0200, Kristoffer Grönlund wrote: >> Ken Gaillot writes: >>> Here are the current names, with some example replacements: >>> >>>  pacemakerd: PREFIX-launchd, PREFIX-launcher >>> >>>  attrd:

[ClusterLabs] [Announce] clufter v0.77.1 released

2018-03-14 Thread Jan Pokorný
I am happy to announce that clufter, a tool/library for transforming and analyzing cluster configuration formats, got its version 0.77.1 tagged and released (incl. signature using my 60BCBB4F5CD7F9EF key):

Re: [ClusterLabs] Antw: Pacemaker 2.0.0-rc1 now available

2018-02-19 Thread Jan Pokorný
On 19/02/18 10:39 +0100, Ulrich Windl wrote: Ken Gaillot schrieb am 16.02.2018 um 22:06 in Nachricht > <1518815166.31176.22.ca...@redhat.com>: > [...] >> * The master XML tag is deprecated (though still supported) in favor of > > XML guys! > > Everybody is using

Re: [ClusterLabs] Error when linking to libqb in shared library

2018-02-12 Thread Jan Pokorný
[let's move this to developers list] On 12/02/18 07:22 +0100, Kristoffer Grönlund wrote: > (and especially the libqb developers) > > I started hacking on a python library written in C which links to > pacemaker, and so to libqb as well, but I'm encountering a strange > problem which I don't know

Re: [ClusterLabs] fence-agents-all missing some agents

2018-02-01 Thread Jan Pokorný
On 01/02/18 01:59 -0500, Digimer wrote: > On RHEL 7 (and possible elsewhere), 'fence-agents-all' doesn't install > the following; > > fence-agents-virsh.x86_64 4.0.11-66.el7_4.3 updates > fence-sanlock.x86_64 3.5.0-1.el7base > fence-virtd.x86_64

Re: [ClusterLabs] Looking for a Fedora package sponsor

2018-01-30 Thread Jan Pokorný
On 29/01/18 13:19 -0500, Digimer wrote: > On 2018-01-29 11:13 AM, Andrew Price wrote: >> On 29/01/18 01:40, Digimer wrote: >>> Hi all, >>> >>>    I plan to maintain the new kronosnet package for Fedora. To do this >>> (as far as I understand), I'll need a sponsor to get into the package >>>

Re: [ClusterLabs] [IMPORTANT] Fatal, yet rare issue verging on libqb's design flaw and/or it's use in corosync around daemon-forking

2018-01-29 Thread Jan Pokorný
[developers list subscribers, kindly jump to "Current libqb PR" part] On 22/01/18 11:29 +0100, Jan Friesse wrote: >> It was discovered that corosync exposes itself for a self-crash >> under rare circumstance whereby corosync executable is run when there >> is already a daemon instance around

[ClusterLabs] [IMPORTANT] Fatal, yet rare issue verging on libqb's design flaw and/or it's use corosync around daemon-forking

2018-01-22 Thread Jan Pokorný
It was discovered that corosync exposes itself for a self-crash under rare circumstance whereby corosync executable is run when there is already a daemon instance around (does not apply to corosync serving without any backgrounding, i.e. launched with "-f" switch). Such a circumstance can be

[ClusterLabs] [ANTICIPATED FAQ] libqb v1.0.3 vs. binutils' linker (Was: [Announce] libqb 1.0.3 release)

2017-12-21 Thread Jan Pokorný
I've meant to spread following piece advice but forgot... On 21/12/17 17:45 +0100, Jan Pokorný wrote: > On 21/12/17 14:40 +, Christine Caulfield wrote: >> We are pleased to announce the release of libqb 1.0.3 >> >> >> Source code is available at: >> htt

Re: [ClusterLabs] [Announce] libqb 1.0.3 release

2017-12-21 Thread Jan Pokorný
On 21/12/17 14:40 +, Christine Caulfield wrote: > We are pleased to announce the release of libqb 1.0.3 > > > Source code is available at: > https://github.com/ClusterLabs/libqb/releases/download/v1.0.3/libqb-1.0.3.tar.xz > > > This is mainly a bug-fix release to 1.0.2 > > [...] Thanks

Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2017-12-15 Thread Jan Pokorný
On 20/05/16 17:04 +0100, Adam Spiers wrote: > Klaus Wenninger wrote: >> On 05/20/2016 08:39 AM, Ulrich Windl wrote: >>> I think RAs should not rely on "stop" being called multiple times >>> for a resource to be stopped. > > Well, this would be a major architectural change.

Re: [ClusterLabs] Issue with DRBD + a systemd resource

2017-12-14 Thread Jan Pokorný
On 14/12/17 20:59 +0300, Andrei Borzenkov wrote: > 14.12.2017 19:25, Jan Pokorný пишет: >> On 14/12/17 10:49 -0500, Julien Semaan wrote: >>> Great success! >>> >>> Adding the following line to /usr/lib/systemd/system/pacemaker.service did >&g

Re: [ClusterLabs] Issue with DRBD + a systemd resource

2017-12-14 Thread Jan Pokorný
On 14/12/17 17:25 +0100, Jan Pokorný wrote: > Anyway, the change is seemingly straightfoward, but few things > should be answered/investigated first: > - After=dbus.service or rather After=dbus.socket (or both)? In theory, dbus.socket would be more flexible should anyone want to

Re: [ClusterLabs] Issue with DRBD + a systemd resource

2017-12-14 Thread Jan Pokorný
On 14/12/17 17:25 +0100, Jan Pokorný wrote: > On 14/12/17 10:49 -0500, Julien Semaan wrote: >> Great success! >> >> Adding the following line to /usr/lib/systemd/system/pacemaker.service did >> it: >> After=dbus.service > > [...] > > Anyway, th

Re: [ClusterLabs] Issue with DRBD + a systemd resource

2017-12-14 Thread Jan Pokorný
On 14/12/17 10:49 -0500, Julien Semaan wrote: > Great success! > > Adding the following line to /usr/lib/systemd/system/pacemaker.service did > it: > After=dbus.service Note, this is not a proper way for overriding the systemd unit files, which is rather along the lines: - make a copy to

Re: [ClusterLabs] Pacemaker/Corosync on FreeBSD

2017-12-06 Thread Jan Pokorný
On 06/12/17 17:37 -0400, Alberto Mijares wrote: >> >>> If I configure everything by hand (no crmsh nor pcsd) should it >>> work? >> >> Definitely (and if not, we want to know). > > Jan, please check this PR > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224100 > > Apparently, it works.

Re: [ClusterLabs] Antw: Re: questions about startup fencing

2017-12-05 Thread Jan Pokorný
On 05/12/17 10:01 +0100, Tomas Jelinek wrote: > The first attempt to fix the issue was to put nodes into standby mode with > --lifetime=reboot: > https://github.com/ClusterLabs/pcs/commit/ea6f37983191776fd46d90f22dc1432e0bfc0b91 > > This didn't work for several reasons. One of them was back then

Re: [ClusterLabs] Pacemaker/Corosync on FreeBSD

2017-12-04 Thread Jan Pokorný
Hello Alberto, On 04/12/17 16:12 -0400, Alberto Mijares wrote: > At this point, I need to know if someone is using pacemaker/corosync > on FreeBSD. Is it a problem with crmsh only? well, it's enough to have a look at which people develop these high level tooling (crm, pcs) and you'll figure out

Re: [ClusterLabs] systemd's TasksMax and pacemaker

2017-12-02 Thread Jan Pokorný
On 15/11/17 11:16 +0100, Jan Pokorný wrote: > On 14/11/17 15:07 -0600, Ken Gaillot wrote: >> It is conceivable in a large cluster that Pacemaker could exceed >> this limit > > [of 512 or 4915 tasks allowed per service process tree, possibly > overridden with systemd-syst

[ClusterLabs] Should pacemaker pursue its own and corosync's instant resurrection if either dies? (Was: Is corosync supposed to be restarted if it dies?)

2017-12-02 Thread Jan Pokorný
On 30/11/17 11:00 +0300, Andrei Borzenkov wrote: > On Thu, Nov 30, 2017 at 12:42 AM, Jan Pokorný <jpoko...@redhat.com> wrote: >> On 29/11/17 22:00 +0100, Jan Pokorný wrote: >>> On 28/11/17 22:35 +0300, Andrei Borzenkov wrote: >>>> I'm not sure what is exp

Re: [ClusterLabs] Is corosync supposed to be restarted if it fies?

2017-11-29 Thread Jan Pokorný
On 29/11/17 22:00 +0100, Jan Pokorný wrote: > On 28/11/17 22:35 +0300, Andrei Borzenkov wrote: >> 28.11.2017 13:01, Jan Pokorný пишет: >>> On 27/11/17 17:43 +0300, Andrei Borzenkov wrote: >>>> Отправлено с iPhone >>>> >>>>> 27 нояб.

Re: [ClusterLabs] Is corosync supposed to be restarted if it fies?

2017-11-29 Thread Jan Pokorný
On 28/11/17 22:35 +0300, Andrei Borzenkov wrote: > 28.11.2017 13:01, Jan Pokorný пишет: >> On 27/11/17 17:43 +0300, Andrei Borzenkov wrote: >>> Отправлено с iPhone >>> >>>> 27 нояб. 2017 г., в 14:36, Ferenc Wágner <wf...@niif.hu> написал(а): >

Re: [ClusterLabs] Is corosync supposed to be restarted if it fies?

2017-11-28 Thread Jan Pokorný
On 27/11/17 17:43 +0300, Andrei Borzenkov wrote: > Отправлено с iPhone > >> 27 нояб. 2017 г., в 14:36, Ferenc Wágner написал(а): >> >> Andrei Borzenkov writes: >> >>> 25.11.2017 10:05, Andrei Borzenkov пишет: >>> In one of guides suggested procedure

Re: [ClusterLabs] pcs create master/slave resource doesn't work

2017-11-23 Thread Jan Pokorný
On 23/11/17 23:52 +0800, Hui Xiang wrote: > I am working on HA with 3-nodes, which has below configurations: > > """ > pcs resource create ovndb_servers ocf:ovn:ovndb-servers \ > master_ip=168.254.101.2 \ > op monitor interval="10s" \ > op monitor interval="11s" role=Master > pcs resource

Re: [ClusterLabs] systemd's TasksMax and pacemaker

2017-11-15 Thread Jan Pokorný
On 14/11/17 15:07 -0600, Ken Gaillot wrote: > It is conceivable in a large cluster that Pacemaker could exceed > this limit [of 512 or 4915 tasks allowed per service process tree, possibly overridden with systemd-system.conf(5) configuration], > so we are now recommending that users set

Re: [ClusterLabs] [Announce] clufter v0.77.0 released

2017-11-10 Thread Jan Pokorný
On 10/11/17 23:25 +0100, Jan Pokorný wrote: > - bug fixes: > [...] > . all commands having sequence of pcs commands on the output, > hence getting post-processed (line-wrapped and generally > prettified) with the aim to get them human-friendly, might > previousl

[ClusterLabs] [Announce] clufter v0.77.0 released

2017-11-10 Thread Jan Pokorný
I am happy to announce that clufter, a tool/library for transforming and analyzing cluster configuration formats, got its version 0.77.0 tagged and released (incl. signature using my 60BCBB4F5CD7F9EF key):

<    1   2   3   4   >