[ClusterLabs] [Announce] libqb 2.0.7 released

2023-07-21 Thread Christine caulfield
We are pleased to announce the release of libqb 2.0.8 https://github.com/ClusterLabs/libqb/releases/tag/v2.0.8 The main purpose of this release is to fix a potential memory overwrite caused by very long log messages, so an upgrade is recommended.

[ClusterLabs] [Announce] libqb 2.0.7 released

2023-06-07 Thread Christine caulfield
We are pleased to announce the release of libqb 2.0.7 https://github.com/ClusterLabs/libqb/releases/tag/v2.0.7 This release mainly fixes build and test issues (especially building with -j which is now supported), but there are a few obscure bugfixes in here too that are worthwhile upgrading

Re: [ClusterLabs] pacemaker-fenced /dev/shm errors

2023-03-27 Thread Christine caulfield
On 27/03/2023 07:48, d tbsky wrote: Hi: the cluster is running under RHEL 9.0 elements. today I saw log report strange errors like below: Mar 27 13:07:06.287 example.com pacemaker-fenced[2405] (qb_sys_mmap_file_open) error: couldn't allocate file

Re: [ClusterLabs] pacemaker-remoted /dev/shm errors

2023-03-06 Thread Christine caulfield
Hi, The error is coming from libqb - which is what manages the local IPC connections between local clients and the server. I'm the libqb maintainer but I've never seen that error before! Is there anything unusual about the setup on this node? Like filesystems on NFS or some other networked

Re: [ClusterLabs] corosync not starting

2022-06-29 Thread Christine caulfield
On 27/06/2022 17:10, Sridhar K wrote: Hi Team, corosync not starting , getting below error  any port number which I can do telnet and check similar to that of 2224 for pcs image.png image.png The error message from Corosync is "no interfaces defined" - so it looks like the node(s) being

Re: [ClusterLabs] No node name in corosync-cmapctl output

2022-06-01 Thread Christine caulfield
On 01/06/2022 11:17, Jan Friesse wrote: On 31/05/2022 16:28, Andreas Hasenack wrote: Hi, On Tue, May 31, 2022 at 1:35 PM Jan Friesse wrote: Hi, On 31/05/2022 15:16, Andreas Hasenack wrote: Hi, corosync 3.1.6 pacemaker 2.1.2 crmsh 4.3.1 TL;DR I only seem to get a "name" attribute in the

Re: [ClusterLabs] Corosync Transport- Knet Vs UDPU

2022-03-28 Thread Christine caulfield
On 28/03/2022 03:30, Somanath Jeeva via Users wrote: Hi , I am upgrading from corosync 2.x/pacemaker 1.x to corosync 3.x/pacemaker 2.1.x In our use case we are using a 2 node corosync/pacemaker cluster. In corosync 2.x version I was using udpu as transport method. In the corosync 3.x , as

[ClusterLabs] [Announce] libqb 2.0.6 released

2022-03-23 Thread Christine caulfield
A quick update to 2.0.5 that fixes the tests and RPM building. *the new ipc_sock tests needs to be run as root as otherwise each sub-test will timeout - making the run-time huge. *Make sure that the libstat_wrapper.so library is included in the libqb-tests RPM (when built) If you

[ClusterLabs] [Announce] libqb 2.0.5 released

2022-03-21 Thread Christine caulfield
We are pleased to announce the release of libqb 2.0.5 The headline feature of this release is the addition of the new qb_ipcc_connect_async() API call, but there are lots of smaller fixes that should be helpful. Chrissie Caulfield (7): ipcc: Add an async connect API (#450) Tidy some scripts

[ClusterLabs] [Announce] libqb 2.0.4 released

2021-11-15 Thread Christine caulfield
We are pleased to announce the release of libqb 2.0.4 Source code is available at: https://github.com/ClusterLabs/libqb/releases/ Please use the signed .tar.gz or .tar.xz files with the version number in rather than the github-generated "Source Code" ones. The most important fix in this

Re: [ClusterLabs] Two node cluster without fencing and no split brain?

2021-07-21 Thread Christine caulfield
On 21/07/2021 09:50, Frank D. Engel, Jr. wrote: OpenVMS can do this sort of thing without a requirement for fencing (you still need a third disk as a quorum device in a 2-node cluster), but Linux (at least in its current form) cannot. From what I can tell the fencing requirements in the Linux

[ClusterLabs] [Announce] libqb 2.0.3 released

2021-03-03 Thread Christine Caulfield
We are pleased to announce the release of libqb 2.0.3. This is the latest stable release of libqb Source code is available at: https://github.com/ClusterLabs/libqb/releases/download/2.0.3/libqb-2.0.3.tar.xz Please use the signed .tar.gz or .tar.xz files with the version number in rather than

Re: [ClusterLabs] Q: effieciently collecting some cluster facts

2021-02-24 Thread Christine Caulfield
The most efficient way of getting corosync facts about nodes/quorum is to use the votequorum API. see /usr/include/corosync/votequorum.h and in the corosync sources tarball tests/testvotequorum1.c CHrissie On 25/02/2021 07:16, Ulrich Windl wrote: Hi! I'm thinking about some simple cluster

Re: [ClusterLabs] corosync.conf is missing, I did not delete manually. what should I do?

2021-02-16 Thread Christine Caulfield
If you ran pcs cluster destroy then, yes that will delete cluster.conf (at least it did when I just tried it) - which seems reasonable behaviour to me. If you want it back then you should either rerun pcs to create the cluster again or rescue the file from system backups I suppose. Chrissie

Re: [ClusterLabs] Corosync node gets unique Ring ID

2021-01-27 Thread Christine Caulfield
A few things really stand out from this report, I think the inconsistent ring_id is just a symptom. It worries me that corosync-quorumtool behaves differently on some nodes - some show names, some just IP addresses. That could be a cause of some inconsistency. Also the messages " Jan 26

[ClusterLabs] {announce] [Alpha] Rust bindings for Corosync libraries

2021-01-20 Thread Christine Caulfield
I don't know how many/few people will be interested in this, but I have been working on some Rust bindings for the corosync libraries: cpg, cfg, cmap, quorum & votequorum. They are currently in Alpha stage but all features are (I think) implemented and seem to work. There's a little more work

Re: [ClusterLabs] Running shell command on remote node via corosync messaging infrastructure

2021-01-04 Thread Christine Caulfield
On 04/01/2021 13:19, Klaus Wenninger wrote: On 1/4/21 1:50 PM, Christine Caulfield wrote: On 04/01/2021 09:21, Klaus Wenninger wrote: On 1/4/21 8:36 AM, Christine Caulfield wrote: On 18/12/2020 20:41, Andrei Borzenkov wrote: 18.12.2020 21:54, Ken Gaillot пишет: On Fri, 2020-12-18

Re: [ClusterLabs] Running shell command on remote node via corosync messaging infrastructure

2021-01-04 Thread Christine Caulfield
On 04/01/2021 09:21, Klaus Wenninger wrote: On 1/4/21 8:36 AM, Christine Caulfield wrote: On 18/12/2020 20:41, Andrei Borzenkov wrote: 18.12.2020 21:54, Ken Gaillot пишет: On Fri, 2020-12-18 at 17:51 +, Animesh Pande wrote: Hello, Is there a tool that would allow for commands

Re: [ClusterLabs] Running shell command on remote node via corosync messaging infrastructure

2021-01-03 Thread Christine Caulfield
On 18/12/2020 20:41, Andrei Borzenkov wrote: 18.12.2020 21:54, Ken Gaillot пишет: On Fri, 2020-12-18 at 17:51 +, Animesh Pande wrote: Hello, Is there a tool that would allow for commands to be run on remote nodes in the cluster through the corosync messaging layer? I have a cluster

[ClusterLabs] [Announce] libqb 2.0.2 released

2020-12-03 Thread Christine Caulfield
IN32 (#424) doxygen2man: Fix a couple of covscan-detected errors (#425) cov: Quieten some covscan warnings (#427) Christine Caulfield (1): lib: Update library version for 2.0.2 release Hideo Yamauchi (1): ipcs : Decrease log level. (#426) wferi (1): doc related fi

Re: [ClusterLabs] Antw: [EXT] Re: Q: cryptic messages from "QB"

2020-11-26 Thread Christine Caulfield
On 25/11/2020 13:04, Ulrich Windl wrote: Christine Caulfield schrieb am 25.11.2020 um 10:17 in Nachricht <56738406-9222-a9f3-c57c-e30400a0b...@redhat.com>: On 25/11/2020 08:45, Ulrich Windl wrote: Hi! Setting up a cluster in SLES15 SP2, I wonder about a few log messages: 1) what do

[ClusterLabs] [Announce] libqb 2.0.1 released

2020-07-29 Thread Christine Caulfield
We are pleased to announce the release of libqb 2.0.1. This is the latest stable release of libqb Source code is available at: https://github.com/ClusterLabs/libqb/releases/download/2.0.1/libqb-2.0.1.tar.xz Please use the signed .tar.gz or .tar.xz files with the version number in rather than

Re: [ClusterLabs] clusterlabs.github.io

2020-06-29 Thread Christine Caulfield
On 29/06/2020 10:27, Jehan-Guillaume de Rorthais wrote: > On Mon, 29 Jun 2020 09:27:00 +0100 > Christine Caulfield wrote: > >> Is anyone (else) using this? > > I do: https://clusterlabs.github.io/PAF/ > >> We publish the libqb man pages to clusterlabs.github.i

[ClusterLabs] clusterlabs.github.io

2020-06-29 Thread Christine Caulfield
Is anyone (else) using this? We publish the libqb man pages to clusterlabs.github.io/libqb but I can't see any other clusterlabs projects using it (just by adding, eg, /pacemaker to the hostname). With libqb 2.0.1 having actual man pages installed with it - which seems far more useful to me - I

Re: [ClusterLabs] Linux 8.2 - high totem token requires manual setting of ping_interval and ping_timeout

2020-06-26 Thread Christine Caulfield
On 26/06/2020 07:56, Jan Friesse wrote: > Robert, > thank you for the info/report. More comments inside. > >> All, >> Hello.  Hope all is well.   I have been researching Oracle Linux 8.2 >> and ran across a situation that is not well documented.   I decided to >> provide some details to the

[ClusterLabs] libqb 2.0.0 released

2020-05-04 Thread Christine Caulfield
We are pleased to announce the release of libqb 2.0.0. This is the latest stable release of libqb Source code is available at: https://github.com/ClusterLabs/libqb/releases/download/2.0.0/libqb-2.0.0.tar.xz Please use the signed .tar.gz or .tar.xz files with the version number in rather than

[ClusterLabs] [Announce] libqb 1.0.6 released

2020-04-29 Thread Christine Caulfield
with the version number in rather than the github-generated "Source Code" ones. Chrissie Shortlog: Christine Caulfield (3): bump version for 1.0.6 Backported fixes to allow applications to compile using gcc10 (#392) Fix error in CI tests - make distcheck Jan Pokorný (9): tests: ipc: avoid pro

[ClusterLabs] [Announce] libqb 1.9.1 released

2020-03-18 Thread Christine Caulfield
We are pleased to announce the release of libqb 1.9.1 - this is a release candidate for a future 2.0 release Source code is available at: https://github.com/ClusterLabs/libqb/releases/download/1.9.0/libqb-1.9.1.tar.xz Please use the signed .tar.gz or .tar.xz files with the version number in

Re: [ClusterLabs] [Announce] libqb 1.9.0 released

2020-01-13 Thread Christine Caulfield
On 13/12/2019 15:00, Yan Gao wrote: > Hi Christine, > > Congratulations and thanks for the release! > > As previously brought from: > https://github.com/ClusterLabs/libqb/issues/338#issuecomment-503155816 > > , the master branch has this too: > >

Re: [ClusterLabs] [Announce] libqb 1.9.0 released

2020-01-06 Thread Christine Caulfield
> Does it mean the master branch is somehow not impacted by the issues, or > some other solutions are being sought there? Thanks. > > Regards, >Yan > > > > On 12/12/19 5:37 PM, christine caulfield wrote: >> We are pleased to announce the release of libqb 1.

[ClusterLabs] [Announce] libqb 1.9.0 released

2019-12-12 Thread christine caulfield
We are pleased to announce the release of libqb 1.9.0 - this is a release candidate for a future 2.0 release Source code is available at: https://github.com/ClusterLabs/libqb/releases/download/1.9.0/libqb-1.9.0.tar.xz Please use the signed .tar.gz or .tar.xz files with the version number in

Re: [ClusterLabs] corosync 3.0.1 on Debian/Buster reports some MTU errors

2019-11-21 Thread christine caulfield
On 18/11/2019 21:31, Jean-Francois Malouin wrote: Hi, Maybe not directly a pacemaker question but maybe some of you have seen this problem: A 2 node pacemaker cluster running corosync-3.0.1 with dual communication ring sometimes reports errors like this in the corosync log file: [KNET ]

Re: [ClusterLabs] Announcing ClusterLabs Summit 2020

2019-11-12 Thread christine caulfield
On 11/11/2019 13:21, Thomas Lamprecht wrote: On 11/5/19 3:07 AM, Ken Gaillot wrote: Hi all, A reminder: We are still interested in ideas for talks, and rough estimates of potential attendees. "Maybe" is perfectly fine at this stage. It will let us negotiate hotel rates and firm up the location

Re: [ClusterLabs] DLM in the cluster can tolerate more than one node failure at the same time?

2019-10-23 Thread christine caulfield
On 22/10/2019 07:15, Gang He wrote: Hi List, I remember that master node has the full copy for one DLM lock resource and the other nodes have their own lock status, then if one node is failed(or fenced), the DLM lock status can be recovered from the remained node quickly. My question is, if

[ClusterLabs] [Announce] libqb 1.0.4 release

2019-04-15 Thread Christine Caulfield
We are pleased to announce the release of libqb 1.0.4 Source code is available at: https://github.com/ClusterLabs/libqb/releases/download/v1.0.4/libqb-1.0.4.tar.xz Please used the signed .tar.gz or .tar.xz files with the version number in rather than the github-generated "Source Code" ones.

Re: [ClusterLabs] Can subsequent rings be added to established cluster?

2019-02-25 Thread Christine Caulfield
On 21/02/2019 18:33, lejeczek wrote: > hi guys > > as per the subject. > > Would there be some nice docs/howto? Or maybe it's just standard op > procedure? > With corosync 3 you can add links (similar to rings from the user POV) dynamically just by adding the necessary ringX_addr entries to

Re: [ClusterLabs] corosync SCHED_RR stuck at 100% cpu usage with kernel 4.19, priority inversion/livelock?

2019-02-18 Thread Christine Caulfield
On 15/02/2019 16:58, Edwin Török wrote: > On 15/02/2019 16:08, Christine Caulfield wrote: >> On 15/02/2019 13:06, Edwin Török wrote: >>> I tried again with 'debug: trace', lots of process pause here: >>> https://clbin.com/ZUHpd >>> >>> And here

Re: [ClusterLabs] corosync SCHED_RR stuck at 100% cpu usage with kernel 4.19, priority inversion/livelock?

2019-02-15 Thread Christine Caulfield
On 15/02/2019 13:06, Edwin Török wrote: > > > On 15/02/2019 11:12, Christine Caulfield wrote: >> On 15/02/2019 10:56, Edwin Török wrote: >>> On 15/02/2019 09:31, Christine Caulfield wrote: >>>> On 14/02/2019 17:33, Edwin Török wrote: >>>>> Hell

Re: [ClusterLabs] corosync SCHED_RR stuck at 100% cpu usage with kernel 4.19, priority inversion/livelock?

2019-02-15 Thread Christine Caulfield
On 15/02/2019 10:56, Edwin Török wrote: > On 15/02/2019 09:31, Christine Caulfield wrote: >> On 14/02/2019 17:33, Edwin Török wrote: >>> Hello, >>> >>> We were testing corosync 2.4.3/libqb 1.0.1-6/sbd 1.3.1/gfs2 on 4.19 and >>> noticed a fundamental pro

Re: [ClusterLabs] corosync SCHED_RR stuck at 100% cpu usage with kernel 4.19, priority inversion/livelock?

2019-02-15 Thread Christine Caulfield
On 14/02/2019 17:33, Edwin Török wrote: > Hello, > > We were testing corosync 2.4.3/libqb 1.0.1-6/sbd 1.3.1/gfs2 on 4.19 and > noticed a fundamental problem with realtime priorities: > - corosync runs on CPU3, and interrupts for the NIC used by corosync are > also routed to CPU3 > - corosync runs

Re: [ClusterLabs] Corosync 3.0.0 is available at corosync.org!

2018-12-17 Thread Christine Caulfield
On 17/12/2018 12:14, Jan Pokorný wrote: > On 17/12/18 10:04 +0000, Christine Caulfield wrote: >> On 17/12/2018 09:34, Ulrich Windl wrote: >>> I wonder: Is there a migration script that can converts corosync.conf files? >>> At least you have a few version co

Re: [ClusterLabs] Antw: Corosync 3.0.0 is available at corosync.org!

2018-12-17 Thread Christine Caulfield
On 17/12/2018 09:34, Ulrich Windl wrote: Jan Friesse schrieb am 14.12.2018 um 15:06 in > Nachricht > <991569e4-2430-30f1-1bbc-827be7637...@redhat.com>: > [...] >> ‑ UDP/UDPU transports are still present, but supports only single ring >> (RRP is gone in favor of Knet) and doesn't support

Re: [ClusterLabs] Antw: Re: Corosync 3 release plans?

2018-10-01 Thread Christine Caulfield
On 01/10/18 07:45, Ulrich Windl wrote: >>>> Ferenc Wágner schrieb am 27.09.2018 um 21:16 > in > Nachricht <87zhw23g5p@lant.ki.iif.hu>: >> Christine Caulfield writes: >> >>> I'm also looking into high‑res timestamps for logfiles too. >>

Re: [ClusterLabs] Corosync 3 release plans?

2018-09-28 Thread Christine Caulfield
On 27/09/18 20:16, Ferenc Wágner wrote: > Christine Caulfield writes: > >> I'm also looking into high-res timestamps for logfiles too. > > Wouldn't that be a useful option for the syslog output as well? I'm > sometimes concerned by the batching effect added by th

Re: [ClusterLabs] Corosync 3 release plans?

2018-09-27 Thread Christine Caulfield
On 27/09/18 16:01, Ken Gaillot wrote: > On Thu, 2018-09-27 at 09:58 -0500, Ken Gaillot wrote: >> On Thu, 2018-09-27 at 15:32 +0200, Ferenc Wágner wrote: >>> Christine Caulfield writes: >>> >>>> TBH I would be quite happy to leave this to logrotate but t

Re: [ClusterLabs] Corosync 3 release plans?

2018-09-27 Thread Christine Caulfield
On 27/09/18 12:52, Ferenc Wágner wrote: > Christine Caulfield writes: > >> I'm looking into new features for libqb and the option in >> https://github.com/ClusterLabs/libqb/issues/142#issuecomment-76206425 >> looks like a good option to me. > > It feels

Re: [ClusterLabs] Corosync 3 release plans?

2018-09-27 Thread Christine Caulfield
On 26/09/18 09:21, Ferenc Wágner wrote: > Jan Friesse writes: > >> wagner.fer...@kifu.gov.hu writes: >> >>> triggered by your favourite IPC mechanism (SIGHUP and SIGUSRx are common >>> choices, but logging.* cmap keys probably fit Corosync better). That >>> would enable proper log rotation. >>

Re: [ClusterLabs] Corosync 3 release plans?

2018-09-24 Thread Christine Caulfield
On 24/09/18 13:12, Ferenc Wágner wrote: > Jan Friesse writes: > >> Have you had a time to play with packaging current alpha to find out >> if there are no issues? I had no problems with Fedora, but Debian has >> a lot of patches, and I would be really grateful if we could reduce >> them a lot -

Re: [ClusterLabs] short circuiting the corosync token timeout

2018-08-13 Thread Christine Caulfield
On 13/08/18 09:00, Jan Friesse wrote: > Chris Walker napsal(a): >> Hello, >> >> Before Pacemaker can declare a node as 'offline', the Corosync layer >> must first declare that the node is no longer part of the cluster >> after waiting a full token timeout.  For example, if I manually >> STONITH a

Re: [ClusterLabs] Upgrade corosync problem

2018-07-06 Thread Christine Caulfield
ose' failed.* > > anything is logged (even in debug mode). > > I do not understand why installing libqb during the normal upgrade > process fails while if I upgrade it after the > crmsh/pacemaker/corosync/resourceagents upgrade it works fine.  > > On 3 Jul 2018, at 11:42, Chri

Re: [ClusterLabs] Found libqb issue that affects pacemaker 1.1.18

2018-07-06 Thread Christine Caulfield
On 06/07/18 10:09, Salvatore D'angelo wrote: > I closed the issue. > Libqb uses tagging and people should not download the Source code (zip) >  or Source > code (tar.gz) . > The

Re: [ClusterLabs] Upgrade corosync problem

2018-07-03 Thread Christine Caulfield
On 03/07/18 07:53, Jan Pokorný wrote: > On 02/07/18 17:19 +0200, Salvatore D'angelo wrote: >> Today I tested the two suggestions you gave me. Here what I did. >> In the script where I create my 5 machines cluster (I use three >> nodes for pacemaker PostgreSQL cluster and two nodes for glusterfs

Re: [ClusterLabs] Upgrade corosync problem

2018-07-02 Thread Christine Caulfield
On 29/06/18 17:20, Jan Pokorný wrote: > On 29/06/18 10:00 +0100, Christine Caulfield wrote: >> On 27/06/18 08:35, Salvatore D'angelo wrote: >>> One thing that I do not understand is that I tried to compare corosync >>> 2.3.5 (the old version that worked fi

Re: [ClusterLabs] Upgrade corosync problem

2018-06-29 Thread Christine Caulfield
On 27/06/18 08:35, Salvatore D'angelo wrote: > Hi, > > Thanks for reply and detailed explaination. I am not using the > —network=host option. > I have a docker image based on Ubuntu 14.04 where I only deploy this > additional software: > > *RUN apt-get update && apt-get install -y wget git

Re: [ClusterLabs] Upgrade corosync problem

2018-06-26 Thread Christine Caulfield
gle but results where > quite confusing. > It's pretty unlikely to be the crypto libraries. It's almost certainly in libqb, with a small possibility that of corosync. Which versions did you have that worked (libqb and corosync) ? Chrissie > >> On 26 Jun 2018, at 12:27, Chr

Re: [ClusterLabs] Upgrade corosync problem

2018-06-26 Thread Christine Caulfield
On 26/06/18 11:24, Salvatore D'angelo wrote: > Hi, > > I have tried with: > 0.16.0.real-1ubuntu4 > 0.16.0.real-1ubuntu5 > > which version should I try? Hmm both of those are actually quite old! maybe a newer one? Chrissie > >> On 26 Jun 2018, at 12:03, Chris

Re: [ClusterLabs] Upgrade corosync problem

2018-06-26 Thread Christine Caulfield
to the code. > Anyone can help? > Have you tried downgrading libqb to the previous version to see if it still happens? Chrissie >> On 26 Jun 2018, at 11:56, Christine Caulfield > <mailto:ccaul...@redhat.com>> wrote: >> >> On 26/06/18 10:35, Salva

Re: [ClusterLabs] Upgrade corosync problem

2018-06-26 Thread Christine Caulfield
rosync so it does not contains lines of previous >> executions. >> >> >> But the command: >> corosync-quorumtool -ps >> >> still give: >> Cannot initialize QUORUM service >> >> Consider that few minutes before it gave me the message: >&

Re: [ClusterLabs] Upgrade corosync problem

2018-06-26 Thread Christine Caulfield
      64M   11M   54M  16% /dev/shm > > but I do not know how to do that. Any suggestion? > According to google, you just add a new line to /etc/fstab for /dev/shm tmpfs /dev/shm tmpfs defaults,size=512m 0 0 Chrissie >> On 26 Jun 2018, at 09:48, Christine Caulfield > &l

Re: [ClusterLabs] Upgrade corosync problem

2018-06-26 Thread Christine Caulfield
On 25/06/18 20:41, Salvatore D'angelo wrote: > Hi, > > Let me add here one important detail. I use Docker for my test with 5 > containers deployed on my Mac. > Basically the team that worked on this project installed the cluster on soft > layer bare metal. > The PostgreSQL cluster was hard to

Re: [ClusterLabs] Upgrade corosync problem

2018-06-25 Thread Christine Caulfield
rce temporarily unavailable (11) [17323] pg1 corosyncerror [QB] Error in connection setup (17324-17334-23): Resource temporarily unavailable (11) [17323] pg1 corosyncdebug [QB] qb_ipcs_disconnect(17324-17334-23) state:0 is /dev/shm full? Chrissie > > >> On 22 Jun 2018, at

Re: [ClusterLabs] Upgrade corosync problem

2018-06-22 Thread Christine Caulfield
first. If you enable debug logging in corosync.conf: logging { to_syslog: yes debug: on } Then see what happens and post the syslog file that has all of the corosync messages in it, we'll take it from there. Chrissie >> On 22 Jun 2018, at 11:30, Christine Caulfield wrote:

Re: [ClusterLabs] Upgrade corosync problem

2018-06-22 Thread Christine Caulfield
                ring0_addr: pg2 >                 ring1_addr: pg2p >                 nodeid: 2 >         } >         node { >                 ring0_addr: pg3 >                 ring1_addr: pg3p >                 nodeid: 3 >         } > } > logging { >         to

Re: [ClusterLabs] Upgrade corosync problem

2018-06-22 Thread Christine Caulfield
On 21/06/18 16:16, Salvatore D'angelo wrote: > Hi, > > I upgraded my PostgreSQL/Pacemaker cluster with these versions. > Pacemaker 1.1.14 -> 1.1.18 > Corosync 2.3.5 -> 2.4.4 > Crmsh 2.2.0 -> 3.0.1 > Resource agents 3.9.7 -> 4.1.1 > > I started on a first node  (I am trying one node at a time

Re: [ClusterLabs] corosync-qdevice doesn't daemonize (or stay running)

2018-06-21 Thread Christine Caulfield
On 21/06/18 14:27, Christine Caulfield wrote: > On 21/06/18 12:05, Jason Gauthier wrote: >> On Thu, Jun 21, 2018 at 5:11 AM Christine Caulfield >> wrote: >>> >>> On 19/06/18 18:47, Jason Gauthier wrote: >>>> On Tue, Jun 19, 2018 at 6:58 AM Christine

Re: [ClusterLabs] corosync-qdevice doesn't daemonize (or stay running)

2018-06-21 Thread Christine Caulfield
On 21/06/18 12:05, Jason Gauthier wrote: > On Thu, Jun 21, 2018 at 5:11 AM Christine Caulfield > wrote: >> >> On 19/06/18 18:47, Jason Gauthier wrote: >>> On Tue, Jun 19, 2018 at 6:58 AM Christine Caulfield >>> wrote: >>>> >>>> O

Re: [ClusterLabs] corosync-qdevice doesn't daemonize (or stay running)

2018-06-21 Thread Christine Caulfield
On 19/06/18 18:47, Jason Gauthier wrote: > On Tue, Jun 19, 2018 at 6:58 AM Christine Caulfield > wrote: >> >> On 19/06/18 11:44, Jason Gauthier wrote: >>> On Tue, Jun 19, 2018 at 3:25 AM Christine Caulfield >>> wrote: >>>> >>>&

Re: [ClusterLabs] corosync-qdevice doesn't daemonize (or stay running)

2018-06-19 Thread Christine Caulfield
On 19/06/18 11:44, Jason Gauthier wrote: > On Tue, Jun 19, 2018 at 3:25 AM Christine Caulfield > wrote: >> >> On 19/06/18 02:46, Jason Gauthier wrote: >>> Greetings, >>> >>>I've just discovered corosync-qdevice and corosync-qnet. >>> (Th

Re: [ClusterLabs] corosync-qdevice doesn't daemonize (or stay running)

2018-06-19 Thread Christine Caulfield
On 19/06/18 02:46, Jason Gauthier wrote: > Greetings, > >I've just discovered corosync-qdevice and corosync-qnet. > (Thanks Ken Gaillot) . Set up was pretty quick. > > I enabled qnet off cluster. I followed the steps presented by > corosync-qdevice-net-certutil.However, when running >

Re: [ClusterLabs] corosync not able to form cluster

2018-06-08 Thread Christine Caulfield
it never gets out of the JOIN "Jun 07 16:55:37 corosync [TOTEM ] entering GATHER state from 11." process so something is wrong on that node, either a rogue routing table entry, dangling iptables rule or even a broken NIC. Chrissie > Thanks! > > On Thu, Jun 7, 2018 at 8:43 P

Re: [ClusterLabs] corosync not able to form cluster

2018-06-07 Thread Christine Caulfield
Thu, 7 Jun 2018, 8:03 pm Christine Caulfield, <mailto:ccaul...@redhat.com>> wrote: > > On 07/06/18 15:24, Prasad Nagaraj wrote: > > > > No iptables or otherwise firewalls are setup on these nodes. > > > > One observation is that each nod

Re: [ClusterLabs] corosync not able to form cluster

2018-06-07 Thread Christine Caulfield
On 07/06/18 15:24, Prasad Nagaraj wrote: > > No iptables or otherwise firewalls are setup on these nodes. > > One observation is that each node sends messages on with its own ring > sequence number which is not converging.. I have seen that in a good > cluster, when nodes respond with same

Re: [ClusterLabs] corosync not able to form cluster

2018-06-07 Thread Christine Caulfield
:25:30.946531 IP 172.22.0.11.44864 > 172.22.0.13.netsupport: UDP, > length 332 > 10:25:30.970931 IP 172.22.0.4.34060 > 172.22.0.11.netsupport: UDP, > length 376 > 10:25:30.983055 IP 172.22.0.13.57332 > 172.22.0.11.netsupport: UDP, > length 332 > 10

Re: [ClusterLabs] corosync not able to form cluster

2018-06-07 Thread Christine Caulfield
On 07/06/18 09:21, Prasad Nagaraj wrote: > Hi - I am running corosync on  3 nodes of CentOS release 6.9 (Final). > Corosync version is  corosync-1.4.7. > The nodes are not seeing each other and not able to form memberships. > What I see is continuous message about " A processor joined or left the

Re: [ClusterLabs] Failure of preferred node in a 2 node cluster

2018-04-30 Thread Christine Caulfield
On 29/04/18 13:22, Andrei Borzenkov wrote: > 29.04.2018 04:19, Wei Shan пишет: >> Hi, >> >> I'm using Redhat Cluster Suite 7with watchdog timer based fence agent. I >> understand this is a really bad setup but this is what the end-user wants. >> >> ATB => auto_tie_breaker >> >> "When the

Re: [ClusterLabs] Announcing the first ClusterLabs video karaoke contest!

2018-04-03 Thread Christine Caulfield
On 03/04/18 07:14, Klaus Wenninger wrote: > On 04/02/2018 02:57 AM, Digimer wrote: >> On 2018-04-01 05:30 PM, Ken Gaillot wrote: >>> In honor of the recent 10th anniversary of the first public release of >>> Pacemaker, ClusterLabs is proud to announce its first video karaoke >>> contest! >>> >>>

Re: [ClusterLabs] [corosync] Document on configuring corosync3 with knet

2018-03-02 Thread Christine Caulfield
On 16/01/18 13:46, Christine Caulfield wrote: > Hi All, > > To get people started with the new things going on with kronosnet and > corosync3, I've written a document which explains what you can do with > the new configuration options, how to set up multiple links and mu

[ClusterLabs] [corosync] Document on configuring corosync3 with knet

2018-01-16 Thread Christine Caulfield
Hi All, To get people started with the new things going on with kronosnet and corosync3, I've written a document which explains what you can do with the new configuration options, how to set up multiple links and much, much more. It might be helpful for people who want to write configuration

[ClusterLabs] [Announce] libqb 1.0.3 release

2017-12-21 Thread Christine Caulfield
We are pleased to announce the release of libqb 1.0.3 Source code is available at: https://github.com/ClusterLabs/libqb/releases/download/v1.0.3/libqb-1.0.3.tar.xz This is mainly a bug-fix release to 1.0.2 Christine Caulfield (6): tests: Fix signal handling in check_ipc.c test: Disable

Re: [ClusterLabs] corosync race condition when node leaves immediately after joining

2017-10-12 Thread Christine Caulfield
On 12/10/17 11:54, Jan Friesse wrote: > Jonathan, > >> >> >> On 12/10/17 07:48, Jan Friesse wrote: >>> Jonathan, >>> I believe main "problem" is votequorum ability to work during sync >>> phase (votequorum is only one service with this ability, see >>> votequorum_overview.8 section VIRTUAL

Re: [ClusterLabs] Introducing the Anvil! Intelligent Availability platform

2017-07-06 Thread Christine Caulfield
On 05/07/17 14:55, Ken Gaillot wrote: > Wow! I'm looking forward to the September summit talk. > Me too! Congratulations on the release :) Chrissie > On 07/05/2017 01:52 AM, Digimer wrote: >> Hi all, >> >> I suspect by now, many of you here have heard me talk about the Anvil! >>

Re: [ClusterLabs] how to sync data using cmap between cluster

2017-05-25 Thread Christine Caulfield
On 25/05/17 15:48, Rui Feng wrote: > Hi, > > I have a test based on corosync 2.3.4, and find the data stored by > cmap( corosync-cmapctl -s test i8 1) which can't be sync to other > node. > Could somebody give some comment or solution for it, thanks! > > cmap isn't replicated across the

Re: [ClusterLabs] Antw: Re: 2-Node Cluster Pointless?

2017-04-18 Thread Christine Caulfield
On 18/04/17 15:02, Digimer wrote: > On 18/04/17 10:00 AM, Digimer wrote: >> On 18/04/17 03:47 AM, Ulrich Windl wrote: >> Digimer schrieb am 16.04.2017 um 20:17 in Nachricht >>> <12cde13f-8bad-a2f1-6834-960ff3afc...@alteeve.ca>: On 16/04/17 01:53 PM, Eric Robinson wrote:

Re: [ClusterLabs] 2-Node Cluster Pointless?

2017-04-18 Thread Christine Caulfield
> > This isn't the first time this has come up, so I decided to elaborate on > this email by writing an article on the topic. > > It's a first-draft so there are likely spelling/grammar mistakes. > However, the body is done. > > https://www.alteeve.com/w/The_2-Node_Myth > An excellent

Re: [ClusterLabs] Three node cluster becomes completely fenced if one node leaves

2017-03-29 Thread Christine Caulfield
On 24/03/17 20:44, Seth Reid wrote: > I have a three node Pacemaker/GFS2 cluster on Ubuntu 16.04. Its not in > production yet because I'm having a problem during fencing. When I > disable the network interface of any one machine, If you mean by using ifdown or similar then ... don't do that. A

Re: [ClusterLabs] corosync dead loop in segfault handler

2017-03-14 Thread Christine Caulfield
On 11/03/17 01:32, cys wrote: > At 2017-03-09 18:25:59, "Christine Caulfield" <ccaul...@redhat.com> wrote: >> Thanks. Oddly that looks like a totally different incident to the core >> file we had last time. That seemed to be in a node state transition >> wherea

Re: [ClusterLabs] corosync cannot acquire quorum

2017-03-13 Thread Christine Caulfield
On 11/03/17 02:50, cys wrote: > We have a cluster containing 3 nodes(nodeA, nodeB, nodeC). > After nodeA is taken offline(by ifdown, this may be not right?), ifdown isn't right, no. you need to do a physical cable pull or use iptables to simulate loss of traffic, ifdown does odd things to

Re: [ClusterLabs] corosync dead loop in segfault handler

2017-03-09 Thread Christine Caulfield
On 08/03/17 11:04, cys wrote: > At 2017-02-21 00:24:33, "Christine Caulfield" <ccaul...@redhat.com> wrote: >> Thanks, I can read that core now. It's something odd happening in the >> sync() code that I can't quite diagnose without the blackbox. We've only >

Re: [ClusterLabs] Q: cluster-dlm[4494]: setup_cpg_daemon: daemon cpg_join error retrying

2017-03-03 Thread Christine Caulfield
On 03/03/17 12:59, Ulrich Windl wrote: > Hello! > > After Update and reboot of 2nd of three nodes (SLES11 SP4) I see a > "cluster-dlm[4494]: setup_cpg_daemon: daemon cpg_join error retrying" message > when I expected the node to joint the cluster. What can be the reasons for > this? > In fact

Re: [ClusterLabs] corosync dead loop in segfault handler

2017-02-20 Thread Christine Caulfield
n-wire incompatibilities. Has it happened before? Chrissie > At 2017-02-16 19:38:03, "Christine Caulfield" <ccaul...@redhat.com> wrote: >> On 16/02/17 09:31, cys wrote: >>> The attachment includes coredump and logs just before corosync went wrong. >>> >

Re: [ClusterLabs] corosync dead loop in segfault handler

2017-02-16 Thread Christine Caulfield
t was worth a try!) Thanks Chrissie > Unfortunately corosync was restarted yesterday, and I can't get the blackbox > dump covering the day the incident occurred. > > At 2017-02-16 16:00:05, "Christine Caulfield" <ccaul...@redhat.com> wrote: >> On 16/02/17 03:5

Re: [ClusterLabs] corosync dead loop in segfault handler

2017-02-16 Thread Christine Caulfield
On 16/02/17 03:51, cys wrote: > At 2017-02-15 23:13:08, "Christine Caulfield" <ccaul...@redhat.com> wrote: >> >> Yes, it seems that some corosync SEGVs trigger this obscure bug in >> libqb. I've chased a few possible causes and none have been fruitful. &g

Re: [ClusterLabs] Corosync maximum nodes

2017-01-30 Thread Christine Caulfield
On 27/01/17 09:43, Гюльнара Невежина wrote: > Hello! > I'm very sorry to disturb you with such question but I can't find > information if there is maximum nodes' limit in corosync? I've found a > bug report https://bugzilla.redhat.com/show_bug.cgi?id=905296#c5 with > "Corosync has hardcoded

[ClusterLabs] libqb 1.0.1 release

2016-11-24 Thread Christine Caulfield
I am very pleased to announce the 1.0.1 release of libqb This is a bugfix release with mainly lots of small amendments. Low: ipc_shm: fix superfluous NULL check log: Don't overwrite valid tags Low: further avoid magic in qblog.h by using named constants Low: log: check for appropriate space when

Re: [ClusterLabs] [corosync] Master branch

2016-10-11 Thread Christine Caulfield
On 11/10/16 12:07, Dennis Jacobfeuerborn wrote: > On 11.10.2016 12:42, Christine Caulfield wrote: >> I've just committed a bit patch to the master branch of corosync - it is >> now all very experimental, and existing pull requests against master >> might need to be checked.

[ClusterLabs] [corosync] Master branch

2016-10-11 Thread Christine Caulfield
dynamic reconfiguration of interfaces. It also fixes the ifup/ifdown and 127.0.0.1 binding problems that have plagued corosync/openais from day 1 Signed-off-by: Christine Caulfield <ccaul...@redhat.com> Chrissie ___ Users maili

Re: [ClusterLabs] Antw: Re: Establishing Timeouts

2016-10-11 Thread Christine Caulfield
On 11/10/16 08:22, Vladislav Bogdanov wrote: > 11.10.2016 09:31, Ulrich Windl wrote: > Klaus Wenninger schrieb am 10.10.2016 um > 20:04 in >> Nachricht <936e4d4b-df5c-246d-4552-5678653b3...@redhat.com>: >>> On 10/10/2016 06:58 PM, Eric Robinson wrote: Thanks for

Re: [ClusterLabs] Establishing Timeouts

2016-10-11 Thread Christine Caulfield
On 10/10/16 19:35, Eric Robinson wrote: > Basically, when we turn off a switch, I want to keep the cluster from failing > over before Linux bonding has had a chance to recover. > > I'm mostly interested in prventing false-positive cluster failovers that > might occur during manual network

Re: [ClusterLabs] Establishing Timeouts

2016-10-10 Thread Christine Caulfield
On 10/10/16 05:51, Eric Robinson wrote: > I have about a dozen corosync+pacemaker clusters and I am just now getting > around to understanding timeouts. > > Most of my corosync.conf files look something like this: > > version:2 > token: 5000 >

Re: [ClusterLabs] corosync-quorum tool, output name key on Name column if set?

2016-09-20 Thread Christine Caulfield
On 20/09/16 10:46, Thomas Lamprecht wrote: > Hi, > > when I'm using corosync-quorumtool [-l] and have my ring0_addr set to a > IP address, > which does not resolve to a hostname, I get the nodes IP addresses for > the 'Name' column. > > As I'm using the nodelist.node.X.name key to set the name

  1   2   >