Re: [ClusterLabs] Totems and Tokens and Kronosnet, oh my!

2024-07-25 Thread Jan Friesse

On 24/07/2024 01:22, Mike Holloway via Users wrote:

Hi folks,

Writing up an article which is near ready to publish and I want to make sure 
the terms I seek to elaborate upon are as accurately conveyed as possible.

To that end, my current task is to understand "totem" vs "token" and how this relates to 


Quite user friendly description of Totem and Token is 
https://discourse.ubuntu.com/t/corosync-and-redundant-rings/11627


the protocol for consensus in modern Corosync (Post-RHEL 6), as distinct 
(if I grok) from the transport protocol "kronosnet".


kronosnet from corosync pov is "just" another transport protocol. 
Corosync (historically) supported udp multicast, udp unicast and 
infiniband for sending/receiving packet to/from wire. Kronosnet is used 
same way - for sending/receiving packets.




Could someone help me out by shedding light on the difference between the "token" in 
circulation, whether a "totem" concept of some sort is still valid, and how this all 
relates to modern Pacemaker-Corosync HA clusters?

There is really no difference from very high level of view.

Honza



Thanks,
Mik

Sent with [Proton Mail](https://proton.me/) secure email.


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Booth 1.2 is available at GitHub!

2024-06-06 Thread Jan Friesse

I am pleased to announce the latest maintenance release of Booth 1.2 is
available immediately from GitHub at
https://github.com/ClusterLabs/booth/releases as booth-1.2.

Booth 1.2 implements support for changes in Pacemaker 3. It is known
that older versions of Booth will not work properly because of using
deprecated and in Pacemaker 3 removed CLI arguments. To make this
support possible, very old versions of Pacemaker are no longer
supported and minimum Pacemaker version is 2.1 as noted in release
notes for Booth 1.1.

Another big change is add support for GnuTLS as an alternative to mhash
and gcrypt for encryption library. Configure script by default tries to
autodetect installed library (order is GnuTLS, gcrypt, mhash) or
it is possible to use parameter --with-hmac-library to choose specific
implementation.

This release is also required for users with Kernel >= 6.9, because 
function which finds local host was broken and caused endless loop for 
these Kernels.


Another important reason to upgrade is fix for CVE-2024-3049. For 
package maintainers of older versions, patches from PR#142 are needed 
and compatible with at least Booth 1.1.


Smaller but hopefully helpful new feature is that Booth now
stores booth-cfg-name attribute to allow cluster configuration
tools delete CIB ticket when removing ticket from booth configuration.

Lastly, I'm very happy to announce new maintainer for the booth project:
Chris Lumens. Chris has superb knowledge of Pacemaker code and already
gain pretty good understanding of Booth code. Combination of Pacemaker
and Booth code knowledge is perfect match for Booth.
I will remain with project as a patches reviewer.

Complete changelog for 1.2:

Chris Lumens (1):
  tests: Remove the unit-tests directory.

Jan Friesse (14):
  pacemaker: Remove non-atomic grant of ticket
  pacemaker: Don't add explicit error prefix in log
  pacemaker: Check snprintf return values
  pacemaker: Use long format for crm_ticket -v
  pacemaker: Remove const warning
  query_get_string_answer: Remove duplicate line
  transport: Fix _find_myself for kernel 6.9
  pacemaker: Store booth-cfg-name attribute
  attr: Fix reading of server_reply
  auth: Check result of gcrypt gcry_md_get_algo_dlen
  configure: Remove duplicate mhash.h check
  configure: Add option to select HMAC library
  Add support for GnuTLS
  build: Prepare version 1.2 release

Upgrade is highly recommended.

Thanks/congratulations to all people that contributed to achieve this 
milestone.


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Fast-failover on 2 nodes + qnetd: qdevice connenction disrupted.

2024-05-03 Thread Jan Friesse

Hi,
some of your findings are really interesting.

On 02/05/2024 01:56, ale...@pavlyuts.ru wrote:

Hi All,

  


I am trying to build application-specific 2-node failover cluster using
ubuntu 22, pacemaker 2.1.2 + corosync 3.1.6 and DRBD 9.2.9, knet transport.



...

  


Also, I've done wireshark capture and found great mess in TCP, it seems like
connection between qdevice and qnetd really stops for some time and packets
won't deliver.


Could you check UDP? I guess there is a lot of UDP packets sent by 
corosync which probably makes TCP to not go thru.




  


For my guess, it match corosync syncing activities, and I suspect that
corosync prevent any other traffic on the interface it use for rings.

  


As I switch qnetd and qdevice to use different interface it seems to work
fine.


Actually having dedicated interface just for corosync/knet traffic is 
optimal solution. qdevice+qnetd on the other hand should be as close to 
"customer" as possible.


So if you could have two interfaces (one just for corosync, second for 
qnetd+qdevice+publicly accessible services) it might be a solution?




  


So, the question is: does corosync really temporary blocks any other traffic
on the interface it uses? Or it is just a coincidence? If it blocks, is


Nope, no "blocking". But it sends quite some few UDP packets and I guess 
it can really use all available bandwidth so no TCP goes thru.


Honza


there a way to manage it?

  


Thank you for any suggest on that!

  


Sincerely,

  


Alex

  

  




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] corosync service stopping

2024-04-29 Thread Jan Friesse

Hi,
I will reply just to "sysadmin" question:

On 26/04/2024 14:43, Alexander Eastwood via Users wrote:

Dear Reid,


...



Why does the corosync log say ’shutdown by sysadmin’ when the shutdown was 
triggered by pacemaker? Isn’t this misleading?


This basically means shutdown was triggered by calling corosync cfg api. 
I can agree "sysadmin" is misleading. Problem is, same cfg api call is 
used by corosync-cfgtool and corosync-cfgtool is used in systemd service 
file and here it is really probably sysadmin who initiated the shutdown.


Currently the function where this log message is printed has no 
information about which process initiated shutdown. It knows only nodeid.


It would be possible to log some more info (probably also with 
proc_name) in the cfg API function call, but then it is probably good 
candidate for DEBUG log level.


So do you think "shutdown by cfg request" would be less misleading?

Regards
  Honza

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Pacemaker 2.1.7-rc2 now available

2023-11-27 Thread Jan Friesse

On 24/11/2023 09:18, Klaus Wenninger wrote:

Hi all,

Source code for the 2nd release candidate for Pacemaker version 2.1.7
is available at:

https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-2.1.7-rc2

This is primarily a bug fix release. See the ChangeLog or the link
above for details.

Everyone is encouraged to download, build, and test the new release. We


I would like to ask if https://bugs.clusterlabs.org/show_bug.cgi?id=5529 
fix get in?


Without the fix booth is not working so it's really not recommended for 
anybody who is booth to try pcmk 2.1.7 until the bug 5529 gets fixed.


Regards
  Honza


do many regression tests and simulations, but we can't cover all
possible use cases, so your feedback is important and appreciated.

Many thanks to all contributors of source code to this release,
including Chris Lumens
Gao Yan, Grace Chin, Hideo Yamauchi, Jan Pokorný, Ken Gaillot,
liupei, Oyvind Albrigtsen, Reid Wahl, xin liang, xuezhixin.

Klaus


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Corosync 3.1.8 is available at corosync.org!

2023-11-15 Thread Jan Friesse

I am pleased to announce the latest maintenance release of Corosync
3.1.8 is available immediately from the GitHub release section at 
https://github.com/corosync/corosync/releases or our website at

http://build.clusterlabs.org/corosync/releases/.

This release contains mostly smaller bugfixes and improvements of Rust 
bindings.


Complete changelog for 3.1.8:

Christine Caulfield (10):
  bindings: Add Rust bindings
  rust: Make it work on FreeBSD
  Rust: 'fix' clippys for Rust 1.67
  knet: use knet TRACE logging level if available
  Rust: Remove obsolete bindgen flag
  parser: Allow a non-breaking space as 'whitespace'
  rust: Remove some pointless casts
  config: Fail to start if ping timers are invalid
  man: Update the corosync_overview manpage
  rust: Improve vector initialisation

Jan Friesse (3):
  rust: Remove tests from check scripts
  build: Fix rust make -j build dep for distcheck
  spec: Migrate to SPDX license

Machiry Aravind Kumar (1):
  Handling integer overflow issues

Upgrade is highly recommended.

Thanks/congratulations to all people that contributed to achieve this 
great milestone.


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Booth 1.1 is available at GitHub!

2023-10-18 Thread Jan Friesse
I am pleased to announce the latest maintenance release of Booth 1.1 is 
available immediately from GitHub at 
https://github.com/ClusterLabs/booth/releases as booth-1.1.


Booth 1.1 has been released 7 long years after 1.0, so there are a huge 
number of changes and it is really hard to highlight the most important 
ones (and I would rather not even try). Also I guess most booth users 
are actually running a git version so most of the fixes are already in 
production.


Please keep in mind that this is the last release where very old 
versions of Python, Pacemaker, ... are supported. The next release will 
require at least Python 3.9, Pacemaker 2.1, ... Basically RHEL 8 was 
chosen as the oldest supported base. I hope virtually nobody will be 
affected by this decision (and if you are then patches are welcomed ;) ).


My plan is to do regular releases once, maybe twice a year (more if 
needed of course). Also I'm expecting 1.2 later this year to prepare 
Booth for changes in Pacemaker 3.0. So stay tuned!


Complete changelog for 1.1:

Aleksei Burlakov (1):
  test: fix the delimiter in the here-string

Bin Liu (2):
  low:fix:remove unnecessary return from unit-test.py
  typo fix: there is no %{S:2} defined in booth.spec

Chris Kowalczyk (16):
  Build: create and set working directory
  Build: add correct permissions to the folders
  Config: handle hostnames in booth.conf file
  Change comment style and indentation
  Find the correct address of a local site
  Booth Daemon Arguments: disable foreground for D flag
  Booth Daemon Arguments: Added tests for DS and S flags
  Booth Daemon Arguments: Enable stderr for foregound mode
  Booth Daemon Arguments: Disable stderr for daemon mode
  Feature: add manual mode to booth tickets
  Fixed typo in help message
  Updates after review
  Updates after code review
  Handle multi-leader situation for manual tickets. Added 
manual tickets to Life Tests framework

  Refactoring after review comments
  Fix asciidoc build

Dan Streetman (1):
  Don't lock all current and future memory if can't increase 
memlock rlimit


Dejan Muhamedagic (14):
  Doc: update the before-acquire-handler description to match 
the new semantics

  Dev: extprog: rename prog to path (it can be a directory too)
  build: remove the old paxos update pre-script
  Feature: extprog: add capability to run a set of programs
  Medium: extprog: external tests timeout after renewal interval
  Medium: main: finally fix address matching
  Low: ticket: reset next state on ticket reset
  Low: extprog: fix pid test
  Dev: extprog: set progstate debug
  Low: attr: set time string to "" when time is not set
  Low: attr: add ticket name to warnings
  Medium: attr: fix wrong order for hash free functions
  Medium: extprog: fix race condition on ticket loss
  fix bashisms (use printf instead of echo)

Fabio M. Di Nitto (14):
  build: use pkg-config way to detect and use pacemaker header 
files
  build: allow to override path to python without failing 
version checks

  build: fix make distcheck
  [build] ship booth.pc with basic booth build information for 
downstream packages to use
  configure: Simplify libqb detection when libqb is not 
installed in standard paths

  configure: detect and init pkg-config with proper macro
  configure: add BOOTH_PKG_CHECK_VAR macro to wrap PKG_CHECK_VAR
  configure: use resource-agents pkg-config info to determine 
ocfdir

  configure: use PKG_CONFIG to detect pacemaker user/group
  configure: drop unnecessary macro
  configure: drop dead code
  configure: move exec_prefix sanitizer closer to prefix
  configure: drop unnecessary check and define
  rpm: use new package name for pacemaker devel on opensuse

Jan Friesse (64):
  test: Allow test running as a root
  test: Enlarge timeout for boothd exit
  test: Actively wait for lock file create/delete
  build: Make make distcheck with asciidoctor
  docs: Fix description of how to run python tests
  tests: Make test work for Debian systems
  main: Accept longer config and lock file names
  main: Delete lockfile when signal arrive too early
  build: Do not link with pcmk libraries
  pacemaker: Handle updated exit code of crm_ticket
  tests: Allow parallel running of tests
  build: Make generating of HTML man work
  build: Remove unneeded OS detection section
  build: Make sure tarball contains all needed files
  build: Delete cov directory on clean
  configure: Always let

Re: [ClusterLabs] Centreon HA Cluster - VIP issue

2023-09-04 Thread Jan Friesse

Hi,


On 02/09/2023 17:16, Adil Bouazzaoui wrote:

  Hello,

My name is Adil,i worked for Tman company, we are testing the Centreon HA
cluster to monitor our infrastructure for 13 companies, for now we are
using the 100 IT licence to test the platform, once everything is working
fine then we can purchase a licence suitable for our case.

We're stuck at *scenario 2*: setting up Centreon HA Cluster with Master &
Slave on a different datacenters.
For *scenario 1*: setting up the Cluster with Master & Slave and VIP
address on the same network (VLAN) it is working fine.

*Scenario 1: Cluster on Same network (same DC) ==> works fine*
Master in DC 1 VLAN 1: 172.30.15.10 /24
Slave in DC 1 VLAN 1: 172.30.15.20 /24
VIP in DC 1 VLAN 1: 172.30.15.30/24
Quorum in DC 1 LAN: 192.168.1.10/24
Poller in DC 1 LAN: 192.168.1.20/24

*Scenario 2: Cluster on different networks (2 separate DCs connected with
VPN) ==> still not working*


corosync on all nodes needs to have direct connection to any other node. 
VPN should work as long as routing is correctly configured. What exactly 
is "still not working"?



Master in DC 1 VLAN 1: 172.30.15.10 /24
Slave in DC 2 VLAN 2: 172.30.50.10 /24
VIP: example 102.84.30.XXX. We used a public static IP from our internet
service provider, we thought that using a IP from a site network won't
work, if the site goes down then the VIP won't be reachable!
Quorum: 192.168.1.10/24


No clue what you mean by Quorum, but placing it in DC1 doesn't feel right.


Poller: 192.168.1.20/24

Our *goal *is to have Master & Slave nodes on different sites, so when Site
A goes down, we keep monitoring with the slave.
The problem is that we don't know how to set up the VIP address? Nor what
kind of VIP address will work? or how can the VIP address work in this
scenario? or is there anything else that can replace the VIP address to
make things work.
Also, can we use a backup poller? so if the poller 1 on Site A goes down,
then the poller 2 on Site B can take the lead?

we looked everywhere (The watch, youtube, Reddit, Github...), and we still
couldn't get a workaround!

the guide we used to deploy the 2 Nodes Cluster:
https://docs.centreon.com/docs/installation/installation-of-centreon-ha/overview/

attached the 2 DCs architecture example.

We appreciate your support.
Thank you in advance.


Adil Bouazzaoui
IT Infrastructure Engineer
TMAN
adil.bouazza...@tmandis.ma
adilb...@gmail.com
+212 656 29 2020


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] updu transport support with corosync

2023-07-27 Thread Jan Friesse

Hi,

On 24/07/2023 18:13, Abhijeet Singh wrote:

Hello,

We have a 2-node corosync/pacemaker cluster setup. We recently updated
corosync from v2.3.4 to v.3.0.3. I have couple of questions related to
corosync transport mechanism -

1. Found below article which indicates updu support might be deprecated in
the future. Is there a timeline for when updu might be deprecated? updu


Probably in corosync 4.x, but there is currently no plan for corosync 
4.x yet :) So if you are happy with udpu (eq. no need for encryption and 
multi-link) just keep using it.



seems to be performing better than knet in our setup.


Do you have a numbers? We did quite extensive testing and knet was 
always both faster and better (lower) latency.



https://www.mail-archive.com/users@clusterlabs.org/msg12806.html

2. We are using Linux 5.15.x. Noticed that with Knet transport corosync
takes up almost double memory as compared to updu. Is this expected? Are


Knet pre-allocates buffers on startup and corosync send buffers are also 
larger so yes, it is expected. It shouldn't be extra bad especially over 
time.



there any config changes which can help reduce the memory footprint?


Not really. You can change some compile #defines in source code but it's 
really asking for huge trouble.


Honza


Thanks
Abhijeet


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Corosync 3.1.5 Fails to Autostart

2023-04-25 Thread Jan Friesse

On 24/04/2023 22:16, Tyler Phillippe via Users wrote:

Hello all,

We are currently using RHEL9 and have set up a PCS cluster. When restarting the 
servers, we noticed Corosync 3.1.5 doesn't start properly with the below error 
message:

Parse error in config: No valid name found for local host
Corosync Cluster Engine exiting with status 8 at main.c:1445.
Corosync.service: Main process exited, code=exited, status=8/n/a

These are physical, blade machines that are using a 2x Fibre Channel NIC in a Mode 6 bond as their networking interface for the cluster; other than that, there is really nothing special about these machines. We have ensured the names of the machines exist in /etc/hosts and that they can resolve those names via the hosts file first. The strange 


This is really weird. All described symptoms simply points to name 
service (DNS/NIS/...) is not available during bootup and it will become 
available later. But if /etc/hosts really contains static entries it 
should just work.


Could you please try to set debug: trace in corosync.conf like
```
...
logging {
to_syslog: yes
to_stderr: yes
timestamp: on
to_logfile: yes
logfile: /var/log/cluster/corosync.log

debug: trace
}
...
```

and observe very beginning output of corosync (either in syslog or in 
/var/log/cluster/corosync.log)? There should be something like


totemip_parse: IPv4 address of NAME resolved as IPADDR

Also compare the difference between corosync started on boot and later 
after multi-user.target.


thing is if we start Corosync manually after we can SSH into the 
machines, Corosync starts immediately and without issue. We did manage 
to get Corosync to autostart properly by modifying the service file and 
changing the After=network-online.target to After=multi-user.target. In 
doing this, at first, Pacemaker complains about mismatching dependencies 
in the service between Corosync and Pacemaker. Changing the Pacemaker 
service to After=multi-user.target fixes that self-caused issue. Any 
ideas on this one? Mostly checking to see if changing the After 
dependency will harm us in the future.


That's questionable. It's always best if resolve uses /etc/hosts 
reliably, what is not the case now, so IMHO better to find a reason why 
/etc/hosts doesn't work rather than "workaround" it.


Regards,
  Honza



Thanks!

Respectfully,
  Tyler Phillippe


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Could not initialize corosync configuration API error 2

2023-04-03 Thread Jan Friesse

Hi,

On 31/03/2023 11:36, S Sathish S wrote:

Hi Team,

Please find the corosync version.

[root@node2 ~]# rpm -qa corosync
corosync-2.4.4-2.el7.x86_64.


RHEL 7 never got 2.4.4 - there was 2.4.3 in RHEL 7.7 and 2.4.5 in RHEL 
7.8/7.9. Is this self compiled version? If so, please consider updating 
to distro provided package - RHEL 7 package IS actively maintained.





Firewall in disable state only.

Please find the debug and trace logs

Mar 31 10:07:30 [17684] node2 corosync notice  [MAIN  ] Corosync Cluster Engine 
('UNKNOWN'): started and ready to provide service.
Mar 31 10:07:30 [17684] node2 corosync info[MAIN  ] Corosync built-in 
features: pie relro bindnow
Mar 31 10:07:30 [17684] node2 corosync warning [MAIN  ] Could not set SCHED_RR 
at priority 99: Operation not permitted (1)


This is weird - is corosync running as a root?


Mar 31 10:07:30 [17684] node2 corosync debug   [QB] shm size:8388621; 
real_size:8392704; rb->word_size:2098176
Mar 31 10:07:30 [17684] node2 corosync debug   [MAIN  ] Corosync TTY detached
Mar 31 10:07:30 [17684] node2 corosync debug   [TOTEM ] waiting_trans_ack 
changed to 1
Mar 31 10:07:30 [17684] node2 corosync debug   [TOTEM ] Token Timeout (5550 ms)



...


Mar 31 10:07:30 [17684] node2 corosync debug   [TOTEM ] entering GATHER state 
from 11(merge during join).



This is important. Usually this means there is forgotten node somewhere 
trying to connect to existing cluster or config files between nodes 
differs. Solution is:

1. Check corosync.conf is equal on all nodes
2. Update to distro package (2.4.5) which contains block_unlisted_ips 
functionality/option (enabled by default) and/or generate new crypto 
key, distribute it only to nodes within cluster (so node1 .. node9) and 
turn on crypto,




Mar 31 10:07:30 [17684] node2 corosync debug   [TOTEM ] entering GATHER state 
from 11(merge during join).
Mar 31 10:07:30 [17684] node2 corosync debug   [TOTEM ] entering GATHER state from 


...






Please find the corosync conf file.

[root@node2 ~]# cat /etc/corosync/corosync.conf
totem {
 version: 2
 cluster_name: OCC
 secauth: off


it's really good idea to turn on crypto


 transport: udpu
}



nodelist {
 node {
 ring0_addr: node1
 nodeid: 1
 }



 node {
 ring0_addr: node2
 nodeid: 2
 }



 node {
 ring0_addr: node3
 nodeid: 3
 }



 node {
 ring0_addr: node4
 nodeid: 4
 }



 node {
 ring0_addr: node5
 nodeid: 5
 }



 node {
 ring0_addr: node6
 nodeid: 6
 }



 node {
 ring0_addr: node7
 nodeid: 7
 }



 node {
 ring0_addr: node8
 nodeid: 8
 }



 node {
 ring0_addr: node9
 nodeid: 9
 }
}



quorum {
 provider: corosync_votequorum
}



logging {
 to_logfile: yes
 logfile: /var/log/cluster/corosync.log
 to_syslog: no
timestamp:on
}



Regards,
  Honza


Thanks and Regards,
S Sathish S



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Could not initialize corosync configuration API error 2

2023-03-31 Thread Jan Friesse

Hi,
more information would be needed to really find out real reason, so:
- double check corosync.conf (ip addresses)
- check firewall (mainly local one)
- what is the version of corosync
- try to set debug:on (or trace)
- paste config file
- paste full log - since corosync was started

Also keep in mind if it is version 2.x it's no longer supported by  
upstream and you have to contact your distribution provider support.


Regards,
  Honza

On 30/03/2023 12:08, S Sathish S via Users wrote:

Hi Team,

we are unable to start corosync service which is already part of existing 
cluster same is running fine for longer time. Now we are seeing corosync
server unable to join "Could not initialize corosync configuration API error 
2". Please find the below logs.

[root@node1 ~]# systemctl status corosync
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/usr/lib/systemd/system/corosync.service; enabled; vendor 
preset: disabled)
Active: failed (Result: exit-code) since Thu 2023-03-30 10:49:58 WAT; 7min 
ago
  Docs: man:corosync
man:corosync.conf
man:corosync_overview
   Process: 9922 ExecStop=/usr/share/corosync/corosync stop (code=exited, 
status=0/SUCCESS)
   Process: 9937 ExecStart=/usr/share/corosync/corosync start (code=exited, 
status=1/FAILURE)



Mar 30 10:48:57 node1 systemd[1]: Starting Corosync Cluster Engine...
Mar 30 10:49:58 node1 corosync[9937]: Starting Corosync Cluster Engine 
(corosync): [FAILED]
Mar 30 10:49:58 node1 systemd[1]: corosync.service: control process exited, 
code=exited status=1
Mar 30 10:49:58 node1 systemd[1]: Failed to start Corosync Cluster Engine.
Mar 30 10:49:58 node1 systemd[1]: Unit corosync.service entered failed state.
Mar 30 10:49:58 node1 systemd[1]: corosync.service failed.

Please find the corosync logs error:

Mar 30 10:49:52 [9947] node1 corosync debug   [MAIN  ] Denied connection, 
corosync is not ready
Mar 30 10:49:52 [9947] node1 corosync warning [QB] Denied connection, is 
not ready (9948-10497-23)
Mar 30 10:49:52 [9947] node1 corosync debug   [MAIN  ] 
cs_ipcs_connection_destroyed()
Mar 30 10:49:52 [9947] node1 corosync debug   [MAIN  ] Denied connection, 
corosync is not ready
Mar 30 10:49:57 [9947] node1 corosync debug   [MAIN  ] 
cs_ipcs_connection_destroyed()
Mar 30 10:49:58 [9947] node1 corosync notice  [MAIN  ] Node was shut down by a 
signal
Mar 30 10:49:58 [9947] node1 corosync notice  [SERV  ] Unloading all Corosync 
service engines.
Mar 30 10:49:58 [9947] node1 corosync info[QB] withdrawing server 
sockets
Mar 30 10:49:58 [9947] node1 corosync debug   [QB] qb_ipcs_unref() - 
destroying
Mar 30 10:49:58 [9947] node1 corosync notice  [SERV  ] Service engine unloaded: 
corosync vote quorum service v1.0
Mar 30 10:49:58 [9947] node1 corosync info[QB] withdrawing server 
sockets
Mar 30 10:49:58 [9947] node1 corosync debug   [QB] qb_ipcs_unref() - 
destroying
Mar 30 10:49:58 [9947] node1 corosync notice  [SERV  ] Service engine unloaded: 
corosync configuration map access
Mar 30 10:49:58 [9947] node1 corosync info[QB] withdrawing server 
sockets
Mar 30 10:49:58 [9947] node1 corosync debug   [QB] qb_ipcs_unref() - 
destroying
Mar 30 10:49:58 [9947] node1 corosync notice  [SERV  ] Service engine unloaded: 
corosync configuration service
Mar 30 10:49:58 [9947] node1 corosync info[QB] withdrawing server 
sockets
Mar 30 10:49:58 [9947] node1 corosync debug   [QB] qb_ipcs_unref() - 
destroying
Mar 30 10:49:58 [9947] node1 corosync notice  [SERV  ] Service engine unloaded: 
corosync cluster closed process group service v1.01
Mar 30 10:49:58 [9947] node1 corosync info[QB] withdrawing server 
sockets
Mar 30 10:49:58 [9947] node1 corosync debug   [QB] qb_ipcs_unref() - 
destroying
Mar 30 10:49:58 [9947] node1 corosync notice  [SERV  ] Service engine unloaded: 
corosync cluster quorum service v0.1
Mar 30 10:49:58 [9947] node1 corosync notice  [SERV  ] Service engine unloaded: 
corosync profile loading service
Mar 30 10:49:58 [9947] node1 corosync debug   [TOTEM ] sending join/leave 
message
Mar 30 10:49:58 [9947] node1 corosync notice  [MAIN  ] Corosync Cluster Engine 
exiting normally


While try manually start corosync service also getting below error.


[root@node1 ~]# bash -x /usr/share/corosync/corosync start
+ desc='Corosync Cluster Engine'
+ prog=corosync
+ PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/sbin
+ '[' -f /etc/sysconfig/corosync ']'
+ . /etc/sysconfig/corosync
++ COROSYNC_INIT_TIMEOUT=60
++ COROSYNC_OPTIONS=
+ case '/etc/sysconfig' in
+ '[' -f /etc/init.d/functions ']'
+ . /etc/init.d/functions
++ TEXTDOMAIN=initscripts
++ umask 022
++ PATH=/sbin:/usr/sbin:/bin:/usr/bin
++ export PATH
++ '[' 28864 -ne 1 -a -z '' ']'
++ '[' -d /run/systemd/system ']'
++ case "$0" in
++ '[' -z '' ']'
++ COLUMNS=80
++ '[' -z '' ']'
++ '[' -c /dev/stderr -a -r /dev/stderr ']'
+++ /sbin/consoletype
++ CONSOLETYPE=pty
++ '[' -z '' ']'
++ '[' -z '' ']'
++ '[' -f /etc/syscon

Re: [ClusterLabs] Totem decrypt with Wireshark

2023-03-31 Thread Jan Friesse

Hi,

On 29/03/2023 08:51, Justino, Fabiana wrote:

Hi,

I have corosync version 3.1.7-1, encrypted totem messages and would like to 
know how to decrypt them.
Tried to disable encryption with crypto_cipher set to No and crypto_hash set to 
No but it keeps encrypted.


it's definitively not encrypted when encryption is disabled. Problem 
with Wireshark is, that dissector for corosync (and knet) is not 
implemented (actually there is corosync 1.x dissector, but protocol was 
changed in corosync 2.x and 3.x) so it may appear "encrypted" - it's 
not, it's just binary protocol.


Regards,
  Honza



Thank you in advance.
Fabiana.


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] corosync 2.4.4 version provide secure the communication by default

2023-03-27 Thread Jan Friesse

On 26/03/2023 12:42, S Sathish S wrote:

Hi Jan,



Hi,


In Corosync which all scenario it send cpg message and what is impact if we are 
not secure communication.


It really depends of what services are used, but generally speaking 
corosync without cpg is not super useful so I guess cpg is probably used...





   1.  Any outsider attacker can manipulate the system using unencrypted 
communication.


yes

   2.  Corosync used for heartbeat communication in that we don't have any sensitive data really need to secure ? if not then any other sensitive data transferred via corosync 

communication.

Not sure I understand question - but in general modifying corosync 
messages can lead to huge problems. If attacker can really change 
messages it's super easy to change membership, make it unstable, ... 
it's not really just about changing content of cpg data.


What is the point to turn off encryption?

Regards,
  Honza



Thanks and Regards,
S Sathish S



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Corosync-qdevice 3.0.3 is available at GitHub!

2023-03-22 Thread Jan Friesse
I am pleased to announce the latest maintenance release of 
Corosync-Qdevice 3.0.3 available immediately from GitHub at 
https://github.com/corosync/corosync-qdevice/releases as 
corosync-qdevice-3.0.3.


This release fixes bug which made qdevice crash (abrt) when no network 
interfaces other than loopback exists. This bug was introduced in 
version 3.0.1. Version 3.0.0 and previous versions shipped within 
corosync package are not affected.


Complete changelog for 3.0.3:

Jan Friesse (1):
  qdevice: Destroy non blocking client on failure

Upgrade is highly recommended.

Thanks/congratulations to all people that contributed to achieve this 
great milestone.


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] 2-Node cluster - both nodes unclean - can't start cluster

2023-03-13 Thread Jan Friesse

On 10/03/2023 22:29, Reid Wahl wrote:

On Fri, Mar 10, 2023 at 10:49 AM Lentes, Bernd
 wrote:


Hi,

I don’t get my cluster running. I had problems with an OCFS2 Volume, both
nodes have been fenced.
When I do now a “systemctl start pacemaker.service”, crm_mon shows for a few
seconds both nodes as UNCLEAN, then pacemaker stops.
I try to confirm the fendcing with “Stonith_admin –C”, but it doesn’t work.
Maybe time is to short, pacemaker is just running for a few seconds.

Here is the log:

Mar 10 19:36:24 [31037] ha-idg-1 corosync notice  [MAIN  ] Corosync Cluster
Engine ('2.3.6'): started and ready to provide service.
Mar 10 19:36:24 [31037] ha-idg-1 corosync info[MAIN  ] Corosync built-in
features: debug testagents augeas systemd pie relro bindnow
Mar 10 19:36:24 [31037] ha-idg-1 corosync notice  [TOTEM ] Initializing
transport (UDP/IP Multicast).
Mar 10 19:36:24 [31037] ha-idg-1 corosync notice  [TOTEM ] Initializing
transmit/receive security (NSS) crypto: aes256 hash: sha1
Mar 10 19:36:25 [31037] ha-idg-1 corosync notice  [TOTEM ] The network
interface [192.168.100.10] is now up.
Mar 10 19:36:25 [31037] ha-idg-1 corosync notice  [SERV  ] Service engine
loaded: corosync configuration map access [0]
Mar 10 19:36:25 [31037] ha-idg-1 corosync info[QB] server name: cmap
Mar 10 19:36:25 [31037] ha-idg-1 corosync notice  [SERV  ] Service engine
loaded: corosync configuration service [1]
Mar 10 19:36:25 [31037] ha-idg-1 corosync info[QB] server name: cfg
Mar 10 19:36:25 [31037] ha-idg-1 corosync notice  [SERV  ] Service engine
loaded: corosync cluster closed process group service v1.01 [2]
Mar 10 19:36:25 [31037] ha-idg-1 corosync info[QB] server name: cpg
Mar 10 19:36:25 [31037] ha-idg-1 corosync notice  [SERV  ] Service engine
loaded: corosync profile loading service [4]
Mar 10 19:36:25 [31037] ha-idg-1 corosync notice  [QUORUM] Using quorum
provider corosync_votequorum
Mar 10 19:36:25 [31037] ha-idg-1 corosync notice  [QUORUM] This node is
within the primary component and will provide service.
Mar 10 19:36:25 [31037] ha-idg-1 corosync notice  [QUORUM] Members[0]:
Mar 10 19:36:25 [31037] ha-idg-1 corosync notice  [SERV  ] Service engine
loaded: corosync vote quorum service v1.0 [5]
Mar 10 19:36:25 [31037] ha-idg-1 corosync info[QB] server name:
votequorum
Mar 10 19:36:25 [31037] ha-idg-1 corosync notice  [SERV  ] Service engine
loaded: corosync cluster quorum service v0.1 [3]
Mar 10 19:36:25 [31037] ha-idg-1 corosync info[QB] server name:
quorum
Mar 10 19:36:25 [31037] ha-idg-1 corosync notice  [TOTEM ] A new membership
(192.168.100.10:2340) was formed. Members joined: 1084777482


Is this really the corosync node ID of one of your nodes? If not,
what's your corosync version? Is the number the same every time the
issue happens? The number is so large and seemingly random that I
wonder if there's some kind of memory corruption.


It's autogenerated nodeid (just ipv4 addresss). Nodeid was not required 
for Corosync < 3 (we made it required mostly for knet).






Mar 10 19:36:25 [31037] ha-idg-1 corosync notice  [QUORUM] Members[1]:
1084777482
Mar 10 19:36:25 [31037] ha-idg-1 corosync notice  [MAIN  ] Completed service
synchronization, ready to provide service.
Mar 10 19:36:25 [31044] ha-idg-1 pacemakerd:   notice: main:Starting
Pacemaker 1.1.24+20210811.f5abda0ee-3.27.1 | build=1.1.24+20210811.f5abda0ee
features: generated-manpages agent-manp
ages ncurses libqb-logging libqb-ipc lha-fencing systemd nagios
corosync-native atomic-attrd snmp libesmtp acls cibsecrets
Mar 10 19:36:25 [31044] ha-idg-1 pacemakerd: info: main:Maximum core
file size is: 18446744073709551615
Mar 10 19:36:25 [31044] ha-idg-1 pacemakerd: info: qb_ipcs_us_publish:
server name: pacemakerd
Mar 10 19:36:25 [31044] ha-idg-1 pacemakerd: info:
pcmk__ipc_is_authentic_process_active:   Could not connect to lrmd IPC:
Connection refused
Mar 10 19:36:25 [31044] ha-idg-1 pacemakerd: info:
pcmk__ipc_is_authentic_process_active:   Could not connect to cib_ro IPC:
Connection refused
Mar 10 19:36:25 [31044] ha-idg-1 pacemakerd: info:
pcmk__ipc_is_authentic_process_active:   Could not connect to crmd IPC:
Connection refused
Mar 10 19:36:25 [31044] ha-idg-1 pacemakerd: info:
pcmk__ipc_is_authentic_process_active:   Could not connect to attrd IPC:
Connection refused
Mar 10 19:36:25 [31044] ha-idg-1 pacemakerd: info:
pcmk__ipc_is_authentic_process_active:   Could not connect to pengine IPC:
Connection refused
Mar 10 19:36:25 [31044] ha-idg-1 pacemakerd: info:
pcmk__ipc_is_authentic_process_active:   Could not connect to stonith-ng
IPC: Connection refused
Mar 10 19:36:25 [31044] ha-idg-1 pacemakerd: info: corosync_node_name:
Unable to get node name for nodeid 1084777482
Mar 10 19:36:25 [31044] ha-idg-1 pacemakerd:   notice: get_node_name:
Could not obtain a node name for corosync nodeid 1084777482
Mar 10 19:36:25 [31044] ha-idg-1 pacemakerd: info: crm_get_peer:
Created 

Re: [ClusterLabs] Migrated to corosync 3.x knet become default protocol

2023-01-30 Thread Jan Friesse

On 30/01/2023 10:16, Jan Friesse wrote:

Hi,

On 30/01/2023 07:14, S Sathish S via Users wrote:

Hi Team,

In our application we are currently using UDPU as transport protocol 
with single ring, while migrated to corosync 3.x knet become default 
protocol.


We need to understand any maintenance overhead that any required 
certificate/key management would bring in for knet transport protocol 
(or) it will use existing authorization key /etc/corosync/authkey file 
for secure communication between nodes using 


yes, as long as secauth or crypto_cipher/crypto_hash is configured, 
corosync 3.x will happily use existing /etc/corosync/authkey. Eventho I 
would recommend to generate new one because new one is longer by default 
(2024 bits vs old 1024).


Typo, 2048 of course ;)




knet transport protocol.




https://access.redhat.com/solutions/5963941

https://access.redhat.com/solutions/1182463


We shouldn't end up in a case where Pacemaker stops working due to 
some certificate/key expiry?


It's symmetric key so there is no key expiration.


Regards,
   Honza



Thanks and Regards,
S Sathish S


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/





___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Migrated to corosync 3.x knet become default protocol

2023-01-30 Thread Jan Friesse

Hi,

On 30/01/2023 07:14, S Sathish S via Users wrote:

Hi Team,

In our application we are currently using UDPU as transport protocol with 
single ring, while migrated to corosync 3.x knet become default protocol.

We need to understand any maintenance overhead that any required certificate/key management would bring in for knet transport protocol (or) it will use existing authorization key /etc/corosync/authkey file for secure communication between nodes using 


yes, as long as secauth or crypto_cipher/crypto_hash is configured, 
corosync 3.x will happily use existing /etc/corosync/authkey. Eventho I 
would recommend to generate new one because new one is longer by default 
(2024 bits vs old 1024).


knet transport protocol.




https://access.redhat.com/solutions/5963941

https://access.redhat.com/solutions/1182463


We shouldn't end up in a case where Pacemaker stops working due to some 
certificate/key expiry?


It's symmetric key so there is no key expiration.


Regards,
  Honza



Thanks and Regards,
S Sathish S


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] corosync 2.4.4 version provide secure the communication by default

2023-01-23 Thread Jan Friesse

On 23/01/2023 10:38, S Sathish S wrote:

Hi Jan/Team,

Yes , In syslog we noticed "crypto: none" during startup of corosync service.


Ok, so then communication is unencrypted.



In Corosync communication which protocols/ports transfer sensitive data which 
need to be secured ?


Corosync implements its own protocol and for udpu it is using port 5405 
by default.




Or It will have only binary protocol like 5405 port for all corosync 
communication?


Yes

Basically if you dump UDP traffic port 5405 you should see messages sent 
via cpg.


For example I've tried:
tcpdump -i eth1  -nN -nn udp

and send "This is nice test" using testcpg (which is using CPG group 
called GROUP) and entry


"16:12:22.534234 IP 192.168.63.35.52319 > 192.168.63.36.5405: UDP, 
length 321
E..]D?@.@.?#..?$._...I.".."..?#..)...(...?#o.aGROUPU..This 
is nice test"


was logged.

Regards,
  Honza



Thanks and Regards,
S Sathish S
-Original Message-
From: Jan Friesse 
Sent: 23 January 2023 14:50
To: Cluster Labs - All topics related to open-source clustering welcomed 

Cc: S Sathish S 
Subject: Re: [ClusterLabs] corosync 2.4.4 version provide secure the 
communication by default

Hi,

On 23/01/2023 01:37, S Sathish S via Users wrote:

Hi Team,

corosync 2.4.4 version provide mechanism to secure the communication path 
between nodes of a cluster by default? bcoz in our configuration secauth is 
turned off but still communication occur is encrypted.

Note : Capture tcpdump for port 5405 and I can see that the data is already 
garbled and not in the clear.


It's binary protocol so don't expect some really readable format (like 
xml/json/...). But with your config it should be unencrypted. You can check message 
"notice  [TOTEM ] Initializing transmit/receive security
(NSS) crypto: none hash: none" during start of corosync.

Regards,
Honza




[root@node1 ~]# cat /etc/corosync/corosync.conf totem {
  version: 2
  cluster_name: OCC
 secauth: off
  transport: udpu
}

nodelist {
  node {
  ring0_addr: node1
  nodeid: 1
  }

  node {
  ring0_addr: node2
  nodeid: 2
  }

  node {
  ring0_addr: node3
  nodeid: 3
  }
}

quorum {
  provider: corosync_votequorum
}

logging {
  to_logfile: yes
  logfile: /var/log/cluster/corosync.log
  to_syslog: no
  timestamp: on
}

Thanks and Regards,
S Sathish S


___
Manage your subscription:
https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-45444
731-d41b18997a64a81a&q=1&e=d75dcac1-7d11-41aa-b596-47366bde2862&u=
https%3A%2F%2Flists.clusterlabs.org%2Fmailman%2Flistinfo%2Fusers

ClusterLabs home:
https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-45444
731-b3537e65a3f1def4&q=1&e=d75dcac1-7d11-41aa-b596-47366bde2862&u=
https%3A%2F%2Fwww.clusterlabs.org%2F





___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Re: corosync 2.4.4 version provide secure the communication by default

2023-01-23 Thread Jan Friesse

On 23/01/2023 12:51, Ulrich Windl wrote:

Jan Friesse  schrieb am 23.01.2023 um 10:20 in Nachricht

:

Hi,

On 23/01/2023 01:37, S Sathish S via Users wrote:

Hi Team,

corosync 2.4.4 version provide mechanism to secure the communication path

between nodes of a cluster by default? bcoz in our configuration secauth is
turned off but still communication occur is encrypted.


Note : Capture tcpdump for port 5405 and I can see that the data is already

garbled and not in the clear.

It's binary protocol so don't expect some really readable format (like
xml/json/...). But with your config it should be unencrypted. You can
check message "notice  [TOTEM ] Initializing transmit/receive security
(NSS) crypto: none hash: none" during start of corosync.


Probably a good example for "a false feeling of security" (you think the 
comminication is encrypted, while in fact it is not).


Yeah, "none" and "none" is definitively "false feeling of security" and 
definitively suggest communication is encrypted. Sigh...







Regards,
Honza




[root@node1 ~]# cat /etc/corosync/corosync.conf
totem {
  version: 2
  cluster_name: OCC
 secauth: off
  transport: udpu
}

nodelist {
  node {
  ring0_addr: node1
  nodeid: 1
  }

  node {
  ring0_addr: node2
  nodeid: 2
  }

  node {
  ring0_addr: node3
  nodeid: 3
  }
}

quorum {
  provider: corosync_votequorum
}

logging {
  to_logfile: yes
  logfile: /var/log/cluster/corosync.log
  to_syslog: no
  timestamp: on
}

Thanks and Regards,
S Sathish S


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/





___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] corosync 2.4.4 version provide secure the communication by default

2023-01-23 Thread Jan Friesse

Hi,

On 23/01/2023 01:37, S Sathish S via Users wrote:

Hi Team,

corosync 2.4.4 version provide mechanism to secure the communication path 
between nodes of a cluster by default? bcoz in our configuration secauth is 
turned off but still communication occur is encrypted.

Note : Capture tcpdump for port 5405 and I can see that the data is already 
garbled and not in the clear.


It's binary protocol so don't expect some really readable format (like 
xml/json/...). But with your config it should be unencrypted. You can 
check message "notice  [TOTEM ] Initializing transmit/receive security 
(NSS) crypto: none hash: none" during start of corosync.


Regards,
  Honza




[root@node1 ~]# cat /etc/corosync/corosync.conf
totem {
 version: 2
 cluster_name: OCC
secauth: off
 transport: udpu
}

nodelist {
 node {
 ring0_addr: node1
 nodeid: 1
 }

 node {
 ring0_addr: node2
 nodeid: 2
 }

 node {
 ring0_addr: node3
 nodeid: 3
 }
}

quorum {
 provider: corosync_votequorum
}

logging {
 to_logfile: yes
 logfile: /var/log/cluster/corosync.log
 to_syslog: no
 timestamp: on
}

Thanks and Regards,
S Sathish S


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Corosync 3.1.7 is available at corosync.org!

2022-11-15 Thread Jan Friesse

I am pleased to announce the latest maintenance release of Corosync
3.1.7 available immediately from the GitHub release section at 
https://github.com/corosync/corosync/releases or our website at

http://build.clusterlabs.org/corosync/releases/.

This release contains important bugfixes and the knet_mtu (for more 
information please see corosync.conf(5)) feature.


Complete changelog for 3.1.7:

Andreas Grueninger (1):
  totemconfig: Check uname return value correctly

Christine Caulfield (1):
  log: Configure knet logging to the same as corosync

Ferenc Wágner (1):
  Remove bashism from configure script

Jan Friesse (6):
  totemudpu: Don't block local socketpair
  pkgconfig: Export corosysconfdir
  totempg: Fix alignment handling
  logrotate: Use copytruncate method by default
  configure: Modernize configure.ac a bit
  totemconfig: Add support for knet_mtu

Upgrade is highly recommended.

Thanks/congratulations to all people that contributed to achieve this 
great milestone.


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Corosync 2.4.6 is available at corosync.org!

2022-11-09 Thread Jan Friesse
I am pleased to announce the last maintenance of the old stable (Needle 
branch) release of Corosync
2.4.6 available immediately from the GitHub release section at 
https://github.com/corosync/corosync/releases or our website at

http://build.clusterlabs.org/corosync/releases/.

The Needle branch is now unsupported and no longer maintained by 
Corosync team. The main reason for this decision is that Camelback (v3 - 
latest v3.1.6) is now almost 4 years old and proven to be stable.


Just a few short statistics/interesting facts:
- Version 2.0.0 was released on Tue Apr 10 2012, so Needle was supported 
for more than 10 years

- There were 845 commits made by 67 people during Needle's lifetime
- 410 files changed, 51214 insertions(+), 5516 deletions(-)
- Needle was first release without LCR support and without AIS services 
implemented
- Corosync-qdevice was added during the Needle life cycle (and now it is 
a separate project)


Complete changelog for 2.4.6 (compared to v2.4.5):

Aleksei Burlakov (1):
  totemsrp: More informative messages

Christine Caulfield (4):
  icmap: fix the icmap_get_*_r functions
  stats: Add basic schedule-miss stats to needle
  icmap: icmap_init_r() leaks if trie_create() fails
  test: Fix cpgtest

Fabio M. Di Nitto (1):
  pkgconfig: Add libqb dependency

Ferenc Wágner (1):
  man: votequorum.5: use proper single quotes

Hideo Yamauchi (1):
  cpg: Change downlist log level

    Jan Friesse (52):
  totem: Increase ring_id seq after load
  totempg: Check sanity (length) of received message
  totemsrp: Reduce MTU to left room second mcast
  qnetd: Rename qnetd-log.c to log.c
  qnetd: Fix double -d description
  qnetd: Check log initialization error
  qnetd: Add function to set log target
  qdevice: Use log instead of libqb log
  qdevice: Import log instead of qdevice-log
  qdevice: Merge msg_decode_error functions
  qnetd: Use log-common for nodelist debug dump
  qdevice: Configurable log priority bump
  tests: Add utils_parse_bool_str test
  qdevice: Free memory used by log
  qdevice: Add log test
  qdevice: Add header files to list of test sources
  qdevice: Add chk variant of vsyslog to test-log
  qdevice: Add prototype of __vsyslog_chk
  votequorum: Ignore the icmap_get_* return value
  logconfig: Remove double free of value
  cmap: Assert copied string length
  sync: Assert sync_callbacks.name length
  votequorum: Assert copied strings length
  cpghum: Remove unused time variables and functions
  cfgtool: Remove unused callbacks
  cmapctl: Free bin_value on error
  quorumtool: Assert copied string length
  votequorum: Reflect runtime change of 2Node to WFA
  main: Add schedmiss timestamp into message
  votequorum: Change check of expected_votes
  quorumtool: Fix exit status codes
  quorumtool: exit on invalid expected votes
  votequorum: set wfa status only on startup
  Revert "totemip: Add support for sin6_scope_id"
  Revert "totemip: compare sin6_scope_id and interface_num"
  main: Make schedmiss in cmap and log equal
  totemip: Add support for sin6_scope_id
  qnetd: Do not call ffsplit_do on shutdown
  qdevice: Fix connect heuristics result callback
  qdevice: Fix connect heuristics result callback
  qdevice: Log adds newline automatically
  qnetd: Fix dpd timer
  qnetd: Add support for keep active partition vote
  common_lib: Remove trailing spaces in cs_strerror
  totemsrp: Move token received callback
  tests: Use CS_DISPATCH_BLOCKING instead of cycle
  qnetd: Fix NULL dereference of client
  qnetd: Simplify KAP Tie-breaker logic
  totem: Add cancel_hold_on_retransmit config option
  logsys: Unlock config mutex on error
  totemsrp: Switch totempg buffers at the right time
  totemudpu: Don't block local socketpair

Kai Kang (1):
  configure.ac: fix pkgconfig issue of rdma

liangxin1300 (12):
  totemip: Add support for sin6_scope_id
  totemip: compare sin6_scope_id and interface_num
  qdevice: Change log level to NOTICE on PASS
  cfgtool: output error messages to stderr
  tools: use util_strtonum for options checking
  cmapctl: return EXIT_FAILURE on failure
  quorumtool: Help shouldn't require running service
  quorumtool: strict check for -o option
  cmapctl: check NULL for key type and value for -p
  man: adjust description about interface section
  qnetd: sort by node_id when add new client
  man: replace votequorum_poll for actuall

[ClusterLabs] Corosync-qdevice 3.0.2 is available at GitHub!

2022-11-03 Thread Jan Friesse
I am pleased to announce the latest maintenance release of 
Corosync-Qdevice 3.0.2 available immediately from GitHub at 
https://github.com/corosync/corosync-qdevice/releases as 
corosync-qdevice-3.0.2.


This release contains important bug fixes.

Complete changelog for 3.0.2:

Jan Friesse (6):
  qnetd: Don't alloc host_addr
  tests: Enhance test-timer-list
  tests: Fix test-timer-list NULL check
  timer-list: Use correct english term children
  unix-socket: Check minimal length of socket path
  configure: Modernize configure.ac a bit

liangxin1300 (2):
  qnetd: sort by node_id when add new client
  man: replace votequorum_poll for actually used fn

Upgrade is highly recommended.

Thanks/congratulations to all people that contributed to achieve this 
great milestone.


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Re: QDevice not found after reboot but appears after cluster restart

2022-08-01 Thread Jan Friesse

Hi,

On 01/08/2022 16:18, john tillman wrote:

"john tillman"  schrieb am 29.07.2022 um 22:51 in

Nachricht
:

On Thursday 28 July 2022 at 22:17:01, john tillman wrote:


I have a two cluster setup with a qdevice. 'pcs quorum status' from a
cluster node shows the qdevice casting a vote.  On the qdevice node
'corosync‑qnetd‑tool ‑s' says I have 2 connected clients and 1
cluster.
The vote count looks correct when I shutdown either one of the
cluster
nodes or the qdevice.  So the voting seems to be working at this
point.


Indeed ‑ shutting down 1 of 3 nodes leaves quorum intact, therefore
everything
still awake knows what's going on.


 From this state, if I reboot both my cluster nodes at the same time


Ugh!


but leave the qdevice node running, the cluster will not see the
qdevice
when the nodes come back up: 'pcs quorum status' show 3 votes
expected
but
only 2 votes cast (from the cluster nodes).


I would think this is to be expected, since if you reboot 2 out of 3
nodes,
you completely lose quorum, so the single node left has no idea what
to
trust
when the other nodes return.


No, no.  I do have quorum after the reboots.  It is courtesy of the 2
cluster nodes casting their quorum votes.  However, the qdevice is not
casting a vote so I am down to 2 out of 3 nodes.

And the qdevice is not part of the cluster.  It will never have any
resources running on it.  Its job is just to vote.

‑John



I thought maybe the problem was that the network wasn't ready when
corosync.service started so I forced a "ExecStartPre=/usr/bin/sleep 10"
into it but that didn't change anything.


This type of fix is broken anyway: You are not delaying, you are waiting
for
an event (network up).
Basically the OS distribution should have configured it correctly already.

In SLES15 there is:
Requires=network-online.target
After=network-online.target



Thank you for the response.

Yes, I saw that those values were correctly set in the service
configuration file for corosync.  The delay was just a test. I just wanted
to make sure that it wasn't a race condition of bringing up the bond and
trying to connect to the quorum node.

I was grep'ing the corosync log for VOTEQ entries and noticed when it
works I see consecutively:
... [VOTEQ ] Sending quorum callback, quorate = 0
... [VOTEQ ] Received qdevice op 1 req from node 1 [QDevice]
When it does not work I never see 'Received qdevice...' line in the log.
Is there something else I can look for to find this problem?  Some other
test you can think of?  Maybe some configuration of the votequorum
service?


maybe good start is to get cluster into state of "non working" qdevice 
and then paste:

- /var/log/messages of corosync/qdevice
- output of `corosync-qdevice-tool -sv` (from nodes) and 
`corosync-qnetd-tool -lv` (from machine where qnetd is running)


"Received qdevice op 1 req from node 1 [QDevice]" it means qdevice is 
registered (= corosync-qdevice was started) - if line is really missing 
it can mean corosync-qdevice is not running - log or running 
`corosync-qdevice -f -d` should give some insights why it is not running.


Honza







I could still use some advice with debugging this oddity.  Or have I
used
up my quota of questions this year :‑)

‑John



Starting from a situation such as this, your only hope is to rebuilt
the
cluster from scratch, IMHO.


Antony.

‑‑
Police have found a cartoonist dead in his house.  They say that
details
are
currently sketchy.

Please reply to the
list;
  please
*don't*
CC
me.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/





___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/





___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Question regarding the security of corosync

2022-06-22 Thread Jan Friesse

On 22/06/2022 07:12, Andrei Borzenkov wrote:

On 22.06.2022 02:27, Antony Stone wrote:

On Friday 17 June 2022 at 11:39:14, Mario Freytag wrote:


I’d like to ask about the security of corosync. We’re using a Proxmox HA
setup in our testing environment and need to confirm it’s compliance with
PCI guidelines.

We have a few questions:

Is the communication encrypted?
What method of encryption is used?
What method of authentication is used?
What is the recommended way of separation for the corosync network? VLAN?


Your first three questions are probably well-answered by
https://github.com/fghaas/corosync/blob/master/SECURITY



This is thirteen years old file which is not present in the current
corosync sources. I hesitate to use it as the answer to anything
*today*. If it is still relevant, why it was removed?


Yup, the file is no longer relevant. The main reason to remove it was 
that corosync no longer does encryption itself - it's now knet problem. 
Also file was pretty much outdated since removal of tomcrypt and move to 
just using nss (so corosync 2 era).


So really authoritative source is knet source code 
(https://github.com/kronosnet/kronosnet/blob/main/libknet/crypto.c and 
other crypto*.c files).


Honza




For the fourth, I agree with Jan Friesse - a dedicated physical network is
best; a dedicated VLAN is second best.


Antony.



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Question regarding the security of corosync

2022-06-21 Thread Jan Friesse

Hi Mario,

On 17/06/2022 11:39, Mario Freytag wrote:

Dear sirs, or madams,

I’d like to ask about the security of corosync. We’re using a Proxmox HA setup 
in our testing environment and need to confirm it’s compliance with PCI 
guidelines.

We have a few questions:

Is the communication encrypted?


Depends on configuration, but (I think) default for proxmox is to set 
secauth: on, so yes, communication is encrypted.



What method of encryption is used?


aes256


What method of authentication is used?


sha256


What is the recommended way of separation for the corosync network? VLAN?


separate network card is always best. Vlan is probably second best.

Regards,
  Honza



Best regards

Mario Freytag
Systemadministrator | WEBINC GmbH & Co. KG

​Unter den Eichen 5 Geb. F | 65195 Wiesbaden | T +49 611 541075 0
Amtsgericht Wiesbaden | HRA 9610 | Geschäftsführung: Marina Maurer, Monika 
Brandes


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] No node name in corosync-cmapctl output

2022-06-01 Thread Jan Friesse

On 31/05/2022 16:28, Andreas Hasenack wrote:

Hi,

On Tue, May 31, 2022 at 1:35 PM Jan Friesse  wrote:


Hi,

On 31/05/2022 15:16, Andreas Hasenack wrote:

Hi,

corosync 3.1.6
pacemaker 2.1.2
crmsh 4.3.1

TL;DR
I only seem to get a "name" attribute in the "corosync-cmapctl | grep
nodelist" output if I set an explicit name in corosync.conf's
nodelist. If I rely on the default of "name will be uname -n if it's
not set", I get nothing.



wondering where is problem? name is not set so it's not in cmap, what is
(mostly) 1:1 mapping of config file. So this is expected, not a bug.


It was surprising to me, because the node clearly has a name (crm_node -n).


Why not also use "uname -n" when "name" is not explicitly set in the
corosync nodelist config?


Can you please share use case for this behavior? It shouldn't be be hard
to implement.


The use case is a test script[1], which installs the package, starts
the services, and then runs some quick checks. One of the tests is to
check for the node name in "crm status" output, and for that it needs
to discover the node name.


got it



Sure, plenty of ways to achieve that. Set it in the config to a known
name, or run "crm_node -n", or something else. The script is doing:
POS="$(corosync-cmapctl -q -g nodelist.local_node_pos)"
NODE="$(corosync-cmapctl -q -g nodelist.node.$POS.name)"


Ok, so you need only local node name - then why not to add
```
[ "$NODE" == " ] && NODE=`uname -n`
```

No matter what, implementing resolving of just local node name would be 
really easy - implementing it clusterwise would be super hard (on 
corosync level). On the other hand, I'm really not that keen to have 
filled just local node name + it creates bunch of other problems 
(default value during reload, ...).






and I was surprised that there was no "name" entry. In this cluster
stack, depending on which layer you ask,  you may get different
answers :)


Yup, agree. Sometimes it's confusing :( But the test is really about 
`crm` so pacemaker level...


Regards,
  Honza






1. 
https://salsa.debian.org/ha-team/crmsh/-/blob/master/debian/tests/pacemaker-node-status.sh
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] No node name in corosync-cmapctl output

2022-05-31 Thread Jan Friesse

On 31/05/2022 16:11, Ken Gaillot wrote:

On Tue, 2022-05-31 at 13:16 +, Andreas Hasenack wrote:

Hi,

corosync 3.1.6
pacemaker 2.1.2
crmsh 4.3.1

TL;DR
I only seem to get a "name" attribute in the "corosync-cmapctl | grep
nodelist" output if I set an explicit name in corosync.conf's
nodelist. If I rely on the default of "name will be uname -n if it's
not set", I get nothing.


The default is Pacemaker's; corosync doesn't actually know or care
about the node name other than that the user added a configuration
value for "name" (or not).


Just a bit correction.

What you've told was 100% true for corosync 1 and 2, but is not really 
case for corosync 3.x. Name, if defined, is used for matching local node 
even for corosync itself. There is actually a bit scary code taken from 
cman which tries to remove dot separated parts of full uname to match 
name. If name is not set or no match is found, ring0_addr is used instead.


This improvement was required to support configurations, where 
ring0_addr is not set ... what is valid corosync 3 configuration.


Regards,
  Honza



The equivalent query in Pacemaker would be "crm_node --name" (to get
the local host's node name in the cluster) or "crm_node --list" (to
show all known node names).




I formed a test cluster of 3 nodes, and I'm not setting the name
attribute in the nodelist, so that it defaults to `uname -n`:
nodelist {
 node {
  nodeid: 1
 ring0_addr: k1
 }
 node {
 nodeid: 2
 ring0_addr: k2
 }
 node {
 nodeid: 3
 ring0_addr: k3
 }
}

The addresses "k1", "k2" and "k3" are fully resolvable (I know IPs
are
better, but for this quick test it was simpler to use the hostnames).

crm status is happy:
root@k1:~# crm status
Cluster Summary:
   * Stack: corosync
   * Current DC: k3 (version 2.1.2-ada5c3b36e2) - partition with
quorum
   * Last updated: Tue May 31 12:53:02 2022
   * Last change:  Tue May 31 12:51:55 2022 by hacluster via crmd on
k3
   * 3 nodes configured
   * 0 resource instances configured

Node List:
   * Online: [ k1 k2 k3 ]

Full List of Resources:
   * No resources


But there is no node name in the corosync-cmapctl output:

root@k1:~# corosync-cmapctl |grep nodelist
nodelist.local_node_pos (u32) = 0
nodelist.node.0.nodeid (u32) = 1
nodelist.node.0.ring0_addr (str) = k1
nodelist.node.1.nodeid (u32) = 2
nodelist.node.1.ring0_addr (str) = k2
nodelist.node.2.nodeid (u32) = 3
nodelist.node.2.ring0_addr (str) = k3

I was expecting to have entries like "nodelist.node.0.name = k1" in
that output. Apparently I only get that if I explicitly set a node
name in nodelist.

For example, if I set the name of nodeid 1 to "explicit1":
 node {
 name: explicit1
 nodeid: 1
 ring0_addr: k1
 }

Then I get the name attribute for that nodeid only:
# corosync-cmapctl |grep nodelist
nodelist.local_node_pos (u32) = 0
nodelist.node.0.name (str) = explicit1
nodelist.node.0.nodeid (u32) = 1
nodelist.node.0.ring0_addr (str) = k1
nodelist.node.1.nodeid (u32) = 2
nodelist.node.1.ring0_addr (str) = k2
nodelist.node.2.nodeid (u32) = 3
nodelist.node.2.ring0_addr (str) = k3

Why not also use "uname -n" when "name" is not explicitly set in the
corosync nodelist config?
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] No node name in corosync-cmapctl output

2022-05-31 Thread Jan Friesse

Hi,

On 31/05/2022 15:16, Andreas Hasenack wrote:

Hi,

corosync 3.1.6
pacemaker 2.1.2
crmsh 4.3.1

TL;DR
I only seem to get a "name" attribute in the "corosync-cmapctl | grep
nodelist" output if I set an explicit name in corosync.conf's
nodelist. If I rely on the default of "name will be uname -n if it's
not set", I get nothing.



wondering where is problem? name is not set so it's not in cmap, what is 
(mostly) 1:1 mapping of config file. So this is expected, not a bug.




I formed a test cluster of 3 nodes, and I'm not setting the name
attribute in the nodelist, so that it defaults to `uname -n`:
nodelist {
 node {
  nodeid: 1
 ring0_addr: k1
 }
 node {
 nodeid: 2
 ring0_addr: k2
 }
 node {
 nodeid: 3
 ring0_addr: k3
 }
}

The addresses "k1", "k2" and "k3" are fully resolvable (I know IPs are
better, but for this quick test it was simpler to use the hostnames).

crm status is happy:
root@k1:~# crm status
Cluster Summary:
   * Stack: corosync
   * Current DC: k3 (version 2.1.2-ada5c3b36e2) - partition with quorum
   * Last updated: Tue May 31 12:53:02 2022
   * Last change:  Tue May 31 12:51:55 2022 by hacluster via crmd on k3
   * 3 nodes configured
   * 0 resource instances configured

Node List:
   * Online: [ k1 k2 k3 ]

Full List of Resources:
   * No resources


But there is no node name in the corosync-cmapctl output:

root@k1:~# corosync-cmapctl |grep nodelist
nodelist.local_node_pos (u32) = 0
nodelist.node.0.nodeid (u32) = 1
nodelist.node.0.ring0_addr (str) = k1
nodelist.node.1.nodeid (u32) = 2
nodelist.node.1.ring0_addr (str) = k2
nodelist.node.2.nodeid (u32) = 3
nodelist.node.2.ring0_addr (str) = k3

I was expecting to have entries like "nodelist.node.0.name = k1" in
that output. Apparently I only get that if I explicitly set a node
name in nodelist.

For example, if I set the name of nodeid 1 to "explicit1":
 node {
 name: explicit1
 nodeid: 1
 ring0_addr: k1
 }

Then I get the name attribute for that nodeid only:
# corosync-cmapctl |grep nodelist
nodelist.local_node_pos (u32) = 0
nodelist.node.0.name (str) = explicit1
nodelist.node.0.nodeid (u32) = 1
nodelist.node.0.ring0_addr (str) = k1
nodelist.node.1.nodeid (u32) = 2
nodelist.node.1.ring0_addr (str) = k2
nodelist.node.2.nodeid (u32) = 3
nodelist.node.2.ring0_addr (str) = k3

Why not also use "uname -n" when "name" is not explicitly set in the
corosync nodelist config?


Can you please share use case for this behavior? It shouldn't be be hard 
to implement.


Regards,
  Honza


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] corosync-cfgtool -s shows all links not connected for one particular node

2022-05-24 Thread Jan Friesse

Dirk,

On 23/05/2022 19:02, Dirk Gassen wrote:

Greetings,

I have a four-node cluster on Ubuntu Focal with the following versions:

libknet1: 1.15-1ubuntu1
corosync: 3.0.3-2ubuntu2.1
pacemaker: 2.0.3-3ubuntu4.3


3.0.3 corosync-cfgtool was buggy - basically first version with 
correctly working `cfgtool -a` is 3.1.5. If possible I would recommend 
to update (either compile from source or use proxmox repo 
http://download.proxmox.com/debian/pve/.


Honza



Each node is connected to two networks:

testras1:
  eth0  10.1.8.24/26
  eth1  192.168.21.227/24
testras2:
  eth0  10.1.8.25/26   eth1  192.168.21.119/24
testras3:
  eth0  10.1.8.66/26
  eth1  192.168.21.13/24
testras4:
  eth0  10.1.8.77/26   eth1  192.168.21.19/24


The totem section of corosync.conf on all nodes:

totem {
version: 2
cluster_name: BERND-RAS
# Disable encryption
secauth: off
interface {
    linknumber: 0
    #knet_transport: udp|sctp
    #knet_link_priority: 0
}
interface {
    linknumber: 1
    #knet_transport: udp|sctp
    #knet_link_priority: 1
}
transport: knet
}

and the nodelist section:

nodelist {     node {
    ring0_addr: 192.168.21.227
    ring1_addr: 10.1.8.24
    nodeid: 2036952047
    name: testras1
    }
    node {
    ring0_addr: 192.168.21.119
    ring1_addr: 10.1.8.25
    nodeid: 2036951939
    name: testras2
    }
    node {
    ring0_addr: 192.168.21.13
    ring1_addr: 10.1.8.66
    nodeid: 1921682113
    name: testras3
    }
    node {
    ring0_addr: 192.168.21.19
    ring1_addr: 10.1.8.77
    nodeid: 1921682119
    name: testras4
    }
}


On all nodes crm_mon shows all four nodes online:

Node List:
  * Online: [ testras1 testras2 testras3 testras4 ]

and "corosync-cfgtool -s" shows the very same:

Printing link status.
Local node ID 2036952047
LINK ID 0
addr    = 192.168.21.227
status:
    nodeid 1921682113:    link enabled:1    link connected:1
    nodeid 1921682119:    link enabled:1    link connected:1
    nodeid 2036951939:    link enabled:1    link connected:1
    nodeid 2036952047:    link enabled:1    link connected:1
LINK ID 1
addr    = 10.1.8.24
status:
    nodeid 1921682113:    link enabled:1    link connected:1
    nodeid 1921682119:    link enabled:0    link connected:1
    nodeid 2036951939:    link enabled:1    link connected:1
    nodeid 2036952047:    link enabled:1    link connected:1



However, when I add a node that doesn't exist that changes:

    node {
    ring0_addr: 192.168.120.13
    ring1_addr: 10.1.8.99
    nodeid: 2036942833
    name: testras5
    }

Now "corosync-cfgtool -s" shows:

Printing link status.
Local node ID 2036952047
LINK ID 0
addr    = 192.168.21.227
status:
    nodeid 1921682113:    link enabled:1    link connected:0
    nodeid 1921682119:    link enabled:1    link connected:1
    nodeid 2036942833:    link enabled:1    link connected:1
    nodeid 2036951939:    link enabled:1    link connected:1
    nodeid 2036952047:    link enabled:1    link connected:1
LINK ID 1
addr    = 10.1.8.24
status:
    nodeid 1921682113:    link enabled:1    link connected:0
    nodeid 1921682119:    link enabled:1    link connected:1
    nodeid 2036942833:    link enabled:0    link connected:1
    nodeid 2036951939:    link enabled:1    link connected:1
    nodeid 2036952047:    link enabled:1    link connected:1

while everything else stays the same.

Why would "link connected" show 0 for one of the existing nodes but not 
for the non-existing node (2036942833)? (All existing nodes can still 
see each other) What am I missing?


Dirk
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Cluster unable to find back together

2022-05-19 Thread Jan Friesse

Hi,

On 19/05/2022 10:16, Leditzky, Fabian via Users wrote:

Hello

We have been dealing with our pacemaker/corosync clusters becoming unstable.
The OS is Debian 10 and we use Debian packages for pacemaker and corosync,
version 3.0.1-5+deb10u1 and 3.0.1-2+deb10u1 respectively.


Seems like pcmk version is not so important for behavior you've 
described. Corosync 3.0.1 is super old, are you able to reproduce the 
behavior with 3.1.6? What is the version of knet? There were quite a few 
fixes so last one (1.23) is really recommended.


You can try to compile yourself, or use proxmox repo 
(http://download.proxmox.com/debian/pve/) which contains newer version 
of packages.



We use knet over UDP transport.

We run multiple 2-node and 4-8 node clusters, primarily managing VIP resources.
The issue we experience presents itself as a spontaneous disagreement of
the status of cluster members. In two node clusters, each node spontaneously
sees the other node as offline, despite network connectivity being OK.
In larger clusters, the status can be inconsistent across the nodes.
E.g.: node1 sees 2,4 as offline, node 2 sees 1,4 as offline while node 3 and 4 
see every node as online.


This really shouldn't happen.


The cluster becomes generally unresponsive to resource actions in this state.


Expected


Thus far we have been unable to restore cluster health without restarting 
corosync.

We are running packet captures 24/7 on the clusters and have custom tooling
to detect lost UDP packets on knet ports. So far we could not see significant
packet loss trigger an event, at most we have seen a single UDP packet dropped
some seconds before the cluster fails.

However, even if the root cause is indeed a flaky network, we do not understand
why the cluster cannot recover on its own in any way. The issues definitely 
persist
beyond the presence of any intermittent network problem.


Try newer version. If problem persist, it's good idea to monitor if 
packets are really passed thru. Corosync always (at least) creates 
single node membership.


Regards,
  Honza



We were able to artificially break clusters by inducing packet loss with an 
iptables rule.
Dropping packets on a single node of an 8-node cluster can cause malfunctions on
multiple other cluster nodes. The expected behavior would be detecting that the
artificially broken node failed but keeping the rest of the cluster stable.
We were able to reproduce this also on Debian 11 with more recent 
corosync/pacemaker
versions.

Our configuration basic, we do not significantly deviate from the defaults.

We will be very grateful for any insights into this problem.

Thanks,
Fabian

// corosync.conf
totem {
 version: 2
 cluster_name: cluster01
 crypto_cipher: aes256
 crypto_hash: sha512
 transport: knet
}
logging {
 fileline: off
 to_stderr: no
 to_logfile: no
 to_syslog: yes
 debug: off
 timestamp: on
 logger_subsys {
 subsys: QUORUM
 debug: off
 }
}
quorum {
 provider: corosync_votequorum
 two_node: 1
 expected_votes: 2
}
nodelist {
 node {
 name: node01
 nodeid: 01
 ring0_addr: 10.0.0.10
 }
 node {
 name: node02
 nodeid: 02
 ring0_addr: 10.0.0.11
 }
}

// crm config show
node 1: node01 \
 attributes standby=off
node 2: node02 \
 attributes standby=off maintenance=off
primitive IP-clusterC1 IPaddr2 \
 params ip=10.0.0.20 nic=eth0 cidr_netmask=24 \
 meta migration-threshold=2 target-role=Started is-managed=true \
 op monitor interval=20 timeout=60 on-fail=restart
primitive IP-clusterC2 IPaddr2 \
 params ip=10.0.0.21 nic=eth0 cidr_netmask=24 \
 meta migration-threshold=2 target-role=Started is-managed=true \
 op monitor interval=20 timeout=60 on-fail=restart
location STICKY-IP-clusterC1 IP-clusterC1 100: node01
location STICKY-IP-clusterC2 IP-clusterC2 100: node02
property cib-bootstrap-options: \
 have-watchdog=false \
 dc-version=2.0.1-9e909a5bdd \
 cluster-infrastructure=corosync \
 cluster-name=cluster01 \
 stonith-enabled=no \
 no-quorum-policy=ignore \
 last-lrm-refresh=1632230917



  [https://go.aciworldwide.com/rs/030-ROK-804/images/aci-footer.jpg] 

This email message and any attachments may contain confidential, proprietary or 
non-public information. The information is intended solely for the designated 
recipient(s). If an addressing or transmission error has misdirected this 
email, please notify the sender immediately and destroy this email. Any review, 
dissemination, use or reliance upon this information by unintended recipients 
is prohibited. Any opinions expressed in this email are those of the author 
personally.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/




Re: [ClusterLabs] Booth ticket multi-site and quorum /Pacemaker

2022-02-24 Thread Jan Friesse

Hi,

On 24/02/2022 14:19, Viet Nguyen wrote:

Hi,

Thank you so much! Would you please advise more on this following case:

The cluster I am trying to setup is Postgresql with replication streaming
with PAF. So, it will decide one node as a master and 3 standby nodes.

So, with this, from what I understand from Postgresql, having 2 independent
clusters (one in site A, one in site B) is not possible. I have to go with
one single cluster with 4 notes located in 2 different locations (site A
and site B).

Then, my question is:

1. Does the booth ticket work in this setup?


no, not really. booth basically creates cluster on top of 2+ clusters 
and arbitrator.



2. Is Qnetd a better option than booth ticket?


It's neither better nor worse. Qdevice (qnetd) adds a vote(s) to the 
quorum (corosync level). Booth is able to fulfill pacemaker constrain 
for ticket given only to one site in automated way.




3. Is there any better way to manage this?


If you can really use only one big cluster then probably none of booth 
or qdevice is needed.



4. Since we have a distributed site and arbitrator, does fencing make it
even more complicated? How I could solve this problem?


fencing is "must", it doesn't make it more complicated. Probably sbd but 
I have virtually no knowledge about that.





Sorry if my questions sound silly as I am very new to this and thank
you so much for your help.


yw

Regards,
  Honza



Regards,
Viet

On Thu, 24 Feb 2022 at 12:17, Jan Friesse  wrote:


On 24/02/2022 10:28, Viet Nguyen wrote:

Hi,

Thank you so so much for your help. May i ask a following up question:

For the option of having one big cluster with 4 nodes without booth,

then,

if one site (having 2 nodes) is down, then the other site does not work

as

it does not have quorum, am I right? Even if we have a quorum voter in


Yup, you are right


either site A or B, then, if the site with quorum down, then, the other
site does not work.  So, how can we avoid this situation as I want
that if one site is down, the other site still services?


probably only with qnetd - so basically yet again site C.

Regards,
Honza



Regards,
Viet

On Wed, 23 Feb 2022 at 17:08, Jan Friesse  wrote:


Viet,

On 22/02/2022 22:37, Viet Nguyen wrote:

Hi,

Could you please help me out with this question?

I have 4 nodes cluster running in the same network but in 2 different

sites

(building A - 2 nodes and building B - 2 nodes). My objective is to
setup HA for this cluster with pacemaker. The expectation is if a site

is

down, the other site still services.

   From what I could understand so far, in order to make it work, it

needs

to

have booth ticket manager installed in a different location, let's say
building C which connects to both sites A and B.

With this assumption, i would like to ask few questions:

  1. Am i right that I need to setup the booth ticket manager as a

quorum

  voter as well?


Yes, booth (arbitrator) has to be installed on "site" C if you want to
use booth. Just keep in mind booth has nothing to do with quorum.


  2. What happens if  the connection between site A and B is down,

but

the

  connection between A and C, B and C still up? In this case, both

site A and

  B still have the quorum as it can connect to C, but not between

each

other?

If you use booth then it's not required site A to see site B. It's then
"site" C problem to decide which site gets ticket.



  3. Or is there any better way to manage 2 sites cluster, each has

2

  nodes? And if one site is down like environmental disaster, then,

the other

  site still services.


Basically there are (at least) two possible solutions:
- Have one big cluster without booth and use pcmk constraints
- Have two 2 node clusters and use booth. Then each of the two node
clusters is "independent" (have its own quorum) and each of the cluster
runs booth (site) as a cluster resource + "site" C running booth
(arbitrator)

Regards,
 Honza




Thank you so much for your help!
Regards,
Viet


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/













___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Booth ticket multi-site and quorum /Pacemaker

2022-02-24 Thread Jan Friesse

On 24/02/2022 10:28, Viet Nguyen wrote:

Hi,

Thank you so so much for your help. May i ask a following up question:

For the option of having one big cluster with 4 nodes without booth, then,
if one site (having 2 nodes) is down, then the other site does not work as
it does not have quorum, am I right? Even if we have a quorum voter in


Yup, you are right


either site A or B, then, if the site with quorum down, then, the other
site does not work.  So, how can we avoid this situation as I want
that if one site is down, the other site still services?


probably only with qnetd - so basically yet again site C.

Regards,
  Honza



Regards,
Viet

On Wed, 23 Feb 2022 at 17:08, Jan Friesse  wrote:


Viet,

On 22/02/2022 22:37, Viet Nguyen wrote:

Hi,

Could you please help me out with this question?

I have 4 nodes cluster running in the same network but in 2 different

sites

(building A - 2 nodes and building B - 2 nodes). My objective is to
setup HA for this cluster with pacemaker. The expectation is if a site is
down, the other site still services.

  From what I could understand so far, in order to make it work, it needs

to

have booth ticket manager installed in a different location, let's say
building C which connects to both sites A and B.

With this assumption, i would like to ask few questions:

 1. Am i right that I need to setup the booth ticket manager as a

quorum

 voter as well?


Yes, booth (arbitrator) has to be installed on "site" C if you want to
use booth. Just keep in mind booth has nothing to do with quorum.


 2. What happens if  the connection between site A and B is down, but

the

 connection between A and C, B and C still up? In this case, both

site A and

 B still have the quorum as it can connect to C, but not between each

other?

If you use booth then it's not required site A to see site B. It's then
"site" C problem to decide which site gets ticket.



 3. Or is there any better way to manage 2 sites cluster, each has 2
 nodes? And if one site is down like environmental disaster, then,

the other

 site still services.


Basically there are (at least) two possible solutions:
- Have one big cluster without booth and use pcmk constraints
- Have two 2 node clusters and use booth. Then each of the two node
clusters is "independent" (have its own quorum) and each of the cluster
runs booth (site) as a cluster resource + "site" C running booth
(arbitrator)

Regards,
Honza




Thank you so much for your help!
Regards,
Viet


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/








___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Booth ticket multi-site and quorum /Pacemaker

2022-02-23 Thread Jan Friesse

Viet,

On 22/02/2022 22:37, Viet Nguyen wrote:

Hi,

Could you please help me out with this question?

I have 4 nodes cluster running in the same network but in 2 different sites
(building A - 2 nodes and building B - 2 nodes). My objective is to
setup HA for this cluster with pacemaker. The expectation is if a site is
down, the other site still services.

 From what I could understand so far, in order to make it work, it needs to
have booth ticket manager installed in a different location, let's say
building C which connects to both sites A and B.

With this assumption, i would like to ask few questions:

1. Am i right that I need to setup the booth ticket manager as a quorum
voter as well?


Yes, booth (arbitrator) has to be installed on "site" C if you want to 
use booth. Just keep in mind booth has nothing to do with quorum.



2. What happens if  the connection between site A and B is down, but the
connection between A and C, B and C still up? In this case, both site A and
B still have the quorum as it can connect to C, but not between each other?


If you use booth then it's not required site A to see site B. It's then 
"site" C problem to decide which site gets ticket.




3. Or is there any better way to manage 2 sites cluster, each has 2
nodes? And if one site is down like environmental disaster, then, the other
site still services.


Basically there are (at least) two possible solutions:
- Have one big cluster without booth and use pcmk constraints
- Have two 2 node clusters and use booth. Then each of the two node 
clusters is "independent" (have its own quorum) and each of the cluster 
runs booth (site) as a cluster resource + "site" C running booth 
(arbitrator)


Regards,
  Honza




Thank you so much for your help!
Regards,
Viet


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Corosync 3.1.6 is available at corosync.org!

2021-11-15 Thread Jan Friesse

I am pleased to announce the latest maintenance release of Corosync
3.1.6 available immediately from GitHub release section at 
https://github.com/corosync/corosync/releases or our website at

http://build.clusterlabs.org/corosync/releases/.

This release contains MAJOR bugfix of totem protocol which caused loss 
or corruption of messages delivered during recovery phase. It is also 
important to pair this release with Kronosnet v1.23 (announcement 
https://lists.clusterlabs.org/pipermail/users/2021-November/029810.html) 
and Libqb 2.0.4 (announcement 
https://lists.clusterlabs.org/pipermail/users/2021-November/029811.html).


All our development team would like to thank the Proxmox VE maintainer, 
Fabian Gruenbichler, for the extremely detailed bug reports, reproducers 
and collecting all the data from the affected Proxmox VE users, and his 
dedication over the past month to debug, test and work with us.


Another important feature is addition of cancel_hold_on_retransmit 
option, which allows corosync to work in environments, where some 
packets are delayed more than other (caused by various Antivirus / IPS / 
IDS software).


Complete changelog for 3.1.6:

Christine Caulfield (1):
  cpghum: Allow to continue if corosync is restarted

Jan Friesse (4):
  totem: Add cancel_hold_on_retransmit config option
  logsys: Unlock config mutex on error
  totemsrp: Switch totempg buffers at the right time
  build: Add explicit dependency for used libraries

miharahiro (1):
  man: Fix consensus timeout

This upgrade is required.

Thanks/congratulations to all people that contributed to achieve this 
great milestone.


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Corosync 2 vs Corosync 3

2021-10-27 Thread Jan Friesse

On 25/10/2021 16:42, Ken Gaillot wrote:

On Mon, 2021-10-25 at 13:44 +, Toby Haynes wrote:

Looking at Pacemaker 2.1, I see that both corosync 2 and corosync 3
are supported. The last corosync 2 release (2.4.5) came out in 30
July 2019. Will there come a point when a future Pacemaker release
might only support the Corosync 3 (or later) series?


There's nothing that Pacemaker has to do specially for Corosync 2 vs 3,
so it's unlikely support will ever be dropped for just 2. Of course
everything changes eventually, but nothing's on the horizon.

If you're choosing between the two, keep in mind Corosync 3 doesn't
support rolling upgrades from 2, even though Pacemaker itself doesn't
care.


Are there good use cases today where Corosync 2 is a better choice?


Not that I know of. I'd only use it when it's what the OS provides.
There are a few features only supported by 2 (like node discovery via
multicast), but I don't think they're worth staying on 2.



Just to add that Corosync 2 will probably get only one final release and 
we will stop supporting it completely, so I would recommend to choose 
Corosync 3 (with exception where your distro still ships Corosync 2 only 
- then it is probably better to use distro package).


Regards,
  Honza




Thanks,
  
Toby Haynes


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Re: Sub‑clusters / super‑clusters?

2021-08-05 Thread Jan Friesse

On 05/08/2021 00:11, Frank D. Engel, Jr. wrote:
In theory if you could have an independent voting infrastructure among 
the three clusters which serves to effectively create a second cluster 
infrastructure interconnecting them to support resource D, you could 


Yes. It's called booth.

have D running on one of the clusters so long as at least two of them 
can communicate with each other.



In other words, give each cluster one vote, then as long as two of them 
can communicate there are two votes which makes quorum, thus resource D 
can run on one of those two clusters.


If all three clusters lose contact with each other, then D still cannot 
safely run.



To keep the remaining resources working when contact is lost between the 
clusters, the vote for this would need to be independent of the vote 
within each individual cluster, effectively meaning that each node would 
belong to two clusters at once: its own local cluster (A/B/C) plus a 
"global" cluster spread across the three locations.  I don't know 
offhand if that is readily possible to support with the current software.



On 8/4/21 5:01 PM, Antony Stone wrote:

On Wednesday 04 August 2021 at 22:06:39, Frank D. Engel, Jr. wrote:


There is no safe way to do what you are trying to do.

If the resource is on cluster A and contact is lost between clusters A
and B due to a network failure, how does cluster B know if the resource
is still running on cluster A or not?

It has no way of knowing if cluster A is even up and running.

In that situation it cannot safely start the resource.

I am perfectly happy to have an additional machine at a third location in
order to avoid this split-brain between two clusters.

However, what I cannot have is for the resources which should be 
running on

cluster A to get started on cluster B.

If cluster A is down, then its resources should simply not run - as 
happens

right now with two independent clusters.

Suppose for a moment I had three clusters at three locations: A, B and C.

Is there a method by which I can have:

1. Cluster A resources running on cluster A if cluster A is functional 
and not

running anywhere if cluster A is non-functional.

2. Cluster B resources running on cluster B if cluster B is functional 
and not

running anywhere if cluster B is non-functional.

3. Cluster C resources running on cluster C if cluster C is functional 
and not

running anywhere if cluster C is non-functional.

4. Resource D running _somewhere_ on clusters A, B or C, but only a 
single

instance of D at a single location at any time.

Requirements 1, 2 and 3 are easy to achieve - don't connect the clusters.

Requirement 4 is the one I'm stuck with how to implement.

If the three nodes comprising cluster A can manage resources such that 
they
run on only one of the three nodes at any time, surely there must be a 
way of

doing the same thing with a resource running on one of three clusters?


Antony.



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Corosync 3.1.5 is available at corosync.org!

2021-08-04 Thread Jan Friesse

I am pleased to announce the latest maintenance release of Corosync
3.1.5 available immediately from GitHub release section at 
https://github.com/corosync/corosync/releases or our website at

http://build.clusterlabs.org/corosync/releases/.

This release contains important bugfixes of cfgtool and support for 
cgroup v2. Please see corosync.conf(5) man page for more information 
about cgroup v2, because cgroup v2 is very different from cgroup v1 and 
systems with CONFIG_RT_GROUP_SCHED kernel option enabled may experience 
problems with systemd logging or inability to enable cpu controller.


Complete changelog for 3.1.5:

Christine Caulfield (1):
  knet: Fix node status display

Jan Friesse (9):
  main: Add support for cgroup v2 and auto mode
  totemconfig: Do not process totem.nodeid
  cfgtool: Check existence of at least one of nodeid
  totemconfig: Put autogenerated nodeid back to cmap
  cfgtool: Set nodeid indexes after sort
  cfgtool: Fix brief mode display of localhost
  cfgtool: Use CS_PRI_NODE_ID for formatting nodeid
  totemconfig: Ensure all knet hosts has a nodeid
  totemconfig: Knet nodeid must be < 65536

Upgrade is highly recommended.

Thanks/congratulations to all people that contributed to achieve this 
great milestone.


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Sub-clusters / super-clusters?

2021-08-04 Thread Jan Friesse

On 03/08/2021 10:40, Antony Stone wrote:

On Tuesday 11 May 2021 at 12:56:01, Strahil Nikolov wrote:


Here is the example I had promised:

pcs node attribute server1 city=LA
pcs node attribute server2 city=NY

# Don't run on any node that is not in LA
pcs constraint location DummyRes1 rule score=-INFINITY city ne LA

#Don't run on any node that is not in NY
pcs constraint location DummyRes2 rule score=-INFINITY city ne NY

The idea is that if you add a node and you forget to specify the attribute
with the name 'city' , DummyRes1 & DummyRes2 won't be started on it.

For resources that do not have a constraint based on the city -> they will
run everywhere unless you specify a colocation constraint between the
resources.


Excellent - thanks.  I happen to use crmsh rather than pcs, but I've adapted
the above and got it working.

Unfortunately, there is a problem.

My current setup is:

One 3-machine cluster in city A running a bunch of resources between them, the
most important of which for this discussion is Asterisk telephony.

One 3-machine cluster in city B doing exactly the same thing.

The two clusters have no knowledge of each other.

I have high-availability routing between my clusters and my upstream telephony
provider, such that a call can be handled by Cluster A or Cluster B, and if
one is unavailable, the call gets routed to the other.

Thus, a total failure of Cluster A means I still get phone calls, via Cluster
B.


To implement the above "one resource which can run anywhere, but only a single
instance", I joined together clusters A and B, and placed the corresponding
location constraints on the resources I want only at A and the ones I want
only at B.  I then added the resource with no location constraint, and it runs
anywhere, just once.

So far, so good.


The problem is:

With the two independent clusters, if two machines in city A fail, then
Cluster A fails completely (no quorum), and Cluster B continues working.  That
means I still get phone calls.

With the new setup, if two machines in city A fail, then _both_ clusters stop
working and I have no functional resources anywhere.


So, my question now is:

How can I have a 3-machine Cluster A running local resources, and a 3-machine
Cluster B running local resources, plus one resource running on either Cluster
A or Cluster B, but without a failure of one cluster causing _everything_ to
stop?


Yes, it's called geo-clustering (multi-site) - 
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_and_managing_high_availability_clusters/assembly_configuring-multisite-cluster-configuring-and-managing-high-availability-clusters


(ignore doc being for RHEL, other distributions with booth will work 
same way)


Regards,
  Honza




Thanks,


Antony.



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] QDevice vs 3rd host for majority node quorum

2021-07-15 Thread Jan Friesse

On 15/07/2021 10:09, Jehan-Guillaume de Rorthais wrote:

Hi all,

On Tue, 13 Jul 2021 19:55:30 + (UTC)
Strahil Nikolov  wrote:


In some cases the third location has a single IP and it makes sense to use it
as QDevice. If it has multiple network connections to that location - use a
full blown node .


By the way, what's the point of multiple rings in corosync when we can setup
bonding or teaming on OS layer?


Main point of RRP was:
- Bonding/Teaming is not possible everywhere (not supported by switches, 
...)

- To have a checkmark (other products have redundant ring available)

Knet is way improved RRP + some features which wasn't implemented before 
(like pmtu).




I remember some times ago bonding was recommended over corosync rings, because
the totem protocol on multiple rings wasn't as flexible than bonding/teaming


RRP was not recommended mostly because it was fundamentally broken.


and multiple rings was only useful to corosync/pacemaker where bonding was
useful for all other services on the server.

...But that was before the knet era. Did it changed?


There is a nozzle which creates tun device so it is possible to use 
"corosync/knet" network by other services.


Regards,
  Honza



Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] heartbeat rings questions

2021-07-14 Thread Jan Friesse

On 13/07/2021 19:07, Kiril Pashin wrote:

Hi , thanks for the quick reply.
I have a couple of follow up questions below in blue below

 On 12/07/2021 23:27, Kiril Pashin wrote:
  > Hi ,
  > is it valid to use the same network adapter interface on the same host 
to be
  > part of multiple
  > heart beat rings ?

 There should be no problem from technical side, but I wouldn't call this
 use case "valid". Idea of multiple rings is to have multiple independent
 connections between nodes - something what one nic simply doesn't provide.

  > The scenario is hostA has eth0 ( ip 192.10.10.1 ) interface and hostB 
has
 eth0 (
  > 192.20.20.1 ) and eth1 ( 192.20.20.2 ) .

 This is unsupported configuration
 Can you clarify here why this is unsupported, you mentioned above that 
there
 should be no problem from a technical side ?

  > Are there any restrictions to form two heartbeat rings { eth0, eth0 } 
and {
  > eth0, eth1 }

 Technically only restriction is based on IP. But to make it reliable one
 should use multiple NICs with multiple links and multiple switches.

I understand that its not as reliable as using multiple NICs, you mention the
only restriction is based on IP . Is there a doc
mentioning such restrictions ?


Closest is probably https://access.redhat.com/articles/3068841

Regards,
  Honza





  > as well as create a nozzle device to be able to ping hostA in case eth0
 or eth1
  > go down on hostB

 It is definitively possible to create noozle device which will allow to
 ping hostA in case some of nic fails, but not in the way described in
 config snip. Noozle device should have different IP subnet (noozle is
 basically yet another network card).
 I made up the example in thisemail, however , i assumed all the IPs in the
 examle are
 are using 24 bits for the subnet netmaskso the nozzle is on a different
 subnet 192.168.10
 Can you clarify the problem with my config snip ?

  > nodelist {
  >   node {
  >   ring0_addr: 192.10.10.1
  >   ring1_addr: 192.10.10.1
  >   name: node1
  >   nodeid: 1
  >   }
  >   node {
  >   ring0_addr: 192.20.20.1
  >   ring1_addr: 192.20.20.2
  >   name: node2
  >   nodeid: 2
  >   }
  > }
  > nozzle {
  >   name: noz01
  >   ipaddr: 192.168.10.0
  >   ipprefix: 24
  > }

 This config will definitively not work.

 Regards,
 Honza

  > Thanks,
  > Kiril Pashin
  > DB2 Purescale Development & Support
  > kir...@ca.ibm.com
  >
  >
  >
  > ___
  > Manage your subscription:
  > https://lists.clusterlabs.org/mailman/listinfo/users
 
  >
  > ClusterLabs home: https://www.clusterlabs.org/
 
  >





___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] heartbeat rings questions

2021-07-13 Thread Jan Friesse

On 12/07/2021 23:27, Kiril Pashin wrote:

Hi ,
is it valid to use the same network adapter interface on the same host to be
part of multiple
heart beat rings ?


There should be no problem from technical side, but I wouldn't call this 
use case "valid". Idea of multiple rings is to have multiple independent 
connections between nodes - something what one nic simply doesn't provide.



The scenario is hostA has eth0 ( ip 192.10.10.1 ) interface and hostB has eth0 (
192.20.20.1 ) and eth1 ( 192.20.20.2 ) .


This is unsupported configuration


Are there any restrictions to form two heartbeat rings { eth0, eth0 } and {
eth0, eth1 }


Technically only restriction is based on IP. But to make it reliable one 
should use multiple NICs with multiple links and multiple switches.



as well as create a nozzle device to be able to ping hostA in case eth0 or eth1
go down on hostB


It is definitively possible to create noozle device which will allow to 
ping hostA in case some of nic fails, but not in the way described in 
config snip. Noozle device should have different IP subnet (noozle is 
basically yet another network card).



nodelist {
  node {
  ring0_addr: 192.10.10.1
  ring1_addr: 192.10.10.1
  name: node1
  nodeid: 1
  }
  node {
  ring0_addr: 192.20.20.1
  ring1_addr: 192.20.20.2
  name: node2
  nodeid: 2
  }
}
nozzle {
  name: noz01
  ipaddr: 192.168.10.0
  ipprefix: 24
}


This config will definitively not work.

Regards,
  Honza


Thanks,
Kiril Pashin
DB2 Purescale Development & Support
kir...@ca.ibm.com



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Updating quorum configuration without restarting cluster

2021-06-20 Thread Jan Friesse

Gerry,


Dear community,

I would like to ask few questions regarding Corosync/Pacemaker quorum 
configuration.

When updating the Corosync's quorum configuration I added last_man_standing, and
auto_tie_breaker in corosync.conf on all hosts and refreshed with
'corosync-cfgtool -R'.
Note that that man page included with the rpm says that the -R option with "Tell
all instances of corosync in this cluster to reload corosync.conf."

Next I run 'corosync-quorumtool -s', but it did not show the new quorum flags
for auto tiebreaker and last man standing.

Once I restarted the corosync cluster, the auto tiebreaker flags and last man
standing flags appeared in the corosync-quorumtool output as I expected.

So my questions are:
1. Does corosync-quorumtool actually shows the active quorum configuration? If
not how can I query the active quorum config?


Yes, corosync-quorumtool shows quorum configuration which is really used 
(it's actually only source of truth, cmap is not).




2. Is it possible to update the quorum configuration without restarting the 
cluster?


Partly.

Basically only quorum.two_node and quorum.expected_votes are changeable 
during runtime. Other options like:

- quorum.allow_downscale
- quorum.wait_for_all
- quorum.last_man_standing
- quorum.auto_tie_breaker
- quorum.auto_tie_breaker_node

are not (wait_for_all is a little bit more complicated - when not 
explicitly set/unset it follows two_node so it is possible, but only in 
this special case, to change it via changing two_node).


Regards,
  Honza

btw. I've already replied to Janghyuk Boo so mostly copying same answer 
also here.




Thank you,
Gerry Sommerville
E-mail: ge...@ca.ibm.com 



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Corosync 3.1.4 is available at corosync.org!

2021-06-03 Thread Jan Friesse

I am pleased to announce the latest maintenance release of Corosync
3.1.4 available immediately from GitHub release section at 
https://github.com/corosync/corosync/releases or our website at

http://build.clusterlabs.org/corosync/releases/.

This release contains important bugfix in cmap stats map where iterate 
operation may result in corosync crash.


Complete changelog for 3.1.4:

Christine Caulfield (1):
  stats: fix crash when iterating over deleted keys

Jan Friesse (1):
  man: Add note about single node configuration

Upgrade is highly recommended.

Thanks/congratulations to all people that contributed to achieve this 
great milestone.


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Corosync 3.1.3 is available at corosync.org!

2021-05-21 Thread Jan Friesse

I am pleased to announce the latest maintenance release of Corosync
3.1.3 available immediately from GitHub release section at 
https://github.com/corosync/corosync/releases or our website at

http://build.clusterlabs.org/corosync/releases/.

This release contains mostly smaller bugfixes and one smaller feature. 
It's now possible to run `corosync -v` to get list of supported crypto 
and compression models which can be used in `corosync.conf`.


Complete changelog for 3.1.3:

Ferenc Wágner (1):
  man: corosync-cfgtool.8: use proper single quotes

    Jan Friesse (8):
  config: Properly check crypto and compress models
  totemconfig: Ensure strncpy is always terminated
  main: Mark crypto_model key read only
  main: Add support for cgroup v2
  cfg: corosync_cfg_trackstop blocks forever
  man: Add info about cgroup v2 behavior
  Revert "man: Add info about cgroup v2 behavior"
  Revert "main: Add support for cgroup v2"

Upgrade is highly recommended.

Thanks/congratulations to all people that contributed to achieve this 
great milestone.


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] 32 nodes pacemaker cluster setup issue

2021-05-19 Thread Jan Friesse

S Sathish S:


Hi Klaus,

pacemaker/corosync we generated our own build from clusterlab source code.

[root@node1 ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.4 (Maipo)
[root@node1 ~]# uname -r
3.10.0-693.82.1.el7.x86_64
[root@node1 ~]# rpm -qa | grep -iE 'pacemaker|corosync|pcs'
pcs-0.9.169-1.el7.x86_64
pacemaker-2.0.2-2.el7.x86_64
corosync-2.4.4-2.el7.x86_64


Any specific reason to use 2.4.4? Especially because it is home-made 
build I would recommend to try current needle. I don't think it will 
help with help, but it is probably worth try...


Honza



Thanks and Regards,
S Sathish S





___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Corosync 3.1.1 is available at corosync.org!

2021-04-06 Thread Jan Friesse

Ulrich,


Jan Friesse  schrieb am 31.03.2021 um 15:16 in

Nachricht
<8f611847-e341-b51b-49c9-fd9ef29fb...@redhat.com>:

I am pleased to announce the latest maintenance release of Corosync
3.1.1 available immediately from GitHub release section at
https://github.com/corosync/corosync/releases or our website at
http://build.clusterlabs.org/corosync/releases/.

This release contains important bug fixes and also a few new features:

   - Added a new API to get extended node/link information which is now
used by corosync-cfgtool (especially when the new -n parameter is used).
See corosync-cfgtool(8) for more information.


This makes me wonder: "Good old ping" had always been able to display the
delay between request and response.
As the message delay (token round-trip time)is also critical in corosync, I
wonder why the current delay cannot be displayed by corosync-cfgtool for
example. My expectation is that the delay will go up significantly when load


There is latency stored in stats map (corosync-cmapctl -m stats) so I 
can image to display that in the corosync-cfgtool output.



increases, and maybe manual "re-tuning" could be done before the infamous
"retransmit timeout" kicks in.


Yup (actually ideally automatic but that's long term task).

Honza





   - The API for cfg tracking was fixed so it now working again. This API
was removed during the Corosync 2 development cycle and it means that
shutdown tracking (using cfg_try_shutdown()) had stopped working.

Now it is back, so an application may register to receive a callback
when corosync is stopped cleanly (using corosync-cfgtool -H) and can
also prohibit corosync from stopping.

There is a new --force option (which is now the default in the systemd
unit file/init script) which will always cause Corosync to shut down.

This feature is going to be used by an as-yet unreleased Pacemaker
version (currently in the master branch) to veto corosync shutdown.

Complete changelog for 3.1.1:

  Christine Caulfield (4):
cfg: New API to get extended node/link infomation
cfg: Reinstate cfg tracking
test: Add testcfg to exercise some cfg functions
main: Close race condition when moving to statedir

  Dan Streetman (2):
main: Check memlock rlimit
totemknet: retry knet_handle_new if it fails

  Fabio M. Di Nitto (5):
pkgconfig: export LOGDIR in corosync.pc
configure: detect and init pkg-config with macro
configure: drop dead code
configure: move exec_prefix sanitize
configure: drop unnecessary check and define

  Ferenc Wágner (1):
The ring id file needn't be executable

  Jan Friesse (4):
spec: Add isa version of corosync-devel provides
totemknet: Check both cipher and hash for crypto
cfg: Improve nodestatusget versioning
init: Use corosync-cfgtool for shutdown

  Johannes Krupp (1):
totemconfig: fix integer underflow and logic bug

  liangxin1300 (1):
totemconfig: change udp netmtu value as a constant


Upgrade is highly recommended.

Thanks/congratulations to all people that contributed to achieve this
great milestone.

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Corosync 3.1.2 is available at corosync.org!

2021-04-06 Thread Jan Friesse

I am pleased to announce the latest maintenance release of Corosync
3.1.2 available immediately from GitHub release section at 
https://github.com/corosync/corosync/releases or our website at

http://build.clusterlabs.org/corosync/releases/.

This release contains only one (but very important) bug fix, which fixes 
problem with initialization of knet compression. Bug is present since 
Corosync 3.1.0 (3.0.X are not affected) and cause overwriting of memory 
when knet compression is enabled (knet_compression_model is set to value 
other than none) what (usually) makes corosync crash on start.


Complete changelog for 3.1.2:

Fabio M. Di Nitto (1):
  knet: pass correct handle to knet_handle_compress

Upgrade is highly recommended.

Thanks/congratulations to all people that contributed to achieve this
great milestone.

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Corosync 3.1.1 is available at corosync.org!

2021-03-31 Thread Jan Friesse

I am pleased to announce the latest maintenance release of Corosync
3.1.1 available immediately from GitHub release section at 
https://github.com/corosync/corosync/releases or our website at

http://build.clusterlabs.org/corosync/releases/.

This release contains important bug fixes and also a few new features:

 - Added a new API to get extended node/link information which is now 
used by corosync-cfgtool (especially when the new -n parameter is used). 
See corosync-cfgtool(8) for more information.


 - The API for cfg tracking was fixed so it now working again. This API 
was removed during the Corosync 2 development cycle and it means that 
shutdown tracking (using cfg_try_shutdown()) had stopped working.


Now it is back, so an application may register to receive a callback 
when corosync is stopped cleanly (using corosync-cfgtool -H) and can 
also prohibit corosync from stopping.


There is a new --force option (which is now the default in the systemd 
unit file/init script) which will always cause Corosync to shut down.


This feature is going to be used by an as-yet unreleased Pacemaker 
version (currently in the master branch) to veto corosync shutdown.


Complete changelog for 3.1.1:

Christine Caulfield (4):
  cfg: New API to get extended node/link infomation
  cfg: Reinstate cfg tracking
  test: Add testcfg to exercise some cfg functions
  main: Close race condition when moving to statedir

Dan Streetman (2):
  main: Check memlock rlimit
  totemknet: retry knet_handle_new if it fails

Fabio M. Di Nitto (5):
  pkgconfig: export LOGDIR in corosync.pc
  configure: detect and init pkg-config with macro
  configure: drop dead code
  configure: move exec_prefix sanitize
  configure: drop unnecessary check and define

Ferenc Wágner (1):
  The ring id file needn't be executable

    Jan Friesse (4):
  spec: Add isa version of corosync-devel provides
  totemknet: Check both cipher and hash for crypto
  cfg: Improve nodestatusget versioning
  init: Use corosync-cfgtool for shutdown

Johannes Krupp (1):
  totemconfig: fix integer underflow and logic bug

liangxin1300 (1):
  totemconfig: change udp netmtu value as a constant


Upgrade is highly recommended.

Thanks/congratulations to all people that contributed to achieve this
great milestone.

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Corosync - qdevice not voting

2021-03-19 Thread Jan Friesse

Marcelo,


Hello.

I have configured corosync with 2 nodes and added a qdevice to help with
the quorum.

On node1 I added firewall rules to block connections from node2 and the
qdevice, trying to simulate a network issue.


Just please make sure to block both incoming and also outgoing packets. 
Qdevice will handle blocking of just one direction well (because of tcp) 
and corosync 3.x with knet too. But corosync 2.x has a big problem with 
"asymmetric" blocking. Also config suggest that multicast is used - 
please make sure to block also multicast in that case.




The problem I'm having is that one node1 I can see it dropping the
service (the IP), but on node2 it never gets the IP, it is like the qdevice
is not voting.

This is my corosync.conf:

totem {
 version: 2
 cluster_name: cluster1
 token: 3000
 token_retransmits_before_loss_const: 10
 clear_node_high_bit: yes
 crypto_cipher: none
 crypto_hash: none
}
 interface {
 ringnumber: 0
 bindnetaddr: X.X.X.X
 mcastaddr: 239.255.43.2
 mcastport: 5405
 ttl: 1
 }
 nodelist{
 node {
 ring0_addr: X.X.X.2
 name: node1.domain.com
 nodeid: 2
 }
 node {
 ring0_addr: X.X.X.3
 name: node2.domain.com
 nodeid: 3
 }
 }

logging {
 to_logfile: yes
 logfile: /var/log/cluster/corosync.log
 to_syslog: yes
}

#}

quorum {
   provider: corosync_votequorum
   device {
 votes: 1
 model: net
 net {
   tls: off
   host: qdevice.domain.com
   algorithm: lms
 }
 heuristics {
   mode: on
   exec_ping: /usr/bin/ping -q -c 1 "qdevice.domain.com"
 }
   }
}


I'm getting this on the qdevice host (before adding the firewall rules), so
looks like the cluster is properly configured:

pcs qdevice status net --full



Correct. What is the status after blocking is enabled?


QNetd address: *:5403
TLS: Supported (client certificate required)
Connected clients: 2
Connected clusters: 1
Maximum send/receive size: 32768/32768 bytes
Cluster "cluster1":
 Algorithm: LMS
 Tie-breaker: Node with lowest node ID
 Node ID 3:
 Client address: :::X.X.X.3:59746
 HB interval: 8000ms
 Configured node list: 2, 3
 Ring ID: 2.95d
 Membership node list: 2, 3
 Heuristics: Pass (membership: Pass, regular: Undefined)
 TLS active: No
 Vote: ACK (ACK)
 Node ID 2:
 Client address: :::X.X.X.2:33944
 HB interval: 8000ms
 Configured node list: 2, 3
 Ring ID: 2.95d
 Membership node list: 2, 3
 Heuristics: Pass (membership: Pass, regular: Undefined)
 TLS active: No
 Vote: ACK (ACK)

These are partial logs on node2 after activating the firewall rules on
node1. These logs repeats all the time until I remove the firewall rules:

Mar 18 12:48:56 [7202] node2.domain.com stonith-ng: info: crm_cs_flush:
Sent 0 CPG messages  (1 remaining, last=16): Try again (6)
Mar 18 12:48:56 [7201] node2.domain.comcib: info: crm_cs_flush:
Sent 0 CPG messages  (2 remaining, last=87): Try again (6)
Mar 18 12:48:56 [7202] node2.domain.com stonith-ng: info: crm_cs_flush:
Sent 0 CPG messages  (1 remaining, last=16): Try again (6)
Mar 18 12:48:56 [7185] node2.domain.com pacemakerd: info: crm_cs_flush:
Sent 0 CPG messages  (1 remaining, last=13): Try again (6)
[7177] node2.domain.com corosyncinfo[VOTEQ ] waiting for quorum device
Qdevice poll (but maximum for 3 ms)
[7177] node2.domain.com corosyncnotice  [TOTEM ] A new membership
(X.X.X.3:2469) was formed. Members


^^ This is weird. I'm pretty sure something is broken with the way how 
packets are blocked (or log is incomplete)



[7177] node2.domain.com corosyncwarning [CPG   ] downlist left_list: 0
received
[7177] node2.domain.com corosyncwarning [TOTEM ] Discarding JOIN message
during flush, nodeid=3
Mar 18 12:48:56 [7201] node2.domain.comcib: info: crm_cs_flush:
Sent 0 CPG messages  (2 remaining, last=87): Try again (6)
Mar 18 12:48:56 [7202] node2.domain.com stonith-ng: info: crm_cs_flush:
Sent 0 CPG messages  (1 remaining, last=16): Try again (6)
Mar 18 12:48:56 [7185] node2.domain.com pacemakerd: info: crm_cs_flush:
Sent 0 CPG messages  (1 remaining, last=13): Try again (6)
Mar 18 12:48:56 [7201] node2.domain.comcib: info: crm_cs_flush:
Sent 0 CPG messages  (2 remaining, last=87): Try again (6)
Mar 18 12:48:56 [7185] node2.domain.com pacemakerd: info: crm_cs_flush:
Sent 0 CPG messages  (1 remaining, last=13): Try again (6)


If it repeats over and over again then it's 99.9% because of way packets 
are blocked.




Also on node2:

pcs quorum status
Error:

Re: [ClusterLabs] maximum token value (knet)

2021-03-15 Thread Jan Friesse

Strahil,

I will try to get into the details on monday, when I have access to the cluster again.I guess the 


Perfect, thanks

/var/log/cluster/corosync.log and /etc/corosync/corosync.conf are the 
most interesting.


Yup, for now these are the most important.

So far, I have 6 node cluster with separate VLANs for HANA replication, prod and backup.Initially, I used pcs to create the corosync.conf with 2 IPs per node, token 4, consensus 48000 and 
wait_for_all=1.Later I have expanded the cluster to 3 links and added 
qnet to the setup (only after I made it run (token 29000) ), so I'm 
ruling it out.I updated the cluster nodes from RHEL 8.1 to 8.2 , removed 
the consensus and enabled debug.

As knet is using udp by default, and because the problem is hitting me both in 
udp (default settings) and sctp - the problem is not in the protocol.


It is interesting. So the question is, if the problem is amount of links 
or amount of nodes (or if it somehow sums together).



I've also enabled pacemaker blackbox, although I doubt that has any effect on 
corosync.
How can I enable trace logs for corosync only ?


Nope, no effect. Corosync can dump its own blackbox by running 
corosync-blackbox command.


Trace is enabled by setting logging.debug to "trace" value (so where you 
have a debug:on, you just set debug: trace).


Regards,
  Honza


Best Regards,Strahil Nikolov

  
  
   On Fri, Mar 12, 2021 at 17:01, Jan Friesse wrote:   Strahil,



Interesting...
Yet, this doesn't explain why token of 3 causes the nodes to never assemble 
a cluster (waiting for half an hour, using wait_for_all=1) , while setting it 
to 29000 works like a charm.


Definitively.

Could you please provide a bit more info about your setup
(config/logs/how many nodes cluster has/...)? Because I've just briefly
tested two nodes setup with 30 sec token timeout and it was working
perfectly fine.



Thankfully we got RH subsciption, so RH devs will provide more detailed output 
on the issue.


As Jehan correctly noted if it would really get to RH devs it would
probably get to me ;) But before that GSS will take care of checking
configs/hw/logs/... and they are really good in finding problems with
setup/hw/...



I was hoping that I missed in the documentation about the maximum token size...


Nope.

No matter what, if you can send config/logs/... we may try to find out
what is root of the problem here on ML or you can really try GSS, but as
Jehan told, it would be nice if you can post result so other people (me
included) knows what was the main problem.

Thanks and regards,
   Honza



Best Regards,
Strahil Nikolov






В четвъртък, 11 март 2021 г., 19:12:58 ч. Гринуич+2, Jan Friesse 
 написа:





Strahil,

Hello all,
I'm building a test cluster on RHEL8.2 and I have noticed that the cluster 
fails to assemble ( nodes stay inquorate as if the network is not working) if I 
set the token at 3 or more (30s+).


Knet waits for enough pong replies for other nodes before it marks them
as alive and starts sending/receiving packets from them. By default it
needs to receive 2 pongs and ping is sent 4 times in token timeout so it
means 15 sec until node is considered up for 30 sec token timeout.


What is the maximum token value with knet ?On SLES12 (I think it was  corosync 
1) , I used to set the token/consensus with far greater values on some of our 
clusters.


I'm really not aware about any arbitrary limits.



Best Regards,Strahil Nikolov



Regards,

     Honza




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/








   



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] maximum token value (knet)

2021-03-12 Thread Jan Friesse

Strahil,


Interesting...
Yet, this doesn't explain why token of 3 causes the nodes to never assemble 
a cluster (waiting for half an hour, using wait_for_all=1) , while setting it 
to 29000 works like a charm.


Definitively.

Could you please provide a bit more info about your setup 
(config/logs/how many nodes cluster has/...)? Because I've just briefly 
tested two nodes setup with 30 sec token timeout and it was working 
perfectly fine.




Thankfully we got RH subsciption, so RH devs will provide more detailed output 
on the issue.


As Jehan correctly noted if it would really get to RH devs it would 
probably get to me ;) But before that GSS will take care of checking 
configs/hw/logs/... and they are really good in finding problems with 
setup/hw/...




I was hoping that I missed in the documentation about the maximum token size...


Nope.

No matter what, if you can send config/logs/... we may try to find out 
what is root of the problem here on ML or you can really try GSS, but as 
Jehan told, it would be nice if you can post result so other people (me 
included) knows what was the main problem.


Thanks and regards,
  Honza



Best Regards,
Strahil Nikolov






В четвъртък, 11 март 2021 г., 19:12:58 ч. Гринуич+2, Jan Friesse 
 написа:





Strahil,

Hello all,
I'm building a test cluster on RHEL8.2 and I have noticed that the cluster 
fails to assemble ( nodes stay inquorate as if the network is not working) if I 
set the token at 3 or more (30s+).


Knet waits for enough pong replies for other nodes before it marks them
as alive and starts sending/receiving packets from them. By default it
needs to receive 2 pongs and ping is sent 4 times in token timeout so it
means 15 sec until node is considered up for 30 sec token timeout.


What is the maximum token value with knet ?On SLES12 (I think it was  corosync 
1) , I used to set the token/consensus with far greater values on some of our 
clusters.


I'm really not aware about any arbitrary limits.



Best Regards,Strahil Nikolov



Regards,

   Honza




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/








___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Re: maximum token value (knet)

2021-03-12 Thread Jan Friesse

Ulrich,


Jan Friesse  schrieb am 11.03.2021 um 18:12 in Nachricht

:

Strahil,

Hello all,
I'm building a test cluster on RHEL8.2 and I have noticed that the cluster

fails to assemble ( nodes stay inquorate as if the network is not working) if
I set the token at 3 or more (30s+).



Hi!

I know you will be bored when I say this, but anyway:
In old HP Service Guard the node connectivity was checked with ping/pong too, and you could specify the interval and the number lost responses that declare a node unreachable. The good thing (as 


Yes, and this is exactly what 
knet_ping_interval/knet_ping_timeout/knet_pong_count is doing :) What I 
was describing are default timeouts.


opposed to the TOTEM protocol I know) was that single missed responses 
were logged, so you did not


Yup, missed responses are logged.

just have an OK/BAD status, but also an indicator how far you are away 
from BAD status.


So you have a token timeout of 30s, and we had 3 lost responses at an interval 
of 7 seconds (at that time the 100Mb NIC needed about 5 seconds to renegotiate 
after a link failure (like unplug/replug). Recent hard- and software is 
somewhat faster AFAIK.


Indeed. But sadly recent hard- and software are quite usually running in 
VM on super overprovisioned bare-metal so bad things are happening, 
that's why token timeouts are (by default) set to quite big numbers.


Regards,
  Honza




Regards,
Ulrich



Knet waits for enough pong replies for other nodes before it marks them
as alive and starts sending/receiving packets from them. By default it
needs to receive 2 pongs and ping is sent 4 times in token timeout so it
means 15 sec until node is considered up for 30 sec token timeout.


What is the maximum token value with knet ?On SLES12 (I think it was

corosync 1) , I used to set the token/consensus with far greater values on
some of our clusters.

I'm really not aware about any arbitrary limits.


Best Regards,Strahil Nikolov



Regards,
Honza




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/





___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] maximum token value (knet)

2021-03-11 Thread Jan Friesse

Strahil,

Hello all,
I'm building a test cluster on RHEL8.2 and I have noticed that the cluster 
fails to assemble ( nodes stay inquorate as if the network is not working) if I 
set the token at 3 or more (30s+).


Knet waits for enough pong replies for other nodes before it marks them 
as alive and starts sending/receiving packets from them. By default it 
needs to receive 2 pongs and ping is sent 4 times in token timeout so it 
means 15 sec until node is considered up for 30 sec token timeout.



What is the maximum token value with knet ?On SLES12 (I think it was  corosync 
1) , I used to set the token/consensus with far greater values on some of our 
clusters.


I'm really not aware about any arbitrary limits.


Best Regards,Strahil Nikolov



Regards,
  Honza




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Antw: [EXT] Feedback wanted: OCF Resource Agent API 1.1 proposed for adoption

2021-03-10 Thread Jan Friesse

Ulrich Windl napsal(a):

Ken Gaillot  schrieb am 10.03.2021 um 00:07 in

Nachricht
:

Hi all,

After many false starts over the years, we finally have a proposed 1.1
version of the resource agent standard.

Discussion is invited here and/or on the pull request:

  https://github.com/ClusterLabs/OCF‑spec/pull/24

One goal is to formalize widespread existing practices that deviate
from the 1.0 standard, such as the notify, promote, and demote actions;
exit statuses 8, 9, 190, and 191; and allowing installers to choose
where agents are installed (officially /usr/ocf/resource.d in 1.0, even
though everyone actually uses /usr/lib/ocf/resource.d).

Another goal is to add optional new meta‑data hints that user
interfaces can benefit from, such as whether a parameter is required or
deprecated.


What I always was wondering was line-lengths for metadata descriptions:
To wrap, or not to wrap?



The new standard deprecates the "unique" descriptor for parameters,
which was misused by Pacemaker, and replaces it with two new ones,
"reloadable" (to handle what Pacemaker used it for) and "unique‑group"
(to handle its original purpose more flexibly). A new "reload‑params"
action updates any "reloadable" parameters.

The last major change is completing the transition away from
master/slave terminology, renaming the roles to promoted/unpromoted.


I'm worried about all those books describing master/slave flip flops... ;-)
And all those students having a "master"...
I have my own opinion on this:
How many people were harmed by those names, and how many will benefit from
using different words for the same concept?


It is closer to newspeak https://en.wikipedia.org/wiki/Newspeak 
(specially un- prefix) and that seems to be a doubleplusgood thing.







The changes are designed to be backward‑compatible, so for the most
part, agents and software written to either standard can be used with
each other. However for agents that support promote/demote (which were
not part of 1.0), it is recommended to use 1.1 agents only with
software that explicitly supports 1.1. Once the 1.1 standard is
adopted, we intend to update all ClusterLabs software to support it.

The pull request description has a more detailed summary of all the
changes, and the standard itself can be compared with:



https://github.com/ClusterLabs/OCF‑spec/blob/master/ra/1.0/resource‑agent‑api.m


d



https://github.com/kgaillot/OCF‑spec/blob/ocf1.1/ra/1.1/resource‑agent‑api.md



My goal is to merge the pull request formally adopting 1.1 by the end
of this month.
‑‑
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-03-03 Thread Jan Friesse

Eric,


-Original Message-
From: Users  On Behalf Of Jan Friesse
Sent: Monday, March 1, 2021 3:27 AM
To: Cluster Labs - All topics related to open-source clustering welcomed


...


ha1 lost connection to qnetd so it gives up all hope immediately. ha2
retains connection to qnetd so it waits for final decision before
continuing.



Thanks for digging into logs. I believe Eric is hitting
https://github.com/corosync/corosync-qdevice/issues/10 (already fixed, but
may take some time to get into distributions) - it also contains workaround.

Honza



Reading through that linked thread, it seems that quorum timeouts are tricky to get right. I made some changes over the weekend and increased my token timeout to 5000. Are there other timeouts I 


Token timeout really doesn't help because it doesn't affect quorum timeout.

Please follow changes as described in
https://github.com/ClusterLabs/sbd/pull/76#issuecomment-486952369 comment.


should adjust to make sure I don't run into a complicated race condition 
that causes weird/random failures due to mismatched or misaligned timeouts?


Nope, not really.

Honza





In your case apparently one node was completely disconnected for 15
seconds, then connectivity resumed. The second node was still waiting
for qdevice/qnetd decision. So it appears to work as expected.

Note that fencing would not have been initiated before timeout as well.
Fencing /may/ have been initiated after nodes established connection
again and saw that one resource failed to stop. This would
automatically resolve your issue. I need to think how to reproduce stop

failure.


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Disclaimer : This email and any files transmitted with it are confidential and 
intended solely for intended recipients. If you are not the named addressee you 
should not disseminate, distribute, copy or alter this email. Any views or 
opinions presented in this email are solely those of the author and might not 
represent those of Physician Select Management. Warning: Although Physician 
Select Management has taken reasonable precautions to ensure no viruses are 
present in this email, the company cannot accept responsibility for any loss or 
damage arising from the use of this email or attachments.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-03-01 Thread Jan Friesse

Andrei,


On 01.03.2021 15:45, Jan Friesse wrote:

Andrei,


On 01.03.2021 12:26, Jan Friesse wrote:




Thanks for digging into logs. I believe Eric is hitting
https://github.com/corosync/corosync-qdevice/issues/10 (already fixed,
but may take some time to get into distributions) - it also contains
workaround.



I tested corosync-qnetd at df3c672 which should include these fixes. It
changed behavior, still I cannot explain it.

Again, ha1+ha2+qnetd, ha2 is current DC, I disconnect ha1 (block
everything with ha1 source MAC), stonith disabled. corosync and


So ha1 is blocked on both ha2 and qnetd and blocking is symmetric (I
mean, nothing is sent to ha1 and nothing is received from ha1)?



No, it is asymmetric. ha1 cannot *send* anything to ha2 or qnetd; it
should be able to *receive* from both.


That's problem for corosync 2.x. Corosync 3.x with knet solves this by 
establishing connection only when node can both send and receive packets 
from other nodes, but udpu behavior is weird (on corosync side) when it 
is possible to receive message but not sent one (or vice versa).


It also explains why there are multiple "waiting for qdevice" messages 
logged.


Could you please try to block both outgoing and incomming packets?




corosync-qdevice on nodes are still 2.4.5 if it matters.


Shouldn't really matter as long as both corosync-qdevice and
corosync-qnetd are version 3.0.1.



corosync-qdevice on nodes is still 2.4.5. corosync-qnetd on witness is
git snapshot from last November. I was not sure I could mix corosync and
corosync-qdevice of different versions and looking at git commit all


It is (or should be) possible. I was testing this scenario (old qnetd + 
new qdevice and old qdevice + new qnetd) before releasing 3.0.1 (not 
extensivelly tho so of there can be some bugs which I haven't spotted).



changes seem to be in qnetd anyway.


True



...



That's a bit harder to explain but it has a reason.



OK, thank you.
...


No mater what, are you able to provide some step-by-step reproducer of
that 40 sec delay?


No. As I said next time I tested I got entirely different timing. I will
try after cold boot again.



Perfect, thanks.

Regards,
  Honza

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-03-01 Thread Jan Friesse

Andrei,


On 01.03.2021 12:26, Jan Friesse wrote:




Thanks for digging into logs. I believe Eric is hitting
https://github.com/corosync/corosync-qdevice/issues/10 (already fixed,
but may take some time to get into distributions) - it also contains
workaround.



I tested corosync-qnetd at df3c672 which should include these fixes. It
changed behavior, still I cannot explain it.

Again, ha1+ha2+qnetd, ha2 is current DC, I disconnect ha1 (block
everything with ha1 source MAC), stonith disabled. corosync and


So ha1 is blocked on both ha2 and qnetd and blocking is symmetric (I 
mean, nothing is sent to ha1 and nothing is received from ha1)?



corosync-qdevice on nodes are still 2.4.5 if it matters.


Shouldn't really matter as long as both corosync-qdevice and 
corosync-qnetd are version 3.0.1.




ha2:

ar 01 13:23:27 ha2 corosync[1576]:   [TOTEM ] A processor failed,
forming new configuration.
Mar 01 13:23:28 ha2 corosync[1576]:   [VOTEQ ] waiting for quorum device
Qdevice poll (but maximum for 3 ms)
Mar 01 13:23:28 ha2 corosync[1576]:   [TOTEM ] A new membership
(192.168.1.2:3632) was formed. Members left: 1
Mar 01 13:23:28 ha2 corosync[1576]:   [TOTEM ] Failed to receive the
leave message. failed: 1
Mar 01 13:23:28 ha2 corosync[1576]:   [CPG   ] downlist left_list: 1
received
Mar 01 13:23:28 ha2 pacemaker-based[2032]:  notice: Node ha1 state is
now lost
Mar 01 13:23:28 ha2 pacemaker-based[2032]:  notice: Purged 1 peer with
id=1 and/or uname=ha1 from the membership cache
Mar 01 13:23:28 ha2 pacemaker-attrd[2035]:  notice: Lost attribute
writer ha1
Mar 01 13:23:28 ha2 pacemaker-attrd[2035]:  notice: Node ha1 state is
now lost
Mar 01 13:23:28 ha2 pacemaker-attrd[2035]:  notice: Removing all ha1
attributes for peer loss
Mar 01 13:23:28 ha2 pacemaker-attrd[2035]:  notice: Purged 1 peer with
id=1 and/or uname=ha1 from the membership cache
Mar 01 13:23:28 ha2 pacemaker-fenced[2033]:  notice: Node ha1 state is
now lost
Mar 01 13:23:28 ha2 pacemaker-fenced[2033]:  notice: Purged 1 peer with
id=1 and/or uname=ha1 from the membership cache
Mar 01 13:23:28 ha2 pacemaker-controld[2037]:  warning: Stonith/shutdown
of node ha1 was not expected
Mar 01 13:23:28 ha2 pacemaker-controld[2037]:  notice: State transition
S_IDLE -> S_POLICY_ENGINE
Mar 01 13:23:33 ha2 pacemaker-controld[2037]:  notice: High CPU load
detected: 1.20
Mar 01 13:23:35 ha2 corosync[1576]:   [QUORUM] Members[1]: 2
Mar 01 13:23:35 ha2 corosync[1576]:   [MAIN  ] Completed service
synchronization, ready to provide service.
Mar 01 13:23:35 ha2 pacemaker-attrd[2035]:  notice: Recorded local node
as attribute writer (was unset)
Mar 01 13:23:35 ha2 pacemaker-controld[2037]:  notice: Node ha1 state is
now lost
Mar 01 13:23:35 ha2 pacemaker-controld[2037]:  warning: Stonith/shutdown
of node ha1 was not expected
Mar 01 13:23:36 ha2 pacemaker-schedulerd[2036]:  notice:  * Promote
p_drbd0:0(   Slave -> Master ha2 )
Mar 01 13:23:36 ha2 pacemaker-schedulerd[2036]:  notice:  * Start
p_fs_clust01 (   ha2 )
Mar 01 13:23:36 ha2 pacemaker-schedulerd[2036]:  notice:  * Start
p_mysql_001  (   ha2 )


So it is pretty fast to react (8 seconds)

ha1:

Mar 01 13:23:27 ha1 corosync[1552]:   [TOTEM ] A processor failed,
forming new configuration.
Mar 01 13:23:30 ha1 corosync[1552]:   [VOTEQ ] waiting for quorum device
Qdevice poll (but maximum for 3 ms)
Mar 01 13:23:30 ha1 corosync[1552]:   [TOTEM ] A new membership
(192.168.1.1:3640) was formed. Members left: 2
Mar 01 13:23:30 ha1 corosync[1552]:   [TOTEM ] Failed to receive the
leave message. failed: 2
Mar 01 13:23:30 ha1 corosync[1552]:   [CPG   ] downlist left_list: 1
received
Mar 01 13:23:30 ha1 pacemaker-attrd[1738]:  notice: Node ha2 state is
now lost
Mar 01 13:23:30 ha1 pacemaker-attrd[1738]:  notice: Removing all ha2
attributes for peer loss
Mar 01 13:23:30 ha1 pacemaker-attrd[1738]:  notice: Purged 1 peer with
id=2 and/or uname=ha2 from the membership cache
Mar 01 13:23:30 ha1 pacemaker-based[1735]:  notice: Node ha2 state is
now lost
Mar 01 13:23:30 ha1 pacemaker-based[1735]:  notice: Purged 1 peer with
id=2 and/or uname=ha2 from the membership cache
Mar 01 13:23:30 ha1 pacemaker-controld[1740]:  notice: Our peer on the
DC (ha2) is dead
Mar 01 13:23:30 ha1 pacemaker-controld[1740]:  notice: State transition
S_NOT_DC -> S_ELECTION
Mar 01 13:23:30 ha1 pacemaker-fenced[1736]:  notice: Node ha2 state is
now lost
Mar 01 13:23:30 ha1 pacemaker-fenced[1736]:  notice: Purged 1 peer with
id=2 and/or uname=ha2 from the membership cache
Mar 01 13:23:32 ha1 corosync[1552]:   [VOTEQ ] waiting for quorum device
Qdevice poll (but maximum for 3 ms)
Mar 01 13:23:32 ha1 corosync[1552]:   [TOTEM ] A new membership
(192.168.1.1:3644) was formed. Members
Mar 01 13:23:32 ha1 corosync[1552]:   [CPG   ] downlist left_list: 0
received
Mar 01 13:23:33 ha1 corosync[1552]:   [VOTEQ ] waiting for quorum device
Qdevice poll (but maximum for 3 ms)
Mar 01 13:23:3

Re: [ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

2021-03-01 Thread Jan Friesse




On 27.02.2021 22:12, Andrei Borzenkov wrote:

On 27.02.2021 17:08, Eric Robinson wrote:


I agree, one node is expected to go out of quorum. Still the question is, why 
didn't 001db01b take over the services? I just remembered that 001db01b has 
services running on it, and those services did not stop, so it seems that 
001db01b did not lose quorum. So why didn't it take over the services that were 
running on 001db01a?


That I cannot answer. I cannot reproduce it using similar configuration.


Hmm ... actually I can.

Two nodes ha1 and ha2 + qdevice. I blocked all communication *from* ha1
(to be precise - all packets with ha1 source MAC are dropped). This
happened around 10:43:45. Now look:

ha1 immediately stops all services:

Feb 28 10:43:44 ha1 corosync[3692]:   [TOTEM ] A processor failed,
forming new configuration.
Feb 28 10:43:47 ha1 corosync[3692]:   [VOTEQ ] waiting for quorum device
Qdevice poll (but maximum for 3 ms)
Feb 28 10:43:47 ha1 corosync[3692]:   [TOTEM ] A new membership
(192.168.1.1:2944) was formed. Members left: 2
Feb 28 10:43:47 ha1 corosync[3692]:   [TOTEM ] Failed to receive the
leave message. failed: 2
Feb 28 10:43:47 ha1 corosync[3692]:   [CPG   ] downlist left_list: 1
received
Feb 28 10:43:47 ha1 pacemaker-attrd[3703]:  notice: Node ha2 state is
now lost
Feb 28 10:43:47 ha1 pacemaker-attrd[3703]:  notice: Removing all ha2
attributes for peer loss
Feb 28 10:43:47 ha1 pacemaker-attrd[3703]:  notice: Purged 1 peer with
id=2 and/or uname=ha2 from the membership cache
Feb 28 10:43:47 ha1 pacemaker-based[3700]:  notice: Node ha2 state is
now lost
Feb 28 10:43:47 ha1 pacemaker-based[3700]:  notice: Purged 1 peer with
id=2 and/or uname=ha2 from the membership cache
Feb 28 10:43:47 ha1 pacemaker-controld[3705]:  warning: Stonith/shutdown
of node ha2 was not expected
Feb 28 10:43:47 ha1 pacemaker-controld[3705]:  notice: State transition
S_IDLE -> S_POLICY_ENGINE
Feb 28 10:43:47 ha1 pacemaker-fenced[3701]:  notice: Node ha2 state is
now lost
Feb 28 10:43:47 ha1 pacemaker-fenced[3701]:  notice: Purged 1 peer with
id=2 and/or uname=ha2 from the membership cache
Feb 28 10:43:48 ha1 corosync[3692]:   [VOTEQ ] waiting for quorum device
Qdevice poll (but maximum for 3 ms)
Feb 28 10:43:48 ha1 corosync[3692]:   [TOTEM ] A new membership
(192.168.1.1:2948) was formed. Members
Feb 28 10:43:48 ha1 corosync[3692]:   [CPG   ] downlist left_list: 0
received
Feb 28 10:43:50 ha1 corosync[3692]:   [VOTEQ ] waiting for quorum device
Qdevice poll (but maximum for 3 ms)
Feb 28 10:43:50 ha1 corosync[3692]:   [TOTEM ] A new membership
(192.168.1.1:2952) was formed. Members
Feb 28 10:43:50 ha1 corosync[3692]:   [CPG   ] downlist left_list: 0
received
Feb 28 10:43:51 ha1 corosync[3692]:   [VOTEQ ] waiting for quorum device
Qdevice poll (but maximum for 3 ms)
Feb 28 10:43:51 ha1 corosync[3692]:   [TOTEM ] A new membership
(192.168.1.1:2956) was formed. Members
Feb 28 10:43:51 ha1 corosync[3692]:   [CPG   ] downlist left_list: 0
received
Feb 28 10:43:56 ha1 corosync-qdevice[4522]: Server didn't send echo
reply message on time
Feb 28 10:43:56 ha1 corosync-qdevice[4522]: Feb 28 10:43:56 error
Server didn't send echo reply message on time
Feb 28 10:43:56 ha1 corosync[3692]:   [QUORUM] This node is within the
non-primary component and will NOT provide any services.
Feb 28 10:43:56 ha1 corosync[3692]:   [QUORUM] Members[1]: 1
Feb 28 10:43:56 ha1 corosync[3692]:   [MAIN  ] Completed service
synchronization, ready to provide service.
Feb 28 10:43:56 ha1 pacemaker-controld[3705]:  warning: Quorum lost
Feb 28 10:43:56 ha1 pacemaker-controld[3705]:  notice: Node ha2 state is
now lost
Feb 28 10:43:56 ha1 pacemaker-controld[3705]:  warning: Stonith/shutdown
of node ha2 was not expected
Feb 28 10:43:56 ha1 pacemaker-controld[3705]:  notice: Updating quorum
status to false (call=274)
Feb 28 10:43:57 ha1 pacemaker-schedulerd[3704]:  warning: Fencing and
resource management disabled due to lack of quorum
Feb 28 10:43:57 ha1 pacemaker-schedulerd[3704]:  notice:  * Stop
p_drbd0:0(Master ha1 )   due to no quorum
Feb 28 10:43:57 ha1 pacemaker-schedulerd[3704]:  notice:  * Stop
p_drbd1:0( Slave ha1 )   due to no quorum
Feb 28 10:43:57 ha1 pacemaker-schedulerd[3704]:  notice:  * Stop
p_fs_clust01 (   ha1 )   due to no quorum
Feb 28 10:43:57 ha1 pacemaker-schedulerd[3704]:  notice:  * Start
p_fs_clust02 (   ha1 )   due to no quorum (blocked)
Feb 28 10:43:57 ha1 pacemaker-schedulerd[3704]:  notice:  * Stop
p_mysql_001  (   ha1 )   due to no quorum
Feb 28 10:43:57 ha1 pacemaker-schedulerd[3704]:  notice:  * Start
p_mysql_006  (   ha1 )   due to no quorum (blocked)



ha2 *waits for 30 seconds* before doing anything:

Feb 28 10:43:44 ha2 corosync[5389]:   [TOTEM ] A processor failed,
forming new configuration.
Feb 28 10:43:45 ha2 corosync[5389]:   [VOTEQ ] waiting for quorum device
Qdevice poll (but m

Re: [ClusterLabs] cluster loses state (randomly) every few minutes.

2021-01-18 Thread Jan Friesse

lejeczek,


hi guys,

I have a very basic two-node cluster, not even a single resource on it, 
but very troublesome - it keeps braking.

Journal for 'pacemaker' shows constantly (on both nodes):
...
warning: Input I_DC_TIMEOUT received in state S_PENDING from 
crm_timer_popped

  notice: State transition S_ELECTION -> S_PENDING
  notice: State transition S_PENDING -> S_NOT_DC
  notice: Lost attribute writer swir
  notice: Node swir state is now lost
  notice: Our peer on the DC (swir) is dead
  notice: Purged 1 peer with id=2 and/or uname=swir from the membership 
cache

  notice: Node swir state is now lost
  notice: State transition S_NOT_DC -> S_ELECTION
  notice: Removing all swir attributes for peer loss
  notice: Purged 1 peer with id=2 and/or uname=swir from the membership 
cache

  notice: Node swir state is now lost
  notice: Node swir state is now lost
  notice: Recorded local node as attribute writer (was unset)
  notice: Purged 1 peer with id=2 and/or uname=swir from the membership 
cache

  notice: State transition S_ELECTION -> S_INTEGRATION
  warning: Blind faith: not fencing unseen nodes
  notice: Delaying fencing operations until there are resources to manage
  notice: Calculated transition 0, saving inputs in 
/var/lib/pacemaker/pengine/pe-input-627.bz2
  notice: Transition 0 (Complete=0, Pending=0, Fired=0, Skipped=0, 
Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-627.bz2): Complete

  notice: State transition S_TRANSITION_ENGINE -> S_IDLE
  notice: Node swir state is now member
  notice: Node swir state is now member
  notice: Node swir state is now member
  notice: Node swir state is now member
  notice: State transition S_IDLE -> S_INTEGRATION
  warning: Another DC detected: swir (op=noop)
  notice: Detected another attribute writer (swir), starting new election
  notice: Setting #attrd-protocol[swir]: (unset) -> 2
  notice: State transition S_ELECTION -> S_RELEASE_DC
  notice: State transition S_PENDING -> S_NOT_DC
  notice: Recorded local node as attribute writer (was unset)



Is there anything interesting in corosync.log?

It's the same hardware on which "this same" cluster ran okey and then, 
only a couple of days ago, I upgraded Centos on these two boxes to "Steam"
I'm hoping it's something trivial I'm missing with new version(s) of 
software came with upgrace, perhaps some (new) settings for two-node 
cluster which I missed.


Actually for Corosync there is one - increase of token timeout to 3sec. 
This was not a problem during my testing, but just for sure - have you 
restarted corosync on both of the nodes? Do that have same token timeout 
(you can check used token timeout by running "corosync-cmapctl -g 
runtime.config.totem.token")?


Honza


Any suggestions greatly appreciated.
many thanks, L.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Corosync permanently desyncs in face of packet loss

2021-01-18 Thread Jan Friesse

Mariusz,


Hi,

We've had a hardware problem causing asynchronous packet drop on one of
our nodes that caused unrecoverable (required
restarting corosync on both nodes) state, that then repeated next day. Log of 
the events in
attachment.

It did recover few times after the problem, but when it happened it
just spammed

Jan 13 14:28:30 [20833] node-db2 corosync notice  [TOTEM ] A new membership 
(2:72076) was formed. Members
Jan 13 14:28:30 [20833] node-db2 corosync warning [CPG   ] downlist left_list: 
0 received
Jan 13 14:28:30 [20833] node-db2 corosync notice  [QUORUM] Members[1]: 2
Jan 13 14:28:30 [20833] node-db2 corosync notice  [MAIN  ] Completed service 
synchronization, ready to provide service.

I've also seen some of


  corosync warning [KNET  ] pmtud: possible MTU misconfiguration detected. 
kernel is reporting MTU: 1500 bytes for host 1 link 0 but the other node is not 
acknowledging packets of this size.
  corosync warning [KNET  ] pmtud: This can be caused by this node interface 
MTU too big or a network device that does not support or has been misconfigured 
to manage MTU of this size, or packet loss. knet will continue to run but 
performances might be affected.

in previous failure.

After packet loss reason was fixed it also did not fix itself without restart.

In limited testing with udpu protocol that did not occur but that period of 
testing was much shorter as we fixed the networking issue in the meantime.

We've using stable version from Debian Buster (3.0.1). >
Is that a known problem/bug ?


There were quite a few bugs in libknet < 1.15 and some of them may 
explain behavior you see. I would suggest to try backports (where 1.16 
seems to be available) or Proxmox repositories (where is packaged also 
newer corosync).


Regards,
  Honza




Cheers,
Mariusz





___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] corosync[3520]: [CPG ] *** 0x55ff99d211c0 can't mcast to group dlm:ls:lvm_testVG state:1, error:12

2021-01-05 Thread Jan Friesse

Ulrich,


Hi!

In my test cluster using UDPU(!) I saw this syslog message when I shut down a 
VG:
Nov 30 13:38:28 h16 pacemaker-execd[3681]:  notice: executing - 
rsc:prm_testVG_activate action:stop call_id:71
Nov 30 13:38:28 h16 LVM-activate(prm_testVG_activate)[7265]: INFO: Deactivating 
testVG
Nov 30 13:38:28 h16 corosync[3520]:   [CPG   ] *** 0x55ff99d211c0 can't mcast 
to group dlm:ls:lvm_testVG state:1, error:12
Nov 30 13:38:28 h16 kernel: dlm: lvm_testVG: leaving the lockspace group...
Nov 30 13:38:28 h16 kernel: dlm: lvm_global: dlm_recover 7
Nov 30 13:38:28 h16 kernel: dlm: lvm_global: remove member 118
Nov 30 13:38:28 h16 kernel: dlm: lvm_testVG: group event done 0 0
Nov 30 13:38:28 h16 kernel: dlm: lvm_testVG: release_lockspace final free
Nov 30 13:38:28 h16 kernel: dlm: lvm_global: dlm_recover_members 2 nodes
Nov 30 13:38:28 h16 kernel: dlm: lvm_global: generation 3 slots 2 1:116 3:119
Nov 30 13:38:28 h16 kernel: dlm: lvm_global: dlm_recover_directory
Nov 30 13:38:28 h16 kernel: dlm: lvm_global: dlm_recover_directory 0 in 0 new
Nov 30 13:38:28 h16 LVM-activate(prm_testVG_activate)[7288]: INFO: testVG: 
deactivated successfully.
Nov 30 13:38:28 h16 pacemaker-execd[3681]:  notice: prm_testVG_activate stop 
(call 71, PID 7258) exited with status 0 (execution time 251ms, queue time 0ms)
Nov 30 13:38:28 h16 pacemaker-controld[3684]:  notice: Result of stop operation 
for prm_testVG_activate on h16: ok

What kind of prtoblem is that, and what may be causing this?


This happens when application calls cpg_mcast_joined to the group which 
doesn't exists - so application doesn't called cpg_join yet or (that's 
probably the case here) after it called cpg_leave. No matter what, this 
is probably the problem in dlm_controld so best is to ask dlm expert.


Regards,
  Honza



Regards,
Ulrich



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Corosync-qdevice 3.0.1 is available at GitHub!

2020-11-23 Thread Jan Friesse
I am pleased to announce the latest maintenance release of 
Corosync-Qdevice 3.0.1 available immediately from GitHub at 
https://github.com/corosync/corosync-qdevice/releases as 
corosync-qdevice-3.0.1.


This release contains important bug fixes and some of new features. Most 
notable ones:
- Rewrite of DPD (dead-peer-detection) timer. Previously dead peer 
detection was implemented sub-optimally and may result in QNetd waiting 
too long before marking Qdevice peer dead and selecting new quorate 
partition what made Qdevice not responsible for corosync votequorum 
service resulting in lost of quorum. New implementation is using smaller 
timeouts and no sampling (global DPD timer) so QNetd detects dead peer 
long time before sync_timeout. Configuration option dpd_interval is 
removed (no longer needed) and replaced by dpd_interval_coefficient (see 
corosync-qnetd(8) for more details). This fixes GH issue #10.
- Implement KAP Tie-breaker (keep active partition) for ffsplit 
algorithm. This solves problem when Corosync creates single node 
partition during startup and for two node clusters with ffsplit this new 
node might get vote eventho there was already other quorate node. This 
option is enabled by default. To use previous behavior it's possible to 
set keep_active_partition_tie_breaker in corosync.conf (see 
corosync-qdevice(8) for more details). Fix for GH issue #7.

- Qdevice systemd unit file now restarts on-failure.
- Configurations without ring0_addr are now properly supported. Fixes GH 
issue #16.


Complete changelog for 3.0.1:

Fabio M. Di Nitto (1):
  devel: add corosync-qdevice.pc file for pcs to use

    Jan Friesse (79):
  qnetd: Check existence of NSS DB dir before fork
  spec: Use install -p and add license
  man: Fix corosync-qdevice default connect_timeout
  configure: Add user-flags configure option
  qdevice: Fix qdevice_heuristics_destroy error
  qnetd: Rename qnetd-log.c to log.c
  qnetd: Fix double -d description
  qnetd: Check log initialization error
  qnetd: Add function to set log target
  qdevice: Use log instead of libqb log
  qdevice: Import log instead of qdevice-log
  qdevice: Merge msg_decode_error functions
  qnetd: Use log-common for nodelist debug dump
  qdevice: Configurable log priority bump
  tests: Add utils_parse_bool_str test
  qdevice: Free memory used by log
  qdevice: Add log test
  qdevice: Add header files to list of test sources
  qdevice: Add chk variant of vsyslog to test-log
  qdevice: Add prototype of __vsyslog_chk
  build: Update git-version-gen
  build: Use git-version-gen during specfile build
  configure: Use default systemd path with prefix
  pr-poll-loop: Add main poll loop based on PR_Poll
  qnetd: Migrate main loop to pr-poll-loop
  qnetd: Do not call ffsplit_do on shutdown
  qdevice: Use EXIT_SUCCESS and EXIT_FAILURE codes
  qdevice: Add space before bracket
  qdevice: Fix connect heuristics result callback
  pr-poll-loop: Do not add FD when events is empty
  tests: Add pr-poll-loop test
  tests: Enhance pr-poll-loop test
  heuristics: Remove qdevice instance pointer
  pr-poll-loop: Return error code if PR_Poll fails
  qdevice: Initial port to use pr-poll-loop
  qnetd: Remove write callback on listening sockets
  qnetd: Remove unneeded pprio include
  qnetd: Log pr_poll_loop_add,del errors properly
  qnetd: Move pr_poll_loop_exec call to function
  pr-poll-loop: Add support for PR_POLL_EXCEPT
  pr-poll-loop: Pass PRPollDesc for prfd events
  pr-poll-loop: Add pre poll callbacks
  qdevice: Fix connect heuristics result callback
  pr-poll-loop: Fix set_events_cb return code
  qnetd: Return error code based on ipc closed
  qdevice-net: Log adds newline automatically
  qdevice: Port qdevice to use pr-poll-loop
  qdevice-votequorum: Fix typo in log message
  qnetd: Fix dpd timer
  timer-list: Return error on adding NULL callback
  timer-list: Add test
  README: Fix typos
  qnetd: Add support for keep active partition vote
  LICENSE: Update copyright date
  qdevice: Fix set option and set option reply
  qdevice-net-heuristics: Fix log message
  qnetd: Fix NULL dereference of client
  qnet: Add support for keep active partition vote
  qdevice-ipc: Fix dereference bug
  pr-poll-loop: Add queue header include
  timer-list: Implement heap based timer-list
  msg: Check cat result on adding msg type and size
  test-process-list: Fix few bugs
  tlv: Check dynar_cat result
  test-timer-list: Ignore poll errors
  timer

Re: [ClusterLabs] Minor bug in SLES 15 corosync-2.4.5-6.3.2.x86_64 (unicast, ttl)

2020-11-20 Thread Jan Friesse

Ulrich,


Hi!

A short notification:

I had set up a new cluster using udpu, finding that ringnumber 0 has a ttl statement 
("ttl:1"), but ringnumber 1 had not. So I added one for ringnumber 1, and 
then I reloaded corosync via  corosync-cfgtool -R.


probably ttl with value different from 1 right?


Amazingly when (re)starting corosync, it failed with:
Nov 19 15:29:51 h18 corosync[90724]:   [MAIN  ] parse error in config: Can only 
set ttl on multicast transport types


Yeah, old reload was quite flaky. 3.1.0 has way improved reload so it 
would return error when config parsing or sanity checks failed.




Independent of whether this may be true or not, it would be nice if it were 
handled consistently.


What you mean by "handled consistently"? Code is checking if transport 
!= UDP (so multicast) and if ttl != 1 -> display error.



Preferrably "ttl" would be just ignored with warning when not using multicast.


That doesn't sound like a good idea. User set something and may expect 
that something is applied.


It may make sense to consider adding ttl support for other transports.

Regards,
  honza



Regards,
Ulrich




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] egards, Q: what does " corosync-cfgtool -s" check actually?

2020-11-20 Thread Jan Friesse

Ulrich,


Hi!

having a problem, I wonder what " corosync-cfgtool -s" does check actually:
I see on all nodes and all rings "status  = ring 0 active with no faults", but 
the nodes seem unable to comminicate somehow.


For UDPU/UDP without RRP it will always display "ring 0 active with no 
faults". With RRP it will will display error when one of the ring is failed.


For KNet it will display much better information:
# corosync-cfgtool -s
Printing link status.
Local node ID 5
LINK ID 0
addr= IP
status:
nodeid  5:  localhost
nodeid  6:  disconnected



I there a kind of "corosync node ping" that actually checks that the local node 
can communicate with another given node using the corosync (TOTEM) protocol?


As Klaus wrote for non Knet setup corosync-quorumtool is probably closest.



The problem is that the cluster with three nodes says (h16 is the local node):
Node List:
   * Node h18: UNCLEAN (offline)
   * Node h19: UNCLEAN (offline)
   * Online: [h16 ]

In syslog I get a message like this every few (consensus?)  seconds:
Nov 20 07:46:59 h16 corosync[6240]:   [TOTEM ] A new membership 
(172.20.16.16:36752) was formed. Members
Nov 20 07:46:59 h16 corosync[6240]:   [CPG   ] downlist left_list: 0 received
Nov 20 07:46:59 h16 corosync[6240]:   [QUORUM] Members[1]: 116
Nov 20 07:46:59 h16 corosync[6240]:   [MAIN  ] Completed service 
synchronization, ready to provide service.
Nov 20 07:47:05 h16 corosync[6240]:   [TOTEM ] A new membership 
(172.20.16.16:36756) was formed. Members
Nov 20 07:47:05 h16 corosync[6240]:   [CPG   ] downlist left_list: 0 received
Nov 20 07:47:05 h16 corosync[6240]:   [QUORUM] Members[1]: 116
Nov 20 07:47:05 h16 corosync[6240]:   [MAIN  ] Completed service 
synchronization, ready to provide service.


It shows there is just one node with id 116. This is probably firewall 
problem.


Regards,
  Honza



The cluster is brand new and has absolutely no resources set up.

   
 
   
   
   
   
 
   
...nodes...
   
   


Regards,
Ulrich


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Corosync 3.1.0 token timeout

2020-10-22 Thread Jan Friesse

Ulrich,


Jan Friesse  schrieb am 20.10.2020 um 18:05 in Nachricht

<9e9edd13-847c-a81f-9b28-0ecf8f17f...@redhat.com>:

I've forgot to mention one very important change (in text, release notes
at github release is already fixed):


...


- Default token timeout was changed from 1 seconds to 3 seconds. Default


Hi!

The same stupid question as always: How is that value determined, assuming that 
in a LAN the per-hop delay should be less than 1ms these days and the numbe rof 
nodes typically is much less than 10. Ist there a safety-factor of 1000%, or 
what?
Or is this just black magic, and the value was determined in a sleepless 
fulll-mood night by throwing dice?


It's somewhere in the middle actually.

Reason for increasing the value is number of GSS cases where increase of 
token timeout helped reduce number of "unexpected" fencing events.


The proposal was to increase the value to 5 secs, but that would make 
upgrading hard, because nodes with old version would detect token loss 
(default config is resend token 4 times so 5s/4 = 1.25 secs).


There is no such problem with 3 secs.

The main problem is that choosing timeouts is not exact science. We have 
to choose timeout which is high enough to give nodes enough time in case 
of spikes (various ones - cpu/blocked IO/network/...) but also low 
enough to react as quickly as possible. 1 secs was working well most of 
the time, but then something bad happened and node was fenced "without 
the reason". So to conclude, yes, it is kind of black magic.


Regards,
  Honza



Regards,
Ulrich


token timeout of 1000 ms was often changed by users because of other
workloads on machine which may make corosync responding a bit later than
needed and resulting in token loss. 3000 ms was chosen as a compromise
between token timeout increase and allow live cluster upgrade (other
nodes should receive token by node with new default on time). It doesn't
affect token token_coefficient so final token timeout still depends on
number of configured nodes (just base is higher).  This change slows
down failover a bit so for clusters where failover times are important,
please change the token timeout in configuration file corosync.conf as a:

totem {
version: 2
token: 1000
...



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Corosync 3.1.0 is available at corosync.org!

2020-10-20 Thread Jan Friesse
I've forgot to mention one very important change (in text, release notes 
at github release is already fixed):



I am pleased to announce the latest maintenance release of Corosync
3.1.0 available immediately from GitHub release section at 
https://github.com/corosync/corosync/releases or our website at

http://build.clusterlabs.org/corosync/releases/.

This release contains important bug fixes and also few big new features 
(that's the reason for bumping minor version number):


- Support for changing crypto configuration during runtime. This 
includes turning cryptography on or off, changing crypto_cipher and 
crypto_hash and also changing of crypto key. To use this feature simply 
change corosync.conf (or authkey) and call `corosync-cfgtool -R`.


To make this feature work compilation with knet version at least 1.18 is 
required (it is still possible to compile corosync with previous knet 
versions but crypto reconfiguration will be disabled).


Please note that it is not supported to make crypto reconfiguration 
during upgrade of cluster! Cluster will split into partition with nodes 
running old corosync and partition with nodes running new corosync if 
cryptographic configuration is changed!


- Configuration system is rewritten and reload became atomic operation. 
Previously it was possible to change configuration file such way that 
corosync would refuse to start with such configuration, but it was 
partly loaded during reconfiguration (usually with warnings but bad 
things happened anyway) creating inconsistencies.


Now, bad config file which would be refused during startup is refused 
also on reload and configuration stays intact.


- Quorum service got improved API with ability to register new totem 
members change callback. Main motivator for this change is DLM, but 
other API users may find it helpful too - see quorum_model_initialize 
(3) and tests/testquorummodel.c example. Change is fully backwards 
compatible and there is no change in the previous quorum_initialize (3).




- Default token timeout was changed from 1 seconds to 3 seconds. Default 
token timeout of 1000 ms was often changed by users because of other 
workloads on machine which may make corosync responding a bit later than 
needed and resulting in token loss. 3000 ms was chosen as a compromise 
between token timeout increase and allow live cluster upgrade (other 
nodes should receive token by node with new default on time). It doesn't 
affect token token_coefficient so final token timeout still depends on 
number of configured nodes (just base is higher).  This change slows 
down failover a bit so for clusters where failover times are important, 
please change the token timeout in configuration file corosync.conf as a:


totem {
  version: 2
  token: 1000
  ...


Complete changelog for 3.1.0:

     Aleksei Burlakov (1):
   totemsrp: More informative messages

     Christine Caulfield (8):
   config: Reorganise the config system
   cfg: Improve error return to cfgtool -R
   config: don't reload vquorum if reload fails
   config: Don't free pointers used by transports
   config: Fix crash when a reload fails twice
   test: Fix cpgtest
   config: Allow reconfiguration of crypto options
   man: reload during rolling upgrade

     Ferenc Wágner (2):
   man: fix typo: avaialable
   man: votequorum.5: use proper single quotes

     Jan Friesse (9):
   spec: Require at least knet 1.18 for crypto reload
   build: Update git-version-gen
   build: Use git-version-gen during specfile build
   configure: Use default systemd path with prefix
   common_lib: Remove trailing spaces in cs_strerror
   totemsrp: Move token received callback
   quorum: Add support for nodelist callback
   tests: Use CS_DISPATCH_BLOCKING instead of cycle
   config: Increase default token timeout to 3000 ms

     liangxin1300 (15):
   cfgtool: output error messages to stderr
   cfgtool: enhancement -a option
   tools: use util_strtonum for options checking
   cmapctl: return EXIT_FAILURE on failure
   man: update output of -s and -b for cfgtool
   cfgtool: Return error when -i doesn't match
   quorumtool: Help shouldn't require running service
   quorumtool: strict check for -o option
   cmapctl: check NULL for key type and value for -p
   cmapctl: return error on no result of print prefix
   totemconfig: validate totem.transport value
   cfg: enhance message_handler_req_lib_cfg_killnode
   totemconfig: add interface number to the error str
   totemconfig: improve linknumber checking
   totemconfig: remove redundant nodeid error log


Upgrade is highly recommended.

Note for corosync contributors (and distro maintainers). There is no 
plan to keep 3.0.

[ClusterLabs] Corosync 3.1.0 is available at corosync.org!

2020-10-20 Thread Jan Friesse

I am pleased to announce the latest maintenance release of Corosync
3.1.0 available immediately from GitHub release section at 
https://github.com/corosync/corosync/releases or our website at

http://build.clusterlabs.org/corosync/releases/.

This release contains important bug fixes and also few big new features 
(that's the reason for bumping minor version number):


- Support for changing crypto configuration during runtime. This 
includes turning cryptography on or off, changing crypto_cipher and 
crypto_hash and also changing of crypto key. To use this feature simply 
change corosync.conf (or authkey) and call `corosync-cfgtool -R`.


To make this feature work compilation with knet version at least 1.18 is 
required (it is still possible to compile corosync with previous knet 
versions but crypto reconfiguration will be disabled).


Please note that it is not supported to make crypto reconfiguration 
during upgrade of cluster! Cluster will split into partition with nodes 
running old corosync and partition with nodes running new corosync if 
cryptographic configuration is changed!


- Configuration system is rewritten and reload became atomic operation. 
Previously it was possible to change configuration file such way that 
corosync would refuse to start with such configuration, but it was 
partly loaded during reconfiguration (usually with warnings but bad 
things happened anyway) creating inconsistencies.


Now, bad config file which would be refused during startup is refused 
also on reload and configuration stays intact.


- Quorum service got improved API with ability to register new totem 
members change callback. Main motivator for this change is DLM, but 
other API users may find it helpful too - see quorum_model_initialize 
(3) and tests/testquorummodel.c example. Change is fully backwards 
compatible and there is no change in the previous quorum_initialize (3).


Complete changelog for 3.1.0:

Aleksei Burlakov (1):
  totemsrp: More informative messages

Christine Caulfield (8):
  config: Reorganise the config system
  cfg: Improve error return to cfgtool -R
  config: don't reload vquorum if reload fails
  config: Don't free pointers used by transports
  config: Fix crash when a reload fails twice
  test: Fix cpgtest
  config: Allow reconfiguration of crypto options
  man: reload during rolling upgrade

Ferenc Wágner (2):
  man: fix typo: avaialable
  man: votequorum.5: use proper single quotes

Jan Friesse (9):
  spec: Require at least knet 1.18 for crypto reload
  build: Update git-version-gen
  build: Use git-version-gen during specfile build
  configure: Use default systemd path with prefix
  common_lib: Remove trailing spaces in cs_strerror
  totemsrp: Move token received callback
  quorum: Add support for nodelist callback
  tests: Use CS_DISPATCH_BLOCKING instead of cycle
  config: Increase default token timeout to 3000 ms

liangxin1300 (15):
  cfgtool: output error messages to stderr
  cfgtool: enhancement -a option
  tools: use util_strtonum for options checking
  cmapctl: return EXIT_FAILURE on failure
  man: update output of -s and -b for cfgtool
  cfgtool: Return error when -i doesn't match
  quorumtool: Help shouldn't require running service
  quorumtool: strict check for -o option
  cmapctl: check NULL for key type and value for -p
  cmapctl: return error on no result of print prefix
  totemconfig: validate totem.transport value
  cfg: enhance message_handler_req_lib_cfg_killnode
  totemconfig: add interface number to the error str
  totemconfig: improve linknumber checking
  totemconfig: remove redundant nodeid error log


Upgrade is highly recommended.

Note for corosync contributors (and distro maintainers). There is no 
plan to keep 3.0.X branch supported (as was the case with 2.x) so there 
is not going to be any camelback-3.0 (or camelback-3.1) branches - only 
camelback (which will be used for 3.1.1, ...).


Thanks/congratulations to all people that contributed to achieve this
great milestone.

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Two ethernet adapter within same subnet causing issue on Qdevice

2020-10-06 Thread Jan Friesse

Richard ,


To clarify my problem, this is more on Qdevice issue I want to fix.


The question is, how much it is really qdevice problem and if so, if 
there is really something we can do about the problem.


Qdevice itself is just using standard connect(2) call and standard TCP 
socket. So from qdevice point of view it is really kernel problem where 
to route packet to reach qnetd.


It is clear that ifdown made qdevice to lost connection with qnetd 
(that's why ip changed from ens192 to ens256) and standard qdevice 
behavior is to try reconnect. Qdevice itself is not binding to any 
specific address (it is really just a client) so after calling 
connect(2) qdevice reached qnetd via other (working) interface.


So I would suggest to try method recommended by Andrei (add host route).

Regards,
  Honza


See below for more detail.
Thank you,
Richard

 - Original message -
 From: Andrei Borzenkov 
 Sent by: "Users" 
 To: users@clusterlabs.org
 Cc:
 Subject: [EXTERNAL] Re: [ClusterLabs] Two ethernet adapter within same
 subnet causing issue on Qdevice
 Date: Thu, Oct 1, 2020 2:45 PM
 01.10.2020 20:09, Richard Seo пишет:
  > Hello everyone,
  > I'm trying to setup a cluster with two hosts:
  > both have two ethernet adapters all within the same subnet.
  > I've created resources for an adapter for each hosts.
  > Here is the example:
  > Stack: corosync
  > Current DC:  (version 2.0.2-1.el8-744a30d655) - partition with 
quorum
  > Last updated: Thu Oct  1 12:50:48 2020
  > Last change: Thu Oct  1 12:32:53 2020 by root via cibadmin on 
  > 2 nodes configured
  > 2 resources configured
  > Online: [   ]
  > Active resources:
  > db2__ens192(ocf::heartbeat:db2ethmon): Started 
  > db2__ens192(ocf::heartbeat:db2ethmon): Started 
  > I also have a qdevice setup:
  > # corosync-qnetd-tool -l
  > Cluster "hadom":
  >  Algorithm:LMS
  >  Tie-breaker:Node with lowest node ID
  >  Node ID 2:
  >  Client address:::::40044
  >  Configured node list:1, 2
  >  Membership node list:1, 2
  >  Vote:ACK (ACK)
  >  Node ID 1:
  >  Client address::::<*ip for ens192 for host 
1*>:37906
  >  Configured node list:1, 2
  >  Membership node list:1, 2
  >  Vote:ACK (ACK)
  > When I ifconfig down ens192 for host 1, looks like qdevice changes the 
Client
  > address to the other adapter and still give quorum to the lowest node ID
 (which
  > is host 1 in this case) even when the network is down for host 1.

 Network on host 1 is obviously not down as this host continues to
 communicate with the outside world. Network may be down for your
 specific application but then it is up to resource agent for this
 application to detect it and initiate failover.
 The Network (ens192) on host 1 is down. host1 can still communicate with 
the
 world, because host1 has another network adapter (ens256). However, only
 ens192 was configured as a resource. I've also configured specifically
 ens192 ip address in the corsync.conf.
 I want the network on host 1 down. that way, I can reproduce the problem
 where quorum is given to a wrong node.

  > Cluster "hadom":
  >  Algorithm:LMS
  >  Tie-breaker:Node with lowest node ID
  >  Node ID 2:
  >  Client address:::::40044
  >  Configured node list:1, 2
  >  Membership node list:1, 2
  >  Vote:ACK (ACK)
  >  Node ID 1:
  >  Client address::::<*ip for ens256 for host 
1*>:37906
  >  Configured node list:1, 2
  >  Membership node list:1, 2
  >  Vote:ACK (ACK)
  > Is there a way we can force qdevice to only route through a specified 
adapter
  > (ens192 in this case)?

 Create host route via specific device.
 I've looked over the docs, haven't found a way to do this. I've tried
 configuring corosync.conf using the specific ip addresses. Could you 
specify
 how to route to a specific network adapter from a qdevice?

  > Also while I'm on this topic, is multiple communication ring support 
with
  > pacemaker supported or will be supported in the near future?

 What exactly do you mean? What communication are you talking about?

 You seem to confuse multiple layers here. qnetd and pacemaker are two
 independent things.
 So this is a separate question regarding Pacemaker and Corosync. I want to
 know if having multiple communication ring in the nodelist in
 corosync.conf is supported by Pacemaker with Corosync right now. The
 communication protocal is called Redundant r

Re: [ClusterLabs] Alerts for qdevice/qnetd/booth

2020-08-17 Thread Jan Friesse

Thanks Honza. I have raised these on both upstream projects.


Thanks


I will leave upto implementer how best this can be done, considering the
technical limitations you mentioned.

https://github.com/corosync/corosync-qdevice/issues/13
https://github.com/ClusterLabs/booth/issues/99

Thanks,
Rohit

On Thu, Aug 13, 2020 at 1:03 PM Jan Friesse  wrote:


Hi Rohit,


Hi Honza,
Thanks for your reply. Please find the attached image below:

[image: image.png]

Yes, I am talking about pacemaker alerts only.

Please find my suggestions/requirements below:

*Booth:*
1. Node5 booth-arbitrator should be able to give event when any of the
booth node joins or leaves. booth-ip can be passed in event.


This is not how booth works. Ticket leader (so site booth, never
arbitrator) executes election and get replies from other
sites/arbitrator. Follower executes election when leader hasn't for
configured timeout.

What I want to say is, that there is no "membership" - as in (for
example) corosync fashion.

The best we could get is the rough estimation based on election
request/replies.


2. Event when booth-arbitrator is up successfully and has started
monitoring the booth nodes.


This is basically start of service. I think it's doable with small
change in unit file (something like

https://northernlightlabs.se/2014-07-05/systemd-status-mail-on-unit-failure.html
)


2. Geo site booth should be able to give event when its booth peers
joins/leaves. For example, Geo site1 gives an event when node5
booth-arbitrator joins/leaves OR site2 booth joins/leaves.  booth-ip can

be

passed in event.
3. On ticket movements (revoke/grant), every booth node(Site1/2 and

node5)

should give events.


That would be doable



Note: pacemaker alerts works in a cluster. Since, arbitrator is a
non-cluster node, not sure how exactly it will work there. But this is

good

to have feature.

*Qnetd/Qdevice:*
This is similar to above.
1. Node5 qnetd should be able to raise an event when any of the cluster
node joins/leaves the quorum.


Doable


2. Event when qnetd is up successfully and has started monitoring the
cluster nodes


Qnetd itself is not monitoring qdevice nodes (it doesn't have list of
nodes). It monitors node status after node joins (= it would be possible
to trigger event on leave). So that may be enough.


3. Cluster node should be able to give event when any of the quorum node
leaves/joins.


You mean qdevice should be able to trigger event when connected to qnetd?



If you see on high level, then these are kind of node/resource events wrt
booth and qnetd/qdevice.


Yeah



As of today wrt booth/qnetd, I don't see any provision where any of the
nodes gives any event when its peer leaves/joins. This makes it difficult
to know whether geo sites nodes can see booth-arbitrator or not. This is


Got it. That's exactly what would be really problematic to implement,
because of no "membership" in booth. It would be, however, possible to
implement message when ticket was granted/rejected and have a list of
other booths replies and what was their votes.


true the other way around also where booth-arbitrator cannot see geo

booth

sites.
I am not sure how others are doing it in today's deployment, but I see

need

of monitoring of every other booth/qnet node. So that on basis of event,
appropriate alarms can be raised and action can be taken accordingly.

Please let me know if you agree on the usecases. I'll raise

feature-request

I can agree on usecases, but (especially with booth) there are technical
problems on realizing them.


on the pacemaker upstream project accordingly.


Please use booth (https://github.com/ClusterLabs/booth) and qdevice
(https://github.com/corosync/corosync-qdevice) upstream rather than
pacemaker, because these requests has really nothing to do with pcmk.

Regards,
honza



Thanks,
Rohit

On Wed, Aug 12, 2020 at 8:58 PM Jan Friesse  wrote:


Hi Rohit,

Rohit Saini napsal(a):

Hi Team,

Question-1:
Similar to pcs alerts, do we have something similar for qdevice/qnetd?

This

You mean pacemaker alerts right?


is to detect asynchronously if any of the member is

unreachable/joined/left

and if that member is qdevice or qnetd.


Nope but actually shouldn't be that hard to implement. What exactly
would you like to see there?



Question-2:
Same above question for booth nodes and arbitrator. Is there any way to
receive events from booth daemon?


Not directly (again, shouldn't be that hard to implement). But pacemaker
alerts should be triggered when service changes state because of ticket
grant/reject, isn't it?



My main objective is to see if these daemons give events related to
their internal state transitions  and raise some alarms accordingly.

For

example, boothd arbitrator is unreachable, ticket moved from x to y,

etc.


I don't think "boothd arbitrator is unreachable" alert is really doable.
Ticket moved from x to y w

Re: [ClusterLabs] Alerts for qdevice/qnetd/booth

2020-08-13 Thread Jan Friesse

Hi Rohit,


Hi Honza,
Thanks for your reply. Please find the attached image below:

[image: image.png]

Yes, I am talking about pacemaker alerts only.

Please find my suggestions/requirements below:

*Booth:*
1. Node5 booth-arbitrator should be able to give event when any of the
booth node joins or leaves. booth-ip can be passed in event.


This is not how booth works. Ticket leader (so site booth, never 
arbitrator) executes election and get replies from other 
sites/arbitrator. Follower executes election when leader hasn't for 
configured timeout.


What I want to say is, that there is no "membership" - as in (for 
example) corosync fashion.


The best we could get is the rough estimation based on election 
request/replies.



2. Event when booth-arbitrator is up successfully and has started
monitoring the booth nodes.


This is basically start of service. I think it's doable with small 
change in unit file (something like 
https://northernlightlabs.se/2014-07-05/systemd-status-mail-on-unit-failure.html)



2. Geo site booth should be able to give event when its booth peers
joins/leaves. For example, Geo site1 gives an event when node5
booth-arbitrator joins/leaves OR site2 booth joins/leaves.  booth-ip can be
passed in event.
3. On ticket movements (revoke/grant), every booth node(Site1/2 and node5)
should give events.


That would be doable



Note: pacemaker alerts works in a cluster. Since, arbitrator is a
non-cluster node, not sure how exactly it will work there. But this is good
to have feature.

*Qnetd/Qdevice:*
This is similar to above.
1. Node5 qnetd should be able to raise an event when any of the cluster
node joins/leaves the quorum.


Doable


2. Event when qnetd is up successfully and has started monitoring the
cluster nodes


Qnetd itself is not monitoring qdevice nodes (it doesn't have list of 
nodes). It monitors node status after node joins (= it would be possible 
to trigger event on leave). So that may be enough.



3. Cluster node should be able to give event when any of the quorum node
leaves/joins.


You mean qdevice should be able to trigger event when connected to qnetd?



If you see on high level, then these are kind of node/resource events wrt
booth and qnetd/qdevice.


Yeah



As of today wrt booth/qnetd, I don't see any provision where any of the
nodes gives any event when its peer leaves/joins. This makes it difficult
to know whether geo sites nodes can see booth-arbitrator or not. This is


Got it. That's exactly what would be really problematic to implement, 
because of no "membership" in booth. It would be, however, possible to 
implement message when ticket was granted/rejected and have a list of 
other booths replies and what was their votes.



true the other way around also where booth-arbitrator cannot see geo booth
sites.
I am not sure how others are doing it in today's deployment, but I see need
of monitoring of every other booth/qnet node. So that on basis of event,
appropriate alarms can be raised and action can be taken accordingly.

Please let me know if you agree on the usecases. I'll raise feature-request


I can agree on usecases, but (especially with booth) there are technical 
problems on realizing them.



on the pacemaker upstream project accordingly.


Please use booth (https://github.com/ClusterLabs/booth) and qdevice 
(https://github.com/corosync/corosync-qdevice) upstream rather than 
pacemaker, because these requests has really nothing to do with pcmk.


Regards,
  honza



Thanks,
Rohit

On Wed, Aug 12, 2020 at 8:58 PM Jan Friesse  wrote:


Hi Rohit,

Rohit Saini napsal(a):

Hi Team,

Question-1:
Similar to pcs alerts, do we have something similar for qdevice/qnetd?

This

You mean pacemaker alerts right?


is to detect asynchronously if any of the member is

unreachable/joined/left

and if that member is qdevice or qnetd.


Nope but actually shouldn't be that hard to implement. What exactly
would you like to see there?



Question-2:
Same above question for booth nodes and arbitrator. Is there any way to
receive events from booth daemon?


Not directly (again, shouldn't be that hard to implement). But pacemaker
alerts should be triggered when service changes state because of ticket
grant/reject, isn't it?



My main objective is to see if these daemons give events related to
their internal state transitions  and raise some alarms accordingly. For
example, boothd arbitrator is unreachable, ticket moved from x to y, etc.


I don't think "boothd arbitrator is unreachable" alert is really doable.
Ticket moved from x to y would be probably two alerts - 1. ticket
rejected on X and 2. granted on Y.

Would you mind to elaborate a bit more on events you would like to see
and potentially open issue for upstream project (or, if you have a RH
subscription try to contact GSS, so I get more time to work on this issue).

Regards,
Honza



Thanks,
Roh

Re: [ClusterLabs] Alerts for qdevice/qnetd/booth

2020-08-12 Thread Jan Friesse

Hi Rohit,

Rohit Saini napsal(a):

Hi Team,

Question-1:
Similar to pcs alerts, do we have something similar for qdevice/qnetd? This


You mean pacemaker alerts right?


is to detect asynchronously if any of the member is unreachable/joined/left
and if that member is qdevice or qnetd.


Nope but actually shouldn't be that hard to implement. What exactly 
would you like to see there?




Question-2:
Same above question for booth nodes and arbitrator. Is there any way to
receive events from booth daemon?


Not directly (again, shouldn't be that hard to implement). But pacemaker 
alerts should be triggered when service changes state because of ticket 
grant/reject, isn't it?




My main objective is to see if these daemons give events related to
their internal state transitions  and raise some alarms accordingly. For
example, boothd arbitrator is unreachable, ticket moved from x to y, etc.


I don't think "boothd arbitrator is unreachable" alert is really doable. 
Ticket moved from x to y would be probably two alerts - 1. ticket 
rejected on X and 2. granted on Y.


Would you mind to elaborate a bit more on events you would like to see 
and potentially open issue for upstream project (or, if you have a RH 
subscription try to contact GSS, so I get more time to work on this issue).


Regards,
  Honza



Thanks,
Rohit



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] qnetd and booth arbitrator running together in a 3rd geo site

2020-07-14 Thread Jan Friesse

Rohit,


I dont think my question was very clear. I am strictly NO for STONITH.
STONITH is limited only for kvm or HP machines. That's the reason I don't


Nope, stonith is not limited only for KVM or HP machine. There is huge 
amount of fence agents for various HW and VMs 
(https://github.com/ClusterLabs/fence-agents/tree/master/agents). Also 
there is SBD.




want to use STONITH.
What my question is can I use booth with nodes of a single cluster also
(similar to qdevice)? So idea is to use booth arbitrator for cluster of


Standard way to use booth is to have 2 clusters with N nodes. On each 
cluster, there is clustered booth, which is running in active-passive 
fashion (so only on one of the nodes of the cluster). And this booth 
gives ticket depending on booth arbitrator decision. And final piece of 
puzzle is pacemaker resource which depends on ownership of ticket.




clusters AS WELL AS for a single cluster.


So something like a booth resource in one cluster depending on booth 
ticket given by some locally running booth? I would say in theory it is 
possible, but I would say original idea of using both qnetd and booth 
looked a bit more "standard".


Honza




On Tue, Jul 14, 2020 at 4:42 PM Jan Friesse  wrote:


Rohit,


Thanks Honja. That's helpful.
Let's say I don't use qnetd, can I achieve same with booth arbitrator?


That means to have two two-node clusters. Two-node cluster without
fencing is strictly no.


Booth arbitrator works for geo-clusters, can the same arbitrator be

reused

for local clusters as well?


I'm not sure that I understand question. Booth just gives ticket to
(maximally) one of booth-sites.



Is it even possible technically?


The question is, what you are trying to achieve. If geo-cluster then
stonith for sites + booth is probably best solution. If the cluster is
more like a stretch cluster, then qnetd + stonith is enough.

And of course your idea (original one) should work too.

Honza




Regards,
Rohit

On Tue, Jul 14, 2020 at 3:32 PM Jan Friesse  wrote:


Rohit,


Hi Team,
Can I execute corosync-qnetd and booth-arbitrator on the same VM in a
different geo site? What's the recommendation? Will it have any

limitations

in a production deployment?


There is no technical limitation. Both qnetd and booth are very
lightweight and work just fine with high latency links.

But I don't really have any real-life experiences with deployment where
both booth and qnetd are used. It should work, but I would recommend
proper testing - especially what happens when arbitrator node

disappears.



Due to my architecture limitation, I have only one arbitrator available
which is on a 3rd site. To handle cluster split-brain errors, I am

thinking

to use same arbitrator for local cluster as well.
STONITH is not useful in my case as it is limited only to ILO and VIRT.


Keep in mind that neither qdevice nor booth is "replacement" for

stonith.


Regards,
 Honza



[image: image.png]

Thanks,
Rohit



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/













___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] qnetd and booth arbitrator running together in a 3rd geo site

2020-07-14 Thread Jan Friesse

Rohit,


Also, " Keep in mind that neither qdevice nor booth is "replacement" for
stonith.  "

Why not? qdevice/booth are handling the split-brain scenario, keeping one
master only even in case of local/geo network disjoints. Can you please
clarify more on this.


Yeah, you are right. I don't really want to go to much deeps of this 
discussion, because opinions there seems to be like emacs/vi wars. Let's 
say I'm standing more on "stonith (ideally HW) as a hard requirement" side.


Honza




Thanks,
Rohit

On Tue, Jul 14, 2020 at 3:40 PM Rohit Saini 
wrote:


Thanks Honja. That's helpful.
Let's say I don't use qnetd, can I achieve same with booth arbitrator?
Booth arbitrator works for geo-clusters, can the same arbitrator be reused
for local clusters as well?
Is it even possible technically?

Regards,
Rohit

On Tue, Jul 14, 2020 at 3:32 PM Jan Friesse  wrote:


Rohit,


Hi Team,
Can I execute corosync-qnetd and booth-arbitrator on the same VM in a
different geo site? What's the recommendation? Will it have any

limitations

in a production deployment?


There is no technical limitation. Both qnetd and booth are very
lightweight and work just fine with high latency links.

But I don't really have any real-life experiences with deployment where
both booth and qnetd are used. It should work, but I would recommend
proper testing - especially what happens when arbitrator node disappears.


Due to my architecture limitation, I have only one arbitrator available
which is on a 3rd site. To handle cluster split-brain errors, I am

thinking

to use same arbitrator for local cluster as well.
STONITH is not useful in my case as it is limited only to ILO and VIRT.


Keep in mind that neither qdevice nor booth is "replacement" for stonith.

Regards,
Honza



[image: image.png]

Thanks,
Rohit



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/








___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] qnetd and booth arbitrator running together in a 3rd geo site

2020-07-14 Thread Jan Friesse

Rohit,


Thanks Honja. That's helpful.
Let's say I don't use qnetd, can I achieve same with booth arbitrator?


That means to have two two-node clusters. Two-node cluster without 
fencing is strictly no.



Booth arbitrator works for geo-clusters, can the same arbitrator be reused
for local clusters as well?


I'm not sure that I understand question. Booth just gives ticket to 
(maximally) one of booth-sites.




Is it even possible technically?


The question is, what you are trying to achieve. If geo-cluster then 
stonith for sites + booth is probably best solution. If the cluster is 
more like a stretch cluster, then qnetd + stonith is enough.


And of course your idea (original one) should work too.

Honza




Regards,
Rohit

On Tue, Jul 14, 2020 at 3:32 PM Jan Friesse  wrote:


Rohit,


Hi Team,
Can I execute corosync-qnetd and booth-arbitrator on the same VM in a
different geo site? What's the recommendation? Will it have any

limitations

in a production deployment?


There is no technical limitation. Both qnetd and booth are very
lightweight and work just fine with high latency links.

But I don't really have any real-life experiences with deployment where
both booth and qnetd are used. It should work, but I would recommend
proper testing - especially what happens when arbitrator node disappears.


Due to my architecture limitation, I have only one arbitrator available
which is on a 3rd site. To handle cluster split-brain errors, I am

thinking

to use same arbitrator for local cluster as well.
STONITH is not useful in my case as it is limited only to ILO and VIRT.


Keep in mind that neither qdevice nor booth is "replacement" for stonith.

Regards,
Honza



[image: image.png]

Thanks,
Rohit



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/








___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] qnetd and booth arbitrator running together in a 3rd geo site

2020-07-14 Thread Jan Friesse

Rohit,


Hi Team,
Can I execute corosync-qnetd and booth-arbitrator on the same VM in a
different geo site? What's the recommendation? Will it have any limitations
in a production deployment?


There is no technical limitation. Both qnetd and booth are very 
lightweight and work just fine with high latency links.


But I don't really have any real-life experiences with deployment where 
both booth and qnetd are used. It should work, but I would recommend 
proper testing - especially what happens when arbitrator node disappears.



Due to my architecture limitation, I have only one arbitrator available
which is on a 3rd site. To handle cluster split-brain errors, I am thinking
to use same arbitrator for local cluster as well.
STONITH is not useful in my case as it is limited only to ILO and VIRT.


Keep in mind that neither qdevice nor booth is "replacement" for stonith.

Regards,
  Honza



[image: image.png]

Thanks,
Rohit



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Linux 8.2 - high totem token requires manual setting of ping_interval and ping_timeout

2020-06-25 Thread Jan Friesse

Robert,
thank you for the info/report. More comments inside.


All,
Hello.  Hope all is well.   I have been researching Oracle Linux 8.2 and ran 
across a situation that is not well documented.   I decided to provide some 
details to the community in case I am missing something.

Basically, if you increase the totem token above approximately 33000 with the 
knet transport, then a two node cluster will not properly form.   The exact 
threshold value will slightly fluctuate, depending on hardware type and 
debugging, but will consistently fail above 4.


At least corosync with 40sec timeout works just fine for me.

# corosync-cmapctl  | grep token
runtime.config.totem.token (u32) = 40650

# corosync-quorumtool
Quorum information
--
Date: Fri Jun 26 08:45:12 2020
Quorum provider:  corosync_votequorum
Nodes:2
Node ID:  1
Ring ID:  1.11be1
Quorate:  Yes

Votequorum information
--
Expected votes:   3
Highest expected: 3
Total votes:  2
Quorum:   2
Flags:Quorate

Membership information
--
Nodeid  Votes Name
 1  1 vmvlan-vmcos8-n05 (local)
 6  1 vmvlan-vmcos8-n06


It is indeed true that forming took a bit more time (30 sec to be more 
precise)




The failure to form a cluster would occur when running the "pcs cluster start 
--all" command or if I would start one cluster, let it stabilize, then start the 
second.  When it fails to form a cluster, each side would say they are ONLINE, but the 
other side is UNCLEAN(offline) (cluster state: partition WITHOUT quorum).   If I define 
proper stonith resources, then they will not fence since the cluster never makes it to an 
initial quorum state.  So, the cluster will stay in this split state indefinitely.


Maybe some timeout in pcs?



Changing the transport back to udpu or udp, the higher totem tokens worked as 
expected.


Yup. You've correctly find out that knet_* timeouts helps. Basically 
knet let link not working till it gets enough pongs. UDP/UDPU doesn't 
have this concept so it will create cluster faster.




 From the debug logging, I suspect that the Election Trigger (20 seconds) fires before all nodes are properly identified by the knet transport.  I noticed that with a totem token passing 32 seconds, the knet_ping* defaults were pushing up against that 20 second mark.  The output of "corosync-cfgtool -s" will show each node's link as enabled, but each side will state the other side's link is not connected.   Since each side thinks the other node is not active, they fail to properly send a join message to the other node during the election.   They will essentially form a singleton cluster(??).  


Till now your analysis is correct. Corosync is really unable to send 
join message and forms single node cluster.



It is more puzzling when you start one node at a time, waiting for the node to 
stabilize before starting the other.   It is like the first node will never see 
the remote knet interfaces become active, regardless of how long you wait.


This shouldn't happen. Knet will eventually receive enough pongs so 
corosync broadcast message to other nodes, which founds out that new 
membership should be formed.




The solution is to manually set the knet ping_timeout and ping_interval to 
lower values than the default values derived from the totem token.  This seems 
to allow for the knet transport to determine link status of all nodes before 
the election timer pops.


These timeouts are indeed not the best one. I had few ideas how to 
improve them, because currently they are in favor of multiple links 
clusters. Single links cluster may work better with slightly different 
defaults.




I tested this on both physical hardware and with VMs.  Both react similarly.

Bare bones test case to reproduce:
yum install pcs pacemaker fence-agents-all
firewall-cmd --permanent --add-service=high-availability
firewall-cmd --add-service=high-availability
systemctl start pcsd.service
systemctl enable pcsd.service
systemctl disable corosync
systemctl disable pacemaker
passwd hacluster
pcs host auth node1 node2
pcs cluster setup rhcs_test node1 node2 totem token=41000
pcs cluster start --all

Example command to create cluster that will properly form and get quorum:
pcs cluster setup rhcs_test node1 node2 totem token=61000 transport knet link 
ping_interval=1250 ping_timeout=2500

Hope this helps someone in the future.


Yup. It is interesting finding and thanks for that.

Regards,
  Honza



Thanks
Robert


Robert Hayden | Lead Technology Architect | Cerner Corporation


CONFIDENTIALITY NOTICE This message and any included attachments are from 
Cerner Corporation and are intended only for the addressee. The information 
contained in this message is confidential and may constitute inside or 
non-public information under international, federal, or state securities laws. 
Unauthorized forwarding, printing

Re: [ClusterLabs] Rolling upgrade from Corosync 2.3+ to Corosync 2.99+ or Corosync 3.0+?

2020-06-11 Thread Jan Friesse

Thank you very much for your help!
We did try to go to V3.0.3-5 and then dropped to 2.99 in hope that it may work 
with rolling upgrade (we were fooled by the same major version (2)). Our fresh 
install works fine on V3.0.3-5.
Do you know if it is possible to build Pacemaker 3.0.3-5 and Corosync 2.0.3 on Fedora 22 so that I 


Good question. Fedora 22 is quite old but close to RHEL 7 for which we 
build packages automatically (https://kronosnet.org/builds/) so it 
should be possible. But you are really on your own, because I don't 
think anybody ever tried it.


Regards,
  Honza



upgrade the stack before starting "real" upgrade of the product?

Then I can do the following sequence:
1. "quick" full shutdown for HA stack upgrade to 3.0 version
2. start HA stack on the old OS and product version with Pacemaker 3.0.3 and 
bring the product online
3. start rolling upgrade for product upgrade to the new OS and product version
Thanks again for your help!
_Vitaly


On June 11, 2020 3:30 AM Jan Friesse  wrote:

  
Vitaly,



Hello everybody.
We are trying to do a rolling upgrade from Corosync 2.3.5-1 to Corosync 2.99+. 
It looks like they are not compatible and we are getting messages like:


Yes, they are not wire compatible. Also please do not use 2.99 versions,
these were alfa/beta/rc before 3.0 and 3.0 is actually quite a long time
released (3.0.4 is latest and I would recommend using it - there were
quite a few important bugfixes between 3.0.0 and 3.0.4)



Jun 11 02:10:20 d21-22-left corosync[6349]:   [TOTEM ] Message received from 
172.18.52.44 has bad magic number (probably sent by Corosync 2.3+).. Ignoring
on the upgraded node and
Jun 11 01:02:37 d21-22-right corosync[14912]:   [TOTEM ] Invalid packet data
Jun 11 01:02:38 d21-22-right corosync[14912]:   [TOTEM ] Incoming packet has 
different crypto type. Rejecting
Jun 11 01:02:38 d21-22-right corosync[14912]:   [TOTEM ] Received message has 
invalid digest... ignoring.
on the pre-upgrade node.

Is there a good way to do this upgrade?


Usually best way is to start from scratch in testing environment to make
sure everything works as expected. Then you can shutdown current
cluster, upgrade and start it again - config file is mostly compatible,
you may just consider changing transport to knet. I don't think there is
any definitive guide to do upgrade without shutting down whole cluster,
but somebody else may have idea.

Regards,
Honza


I would appreciate it very much if you could point me to any documentation or 
articles on this issue.
Thank you very much!
_Vitaly
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/





___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Redudant Ring Network failure

2020-06-11 Thread Jan Friesse

Michael,


Jan,

actually we using this.

[root@lvm-nfscpdata-05ct::~ 100 ]# apt show corosync
Package: corosync
Version: 3.0.1-2+deb10u1

[root@lvm-nfscpdata-05ct::~]# apt show libknet1
Package: libknet1
Version: 1.8-2

This are the newest version provided on Mirror.


yup, but these are pretty old anyway and there is quite a few bugs (and 
many of them may explain behavior you see).


I would suggest you to either try a upstream code compilation or (if you 
need to stick with packages) you may give a try sid packages or Proxmox 
repositories (where knet is at version 1.15 and corosync 3.0.3).


Regards,
  Honza






Sitz der Gesellschaft / Corporate Headquarters: Deutsche Lufthansa 
Aktiengesellschaft, Koeln, Registereintragung / Registration: Amtsgericht Koeln 
HR B 2168
Vorsitzender des Aufsichtsrats / Chairman of the Supervisory Board: Dr. 
Karl-Ludwig Kley
Vorstand / Executive Board: Carsten Spohr (Vorsitzender / Chairman), Thorsten 
Dirks, Christina Foerster, Harry Hohmeister, Dr. Detlef Kayser, Dr. Michael 
Niggemann


-Ursprüngliche Nachricht-
Von: Jan Friesse 
Gesendet: Mittwoch, 10. Juni 2020 09:24
An: Cluster Labs - All topics related to open-source clustering welcomed 
; ROHWEDER-NEUBECK, MICHAEL (EXTERN) 
; us...@lists.clusterlabs.org
Betreff: Re: [ClusterLabs] Redudant Ring Network failure

Michael,
what version of knet you are using? We had quite a few problems with older 
versions of knet, so current stable is recommended (1.16). Same applies for 
corosync because 3.0.4 has vastly improved display of links status.


Hello,
We have massive problems with the redundant ring operation of our Corosync / 
pacemaker 3 Node NFS clusters.

Most of the nodes either have an entire ring offline or only 1 node in a ring.
Example: (Node1 Ring0 333 Ring1 n33 | Node2 Ring0 033 Ring1 3n3 |
Node3 Ring0 333 Ring 1 33n)


Doesn't seem completely wrong. You can ignore 'n' for ring 1, because that is 
localhost which is connected only on Ring 0 (3.0.4 has this output more 
consistent) so all nodes are connected at least via Ring 1.
Ring 0 on node 2 seems to have some trouble with connection to node 1 but node 
1 (and 3) seems to be connected to node 2 just fine, so I think it is ether 
some bug in knet (probably already fixed) or some kind of firewall blocking 
just connection from node 2 to node 1 on ring 0.




corosync-cfgtool -R don't help
All nodes are VMs that build the ring together using 2 VLANs.
Which logs do you need to hopefully help me?


syslog/journal should contain everything needed especially when debug is 
enabled (corosync.conf - logging.debug: on)

Regards,
Honza



Corosync Cluster Engine, version '3.0.1'
Copyright (c) 2006-2018 Red Hat, Inc.
Debian Buster


--
Mit freundlichen Grüßen
Michael Rohweder-Neubeck

NSB GmbH – Nguyen Softwareentwicklung & Beratung GmbH Röntgenstraße 27
D-64291 Darmstadt
E-Mail:
m...@nsb-software.de<mailto:m...@nsb-software.de<mailto:mrn@nsb-software
.de%3cmailto:m...@nsb-software.de>>
Manager: Van-Hien Nguyen, Jörg Jaspert
USt-ID: DE 195 703 354; HRB 7131 Amtsgericht Darmstadt




Sitz der Gesellschaft / Corporate Headquarters: Deutsche Lufthansa
Aktiengesellschaft, Koeln, Registereintragung / Registration:
Amtsgericht Koeln HR B 2168 Vorsitzender des Aufsichtsrats / Chairman
of the Supervisory Board: Dr. Karl-Ludwig Kley Vorstand / Executive
Board: Carsten Spohr (Vorsitzender / Chairman), Thorsten Dirks,
Christina Foerster, Harry Hohmeister, Dr. Detlef Kayser, Dr. Michael
Niggemann




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/





___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] New user needs some help stabilizing the cluster

2020-06-11 Thread Jan Friesse

Howard,



Good morning.  Thanks for reading.  We have a requirement to provide high
availability for PostgreSQL 10.  I have built a two node cluster with a
quorum device as the third vote, all running on RHEL 8.

Here are the versions installed:
[postgres@srv2 cluster]$ rpm -qa|grep
"pacemaker\|pcs\|corosync\|fence-agents-vmware-soap\|paf"
corosync-3.0.2-3.el8_1.1.x86_64
corosync-qdevice-3.0.0-2.el8.x86_64
corosync-qnetd-3.0.0-2.el8.x86_64
corosynclib-3.0.2-3.el8_1.1.x86_64
fence-agents-vmware-soap-4.2.1-41.el8.noarch
pacemaker-2.0.2-3.el8_1.2.x86_64
pacemaker-cli-2.0.2-3.el8_1.2.x86_64
pacemaker-cluster-libs-2.0.2-3.el8_1.2.x86_64
pacemaker-libs-2.0.2-3.el8_1.2.x86_64
pacemaker-schemas-2.0.2-3.el8_1.2.noarch
pcs-0.10.2-4.el8.x86_64
resource-agents-paf-2.3.0-1.noarch

These are vmare VMs so I configured the cluster to use the ESX host as the
fencing device using fence_vmware_soap.

Throughout each day things generally work very well.  The cluster remains
online and healthy. Unfortunately, when I check pcs status in the mornings,
I see that all kinds of things went wrong overnight.  It is hard to
pinpoint what the issue is as there is so much information being written to
the pacemaker.log. Scrolling through pages and pages of informational log
entries trying to find the lines that pertain to the issue.  Is there a way
to separate the logs out to make it easier to scroll through? Or maybe a
list of keywords to GREP for?


The most important info is following line:

> Jun 10 00:06:41 [10558] srv2 corosync warning [MAIN  ] Corosync main
> process was not scheduled for 13006.0615 ms (threshold is 800. ms).
> Consider token timeout increase.

There are more of these, so you can either make sure VM is not paused 
for such a long time or increase token timeout so corosync is able to 
handle such pause.


Regards,
  Honza



It is clearly indicating that the server lost contact with the other node
and also the quorum device. Is there a way to make this configuration more
robust or able to recover from a connectivity blip?

Here are the pacemaker and corosync logs for this morning's failures:
pacemaker.log
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:42 srv2 pacemakerd
  [10573] (pcmk_quorum_notification)   warning: Quorum lost |
membership=952 members=1
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:42 srv2 pacemaker-controld
  [10579] (pcmk_quorum_notification)   warning: Quorum lost |
membership=952 members=1
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:43 srv2
pacemaker-schedulerd[10578] (pe_fence_node)  warning: Cluster node srv1
will be fenced: peer is no longer part of the cluster
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:43 srv2
pacemaker-schedulerd[10578] (determine_online_status)warning: Node
srv1 is unclean
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:43 srv2
pacemaker-schedulerd[10578] (custom_action)  warning: Action
pgsqld:1_demote_0 on srv1 is unrunnable (offline)
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:43 srv2
pacemaker-schedulerd[10578] (custom_action)  warning: Action
pgsqld:1_stop_0 on srv1 is unrunnable (offline)
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:43 srv2
pacemaker-schedulerd[10578] (custom_action)  warning: Action
pgsqld:1_demote_0 on srv1 is unrunnable (offline)
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:43 srv2
pacemaker-schedulerd[10578] (custom_action)  warning: Action
pgsqld:1_stop_0 on srv1 is unrunnable (offline)
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:43 srv2
pacemaker-schedulerd[10578] (custom_action)  warning: Action
pgsqld:1_demote_0 on srv1 is unrunnable (offline)
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:43 srv2
pacemaker-schedulerd[10578] (custom_action)  warning: Action
pgsqld:1_stop_0 on srv1 is unrunnable (offline)
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:43 srv2
pacemaker-schedulerd[10578] (custom_action)  warning: Action
pgsqld:1_demote_0 on srv1 is unrunnable (offline)
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:43 srv2
pacemaker-schedulerd[10578] (custom_action)  warning: Action
pgsqld:1_stop_0 on srv1 is unrunnable (offline)
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:43 srv2
pacemaker-schedulerd[10578] (custom_action)  warning: Action
pgsql-master-ip_stop_0 on srv1 is unrunnable (offline)
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:43 srv2
pacemaker-schedulerd[10578] (stage6) warning: Scheduling Node srv1
for STONITH
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:43 srv2
pacemaker-schedulerd[10578] (pcmk__log_transition_summary)   warning:
Calculated transition 2 (with warnings), saving inputs in
/var/lib/pacemaker/pengine/pe-warn-34.bz2
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:45 srv2 pacemaker-controld
  [10579] (crmd_ha_msg_filter) warning: Another DC detected: srv1
(op=join_offer)
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:45 srv2 pacemaker-controld
  [10579] (destroy_action) warning: Cancelling timer for action 3
(src=307)
/var/log/pacemaker/pacemaker.log:Jun 10 00:06:45 sr

Re: [ClusterLabs] Rolling upgrade from Corosync 2.3+ to Corosync 2.99+ or Corosync 3.0+?

2020-06-11 Thread Jan Friesse

Vitaly,


Hello everybody.
We are trying to do a rolling upgrade from Corosync 2.3.5-1 to Corosync 2.99+. 
It looks like they are not compatible and we are getting messages like:


Yes, they are not wire compatible. Also please do not use 2.99 versions, 
these were alfa/beta/rc before 3.0 and 3.0 is actually quite a long time 
released (3.0.4 is latest and I would recommend using it - there were 
quite a few important bugfixes between 3.0.0 and 3.0.4)




Jun 11 02:10:20 d21-22-left corosync[6349]:   [TOTEM ] Message received from 
172.18.52.44 has bad magic number (probably sent by Corosync 2.3+).. Ignoring
on the upgraded node and
Jun 11 01:02:37 d21-22-right corosync[14912]:   [TOTEM ] Invalid packet data
Jun 11 01:02:38 d21-22-right corosync[14912]:   [TOTEM ] Incoming packet has 
different crypto type. Rejecting
Jun 11 01:02:38 d21-22-right corosync[14912]:   [TOTEM ] Received message has 
invalid digest... ignoring.
on the pre-upgrade node.

Is there a good way to do this upgrade?


Usually best way is to start from scratch in testing environment to make 
sure everything works as expected. Then you can shutdown current 
cluster, upgrade and start it again - config file is mostly compatible, 
you may just consider changing transport to knet. I don't think there is 
any definitive guide to do upgrade without shutting down whole cluster, 
but somebody else may have idea.


Regards,
  Honza


I would appreciate it very much if you could point me to any documentation or 
articles on this issue.
Thank you very much!
_Vitaly
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Redudant Ring Network failure

2020-06-10 Thread Jan Friesse

Michael,
what version of knet you are using? We had quite a few problems with 
older versions of knet, so current stable is recommended (1.16). Same 
applies for corosync because 3.0.4 has vastly improved display of links 
status.



Hello,
We have massive problems with the redundant ring operation of our Corosync / 
pacemaker 3 Node NFS clusters.

Most of the nodes either have an entire ring offline or only 1 node in a ring.
Example: (Node1 Ring0 333 Ring1 n33 | Node2 Ring0 033 Ring1 3n3 | Node3 Ring0 
333 Ring 1 33n)


Doesn't seem completely wrong. You can ignore 'n' for ring 1, because 
that is localhost which is connected only on Ring 0 (3.0.4 has this 
output more consistent) so all nodes are connected at least via Ring 1. 
Ring 0 on node 2 seems to have some trouble with connection to node 1 
but node 1 (and 3) seems to be connected to node 2 just fine, so I think 
it is ether some bug in knet (probably already fixed) or some kind of 
firewall blocking just connection from node 2 to node 1 on ring 0.





corosync-cfgtool -R don't help
All nodes are VMs that build the ring together using 2 VLANs.
Which logs do you need to hopefully help me?


syslog/journal should contain everything needed especially when debug is 
enabled (corosync.conf - logging.debug: on)


Regards,
  Honza



Corosync Cluster Engine, version '3.0.1'
Copyright (c) 2006-2018 Red Hat, Inc.
Debian Buster


--
Mit freundlichen Grüßen
   Michael Rohweder-Neubeck

NSB GmbH – Nguyen Softwareentwicklung & Beratung GmbH Röntgenstraße 27
D-64291 Darmstadt
E-Mail: 
m...@nsb-software.de>
Manager: Van-Hien Nguyen, Jörg Jaspert
USt-ID: DE 195 703 354; HRB 7131 Amtsgericht Darmstadt




Sitz der Gesellschaft / Corporate Headquarters: Deutsche Lufthansa 
Aktiengesellschaft, Koeln, Registereintragung / Registration: Amtsgericht Koeln 
HR B 2168
Vorsitzender des Aufsichtsrats / Chairman of the Supervisory Board: Dr. 
Karl-Ludwig Kley
Vorstand / Executive Board: Carsten Spohr (Vorsitzender / Chairman), Thorsten 
Dirks, Christina Foerster, Harry Hohmeister, Dr. Detlef Kayser, Dr. Michael 
Niggemann




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Merging partitioned two_node cluster?

2020-05-06 Thread Jan Friesse

Richard,


So I tried an experiment.  I had tried switch over to 'udpu' unicast
transport, but corosync threw an error starting up (which I did not drill
down on yet.)

I went over to my test environment and did the same thing and it worked
fine, the cluster worked and everything.

One thing that is different there is the test environment is on a network
with CIDR /22, and the production network is on a CIDR /26.

So, for my bindnetaddr on my test network, I have the address of my VIP, or
'192.168.193.113'.  That would be a network base of '192.168.192.0'.

But my production network is on '192.168.83.131', That would be a network
base of '192.168.83.128'.

I have tried hardcoding the 'bindnetaddr' to the network base
'192.168.83.128', but it still throws an error.  Perhaps it is 'zeroing'
out the bindnetaddr least significant byte to make the network base?

Is it possible that the calculation of my base network in 'bindnetaddr'
doesn't account for networks with CIDR mask bits greater than 24?  (which
would have non-zero least significant bytes.)


I'm almost sure binding addr checking works correctly, but as long as 
you don't use bindnetaddr - so delete whole interface section - it is 
not used at all and matching for complete ip is used. bindnetaddr 
matching used to make sense before nodelist config stanza was added so 
config file could be easily copied around the nodes without need to 
change (what worked quite well, but only for ipv4).


Honza



Thanks,

Rick





On Tue, May 5, 2020 at 12:03 PM Andrei Borzenkov 
wrote:


05.05.2020 16:44, Nickle, Richard пишет:

Thanks Honza and Andrei (and Strahil?  I might have missed a message in

the

thread...)



Yep, all messages from Strahil end up in spam folder.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/





___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Merging partitioned two_node cluster?

2020-05-05 Thread Jan Friesse

On May 5, 2020 6:39:54 AM GMT+03:00, "Nickle, Richard"  
wrote:

I have a two node cluster managing a VIP.  The service is an SMTP
service.
This could be active/active, it doesn't matter which node accepts the
SMTP
connection, but I wanted to make sure that a VIP was in place so that
there
was a well-known address.

This service has been running for quite awhile with no problems.  All
of a
sudden, it partitioned, and now I can't work out a good way to get them
to
merge the clusters back again.  Right now one partition takes the
resource
and starts the VIP, but doesn't see the other node.  The other node
doesn't
create a resource, and can't seem to see the other node.

At this point, I am perfectly willing to create another node and make
an
odd-numbered cluster, the arguments for this being fairly persuasive.
But
I'm not sure why they are blocking.

Surely there must be some manual way to get a partitioned cluster to
merge?  Some trick?  I also had a scenario several weeks ago where an
odd-numbered cluster configured in a similar way partitioned into a 3
and 2
node cluster, and I was unable to work out how to get them to merge,
until
all of a sudden they seemed to fix themselves after doing a 'pcs node
remove/pcs node add' which had failed many times before.  I have tried
that
here but with no success so far.

I ruled out some common cases I've seen in discussions and threads,
such as
having my host name defined in host as localhost, etc.

Corosync 2.4.3, Pacemaker 0.9.164. (Ubuntu 18.04.).

Output from pcs status for both nodes:

Cluster name: mail
Stack: corosync
Current DC: mail2 (version 1.1.18-2b07d5c5a9) - partition with quorum
Last updated: Mon May  4 23:28:53 2020
Last change: Mon May  4 21:50:04 2020 by hacluster via crmd on mail2

2 nodes configured
1 resource configured

Online: [ mail2 ]
OFFLINE: [ mail3 ]

Full list of resources:

mail_vip (ocf::heartbeat:IPaddr2): Started mail2

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Cluster name: mail
Stack: corosync
Current DC: mail3 (version 1.1.18-2b07d5c5a9) - partition with quorum
Last updated: Mon May  4 22:13:10 2020
Last change: Mon May  4 22:10:34 2020 by root via cibadmin on mail3

2 nodes configured
0 resources configured

Online: [ mail3 ]
OFFLINE: [ mail2 ]

No resources

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

/etc/corosync/corosync.conf:

totem {
version: 2
cluster_name: mail
clear_node_high_bit: yes
crypto_cipher: none
crypto_hash: none

interface {
ringnumber: 0
bindnetaddr: 192.168.80.128
mcastport: 5405
}
}

logging {
fileline: off
to_stderr: no
to_logfile: no
to_syslog: yes
syslog_facility: daemon
debug: off
timestamp: on
}

quorum {
provider: corosync_votequorum
wait_for_all: 0
two_node: 1
}

nodelist {
node {
ring0_addr: mail2
name: mail2
nodeid: 1
}

node {
ring0_addr: mail3
name: mail3
nodeid: 2
}
}

Thanks!

Rick


Ah Rick,All

Just ignore the previous one - I guess  I'm too sleepy.


Honestly I think your advise was good. Current config uses default 
transport and for 2.4.3 it means multicast so trying unicast udpu may 
solve the problem.


If not I would take a look to classic things like firewall, ...

Regards,
  Honza




Best Regards,
Strahil Nikolov
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Corosync 3.0.4 is available at corosync.org!

2020-04-23 Thread Jan Friesse

I am pleased to announce the latest maintenance release of Corosync
3.0.4 available immediately from GitHub release section at 
https://github.com/corosync/corosync/releases or our website at

http://build.clusterlabs.org/corosync/releases/.

This release contains important bug fixes and few improvements. Most 
notable ones:


- corosoync-cfgtool links status output is enhanced to be more readable

- Runtime change of two_node mode also changes wait_for_all when it is 
not set explicitelly


- Internal wait_for_all_status flag is set only on startup and not 
during runtime reload. This fixes possible inconsistency between quorum 
and votequorum quorate information.


- schedmiss events are now also stored in the cmap stats map. This keys 
are mostly intended for automated tools (like Insights).


Complete changelog for 3.0.4:

Christine Caulfield (2):
  stats: Add stats for scheduler misses
  icmap: icmap_init_r() leaks if trie_create() fails

Ferenc Wágner (1):
  man: move cmap_keys man page from section 8 to 7

Hideo Yamauchi (3):
  cpg: Change downlist log level
  totemknet: Change the initial value of the status
  cfgtool: Fix error code as described in MP

Jan Friesse (32):
  totemconfig: Free leaks found by coverity
  votequorum: Ignore the icmap_get_* return value
  logconfig: Remove double free of value
  totemconfig: Reuse already fetched pointer
  cmap: Assert copied string length
  stats: Assert value_len when value is needed
  totemknet: Don't mix corosync and knet error codes
  sync: Assert sync_callbacks.name length
  totemconfig: Initialize warnings variable
  totemknet: Check result of fcntl O_NONBLOCK call
  totemknet: Assert strcpy length
  votequorum: Assert copied strings length
  cpghum: Remove unused time variables and functions
  cfgtool: Remove unused callbacks
  cmapctl: Free bin_value on error
  notifyd: Check cmap_track_add result
  quorumtool: Assert copied string length
  stats: Check return code of stats_map_get
  votequorum: Reflect runtime change of 2Node to WFA
  stats: Use nanoseconds from epoch for schedmiss
  cfgtool: Improve link status display
  totemip: Add support for sin6_scope_id
  totemip: Remove unused totemip_copy_endian_convert
  totemip: Really remove totemip_copy_endian_convert
  main: Add schedmiss timestamp into message
  man: Enhance link_mode priority description
  cfgtool: Simplify output a bit for link status
  votequorum: Change check of expected_votes
  quorumtool: exit on invalid expected votes
  votequorum: set wfa status only on startup
  Revert "totemip: Add support for sin6_scope_id"
  Revert "totemip: compare sin6_scope_id and interface_num"

liangxin1300 (1):
  totemip: compare sin6_scope_id and interface_num


Upgrade is highly recommended.

Thanks/congratulations to all people that contributed to achieve this
great milestone.

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] qdevice up and running -- but questions

2020-04-14 Thread Jan Friesse



On 4/11/20 6:52 PM, Eric Robinson wrote:

 1. What command can I execute on the qdevice node which tells me which
    client nodes are connected and alive?



i use
corosync-qnetd-tool -v -l


 2. In the output of the pcs qdevice status command, what is the 
meaning of…


 Vote:   ACK (ACK)


This is documented in the corosync-qnetd-tool man page.



 3. In the output of the  pcs quorum status Command, what is the 
meaning of…


Membership information

--

 Nodeid  Votes    Qdevice Name

  1  1    A,V,NMW 001db03a

  2  1    A,V,NMW 001db03b (local)


Pcs just displays output of corosync-quorumtool. Nodeid and votes 
columns are probably obvious, name is resolved reverse dns name of the 
node ip addr (-i suppress this behavior) and qdevice column really needs 
to be documented, so:
A/NA - Qdevice is alive / not alive. This is rather internal flag, but 
it can be seen as a heartbeat between qdevice and corosync. Should be 
always alive.

V/NV - Qdevice has cast voted / not cast voted. Take is a an ACK/NACK.
MW/NMW - Master wins / not master wins. This is really internal flag. By 
default corosync-qdevice newer asks for master wins so it is going to be 
nmw. For more information what it means see 
votequorum_qdevice_master_wins (3)


Honza




--Eric

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] temporary loss of quorum when member starts to rejoin

2020-04-09 Thread Jan Friesse

Sherrard Burton napsal(a):



On 4/8/20 1:09 PM, Andrei Borzenkov wrote:

08.04.2020 10:12, Jan Friesse пишет:

Sherrard,


i could not determine which of these sub-threads to include this in,
so i am going to (reluctantly) top-post it.

i switched the transport to udp, and in limited testing i seem to not
be hitting the race condition. of course i have no idea whether this
will behave consistently, or which part of the knet vs udp setup makes
the most difference.

ie, is it the overhead of the crypto handshakes/setup? is there some
other knet layer that imparts additional delay in establishing
connection to other nodes? is the delay on the rebooted node, the
standing node, or both?



Very high level, what is happening in corosync when using udpu:
- Corosync started and begins in gather state -> sends "multicast"
(emulated by unicast to all expected members) message telling "I'm here
and this is my view of live nodes").
- In this state, corosync waits for answers
- When node receives this message it "multicast" same message with
updated view of live nodes
- After all nodes agrees, they move to next state (commit/recovery and
finally operational)

With udp, this happens instantly so most of the time corosync doesn't
even create single node membership, which would be created if no other
nodes exists and/or replies wouldn't be delivered on time.



Is it possible to delay "creating single node membership" until some
reasonable initial timeout after corosync starts to ensure node view of
cluster is up to date? It is clear that there will always be some corner
cases, but at least this would make "obviously correct" configuration to
behave as expected.

Corosync already must have timeout to declare peers unreachable - it
sounds like most logical to use in this case.



i tossed that idea around in my head as well. basically if there was an 
analogue client_leaving called client_joining that could be used to 
allowed the qdevice to return 'ask later'.


It is there.



i think the trade-off here is that you sacrifice some responsiveness in 
your failover times, since (i'm guessing) the timeout for declaring 
peers unreachable errors on the side of caution.


the other hairy bit is determining the difference between a new 
(illegitimate) single-node membership, and the existing (legitimate) 
single-node membership. both are equally legitimate from the standpoint 
of each client, which can see the qdevice, but not the peer, and from 
the standpoint of the qdevice, which can see both clients.


Yep. Actually this is really situation which I hadn't think about. It is 
quite special, because for more than 2 nodes, it works as it should 
(single node partition never gets a vote then). That doesn't mean 2 node 
cluster is not important - it's quite opposite - this is where qdevice 
makes sense.




as such, i suspect that this all comes right back to figuring out how to 
implement issue #7.


It's not hard, it is just quite some work to do. I'm on it, but I have 
no ETA yet (and of course current situation in real life doesn't help 
too much). When I get something, I will let you know and be happy if you 
would be able to test it.


Regards,
  Honza






Knet adds a layer which monitors links between each of the node and it
will make line active after it received configured number of "pong"
packets. Idea behind is to have evidence of reasonable stable line. As
long as line is not active no data packet goes thru (corosync traffic is
just "data"). This basically means, that initial corosync multicast is
not delivered to other nodes so corosync creates single node membership.
After line becomes active "multicast" is delivered to other nodes and
they move to gather state.



I would expect "reasonable timeout" to also take in account knet delay.


So to answer you question. "Delay" is on both nodes side because link is
not established between the nodes.



knet was expected to improve things, was not it? :)





___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] temporary loss of quorum when member starts to rejoin

2020-04-09 Thread Jan Friesse

Sherrard Burton napsal(a):



On 4/7/20 4:09 AM, Jan Friesse wrote:

Sherrard and Andrei



On 4/6/20 4:10 PM, Andrei Borzenkov wrote:

06.04.2020 20:57, Sherrard Burton пишет:

It looks like some timing issue or race condition. After reboot node
manages to contact qnetd first, before connection to other node is
established. Qnetd behaves as documented - it sees two equal size
partitions and favors the partition that includes tie breaker (lowest
node id). So existing node goes out of quorum. Second later both nodes
see each other and so quorum is regained.


Nice catch




thank you for taking the time to troll through my debugging output. 
your explanation seems to accurately describe what i am experiencing. 
of course i have no idea how to remedy it. :-)


It is really quite a problem. Honestly, I don't think there is really 
a way how to remedy this behavior other than implement option to 
prefer active partition as a tie-breaker 
(https://github.com/corosync/corosync-qdevice/issues/7).


Jan,
my curiosity got the best of me, so i spent some time trying to orient 
myself to the inner workings of ffsplit.


a) how would one identify the current active partition? i might be 


This is not tracked yet. This is first thing to do - add last known 
membership into the qnetd cluster structure.


starting too far in (or missing something), but it seems that by the 
time we are in qnetd_algo_ffsplit_partition_cmp(), we are comparing two 
sets of clients and node lists without the kind of context that would 
allow us to identify the current active partition. i could not easily 
identify the object that we would interrogate to answer that question.


b) is it possible to manage client->tie_breaker.mode and 
client->tie_breaker.node_id dynamically to achieve the desired goal? ie, 
if we are in a two-node cluster and one node leaves, can we "push" 
values to the remaining client such that client->tie_breaker.mode == 
TLV_TIE_BREAKER_MODE_NODE_ID and client->tie_breaker.node_id == 
client->node_id?


Would be possible and it would probably work quite well for 2 node 
cluster. But imagine cluster with more nodes - this is where things 
become interesting.


Regards,
  Honza



of course i may be way off base with all of this. just wanted to ask 
before i extracted myself from the rabbit hole.




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] temporary loss of quorum when member starts to rejoin

2020-04-09 Thread Jan Friesse

Andrei Borzenkov napsal(a):

08.04.2020 10:12, Jan Friesse пишет:

Sherrard,


i could not determine which of these sub-threads to include this in,
so i am going to (reluctantly) top-post it.

i switched the transport to udp, and in limited testing i seem to not
be hitting the race condition. of course i have no idea whether this
will behave consistently, or which part of the knet vs udp setup makes
the most difference.

ie, is it the overhead of the crypto handshakes/setup? is there some
other knet layer that imparts additional delay in establishing
connection to other nodes? is the delay on the rebooted node, the
standing node, or both?



Very high level, what is happening in corosync when using udpu:
- Corosync started and begins in gather state -> sends "multicast"
(emulated by unicast to all expected members) message telling "I'm here
and this is my view of live nodes").
- In this state, corosync waits for answers
- When node receives this message it "multicast" same message with
updated view of live nodes
- After all nodes agrees, they move to next state (commit/recovery and
finally operational)

With udp, this happens instantly so most of the time corosync doesn't
even create single node membership, which would be created if no other
nodes exists and/or replies wouldn't be delivered on time.



Is it possible to delay "creating single node membership" until some
reasonable initial timeout after corosync starts to ensure node view of


The thing is, totemsrp begins by creating single node membership. It has 
to start somewhere. Of course question is, if it would make sense to 
slow a bit on the startup to create "better" membership? I would say so, 
and it is something I'm considering as TODO.




cluster is up to date? It is clear that there will always be some corner
cases, but at least this would make "obviously correct" configuration to
behave as expected.

Corosync already must have timeout to declare peers unreachable - it
sounds like most logical to use in this case.


It does, join timeout, but enlarging it will generally slow failure 
detection/recovery.






Knet adds a layer which monitors links between each of the node and it
will make line active after it received configured number of "pong"
packets. Idea behind is to have evidence of reasonable stable line. As
long as line is not active no data packet goes thru (corosync traffic is
just "data"). This basically means, that initial corosync multicast is
not delivered to other nodes so corosync creates single node membership.
After line becomes active "multicast" is delivered to other nodes and
they move to gather state.



I would expect "reasonable timeout" to also take in account knet delay.


So to answer you question. "Delay" is on both nodes side because link is
not established between the nodes.



knet was expected to improve things, was not it? :)



And I believe it does :) Actually, it now behaves more "correctly" (read 
as "as specification says") than before. Anyway, I got the point, it's 
in TODO (https://github.com/corosync/corosync/issues/549)


Honza

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] how to properly add/delete qdevice for an existing cluster

2020-04-08 Thread Jan Friesse

please forgive me if i have overlooked the answer somewhere.

i have an existing cluster that is already configured with a qdevice. i 
now wish to update that configuration to point at a different qdevice.


background:
for the sake of working through the initial configuration details, 
tuning, etc, i initially spun up a qdevice that lives on the same VM 
host as one of the member nodes. obviously this is not robust enough for 
production setup, so i have created a new qdevice elsewhere. now i want 
to update the existing cluster to point at the new qdevice. re-running 
corosync-qdevice-net-certutil results in:


Node  seems to be already initialized. Please delete 
/etc/corosync/qdevice/net/nssdb



obviously, i could (and possibly must) muck around in 
/etc/corosync/qdevice/net/nssdb, but i thought i would ask on-list, for 
the benefit of future googlers, whether there is a proper way to manage 
the qdevice without blowing away the existing configuration and 
re-initializing.


You mean different qnetd? If so, then it is enough to update nodes 
config to point to new IP and copy /etc/corosync/qnetd directory to new 
qnetd server and make sure to chown directory to coroqnetd:coroqnetd 
user:group (this gets created dynamically on installation so uid/gid may 
differ between machines).


Regards,
  Honza



TIA
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] temporary loss of quorum when member starts to rejoin

2020-04-08 Thread Jan Friesse

Sherrard,

i could not determine which of these sub-threads to include this in, so 
i am going to (reluctantly) top-post it.


i switched the transport to udp, and in limited testing i seem to not be 
hitting the race condition. of course i have no idea whether this will 
behave consistently, or which part of the knet vs udp setup makes the 
most difference.


ie, is it the overhead of the crypto handshakes/setup? is there some 
other knet layer that imparts additional delay in establishing 
connection to other nodes? is the delay on the rebooted node, the 
standing node, or both?




Very high level, what is happening in corosync when using udpu:
- Corosync started and begins in gather state -> sends "multicast" 
(emulated by unicast to all expected members) message telling "I'm here 
and this is my view of live nodes").

- In this state, corosync waits for answers
- When node receives this message it "multicast" same message with 
updated view of live nodes
- After all nodes agrees, they move to next state (commit/recovery and 
finally operational)


With udp, this happens instantly so most of the time corosync doesn't 
even create single node membership, which would be created if no other 
nodes exists and/or replies wouldn't be delivered on time.



Knet adds a layer which monitors links between each of the node and it 
will make line active after it received configured number of "pong" 
packets. Idea behind is to have evidence of reasonable stable line. As 
long as line is not active no data packet goes thru (corosync traffic is 
just "data"). This basically means, that initial corosync multicast is 
not delivered to other nodes so corosync creates single node membership. 
After line becomes active "multicast" is delivered to other nodes and 
they move to gather state.


So to answer you question. "Delay" is on both nodes side because link is 
not established between the nodes.


Honza

ultimate i have to remind myself that "a race condition is a race 
condition", and that you can't chase micro-second improvements that may 
lessen the chance of triggering it. you have to solve the underlying 
problem.



thanks again folks, for your help, and the great work you are doing.


On 4/7/20 4:09 AM, Jan Friesse wrote:

Sherrard and Andrei




On 4/6/20 4:10 PM, Andrei Borzenkov wrote:

06.04.2020 20:57, Sherrard Burton пишет:



On 4/6/20 1:20 PM, Sherrard Burton wrote:



On 4/6/20 12:35 PM, Andrei Borzenkov wrote:

06.04.2020 17:05, Sherrard Burton пишет:


from the quorum node:

...

Apr 05 23:10:17 debug   Client :::192.168.250.50:54462 (cluster
xen-nfs01_xen-nfs02, node_id 1) sent quorum node list.
Apr 05 23:10:17 debug msg seq num = 6
Apr 05 23:10:17 debug quorate = 0
Apr 05 23:10:17 debug node list:
Apr 05 23:10:17 debug   node_id = 1, data_center_id = 0, 
node_state

= member


Oops. How comes that node that was rebooted formed cluster all by
itself, without seeing the second node? Do you have two_nodes and/or
wait_for_all configured?



i never thought to check the logs on the rebooted server. hopefully
someone can extract some further useful information here:


https://pastebin.com/imnYKBMN



It looks like some timing issue or race condition. After reboot node
manages to contact qnetd first, before connection to other node is
established. Qnetd behaves as documented - it sees two equal size
partitions and favors the partition that includes tie breaker (lowest
node id). So existing node goes out of quorum. Second later both nodes
see each other and so quorum is regained.


Nice catch




thank you for taking the time to troll through my debugging output. 
your explanation seems to accurately describe what i am experiencing. 
of course i have no idea how to remedy it. :-)


It is really quite a problem. Honestly, I don't think there is really 
a way how to remedy this behavior other than implement option to 
prefer active partition as a tie-breaker 
(https://github.com/corosync/corosync-qdevice/issues/7).







I cannot reproduce it, but I also do not use knet. From documentation I
have impression that knet has artificial delay before it considers 
links

operational, so may be that is the reason.


i will do some reading on how knet factors into all of this and 
respond with any questions or discoveries.


knet_pong_count/knet_ping_interval tuning may help, but I don't think 
there is really a way to prevent creation of single node membership in 
all possible cases.








BTW, great eyes. i had not picked up on that little nuance. i had
poured through this particular log a number of times, but it was very
hard for me to discern the starting and stopping points for each
logical group of messages. the indentation made some of it clear. but
when you have a series of lines beginning in the left-most column, it
is not clear whether they belong to the previous grou

Re: [ClusterLabs] temporary loss of quorum when member starts to rejoin

2020-04-07 Thread Jan Friesse

Sherrard,




On 4/7/20 4:09 AM, Jan Friesse wrote:

Sherrard and Andrei




On 4/6/20 4:10 PM, Andrei Borzenkov wrote:

06.04.2020 20:57, Sherrard Burton пишет:



On 4/6/20 1:20 PM, Sherrard Burton wrote:



On 4/6/20 12:35 PM, Andrei Borzenkov wrote:

06.04.2020 17:05, Sherrard Burton пишет:


from the quorum node:

...

Apr 05 23:10:17 debug   Client :::192.168.250.50:54462 (cluster
xen-nfs01_xen-nfs02, node_id 1) sent quorum node list.
Apr 05 23:10:17 debug msg seq num = 6
Apr 05 23:10:17 debug quorate = 0
Apr 05 23:10:17 debug node list:
Apr 05 23:10:17 debug   node_id = 1, data_center_id = 0, 
node_state

= member


Oops. How comes that node that was rebooted formed cluster all by
itself, without seeing the second node? Do you have two_nodes and/or
wait_for_all configured?



i never thought to check the logs on the rebooted server. hopefully
someone can extract some further useful information here:


https://pastebin.com/imnYKBMN



It looks like some timing issue or race condition. After reboot node
manages to contact qnetd first, before connection to other node is
established. Qnetd behaves as documented - it sees two equal size
partitions and favors the partition that includes tie breaker (lowest
node id). So existing node goes out of quorum. Second later both nodes
see each other and so quorum is regained.


Nice catch




thank you for taking the time to troll through my debugging output. 
your explanation seems to accurately describe what i am experiencing. 
of course i have no idea how to remedy it. :-)


It is really quite a problem. Honestly, I don't think there is really 
a way how to remedy this behavior other than implement option to 
prefer active partition as a tie-breaker 
(https://github.com/corosync/corosync-qdevice/issues/7).







I cannot reproduce it, but I also do not use knet. From documentation I
have impression that knet has artificial delay before it considers 
links

operational, so may be that is the reason.


i will do some reading on how knet factors into all of this and 
respond with any questions or discoveries.


knet_pong_count/knet_ping_interval tuning may help, but I don't think 
there is really a way to prevent creation of single node membership in 
all possible cases.


yes. in my limited thinking about it, i keep coming back around to that 
conclusion in the two-node + qdevice case, barring implementation of #7.











BTW, great eyes. i had not picked up on that little nuance. i had
poured through this particular log a number of times, but it was very
hard for me to discern the starting and stopping points for each
logical group of messages. the indentation made some of it clear. but
when you have a series of lines beginning in the left-most column, it
is not clear whether they belong to the previous group, the next
group, or they are their own group.

just wanted to note my confusion in case the relevant maintainer
happens across this thread.


Here :)

Output (especially debug one) is really a bit cryptic, but I'm not 
entirely sure how to make it better. Qnetd events have no strict 
ordering so I don't see a way ho to group relevant events without some 
kind of reordering and best guessing, what I'm not too keen to do. 
Also some of the messages relates to specific nodes and some of the 
messages relates to whole cluster (or part of the cluster).


Of course I'm open to ideas how to structure it better way.


i wish i was well-versed enough in this particular codebase to submit a 
PR. i think that some kind of tagging indicating whether messages are 


Oh, PR is not really needed. For me it would be enough to see example of 
better structured log.


Honza

node-specific or cluster-specific would probably help a bit. but 
ultimately it is probably not worth the effort of changing the code, as 
long as the relevant parties can easily analyze the output.




Regards,
   Honza




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] temporary loss of quorum when member starts to rejoin

2020-04-07 Thread Jan Friesse

On Tue, 7 Apr 2020 14:13:35 -0400
Sherrard Burton  wrote:


On 4/7/20 1:16 PM, Andrei Borzenkov wrote:

07.04.2020 00:21, Sherrard Burton пишет:


It looks like some timing issue or race condition. After reboot node
manages to contact qnetd first, before connection to other node is
established. Qnetd behaves as documented - it sees two equal size
partitions and favors the partition that includes tie breaker (lowest
node id). So existing node goes out of quorum. Second later both nodes
see each other and so quorum is regained.
  


Define the right problem to solve?

Educated guess is that your problem is not corosync but pacemaker
stopping resources. In this case just do what was done for years in two
node cluster - set no-quorum-policy=ignore and rely on stonith to
resolve split brain.

I dropped idea to use qdevice in two node cluster. If you have reliable
stonith device it is not needed and without stonith relying on watchdog
suicide has too many problems.
   


Andrei,
in a two-node cluster with stonith only, but no qdevice, how do you
avoid the dreaded stonith death match, and the resultant flip-flopping
of services?


In my understanding, two_node and wait_for_all should avoid this.


Just a tiny comment. wait_for_all is enabled automatically when two_node 
mode is set (as long as wait_for_all is not explicitly disabled) so just 
setting two_node mode does the job.




After a node A has been fenced, the node B keeps the quorum thanks to two_node.
When A comes back, as long as it is not able to join the corosync group, it will
not be quorate thanks to wait_for_all. No quorum, no fencing allowed.

But the best protection is to disable pacemaker on boot so an admin can
investigate the situation and join back the node safely.

Regards,
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] temporary loss of quorum when member starts to rejoin

2020-04-07 Thread Jan Friesse

Sherrard,





On 4/7/20 12:53 AM, Strahil Nikolov wrote:


Hi Sherrard,

Have you tried to increase the qnet timers in the corosync.conf ?



Strahil,
i have actually reduced the qnet timers in order to improve failover 
response time, per Jan's suggestion on the thread '[ClusterLabs]  > reducing corosync-qnetd "response time"'


This is actually different problem and reduced qnetd and qdevice timers 
will not help. This problem is really about 2 node cluster which is half 
split into two single node memberships. Qnetd then gives vote to node 
with lowest node id, in this case it is newly restarted node.


Regards,
  Honza



https://www.mail-archive.com/users@clusterlabs.org/msg09278.html
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] temporary loss of quorum when member starts to rejoin

2020-04-07 Thread Jan Friesse

Sherrard and Andrei




On 4/6/20 4:10 PM, Andrei Borzenkov wrote:

06.04.2020 20:57, Sherrard Burton пишет:



On 4/6/20 1:20 PM, Sherrard Burton wrote:



On 4/6/20 12:35 PM, Andrei Borzenkov wrote:

06.04.2020 17:05, Sherrard Burton пишет:


from the quorum node:

...

Apr 05 23:10:17 debug   Client :::192.168.250.50:54462 (cluster
xen-nfs01_xen-nfs02, node_id 1) sent quorum node list.
Apr 05 23:10:17 debug msg seq num = 6
Apr 05 23:10:17 debug quorate = 0
Apr 05 23:10:17 debug node list:
Apr 05 23:10:17 debug   node_id = 1, data_center_id = 0, 
node_state

= member


Oops. How comes that node that was rebooted formed cluster all by
itself, without seeing the second node? Do you have two_nodes and/or
wait_for_all configured?



i never thought to check the logs on the rebooted server. hopefully
someone can extract some further useful information here:


https://pastebin.com/imnYKBMN



It looks like some timing issue or race condition. After reboot node
manages to contact qnetd first, before connection to other node is
established. Qnetd behaves as documented - it sees two equal size
partitions and favors the partition that includes tie breaker (lowest
node id). So existing node goes out of quorum. Second later both nodes
see each other and so quorum is regained.


Nice catch




thank you for taking the time to troll through my debugging output. your 
explanation seems to accurately describe what i am experiencing. of 
course i have no idea how to remedy it. :-)


It is really quite a problem. Honestly, I don't think there is really a 
way how to remedy this behavior other than implement option to prefer 
active partition as a tie-breaker 
(https://github.com/corosync/corosync-qdevice/issues/7).







I cannot reproduce it, but I also do not use knet. From documentation I
have impression that knet has artificial delay before it considers links
operational, so may be that is the reason.


i will do some reading on how knet factors into all of this and respond 
with any questions or discoveries.


knet_pong_count/knet_ping_interval tuning may help, but I don't think 
there is really a way to prevent creation of single node membership in 
all possible cases.








BTW, great eyes. i had not picked up on that little nuance. i had
poured through this particular log a number of times, but it was very
hard for me to discern the starting and stopping points for each
logical group of messages. the indentation made some of it clear. but
when you have a series of lines beginning in the left-most column, it
is not clear whether they belong to the previous group, the next
group, or they are their own group.

just wanted to note my confusion in case the relevant maintainer
happens across this thread.


Here :)

Output (especially debug one) is really a bit cryptic, but I'm not 
entirely sure how to make it better. Qnetd events have no strict 
ordering so I don't see a way ho to group relevant events without some 
kind of reordering and best guessing, what I'm not too keen to do. Also 
some of the messages relates to specific nodes and some of the messages 
relates to whole cluster (or part of the cluster).


Of course I'm open to ideas how to structure it better way.

Regards,
  Honza




thanks again
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Ugrading Ubuntu 14.04 to 16.04 with corosync/pacemaker failed

2020-02-20 Thread Jan Friesse

Rasca Gmelch napsal(a):

Am 19.02.20 um 19:20 schrieb Strahil Nikolov:

On February 19, 2020 6:31:19 PM GMT+02:00, Rasca  wrote:

Hi,

we run a 2-system cluster for Samba with Ubuntu 14.04 and Samba,
Corosync and Pacemaker from the Ubuntu repos. We wanted to update
to Ubuntu 16.04 but it failed:

I checked the versions before and because of just minor updates
of corosync and pacemaker I thought it should be possible to
update node by node.

* Put srv2 into standby
* Upgraded srv2 to Ubuntu 16.04 with reboot and so on
* Added a nodelist to corosync.conf because it looked
  like corosync on srv2 didn't know the names of the
  node ids anymore

But still it does not work on srv2. srv1 (the active
server with ubuntu 14.04) ist fine. It looks like
it's an upstart/systemd issue, but may be even more.
Why does srv1 says UNCLEAN about srv2? On srv2 I see
corosync sees both systems. But srv2 says srv1 is
OFFLINE!?

crm status


srv1
Last updated: Wed Feb 19 17:22:03 2020
Last change: Tue Feb 18 11:05:47 2020 via crm_attribute on srv2
Stack: corosync
Current DC: srv1 (1084766053) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
9 Resources configured


Node srv2 (1084766054): UNCLEAN (offline)
Online: [ srv1 ]

Resource Group: samba_daemons
 samba-nmbd (upstart:nmbd): Started srv1
[..]


srv2
Last updated: Wed Feb 19 17:25:14 2020  Last change: Tue Feb 18
18:29:29
2020 by hacluster via crmd on srv2
Stack: corosync
Current DC: srv2 (version 1.1.14-70404b0) - partition with quorum
2 nodes and 9 resources configured

Node srv2: standby
OFFLINE: [ srv1 ]


Still don't understand the concept of corosync/pacemaker. Which part is
responsible for this "OFFLINE" statement? I don't know where to
look deeper about this mismatch (see some lines above, where it
says "Online" about srv1).



Full list of resources:

Resource Group: samba_daemons
 samba-nmbd (upstart:nmbd): Stopped
[..]>>

Failed Actions:
* samba-nmbd_monitor_0 on srv2 'not installed' (5): call=5, status=Not
installed, exitreason='none',
last-rc-change='Wed Feb 19 14:13:20 2020', queued=0ms, exec=1ms
[..]


According to the logs it looks like the service (e.g. nmbd) is not
available (may be because of (upstart:nmbd) - how do I change this
configuration in pacemaker? I want to change it to "service" instead
of "upstart". I hope this will fix at least the service problems.

   crm configure primitive smbd ..
gives me:
   ERROR: smbd: id is already in use.



Any suggestions, ideas? Is the a nice HowTo for this upgrade situation?

Regards,
Rasca



Are  you  sure  that there  is no cluster  peotocol mismatch ?

Major number OS Upgrade  (even if supported by vendor)  must be done offline  
(with proper  testing in advance).

What happens  when you upgraded  the other  node ,  or when you rollback the 
upgrade ?

Best Regards,
Strahil Nikolov


Protocol mismatch of corosync or pacemaker? corosync-cmapctl shows that
srv1 and srv2 are members. In the corosync config I have:

service {
ver: 0
name: pacemaker
}

What about this "ver: 0"? May be that's wrong - even for the ubuntu
14.04? The configuration itself was designed under ubuntu 12.04. May
be we forgot to change this parameter when we upgraded from 12.04 to
ubuntu 14.04 some years before?


This is not used at all (was used for Pacemaker plugin for 
OpenAIS/Corosync 1.x).


Honza



Thx+Regards,
  Rasca
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] corosync-cfgtool -sb. What is the "n" indicating?

2020-02-10 Thread Jan Friesse
Just for archival purpose, this issue is now worked on at gh 
https://github.com/corosync/corosync/issues/527



Hi Corosync Specialists!

I have a production cluster with two nodes (node0/1). And I have setup
for debugging this issue a completely virtual cluster also.

Both are showing the same pattern that I do not understand:

Printing link status.

Local node ID 0
LINK ID 0
     addr    = 192.168.2.132
     status    = 33
LINK ID 1
     addr    = 192.168.1.132
     status    = *n*3

What is this "n" indicating? The "n" occurs always at the second ring,
independent of the odering of the interfaces, oder the ring/IP
association, or the interface states.

Without the -b switch the status becomes even more unclear.

Printing link status.
Local node ID 0
LINK ID 0
     addr    = 192.168.2.132
     status:
         node  0:    link enabled:1    link connected:1
         node  1:    link enabled:1    link connected:1
LINK ID 1
     addr    = 192.168.1.132
     status:
         node  0:    link enabled:0    link connected:1
         node  1:    link enabled:1    link connected:1

What is an "enabled" vs. an "connected" link? At first I thought about
something like spanning tree, where some interfaces are deliberately are
shut down to prevent circles. But this does not correlate with my
findings when I disabled interfaces.

If I disable the interface 192.168.2.132 on Node0 I get on Node0

Local node ID 0
LINK ID 0
     addr    = 192.168.2.132
     status:
         node  0:    link enabled:1    link connected:1
         node  1:    link enabled:1    link connected:0
LINK ID 1
     addr    = 192.168.1.132
     status:
         node  0:    link enabled:0    link connected:1
         node  1:    link enabled:1    link connected:1

while I get on node1

Printing link status.
Local node ID 1
LINK ID 0
     addr    = 192.168.2.134
     status:
         node 0:    link enabled:1    link connected:0
         node 1:    link enabled:1    link connected:1
LINK ID 1
     addr    = 192.168.1.134
     status:
         node 0:    link enabled:1    link connected:1
         node 1:    link enabled:0    link connected:1

This is awkward.  I had assumed that both nodes indicate for Link0

         node 0:    link enabled:1    link connected:0

Any help appreciated.

Volker

corosync.conf

totem {
     version: 2

     cluster_name: mail

     token: 3000

     token_retransmits_before_loss_const: 10

     clear_node_high_bit: yes

     crypto_cipher: none
     crypto_hash: none

     interface {
     linknumber: 0
     knet_transport: udp
     knet_link_priority: 20
     }
     interface {
     linknumber: 1
     knet_transport: udp
     knet_link_priority: 10
     }

}

nodelist {
     node {
     ring1_addr: 192.168.1.132
     ring0_addr: 192.168.2.132
     nodeid: 0
     name: mail3
     }
     node {
     ring1_addr: 192.168.1.134
     ring0_addr: 192.168.2.134
     nodeid: 1
     name: mail4
     }
}






___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Re: [ClusterLabs] Corosync-Qdevice SSL Ciphers

2020-01-22 Thread Jan Friesse

Hi,


Hi ,

In case the nss.config is not configured, what will be the ciphers used.


nss.config (if exists) contains list of allowed ciphers. Otherwise it 
really depends on NSS version and what options were used during compilation.


Regards,
  Honza





With Regards
Somanath Thilak J

-Original Message-
From: Jan Friesse 
Sent: Wednesday, January 22, 2020 13:45
To: Cluster Labs - All topics related to open-source clustering welcomed 
; Somanath Jeeva 
Subject: Re: [ClusterLabs] Corosync-Qdevice SSL Ciphers

Somanath,


Hi ,

Is there a way to find/restrict the list of ciphers used by corosync-qnetd 
similar to the PCSD_SSL_CIPHERS variable in /etc/sysconfig/pcsd configuration 
file.


Nope. But qnetd is using NSS so it is possible to change the system policy in 
/etc/crypto-policies/back-ends/nss.config and qnetd will use these settings.

Regards,
Honza





With Regards
Somanath Thilak J




___
Manage your subscription:
https://protect2.fireeye.com/v1/url?k=541c38c8-0896edde-541c7853-862f1
4a9365e-d4cfaa0b3e0660da&q=1&e=4a1cd511-8c19-49cb-a44c-4d34eba19a5c&u=
https%3A%2F%2Flists.clusterlabs.org%2Fmailman%2Flistinfo%2Fusers

ClusterLabs home:
https://protect2.fireeye.com/v1/url?k=ef5b5666-b3d18370-ef5b16fd-862f1
4a9365e-6de4900a36a26f81&q=1&e=4a1cd511-8c19-49cb-a44c-4d34eba19a5c&u=
https%3A%2F%2Fwww.clusterlabs.org%2F





___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Corosync-Qdevice SSL Ciphers

2020-01-22 Thread Jan Friesse

Somanath,


Hi ,

Is there a way to find/restrict the list of ciphers used by corosync-qnetd 
similar to the PCSD_SSL_CIPHERS variable in /etc/sysconfig/pcsd configuration 
file.


Nope. But qnetd is using NSS so it is possible to change the system 
policy in /etc/crypto-policies/back-ends/nss.config and qnetd will use 
these settings.


Regards,
  Honza





With Regards
Somanath Thilak J




___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/



___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


  1   2   3   4   >