Hello Rafael, or anyone else affected, Accepted pacemaker into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/pacemaker/1.1.18-0ubuntu1.2 in a few hours, and then in the -proposed repository.
Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed- bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification- failed-bionic. In either case, without details of your testing we will not be able to proceed. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping! N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days. ** Changed in: pacemaker (Ubuntu Bionic) Status: In Progress => Fix Committed ** Tags added: verification-needed verification-needed-bionic ** Tags added: block-proposed-bionic -- You received this bug notification because you are a member of Ubuntu High Availability Team, which is subscribed to pacemaker in Ubuntu. https://bugs.launchpad.net/bugs/1866119 Title: [bionic] fence_scsi not working properly with Pacemaker 1.1.18-2ubuntu1.1 Status in pacemaker package in Ubuntu: Fix Released Status in pacemaker source package in Bionic: Fix Committed Bug description: OBS: This bug was originally into LP: #1865523 but it was split. #### SRU: pacemaker [Impact] * fence_scsi is not currently working in a share disk environment * all clusters relying in fence_scsi and/or fence_scsi + watchdog won't be able to start the fencing agents OR, in worst case scenarios, the fence_scsi agent might start but won't make scsi reservations in the shared scsi disk. * this bug is taking care of pacemaker 1.1.18 issues with fence_scsi, since the later was fixed at LP: #1865523. [Test Case] * having a 3-node setup, nodes called "clubionic01, clubionic02, clubionic03", with a shared scsi disk (fully supporting persistent reservations) /dev/sda, with corosync and pacemaker operational and running, one might try: rafaeldtinoco@clubionic01:~$ crm configure crm(live)configure# property stonith-enabled=on crm(live)configure# property stonith-action=off crm(live)configure# property no-quorum-policy=stop crm(live)configure# property have-watchdog=true crm(live)configure# commit crm(live)configure# end crm(live)# end rafaeldtinoco@clubionic01:~$ crm configure primitive fence_clubionic \ stonith:fence_scsi params \ pcmk_host_list="clubionic01 clubionic02 clubionic03" \ devices="/dev/sda" \ meta provides=unfencing And see the following errors: Failed Actions: * fence_clubionic_start_0 on clubionic02 'unknown error' (1): call=6, status=Error, exitreason='', last-rc-change='Wed Mar 4 19:53:12 2020', queued=0ms, exec=1105ms * fence_clubionic_start_0 on clubionic03 'unknown error' (1): call=6, status=Error, exitreason='', last-rc-change='Wed Mar 4 19:53:13 2020', queued=0ms, exec=1109ms * fence_clubionic_start_0 on clubionic01 'unknown error' (1): call=6, status=Error, exitreason='', last-rc-change='Wed Mar 4 19:53:11 2020', queued=0ms, exec=1108ms and corosync.log will show: warning: unpack_rsc_op_failure: Processing failed op start for fence_clubionic on clubionic01: unknown error (1) [Regression Potential] * LP: #1865523 shows fence_scsi fully operational after SRU for that bug is done. * LP: #1865523 used pacemaker 1.1.19 (vanilla) in order to fix fence_scsi. * There are changes to: cluster resource manager daemon, local resource manager daemon and police engine. From all the changes, the police engine fix is the biggest, but still not big for a SRU. This could cause police engine, thus cluster decisions, to mal function. * All patches are based in upstream fixes made right after Pacemaker-1.1.18, used by Ubuntu Bionic and were tested with fence_scsi to make sure it fixed the issues. [Other Info] * Original Description: Trying to setup a cluster with an iscsi shared disk, using fence_scsi as the fencing mechanism, I realized that fence_scsi is not working in Ubuntu Bionic. I first thought it was related to Azure environment (LP: #1864419), where I was trying this environment, but then, trying locally, I figured out that somehow pacemaker 1.1.18 is not fencing the shared scsi disk properly. Note: I was able to "backport" vanilla 1.1.19 from upstream and fence_scsi worked. I have then tried 1.1.18 without all quilt patches and it didnt work as well. I think that bisecting 1.1.18 <-> 1.1.19 might tell us which commit has fixed the behaviour needed by the fence_scsi agent. (k)rafaeldtinoco@clubionic01:~$ crm conf show node 1: clubionic01.private node 2: clubionic02.private node 3: clubionic03.private primitive fence_clubionic stonith:fence_scsi \ params pcmk_host_list="10.250.3.10 10.250.3.11 10.250.3.12" devices="/dev/sda" \ meta provides=unfencing property cib-bootstrap-options: \ have-watchdog=false \ dc-version=1.1.18-2b07d5c5a9 \ cluster-infrastructure=corosync \ cluster-name=clubionic \ stonith-enabled=on \ stonith-action=off \ no-quorum-policy=stop \ symmetric-cluster=true ---- (k)rafaeldtinoco@clubionic02:~$ sudo crm_mon -1 Stack: corosync Current DC: clubionic01.private (version 1.1.18-2b07d5c5a9) - partition with quorum Last updated: Mon Mar 2 15:55:30 2020 Last change: Mon Mar 2 15:45:33 2020 by root via cibadmin on clubionic01.private 3 nodes configured 1 resource configured Online: [ clubionic01.private clubionic02.private clubionic03.private ] Active resources: fence_clubionic (stonith:fence_scsi): Started clubionic01.private ---- (k)rafaeldtinoco@clubionic02:~$ sudo sg_persist --in --read-keys --device=/dev/sda LIO-ORG cluster.bionic. 4.0 Peripheral device type: disk PR generation=0x0, there are NO registered reservation keys (k)rafaeldtinoco@clubionic02:~$ sudo sg_persist -r /dev/sda LIO-ORG cluster.bionic. 4.0 Peripheral device type: disk PR generation=0x0, there is NO reservation held To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1866119/+subscriptions _______________________________________________ Mailing list: https://launchpad.net/~ubuntu-ha Post to : ubuntu-ha@lists.launchpad.net Unsubscribe : https://launchpad.net/~ubuntu-ha More help : https://help.launchpad.net/ListHelp