Rasca Gmelch napsal(a):
Am 19.02.20 um 19:20 schrieb Strahil Nikolov:
On February 19, 2020 6:31:19 PM GMT+02:00, Rasca <rasca.gme...@artcom.de> wrote:

we run a 2-system cluster for Samba with Ubuntu 14.04 and Samba,
Corosync and Pacemaker from the Ubuntu repos. We wanted to update
to Ubuntu 16.04 but it failed:

I checked the versions before and because of just minor updates
of corosync and pacemaker I thought it should be possible to
update node by node.

* Put srv2 into standby
* Upgraded srv2 to Ubuntu 16.04 with reboot and so on
* Added a nodelist to corosync.conf because it looked
  like corosync on srv2 didn't know the names of the
  node ids anymore

But still it does not work on srv2. srv1 (the active
server with ubuntu 14.04) ist fine. It looks like
it's an upstart/systemd issue, but may be even more.
Why does srv1 says UNCLEAN about srv2? On srv2 I see
corosync sees both systems. But srv2 says srv1 is

crm status

Last updated: Wed Feb 19 17:22:03 2020
Last change: Tue Feb 18 11:05:47 2020 via crm_attribute on srv2
Stack: corosync
Current DC: srv1 (1084766053) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
9 Resources configured

Node srv2 (1084766054): UNCLEAN (offline)
Online: [ srv1 ]

Resource Group: samba_daemons
     samba-nmbd (upstart:nmbd): Started srv1

Last updated: Wed Feb 19 17:25:14 2020          Last change: Tue Feb 18
2020 by hacluster via crmd on srv2
Stack: corosync
Current DC: srv2 (version 1.1.14-70404b0) - partition with quorum
2 nodes and 9 resources configured

Node srv2: standby
OFFLINE: [ srv1 ]

Still don't understand the concept of corosync/pacemaker. Which part is
responsible for this "OFFLINE" statement? I don't know where to
look deeper about this mismatch (see some lines above, where it
says "Online" about srv1).

Full list of resources:

Resource Group: samba_daemons
     samba-nmbd (upstart:nmbd): Stopped

Failed Actions:
* samba-nmbd_monitor_0 on srv2 'not installed' (5): call=5, status=Not
installed, exitreason='none',
    last-rc-change='Wed Feb 19 14:13:20 2020', queued=0ms, exec=1ms

According to the logs it looks like the service (e.g. nmbd) is not
available (may be because of (upstart:nmbd) - how do I change this
configuration in pacemaker? I want to change it to "service" instead
of "upstart". I hope this will fix at least the service problems.

   crm configure primitive smbd ..
gives me:
   ERROR: smbd: id is already in use.

Any suggestions, ideas? Is the a nice HowTo for this upgrade situation?


Are  you  sure  that there  is no cluster  peotocol mismatch ?

Major number OS Upgrade  (even if supported by vendor)  must be done offline  
(with proper  testing in advance).

What happens  when you upgraded  the other  node ,  or when you rollback the 
upgrade ?

Best Regards,
Strahil Nikolov

Protocol mismatch of corosync or pacemaker? corosync-cmapctl shows that
srv1 and srv2 are members. In the corosync config I have:

service {
    ver: 0
    name: pacemaker

What about this "ver: 0"? May be that's wrong - even for the ubuntu
14.04? The configuration itself was designed under ubuntu 12.04. May
be we forgot to change this parameter when we upgraded from 12.04 to
ubuntu 14.04 some years before?

This is not used at all (was used for Pacemaker plugin for OpenAIS/Corosync 1.x).


Manage your subscription:

ClusterLabs home: https://www.clusterlabs.org/

Manage your subscription:

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to