Re: [ClusterLabs] Corosync 3.1.5 Fails to Autostart

2023-04-24 Thread Ken Gaillot
Hi,

With Corosync 3, node names must be specified in
/etc/corosync/corosync.conf like:

node {
ring0_addr: node1
name: node1
nodeid: 1
}

(ring0_addr is a resolvable name used to identify the interface, and
name is the name that should be used in the cluster)

If you set up the cluster from scratch using pcs, it should do that for
you. I'm guessing you reused an older config, or manually set up
corosync.conf.

It shouldn't be necessary to change the After. If it still is an issue
after fixing the config, you might have some unusual dependency like a 
disk that gets mounted later, in which case it would be better to add
an After for the specific dependency.

On Mon, 2023-04-24 at 22:16 +0200, Tyler Phillippe via Users wrote:
> Hello all,
> 
> We are currently using RHEL9 and have set up a PCS cluster. When
> restarting the servers, we noticed Corosync 3.1.5 doesn't start
> properly with the below error message:
> 
> Parse error in config: No valid name found for local host
> Corosync Cluster Engine exiting with status 8 at main.c:1445.
> Corosync.service: Main process exited, code=exited, status=8/n/a
> 
> These are physical, blade machines that are using a 2x Fibre Channel
> NIC in a Mode 6 bond as their networking interface for the cluster;
> other than that, there is really nothing special about these
> machines. We have ensured the names of the machines exist in
> /etc/hosts and that they can resolve those names via the hosts file
> first. The strange thing is if we start Corosync manually after we
> can SSH into the machines, Corosync starts immediately and without
> issue. We did manage to get Corosync to autostart properly by
> modifying the service file and changing the After=network-
> online.target to After=multi-user.target. In doing this, at first,
> Pacemaker complains about mismatching dependencies in the service
> between Corosync and Pacemaker. Changing the Pacemaker service to
> After=multi-user.target fixes that self-caused issue. Any ideas on
> this one? Mostly checking to see if changing the After dependency
> will harm us in the future.
> 
> Thanks!
> 
> Respectfully,
>  Tyler Phillippe
-- 
Ken Gaillot 

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Corosync 3.1.5 Fails to Autostart

2023-04-24 Thread Tyler Phillippe via Users
Hello all,

We are currently using RHEL9 and have set up a PCS cluster. When restarting the 
servers, we noticed Corosync 3.1.5 doesn't start properly with the below error 
message:

Parse error in config: No valid name found for local host
Corosync Cluster Engine exiting with status 8 at main.c:1445.
Corosync.service: Main process exited, code=exited, status=8/n/a

These are physical, blade machines that are using a 2x Fibre Channel NIC in a 
Mode 6 bond as their networking interface for the cluster; other than that, 
there is really nothing special about these machines. We have ensured the names 
of the machines exist in /etc/hosts and that they can resolve those names via 
the hosts file first. The strange thing is if we start Corosync manually after 
we can SSH into the machines, Corosync starts immediately and without issue. We 
did manage to get Corosync to autostart properly by modifying the service file 
and changing the After=network-online.target to After=multi-user.target. In 
doing this, at first, Pacemaker complains about mismatching dependencies in the 
service between Corosync and Pacemaker. Changing the Pacemaker service to 
After=multi-user.target fixes that self-caused issue. Any ideas on this one? 
Mostly checking to see if changing the After dependency will harm us in the 
future.

Thanks!

Respectfully,
 Tyler Phillippe
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] gfs2 without pacemaker ?

2023-04-24 Thread Bernd Lentes


>-Original Message-
>From: Andrew Price 
>Sent: Monday, April 24, 2023 2:04 PM
>To: Bernd Lentes 
>Cc: Cluster Labs - All topics related to open-source clustering welcomed
>
>Subject: Re: [ClusterLabs] gfs2 without pacemaker ?


>For emergency access to a gfs2 filesystem from a single machine, mount it
>with 'mount -o lockproto=lock_nolock /dev/foo /mnt/dir'. Make sure that it's
>the only mounter and that it's unmounted before starting cluster.
>
>There is some info about the lockproto mount option in the gfs2 man page.
>
>Andy

Hi Andy,

works like a charm. This was very helpful !
Thanks a lot.

Bernd
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] gfs2 without pacemaker ?

2023-04-24 Thread Andrew Price

On 23/04/2023 20:36, Bernd Lentes wrote:

Hi,

I need quick access to a GFS2 partition. Is it possible to mount it without a 
running cluster ?


For emergency access to a gfs2 filesystem from a single machine, mount 
it with 'mount -o lockproto=lock_nolock /dev/foo /mnt/dir'. Make sure 
that it's the only mounter and that it's unmounted before starting cluster.


There is some info about the lockproto mount option in the gfs2 man page.

Andy

___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] OCF_HEARTBEAT_PGSQL - any good with current Postgres- ?

2023-04-24 Thread lejeczek via Users

Hi guys.

I've been looking up and fiddling with this RA but 
unsuccessfully so far, that I wonder - is it good for 
current versions of pgSQLs?


many thanks, L.g___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] How to block/stop a resource from running twice?

2023-04-24 Thread Andrei Borzenkov
On Mon, Apr 24, 2023 at 11:52 AM Klaus Wenninger  wrote:
> The checking for a running resource that isn't expected to be running isn't 
> done periodically (at
> least not per default and I don't know a way to achieve that from the top of 
> my mind).

op monitor role=Stopped interval=20s
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] How to block/stop a resource from running twice?

2023-04-24 Thread Klaus Wenninger
On Fri, Apr 21, 2023 at 12:24 PM fs3000 via Users 
wrote:

> Hello all,
>
> I'm configuring a two node cluster. Pacemaker 0.9.169 on Centos 7.
>
> guess this is rather the pcs-version ...


> How can i configure a specific service to run just on one node and avoid
> having it running on more than one node simultaneously. If i start the
> service on the other node, it keeps running, pacemaker does not kill it. I
> have googled and searched the docs, but can't find a solution. This is for
> systemd resources. Any ideas please?
>
> Example:
>
> node1# pcs resource create httpd_service systemd:httpd  op monitor
> interval=10s
> node2# systemctl start httpd
>
> httpd keeps running on both nodes. Ideally, pacemaker should kill a second
> instance of that service.
>

When you start pacemaker on a node it will check which resources are
running there (called 'probe')
and thus if you had multiple instances of a primitive running (without a
clone) pacemaker would
take care of that.
But you are starting the systemd-unit after pacemaker.
The checking for a running resource that isn't expected to be running isn't
done periodically (at
least not per default and I don't know a way to achieve that from the top
of my mind).
Resources that have been started by pacemaker (or found running legitimate
on startup) are of course
monitored using that interval you have given.

Klaus

>
> Thanks in advance for any tips you might have.
> ___
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/