Re: [ClusterLabs] How to generate RPMs for Pacemaker release 2.x on Centos

2018-10-17 Thread Jan Pokorný
On 15/10/18 14:46 +, Lopez, Francisco Javier [Global IT] wrote:
> I could not do that way as this box does not have access to Internet.
> Will see how to deal with this.

As Ken mentioned, current upstream-devised RPM packaging practices are
wrapped around the assumption of working with git tree directly.

You can sort of workaround that, but since it may slightly vary whether
you are after a tagged release or arbitrary commit-based snapshot,
I will provide a tested solution just for the latter (and you may
be required to change some bits for the latter scenario):

$ mkdir pcmkbuild
$ cd pcmkbuild
$ curl -LO 
https://github.com/ClusterLabs/pacemaker/archive/Pacemaker-2.0.0/pacemaker-2.0.0.tar.gz
$ tar xf pacemaker-2.0.0.tar.gz
$ cd pacemaker-Pacemaker-2.0.0
$ ./autogen.sh && ./configure
$ cp ../pacemaker-2.0.0.tar.gz .
$ make srpm _TAG=Pacemaker-2.0.0 'TAG=${_TAG}' 'SHORTTAG=${_TAG}'
$ mock --rebuild pacemaker-2.0.0-*.src.rpm

Note that I am explicitly avoiding the build process carried out as
the current user, since there may be, e.g., a bug in the spec file
accidentally eating your files...  Mock is a way to isolate such
build as much as possible in Fedora/CentOS/EL packaging ecosystem.
If you are OK with said risks, just replace "srpm" with "rpm", which
will produce the final packages right away (i.e. that would be
a final command to run).

So building pacemaker RPMs the upstream way outside of the git tree
is doable, but rather clumsy.  On the other hand, I don't think
there's much demand to make it smoother.  Correct me if I am wrong.

-- 
Nazdar,
Jan (Poki)


pgpIt4fgU79Q6.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Re: How to generate RPMs for Pacemaker release 2.x on Centos

2018-10-17 Thread Ken Gaillot
On Wed, 2018-10-17 at 15:58 +, Lopez, Francisco Javier [Global IT]
wrote:
> Hello guys.
> 
> I finally created the RPMs for Pacemaker and Resource Agents. I will
> paste
> in this thread the way to do that so it can help any other like me :-
> )
> 
> I need a final update, hopefully, from you guys about this issue ...
> 
> I need to know if there is any compatibility or certification matrix
> somewhere.
> I'm asking this because as I'm creating the packages from source, I
> need to be
> sure that Pacemaker, PCS, Corosync, the Agents, ... releases match.
> This would
> guarantee that if we find an issue using the product, the problem is
> not a 
> compatibility among them.

For the most part, each component is written to be independent of other
components' versions.

The big exception is pcs, which is closely tied to both corosync and
pacemaker versions.

The pcs 0.9 series is compatible with corosync 2 and pacemaker 1.1,
while the pcs 0.10 series is compatible with the new kronosnet project,
corosync 3, and pacemaker 2.0. pcs may work "most of the time" with
other versions, but full compatibility requires that line-up.

I'd say your choices from easiest to hardest are:

* Stick with what you have. If a pcs command works, you're fine. If a
pcs command fails, figure out how to do the same thing with direct
editing of corosync.conf and the pacemaker CLI tools.

* Switch to the pacemaker 1.1 series (either distro-provided or your
own build), and keep everything else as you have it. You don't lose
much, since most bug fixes and some features from 2.0 have been
backported to 1.1. This will likely continue for at least the next year
or two, though at some point the frequency of backports will decline.
This would buy you enough time for the distros to catch up to the newer
versions of everything.

* Build kronosnet, corosync 3, and pcs 0.10 yourself. This would put
you on the "way of the future" but obviously is more maintenance for
you when it comes to updates, and you lose the early release of
backported bugfixes that distro packages include.

> 
> These are the releases I have now:
> 
> - Pacemaker: 2.0.0, created by me.
> - Resource Agents: 4.1.1, created by me.
> - Corosync: corosynclib-2.4.3-2.el7_5.1.x86_64
>     corosync-2.4.3-2.el7_5.1.x86_64
>   Installed from repos.
> - PCS: pcs-0.9.162-5.el7.centos.1.x86_64
>   Installed from repos.
> 
> As before, I appreciate all your help
> 
> Best regards
> 
> 
> Francisco Javier      Lopez     IT System Eng
> ineer  |  Global IT     O: +34 619 728 249
>  |    M: +34 619 728 249   | 
> franciscojavier.lo...@solera.com   |  Solera.com  
>   Audatex Datos, S.A.  |  Avda. de Bruselas, 36, Salida
>  16, A‑1 (Diversia)   ,   Alcobendas  ,   Madr
> id,   28108   ,   Spain     On
> 15/10/18 16:42, Ken Gaillot wrote:
> > On Mon, 2018-10-15 at 14:37 +, Lopez, Francisco Javier [Global
> > IT]
> > wrote:
> > > Klaus/Ken.
> > > 
> > > Thx. for you reply.
> > > 
> > > The issue is ...
> > > 
> > > - I downloaded the source from GIT.
> > > - Downloaded the OS required packages.
> > > - Unzipped the source.
> > > - ./autogen.sh + ./configure ---> OK
> > > - Then, indeed, I tried: make rpm 
> > >   But I got thousands of errors:
> > 
> > Ah, I forgot it uses information from the repository. Rather than
> > download the source, you'd have to git clone the repository, and
> > run
> > from there. By default you'll be in the latest master branch; if
> > you
> > prefer to run a released version, you can check it out like "git
> > checkout Pacemaker-2.0.0".
> > 
> > > $ make rpm
> > > fatal: Not a git repository (or any of the parent directories):
> > > .git
> > > fatal: Not a git repository (or any of the parent directories):
> > > .git
> > > fatal: Not a git repository (or any of the parent directories):
> > > .git
> > > fatal: Not a git repository (or any of the parent directories):
> > > .git
> > > fatal: Not a git repository (or any of the parent directories):
> > > .git
> > > fatal: Not a git repository (or any of the parent directories):
> > > .git
> > > /bin/sh: -c: line 0: syntax error near unexpected token
> > > `Pacemaker-*'
> > > /bin/sh: -c: line 0: `case  in Pacemaker-*) echo '' | cut -c11-;; 
> > > *)
> > > git log --pretty=format:%h -n 1 '';; esac'
> > > ...
> > > ...
> > > 
> > > What made me think, that downloading the source to the box I'm
> > > testing could not be the
> > > best, so I decided to ask the experts.
> > > 
> > > Best Regards
> > > 
> > > Francisco Javier      Lopez     IT System
> > >  Eng
> > > ineer  |  Global IT     O: +34 619 728 249
> > >  |    M: +34 619 728 249   | 
> > > franciscojavier.lo...@solera.com   |  Solera.com  
> > >   Audatex Datos, S.A.  |  Avda. de Bruselas, 36, Sa
> > > lida
> > >  16, A‑1 (Diversia)   ,   Alcobendas  ,   
> > > Madr
> > > id,   28108   ,    

Re: [ClusterLabs] Re: How to generate RPMs for Pacemaker release 2.x on Centos

2018-10-17 Thread Lopez, Francisco Javier [Global IT]
Hello guys.

I finally created the RPMs for Pacemaker and Resource Agents. I will paste
in this thread the way to do that so it can help any other like me :-)

I need a final update, hopefully, from you guys about this issue ...

I need to know if there is any compatibility or certification matrix somewhere.
I'm asking this because as I'm creating the packages from source, I need to be
sure that Pacemaker, PCS, Corosync, the Agents, ... releases match. This would
guarantee that if we find an issue using the product, the problem is not a
compatibility among them.

These are the releases I have now:

- Pacemaker: 2.0.0, created by me.
- Resource Agents: 4.1.1, created by me.
- Corosync: corosynclib-2.4.3-2.el7_5.1.x86_64
corosync-2.4.3-2.el7_5.1.x86_64
  Installed from repos.
- PCS: pcs-0.9.162-5.el7.centos.1.x86_64
  Installed from repos.

As before, I appreciate all your help

Best regards


Francisco Javier​   Lopez

IT System Engineer   |  Global IT

O: +34 619 728 249|  M: +34 619 728 
249|

franciscojavier.lo...@solera.com   
 |  Solera.com

Audatex Datos, S.A.  |  Avda. de Bruselas, 36, Salida 16, A‑1 
(Diversia),   Alcobendas  ,   Madrid  ,   28108   , 
  Spain


[cid:image336898.png@3855FCA3.73CA15EA]



On 15/10/18 16:42, Ken Gaillot wrote:

On Mon, 2018-10-15 at 14:37 +, Lopez, Francisco Javier [Global IT]
wrote:


Klaus/Ken.

Thx. for you reply.

The issue is ...

- I downloaded the source from GIT.
- Downloaded the OS required packages.
- Unzipped the source.
- ./autogen.sh + ./configure ---> OK
- Then, indeed, I tried: make rpm
  But I got thousands of errors:



Ah, I forgot it uses information from the repository. Rather than
download the source, you'd have to git clone the repository, and run
from there. By default you'll be in the latest master branch; if you
prefer to run a released version, you can check it out like "git
checkout Pacemaker-2.0.0".



$ make rpm
fatal: Not a git repository (or any of the parent directories): .git
fatal: Not a git repository (or any of the parent directories): .git
fatal: Not a git repository (or any of the parent directories): .git
fatal: Not a git repository (or any of the parent directories): .git
fatal: Not a git repository (or any of the parent directories): .git
fatal: Not a git repository (or any of the parent directories): .git
/bin/sh: -c: line 0: syntax error near unexpected token `Pacemaker-*'
/bin/sh: -c: line 0: `case  in Pacemaker-*) echo '' | cut -c11-;; *)
git log --pretty=format:%h -n 1 '';; esac'
...
...

What made me think, that downloading the source to the box I'm
testing could not be the
best, so I decided to ask the experts.

Best Regards

Francisco JavierLopez IT System Eng
ineer|  Global IT O: +34 619 728 249
 |  M: +34 619 728 249   |
franciscojavier.lo...@solera.com   
 |  Solera.com
  Audatex Datos, S.A.|  Avda. de Bruselas, 36, Salida
 16, A‑1 (Diversia) ,   Alcobendas  ,   Madr
id  ,   28108   ,   Spain On
15/10/18 16:27, Ken Gaillot wrote:


On Mon, 2018-10-15 at 14:39 +0200, Klaus Wenninger wrote:


On 10/15/2018 01:52 PM, Lopez, Francisco Javier [Global IT]
wrote:


Hello guys !

We are planning to use Pacemaker as a base HA Software in our
Company.

Our requirements will be:

- Centos-7
- PostgreSql-10

We did several tests with Pacemaker release 1.1.8 and fixed the
problems found with
the RA. We finally created new RPMs from source (4.x).

Now we want to test Pacemaker release 2.x but, as we will have
to
create some clusters,
we want to create new RPMs for this release instead of doing
manual
installation on
each new box. As per what I see the RPMs for our Centos have
not
been created yet.

We could run 'autogen' + 'configure' but I do not find the way
to
generate the RPMs.
Anyone could share with me the correct paths to do this please
?




The spec-file found in the pacemaker-github-repo should work
straight
forward
using mock to build against the repos of your Centos-Version.
Just check that you are on current corosync, libqb, knet, ...
Pacemaker 2 seems to build well against the packages coming with
Centos 7.5.
Maybe others can comment on how advisable it is running that
combo
though.

Klaus



Also, there is a convenient target for building RPMs from the spec
file, you can just run "make rpm" (after autogen.sh + configure).



Perhaps there are some steps written somewhere and I did not
find
them out ...

Appreciate your help.

Regards
Javier
Francisco JavierLopez IT Syst
em E
ngineer  |  Global IT O: +34 619 728 249

 |  M: +34 619 728 249   |
franciscojavier.lo...@solera.com   
 |  Solera.com
  Audatex Datos, S.A.|  Avda. de Bru

Re: [ClusterLabs] Fwd: Not getting Fencing monitor alerts

2018-10-17 Thread Ken Gaillot
On Wed, 2018-10-17 at 12:18 +0530, Rohit Saini wrote:
> Hi Klaus,
> Please see answers below for your queries:
> 
> Do you have any evidence that monitoring is happening when "resources
> are unreachable"
> ( == fence_virtd is reachable?) like logs?
> [Rohit] Yes, monitoring is happening. I have already tested this. I'm
> getting pcs alerts accurately when monitoring goes down or up.

I'm not sure I understand the question, but monitor alerts are only
sent out when the monitor status changes. If there are 10 successful
monitors in a row and then a failure, there will be one alert for the
first successful monitor and then one alert for the failure.

> I would guess that there is no monitoring unless the fencing-resource 
> is accounted
> as started successfully. 
> [Rohit] If that's the case, then I am never going to get pcs alerts.
> Only way is to check status of resources via fence_xvm or fence_ilo4
> to know if resources are reachable or not. Do you agree with me?

Again I'm not too clear on the question, but monitors are only run on
started services, unless you configure a separate monitor with target-
role=Stopped. If a start fails, an alert will be sent for the failure;
if the start succeeds, an alert will be sent for the success, a monitor
will be started, and alerts will be sent for any change in monitor
status.

> 
> Thanks,
> Rohit 
> 
> On Tue, Oct 16, 2018 at 2:03 PM Klaus Wenninger 
> wrote:
> > On 10/16/2018 07:43 AM, Rohit Saini wrote:
> > > Gentle Reminder!!
> > > 
> > > -- Forwarded message -
> > > From: Rohit Saini 
> > > Date: Tue, Oct 9, 2018 at 2:51 PM
> > > Subject: Not getting Fencing monitor alerts
> > > To: 
> > > 
> > > 
> > > Hi,
> > > I am facing issue in getting pcs alerts for fencing resources.
> > > 
> > > Scenario:
> > > 1. Configure the pcs alerts
> > > 2. Add stonith resources (resources are unreachable)
> > > 3. No monitoring alerts received.
> > > 
> > > Note:
> > > If stonith resources (reachable) are successfully added, then I
> > > get pcs alert for monitor link down and up.
> >  
> > Do you have any evidence that monitoring is happening when
> > "resources are unreachable"
> > ( == fence_virtd is reachable?) like logs?
> > I would guess that there is no monitoring unless the fencing-
> > resource is accounted
> > as started successfully.
> > 
> > Regards,
> > Klaus
> > >     --PCS Alert configuration--
> > >    pcs alert create id=${PCS_ALERT_ID}
> > > path=/var/lib/pacemaker/pw_alert.sh                             
> > >                                                                  
> > >              
> > >     pcs alert recipient add ${PCS_ALERT_ID}
> > > value=/var/lib/pacemaker/pw_alert.sh
> > > 
> > >   
> > >   --Starting Stonith--
> > >     my_fence_name="fence-xvm-$my_hostname"
> > >     pcs stonith show $my_fence_name
> > >     if [ $? -ne 0 ]; then
> > >         #monitor on-fail is "ignore" which means "Pretend the
> > > resource did not fail".
> > >         #Only alarm will be generated if monitoring link goes
> > > down.
> > >         pcs stonith create $my_fence_name fence_xvm \
> > >         multicast_address=$my_mcast_addr port=$my_hostport \
> > >         pcmk_host_list=$my_hostname action=$actionvalue
> > > delay=$my_fence_delay \
> > >         op start interval="100s" on-fail="restart" \
> > >         op monitor interval="5s" on-fail="ignore"
> > >         pcs constraint colocation add $my_fence_name with master
> > > unicloud-master INFINITY
> > >         pcs constraint order start $my_fence_name then promote
> > > unicloud-master
> > >         pcs stonith update $my_fence_name meta failure-timeout=3s
> > >     fi
> > >     peer_fence_name="fence-xvm-$peer_hostname"
> > >     pcs stonith show $peer_fence_name
> > >     if [ $? -ne 0 ]; then
> > >         pcs stonith create $peer_fence_name fence_xvm \
> > >         multicast_address=$peer_mcast_addr port=$peer_hostport \
> > >         pcmk_host_list=$peer_hostname action=$actionvalue
> > > delay=$peer_fence_delay \
> > >         op start interval="100s" on-fail="restart" \
> > >         op monitor interval="5s" on-fail="ignore"
> > >         pcs constraint colocation add $peer_fence_name with
> > > master unicloud-master INFINITY
> > >         pcs constraint order start $peer_fence_name then promote
> > > unicloud-master
> > >         pcs stonith update $peer_fence_name meta failure-
> > > timeout=3s
> > >     fi
> > >                                                                  
> > >                                                                  
> > >                                                             
> > >     pcs property set stonith-enabled=true
> > > 
> > > 
> > > Thanks,
> > > Rohit
> > > 
> > > 
> > > ___
> > > Users mailing list: Users@clusterlabs.org
> > > https://lists.clusterlabs.org/mailman/listinfo/users
> > > 
> > > Project Home: http://www.clusterlabs.org
> > > Getting started: http://ww

[ClusterLabs] fence-agents v4.3.1

2018-10-17 Thread Oyvind Albrigtsen

ClusterLabs is happy to announce fence-agents v4.3.1, which is a
bugfix release for v4.3.0.

The source code is available at:
https://github.com/ClusterLabs/fence-agents/releases/tag/v4.3.1

The most significant enhancements in this release are:
- bugfixes and enhancements:
 - fence_openstack: fix configure library detection and version
   parameter missing for novaclient.Client() call

Everyone is encouraged to download and test the new release.
We do many regression tests and simulations, but we can't cover all
possible use cases, so your feedback is important and appreciated.

Many thanks to all the contributors to this release.


Best,
The fence-agents maintainers
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] corosync in multicast mode produces lots of unicast traffic

2018-10-17 Thread Jan Friesse

Klaus,


Hi Jan!

Thanks for your answer.


I have a Proxmox cluster which uses Corosync as cluster engine.
Corosync uses the "default" multicast configuration. Nevertheless,
using tcpdump I see much more packets sent by corosync using unicast
between the node members than multicast packets.

Is this normal behavior? If yes, please point me to some documentation.


It really depends. If cluster is quiet (no configuration changes) so it
basically only heartbeats then it's pretty normal that unicast trafic
(used for heartbeats) is bigger than multicast one.


Now I am confused. I thought that corosync uses totem and the totem
protocol has implicit keep-alive by passing the token between the ring
members. So, even if there are no messages, the token is passed on in
the ring giving implicit keep-alive. And all this is done using multicast.


Yes, it's exactly as you've wrote but these messages uses unicast. What 
I called heartbeat is exactly this token passing.


Multicast is used only for sending regular messages.

So if the cluster is quiet, only token passed between nodes using unicast.

Regards,
  Honza



So, is my understanding wrong and corosync uses totem only for message
delivery and there is an additonal "heartbeat" feature which sends
unicast keep-alive to all the known members?

Thanks
Klaus



___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] resource-agents v4.2.0 rc1

2018-10-17 Thread Oyvind Albrigtsen

ClusterLabs is happy to announce resource-agents v4.2.0 rc1.
Source code is available at:
https://github.com/ClusterLabs/resource-agents/releases/tag/v4.2.0rc1

The most significant enhancements in this release are:
- new resource agents:
 - aliyun-vpc-move-ip
 - gcp-pd-move
 - gcp-vpc-move-ip
 - gcp-vpc-move-route (improved Python version of gcp-vpc-move-ip)
 - gcp-vpc-move-vip
 - openstack-cinder-volume
 - openstack-floating-ip
 - openstack-info
 - podman
 - sybaseASE

- bugfixes and enhancements:
 - CI: fixes for bash path, strncpy in GCC 8 and missing docbook-style-xsl
 - CTDB: fix "ctdb_recovery_lock" validation
 - CTDB: fix incorrect DB corruption reports (ensure health check is run)
 - Filesystem: support symlink as mountpoint directory
 - IPaddr2: return OCF_ERR_GENERIC when failing due to IPv4 address collision
 - LVM-activate: fix for dashes in volume group and logical volume names
 - LVM-activate: read parameters for stop-action
 - LVM-activate: return OCF_ERR_CONFIGURED for incorrect vg_access_mode
 - LVM: added missing dash for activation parameters
 - SAPDatabase: add info to meta-data
 - SAPInstance: add monitored services for ENSA2 (bsc#1092384)
 - SAPInstance: implement reload action to avoid resource restarts after a
   non-unique parameter has been changed
 - SAPInstance: improve SAP instance profile detection
 - SAPInstance: improve stop-action logging
 - Squid: use ss if netstat is not available
 - VirtualDomain: add stateless support
 - VirtualDomain: correctly create logfile and set permissions
 - Xen: add utilization support for cpu and hv_memory
 - apache: retry PID check.
 - aws-vpc-move-ip: check routing table during monitor probe action
 - aws-vpc-move-ip: fix backward-compatibility
 - aws-vpc-move-ip: use ip utility to check address
 - awseip: fix allocation_id not found error
 - awseip: update required IAM role permissions
 - awsvip: get network-id from metadata
 - awsvip: improve secondary-private-ip query
 - configure: add Python path detection
 - exportfs: fix square bracket stripping in clientspec
 - findif: improve IPv6 NIC detection
 - findif: only match lines containing netmasks (for newer iputils)
 - garbd: support netstat and ss
 - iSCSITarget: support CHAP authentication for lio-t
 - ipsec: add tunnel fallback option
 - ldirectord: add manpage to systemd unit file
 - lvmlockd: add cmirrord support
 - mysql: remove obsolete DEBUG_LOG functionality (bsc#1021689)
 - nfsserver: fix rpcpipefs_dir and nfs_shared_infodir issues
 - ocf-binaries: use SSH-path detected by configure
 - ocf.py: new Python library and dev guide
 - oracle: improve dbopen error
 - pgsql: create replication slots after promoting master
 - pgsql: dont change ownership of /dev/null
 - pgsql: support PostgreSQL 11 or later
 - portblock: support ss and netstat (partial)
 - ra-dev-guide: update instructions for GitHub
 - rabbitmq-cluster: get cluster status from mnesia during monitor
 - rabbitmq-cluster: retry start when cluster join fails
 - redis: do not use absolute path in pidof calls
 - sg_persist: correctly pickup old keys
 - syslog-ng: add Premium Edition 6 and 7 support
 - systemd-tmpfiles: configure path with --with-rsctmpdir

The full list of changes for resource-agents is available at:
https://github.com/ClusterLabs/resource-agents/blob/v4.2.0rc1/ChangeLog

Everyone is encouraged to download and test the new release candidate.
We do many regression tests and simulations, but we can't cover all
possible use cases, so your feedback is important and appreciated.

Many thanks to all the contributors to this release.


Best,
The resource-agents maintainers
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] corosync in multicast mode produces lots of unicast traffic

2018-10-17 Thread Klaus Darilion
Hi Jan!

Thanks for your answer.

>> I have a Proxmox cluster which uses Corosync as cluster engine.
>> Corosync uses the "default" multicast configuration. Nevertheless,
>> using tcpdump I see much more packets sent by corosync using unicast
>> between the node members than multicast packets.
>>
>> Is this normal behavior? If yes, please point me to some documentation. 
> 
> It really depends. If cluster is quiet (no configuration changes) so it
> basically only heartbeats then it's pretty normal that unicast trafic
> (used for heartbeats) is bigger than multicast one.

Now I am confused. I thought that corosync uses totem and the totem
protocol has implicit keep-alive by passing the token between the ring
members. So, even if there are no messages, the token is passed on in
the ring giving implicit keep-alive. And all this is done using multicast.

So, is my understanding wrong and corosync uses totem only for message
delivery and there is an additonal "heartbeat" feature which sends
unicast keep-alive to all the known members?

Thanks
Klaus
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Fwd: Not getting Fencing monitor alerts

2018-10-17 Thread Rohit Saini
Hi Klaus,
Please see answers below for your queries:

Do you have any evidence that monitoring is happening when "resources are
unreachable"
( == fence_virtd is reachable?) like logs?
*[Rohit]* Yes, monitoring is happening. I have already tested this. I'm
getting pcs alerts accurately when monitoring goes down or up.

I would guess that there is no monitoring unless the fencing-resource is
accounted
as started successfully.
*[Rohit]* If that's the case, then I am never going to get pcs alerts. Only
way is to check status of resources via fence_xvm or fence_ilo4 to know if
resources are reachable or not. Do you agree with me?

Thanks,
Rohit

On Tue, Oct 16, 2018 at 2:03 PM Klaus Wenninger  wrote:

> On 10/16/2018 07:43 AM, Rohit Saini wrote:
>
> Gentle Reminder!!
>
> -- Forwarded message -
> From: Rohit Saini 
> Date: Tue, Oct 9, 2018 at 2:51 PM
> Subject: Not getting Fencing monitor alerts
> To: 
>
>
> Hi,
> I am facing issue in getting pcs alerts for fencing resources.
>
> *Scenario:*
> 1. Configure the pcs alerts
> 2. Add stonith resources (resources are unreachable)
> 3. No monitoring alerts received.
>
> *Note:*
> If stonith resources (reachable) are successfully added, then I get pcs
> alert for monitor link down and up.
>
>
> Do you have any evidence that monitoring is happening when "resources are
> unreachable"
> ( == fence_virtd is reachable?) like logs?
> I would guess that there is no monitoring unless the fencing-resource is
> accounted
> as started successfully.
>
> Regards,
> Klaus
>
>
> *--PCS Alert configuration--*
>pcs alert create id=${PCS_ALERT_ID}
> path=/var/lib/pacemaker/pw_alert.sh
>
> pcs alert recipient add ${PCS_ALERT_ID}
> value=/var/lib/pacemaker/pw_alert.sh
>
> * --Starting Stonith--*
> my_fence_name="fence-xvm-$my_hostname"
> pcs stonith show $my_fence_name
> if [ $? -ne 0 ]; then
> #monitor on-fail is "ignore" which means "Pretend the resource did
> not fail".
> #Only alarm will be generated if monitoring link goes down.
> pcs stonith create $my_fence_name fence_xvm \
> multicast_address=$my_mcast_addr port=$my_hostport \
> pcmk_host_list=$my_hostname action=$actionvalue
> delay=$my_fence_delay \
> op start interval="100s" on-fail="restart" \
> op monitor interval="5s" on-fail="ignore"
> pcs constraint colocation add $my_fence_name with master
> unicloud-master INFINITY
> pcs constraint order start $my_fence_name then promote
> unicloud-master
> pcs stonith update $my_fence_name meta failure-timeout=3s
> fi
> peer_fence_name="fence-xvm-$peer_hostname"
> pcs stonith show $peer_fence_name
> if [ $? -ne 0 ]; then
> pcs stonith create $peer_fence_name fence_xvm \
> multicast_address=$peer_mcast_addr port=$peer_hostport \
> pcmk_host_list=$peer_hostname action=$actionvalue
> delay=$peer_fence_delay \
> op start interval="100s" on-fail="restart" \
> op monitor interval="5s" on-fail="ignore"
> pcs constraint colocation add $peer_fence_name with master
> unicloud-master INFINITY
> pcs constraint order start $peer_fence_name then promote
> unicloud-master
> pcs stonith update $peer_fence_name meta failure-timeout=3s
> fi
>
>
>
> pcs property set stonith-enabled=true
>
>
> Thanks,
> Rohit
>
>
> ___
> Users mailing list: 
> Users@clusterlabs.orghttps://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
>
___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] corosync in multicast mode produces lots of unicast traffic

2018-10-17 Thread Jan Friesse

Klaus,


Hi!

I have a Proxmox cluster which uses Corosync as cluster engine. Corosync 
uses the "default" multicast configuration. Nevertheless, using tcpdump 
I see much more packets sent by corosync using unicast between the node 
members than multicast packets.


Is this normal behavior? If yes, please point me to some documentation. 


It really depends. If cluster is quiet (no configuration changes) so it 
basically only heartbeats then it's pretty normal that unicast trafic 
(used for heartbeats) is bigger than multicast one.


If no, does this mean I have some broken configuration? Everthing seems 
to work fine with my setup.


So then you don't have to worry :)

Regards,
  Honza




Thanks

Klaus

___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org