Re: [ClusterLabs] corosync race condition when node leaves immediately after joining

2017-11-13 Thread Jan Friesse

Jonathan,
I've finished (I hope) proper fix for problem you've seen, so can you 
please try to test


https://github.com/corosync/corosync/pull/280

Thanks,
  Honza




On 31/10/17 10:41, Jan Friesse wrote:

Did you get a chance to confirm whether the workaround to remove the
final call to votequorum_exec_send_nodeinfo from votequorum_exec_init_fn
is safe?


I didn't had time to find out what exactly is happening, but I can
confirm you, that workaround is safe. It's just not a full fix and
there can still be situations when the bug appears.



The patch works well in our testing, but I'm keen to hear whether you
think this is likely to be safe for use in production.


It's safe but it's just a workaround.


Thanks for confirming.

Jonathan



___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] boothd-site/boothd-arbitrator: WARN: packet timestamp older than previous one

2017-11-13 Thread Nicolas Huillard
Le samedi 11 novembre 2017 à 10:46 +0100, Dejan Muhamedagic a écrit :
> Hmm, there is supposed to be a guarantee that the clock is
> monotonic (we use CLOCK_MONOTONIC). The clock_getres(3) does say
> though that it "is affected by the incremental adjustments
> performed by adjtime(3) and NTP." But normally those
> adjustments shouldn't exceed certain threshold per second.
> Anyway, maybe you can try with hpet as a clock source.

I tried with hpet avec acpi_pm (arbitrator using tsc, one site using
hpet, the other one using acpi_pm), with the same result.
Note that the booth message "packet timestamp older than previous one"
really means "not newer than", ie. it may happen even if the clock is
monotonic, but simply outputs the same value two consecutive times.
There should not be NTP abnormalities here, since clocks are in sync
and stable since a long time.

> > The CPUs are currently underutilised, which can lead to increased
> > discrepancy between cores TSCs.
> > I can either:
> > * switch to another clocksource (I don't yet know how to do that)
> 
> You can do that with sysfs: in
> /sys/devices/system/clocksource/clocksource0/ see
> available_clocksource and current_clocksource. To modify it on
> boot there are certainly some kernel parameters.

I just changed it and restated boothd only. I don't plan on changing
the clock source in the long run.

> No idea. If you feel like experimenting, you could add a bit more
> information into the debug messages (i.e. time difference) and
> log all time diffs. Otherwise, though it doesn't look so to me,
> maybe we do have some issue in the code, one never knows ;-)

I don't have much time to recompile and test booth, but I may
eventually do it on the arbitrator (which unfortunately is also the
machine less prone to trigger those warnings). I'll try to add the
time-diff, to check if it is 0 or less when the warning appears.

In the mean time, I'll just ignore those warnings.

Thank you very much!

-- 
Nicolas Huillard

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Users Digest, Vol 34, Issue 24

2017-11-13 Thread Kanika Satija
Hi All,

It seems that pcsd service is not running on all three nodes.

Please check pcsd service by below command and if not running then start it 
1) systemctl status pcsd
2) systemctl start pcsd

After this try to authenticate the nodes

Regards,
Kanika

-Original Message-
From: users-requ...@clusterlabs.org [mailto:users-requ...@clusterlabs.org] 
Sent: Monday, November 13, 2017 4:30 PM
To: users@clusterlabs.org
Subject: Users Digest, Vol 34, Issue 24

Send Users mailing list submissions to
users@clusterlabs.org

To subscribe or unsubscribe via the World Wide Web, visit
http://lists.clusterlabs.org/mailman/listinfo/users
or, via email, send a message with subject or body 'help' to
users-requ...@clusterlabs.org

You can reach the person managing the list at
users-ow...@clusterlabs.org

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of Users digest..."


Today's Topics:

   1. Re: pcs authentication fails on Centos 7.0 & 7.1 (Digimer)
   2. Antw: Re:  issues with pacemaker daemonization (Ulrich Windl)
   3. Re: pcs authentication fails on Centos 7.0 & 7.1 (Jan Friesse)


--

Message: 1
Date: Sun, 12 Nov 2017 15:14:07 -0500
From: Digimer 
To: users@clusterlabs.org
Subject: Re: [ClusterLabs] pcs authentication fails on Centos 7.0 &
7.1
Message-ID: <7e3a7b7e-ab66-b641-8ae9-dc955c578...@alteeve.ca>
Content-Type: text/plain; charset=windows-1252

On 2017-11-12 04:20 AM, Aviran Jerbby wrote:
>
> Hi Clusterlabs mailing list,
>
> ?
>
> I'm having issues running pcs authentication on RH cent os 7.0/7.1 
> (Please see log below).
>
> ?
>
> *_It's important to mention that pcs authentication with RH cent os
> 7.2/7.4 and with the same setup and packages is working._*
>
> ?
>
> *[root@ufm-host42-014 tmp]# cat /etc/redhat-release *
>
> *CentOS Linux release 7.0.1406 (Core) *
>
> *[root@ufm-host42-014 tmp]# rpm -qa | grep openssl*
>
> *openssl-libs-1.0.2k-8.el7.x86_64*
>
> *openssl-devel-1.0.2k-8.el7.x86_64*
>
> *openssl-1.0.2k-8.el7.x86_64*
>
> *[root@ufm-host42-014 tmp]# rpm -qa | grep pcs*
>
> *pcs-0.9.158-6.el7.centos.x86_64*
>
> *pcsc-lite-libs-1.8.8-4.el7.x86_64*
>
> *[root@ufm-host42-014 tmp]# pcs cluster auth 
> ufm-host42-012.rdmz.labs.mlnx ufm-host42-013.rdmz.labs.mlnx 
> ufm-host42-014.rdmz.labs.mlnx -u hacluster -p "" --debug*
>
> *Running: /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb 
> auth*
>
> *Environment:*
>
> *? DISPLAY=localhost:10.0*
>
> *? GEM_HOME=/usr/lib/pcsd/vendor/bundle/ruby*
>
> *? HISTCONTROL=ignoredups*
>
> *? HISTSIZE=1000*
>
> *? HOME=/root*
>
> *? HOSTNAME=ufm-host42-014.rdmz.labs.mlnx*
>
> *? KDEDIRS=/usr*
>
> *? LANG=en_US.UTF-8*
>
> *? LC_ALL=C*
>
> *? LESSOPEN=||/usr/bin/lesspipe.sh %s*
>
> *? LOGNAME=root*
>
> *?
> LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=4
> 0;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30
> ;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=
> 01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lz
> ma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*
> .z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=
> 01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=
> 01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sa
> r=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*
> .7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:
> *.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;3
> 5:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=
> 01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m
> 2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:
> *.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;3
> 5:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01
> ;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01
> ;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=
> 01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.m
> p3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.
> oga=01;36:*.spx=01;36:*.xspf=01;36:*
>
> *?MAIL=/var/spool/mail/root*
>
> *? OLDPWD=/root*
>
> *?
> PATH=/usr/lib64/qt-3.3/bin:/root/perl5/bin:/usr/local/sbin:/usr/local/
> bin:/usr/sbin:/usr/bin:/root/bin*
>
> *? PCSD_DEBUG=true*
>
> *? PCSD_NETWORK_TIMEOUT=60*
>
> *? PERL5LIB=/root/perl5/lib/perl5:*
>
> *? PERL_LOCAL_LIB_ROOT=:/root/perl5*
>
> *? PERL_MB_OPT=--install_base /root/perl5*
>
> *? PERL_MM_OPT=INSTALL_BASE=/root/perl5*
>
> *? PWD=/tmp*
>
> *? QTDIR=/usr/lib64/qt-3.3*
>
> *? QTINC=/usr/lib64/qt-3.3/include*
>
> *? QTLIB=/usr/lib64/qt-3.3/lib*
>
> *? QT_GRAPHICSSYSTEM=native*
>
> *? QT_GRAPHICSSYSTEM_CHECKED=1*
>
> *? 

Re: [ClusterLabs] One cluster with two groups of nodes

2017-11-13 Thread Alberto Mijares
>
> Colocation constraints may take a "node-attribute" parameter, that
> basically means, "Put this resource on a node of the same class as the
> one running resource X".
>
> In this case, you might set a "group" node attribute on all nodes, to
> "1" on the three primary nodes and "2" on the three failover nodes.
> Pick one resource as your base resource that everything else should go
> along with. Configure colocation constraints for all the other
> resources with that one, using "node-attribute=group". That means that
> all the other resources must be one a node with the same "group"
> attribute value as the node that the base resource is running on.
>
> "node-attribute" defaults to "#uname" (node name), this giving the
> usual behavior of colocation constraints: put the resource only on a
> node with the same name, i.e. the same node.
>
> The remaining question is, how do you want the base resource to fail
> over? If the base resource can fail over to any other node, whether in
> the same group or not, then you're done. If the base resource can only
> run on one node in each group, ban it from the other nodes using
> -INFINITY location constraints. If the base resource should only fail
> over to the opposite group, that's trickier, but something roughly
> similar would be to prefer one node in each group with an equal
> positive score location constraint, and migration-threshold=1.


I just want to let you know that this worked like a charm.

Again, thank you very much.

Best regards,


Alberto Mijares

___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] pcs authentication fails on Centos 7.0 & 7.1

2017-11-13 Thread Jan Friesse

Digimer napsal(a):

On 2017-11-12 04:20 AM, Aviran Jerbby wrote:


Hi Clusterlabs mailing list,



I'm having issues running pcs authentication on RH cent os 7.0/7.1
(Please see log below).



*_It's important to mention that pcs authentication with RH cent os
7.2/7.4 and with the same setup and packages is working._*



*[root@ufm-host42-014 tmp]# cat /etc/redhat-release *

*CentOS Linux release 7.0.1406 (Core) *

*[root@ufm-host42-014 tmp]# rpm -qa | grep openssl*

*openssl-libs-1.0.2k-8.el7.x86_64*

*openssl-devel-1.0.2k-8.el7.x86_64*

*openssl-1.0.2k-8.el7.x86_64*

*[root@ufm-host42-014 tmp]# rpm -qa | grep pcs*

*pcs-0.9.158-6.el7.centos.x86_64*

*pcsc-lite-libs-1.8.8-4.el7.x86_64*

*[root@ufm-host42-014 tmp]# pcs cluster auth
ufm-host42-012.rdmz.labs.mlnx ufm-host42-013.rdmz.labs.mlnx
ufm-host42-014.rdmz.labs.mlnx -u hacluster -p "" --debug*

*Running: /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb auth*

*Environment:*

*  DISPLAY=localhost:10.0*

*  GEM_HOME=/usr/lib/pcsd/vendor/bundle/ruby*

*  HISTCONTROL=ignoredups*

*  HISTSIZE=1000*

*  HOME=/root*

*  HOSTNAME=ufm-host42-014.rdmz.labs.mlnx*

*  KDEDIRS=/usr*

*  LANG=en_US.UTF-8*

*  LC_ALL=C*

*  LESSOPEN=||/usr/bin/lesspipe.sh %s*

*  LOGNAME=root*

*
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.

v
ob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:*


* MAIL=/var/spool/mail/root*

*  OLDPWD=/root*

*
PATH=/usr/lib64/qt-3.3/bin:/root/perl5/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin*

*  PCSD_DEBUG=true*

*  PCSD_NETWORK_TIMEOUT=60*

*  PERL5LIB=/root/perl5/lib/perl5:*

*  PERL_LOCAL_LIB_ROOT=:/root/perl5*

*  PERL_MB_OPT=--install_base /root/perl5*

*  PERL_MM_OPT=INSTALL_BASE=/root/perl5*

*  PWD=/tmp*

*  QTDIR=/usr/lib64/qt-3.3*

*  QTINC=/usr/lib64/qt-3.3/include*

*  QTLIB=/usr/lib64/qt-3.3/lib*

*  QT_GRAPHICSSYSTEM=native*

*  QT_GRAPHICSSYSTEM_CHECKED=1*

*  QT_PLUGIN_PATH=/usr/lib64/kde4/plugins:/usr/lib/kde4/plugins*

*  SHELL=/bin/bash*

*  SHLVL=1*

*  SSH_CLIENT=10.208.0.12 47232 22*

*  SSH_CONNECTION=10.208.0.12 47232 10.224.40.143 22*

*  SSH_TTY=/dev/pts/0*

*  TERM=xterm*

*  USER=root*

*  XDG_RUNTIME_DIR=/run/user/0*

*  XDG_SESSION_ID=6*

*  _=/usr/sbin/pcs*

*--Debug Input Start--*

*{"username": "hacluster", "local": false, "nodes":
["ufm-host42-014.rdmz.labs.mlnx", "ufm-host42-013.rdmz.labs.mlnx",
"ufm-host42-012.rdmz.labs.mlnx"], "password": "", "force": false}*

*--Debug Input End--*

* *

*Finished running: /usr/bin/ruby -I/usr/lib/pcsd/
/usr/lib/pcsd/pcsd-cli.rb auth*

*Return value: 0*

*--Debug Stdout Start--*

*{*

*  "status": "ok",*

*  "data": {*

*"auth_responses": {*

*  "ufm-host42-014.rdmz.labs.mlnx": {*

*"status": "noresponse"*

* },*

*  "ufm-host42-012.rdmz.labs.mlnx": {*

*"status": "noresponse"*

*  },*

*  "ufm-host42-013.rdmz.labs.mlnx": {*

*"status": "noresponse"*

*  }*

*},*

*"sync_successful": true,*

*"sync_nodes_err": [*

* *

*],*

*"sync_responses": {*

*}*

*  },*

*  "log": [*

*"I, [2017-11-07T19:52:27.434067 #25065]  INFO -- : PCSD Debugging
enabled\n",*

*"D, [2017-11-07T19:52:27.454014 #25065] DEBUG -- : Did not detect
RHEL 6\n",*

*"I, [2017-11-07T19:52:27.454076 #25065]  INFO -- : Running:
/usr/sbin/corosync-cmapctl totem.cluster_name\n",*

*"I, [2017-11-07T19:52:27.454127 #25065]  INFO -- : CIB USER:
hacluster, groups: \n",*

*"D, [2017-11-07T19:52:27.458142 #25065] DEBUG -- : []\n",*

*"D, [2017-11-07T19:52:27.458216 #25065] DEBUG -- : [\"Failed to
initialize the cmap API. Error CS_ERR_LIBRARY\\n\"]\n",*

*"D,