Hi All, The option --ulimit memlock=536870912 worked fine.
I have now another strange issue. The upgrade without updating libqb (leaving the 0.16.0) worked fine. If after the upgrade I stop pacemaker and corosync, I download the latest libqb version: https://github.com/ClusterLabs/libqb/releases/download/v1.0.3/libqb-1.0.3.tar.gz build and install it everything works fine. If I try to install in sequence (after the installation of old code): libqb 1.0.3 corosync 2.4.4 pacemaker 1.1.18 crmsh 3.0.1 resource agents 4.1.1 when I try to start corosync I got the following error: Starting Corosync Cluster Engine (corosync): /etc/init.d/corosync: line 99: 8470 Aborted $prog $COROSYNC_OPTIONS > /dev/null 2>&1 [FAILED] if I launch corosync -f I got: corosync: main.c:143: logsys_qb_init: Assertion `"implicit callsite section is populated, otherwise target's build is at fault, preventing reliable logging" && __start___verbose != __stop___verbose' failed. anything is logged (even in debug mode). I do not understand why installing libqb during the normal upgrade process fails while if I upgrade it after the crmsh/pacemaker/corosync/resourceagents upgrade it works fine. On 3 Jul 2018, at 11:42, Christine Caulfield <ccaul...@redhat.com> wrote: > > On 03/07/18 07:53, Jan Pokorný wrote: >> On 02/07/18 17:19 +0200, Salvatore D'angelo wrote: >>> Today I tested the two suggestions you gave me. Here what I did. >>> In the script where I create my 5 machines cluster (I use three >>> nodes for pacemaker PostgreSQL cluster and two nodes for glusterfs >>> that we use for database backup and WAL files). >>> >>> FIRST TEST >>> —————————— >>> I added the —shm-size=512m to the “docker create” command. I noticed >>> that as soon as I start it the shm size is 512m and I didn’t need to >>> add the entry in /etc/fstab. However, I did it anyway: >>> >>> tmpfs /dev/shm tmpfs defaults,size=512m 0 0 >>> >>> and then >>> mount -o remount /dev/shm >>> >>> Then I uninstalled all pieces of software (crmsh, resource agents, >>> corosync and pacemaker) and installed the new one. >>> Started corosync and pacemaker but same problem occurred. >>> >>> SECOND TEST >>> ——————————— >>> stopped corosync and pacemaker >>> uninstalled corosync >>> build corosync with --enable-small-memory-footprint and installed it >>> starte corosync and pacemaker >>> >>> IT WORKED. >>> >>> I would like to understand now why it didn’t worked in first test >>> and why it worked in second. Which kind of memory is used too much >>> here? /dev/shm seems not the problem, I allocated 512m on all three >>> docker images (obviously on my single Mac) and enabled the container >>> option as you suggested. Am I missing something here? >> >> My suspicion then fully shifts towards "maximum number of bytes of >> memory that may be locked into RAM" per-process resource limit as >> raised in one of the most recent message ... >> >>> Now I want to use Docker for the moment only for test purpose so it >>> could be ok to use the --enable-small-memory-footprint, but there is >>> something I can do to have corosync working even without this >>> option? >> >> ... so try running the container the already suggested way: >> >> docker run ... --ulimit memlock=33554432 ... >> >> or possibly higher (as a rule of thumb, keep doubling the accumulated >> value until some unreasonable amount is reached, like the equivalent >> of already used 512 MiB). >> >> Hope this helps. > > This makes a lot of sense to me. As Poki pointed out earlier, in > corosync 2.4.3 (I think) we fixed a regression in that caused corosync > NOT to be locked in RAM after it forked - which was causing potential > performance issues. So if you replace an earlier corosync with 2.4.3 or > later then it will use more locked memory than before. > > Chrissie > > >> >>> The reason I am asking this is that, in the future, it could be >>> possible we deploy in production our cluster in containerised way >>> (for the moment is just an idea). This will save a lot of time in >>> developing, maintaining and deploying our patch system. All >>> prerequisites and dependencies will be enclosed in container and if >>> IT team will do some maintenance on bare metal (i.e. install new >>> dependencies) it will not affects our containers. I do not see a lot >>> of performance drawbacks in using container. The point is to >>> understand if a containerised approach could save us lot of headache >>> about maintenance of this cluster without affect performance too >>> much. I am notice in Cloud environment this approach in a lot of >>> contexts. >> >> >> >> _______________________________________________ >> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org> >> https://lists.clusterlabs.org/mailman/listinfo/users >> <https://lists.clusterlabs.org/mailman/listinfo/users> >> >> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> >> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> >> > > _______________________________________________ > Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org> > https://lists.clusterlabs.org/mailman/listinfo/users > <https://lists.clusterlabs.org/mailman/listinfo/users> > > Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> > Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/>
_______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org