Hi,

Thanks for reply and detailed explaination. I am not using the —network=host 
option.
I have a docker image based on Ubuntu 14.04 where I only deploy this additional 
software:

        RUN apt-get update && apt-get install -y wget git xz-utils 
openssh-server \
                systemd-services make gcc pkg-config psmisc fuse libpython2.7 
libopenipmi0 \
                libdbus-glib-1-2 libsnmp30 libtimedate-perl libpcap0.8

configure ssh with key pairs to communicate easily. The containers are created 
with these simple commands:

        docker create -it --cap-add=MKNOD --cap-add SYS_ADMIN --device 
/dev/loop0 --device /dev/fuse --net ${PUBLIC_NETWORK_NAME} --publish            
 ${PG1_SSH_PORT}:22 --ip ${PG1_PUBLIC_IP} --name ${PG1_PRIVATE_NAME} --hostname 
${PG1_PRIVATE_NAME} -v ${MOUNT_FOLDER}:/Users ngha /bin/bash

        docker create -it --cap-add=MKNOD --cap-add SYS_ADMIN --device 
/dev/loop1 --device /dev/fuse --net ${PUBLIC_NETWORK_NAME} --publish 
${PG2_SSH_PORT}:22 --ip ${PG2_PUBLIC_IP} --name ${PG2_PRIVATE_NAME} --hostname 
${PG2_PRIVATE_NAME} -v ${MOUNT_FOLDER}:/Users ngha /bin/bash         

        docker create -it --cap-add=MKNOD --cap-add SYS_ADMIN --device 
/dev/loop2 --device /dev/fuse --net ${PUBLIC_NETWORK_NAME} --publish 
${PG3_SSH_PORT}:22 --ip ${PG3_PUBLIC_IP} --name ${PG3_PRIVATE_NAME} --hostname 
${PG3_PRIVATE_NAME} -v ${MOUNT_FOLDER}:/Users ngha /bin/bash

/dev/fuse is used to configure glusterfs on two others nodes and /dev/loopX 
just to simulate better my bare metal env.

One thing that I do not understand is that I tried to compare corosync 2.3.5 
(the old version that worked fine) and 2.4.4 to understand differences but I 
haven’t found anything related to the piece of code that affects the issue. The 
quorum tool.c and cfg.c are almost the same. Probably the issue is somewhere 
else.


> On 27 Jun 2018, at 08:34, Jan Pokorný <jpoko...@redhat.com> wrote:
> 
> On 26/06/18 17:56 +0200, Salvatore D'angelo wrote:
>> I did another test. I modified docker container in order to be able to run 
>> strace.
>> Running strace corosync-quorumtool -ps I got the following:
> 
>> [snipped]
>> connect(5, {sa_family=AF_LOCAL, sun_path=@"cfg"}, 110) = 0
>> setsockopt(5, SOL_SOCKET, SO_PASSCRED, [1], 4) = 0
>> sendto(5, "\377\377\377\377\0\0\0\0\30\0\0\0\0\0\0\0\0\0\20\0\0\0\0\0", 24, 
>> MSG_NOSIGNAL, NULL, 0) = 24
>> setsockopt(5, SOL_SOCKET, SO_PASSCRED, [0], 4) = 0
>> recvfrom(5, 0x7ffd73bd7ac0, 12328, 16640, 0, 0) = -1 EAGAIN (Resource 
>> temporarily unavailable)
>> poll([{fd=5, events=POLLIN}], 1, 4294967295) = 1 ([{fd=5, revents=POLLIN}])
>> recvfrom(5, 
>> "\377\377\377\377\0\0\0\0(0\0\0\0\0\0\0\365\377\377\377\0\0\0\0\0\0\0\0\0\0\0\0"...,
>>  12328, MSG_WAITALL|MSG_NOSIGNAL, NULL, NULL) = 12328
>> shutdown(5, SHUT_RDWR)                  = 0
>> close(5)                                = 0
>> write(2, "Cannot initialise CFG service\n", 30Cannot initialise CFG service) 
>> = 30
>> [snipped]
> 
> This just demonstrated the effect of already detailed server-side
> error in the client, which communicates with the server just fine,
> but as soon as the server hits the mmap-based problem, it bails
> out the observed way, leaving client unsatisfied.
> 
> Note one thing, abstract Unix sockets are being used for the
> communication like this (observe the first line in the strace
> output excerpt above), and if you happen to run container via
> a docker command with --network=host, you may also be affected with
> issues arising from abstract sockets not being isolated but rather
> sharing the same namespace.  At least that was the case some years
> back and what asked for a switch in underlying libqb library to
> use strictly the file-backed sockets, where the isolation
> semantics matches the intuition:
> 
> https://lists.clusterlabs.org/pipermail/users/2017-May/013003.html
> 
> + way to enable (presumably only for container environments, note
> that there's no per process straightforward granularity):
> 
> https://clusterlabs.github.io/libqb/1.0.2/doxygen/qb_ipc_overview.html
> (scroll down to "IPC sockets (Linux only)")
> 
> You may test that if you are using said --network=host switch.
> 
>> I tried to understand what happen behind the scene but it is not easy for me.
>> Hoping someone on this list can help.
> 
> Containers are tricky, just as Ansible (as shown earlier on the list)
> can be, when encumbered with false believes and/or misunderstandings.
> Virtual machines may serve better wrt. insights for the later bare
> metal deployments.
> 
> -- 
> Jan (Poki)
> _______________________________________________
> Users mailing list: Users@clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to