Public bug reported:

package: Version: 2.3.3-1ubuntu1
Ubuntu release: 14.04

The current package does not use upstart (even though upstream has an upstart
corosync.conf and I suspect the bug would not reproduce) and has the sysv init
service file like so:

        start-stop-daemon --start --quiet --exec $DAEMON --test > /dev/null \
                || return 1
        start-stop-daemon --start --quiet --exec $DAEMON -- $DAEMON_ARGS \
                || return 2
        # Add code here, if necessary, that waits for the process to be ready
        # to handle requests from services started subsequently which depend
        # on this one.  As a last resort, sleep for some time.
        pidof corosync > $PIDFILE

As you can see above, the check it performs to see if it is running is whether
there is a process instance of $DAEMON. If the lxc OS is also ubuntu and also
running corosync, obviously it will be detected as well and thus, the service
will not start.

The second problem is that naive pidof will create a $PIDFILE that includes
all the corosync processes in all the lxc containers as well.

I recommend checking the pidfile in start-stop-daemon like so:

    start-stop-daemon --start --quiet --pidfile $PIDFILE --exec $DAEMON \
            --test > /dev/null || return 1
    start-stop-daemon --start --quiet --pidfile $PIDFILE --exec $DAEMON \
            -- $DAEMON_ARGS || return 2

As for storing the pid, since corosync forks, we can't possibly use the
--make-pidfile directive of start-stop-daemon. Something like the following
could be used:

    cgroup_path() {
        cat "/proc/$1/cgroup" | head -1 | cut -f3 -d:
    }

    local pids own_path proc_path
    pids=""
    own_path=$(cgroup_path $$)
    for i in $(pgrep corosync); do
        proc_path=$(cgroup_path "$i")
        if [ "$own_path" = "$proc_path" ]; then
            # Only keep the last one. If there is more than one, the former
            # would be the parent that exits
            pids="$i"
        fi
    done
    echo "${pids}" > $PIDFILE

The present bug is specially damning in juju deployments if you have a lot of
services in a node in lxc and one outside.

Note that there was a similar report for debian[1] for the snmpd
package.

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?msg=31;bug=718702

** Affects: corosync (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1528069

Title:
  corosync usage of start-stop-daemon prevents it from running when the
  corosync is also running in lxc

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/corosync/+bug/1528069/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to