Bug#986325: corosync: crash with compression enabled

Ferenc Wágner Sat, 03 Apr 2021 01:09:14 -0700

Package: corosync
Version: 3.1.0-3
Severity: normal
Tags: patch upstream
Forwarded: https://github.com/corosync/corosync/issues/630


As reported by Lukey3332:

Sometimes corosync crashes at startup, but only if compression is enabled.

Distribution: Debian Bullseye

Corosync version:

Corosync Cluster Engine, version '3.1.0'
Copyright (c) 2006-2018 Red Hat, Inc.

Kronosnet version:

Package: libknet1
Source: kronosnet
Version: 1.20-4

Here is a backtrace:

#0  __GI___pthread_mutex_lock (mutex=mutex@entry=0x555558619958) at 
../nptl/pthread_mutex_lock.c:67
#1  0x00007ffff7e6c728 in pmtud_reschedule (knet_h=0x555555577320 
<_logsys_log_printf>, knet_h@entry=0x555558619958) at threads_common.c:42
#2  get_global_wrlock (knet_h=knet_h@entry=0x555555577320 <_logsys_log_printf>) 
at threads_common.c:61
#3  0x00007ffff7e64316 in knet_handle_compress (knet_h=0x555555577320 
<_logsys_log_printf>, knet_handle_compress_cfg=0x7fffffffcea0) at compress.c:503
#4  0x000055555559ae8f in totemknet_configure_compression 
(knet_context=knet_context@entry=0x55555574d900, 
totem_config=totem_config@entry=0x7fffffffd310) at totemknet.c:1565
#5  0x000055555559c104 in totemknet_initialize (poll_handle=0x5555556ddd30, 
knet_context=0x55555574d900, totem_config=0x7fffffffd310, stats=<optimized 
out>, context=0x5555556eed20,
    deliver_fn=0x55555558c810 <main_deliver_fn>, iface_change_fn=0x55555558d9c0 
<main_iface_change_fn>, mtu_changed=0x55555558c290 <totempg_mtu_changed>, 
target_set_completed=0x55555558ddd0 <target_set_completed>)
    at totemknet.c:1149
#6  0x0000555555588550 in totemnet_initialize 
(loop_pt=loop_pt@entry=0x5555556ddd30, 
net_context=net_context@entry=0x5555557026f8, 
totem_config=totem_config@entry=0x7fffffffd310, stats=0x555555702728,
    context=context@entry=0x5555556eed20, 
deliver_fn=deliver_fn@entry=0x55555558c810 <main_deliver_fn>, 
iface_change_fn=0x55555558d9c0 <main_iface_change_fn>, 
mtu_changed=0x55555558c290 <totempg_mtu_changed>,
    target_set_completed=0x55555558ddd0 <target_set_completed>) at 
totemnet.c:343
#7  0x000055555559541a in totemsrp_initialize 
(poll_handle=poll_handle@entry=0x5555556ddd30, 
srp_context=srp_context@entry=0x5555556760f0 <totemsrp_context>, 
totem_config=totem_config@entry=0x7fffffffd310,
    stats=stats@entry=0x5555556760c0 <totempg_stats>, 
deliver_fn=deliver_fn@entry=0x5555555970b0 <totempg_deliver_fn>, 
confchg_fn=confchg_fn@entry=0x5555555965a0 <totempg_confchg_fn>,
    waiting_trans_ack_cb_fn=0x555555596560 <totempg_waiting_trans_ack_cb>) at 
totemsrp.c:981
#8  0x0000555555597c28 in totempg_initialize (poll_handle=0x5555556ddd30, 
totem_config=totem_config@entry=0x7fffffffd310) at totempg.c:824
#9  0x000055555555e0de in main (argc=-11504, argv=<optimized out>, 
envp=<optimized out>) at main.c:1526

corosync.conf:

# Please read the corosync.conf.5 manual page
totem {
        version: 2

        cluster_name: tele-clu

        key: <snip>
        crypto_cipher: aes256
        crypto_hash: sha256

        knet_compression_model: zlib
        knet_compression_level: 6

        link_mode: passive

        interface {
                linknumber: 0
                knet_link_priority: 1
        }

        interface {
                linknumber: 1
                knet_link_priority: 0
        }
        token: 5000
}

logging {
        # Log the source file and line where messages are being
        # generated. When in doubt, leave off. Potentially useful for
        # debugging.
        fileline: off
        # Log to standard error. When in doubt, set to yes. Useful when
        # running in the foreground (when invoking "corosync -f")
        to_stderr: yes
        # Log to a log file. When set to "no", the "logfile" option
        # must not be set.
        to_logfile: yes
        logfile: /var/log/corosync/corosync.log
        # Log to the system log daemon. When in doubt, set to yes.
        to_syslog: yes
        # Log debug messages (very verbose). When in doubt, leave off.
        debug: off
        # Log messages with time stamps. When in doubt, set to hires (or on)
        #timestamp: hires
        logger_subsys {
                subsys: QUORUM
                debug: off
        }
}

quorum {
        # Enable and configure quorum subsystem (default: off)
        # see also corosync.conf.5 and votequorum.5
        provider: corosync_votequorum
}

nodelist {

        node {
                # Hostname of the node
                name: tele-clu-01
                # Cluster membership node identifier
                nodeid: 1

                ring0_addr: 192.168.233.1
                ring1_addr: 192.168.178.241
        }
        node {
                # Hostname of the node
                name: tele-clu-02
                # Cluster membership node identifier
                nodeid: 2

                ring0_addr: 192.168.233.2
                ring1_addr: 192.168.178.242
        }
        node {
                # Hostname of the node
                name: tele-clu-03
                # Cluster membership node identifier
                nodeid: 3

                ring0_addr: 192.168.233.6
                ring1_addr: 192.168.178.243
        }
}

------------------------------
As commented by fabbione:

It turns out the issue is in corosync configuration handling when doing 
compress.
Fix for master is here: https://github.com/corosync/corosync/pull/631
Same patch applies to 3.1.1

Backport to 3.1.0 attached to the upstream issue.

Bug#986325: corosync: crash with compression enabled

Reply via email to