Package: ifenslave
Version: 2.13
Severity: normal

Hi,

I have a simple bonding configuration with two physical Ethernet
interfaces, both defined with the `allow-hotplug` option.  I use
allow-hotplug because the interfaces are on USB.  And since I'm
using allow-hotplug, I chose the style of configuration where I use
the `bond-master` configuration option under each slave configuration
stanza, rather than `bond-slaves` in the bonding master interface
stanza.

This configuration is similar to the one in
examples/two_hotplug_ethernet.  I'm attaching a copy of my
/etc/network/interfaces file.

The problem is that sometimes after a reboot, I find that one of my
interfaces (wlx0013efd01275: an external, wireless USB interface) is
not a member of the bond:

$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group 
default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enxb827eb9e4634: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc 
pfifo_fast master bond0 state UP group default qlen 1000
    link/ether b8:27:eb:9e:46:34 brd ff:ff:ff:ff:ff:ff
3: wlx0013efd01275: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state 
UP group default qlen 1000
    link/ether 00:13:ef:d0:12:75 brd ff:ff:ff:ff:ff:ff
    inet6 2002:ce3f:e590:2:213:efff:fed0:1275/64 scope global dynamic 
mngtmpaddr 
       valid_lft 86179sec preferred_lft 14179sec
    inet6 2002:ce3f:e590:1:213:efff:fed0:1275/64 scope global dynamic 
mngtmpaddr 
       valid_lft 86179sec preferred_lft 14179sec
    inet6 fe80::213:efff:fed0:1275/64 scope link 
       valid_lft forever preferred_lft forever
4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state 
UP group default qlen 1000
    link/ether b8:27:eb:9e:46:34 brd ff:ff:ff:ff:ff:ff
    inet 192.168.66.19/21 brd 192.168.71.255 scope global bond0
       valid_lft forever preferred_lft forever
    inet6 2002:ce3f:e590:1:1::19/64 scope global nodad 
       valid_lft forever preferred_lft forever
    inet6 fe80::ba27:ebff:fe9e:4634/64 scope link 
       valid_lft forever preferred_lft forever
$ cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v5.10.0-10-rpi

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: enxb827eb9e4634 (primary_reselect always)
Currently Active Slave: enxb827eb9e4634
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

Slave Interface: enxb827eb9e4634
MII Status: up
Speed: 100 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: b8:27:eb:9e:46:34
Slave queue ID: 0
$ /sbin/ifquery --state
lo=lo
bond0=bond0
wlx0013efd01275=wlx0013efd01275
enxb827eb9e4634=enxb827eb9e4634

In this situation, this is logged to syslog (via systemd):

sh[299]: Failed to enslave wlx0013efd01275 to bond0.

I think I understand the cause, and propose a workaround or a bug fix
(depending on how you look at it).

When both devices are present at boot, two ifup@<iface>.service units
(one for each interface) are started simultaneously.  It seems like
the ifenslave ifupdown scripts are meant to handle this, but in
/etc/network/if-pre-up.d/ifenslave, in the definition of setup_slave_device(),
there is:

    # Ensure the master is up or being configured
    export IFENSLAVE_ENV_NAME="IFUPDOWN_$IF_BOND_MASTER"
    IFUPDOWN_IF_BOND_MASTER="$(printenv "$IFENSLAVE_ENV_NAME")"
    unset IFENSLAVE_ENV_NAME
    if [ -z "$IFUPDOWN_IF_BOND_MASTER" ] ; then
            ifquery --state "$IF_BOND_MASTER" >/dev/null 2>&1 || ifup 
"$IF_BOND_MASTER"
    fi

I've added loads of debugging and done many reboot cycles to find out
that when the problem occurs (or at least in one case), both
simultaneously-running processes get to the `ifquery` line.  One of
the processes executes ifquery and gets a non-zero return code,
leading it to run `ifup "$IF_BOND_MASTER"`.  After ifup starts, the
other script process executes ifquery and gets a zero return code.

In this case, the bond interface hasn't come up yet, but the command
to bring it up has started, which I think is why ifquery is returning
zero here.  I can reproduce this behavior of ifquery with a dummy
interface:

iface dummy0 inet manual
        pre-up modprobe dummy
        pre-up sleep 5
        up ip link add dummy0 type dummy
        down ip link del dummy0

$ q() { sudo ifquery --state dummy0; echo "  => $?"; }
$ q
  => 1
$ sudo ifup dummy0 & sleep 1; q; ip link show dummy0; ps $!
[1] 7956
dummy0=dummy0
  => 0
Device "dummy0" does not exist.
  PID TTY      STAT   TIME COMMAND
 7956 pts/1    S      0:00 sudo ifup dummy0

This shows that ifquery does return 0, even while ifup is still
working.

>From my reading of ifquery(8), this is possibly a bug ("successful
code is returned if all of interfaces given as arguments are up").
Either ifquery is supposed to return failure in this case (meaning
ifquery has a bug, since the interface hasn't finished coming up yet),
or it is valid for it to return successfully since an ifup process has
been _started_ for this interface (in which case the ifenslave script
has a bug--but ifquery(8) could probably be improved).  Regardless of
whether ifquery has a bug here, the ifenslave script could be
simplified in a way that works around the issue anyway:

    # [omitted context]
    if [ -z "$IFUPDOWN_IF_BOND_MASTER" ] ; then
            ifup "$IF_BOND_MASTER"
    fi

Now, if ifup gets started concurrently for the same bond master
interface, the second process blocks (due to ifupdown locking) until
the first process completes.  Both processes running
if-pre-up.d/ifenslave would now continue after bond0 has finished its
configuration.

ifup is idempotent, so there's no harm in running it every time.  In
fact, with how the script is written now, there's a race condition
where ifquery could return failure, and in-between the return and the
ifup, the interface could come up.  If that happens, ifup gets run on
a bond interface in the up state.  Because of the idempotence of ifup,
there's no harm in this hypothetical race condition, but the script
already relies on it, so I'd suggest removing the ifquery condition
altogether.

The only downside I see to running `ifup` every time is that I now get
extra messages in my bootup scenario:

ifup[283]: ifup: waiting for lock on /run/network/ifstate.bond0
sh[311]: ifup: interface bond0 already configured

But if I understand correctly, those are informational, and not
indicating an error.

When I remove that ifquery condition, I no longer encounter the
problem, even after letting this machine reboot in a loop all night.
I'm attaching my modified version of the script.

Do you think this fix could be accepted?  If you think it'd be more
appropriate for me to file a bug against ifupdown regarding ifquery,
please let me know.

Thanks,
-Jean-Paul

-- System Information:
Debian Release: 11.2
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: armel (armv6l)

Kernel: Linux 5.10.0-10-rpi
Kernel taint flags: TAINT_CRAP, TAINT_UNSIGNED_MODULE
Locale: LANG=C, LC_CTYPE=C (charmap=UTF-8) (ignored: LC_ALL set to 
en_US.UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages ifenslave depends on:
ii  ifupdown  0.8.36
ii  iproute2  5.10.0-4

Versions of packages ifenslave recommends:
ii  net-tools  1.60+git20181103.0eebece-1

ifenslave suggests no packages.

-- no debconf information
# interfaces(5) file used by ifup(8) and ifdown(8)

auto bond0
iface bond0 inet static
        # No bond-slaves; let them come up on their own via hotplugging.
        bond-slaves none
        bond-mode active-backup
        bond-primary enxb827eb9e4634
        bond-miimon 100
        address 192.168.66.19/21
        gateway 192.168.64.1
        dns-nameserver 192.168.66.12
        dns-search thoughtcrime.us
iface bond0 inet6 static
        address 2002:ce3f:e590:1:1::19/64
        # DAD fails because there will be no slaves at the time of
        # this bond interface coming up.  DAD failure will result
        # in ifup failure.  Just skip DAD.
        dad-attempts 0

allow-hotplug enxb827eb9e4634
iface enxb827eb9e4634 inet manual
        bond-master bond0
        bond-mode active-backup
        bond-primary enxb827eb9e4634
        bond-miimon 100

allow-hotplug wlx0013efd01275
iface wlx0013efd01275 inet manual
        wpa-ssid [redacted]
        wpa-psk [redacted]
        bond-master bond0
        bond-mode active-backup
        bond-primary enxb827eb9e4634
        bond-miimon 100
#!/bin/sh

[ "$VERBOSITY" = 1 ] && set -x

[ "$ADDRFAM" = meta ] && exit 0

add_master()
{
        # Return if $IFACE is already a bonding interface.
        [ -f "/sys/class/net/$IFACE/bonding/slaves" ] && return

        ip link add dev "$IFACE" type bond
}

sysfs_change_down()
{
        # Called with :
        # $1 = basename of the file in bonding/ to write to.
        # $2 = value to write. Won't write if $2 is empty.
        if [ -n "$2" ] ; then
                # If the value we plan to write is different from the current 
one...
                if ! grep -sq "\\<$2\\>" 
"/sys/class/net/$BOND_MASTER/bonding/$1" ; then
                        # ...and the master is up...
                        if ip link show "$BOND_MASTER" | grep -sq '[<,]UP[,>]' 
; then
                                # ...bring the master down.
                                ip link set dev "$BOND_MASTER" down
                        fi
                fi
                sysfs "$1" "$2"
        fi
}

sysfs()
{
        # Called with :
        # $1 = basename of the file in bonding/ to write to.
        # $2 = value to write. Won't write if $2 is empty.
        if [ -n "$2" ] ; then
                echo "$2" > "/sys/class/net/$BOND_MASTER/bonding/$1"
                return $?
        fi
        return 0
}

sysfs_add()
{
        #??Called with :
        # $1 = target filename.
        # $2 = values to write.
        for value in $2; do
                # Do not add $2 to $1 if already present.
                if ! grep -sq "\\<$value\\>" 
"/sys/class/net/$BOND_MASTER/bonding/$1"
                then
                    sysfs "$1" "+$value"
                fi
        done
}

# early_setup_master is the place where we do master setup that need to be done 
before enslavement.
early_setup_master()
{
        # Warning: the order in which we write into the sysfs files is 
important.
        # Double check in drivers/net/bonding/bond_sysfs.c in the Linux kernel 
source tree
        # before changing anything here.

        # fail_over_mac must be set before enslavement of any slaves.
        sysfs fail_over_mac "$IF_BOND_FAIL_OVER_MAC"
}

enslave_slaves()
{
        case "$BOND_SLAVES" in
                none)
                        BOND_SLAVES=""
                        ;;
                all)
                        BOND_SLAVES=$(sed -ne 's/ *\(eth[^:]*\):.*/\1/p' 
/proc/net/dev)
                        ;;
        esac

        [ "$VERBOSITY" = 1 ] && v=-v
        for slave in $BOND_SLAVES ; do
                export IFENSLAVE_ENV_NAME="IFUPDOWN_$slave"
                IFUPDOWN_IFACE="$(printenv "$IFENSLAVE_ENV_NAME")"
                unset IFENSLAVE_ENV_NAME
                if ( ( ifquery --state "$slave" >/dev/null 2>&1 ) && ( ifquery 
"$slave" | grep -q bond-master ) ) || [ -n "$IFUPDOWN_IFACE" ] ; then
                        # Skipping interface that's already up or being 
configured and has bonding configuration
                        continue
                else
                        # Ensure $slave is down.
                        ip link set "$slave" down 2>/dev/null
                        if ! sysfs_add slaves "$slave" 2>/dev/null ; then
                                echo "Failed to enslave $slave to $BOND_MASTER. 
Is $BOND_MASTER ready and a bonding interface ?" >&2
                        else
                                # Bring up slave if it is the target of an 
allow-bondX stanza.
                                # This is useful to bring up slaves that need 
extra setup.
                                ifup $v --allow "$BOND_MASTER" "$slave"
                        fi
                fi
        done
}

setup_master()
{
        # Warning: the order in which we write into the sysfs files is 
important.
        # Double check in drivers/net/bonding/bond_sysfs.c in the Linux kernel 
source tree
        # before changing anything here.

        # use_carrier can be set anytime.
        sysfs use_carrier "$IF_BOND_USE_CARRIER"
        # num_grat_arp can be set anytime.
        sysfs num_grat_arp "$IF_BOND_NUM_GRAT_ARP"
        # num_unsol_na can be set anytime.
        sysfs num_unsol_na "$IF_BOND_NUM_UNSOL_NA"

        # arp_ip_target must be set before arp_interval.
        sysfs_add arp_ip_target "$IF_BOND_ARP_IP_TARGET"
        sysfs arp_interval "$IF_BOND_ARP_INTERVAL"

        # miimon must be set before updelay and downdelay.
        sysfs miimon "$IF_BOND_MIIMON"
        sysfs downdelay "$IF_BOND_DOWNDELAY"
        sysfs updelay "$IF_BOND_UPDELAY"

        # Changing ad_select requires $BOND_MASTER to be down.
        sysfs_change_down ad_select "$IF_BOND_AD_SELECT"

        # Changing mode requires $BOND_MASTER to be down.
        # Mode should be set after miimon or arp_interval, to avoid a warning 
in syslog.
        sysfs_change_down mode "$IF_BOND_MODE"

        # Requires $BOND_MASTER to be down and mode to be configured
        sysfs_change_down xmit_hash_policy "$IF_BOND_XMIT_HASH_POLICY"
        sysfs_change_down tlb_dynamic_lb "$IF_BOND_TLB_DYNAMIC_LB"

        # packets_per_slave allowed for mode balance-rr only.
        sysfs packets_per_slave "$IF_BOND_PACKETS_PER_SLAVE"

        # arp_validate must be after mode (because mode must be active-backup).
        sysfs arp_validate "$IF_BOND_ARP_VALIDATE"

        # lacp_rate must be set after mode (because mode must be 802.3ad).
        # Changing lacp_rate requires $BOND_MASTER to be down.
        sysfs_change_down lacp_rate "$IF_BOND_LACP_RATE"

        # queue_id must be set after enslavement.
        for iface_queue_id in $IF_BOND_QUEUE_ID ; do
                sysfs iface_queue_id "$iface_queue_id"
        done

        # active_slave must be set after mode and after enslavement.
        # The slave must be up and the underlying link must be up too.
        # FIXME: We should have a way to write an empty string to active_slave, 
to set the active_slave to none.
        if [ -n "$IF_BOND_ACTIVE_SLAVE" ] ; then
                if [ "$IF_BOND_ACTIVE_SLAVE" = "none" ] ; then
                        sysfs active_slave ""
                else
                        # Need to force interface up before. Bonding will 
refuse to activate a down interface.
                        if ifquery -l "$IF_BOND_ACTIVE_SLAVE" >/dev/null 2>&1 ; 
then
                                ifup "$IF_BOND_ACTIVE_SLAVE"
                        else
                                ip link set "$IF_BOND_ACTIVE_SLAVE" up
                        fi
                        sysfs active_slave "$IF_BOND_ACTIVE_SLAVE"
                fi
        fi
}

setup_primary() {
        # primary must be set after mode (because only supported in some modes) 
and after enslavement.
        # The first slave in bond-primary found in current slaves becomes the 
primary.
        # If no slave in bond-primary is found, then the primary does not 
change.
        for slave in $IF_BOND_PRIMARY ; do
                slaves="/sys/class/net/$BOND_MASTER/bonding/slaves"
                if grep -sq "\\<$slave\\>" "$slaves" ; then
                        sysfs primary "$slave"
                        break
                fi
        done

        # primary_reselect should be set after mode (it is only supported in 
some modes), after enslavement
        # and after primary. This is currently (2.6.35-rc1) not enforced by the 
bonding driver, but it is
        # probably safer to do it in that order.
        sysfs primary_reselect "$IF_BOND_PRIMARY_RESELECT"
}

setup_master_device() {
        add_master
        early_setup_master
        setup_master
        enslave_slaves
        setup_primary
}

setup_slave_device() {
        # Require the bond master to have an iface stanza
        if ! ifquery -l "$IF_BOND_MASTER" >/dev/null 2>&1 ; then
                echo "No iface stanza found for master $IF_BOND_MASTER" >&2
                exit 1
        fi

        # Ensure the master is up or being configured
        export IFENSLAVE_ENV_NAME="IFUPDOWN_$IF_BOND_MASTER"
        IFUPDOWN_IF_BOND_MASTER="$(printenv "$IFENSLAVE_ENV_NAME")"
        unset IFENSLAVE_ENV_NAME
        if [ -z "$IFUPDOWN_IF_BOND_MASTER" ] ; then
                ifup "$IF_BOND_MASTER"
        fi

        # Enslave it to the master
        ip link set "$1" down 2>/dev/null
        if ! sysfs_add slaves "$1" 2>/dev/null ; then
                echo "Failed to enslave $1 to $BOND_MASTER." >&2
        fi

        setup_primary
}

# Option slaves deprecated, replaced by bond-slaves, but still supported for 
backward compatibility.
IF_BOND_SLAVES=${IF_BOND_SLAVES:-$IF_SLAVES}

if [ -n "$IF_BOND_MASTER" ] ; then
        # FIXME: use function arguments instead of this variable
        BOND_MASTER="$IF_BOND_MASTER"
        setup_slave_device "$IFACE"
elif [ -n "$IF_BOND_SLAVES" ] || [ -n "$IF_BOND_MODE" ] ; then
        # FIXME: use function arguments instead of these variables
        BOND_MASTER="$IFACE"
        BOND_SLAVES="$IF_BOND_SLAVES"
        setup_master_device "$IFACE"
fi

exit 0

Reply via email to