We upgraded to the vlan 1.9-3.2ubuntu1.16.04.3 package and our
networking broke horribly in a very similar way.

Let me start with our networking configuration. Two slaves, a bond and a
vlan on top of that bond:

auto eno1
iface eno1 inet manual
   mtu 1500
   bond-master bond1
   bond-primary eno1

auto eno2
iface eno2 inet manual
   mtu 1500
   bond-master bond1

auto bond1
iface bond1 inet static
   mtu 1500
   address 10.10.10.3
   bond-miimon 100
   bond-mode active-backup
   bond-slaves none
   bond-downdelay 200
   bond-updelay 200
   dns-nameservers 10.10.0.1
   netmask 255.255.0.0

auto bond1.2
iface bond1.2 inet static
   mtu 1500
   address 10.11.10.3
   netmask 255.255.0.0
   vlan-raw-device bond1

This fails to come up correctly, both during boot and manually. Bringing
up either eno1, eno2, bond1 or bond1.2 all result in the same problem:
"ifup: waiting for lock on /run/network/ifstate.bond1".

Problem seems to be that ifup tries to bring up the base bond1 interface
*again*. Even if it is already up. And it gets stuck waiting for the
bond1 interface to be unlocked so it can bring it up, but it is already
up and thus locked so that will never happen.

We also tried bringing all interfaces down and just running "ifup
bond1.2" but that results in the same behavior.

Only workaround that seemed to work for us was to:
1) temporarily remove the bond1.2.cfg from /etc/network/interfaces.d
2) bring up eno1, eno2 and bond1
3) put the bond1.2.cfg back in its place
4) run "ifup bond1.2"
5) using another terminal, list all open processes using "ps -ef | grep ifup"
6) kill the "ifup bond1" process

The "ps -ef | grep ifup" during step 5, outputs two ifup processes. One
for bond1 and one for bond1.2. As soon as we kill the "ifup bond1"
process, the "ifup bond1.2" process completes immediately and correctly
configures the vlan 2 subinterface.

This is clearly linked to vlan, because our infiniband interfaces work
just fine. Also it worked just fine before upgrading the package. So my
best guess would be that something broke in the code that detects if the
vlan-raw-device is up. Perhaps related to LP #1573272 ?

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to ifupdown in Ubuntu.
https://bugs.launchpad.net/bugs/1636708

Title:
  ifup -a does not start dependants last, causes deadlocks with
  vlans/bonding

Status in ifupdown package in Ubuntu:
  Confirmed
Status in ifupdown source package in Xenial:
  Confirmed

Bug description:
  This is a problem I've been struggling with since moving to 16.04.1
  from 14.04 (fresh install)

  I don't believe this problem affected 14.04. I have used an almost
  identical interfaces file on 14.04 without problem.

  On 16.04.1, however, 9/10 boots would hang during network
  configuration and leave the network incorrectly configured.

  When calling "ifup -a" all candidate interfaces appear to be started
  in parallel leading to collisions with locks. This causes hanging
  (until timeout) during booting and the network interfaces left
  incorrectly configured

  Imagine this /etc/network/interfaces

  auto eno1 bond0 bond0.1

  iface eno1 inet manual
          bond-master     bond0

  iface bond0 inet manual
          bond-slaves     eno1
          bond-mode       4
          bond-lacp-rate  1
          bond-miimon     100
          bond-updelay    200
          bond-downdelay  200

  iface bond0.5 inet dhcp
          vlan-raw-device bond0

  
  eno1 -> bond0 -> bond0.5 -> dhcp

  When calling "ifup -a" at boot time all three interfaces are started
  at the same time.

  bond0 and bond0.5 both attempt to share the same lock file:

    /run/network/ifstate.bond0

  If bond0 wins the race, the system will start correctly (1/10):

    * bond0 starts and creates the bond0 device and the ifenslave.bond0 file to 
indicate the bond is ready
    * eno1 polls for the ifenslave.bond0 file, when it appears it attaches eno1 
to bond0
    * bond0 finishes and releases the lock
    * bond0.5 now acquires the lock.
    * bond0.5 starts dhclient, which can talk to the network and configure the 
interface

  
  If, however, bond0.2 wins the lock race, the system will hang at boot (5 
mins) and fail to set up the network.

    * bond0.5 is awarded the ifstate.bond0 lockfile
    * bond0.5 starts dhclient waiting to hear from the network
    * bond0 is blocked, so bond0 is not created nor is the bond0.ifenslave file 
    * eno1 polls but never finds the ifenslave.bond0 file so never attaches to 
bond0
    * bond0.5's dhclient is trying to talk to a disconnected network and never 
receives an answer

    ! bond0.5 is stuck running dhclient
    ! bond0 is stuck waiting for bond0.5 to finish
    ! eno1 is stuck waiting for bond0 to create the ifenslave.bond0 file

  
  I believe ifup should start interfaces (that share lock files) in dependant 
order. The most basic interface must be awarded the lock over its dependants. 
In this case:
    
    1 eno1
    2 bond0
    3 bond0.5

  but never:

    1 eno1
    2 bond0.5
    3 bond0

  
  As a work arouund, in /etc/network/interfaces

  -auto       eno1 bond0 bond0.1
  +auto       eno1 bond0
  +allow-bond bond0.1

  And also in /lib/systemd/system/networking.service

   ExecStart=/sbin/ifup -a --read-environment
  +ExecStart=/sbin/ifup -a --allow=bond --read-environment
   ExecStop=/sbin/ifdown -a --read-environment

  Then run:

    systemctl dameon-reload

  This causes all "auto" interfaces to start then, when they've
  completed, all allow-bond interfaces to start.

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: ifupdown 0.8.10ubuntu1.1 [modified: 
lib/systemd/system/networking.service]
  ProcVersionSignature: Ubuntu 4.4.0-45.66-generic 4.4.21
  Uname: Linux 4.4.0-45-generic x86_64
  ApportVersion: 2.20.1-0ubuntu2.1
  Architecture: amd64
  Date: Wed Oct 26 06:32:57 2016
  InstallationDate: Installed on 2016-10-24 (1 days ago)
  InstallationMedia: Ubuntu-Server 16.04.1 LTS "Xenial Xerus" - Release amd64 
(20160719)
  SourcePackage: ifupdown
  UpgradeStatus: No upgrade log present (probably fresh install)
  modified.conffile..etc.init.networking.conf: [modified]
  mtime.conffile..etc.init.networking.conf: 2016-10-26T04:52:05.750927

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/1636708/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to