[Kernel-packages] [Bug 1782716] Re: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout

2019-08-23 Thread Paul Graydon
Seeing the same on Ubuntu 18.04.3 with the HWE kernel, 5.0.0-25-generic

AMD A6-9225 RADEON R4

This appears to be tied in to the problems resuming the laptop from
suspended (black screen, flashing cursor)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1782716

Title:
  [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Running the 4.17.0-5-generic kernel on a ppc64le machine with a Radeon
  R9 Fury GPU

  
  0033:01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Fiji [Radeon R9 FURY / NANO Series] (rev ff)

  [ 2361.958847] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, 
last signaled seq=8777, last emitted seq=8778
  [ 2362.080397] EEH: Frozen PHB#33-PE#0 detected
  [ 2362.080470] EEH: PE location: CPU2 Slot1 (16x), PHB location: N/A
  [ 2362.080568] CPU: 53 PID: 874 Comm: kworker/53:1 Not tainted 
4.17.0-5-generic #6-Ubuntu
  [ 2362.080575] Workqueue: events drm_sched_job_timedout [gpu_sched]
  [ 2362.080577] Call Trace:
  [ 2362.080584] [c000fb7078f0] [c0d275ac] dump_stack+0xb0/0xf4 
(unreliable)
  [ 2362.080590] [c000fb707930] [c003ba0c] 
eeh_dev_check_failure+0x5bc/0x5e0
  [ 2362.080593] [c000fb7079e0] [c003babc] 
eeh_check_failure+0x8c/0xd0
  [ 2362.080628] [c000fb707a20] [c0080cfa1b88] 
amdgpu_mm_rreg+0x280/0x2a0 [amdgpu]
  [ 2362.080676] [c000fb707a70] [c0080d04cf68] 
gmc_v8_0_check_soft_reset+0x30/0xe0 [amdgpu]
  [ 2362.080711] [c000fb707aa0] [c0080cfa1194] 
amdgpu_device_ip_check_soft_reset.part.1+0x8c/0x140 [amdgpu]
  [ 2362.080745] [c000fb707b30] [c0080cfa649c] 
amdgpu_device_gpu_recover+0x854/0xa40 [amdgpu]
  [ 2362.080799] [c000fb707c00] [c0080d0b97a4] 
amdgpu_job_timedout+0x5c/0x80 [amdgpu]
  [ 2362.080805] [c000fb707c70] [c0080c8f0040] 
drm_sched_job_timedout+0x38/0x60 [gpu_sched]
  [ 2362.080810] [c000fb707c90] [c0137928] 
process_one_work+0x298/0x580
  [ 2362.080813] [c000fb707d20] [c0137c98] worker_thread+0x88/0x610
  [ 2362.080817] [c000fb707dc0] [c0140958] kthread+0x1a8/0x1b0
  [ 2362.080822] [c000fb707e30] [c000b658] 
ret_from_kernel_thread+0x5c/0x84
  [ 2362.080827] [drm] IP block:gmc_v8_0 is hung!
  [ 2362.080832] [drm] IP block:tonga_ih is hung!
  [ 2362.080843] [drm] IP block:gfx_v8_0 is hung!
  [ 2362.080845] EEH: Detected PCI bus error on PHB#33-PE#0
  [ 2362.080847] EEH: This PCI device has failed 1 times in the last hour
  [ 2362.080849] EEH: Notify device drivers to shutdown
  [ 2362.080850] [drm] IP block:sdma_v3_0 is hung!
  [ 2362.080856] [drm] IP block:uvd_v6_0 is hung!
  [ 2362.080858] EEH: Collect temporary log
  [ 2362.080866] [drm] IP block:vce_v3_0 is hung!
  [ 2362.080867] [drm] GPU recovery disabled.
  [ 2362.080903] EEH: of node=0033:01:00.1
  [ 2362.080905] EEH: PCI device/vendor: 
  [ 2362.080907] EEH: PCI cmd/status register: 
  [ 2362.080908] EEH: PCI-E capabilities and status follow:
  [ 2362.080915] EEH: PCI-E 00:     
  [ 2362.080920] EEH: PCI-E 10:     
  [ 2362.080921] EEH: PCI-E 20:  
  [ 2362.080922] EEH: PCI-E AER capability register set follows:
  [ 2362.080928] EEH: PCI-E AER 00:     
  [ 2362.080933] EEH: PCI-E AER 10:     
  [ 2362.080938] EEH: PCI-E AER 20:     
  [ 2362.080940] EEH: PCI-E AER 30:   
  [ 2362.080941] EEH: of node=0033:01:00.0
  [ 2362.080943] EEH: PCI device/vendor: 
  [ 2362.080945] EEH: PCI cmd/status register: 
  [ 2362.080945] EEH: PCI-E capabilities and status follow:
  [ 2362.080951] EEH: PCI-E 00:     
  [ 2362.080956] EEH: PCI-E 10:     
  [ 2362.080957] EEH: PCI-E 20:  
  [ 2362.080958] EEH: PCI-E AER capability register set follows:
  [ 2362.080964] EEH: PCI-E AER 00:     
  [ 2362.080969] EEH: PCI-E AER 10:     
  [ 2362.080974] EEH: PCI-E AER 20:     
  [ 2362.080975] EEH: PCI-E AER 30:   
  [ 2362.080977] PHB4 PHB#51 Diag-data (Version: 1)
  [ 2362.080978] brdgCtl:0002
  [ 2362.080979] RootSts:00060020 00402000 c1010008 00100107 
  [ 2362.080980] RootErrSts:  0020 
  [ 2362.080981] PhbSts: 001c 001c
  [ 2362.080982] Lem:0001  0001
  [ 2362.080983] PhbErr: 00c0 0080 214898000240 
a0084000
  [ 2362.080984] RegbErr:0090 0010 483c 
0200
  [ 2362.080985] PE[000] A/B: 

[Kernel-packages] [Bug 1790652] Re: Oracle cosmic image does not find broadcom network device in Shape VMStandard2.1

2019-02-15 Thread Paul Graydon
I'm confused.  Do you need verification or not?  Cosmic is not
specifically supported on our platform, and there are no plans at the
moment to support non-LTS releases that I know of.  I can certainly test
this if needs be, though.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1790652

Title:
  Oracle cosmic image does not find broadcom network device in Shape
  VMStandard2.1

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Cosmic:
  Fix Released

Bug description:
  I tried to register and boot a cosmic image to verify new changes in it and 
in cloud-init.
  The image failed to bring up networking in the initramfs, and thus failed to 
find iscsi root.
  this could be user error.

  Here is what I did to publish the image.

   - use oci build tool [1].
 following
 
https://docs.cloud.oracle.com/iaas/Content/Compute/Tasks/imageimportexport.htm#ImportinganImage
   - Download a livefs build from cloudware
 https://launchpad.net/~cloudware/+livefs/ubuntu/cosmic/cpc/
 example: livecd.ubuntu-cpc.oracle_bare_metal.img
 My image had version 20180821.1

   - oci os bucket create --name=smoser-devel
   - oci os object put \
--parallel-upload-count=4 \
--part-size=10 \
--bucket-name=smoser-devel \
--file=/tmp/livecd.ubuntu-cpc.oracle_bare_metal.img \
--name=cosmic-20180821.1.img

   - import the object
  $ oci compute image import from-object \
  --display-name=smoser-cosmic-20180821.1.img \
  --launch-mode=NATIVE \
  --namespace=intcanonical \
  --bucket-name=smoser-devel \
  --name=cosmic-20180821.1.img \
  --source-image-type=QCOW2

  Then I launched from the web UI a VM.Standard2.1.

  --
   https://docs.cloud.oracle.com/iaas/Content/API/Concepts/cliconcepts.htm

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1790652/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1790652] Re: Oracle cosmic image does not find broadcom network device in Shape VMStandard2.1

2018-09-14 Thread Paul Graydon
Patch submitted to netdev: https://marc.info/?l=linux-
netdev=153695411427176=2

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1790652

Title:
  Oracle cosmic image does not find broadcom network device in Shape
  VMStandard2.1

Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Cosmic:
  Triaged

Bug description:
  I tried to register and boot a cosmic image to verify new changes in it and 
in cloud-init.
  The image failed to bring up networking in the initramfs, and thus failed to 
find iscsi root.
  this could be user error.

  Here is what I did to publish the image.

   - use oci build tool [1].
 following
 
https://docs.cloud.oracle.com/iaas/Content/Compute/Tasks/imageimportexport.htm#ImportinganImage
   - Download a livefs build from cloudware
 https://launchpad.net/~cloudware/+livefs/ubuntu/cosmic/cpc/
 example: livecd.ubuntu-cpc.oracle_bare_metal.img
 My image had version 20180821.1

   - oci os bucket create --name=smoser-devel
   - oci os object put \
--parallel-upload-count=4 \
--part-size=10 \
--bucket-name=smoser-devel \
--file=/tmp/livecd.ubuntu-cpc.oracle_bare_metal.img \
--name=cosmic-20180821.1.img

   - import the object
  $ oci compute image import from-object \
  --display-name=smoser-cosmic-20180821.1.img \
  --launch-mode=NATIVE \
  --namespace=intcanonical \
  --bucket-name=smoser-devel \
  --name=cosmic-20180821.1.img \
  --source-image-type=QCOW2

  Then I launched from the web UI a VM.Standard2.1.

  --
   https://docs.cloud.oracle.com/iaas/Content/API/Concepts/cliconcepts.htm

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1790652/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1790652] Re: Oracle cosmic image does not find broadcom network device in Shape VMStandard2.1

2018-09-14 Thread Paul Graydon
I've been able to replicate the situation with a few different
distributions.  It seems to only occur with VMs.  When I tried 4.18.7 on
a bare metal instance, there was no problem.

We believe we've isolated the kernel commit that is introducing the
problem to 707e7e96602675beb5e09bb994195663da6eb56d

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1790652

Title:
  Oracle cosmic image does not find broadcom network device in Shape
  VMStandard2.1

Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Cosmic:
  Triaged

Bug description:
  I tried to register and boot a cosmic image to verify new changes in it and 
in cloud-init.
  The image failed to bring up networking in the initramfs, and thus failed to 
find iscsi root.
  this could be user error.

  Here is what I did to publish the image.

   - use oci build tool [1].
 following
 
https://docs.cloud.oracle.com/iaas/Content/Compute/Tasks/imageimportexport.htm#ImportinganImage
   - Download a livefs build from cloudware
 https://launchpad.net/~cloudware/+livefs/ubuntu/cosmic/cpc/
 example: livecd.ubuntu-cpc.oracle_bare_metal.img
 My image had version 20180821.1

   - oci os bucket create --name=smoser-devel
   - oci os object put \
--parallel-upload-count=4 \
--part-size=10 \
--bucket-name=smoser-devel \
--file=/tmp/livecd.ubuntu-cpc.oracle_bare_metal.img \
--name=cosmic-20180821.1.img

   - import the object
  $ oci compute image import from-object \
  --display-name=smoser-cosmic-20180821.1.img \
  --launch-mode=NATIVE \
  --namespace=intcanonical \
  --bucket-name=smoser-devel \
  --name=cosmic-20180821.1.img \
  --source-image-type=QCOW2

  Then I launched from the web UI a VM.Standard2.1.

  --
   https://docs.cloud.oracle.com/iaas/Content/API/Concepts/cliconcepts.htm

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1790652/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1790652] Re: Oracle cosmic image does not find broadcom network device in Shape VMStandard2.1

2018-09-13 Thread Paul Graydon
I haven't specifically seen that one, but I'll check in with both the
Oracle Linux team and our Hypervisor team.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1790652

Title:
  Oracle cosmic image does not find broadcom network device in Shape
  VMStandard2.1

Status in linux package in Ubuntu:
  Triaged
Status in linux source package in Cosmic:
  Triaged

Bug description:
  I tried to register and boot a cosmic image to verify new changes in it and 
in cloud-init.
  The image failed to bring up networking in the initramfs, and thus failed to 
find iscsi root.
  this could be user error.

  Here is what I did to publish the image.

   - use oci build tool [1].
 following
 
https://docs.cloud.oracle.com/iaas/Content/Compute/Tasks/imageimportexport.htm#ImportinganImage
   - Download a livefs build from cloudware
 https://launchpad.net/~cloudware/+livefs/ubuntu/cosmic/cpc/
 example: livecd.ubuntu-cpc.oracle_bare_metal.img
 My image had version 20180821.1

   - oci os bucket create --name=smoser-devel
   - oci os object put \
--parallel-upload-count=4 \
--part-size=10 \
--bucket-name=smoser-devel \
--file=/tmp/livecd.ubuntu-cpc.oracle_bare_metal.img \
--name=cosmic-20180821.1.img

   - import the object
  $ oci compute image import from-object \
  --display-name=smoser-cosmic-20180821.1.img \
  --launch-mode=NATIVE \
  --namespace=intcanonical \
  --bucket-name=smoser-devel \
  --name=cosmic-20180821.1.img \
  --source-image-type=QCOW2

  Then I launched from the web UI a VM.Standard2.1.

  --
   https://docs.cloud.oracle.com/iaas/Content/API/Concepts/cliconcepts.htm

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1790652/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2017-01-09 Thread Paul Graydon
I took a step back from doing bisecting and focussed on creating a
replication scenario, which I've done successfully.

ipconfig is struggling to handle things when two interfaces are present
and sending out DHCP requests, even if one interface doesn't get a
response.

Here's what I've done:

Using virt-manager I created a bridge, bridge1, with no IP range
associated with it (I want dnsmasq on a host to handle IP).  I created a
second, bridge2, likewise with no IP range associated with it ready for
later use.

$$$

I created an instance, named primary, with two NICs, one doing the usual
NAT stuff so it has internet access.  One hooked up to bridge1.  I gave
it two storage devices, 1 (sda) at 15Gb in size to act as local storage,
1 (sdb) 40Gb in size to be hosted over iSCSI (in hindsight, no reason
for it not to be 15Gb too).

Install Ubuntu 16.04.1 LTS on the primary instance, pretty much
following through with defaults, but leaving the second hard drive
unused.  Reboot and bring up the instance.  In my case I end up with
ens3 being the NATing interface, ens9 being hooked up to the bridge
interface.

##

sudo apt update
sudo apt upgrade

##

Add to /etc/network/interfaces:

auto ens9
iface ens9 inet static
  address 192.168.0.1/24

##

Then:

sudo apt install open-iscsi targetcli dnsmasq

##

dnsmasq config:

log-queries
log-dhcp
interface=ens9
dhcp-range=192.168.0.50,192.168.0.150,12h
dhcp-boot=script.ipxe
enable-tftp
tftp-root=/tftpd
tftp-no-fail

##

Then run targetcli and do the following commands:

backstores/iblock create uefi /dev/sdb
/iscsi create iqn.2015-02.oracle.boot:uefi
cd iqn.2015-02.oracle.boot:uefi/tpg1
luns/ create /backstores/block/uefi
portals/ create 0.0.0.0
set attribute authentication=0 demo_mode_write_protect=0 generate_node_acls=1 
cache_dynamic_acls=1
exit

##

sudo mkdir /tftpd
sudo chown dnsmasq: /tftpd

##

/tftpd/script.ipxe:

#!ipxe
set initiator-iqn iqn.2015-02.oracle.boot:uefi
sanboot iscsi:192.168.0.1iqn.2015-02.oracle.boot:uefi

##

This gets the host pretty much ready to be an iscsi target for a host.
The host has been patched etc, so reboot.

You may want to set up ip forwarding etc on this instance.


$$$

Second host:

No storage.  Attach Ubuntu 16.04.1 LTS iso to the instance to boot from
initially.  Two NICs, first attached to bridge1.  Second attached to
bridge2.

Go through the installation procedure, logging in to the iscsi endpoint
on 192.168.0.1, using the details above (no username/password necessary
with this configuration) and install to the iSCSI target.  At the end,
detach the CD-ROM and ensure everything is set up to network boot.

On start-up you should see it network boot happily, everything is
awesome.  Do a "sudo apt update" and "sudo apt upgrade".  Then reboot.

On start-up you should see the bug happening.  ipconfig is sending out
DHCP requests on both interfaces and failing to accept any responses it
is being sent ("journalctl -xef -u dnsmasq" on primary shows it is
sending them).  If you remove that second NIC, you'll see that the
instance is able to boot happily.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
    dns0 : 169.254.169.254  dns1   : 0.0.0.0
   

[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2017-01-03 Thread Paul Graydon
I'm continuing to bisect the mainline linux kernel, and also trying to
see if I can create a straightforward reproducible example.

First focus on bisecting was between 4.5 and 4.6, to figure out what
changed to suddenly have ipconfig working.  I've tracked it down to this
using bisect, and validated it afterwards:

commit 689de1d6ca95b3b5bd8ee446863bf81a4883ea25
Author: Linus Torvalds 
Date:   Mon May 2 12:46:42 2016 -0700

Minimal fix-up of bad hashing behavior of hash_64()

This is a fairly minimal fixup to the horribly bad behavior of hash_64()
with certain input patterns.

In particular, because the multiplicative value used for the 64-bit hash
was intentionally bit-sparse (so that the multiply could be done with
shifts and adds on architectures without hardware multipliers), some
bits did not get spread out very much.  In particular, certain fairly
common bit ranges in the input (roughly bits 12-20: commonly with the
most information in them when you hash things like byte offsets in files
or memory that have block factors that mean that the low bits are often
zero) would not necessarily show up much in the result.

There's a bigger patch-series brewing to fix up things more completely,
but this is the fairly minimal fix for the 64-bit hashing problem.  It
simply picks a much better constant multiplier, spreading the bits out a
lot better.

NOTE! For 32-bit architectures, the bad old hash_64() remains the same
for now, since 64-bit multiplies are expensive.  The bigger hashing
cleanup will replace the 32-bit case with something better.

The new constants were picked by George Spelvin who wrote that bigger
cleanup series.  I just picked out the constants and part of the comment
from that series.

Cc: sta...@vger.kernel.org
Cc: George Spelvin 
Cc: Thomas Gleixner 
Signed-off-by: Linus Torvalds 


Next up is tracking down what changed between 4.7 and 4.8.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
    dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  tcpdumps show that dhcp requests are being received from the host, and
  responses sent, but not accepted by the host.  When the ipconfig
  command is issued manually, an identical dhcp request and response
  happens, only this time it is accepted.  It doesn't appear to be that
  the messages are being sent and received incorrectly, just silently
  ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  Ubuntu kernel bisect offending commit:
  # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per 
mount namespace limit on the number of mounts

  Ubuntu kernel bisect offending commit submission:
  https://lkml.org/lkml/2016/10/5/308

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-31 Thread Paul Graydon
I've tried every version in the v4 series, and a few in v3.  None prior
to (and including) v4.0.0 will boot, none output anything on the screen
to give me a clue why they're not booting.

So far:

v4.0 = won't boot
v4.1 = ipconfig bug
v4.2 = ipconfig bug
v4.3 = ipconfig bug
v4.4 = ipconfig bug
v4.5 = ipconfig bug
v4.6 = Boots
v4.7 = Boots
v4.8 = ipconfig bug
v4.9 = ipconfig bug
v4.10 = ipconfig bug


I'm getting seriously concerned that "working" is actually the aberration.  
It's working in just two out of ten releases.

I do have two things I should probably bisect there:  1) what changed
between 4.5 and 4.6, and 2) what changed between 4.7 and 4.8.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
    dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  tcpdumps show that dhcp requests are being received from the host, and
  responses sent, but not accepted by the host.  When the ipconfig
  command is issued manually, an identical dhcp request and response
  happens, only this time it is accepted.  It doesn't appear to be that
  the messages are being sent and received incorrectly, just silently
  ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  Ubuntu kernel bisect offending commit:
  # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per 
mount namespace limit on the number of mounts

  Ubuntu kernel bisect offending commit submission:
  https://lkml.org/lkml/2016/10/5/308

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-31 Thread Paul Graydon
The more I look at this, the more I'm convinced *most* of the real
problem lies in that ipconfig tool.  Yes, various kernel changes seem to
make it alter between working & not working under the circumstances
(which is bizarre), but unless something is specifically interfering
with the inter-process communication, ipconfig appears to be ignoring
valid dhcp responses, just based on whether you tell it "all" interfaces
vs telling it a specific interface.

A small modification could be made to the initramfs-tools to have it
iterate over the interfaces in the system one-at-a-time.  It would
marginally slow down the boot should the relevant interface not be the
first, but it would get rid of this bug entirely.  Or the intird
environment could be modified to use dhclient instead of ipconfig
(dhclient appears to be in the initrd, and works perfectly fine when
called in a generic fashion, though the other initramfs-tools scripts
seem aware ipconfig didn't complete successfully which I haven't looked
in to)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
    dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  tcpdumps show that dhcp requests are being received from the host, and
  responses sent, but not accepted by the host.  When the ipconfig
  command is issued manually, an identical dhcp request and response
  happens, only this time it is accepted.  It doesn't appear to be that
  the messages are being sent and received incorrectly, just silently
  ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  Offending commit:
  # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per 
mount namespace limit on the number of mounts

  The offending commit submission:
  https://lkml.org/lkml/2016/10/5/308

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-31 Thread Paul Graydon
My apologies for any lack of clarity.

I tested against the head of ubuntu-xenial, reverting just that commit
and it fixed it.

I tested against the head of the mainstream kernel and it didn't (last
night I tried 4.9, 4.8, 4.5, 4.4, 4.2 tags of the mainstream kernel and
in every place I find the general bug in effect).  I'll try some larger
leaps and see if I can track it down elsewhere.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
    dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  tcpdumps show that dhcp requests are being received from the host, and
  responses sent, but not accepted by the host.  When the ipconfig
  command is issued manually, an identical dhcp request and response
  happens, only this time it is accepted.  It doesn't appear to be that
  the messages are being sent and received incorrectly, just silently
  ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  Offending commit:
  # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per 
mount namespace limit on the number of mounts

  The offending commit submission:
  https://lkml.org/lkml/2016/10/5/308

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-30 Thread Paul Graydon
I tried reverting that specific commit from upstream, but that didn't
resolve the issue.  Time for a new round of bisecting the kernel, this
time using mainline.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Triaged

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
    dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  tcpdumps show that dhcp requests are being received from the host, and
  responses sent, but not accepted by the host.  When the ipconfig
  command is issued manually, an identical dhcp request and response
  happens, only this time it is accepted.  It doesn't appear to be that
  the messages are being sent and received incorrectly, just silently
  ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  Offending commit:
  # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per 
mount namespace limit on the number of mounts

  The offending commit submission:
  https://lkml.org/lkml/2016/10/5/308

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-30 Thread Paul Graydon
This seems to make no sense to me, as a layman anyway.

I checked out the 4.4.0-58.79 tag, reverted that one commit and
confirmed I have a booting 4.4.0-58-generic that'll happily DHCP in the
initrd environment on multiple boots.  It really does seem like,
somehow, that commit is the source of the problems.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-30 Thread Paul Graydon
I bisected again, and again it came back to that mount point change.
This seems so bizarre.

$ git bisect log
# bad: [6d4f0a79e5a307b6fd3ee3cc5bbb2fcb701b09db] UBUNTU: Ubuntu-4.4.0-57.78
# good: [db5f146d309e70067dae57798c9ea679af835aa7] UBUNTU: Ubuntu-4.4.0-53.74
git bisect start 'Ubuntu-4.4.0-57.78' 'Ubuntu-4.4.0-53.74'
# bad: [02bf412367b827aa5be05a315088ef5fdcf267ca] dmaengine: at_xdmac: fix 
spurious flag status for mem2mem transfers
git bisect bad 02bf412367b827aa5be05a315088ef5fdcf267ca
# bad: [1e089050b800ba7d6ba1bf5814827e6cca301ad5] smc91x: avoid self-comparison 
warning
git bisect bad 1e089050b800ba7d6ba1bf5814827e6cca301ad5
# bad: [d7632bdaba3dd143eac3c80bb7e2b0f62259583d] xhci: use default 
USB_RESUME_TIMEOUT when resuming ports.
git bisect bad d7632bdaba3dd143eac3c80bb7e2b0f62259583d
# bad: [7942010de9a2fe39e72b84e628867f4ff29a70f2] libxfs: clean up 
_calc_dquots_per_chunk
git bisect bad 7942010de9a2fe39e72b84e628867f4ff29a70f2
# good: [9d2524b0bdeb57f80d0279f6695a833606ad0597] UBUNTU: SAUCE: Bluetooth: 
decrease refcount after use
git bisect good 9d2524b0bdeb57f80d0279f6695a833606ad0597
# bad: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount 
namespace limit on the number of mounts
git bisect bad fd4b5fa6e3487d15ede746f92601af008b2abbc0
# good: [f2109fe47ceb77647ef7d4f545efeba43d06fb64] videobuf2-v4l2: Verify 
planes array in buffer dequeueing
git bisect good f2109fe47ceb77647ef7d4f545efeba43d06fb64
# good: [d5d9494d2092a7e571dee635ca254075912355c1] thinkpad_acpi: Add support 
for HKEY version 0x200
git bisect good d5d9494d2092a7e571dee635ca254075912355c1
# first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per 
mount namespace limit on the number of mounts

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-29 Thread Paul Graydon
I see where I messed up.. I'll try the bisect again.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-29 Thread Paul Graydon
Okay... I can't help but think I made a mistake somewhere in the
bisecting process, but it seems to have isolated
fd4b5fa6e3487d15ede746f92601af008b2abbc0 as the bad commit


$ git bisect log
# bad: [6d4f0a79e5a307b6fd3ee3cc5bbb2fcb701b09db] UBUNTU: Ubuntu-4.4.0-57.78
# good: [40a98f0e91bcc062babd017732cbf7cb20cf39fd] UBUNTU: Ubuntu-4.4.0-51.72
git bisect start 'Ubuntu-4.4.0-57.78' 'Ubuntu-4.4.0-51.72'
# bad: [cd29d2303e86529c089b1c292480c05e7a24bd16] drm/i915: Respect 
alternate_ddc_pin for all DDI ports
git bisect bad cd29d2303e86529c089b1c292480c05e7a24bd16
# bad: [617dec606ff9e43e64a06daef83e17da0035340a] drm/exynos: fix error 
handling in exynos_drm_subdrv_open
git bisect bad 617dec606ff9e43e64a06daef83e17da0035340a
# bad: [0dbd2050197ea4dd59f8957b72981cb7d2cfab1c] usb: gadget: function: 
u_ether: don't starve tx request queue
git bisect bad 0dbd2050197ea4dd59f8957b72981cb7d2cfab1c
# bad: [f3f9de1bd9a63b633946226ba23392ad44e2badf] i2c: core: fix NULL pointer 
dereference under race condition
git bisect bad f3f9de1bd9a63b633946226ba23392ad44e2badf
# good: [a0678a6643bf688bccce3c298a4a110af10988fc] ipv6: correctly add local 
routes when lo goes up
git bisect good a0678a6643bf688bccce3c298a4a110af10988fc
# good: [a0ae41d8ee0549161174a39d60f7316b67a87cae] Bluetooth: btusb: Add 
support for 0cf3:e009
git bisect good a0ae41d8ee0549161174a39d60f7316b67a87cae
# good: [d5d9494d2092a7e571dee635ca254075912355c1] thinkpad_acpi: Add support 
for HKEY version 0x200
git bisect good d5d9494d2092a7e571dee635ca254075912355c1
# bad: [a6e674fa25854a7dafc59555d508855ea8fe3eaa] i2c: xgene: Avoid dma_buffer 
overrun
git bisect bad a6e674fa25854a7dafc59555d508855ea8fe3eaa
# bad: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount 
namespace limit on the number of mounts
git bisect bad fd4b5fa6e3487d15ede746f92601af008b2abbc0
# first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per 
mount namespace limit on the number of mounts


>From a layman perspective, it doesn't seem like that could possibly cause the 
>bug.

I guess one quick way forward, rather than repeat the whole bisecting
process, is to completely reset the repository, bring it up to date,
verify the bug still exists, and then revert this specific commit and
see if the bug goes away.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach 

[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-27 Thread Paul Graydon
I'll take a fresh look in the morning, but ran into this:

make[1]: Leaving directory 
'/home/ubuntu/storage/ubuntu-xenial/debian/build/build-generic/zfs/module'
Debug: module-check-generic
install -d 
/home/ubuntu/storage/ubuntu-xenial/debian.master/abi/4.4.0-54.76/amd64
find /home/ubuntu/storage/ubuntu-xenial/debian/build/build-generic/ -name \*.ko 
| \
sed -e 's/.*\/\([^\/]*\)\.ko/\1/' | sort > 
/home/ubuntu/storage/ubuntu-xenial/debian.master/abi/4.4.0-54.76/amd64/generic.modules
II: Checking modules for generic...previous or current modules file missing!
   
/home/ubuntu/storage/ubuntu-xenial/debian.master/abi/4.4.0-54.76/amd64/generic.modules
   
/home/ubuntu/storage/ubuntu-xenial/debian.master/abi/4.4.0-54.75/amd64/generic.modules
debian/rules.d/4-checks.mk:12: recipe for target 'module-check-generic' failed
make: *** [module-check-generic] Error 1

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-27 Thread Paul Graydon
I can give that a shot, following the instructions here:
https://wiki.ubuntu.com/Kernel/KernelBisection#Bisecting_Ubuntu_kernel_versions

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-27 Thread Paul Graydon
I should clarify, I know for certain that 4.4.0-51 is stable and
reliable (and doesn't exhibit the bug).  As part of our attempt to
verify everything was correct with the installation we had a system run
from Wednesday before Thanksgiving, all the way through to the following
Monday, during which time it had an rc.local triggered reboot (so it had
to be fully booted).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-27 Thread Paul Graydon
Okay.. this is interesting.  It seems like the Ubuntu dev version of
4.10 is actually intermittently failing (?!)  I guess the next thing to
do here is keep rebooting on this version of the kernel and see how
often the bug occurs vs doesn't occur, so I can get a feel for a
reasonable number of times to reboot with each test kernel once I
actually start bisecting.

>From the dhcp server side I can't see anything different.  The requests
look the same.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-27 Thread Paul Graydon
Rolling that command against master fails too:

ubuntu@Beta:~/linux$ mainline-build-one 
afd2ff9b7e1b367172f18ba7f693dfb62bdcb2dc xenial
*** BUILDING: commit:afd2ff9b7e1b367172f18ba7f693dfb62bdcb2dc series:xenial 
abinum: ...
full_version<4.4.0>
version<4.4.0>
long
abinum<040400>
fatal: 'xenial' does not appear to be a git repository
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
error: pathspec 'xenial/master' did not match any file(s) known to git.
Deleted branch BUILD.040400 (was 794249c).
Checking out files: 100% (33279/33279), done.
Switched to a new branch 'BUILD.040400'
vvv - build head
commit afd2ff9b7e1b367172f18ba7f693dfb62bdcb2dc
Author: Linus Torvalds 
Date:   Sun Jan 10 15:01:32 2016 -0800

Linux 4.4
^^^ - build head
fatal: invalid reference: xenial/master
fatal: invalid reference: xenial/master-next
fatal: invalid reference: xenial/master
fatal: invalid reference: xenial/master-next
On branch BUILD.040400
nothing to commit, working directory clean
*** checking 
/home/ubuntu/kteam-tools/mainline-build/adhoc/0001-DISABLE-comedi.patch 
(drivers/staging/comedi/drivers/das08_cs.c 47a4f33c4733880faa50f0e64a6e5c8f 
79236ea0358db3c7a7a8a5f081c320b4) ...
md5sum: drivers/staging/ti-st/st_kim.c: No such file or directory
*** checking 
/home/ubuntu/kteam-tools/mainline-build/adhoc/0002-DISABLE-ti-st.patch 
(drivers/staging/ti-st/st_kim.c b41944e0c30683bdedb6a66e11098892 ) ...
md5sum: drivers/staging/hv/hv_mouse.c: No such file or directory
*** checking 
/home/ubuntu/kteam-tools/mainline-build/adhoc/0003-DISABLE-hyperv.patch 
(drivers/staging/hv/hv_mouse.c afd5524c29871a8293518f0be50a7474 ) ...
*** checking 
/home/ubuntu/kteam-tools/mainline-build/adhoc/0004-DISABLE-olpc.patch 
(drivers/staging/olpc_dcon/olpc_dcon_xo_1.c 13b325ae1aeee7f8602759057ed0d1f9 
9d099e35d45e22f96c4d77694a5e6c58) ...
*** checking 
/home/ubuntu/kteam-tools/mainline-build/adhoc/0005-UBUNTU-olpc_dcon_xo_1-needs-delay.h.patch
 (drivers/staging/olpc_dcon/olpc_dcon_xo_1.c 6a0ae9f73f4878052202473bb952d6e4 
9d099e35d45e22f96c4d77694a5e6c58) ...
*** checking 
/home/ubuntu/kteam-tools/mainline-build/adhoc/0006-UBUNTU-olpc_dcon_xo_1_5-needs-delay.h.patch
 (drivers/staging/olpc_dcon/olpc_dcon_xo_1_5.c 55c01b13d520fa0cdde88d8d3034f21c 
37460a6a542aa92444e9114105621f18) ...
*** checking 
/home/ubuntu/kteam-tools/mainline-build/adhoc/0007-x86-idle-APM-requires-pm_idle-always-when-it-is-a-mo.patch
 (arch/x86/kernel/process.c 1ded15dd3a3cb622df182d60160ff826 
73538a1ff57235e73e0342d9efa681f5) ...
md5sum: debian/rules.d/2-binary-arch.mk: No such file or directory
*** checking 
/home/ubuntu/kteam-tools/mainline-build/adhoc/0008-UBUNTU-packaging-do-not-fail-secure-copy-on-older-ke.patch
 (debian/rules.d/2-binary-arch.mk 647c141b53e037781844f0c04234526e ) ...
md5sum: arch/arm/mach-highbank/clock.c: No such file or directory
*** checking 
/home/ubuntu/kteam-tools/mainline-build/adhoc/0009-UBUNTU-SAUCE-highbank-export-clock-functions-for-mod.patch
 (arch/arm/mach-highbank/clock.c 119a926bf04eae5024a3002b626ef8bc ) ...
*** applying 
/home/ubuntu/kteam-tools/mainline-build/adhoc/any-0001-UBUNTU-SAUCE-add-vmlinux.strip-to-BOOT_TARGETS1-on-p.patch
 ...
Applying: UBUNTU: SAUCE: add vmlinux.strip to BOOT_TARGETS1 on powerpc
*** applying 
/home/ubuntu/kteam-tools/mainline-build/adhoc/any-0001-UBUNTU-SAUCE-tools-hv-lsvmbus-add-manual-page.patch
 ...
Applying: UBUNTU: SAUCE: tools/hv/lsvmbus -- add manual page
*** applying 
/home/ubuntu/kteam-tools/mainline-build/adhoc/yakkety-0001-disable-pie-when-gcc-has-it-enabled-by-default.patch
 ...
Applying: UBUNTU: SAUCE: (no-up) disable -pie when gcc has it enabled by default
fatal: Not a valid object name xenial/master-next:debian.master/changelog
dpkg-parsechangelog: warning:-(l0): found end of file where 
expected first heading
dpkg-parsechangelog: error: fatal error occurred while parsing -
fatal: Not a valid object name xenial/master:debian.master/changelog
dpkg-parsechangelog: warning:-(l0): found end of file where 
expected first heading
dpkg-parsechangelog: error: fatal error occurred while parsing -
/home/ubuntu/kteam-tools/mainline-build/mainline-build-one: line 291: 
debian/changelog.new: No such file or directory
mv: cannot stat 'debian/changelog.new': No such file or directory
On branch BUILD.040400
nothing to commit, working directory clean
*** using configs from Ubuntu-0 () ...
fatal: invalid reference: Ubuntu-0
fatal: invalid reference: xenial/
xenial-amd64: chroot not found (::,)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is 

[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-27 Thread Paul Graydon
Gah.. okay https://wiki.ubuntu.com/KernelTeam/GitKernelBuild

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-27 Thread Paul Graydon
Ahh, I see where the kteam tools stuff is supposed to come from.

It's not clear if I'm supposed to go down that route and use the
mainline-build-one script or not when trying to build the kernel in this
case.  If I use the mainline-build-one tool:

$ mainline-build-one afd2ff9b7e1b367172f18ba7f693dfb62bdcb2dc xenial
*** BUILDING: commit:afd2ff9b7e1b367172f18ba7f693dfb62bdcb2dc series:xenial 
abinum: ...
full_version<4.4.0>
version<4.4.0>
long
abinum<040400>
fatal: 'xenial' does not appear to be a git repository
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
error: pathspec 'xenial/master' did not match any file(s) known to git.
error: Cannot delete the branch 'BUILD.040400' which you are currently on.
fatal: A branch named 'BUILD.040400' already exists.


The only way this tool works with that syntax is to switch to the master 
branch, and run it from there.  I'm not sure how that's supposed to work with 
git bisect, given bisect is setting your checked out position.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-27 Thread Paul Graydon
I'll get started on it. This might take a while to do.

A couple of quick observations:

1) we haven't validated that mainline 4.4.0 actually works.  I only know
certain Ubuntu versions of the 4.4.0 kernel work.  Given how much seems
to be changing between Ubuntu releases of it, that seems a risky
assumption to make.  I'll start by proving that first.

2) On the wiki you linked to: "To do this, you can use the mainline-
build-one script which can be found at ~kteam-tools/malinline-build
/maineline-build-one ."  A proper link would be useful.  Where is
~kteam-tools?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-26 Thread Paul Graydon
Tried and tested (the current up-to-date kernels at the time of
posting):

http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10-rc1/linux-
headers-4.10.0-041000rc1-generic_4.10.0-041000rc1.201612252031_amd64.deb

http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10-rc1/linux-
image-4.10.0-041000rc1-generic_4.10.0-041000rc1.201612252031_amd64.deb

They do not appear to suffer from the bug, dhcp was able to complete
happily via the startup scripts in the initrd environment, and the host
booted successfully.

** Tags added: kernel-fixed-upstream

** Tags added: kernel-fixed-upstream-4.10-rc1

** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-23 Thread Paul Graydon
I've also confirmed the bug is present all the way back in
4.4.0-21-generic, and is present in 4.8.0-34-generic from yakkety-
proposed.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-23 Thread Paul Graydon
I've worked my way back through the kernels.  The bug, as it was
(avoided by ip=dhcp in the kernel command line), was in effect in
version 4.4.0-38-generic.  It was fixed in 4.4.0-42-generic.  This is
the state of play so far with kernels I've tested:

linux-image-4.4.0-38-generic - Affected
linux-image-4.4.0-42-generic - Fine
linux-image-4.4.0-43-generic - Fine
linux-image-4.4.0-45-generic - Fine
linux-image-4.4.0-47-generic - Fine
linux-image-4.4.0-51-generic - Fine
linux-image-4.4.0-53-generic - Fine
linux-image-4.4.0-57-generic - Affected
linux-image-4.4.0-58-generic - Affected  (kernel in proposed)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-23 Thread Paul Graydon
apport-collect doesn't exist in initrd.  I'm unable to supply the
requested information.

** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-23 Thread Paul Graydon
** Attachment added: "pcap from dhcp server side of 'ipconfig -t "dhcp" -d 
"ens2f0" '"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+attachment/4795819/+files/worked.pcap

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-23 Thread Paul Graydon
The checksum invalid mentioned in the pcap is interesting, but happens
in both failed and successful, so I'm not sure it's relevant.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-23 Thread Paul Graydon
** Attachment added: "pcap from dhcp server side of inird startup doing dhcp"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+attachment/4795820/+files/failed.pcap

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-23 Thread Paul Graydon
** Attachment added: "Working 4.4.0-53 initrd"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+attachment/4795794/+files/initrd.img-4.4.0-53-generic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  New

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response

2016-12-23 Thread Paul Graydon
** Attachment added: "4.4.0-57 "broken" initrd"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+attachment/4795793/+files/initrd.img-4.4.0-57-generic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  New

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  I'm going to try and track back through kernel versions to see if I
  can find which version the fix happened in to maybe provide some
  additional context.  I'll also attach copies of the initrds, packet
  captures etc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1652348] [NEW] initrd dhcp fails / ignores valid response

2016-12-23 Thread Paul Graydon
Public bug reported:

Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
(re?)introduced that is breaking dhcp booting in the initrd environment.
This is stopping instances that use iscsi storage from being able to
connect.

Over serial console it outputs:

IP-Config: no response after 2 secs - giving up
IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
IP-Config: no response after 3 secs - giving up

with increasing delays until it fails.  At which point a simple ipconfig
-t dhcp -d "ens2f0"  works.  The console output is slightly garbled but
should give you an idea:

(initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
d "ens2f0"
IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
IP-Config: ens2f0 guessed broadcast address 10.0.1.255
IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
 addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
 gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
  dns0 : 169.254.169.254  dns1   : 0.0.0.0
 rootserver: 169.254.169.254 rootpath:
 filename  : /ipxe.efi


tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

I was seeing this behaviour earlier this year, which I was able to fix
by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
was identified as causing us other problems (long story) and we dropped
it, at which point we discovered the original bug was no longer an
issue.

Putting "ip=dhcp" back on with this kernel no longer fixes the problem.

I've compared the two initrds and effectively the only thing that has
changed between the two is the kernel components.

I'm going to try and track back through kernel versions to see if I can
find which version the fix happened in to maybe provide some additional
context.  I'll also attach copies of the initrds, packet captures etc.

** Affects: linux-meta (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-meta in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux-meta package in Ubuntu:
  New

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe :13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56broadcast: 10.0.1.255   netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
dns0 : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  
  tcpdumps show that dhcp requests are being received from the host, and 
responses sent, but not accepted by the host.  When the ipconfig command is 
issued manually, an identical dhcp request and response happens, only this time 
it is accepted.  It doesn't appear to be that the messages are being sent and 
received incorrectly, just silently ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel 

[Kernel-packages] [Bug 1626679] Re: NVMe triggering kernel panic followed by "bad: scheduling from the idle thread!"

2016-10-05 Thread Paul Graydon
There isn't a kernel in proposed at the moment, but I've tested using
the latest in yakkety and it seems to be working fine.

I don't have a simple replication case for the bug, unfortunately.  It
just seems to happen for (hand-wavey guess) 50% of boots.

So far I've got this 4.8.0-19-generic kernel to boot several times over
without problem.  I'll keep rebooting and rebooting the server in the
background today, just in case, while I focus on other stuff.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1626679

Title:
  NVMe triggering kernel panic followed by "bad: scheduling from the
  idle thread!"

Status in linux package in Ubuntu:
  Triaged

Bug description:
  On an NVMe system I'm using, Ubuntu 16.04.1 regularly seems to trigger
  off a kernel panic against somepart of the NVMe driver it looks like,
  after which the logs get filled with entries over and over again of:

  "bad: scheduling from the idle thread!"

  Here's the initial stack trace that seems to trigger off the bug:

  Sep 22 15:51:46 ubuntu kernel: [   97.478175] [ cut here 
]
  Sep 22 15:51:46 ubuntu kernel: [   97.478185] WARNING: CPU: 13 PID: 0 at 
/build/linux-dcxD3m/linux-4.4.0/kernel/irq/manage.c:1438 
__free_irq+0x1d2/0x280()
  Sep 22 15:51:46 ubuntu kernel: [   97.478188] Trying to free IRQ 38 from IRQ 
context!
  Sep 22 15:51:46 ubuntu kernel: [   97.478191] Modules linked in: 
nls_iso8859_1 ipmi_ssif intel_rapl x86_pkg_temp_thermal intel_powerclamp 
coretemp kvm_intel kvm irqbypass ioatdma me
  i_me sb_edac shpchp edac_core lpc_ich mei 8250_fintek ipmi_msghandler mac_hid 
ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr autofs4 btrfs 
iscsi_tcp libiscsi_tcp libiscsi
  scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq 
async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 
crct10dif_pclmul ixgbe crc32_pclmu
  l dca vxlan aesni_intel ip6_udp_tunnel udp_tunnel aes_x86_64 lrw gf128mul ptp 
glue_helper ahci ablk_helper pps_core cryptd nvme libahci mdio wmi fjes
  Sep 22 15:51:46 ubuntu kernel: [   97.478257] CPU: 13 PID: 0 Comm: swapper/13 
Not tainted 4.4.0-31-generic #50-Ubuntu
  Sep 22 15:51:46 ubuntu kernel: [   97.478260] Hardware name: Oracle 
Corporation ORACLE SERVER X5-2/ASM,MOTHERBOARD,1U, BIOS 30080100 04/13/2016
  Sep 22 15:51:46 ubuntu kernel: [   97.478263]  0286 
4fea3140a01056a3 883f7f743b10 813f1143
  Sep 22 15:51:46 ubuntu kernel: [   97.478267]  883f7f743b58 
81cb61f8 883f7f743b48 81081102
  Sep 22 15:51:46 ubuntu kernel: [   97.478271]  0026 
883f5b2ea700 0026 
  Sep 22 15:51:46 ubuntu kernel: [   97.478275] Call Trace:
  Sep 22 15:51:46 ubuntu kernel: [   97.478277][] 
dump_stack+0x63/0x90
  Sep 22 15:51:46 ubuntu kernel: [   97.478290]  [] 
warn_slowpath_common+0x82/0xc0
  Sep 22 15:51:46 ubuntu kernel: [   97.478294]  [] 
warn_slowpath_fmt+0x5c/0x80
  Sep 22 15:51:46 ubuntu kernel: [   97.478299]  [] ? 
try_to_grab_pending+0xb3/0x160
  Sep 22 15:51:46 ubuntu kernel: [   97.478302]  [] 
__free_irq+0x1d2/0x280
  Sep 22 15:51:46 ubuntu kernel: [   97.478306]  [] 
free_irq+0x3c/0x90
  Sep 22 15:51:46 ubuntu kernel: [   97.478314]  [] 
nvme_suspend_queue+0x89/0xb0 [nvme]
  Sep 22 15:51:46 ubuntu kernel: [   97.478320]  [] 
nvme_disable_admin_queue+0x27/0x90 [nvme]
  Sep 22 15:51:46 ubuntu kernel: [   97.478325]  [] 
nvme_dev_disable+0x29e/0x2c0 [nvme]
  Sep 22 15:51:46 ubuntu kernel: [   97.478330]  [] ? 
__nvme_process_cq+0x210/0x210 [nvme]
  Sep 22 15:51:46 ubuntu kernel: [   97.478334]  [] ? 
dev_warn+0x6c/0x90
  Sep 22 15:51:46 ubuntu kernel: [   97.478340]  [] 
nvme_timeout+0x110/0x1d0 [nvme]
  Sep 22 15:51:46 ubuntu kernel: [   97.478344]  [] ? 
cpumask_next_and+0x2f/0x40
  Sep 22 15:51:46 ubuntu kernel: [   97.478348]  [] ? 
load_balance+0x18c/0x980
  Sep 22 15:51:46 ubuntu kernel: [   97.478354]  [] 
blk_mq_rq_timed_out+0x2f/0x70
  Sep 22 15:51:46 ubuntu kernel: [   97.478358]  [] 
blk_mq_check_expired+0x4e/0x80
  Sep 22 15:51:46 ubuntu kernel: [   97.478363]  [] 
bt_for_each+0xd8/0xe0
  Sep 22 15:51:46 ubuntu kernel: [   97.478367]  [] ? 
blk_mq_rq_timed_out+0x70/0x70
  Sep 22 15:51:46 ubuntu kernel: [   97.478370]  [] ? 
blk_mq_rq_timed_out+0x70/0x70
  Sep 22 15:51:46 ubuntu kernel: [   97.478375]  [] 
blk_mq_queue_tag_busy_iter+0x47/0xc0
  Sep 22 15:51:46 ubuntu kernel: [   97.478379]  [] ? 
blk_mq_attempt_merge+0xb0/0xb0
  Sep 22 15:51:46 ubuntu kernel: [   97.478383]  [] 
blk_mq_rq_timer+0x41/0xf0
  Sep 22 15:51:46 ubuntu kernel: [   97.478389]  [] 
call_timer_fn+0x35/0x120
  Sep 22 15:51:46 ubuntu kernel: [   97.478393]  [] ? 
blk_mq_attempt_merge+0xb0/0xb0
  Sep 22 15:51:46 ubuntu kernel: [   97.478397]  [] 
run_timer_softirq+0x23a/0x2f0
  Sep 22 15:51:46 ubuntu kernel: [   97.478403]  [] 
__do_softirq+0x101/0x290
  Sep 22 

[Kernel-packages] [Bug 1626679] Re: NVMe triggering kernel panic followed by "bad: scheduling from the idle thread!"

2016-09-22 Thread Paul Graydon
gzip'd copy of the kern.log showing the error.

** Attachment added: "kern.log.gz"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1626679/+attachment/4746377/+files/kern.log.gz

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1626679

Title:
  NVMe triggering kernel panic followed by "bad: scheduling from the
  idle thread!"

Status in linux package in Ubuntu:
  New

Bug description:
  On an NVMe system I'm using, Ubuntu 16.04.1 regularly seems to trigger
  off a kernel panic against somepart of the NVMe driver it looks like,
  after which the logs get filled with entries over and over again of:

  "bad: scheduling from the idle thread!"

  Here's the initial stack trace that seems to trigger off the bug:

  Sep 22 15:51:46 ubuntu kernel: [   97.478175] [ cut here 
]
  Sep 22 15:51:46 ubuntu kernel: [   97.478185] WARNING: CPU: 13 PID: 0 at 
/build/linux-dcxD3m/linux-4.4.0/kernel/irq/manage.c:1438 
__free_irq+0x1d2/0x280()
  Sep 22 15:51:46 ubuntu kernel: [   97.478188] Trying to free IRQ 38 from IRQ 
context!
  Sep 22 15:51:46 ubuntu kernel: [   97.478191] Modules linked in: 
nls_iso8859_1 ipmi_ssif intel_rapl x86_pkg_temp_thermal intel_powerclamp 
coretemp kvm_intel kvm irqbypass ioatdma me
  i_me sb_edac shpchp edac_core lpc_ich mei 8250_fintek ipmi_msghandler mac_hid 
ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr autofs4 btrfs 
iscsi_tcp libiscsi_tcp libiscsi
  scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq 
async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 
crct10dif_pclmul ixgbe crc32_pclmu
  l dca vxlan aesni_intel ip6_udp_tunnel udp_tunnel aes_x86_64 lrw gf128mul ptp 
glue_helper ahci ablk_helper pps_core cryptd nvme libahci mdio wmi fjes
  Sep 22 15:51:46 ubuntu kernel: [   97.478257] CPU: 13 PID: 0 Comm: swapper/13 
Not tainted 4.4.0-31-generic #50-Ubuntu
  Sep 22 15:51:46 ubuntu kernel: [   97.478260] Hardware name: Oracle 
Corporation ORACLE SERVER X5-2/ASM,MOTHERBOARD,1U, BIOS 30080100 04/13/2016
  Sep 22 15:51:46 ubuntu kernel: [   97.478263]  0286 
4fea3140a01056a3 883f7f743b10 813f1143
  Sep 22 15:51:46 ubuntu kernel: [   97.478267]  883f7f743b58 
81cb61f8 883f7f743b48 81081102
  Sep 22 15:51:46 ubuntu kernel: [   97.478271]  0026 
883f5b2ea700 0026 
  Sep 22 15:51:46 ubuntu kernel: [   97.478275] Call Trace:
  Sep 22 15:51:46 ubuntu kernel: [   97.478277][] 
dump_stack+0x63/0x90
  Sep 22 15:51:46 ubuntu kernel: [   97.478290]  [] 
warn_slowpath_common+0x82/0xc0
  Sep 22 15:51:46 ubuntu kernel: [   97.478294]  [] 
warn_slowpath_fmt+0x5c/0x80
  Sep 22 15:51:46 ubuntu kernel: [   97.478299]  [] ? 
try_to_grab_pending+0xb3/0x160
  Sep 22 15:51:46 ubuntu kernel: [   97.478302]  [] 
__free_irq+0x1d2/0x280
  Sep 22 15:51:46 ubuntu kernel: [   97.478306]  [] 
free_irq+0x3c/0x90
  Sep 22 15:51:46 ubuntu kernel: [   97.478314]  [] 
nvme_suspend_queue+0x89/0xb0 [nvme]
  Sep 22 15:51:46 ubuntu kernel: [   97.478320]  [] 
nvme_disable_admin_queue+0x27/0x90 [nvme]
  Sep 22 15:51:46 ubuntu kernel: [   97.478325]  [] 
nvme_dev_disable+0x29e/0x2c0 [nvme]
  Sep 22 15:51:46 ubuntu kernel: [   97.478330]  [] ? 
__nvme_process_cq+0x210/0x210 [nvme]
  Sep 22 15:51:46 ubuntu kernel: [   97.478334]  [] ? 
dev_warn+0x6c/0x90
  Sep 22 15:51:46 ubuntu kernel: [   97.478340]  [] 
nvme_timeout+0x110/0x1d0 [nvme]
  Sep 22 15:51:46 ubuntu kernel: [   97.478344]  [] ? 
cpumask_next_and+0x2f/0x40
  Sep 22 15:51:46 ubuntu kernel: [   97.478348]  [] ? 
load_balance+0x18c/0x980
  Sep 22 15:51:46 ubuntu kernel: [   97.478354]  [] 
blk_mq_rq_timed_out+0x2f/0x70
  Sep 22 15:51:46 ubuntu kernel: [   97.478358]  [] 
blk_mq_check_expired+0x4e/0x80
  Sep 22 15:51:46 ubuntu kernel: [   97.478363]  [] 
bt_for_each+0xd8/0xe0
  Sep 22 15:51:46 ubuntu kernel: [   97.478367]  [] ? 
blk_mq_rq_timed_out+0x70/0x70
  Sep 22 15:51:46 ubuntu kernel: [   97.478370]  [] ? 
blk_mq_rq_timed_out+0x70/0x70
  Sep 22 15:51:46 ubuntu kernel: [   97.478375]  [] 
blk_mq_queue_tag_busy_iter+0x47/0xc0
  Sep 22 15:51:46 ubuntu kernel: [   97.478379]  [] ? 
blk_mq_attempt_merge+0xb0/0xb0
  Sep 22 15:51:46 ubuntu kernel: [   97.478383]  [] 
blk_mq_rq_timer+0x41/0xf0
  Sep 22 15:51:46 ubuntu kernel: [   97.478389]  [] 
call_timer_fn+0x35/0x120
  Sep 22 15:51:46 ubuntu kernel: [   97.478393]  [] ? 
blk_mq_attempt_merge+0xb0/0xb0
  Sep 22 15:51:46 ubuntu kernel: [   97.478397]  [] 
run_timer_softirq+0x23a/0x2f0
  Sep 22 15:51:46 ubuntu kernel: [   97.478403]  [] 
__do_softirq+0x101/0x290
  Sep 22 15:51:46 ubuntu kernel: [   97.478407]  [] 
irq_exit+0xa3/0xb0
  Sep 22 15:51:46 ubuntu kernel: [   97.478413]  [] 
smp_apic_timer_interrupt+0x42/0x50
  Sep 22 15:51:46 ubuntu kernel: [   97.478417]  [] 
apic_timer_interrupt+0x82/0x90
  Sep 22 15:51:46 ubuntu kernel: [   

[Kernel-packages] [Bug 1626679] [NEW] NVMe triggering kernel panic followed by "bad: scheduling from the idle thread!"

2016-09-22 Thread Paul Graydon
Public bug reported:

On an NVMe system I'm using, Ubuntu 16.04.1 regularly seems to trigger
off a kernel panic against somepart of the NVMe driver it looks like,
after which the logs get filled with entries over and over again of:

"bad: scheduling from the idle thread!"

Here's the initial stack trace that seems to trigger off the bug:

Sep 22 15:51:46 ubuntu kernel: [   97.478175] [ cut here 
]
Sep 22 15:51:46 ubuntu kernel: [   97.478185] WARNING: CPU: 13 PID: 0 at 
/build/linux-dcxD3m/linux-4.4.0/kernel/irq/manage.c:1438 
__free_irq+0x1d2/0x280()
Sep 22 15:51:46 ubuntu kernel: [   97.478188] Trying to free IRQ 38 from IRQ 
context!
Sep 22 15:51:46 ubuntu kernel: [   97.478191] Modules linked in: nls_iso8859_1 
ipmi_ssif intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel 
kvm irqbypass ioatdma me
i_me sb_edac shpchp edac_core lpc_ich mei 8250_fintek ipmi_msghandler mac_hid 
ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr autofs4 btrfs 
iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq 
async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 
crct10dif_pclmul ixgbe crc32_pclmu
l dca vxlan aesni_intel ip6_udp_tunnel udp_tunnel aes_x86_64 lrw gf128mul ptp 
glue_helper ahci ablk_helper pps_core cryptd nvme libahci mdio wmi fjes
Sep 22 15:51:46 ubuntu kernel: [   97.478257] CPU: 13 PID: 0 Comm: swapper/13 
Not tainted 4.4.0-31-generic #50-Ubuntu
Sep 22 15:51:46 ubuntu kernel: [   97.478260] Hardware name: Oracle Corporation 
ORACLE SERVER X5-2/ASM,MOTHERBOARD,1U, BIOS 30080100 04/13/2016
Sep 22 15:51:46 ubuntu kernel: [   97.478263]  0286 
4fea3140a01056a3 883f7f743b10 813f1143
Sep 22 15:51:46 ubuntu kernel: [   97.478267]  883f7f743b58 
81cb61f8 883f7f743b48 81081102
Sep 22 15:51:46 ubuntu kernel: [   97.478271]  0026 
883f5b2ea700 0026 
Sep 22 15:51:46 ubuntu kernel: [   97.478275] Call Trace:
Sep 22 15:51:46 ubuntu kernel: [   97.478277][] 
dump_stack+0x63/0x90
Sep 22 15:51:46 ubuntu kernel: [   97.478290]  [] 
warn_slowpath_common+0x82/0xc0
Sep 22 15:51:46 ubuntu kernel: [   97.478294]  [] 
warn_slowpath_fmt+0x5c/0x80
Sep 22 15:51:46 ubuntu kernel: [   97.478299]  [] ? 
try_to_grab_pending+0xb3/0x160
Sep 22 15:51:46 ubuntu kernel: [   97.478302]  [] 
__free_irq+0x1d2/0x280
Sep 22 15:51:46 ubuntu kernel: [   97.478306]  [] 
free_irq+0x3c/0x90
Sep 22 15:51:46 ubuntu kernel: [   97.478314]  [] 
nvme_suspend_queue+0x89/0xb0 [nvme]
Sep 22 15:51:46 ubuntu kernel: [   97.478320]  [] 
nvme_disable_admin_queue+0x27/0x90 [nvme]
Sep 22 15:51:46 ubuntu kernel: [   97.478325]  [] 
nvme_dev_disable+0x29e/0x2c0 [nvme]
Sep 22 15:51:46 ubuntu kernel: [   97.478330]  [] ? 
__nvme_process_cq+0x210/0x210 [nvme]
Sep 22 15:51:46 ubuntu kernel: [   97.478334]  [] ? 
dev_warn+0x6c/0x90
Sep 22 15:51:46 ubuntu kernel: [   97.478340]  [] 
nvme_timeout+0x110/0x1d0 [nvme]
Sep 22 15:51:46 ubuntu kernel: [   97.478344]  [] ? 
cpumask_next_and+0x2f/0x40
Sep 22 15:51:46 ubuntu kernel: [   97.478348]  [] ? 
load_balance+0x18c/0x980
Sep 22 15:51:46 ubuntu kernel: [   97.478354]  [] 
blk_mq_rq_timed_out+0x2f/0x70
Sep 22 15:51:46 ubuntu kernel: [   97.478358]  [] 
blk_mq_check_expired+0x4e/0x80
Sep 22 15:51:46 ubuntu kernel: [   97.478363]  [] 
bt_for_each+0xd8/0xe0
Sep 22 15:51:46 ubuntu kernel: [   97.478367]  [] ? 
blk_mq_rq_timed_out+0x70/0x70
Sep 22 15:51:46 ubuntu kernel: [   97.478370]  [] ? 
blk_mq_rq_timed_out+0x70/0x70
Sep 22 15:51:46 ubuntu kernel: [   97.478375]  [] 
blk_mq_queue_tag_busy_iter+0x47/0xc0
Sep 22 15:51:46 ubuntu kernel: [   97.478379]  [] ? 
blk_mq_attempt_merge+0xb0/0xb0
Sep 22 15:51:46 ubuntu kernel: [   97.478383]  [] 
blk_mq_rq_timer+0x41/0xf0
Sep 22 15:51:46 ubuntu kernel: [   97.478389]  [] 
call_timer_fn+0x35/0x120
Sep 22 15:51:46 ubuntu kernel: [   97.478393]  [] ? 
blk_mq_attempt_merge+0xb0/0xb0
Sep 22 15:51:46 ubuntu kernel: [   97.478397]  [] 
run_timer_softirq+0x23a/0x2f0
Sep 22 15:51:46 ubuntu kernel: [   97.478403]  [] 
__do_softirq+0x101/0x290
Sep 22 15:51:46 ubuntu kernel: [   97.478407]  [] 
irq_exit+0xa3/0xb0
Sep 22 15:51:46 ubuntu kernel: [   97.478413]  [] 
smp_apic_timer_interrupt+0x42/0x50
Sep 22 15:51:46 ubuntu kernel: [   97.478417]  [] 
apic_timer_interrupt+0x82/0x90
Sep 22 15:51:46 ubuntu kernel: [   97.478419][] ? 
cpuidle_enter_state+0x111/0x2b0
Sep 22 15:51:46 ubuntu kernel: [   97.478428]  [] 
cpuidle_enter+0x17/0x20
Sep 22 15:51:46 ubuntu kernel: [   97.478432]  [] 
call_cpuidle+0x32/0x60
Sep 22 15:51:46 ubuntu kernel: [   97.478436]  [] ? 
cpuidle_select+0x13/0x20
Sep 22 15:51:46 ubuntu kernel: [   97.478440]  [] 
cpu_startup_entry+0x290/0x350
Sep 22 15:51:46 ubuntu kernel: [   97.478444]  [] 
start_secondary+0x154/0x190
Sep 22 15:51:46 ubuntu kernel: [   97.478448] ---[ end trace 4f4c67e52b4d19ac 
]---

then

Sep 22 15:51:46 ubuntu kernel: [   97.478463] BUG: