This bug is missing log files that will aid in diagnosing the problem.
While running an Ubuntu kernel (not a mainline or third-party kernel)
please enter the following command in a terminal window:

apport-collect 1886404

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable
to run this command, please add a comment stating that fact and change
the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the
Ubuntu Kernel Team.

** Changed in: linux (Ubuntu)
       Status: New => Incomplete

** Tags added: artful

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1886404

Title:
  Ubuntu 16.04 nodes getting hung without any error logs and reboots

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Hello Guys,

  In one of the Ubuntu nodes we have kernel 4.13.0-36-generic
  #40~16.04.1-Ubuntu SMP Fri Feb 16 23:25:58 UTC 2018 x86_64 x86_64
  x86_64 GNU/Linux

  We have 3 Mon nodes and 20 Data Nodes Ceph Cluster and have installed
  ceph version 14.2.8 (2d095e947a02261ce61424021bb43bd3022d35cb)
  nautilus (stable)

  On Monday around 04:36 AM IST, one of the Data Node hung and got rebooted 
around 05:03 AM IST. Same issue occured in 3 different Data Nodes on different 
days and time prior to this. 
  While checking osd logs there were heartbeat logs got generated when other DN 
was in hung state. Checked syslog as well but couldn't find anything unusual.
  After reboot of the node below messages were logged :

  Jun  8 05:03:51 nvmbdprp012682 kernel: [    0.000000] tsc: Fast TSC
  calibration failed

  Jun  8 05:03:51 nvmbdprp012682 kernel: [    0.250233] acpi PNP0A03:00:
  _OSC failed (AE_NOT_FOUND); disabling ASPM

  Jun  8 05:03:51 nvmbdprp012682 kernel: [    0.251101] acpi PNP0A03:01:
  _OSC failed (AE_NOT_FOUND); disabling ASPM

  Jun  8 05:03:51 nvmbdprp012682 kernel: [    0.310575] pci 0000:04:00.0: BAR 
6: failed to assign [mem size 0x00100000 pref]
  Jun  8 05:03:51 nvmbdprp012682 kernel: [    0.310579] pci 0000:04:00.1: BAR 
6: no space for [mem size 0x00100000 pref]
  Jun  8 05:03:51 nvmbdprp012682 kernel: [    0.310582] pci 0000:04:00.1: BAR 
6: failed to assign [mem size 0x00100000 pref]

  Jun  8 05:04:04 nvmbdprp012682 kernel: [   31.107338] dsa_filter_hook:
  module verification failed: signature and/or required key missing -
  tainting kernel

  Jun  8 05:04:37 nvmbdprp012682 snapd[2836]: stateengine.go:102: state
  ensure error: Get https://api.snapcraft.io/api/v1/snaps/sections:
  net/http: request canceled while waiting for connection
  (Client.Timeout exceeded while awaiting headers)

  Can you please help me on figuring out the root cause. Are above logs can 
cause such behaviour or anything else is causing this issue which i am not able 
to see?
  Let me know if anymore information is needed in this regard.

  Thanks

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1886404/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to