[Bug 1596866] Re: NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

2018-04-03 Thread Michael Sherman
I am experiencing this issue on a new xeon scalable machine.
I can confirm that it is present with xenial linux-image-generic, and 
linux-image-generic-hwe, kernels 4.4.0-116, and 4.13.0-38, and I can reproduce 
it at will.

It no longer seems to occur on kernel 4.15.0-13.

In all cases, clocksource was already set to TSC automatically on install.
The soft lockup and hard lockup were triggered each time by a waf build of the 
ns3 project.
In each case, a failure occurred while running the commands as a regular user, 
but the system was stable if run as root, or with sudo.


** Attachment added: "lspci-vnvn.log"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1596866/+attachment/5100766/+files/lspci-vnvn.log

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1596866

Title:
  NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1596866/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1596866] Re: NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

2017-03-12 Thread Kevin O'Gorman
I was having this problem on two systems, one a Core i-7, the other a
Xeon.  Both are X86-64, running the kernel that uname reports as
4.4.0-66-generic

The problem has not recurred since the changes, but that's only a few
days ago at the moment.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1596866

Title:
  NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1596866/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1596866] Re: NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

2017-02-23 Thread Kevin O'Gorman
It looks like I'm having the same problem with a clean install of
Xubuntu 16.04.1, and I'll be trying the clocksource tweak.  If you don't
hear back, it got fixed.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1596866

Title:
  NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1596866/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1596866] Re: NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

2016-06-30 Thread James Troup
** Changed in: linux (Ubuntu)
   Status: Incomplete => New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1596866

Title:
  NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1596866/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1596866] Re: NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

2016-06-29 Thread Benjamin Kaehne
@jjo
So far I am having success with clocksource=tsc despite warnings from the 
kernel telling me it is ynstable.

@jsalisbury
This is off a clean xenial install/update. I will try post test from new kernel 
shortly.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1596866

Title:
  NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1596866/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1596866] Re: NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

2016-06-29 Thread Joseph Salisbury
Did this issue start happening after an update/upgrade?  Was there a
prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer
to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest
v4.7 kernel[0].

If this bug is fixed in the mainline kernel, please add the following
tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag:
'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as
"Confirmed".


Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.7-rc5-yakkety/

** Changed in: linux (Ubuntu)
   Importance: Undecided => High

** Changed in: linux (Ubuntu)
   Status: Confirmed => Incomplete

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1596866

Title:
  NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1596866/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1596866] Re: NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

2016-06-29 Thread JuanJo Ciarlante
Some ~recent alike finding, in case it helps:
  https://github.com/TobleMiner/wintron7.0/issues/2
- worked around with clocksource=tsc, guess that
ntpq should also show a large drift.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1596866

Title:
  NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1596866/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1596866] Re: NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

2016-06-28 Thread Benjamin Kaehne
** Attachment added: "kern.log"
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1596866/+attachment/4692019/+files/kern.log

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1596866

Title:
  NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1596866/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1596866] Re: NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

2016-06-28 Thread Benjamin Kaehne
Uploading kern.log too. dmesg was cleared as server needed to be
rebooted.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1596866

Title:
  NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1596866/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1596866] Re: NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 - Xenial - Python

2016-06-28 Thread Benjamin Kaehne
apport information

** Tags added: apport-collected uec-images

** Description changed:

  I am receiving quite regular hardlockups on python (27) in xenial:
  
  Linux rts-os-s-03 4.4.0-28-generic #47-Ubuntu SMP Fri Jun 24 10:09:13
  UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
  
  Ubuntu 16.04 LTS \n \l
  
  Python 27:
  ii  python   2.7.11-1
amd64interactive high-level object-oriented language (default version)
  ii  python2.72.7.11-7ubuntu1 
amd64Interactive high-level object-oriented language (version 2.7)
  
  Python 3:
  ii  python3  3.5.1-3 
amd64interactive high-level object-oriented language (default python3 
version)
  
  
  Jun 28 06:52:42  kernel: [ 1634.052991] NMI watchdog: Watchdog detected 
hard LOCKUP on cpu 0
  Jun 28 06:52:42  kernel: [ 1634.059516] Modules linked in: iptable_raw 
kvm_intel ebtable_filter ebtables ip6table_filter ip6_tables xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat xt_tcpudp iptable_filter 
ip_tables x_tables veth bridge stp llc bonding dcdbas intel_rapl 
x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass shpchp lpc_ich 
ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp 
libiscsi scsi_transport_iscsi openvswitch nf_defrag_ipv6 nf_conntrack autofs4 
btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx 
xor raid6_pq raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul 
aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd bnx2x ahci 
libahci tg3 megaraid_sas vxlan ip6_udp_tunnel udp_tunnel ptp pps_core mdio 
libcrc32c [last unloaded: kvm_intel]
  Jun 28 06:52:42  kernel: [ 1634.059790] CPU: 0 PID: 52914 Comm: python 
Not tainted 4.4.0-28-generic #47-Ubuntu
  Jun 28 06:52:42  kernel: [ 1634.059791] Hardware name: Dell Inc. 
PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015
  Jun 28 06:52:42  kernel: [ 1634.059792]  0086 
2732bfd7 887b254bbbd0 813eb1a3
  Jun 28 06:52:42  kernel: [ 1634.059794]   
 887b254bbbe8 8113b3bd
  Jun 28 06:52:42  kernel: [ 1634.059796]  887e4da1a000 
887b254bbc20 81183e4c 0001
  Jun 28 06:52:42  kernel: [ 1634.059797] Call Trace:
  Jun 28 06:52:42  kernel: [ 1634.059804]  [] 
dump_stack+0x63/0x90
  Jun 28 06:52:42  kernel: [ 1634.059807]  [] 
watchdog_overflow_callback+0xbd/0xd0
  Jun 28 06:52:42  kernel: [ 1634.059810]  [] 
__perf_event_overflow+0x8c/0x1d0
  Jun 28 06:52:42  kernel: [ 1634.059811]  [] 
perf_event_overflow+0x14/0x20
  Jun 28 06:52:42  kernel: [ 1634.059814]  [] 
intel_pmu_handle_irq+0x1e1/0x4a0
  Jun 28 06:52:42  kernel: [ 1634.059817]  [] ? 
__alloc_pages_nodemask+0x1b1/0xb60
  Jun 28 06:52:42  kernel: [ 1634.059821]  [] ? 
try_charge+0xd4/0x640
  Jun 28 06:52:42  kernel: [ 1634.059823]  [] ? 
mem_cgroup_try_charge+0x6b/0x1b0
  Jun 28 06:52:42  kernel: [ 1634.059826]  [] ? 
lru_cache_add_active_or_unevictable+0x27/0xa0
  Jun 28 06:52:42  kernel: [ 1634.059830]  [] ? 
handle_mm_fault+0xcaa/0x1820
  Jun 28 06:52:42  kernel: [ 1634.059831]  [] ? 
vma_merge+0x22e/0x330
  Jun 28 06:52:42  kernel: [ 1634.059834]  [] 
perf_event_nmi_handler+0x2d/0x50
  Jun 28 06:52:42  kernel: [ 1634.059837]  [] 
nmi_handle+0x69/0x120
  Jun 28 06:52:42  kernel: [ 1634.059839]  [] 
default_do_nmi+0x40/0x100
  Jun 28 06:52:42  kernel: [ 1634.059841]  [] 
do_nmi+0xe2/0x130
  Jun 28 06:52:42  kernel: [ 1634.059844]  [] 
nmi+0x56/0xa5
  
  
  As suggested, this is causing hard lockups and/or pauses.
+ --- 
+ AlsaDevices:
+  total 0
+  crw-rw 1 root audio 116,  1 Jun 29 04:12 seq
+  crw-rw 1 root audio 116, 33 Jun 29 04:12 timer
+ AplayDevices: Error: [Errno 2] No such file or directory
+ ApportVersion: 2.20.1-0ubuntu2.1
+ Architecture: amd64
+ ArecordDevices: Error: [Errno 2] No such file or directory
+ AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
+ DistroRelease: Ubuntu 16.04
+ IwConfig: Error: [Errno 2] No such file or directory
+ Lsusb:
+  Bus 002 Device 002: ID 8087:8002 Intel Corp. 
+  Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
+  Bus 001 Device 003: ID 413c:a001 Dell Computer Corp. Hub
+  Bus 001 Device 002: ID 8087:800a Intel Corp. 
+  Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
+ MachineType: Dell Inc. PowerEdge R730
+ Package: linux (not installed)
+ PciMultimedia:
+  
+ ProcEnviron:
+  TERM=xterm-256color
+  PATH=(custom, no user)
+  LANG=en_US.UTF-8
+  SHELL=/bin/bash
+ ProcFB:
+  
+ ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-28-generic 
root=UUID=58ac0f2e-dff9-4433-9a89-a8cba7a8154b ro conso