On 2015-10-05 15:16, Udo van den Heuvel wrote: > Did I miss anything that is needed to avoid this? > Is this a known issue? > > Please let me know.
Finally I got some logging. I booted into 4.1.10 in single user mode. All appeared fine. Then I went to multi user or whatever the systemd thing is behind `init 3`. No problem. Then I went to former runlevel 5 and after a short while the problems reappeared: Besides the network issue in the log I had disk issues as well: >From md0 I lost a partition and from md1 I lost a partition from a different disk. No such issues in 4.1.8. [ 335.353756] Bluetooth: BNEP socket layer initialized [ 696.811881] ------------[ cut here ]------------ [ 696.811891] WARNING: CPU: 3 PID: 20 at net/sched/sch_generic.c:303 dev_watchdog+0x250/0x260() [ 696.811893] NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out [ 696.811895] Modules linked in: bnep bluetooth fuse edac_core cpufreq_userspace eeprom msr it87 hwmon_vid nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc ip6t_REJECT nf_conntrack_netbios_ns nf_conntrack_broadcast nf_reject_ipv6 ipt_REJECT nf_conntrack_ipv6 nf_reject_ipv4 xt_tcpudp nf_defrag_ipv6 iptable_filter ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_conntrack iptable_nat nf_conntrack_ipv4 ip6table_filter nf_defrag_ipv4 nf_nat_ipv4 ip6_tables nf_nat nf_conntrack pwc uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev snd_usb_audio snd_usbmidi_lib snd_hwdep snd_rawmidi ppdev kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic cp210x usbserial microcode snd_hda_intel cdc_acm snd_hda_controller snd_hda_codec snd_hda_core snd_seq snd_seq_device evdev parport_serial [ 696.811940] k10temp parport_pc parport snd_pcm xhci_pci snd_timer snd xhci_hcd i2c_piix4 button acpi_cpufreq binfmt_misc ip_tables x_tables ecb hid_generic usbhid ohci_pci ohci_hcd ehci_pci ehci_hcd sr_mod cdrom radeon fbcon bitblit softcursor font cfbfillrect cfbimgblt cfbcopyarea i2c_algo_bit backlight drm_kms_helper ttm drm fb fbdev autofs4 [ 696.811970] CPU: 3 PID: 20 Comm: ksoftirqd/3 Not tainted 4.1.10 #5 [ 696.811972] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./F2A85X-UP4, BIOS F5a 04/30/2013 [ 696.811974] ffffffffb46d1a21 0000000001778562 ffffffffb46d1a21 ffffffffb4552dcb [ 696.811977] ffff88042e90bcd0 ffffffffb40730db 0000000000000000 ffff88042e1643a0 [ 696.811980] 0000000000000003 ffff88042e164000 0000000000000001 ffffffffb4073188 [ 696.811984] Call Trace: [ 696.811989] [<ffffffffb4552dcb>] ? dump_stack+0x4a/0x74 [ 696.811993] [<ffffffffb40730db>] ? warn_slowpath_common+0x8b/0xe0 [ 696.811996] [<ffffffffb4073188>] ? warn_slowpath_fmt+0x58/0x80 [ 696.811999] [<ffffffffb4496cb0>] ? dev_watchdog+0x250/0x260 [ 696.812002] [<ffffffffb4496a60>] ? qdisc_rcu_free+0x30/0x30 [ 696.812005] [<ffffffffb40c0143>] ? call_timer_fn.isra.6+0x23/0x90 [ 696.812008] [<ffffffffb4496a60>] ? qdisc_rcu_free+0x30/0x30 [ 696.812010] [<ffffffffb40c03a8>] ? run_timer_softirq+0x1f8/0x2b0 [ 696.812013] [<ffffffffb4001470>] ? __switch_to+0x20/0x600 [ 696.812016] [<ffffffffb40760a7>] ? __do_softirq+0xf7/0x1f0 [ 696.812018] [<ffffffffb40761b9>] ? run_ksoftirqd+0x19/0x40 [ 696.812022] [<ffffffffb40909b0>] ? smpboot_thread_fn+0x170/0x250 [ 696.812025] [<ffffffffb4090840>] ? sort_range+0x20/0x20 [ 696.812027] [<ffffffffb408dc38>] ? kthread+0xc8/0xe0 [ 696.812029] [<ffffffffb408db70>] ? kthread_worker_fn+0x180/0x180 [ 696.812032] [<ffffffffb4558912>] ? ret_from_fork+0x42/0x70 [ 696.812035] [<ffffffffb408db70>] ? kthread_worker_fn+0x180/0x180 [ 696.812037] ---[ end trace 4e22e3f455b32613 ]--- [ 749.858395] INFO: rcu_preempt self-detected stall on CPU { 1} (t=15000 jiffies g=61792 c=61791 q=297918) [ 749.858444] Task dump for CPU 1: [ 749.858459] ksoftirqd/1 R running task 0 12 2 0x00000008 [ 749.858491] ffff88043ec83dd8 00000000b29c67b2 ffffffffb4741c80 ffffffffb40ba775 [ 749.858525] 000000000000f160 ffff88043ec95100 ffffffffb4741c80 0000000000048bbe [ 749.858559] 0000000000000000 ffffffffb40bdf6c ffff88043ec8f7e0 0000000000000046 [ 749.858593] Call Trace: [ 749.858604] <IRQ> [<ffffffffb40ba775>] ? rcu_dump_cpu_stacks+0x85/0xe0 [ 749.858637] [<ffffffffb40bdf6c>] ? rcu_check_callbacks+0x43c/0x840 [ 749.858663] [<ffffffffb40d0de0>] ? tick_sched_handle.isra.6+0x30/0x30 [ 749.858690] [<ffffffffb40c208a>] ? hrtimer_run_queues+0x3a/0x120 [ 749.858714] [<ffffffffb40d0de0>] ? tick_sched_handle.isra.6+0x30/0x30 [ 749.858740] [<ffffffffb40c0d59>] ? update_process_times+0x39/0x70 [ 749.858764] [<ffffffffb40d0de0>] ? tick_sched_handle.isra.6+0x30/0x30 [ 749.858789] [<ffffffffb40d0e28>] ? tick_sched_timer+0x48/0x90 [ 749.858813] [<ffffffffb40c15c0>] ? __run_hrtimer.isra.5+0x60/0x150 [ 749.858839] [<ffffffffb40c1d4d>] ? hrtimer_interrupt+0xfd/0x290 [ 749.858863] [<ffffffffb4033450>] ? smp_trace_apic_timer_interrupt+0x60/0xa0 [ 749.858891] [<ffffffffb455931b>] ? apic_timer_interrupt+0x6b/0x70 [ 749.858915] <EOI> [<ffffffffb40c004a>] ? del_timer_sync+0x3a/0x50 [ 749.858943] [<ffffffffb40c0052>] ? del_timer_sync+0x42/0x50 [ 749.858967] [<ffffffffb44ba8c7>] ? inet_csk_reqsk_queue_drop+0xa7/0x210 [ 749.858993] [<ffffffffb44bab15>] ? reqsk_timer_handler+0xe5/0x2f0 [ 749.859021] [<ffffffffb44baa30>] ? inet_csk_reqsk_queue_drop+0x210/0x210 [ 749.859047] [<ffffffffb40c0143>] ? call_timer_fn.isra.6+0x23/0x90 [ 749.859072] [<ffffffffb44baa30>] ? inet_csk_reqsk_queue_drop+0x210/0x210 [ 749.859098] [<ffffffffb40c03a8>] ? run_timer_softirq+0x1f8/0x2b0 [ 749.859122] [<ffffffffb4001470>] ? __switch_to+0x20/0x600 [ 749.859145] [<ffffffffb40760a7>] ? __do_softirq+0xf7/0x1f0 [ 749.859167] [<ffffffffb40761b9>] ? run_ksoftirqd+0x19/0x40 [ 749.859190] [<ffffffffb40909b0>] ? smpboot_thread_fn+0x170/0x250 [ 749.859214] [<ffffffffb4090840>] ? sort_range+0x20/0x20 [ 749.859235] [<ffffffffb408dc38>] ? kthread+0xc8/0xe0 [ 749.859256] [<ffffffffb408db70>] ? kthread_worker_fn+0x180/0x180 [ 749.859280] [<ffffffffb4558912>] ? ret_from_fork+0x42/0x70 [ 749.859302] [<ffffffffb408db70>] ? kthread_worker_fn+0x180/0x180 [ 756.722904] INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 0, t=15002 jiffies, g=-11, c=-12, q=1) [ 756.724309] Task dump for CPU 1: [ 756.725643] ksoftirqd/1 R running task 0 12 2 0x00000008 [ 756.726956] ffffffffb40760a7 0000000004208040 0000000100017dc6 ffff88042e8d0000 [ 756.728221] 042080400000000a 0000000000000001 ffff88042e89d7f0 ffff88042e89a280 [ 756.729459] ffffffffb473cd00 0000000000000001 0000000000000000 0000000000000000 [ 756.730673] Call Trace: [ 756.731862] [<ffffffffb40760a7>] ? __do_softirq+0xf7/0x1f0 [ 756.733053] [<ffffffffb40761b9>] ? run_ksoftirqd+0x19/0x40 [ 756.734251] [<ffffffffb40909b0>] ? smpboot_thread_fn+0x170/0x250 [ 756.735439] [<ffffffffb4090840>] ? sort_range+0x20/0x20 [ 756.736606] [<ffffffffb408dc38>] ? kthread+0xc8/0xe0 [ 756.737785] [<ffffffffb408db70>] ? kthread_worker_fn+0x180/0x180 [ 756.738957] [<ffffffffb4558912>] ? ret_from_fork+0x42/0x70 [ 756.740128] [<ffffffffb408db70>] ? kthread_worker_fn+0x180/0x180 [ 758.763590] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0xd0000 action 0x6 frozen [ 758.764837] ata6: SError: { PHYRdyChg CommWake 10B8B } [ 758.766062] ata6.00: failed command: FLUSH CACHE EXT [ 758.767304] ata6.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 1 res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 758.769824] ata6.00: status: { DRDY } [ 758.771080] ata6: hard resetting link [ 759.262751] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 764.254450] ata6.00: qc timeout (cmd 0xec) [ 764.256699] ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4) [ 764.258946] ata6.00: revalidation failed (errno=-5) [ 764.261185] ata6: hard resetting link [ 764.753632] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Does this shed a light on what is happening here? Hardware appears fine with 4.1.8. Disks sync OK, no defects seen. Kind regards, Udo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/