Hi,
this didn't happen before but after 4.16-rc1 my tg3 nic stops for
whatever reason and the connection to the machine is dead. It didn't show
anything in dmesg until today.
The IO pagefaults look like it is trying to access something it
shouldn't and maybe that's why it times out.
It triggers pretty quickly so I'd call it a reliable reproducer and thus
I can test patches... :-)
Thx.
...
[ 15.916840] random: crng init done
[ 44.792699] tg3 :01:00.0 eth0: Link is up at 100 Mbps, full duplex
[ 44.793024] tg3 :01:00.0 eth0: Flow control is on for TX and on for RX
[ 44.793315] tg3 :01:00.0 eth0: EEE is disabled
[ 44.793395] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 58.216474] tg3 :01:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0001 address=0x0001f0c0 flags=0x]
[ 58.216943] tg3 :01:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0001 address=0x0001f100 flags=0x]
[ 58.217395] tg3 :01:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0001 address=0x0001f140 flags=0x]
[ 58.217844] tg3 :01:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0001 address=0x0001f180 flags=0x]
[ 64.992145] [ cut here ]--------
[ 64.992406] NETDEV WATCHDOG: eth0 (tg3): transmit queue 0 timed out
[ 64.992742] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:464
dev_watchdog+0x1fe/0x210
[ 64.992744] Modules linked in: arc4 iwlmvm mac80211 amdgpu kvm_amd kvm
iwlwifi irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel
snd_hda_codec_conexant snd_hda_codec_hdmi snd_hda_codec_generic aesni_intel
sha256_generic aes_x86_64 crypto_simd snd_hda_intel cryptd glue_helper tg3
snd_hda_codec pcspkr snd_hwdep cfg80211 joydev psmouse ptp snd_hda_core hp_wmi
pps_core snd_pcm ehci_pci chash tpm_infineon rfkill libphy i2c_piix4 snd_timer
fam15h_power xhci_pci ehci_hcd snd sg gpu_sched k10temp soundcore xhci_hcd
tpm_tis tpm_tis_core video tpm battery button ac acpi_cpufreq evdev input_leds
serio_raw sd_mod thermal pinctrl_amd
[ 64.993216] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.16.0-rc1+ #2
[ 64.993222] Hardware name: HP HP EliteBook 745 G3/807E, BIOS N73 Ver. 01.08
01/28/2016
[ 64.996048] RIP: 0010:dev_watchdog+0x1fe/0x210
[ 64.996050] RSP: 0018:88043dc83e88 EFLAGS: 00010282
[ 64.996052] RAX: RBX: RCX: 0103
[ 64.996054] RDX: 8103 RSI: 0086 RDI:
[ 64.996055] RBP: 88042b86e39c R08: 81c0a400 R09: 0001
[ 64.996057] R10: 035a R11: R12: 88042b86e3b0
[ 64.996058] R13: 88042b86e000 R14: 0005 R15: 88042a0ced80
[ 64.996061] FS: () GS:88043dc8()
knlGS:
[ 64.996063] CS: 0010 DS: ES: CR0: 80050033
[ 64.996065] CR2: 7f98ed87eb00 CR3: 000428ea CR4: 001406e0
[ 64.996068] Call Trace:
[ 64.996074]
[ 64.996082] ? qdisc_reset+0xe0/0xe0
[ 64.996085] ? qdisc_reset+0xe0/0xe0
[ 64.996092] call_timer_fn+0x2b/0x150
[ 64.996097] run_timer_softirq+0x415/0x460
[ 64.996101] ? tick_sched_timer+0x42/0x90
[ 64.996106] ? _raw_spin_lock_irq+0x1a/0x40
[ 64.996110] ? __hrtimer_run_queues+0x113/0x2d0
[ 64.996114] __do_softirq+0xeb/0x2d5
[ 64.996121] irq_exit+0xaa/0xb0
[ 64.996125] smp_apic_timer_interrupt+0x73/0x150
[ 64.996128] apic_timer_interrupt+0x7d/0x90
[ 64.996131]
[ 64.996136] RIP: 0010:cpuidle_enter_state+0xa3/0x2f0
[ 64.996138] RSP: 0018:c900019c3ea8 EFLAGS: 0246 ORIG_RAX:
ff12
[ 64.996141] RAX: 88043dc8 RBX: 000f21d4b954 RCX: 001f
[ 64.996142] RDX: 000f21d4b954 RSI: 81da4ca1 RDI: 81db2a9e
[ 64.996144] RBP: 88042a39a200 R08: 0005a0b5 R09: 000585fa
[ 64.996145] R10: 0018 R11: 00049370 R12: 0002
[ 64.996146] R13: 82095db8 R14: R15: 000f0b23994e
[ 64.996157] ? cpuidle_enter_state+0x93/0x2f0
[ 65.003171] do_idle+0x19a/0x1f0
[ 65.003176] cpu_startup_entry+0x6f/0x80
[ 65.003181] start_secondary+0x1a5/0x200
[ 65.003185] secondary_startup_64+0xa5/0xb0
[ 65.003189] Code: 00 49 63 4c 24 f0 eb 93 4c 89 ef c6 05 5b 10 af 00 01 e8
b6 67 fd ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 20 f6 df 81 e8 e2 8d a7 ff <0f> ff
eb be 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44
[ 65.003234] ---[ end trace b191673f18a75f41 ]---
[ 65.003243] tg3 :01:00.0 eth0: transmit timed out, resetting
[ 67.679695] tg3 :01:00.0 eth0: 0x: 0x168714e4, 0x10100406,
0x0210, 0x
[ 67.680053] tg3 :01:00.0 eth0: 0x0010: 0xd082000c, 0x,
0xd081000c, 0x
[ 67.680406] tg3 :01:00.0 eth0: 0x0020: 0xd08c, 0x,
0x, 0x807e103c
[ 67.680419] tg3 :01:00