Re: next-20150610 - repeated hangs at e1000e_phc_gettime+0x2e/0x60

2015-06-12 Thread Valdis . Kletnieks
On Thu, 11 Jun 2015 22:57:48 -0400, Valdis Kletnieks said:

> 0) next-20150603 works, so the problem landed in linux-next in the last week.
>
> 1) All 3 times happened while I was at home, using wireless, so
> the interface didn't have link and was ifconfig'ed down.

All 3 crashes happened at almost exactly 4 hours of uptime, but here
in my office I'm now at 6 hours on the same kernel while running with
the interface plugging in and doing traffic.

I have a fighting chance of mostly finishing a bisect over the weekend,
I'll let you know where that leads.


pgpVQUlUm7ZLN.pgp
Description: PGP signature


next-20150610 - repeated hangs at e1000e_phc_gettime+0x2e/0x60

2015-06-11 Thread Valdis Kletnieks
I'm seeing repeated hard lockups on my Dell Latitude E6530.
Helpful info:

0) next-20150603 works, so the problem landed in linux-next in the last week.

1) All 3 times happened while I was at home, using wireless, so
the interface didn't have link and was ifconfig'ed down.

2) Remarkably similar times for it to blow up:

[14513.365378] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 
1
[14482.271716] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 
3
[14479.906820] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 0

(I suspect the offsets were caused by differences in how long it took me
to correctly enter the cryptLUKS passphrase for my encrypted root filesystem)

Oddly enough, I don't see any patches to the e1000e driver in quite some
time... but that's where it keeps locking up.

This ringing any bells?

All 3 traces look like:

[14479.906908] Call Trace:
[14479.906914][] dump_stack+0x50/0xa8
[14479.906930]  [] panic+0xcd/0x1e4
[14479.906940]  [] ? perf_event_task_disable+0xc0/0xc0
[14479.906952]  [] watchdog_overflow_callback+0x9b/0xa0
[14479.906959]  [] __perf_event_overflow+0xc4/0x1f0
[14479.906968]  [] perf_event_overflow+0x14/0x20
[14479.906976]  [] intel_pmu_handle_irq+0x1e1/0x430
[14479.906990]  [] perf_event_nmi_handler+0x26/0x40
[14479.906999]  [] nmi_handle+0x103/0x340
[14479.907005]  [] ? nmi_handle+0x5/0x340
[14479.907017]  [] default_do_nmi+0xc3/0x120
[14479.907032]  [] do_nmi+0xe8/0x130
[14479.907044]  [] end_repeat_nmi+0x1e/0x2e
[14479.907055]  [] ? e1000e_cyclecounter_read+0x16/0xc0
[14479.907061]  [] ? e1000e_cyclecounter_read+0x16/0xc0
[14479.907069]  [] ? e1000e_cyclecounter_read+0x16/0xc0
[14479.907075]  <>  [] timecounter_read+0x19/0x60
[14479.907088]  [] e1000e_phc_gettime+0x2e/0x60
[14479.907098]  [] e1000e_systim_overflow_work+0x31/0x70
[14479.907105]  [] process_one_work+0x3c9/0x980
[14479.907115]  [] ? process_one_work+0x312/0x980
[14479.907125]  [] ? worker_thread+0x78/0x760
[14479.907134]  [] worker_thread+0x2cc/0x760
[14479.907144]  [] ? process_one_work+0x980/0x980
[14479.907154]  [] kthread+0xfe/0x120
[14479.907163]  [] ? finish_task_switch+0x50/0x1c0
[14479.907173]  [] ? kthread_create_on_node+0x270/0x270
[14479.907179]  [] ret_from_fork+0x3f/0x70
[14479.907188]  [] ? kthread_create_on_node+0x270/0x270
[14479.907243] Kernel Offset: 0x3900 from 0x8100 (relocation 
range:


pgpaLGGXQq3pB.pgp
Description: PGP signature