Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")

[email protected] Tue, 14 Feb 2017 08:36:07 -0800

> I tested today's linux-next (next-20170214) + the 2 patches just now and got
> a weird result: 
> sometimes the VM stills hung with a new calltrace (BUG: spinlock bad
> magic) , but sometimes the VM did boot up despite the new calltrace!
> 
> Attached is the log of a "good" boot.
> 
> It looks we have a memory corruption issue somewhere...


Yes.

> Actually previously I saw the "BUG: spinlock bad magic" message once, but I
> couldn't repro it later, so I didn't mention it to you.

Interesting.

> 
> The good news is that now I can repro the "spinlock bad magic" message
> every time. 
> I tried to dig into this by enabling Kernel hacking -> Memory debugging,
> but didn't find anything abnormal. 
> Is it possible that the SCSI layer passes a wrong memory address?

It's possible, but this looks like it might be a different issue.

A few questions on the dmesg:

[    6.208794] sd 2:0:0:0: [storvsc] Sense Key : Illegal Request [current] 
[    6.209447] sd 2:0:0:0: [storvsc] Add. Sense: Invalid command operation code
[    6.210043] sd 3:0:0:0: [storvsc] Sense Key : Illegal Request [current] 
[    6.210618] sd 3:0:0:0: [storvsc] Add. Sense: Invalid command operation code
[    6.212272] sd 2:0:0:0: [storvsc] Sense Key : Illegal Request [current] 
[    6.212897] sd 2:0:0:0: [storvsc] Add. Sense: Invalid command operation code
[    6.213474] sd 3:0:0:0: [storvsc] Sense Key : Illegal Request [current] 
[    6.214051] sd 3:0:0:0: [storvsc] Add. Sense: Invalid command operation code

I didn't see anything like this in the other logs.  Are these messages
something usual on HyperV VMs?

[    6.358405] XFS (sdb1): Mounting V5 Filesystem
[    6.404478] XFS (sdb1): Ending clean mount
[    7.535174] BUG: spinlock bad magic on CPU#0, swapper/0/0
[    7.536807]  lock: host_ts+0x30/0xffffffffffffe1a0 [hv_utils], .magic: 
00000000, .owner: <none>/-1, .owner_cpu: 0
[    7.538436] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.10.0-rc8-next-20170214+ #1
[    7.539142] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090006  04/28/2016
[    7.539142] Call Trace:
[    7.539142]  <IRQ>
[    7.539142]  dump_stack+0x63/0x82
[    7.539142]  spin_dump+0x78/0xc0
[    7.539142]  do_raw_spin_lock+0xfd/0x160
[    7.539142]  _raw_spin_lock_irqsave+0x4c/0x60
[    7.539142]  ? timesync_onchannelcallback+0x153/0x220 [hv_utils]
[    7.539142]  timesync_onchannelcallback+0x153/0x220 [hv_utils]

Can you resolve this address using gdb to a line of code?  Once inside
gdb do:

l *(timesync_onchannelcallback+0x153)

Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")

Reply via email to