Re: Kernel lockup, might be helpful log.

Birdsarenice Mon, 14 Dec 2015 00:28:51 -0800

I've no need for a fix. I know exactly what the underlying cause is:Those Seagate 8TB Archive drives and their known compatibility issueswith some kernel versions. I just shared the log because it's asituation that btrfs handles very, very poorly, and the error handlingcould be improved. If a drive is unresponsive, btrfs really should beable to just cease using it and treat it as failed, or even unmount theentire filesystem - either would be preferable to what actually happens(at least for me), a system hang that leaves nothing functional whatsoever.

I've 'solved' it by removing all drives of that model. It's been runningwithout issue since I did that.


On 14/12/15 07:36, Chris Murphy wrote:

I can't help with the call traces. But several (not all) of the hard
resetting link messages are hallmark cases where the SCSI command
timer default of 30 seconds looks like it's being hit while the drive
itself is hung up doing a sector read recovery (multiple attempts).
It's worth seeing if 'smartctl -l scterc <dev>' will report back that
SCT is supported and that it's just disabled, meaning you can change
this to something sane like with 'smartctl -l 70,70 <dev>' which will
make the drive time out before the linux kernel command timer. That'll
let Btrfs do the right thing, rather than constantly getting poked in
both eyes by link resets.


Chris Murphy


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Kernel lockup, might be helpful log.

Reply via email to