On Feb 14, 2012, at 4:34 PM, Victor Balada Diaz wrote: > On Tue, Feb 14, 2012 at 03:09:58PM -0800, Jeremy Chadwick wrote: >> On Tue, Feb 14, 2012 at 11:15:27PM +0100, Victor Balada Diaz wrote: >>> On Tue, Feb 14, 2012 at 06:17:19PM +0100, Harald Schmalzbauer wrote: >>>> schrieb Jeremy Chadwick am 14.02.2012 17:50 (localtime): >>>>> On Tue, Feb 14, 2012 at 04:55:10PM +0100, Claudius Herder wrote: >>>>>> Hello, >>>>>> >>>>>> I have got a quite similar problem with AHCI on FreeBSD 8.2 and it still >>>>>> persists on FreeBSD 9.0 release. >>>>>> >>>>>> Switching from ahci to ataahci resolved the problem for me too. >>>>>> >>>>>> I'm using gmirror for swap, system is on a zpool and the problem first >>>>>> occurred during a zpool scrub, but it is easily reproducible with dd. >>>>>> >>>>>> The timeouts only occur when writing to disks, dd if=/dev/ada{0|1} >>>>>> of=/dev/null is not an issue. >>>>>> Sometimes I need to power off the server because after a reboot one disk >>>>>> is still missing. >>>>>> >>>>>> I really would like to help in this issue, so let me know if you need >>>>>> any more information. >>>>> I find it interesting that, at least so far, the only people reporting >>>>> problems of this type with the ahci.ko driver are people using Samsung >>>>> disks. The only difference is that your models are F1s while the OPs >>>>> are F2s. >>>> >>>> I saw such timeouts long ago and mav@ had a look at my postings and he >>>> mentioned it could be a NCQ problem. >>>> I suspected the disks firmware. >>>> I never tracked it down further, because after replacing the Samsung (F3 >>>> in that case) disks with hitachi ones solved all my problems and gave a >>>> big performance kick as well (with zfs). >>>> You can find the discussion here: >>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-February/055374.html >>>> >>> >>> You gave me a good idea: try to disable NCQ and see if that's the fault. So >>> i went and applied the attached patch. After it, i can no longer reproduce >>> the issue with ahci driver. >>> >>> I know this is not a solution because it disables NCQ at controller level >>> instead of disk level, but at least we know for sure where the problem is. >>> >>> I think the solution would be to add a new quirk ADA_Q_NONCQ in >>> sys/cam/ata/ata_da.c. >>> Quirks infraestructure is already built, so adding a new quirk for this >>> seems >>> easy. >>> >>> Is someone interested? Do you think there is a better solution? >>> >>> If someone is interested i can build a patch to add ADA_Q_NONCQ quirk and >>> add my drives >>> to it. >> >> I took a stab at this, but I don't feel confident this is the proper >> solution/method. I worry there's some sort of chicken-or-the-egg >> condition here (quirk setup/matching comes *after* SATA capabilities >> detection), or that it makes the code messier. Need mav@'s >> recommendations on this. >> >> Below is for RELENG_8. I should note I haven't tested if this works, or >> even compiles -- normally I don't provide such patches without testing >> so I apologise in advance / user beware. > > You're amazingly fast. Thanks for all your help :) > > You start applying the quirks before > > snprintf(announce_buf, sizeof(announce_buf), > "kern.cam.ada.%d.quirks", periph->unit_number); > quirks = softc->quirks; > TUNABLE_INT_FETCH(announce_buf, &quirks); > > So you're breaking quirk setting at boot time. > > See my attached patch. I can confirm it works for me. > > Regards. >
I don't think that disabling NCQ entirely is the right solution. It's a tag starvation issue in the firmware, not a complete failure, and it can be dealt with in the CAM XPT scheduler fairly efficiently. Alexander and I talked about this recently, and though we differ on the details, a tag hack is not in order, IMHO. In the short term, try just using "cam control tags ada0 -N 1" to limit the concurrent commands to 1. Scott _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"