On 06/29/2015 11:42 AM, John Snow wrote: > > > On 06/29/2015 10:24 AM, Stefan Hajnoczi wrote: >> On Fri, Jun 26, 2015 at 02:27:39PM -0400, John Snow wrote: >>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 >>> >>> >>> >>> On 06/26/2015 11:35 AM, Stefan Hajnoczi wrote: >>>> On Mon, Jun 22, 2015 at 08:21:05PM -0400, John Snow wrote: >>>>> Handle NCQ failures for cases where we want to halt the VM on >>>>> IO errors. >>>>> >>>>> Signed-off-by: John Snow <js...@redhat.com> --- >>>>> hw/ide/ahci.c | 17 +++++++++++++++-- hw/ide/ahci.h | 1 + >>>>> hw/ide/internal.h | 1 + 3 files changed, 17 insertions(+), 2 >>>>> deletions(-) >>>>> >>>>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index >>>>> 71b5085..a838317 100644 --- a/hw/ide/ahci.c +++ >>>>> b/hw/ide/ahci.c @@ -959,13 +959,25 @@ static void ncq_cb(void >>>>> *opaque, int ret) return; } >>>>> >>>>> + ncq_tfs->halt = false; >>>> >>>> Why does halt need to be cleared here? >>>> > > I remember my thinking now. I didn't want to leave it dangling after a > successful command, so I wanted to clear it on the callback. > > It's harmless, but it's weird to have it set afterwards and I wanted > it to be very crystal clear what it meant when it was set. (With this > usage case, 'halt' being set ALWAYS means that there's a command to > retry -- you don't have to test other variables like 'used' to tell if > it's a left over flag.) > > I actually want to leave it here, now. >
I am literally not awake: if I clear it in execute_ncq_command like I offered, it will get cleared on retry and we won't leave it hanging. It takes hitting the 'send' button to realize this. >>> >>> Might make more sense to clear it just on the beginning of every >>> command, in execute(). >>> >>> There's no strong reason here other than "If there's an error and >>> it should be set, it'll get reset again pretty soon." It's just a >>> default state. >>> >>> I could move it from process to execute. >> >> By the way, does ->halt need to be cleared in ahci_reset_port()? >> >> I'm thinking of a scenario where requests have failed and the port >> is reset. We should not try to reissue those commands after >> reset. >> > > If I keep it like I have it now (where it clears itself after a > successful command), we can clear it alongside the 'used' flag to > protect against the case where the guest issues a reset almost > simultaneously with an errored read command, where it might be that > the NCQ command halts the VM, then we try to reset immediately after boot. > > > (Perilously tangential side-note: what's the default error action if > you don't set rerror=stop or werror=stop? the ide code didn't make it > particularly clear to me if it was IGNORE or REPORT.) > Still curious about this bit, though.