Re: [Qemu-devel] [Qemu-block] [PATCH 06/16] ahci: record ncq failures

John Snow Mon, 29 Jun 2015 08:49:03 -0700


On 06/29/2015 11:42 AM, John Snow wrote:
> 
> 
> On 06/29/2015 10:24 AM, Stefan Hajnoczi wrote:
>> On Fri, Jun 26, 2015 at 02:27:39PM -0400, John Snow wrote:
>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
>>>
>>>
>>>
>>> On 06/26/2015 11:35 AM, Stefan Hajnoczi wrote:
>>>> On Mon, Jun 22, 2015 at 08:21:05PM -0400, John Snow wrote:
>>>>> Handle NCQ failures for cases where we want to halt the VM on
>>>>> IO errors.
>>>>>
>>>>> Signed-off-by: John Snow <js...@redhat.com> ---
>>>>> hw/ide/ahci.c | 17 +++++++++++++++-- hw/ide/ahci.h     |  1 +
>>>>> hw/ide/internal.h |  1 + 3 files changed, 17 insertions(+), 2
>>>>> deletions(-)
>>>>>
>>>>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index
>>>>> 71b5085..a838317 100644 --- a/hw/ide/ahci.c +++
>>>>> b/hw/ide/ahci.c @@ -959,13 +959,25 @@ static void ncq_cb(void
>>>>> *opaque, int ret) return; }
>>>>>
>>>>> +    ncq_tfs->halt = false;
>>>>
>>>> Why does halt need to be cleared here?
>>>>
> 
> I remember my thinking now. I didn't want to leave it dangling after a
> successful command, so I wanted to clear it on the callback.
> 
> It's harmless, but it's weird to have it set afterwards and I wanted
> it to be very crystal clear what it meant when it was set. (With this
> usage case, 'halt' being set ALWAYS means that there's a command to
> retry -- you don't have to test other variables like 'used' to tell if
> it's a left over flag.)
> 
> I actually want to leave it here, now.
>


I am literally not awake: if I clear it in execute_ncq_command like I
offered, it will get cleared on retry and we won't leave it hanging.

It takes hitting the 'send' button to realize this.

>>>
>>> Might make more sense to clear it just on the beginning of every 
>>> command, in execute().
>>>
>>> There's no strong reason here other than "If there's an error and
>>> it should be set, it'll get reset again pretty soon." It's just a
>>> default state.
>>>
>>> I could move it from process to execute.
>>
>> By the way, does ->halt need to be cleared in ahci_reset_port()?
>>
>> I'm thinking of a scenario where requests have failed and the port
>> is reset.  We should not try to reissue those commands after
>> reset.
>>
> 
> If I keep it like I have it now (where it clears itself after a
> successful command), we can clear it alongside the 'used' flag to
> protect against the case where the guest issues a reset almost
> simultaneously with an errored read command, where it might be that
> the NCQ command halts the VM, then we try to reset immediately after boot.
> 
> 
> (Perilously tangential side-note: what's the default error action if
> you don't set rerror=stop or werror=stop? the ide code didn't make it
> particularly clear to me if it was IGNORE or REPORT.)
> 

Still curious about this bit, though.

Re: [Qemu-devel] [Qemu-block] [PATCH 06/16] ahci: record ncq failures

Reply via email to